🔢 Digit Recognizer - 手寫數字辨識：從 Random Forest 到 CNN 的電腦視覺入門

2025/12/16 更新2025/12/16 發佈閱讀 44 分鐘

曲折的線條訴說著意圖，  
卻常走向模糊的邊界。  
在混沌裡尋找秩序，  
是我交給機器的第一道考題。

📝 前言

之前做過的題目，大多是表格型的分類或迴歸，靠樹模型就能處理得不錯。這次換成了影像資料，挑戰的方式完全不同。

Digit Recognizer 是一個經典的手寫數字辨識任務，利用大量的手寫圖片資料來訓練模型，最後再去判斷未知的數字。它在 Kaggle 裡是 beginner 題目，因為背後的資料集是眾所皆知的 MNIST，對第一次接觸影像任務的人來說，正好是最適合的入門練習。

對我來說，這是踏出電腦視覺第一步。

📊 資料初探

不管哪一種題目，還是要先仔細看一下資料內容，以下先看看資料內容。

#數據載入
train_data = pd.read_csv(train_path)
test_data = pd.read_csv(test_path)
#數據查看
train_data.info()
train_data.describe()
train_data.head()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 42000 entries, 0 to 41999
Columns: 785 entries, label to pixel783
dtypes: int64(785)
memory usage: 251.5 MB
label	pixel0	pixel1	pixel2	pixel3	pixel4	pixel5	pixel6	pixel7	pixel8	...	pixel774	pixel775	pixel776	pixel777	pixel778	pixel779	pixel780	pixel781	pixel782	pixel783
0	1	0	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	0
1	0	0	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	0
2	1	0	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	0
3	4	0	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	0
4	0	0	0	0	0	0	0	0	0	0	...	0	0	0	0	0	0	0	0	0	0
5 rows × 785 columns

可以看到，這一份資料已經將圖片轉成灰階碼(0~255)，總共有 42,000 筆影像，每筆都是 label + 28x28 像素值（共 785 欄），所以並不需要對圖片進行處理，可以直接對資料進行資料分析以及特徵工程。

當然我們也可以將這個表格內容回復成數字圖片狀況。

#分離Label與像素
row=train_data.iloc[0]
label=row['label']
pixels=row[1:]
#轉成圖片格式
img=pixels.values.reshape(28,28)
#顯示圖片
plt.title(label)
plt.imshow(img,cmap='gray')
plt.show()

接下來，來對資料進行分析。

資料分析

老習慣，先看看標籤分布，確定每個數字是否平衡。

#評估哪個數字最多
train_data['label'].value_counts().sort_index().plot(kind='bar')

從標籤分布圖可以看到，0–9 的數字數量大致平均，每個類別都落在四千筆左右。雖然 1 比較多、5 稍微少一些，但整體來說並沒有明顯的不平衡問題，因此不需要特別處理 class weight 或做額外的過採樣。

這也再次確認了 Digit Recognizer 的定位：它是一個設計乾淨、資料完整的入門題目。對於初次接觸影像分類的人來說，可以省去複雜的資料清理，專心把心力放在模型架構與訓練策略上。

接下來，就進入模型的建立與比較。

🧪 Baseline RF｜隨機森林

既然資料已經是整理好的數字影像表格，我先用最熟悉的樹模型來試試。這次選擇 Random Forest，直接把 28x28 的像素（展平成 784 維向量）丟進去。這種做法並不會去考慮「影像的空間結構」，單純把它當成一個高維度的表格資料來處理。

✂️ 資料切分

與其他的題目一樣需要將 Train 資料集切成訓練集與測試集，一部分用來訓練，一部分用來檢查模型的泛化能力。切分時也必須保持各數字的比例相近，否則容易造成分布失衡，讓模型偏向某些數字。

這邊採用 70% 作為訓練、30% 作為測試，並設定隨機種子確保結果可以重現。

#切分資料集
x=train_data.drop('label',axis=1)
y=train_data['label']
x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=0.3,random_state=42)
#確認切分結果
print(x_train.shape, x_test.shape)
print(y_train.value_counts().sort_index())
print(y_test.value_counts().sort_index())

(29400, 784) (12600, 784)
label
0    2932
1    3295
2    2883
3    2996
4    2850
5    2710
6    2881
7    3042
8    2854
9    2957
Name: count, dtype: int64
label
0    1200
1    1389
2    1294
3    1355
4    1222
5    1085
6    1256
7    1359
8    1209
9    1231
Name: count, dtype: int64

切分後可以看到train有29400筆資料，test有12600比，每個數字在train跟test兩塊都維持平均，這樣能有效保證訓練不會被某個數字拉偏。有了穩定的資料劃分，接下來就能開始進行模型訓練。

📈 模型訓練與結果

有了穩定的訓練／測試集之後，就可以把資料交給隨機森林來跑一個基準分數。這裡我設定了 250 棵樹，並開啟 OOB（out-of-bag）評估，來快速確認泛化效果。

#隨機森林簡單評估準確率
rf_model=RandomForestClassifier(random_state=2,n_estimators=250,min_samples_split=20,oob_score=True,n_jobs=-1)
rf_model.fit(x_train,y_train)
rf_score=rf_model.score(x_test,y_test)
print('score=',rf_score)
#預測
y_pred=rf_model.predict(x_test)
#混淆矩陣
print(confusion_matrix(y_test,y_pred))
#分類報告
print(classification_report(y_test, y_pred))
# OOB分數
print('OOB 分數;',rf_model.oob_score_)

score= 0.9569047619047619
[[1181    0    3    1    2    1    5    0    7    0]
 [   0 1366    9    5    1    2    4    0    1    1]
 [   4    3 1242    7   14    1    5    8    9    1]
 [   4    2   17 1261    2   21    4   16   19    9]
 [   2    0    2    0 1179    0    8    4    5   22]
 [   7    1    3   16    1 1022   15    1   10    9]
 [   9    3    1    0    4    6 1225    0    8    0]
 [   1    7   20    2   12    0    0 1283    4   30]
 [   2    5    4   15    4    7    7    2 1149   14]
 [   7    4    5   22   16    4    3   10   11 1149]]
              precision    recall  f1-score   support

           0       0.97      0.98      0.98      1200
           1       0.98      0.98      0.98      1389
           2       0.95      0.96      0.96      1294
           3       0.95      0.93      0.94      1355
           4       0.95      0.96      0.96      1222
           5       0.96      0.94      0.95      1085
           6       0.96      0.98      0.97      1256
           7       0.97      0.94      0.96      1359
           8       0.94      0.95      0.94      1209
           9       0.93      0.93      0.93      1231

    accuracy                           0.96     12600
   macro avg       0.96      0.96      0.96     12600
weighted avg       0.96      0.96      0.96     12600

OOB 分數; 0.9570408163265306

在測試集上，隨機森林的準確率達到 95.7%，OOB分數也在 95.7% 左右，兩者非常接近，代表模型沒有明顯的過擬合問題。

進一步觀察分類報告，可以看到：

各類別的 precision 與 recall 都維持在 93%–98% 之間，沒有特別弱勢的數字。
混淆矩陣中，常見的錯誤出現在：
- 3 與 5：部分筆劃容易相互混淆。
- 4 與 9：當筆觸歪斜或不完整時，模型判斷會失準。
- 7 與 9、8 與 3：也有些微交錯。

整體來說，隨機森林作為 baseline，已經能提供一個相當穩定的表現。它證明了資料本身乾淨、數字分布平均，單靠傳統方法就能拿到不錯的分數。

不過，這種做法畢竟是把影像展平成 784 個獨立的數值，缺乏對「局部結構」的理解能力。要更進一步提升準確率，必須讓模型能真正「看懂」線條的形狀與組合，也就是該輪到卷積神經網路 (CNN) 出場了。

🧬 Baseline RF1.5｜用遺傳演算法微調 Random Forest

前面用隨機森林進行跑分已經有0.957的分數，這讓我好奇若只靠樹模型的情況能夠提高到多少分數，因此我利用遺傳演算法來調整他的超參數。

# ===================== GA 基本元件 =====================
# 隨機產生一個超參數組合（個體）
def random_individual():
    return {
        'n_estimators': random.randint(50, 300),   # 樹的數量
        'max_depth': random.randint(5, 30),        # 最大深度
        'min_samples_split': random.randint(2, 20) # 節點分裂所需最小樣本數
    }

# 突變：以一定機率對各參數重新抽樣
def mutate(individual, mutation_rate=0.1):
    if random.random() < mutation_rate:
        individual['n_estimators'] = random.randint(50, 300)
    if random.random() < mutation_rate:
        individual['max_depth'] = random.randint(5, 30)
    if random.random() < mutation_rate:
        individual['min_samples_split'] = random.randint(2, 20)
    return individual

# 交配：從兩個父代分別挑一個參數組成子代
def crossover(p1, p2):
    return {
        'n_estimators': random.choice([p1['n_estimators'], p2['n_estimators']]),
        'max_depth': random.choice([p1['max_depth'], p2['max_depth']]),
        'min_samples_split': random.choice([p1['min_samples_split'], p2['min_samples_split']])
    }

# 輪盤選擇：依適應度（這裡用驗證準確率）比例抽樣父母
def roulette_wheel_selection(scored_population, num_parents):
    # scored_population 內容為 [(individual, val_acc), ...]
    total = sum(score for _, score in scored_population)
    inds = [ind for ind, _ in scored_population]
    if total <= 0:
        # 若全部分數為 0，退而求其次取前幾名
        return inds[:num_parents]
    probs = [score / total for _, score in scored_population]
    return random.choices(inds, weights=probs, k=num_parents)

# 適應度函數：只回傳「驗證集準確率」
# ＊注意：這裡不回傳 OOB，避免回傳 tuple 造成型別比較錯誤
def fitness(individual, x_train, y_train, x_valid, y_valid):
    model = RandomForestClassifier(
        n_estimators=individual['n_estimators'],
        max_depth=individual['max_depth'],
        min_samples_split=individual['min_samples_split'],
        random_state=42,
        n_jobs=-1,
        oob_score=False,   # 調參階段用驗證集準確率即可，OOB 留到最後重訓再看
        bootstrap=True
    )
    model.fit(x_train, y_train)
    y_pred = model.predict(x_valid)
    return accuracy_score(y_valid, y_pred)

# GA 主流程
def genetic_algorithm_rf(x_train, y_train, x_valid, y_valid,
                         population_size, generations, mutation_rate):
    population = [random_individual() for _ in range(population_size)]
    best_individual = None
    best_score = -1.0
    history = []

    for gen in range(generations):
        print(f"\\n🧬 Generation {gen+1}")

        # 評估整個族群，取得 (個體, 驗證分數)
        scored_population = [(ind, fitness(ind, x_train, y_train, x_valid, y_valid))
                             for ind in population]
        scored_population.sort(key=lambda x: x[1], reverse=True)

        # 當代最佳
        cur_best_ind, cur_best_score = scored_population[0]
        history.append(cur_best_score)
        if cur_best_score > best_score:
            best_score = cur_best_score
            best_individual = cur_best_ind

        print(f"Best Val-Acc: {cur_best_score:.4f} | Params: {cur_best_ind}")

        # 精英保留（保留當代第一名）
        elites = [cur_best_ind]

        # 父母挑選（使用輪盤）
        num_offspring = population_size - len(elites)
        parents = roulette_wheel_selection(scored_population, num_offspring)

        # 產生後代（交配 + 突變）
        offspring = [mutate(crossover(random.choice(parents), random.choice(parents)),
                            mutation_rate)
                     for _ in range(num_offspring)]

        # 新一代族群
        population = elites + offspring

    return best_individual, best_score

# ===================== 使用方式 =====================
# 千萬不要用最終 x_test/y_test 來調參；在訓練集中再切一個驗證集
x_tr, x_val, y_tr, y_val = train_test_split(
    x_train, y_train, test_size=0.2, random_state=42, stratify=y_train
)

best_params, best_val_acc = genetic_algorithm_rf(
    x_tr, y_tr, x_val, y_val,
    population_size=20,
    generations=20,
    mutation_rate=0.3
)

print("\\n🎯 最佳參數：", best_params)
print("🎯 最佳驗證準確率：", best_val_acc)

# 用最佳參數在「完整訓練切分」上重訓，這時候再讀 OOB 與最終測試分數
rf_best = RandomForestClassifier(
    **best_params,
    random_state=42,
    n_jobs=-1,
    oob_score=True,   # 現在才開 OOB
    bootstrap=True
)
rf_best.fit(x_train, y_train)
print("🎯 OOB 分數（重訓）：", rf_best.oob_score_)
print("🎯 Test 準確率：", rf_best.score(x_test, y_test))

這邊因為資料較單純，且最終我們會利用CNN做最終模型，所以先自己設定了簡單的遺傳演算法來滿足好奇心。

其實有一個套件叫做 Deap 他內建更多能夠調整的功能(多種選擇/交配/突變、統計紀錄、平行化、NSGA-II 多目標、Hall-of-Fame、checkpoint)，不需要自己設定函數。

🧬 Generation 1
Best Val-Acc: 0.9633 | Params: {'n_estimators': 161, 'max_depth': 30, 'min_samples_split': 5}
🧬 Generation 2
Best Val-Acc: 0.9633 | Params: {'n_estimators': 161, 'max_depth': 30, 'min_samples_split': 5}
🧬 Generation 3
Best Val-Acc: 0.9633 | Params: {'n_estimators': 161, 'max_depth': 30, 'min_samples_split': 5}
🧬 Generation 4
Best Val-Acc: 0.9633 | Params: {'n_estimators': 161, 'max_depth': 30, 'min_samples_split': 5}
🧬 Generation 5
Best Val-Acc: 0.9633 | Params: {'n_estimators': 161, 'max_depth': 30, 'min_samples_split': 5}
🧬 Generation 6
Best Val-Acc: 0.9633 | Params: {'n_estimators': 161, 'max_depth': 30, 'min_samples_split': 5}
🧬 Generation 7
Best Val-Acc: 0.9633 | Params: {'n_estimators': 161, 'max_depth': 30, 'min_samples_split': 5}
🧬 Generation 8
Best Val-Acc: 0.9648 | Params: {'n_estimators': 145, 'max_depth': 20, 'min_samples_split': 2}
🧬 Generation 9
Best Val-Acc: 0.9648 | Params: {'n_estimators': 145, 'max_depth': 20, 'min_samples_split': 2}
🧬 Generation 10
Best Val-Acc: 0.9648 | Params: {'n_estimators': 145, 'max_depth': 20, 'min_samples_split': 2}
🧬 Generation 11
Best Val-Acc: 0.9648 | Params: {'n_estimators': 145, 'max_depth': 20, 'min_samples_split': 2}
🧬 Generation 12
Best Val-Acc: 0.9648 | Params: {'n_estimators': 145, 'max_depth': 20, 'min_samples_split': 2}
🧬 Generation 13
Best Val-Acc: 0.9648 | Params: {'n_estimators': 145, 'max_depth': 20, 'min_samples_split': 2
🧬 Generation 14
Best Val-Acc: 0.9648 | Params: {'n_estimators': 145, 'max_depth': 20, 'min_samples_split': 2}
🧬 Generation 15
Best Val-Acc: 0.9648 | Params: {'n_estimators': 145, 'max_depth': 20, 'min_samples_split': 2}
🧬 Generation 16
Best Val-Acc: 0.9648 | Params: {'n_estimators': 145, 'max_depth': 20, 'min_samples_split': 2}
🧬 Generation 17
Best Val-Acc: 0.9648 | Params: {'n_estimators': 145, 'max_depth': 20, 'min_samples_split': 2}
🧬 Generation 18
Best Val-Acc: 0.9648 | Params: {'n_estimators': 145, 'max_depth': 20, 'min_samples_split': 2}
🧬 Generation 19
Best Val-Acc: 0.9648 | Params: {'n_estimators': 145, 'max_depth': 20, 'min_samples_split': 2}
🧬 Generation 20
Best Val-Acc: 0.9648 | Params: {'n_estimators': 145, 'max_depth': 20, 'min_samples_split': 2}

🎯 最佳參數： {'n_estimators': 145, 'max_depth': 20, 'min_samples_split': 2}
🎯 最佳驗證準確率： 0.9647959183673469
🎯 OOB 分數（重訓）： 0.9589115646258504
🎯 Test 準確率： 0.9615079365079365

我們從最佳化結果看到，GA 找到的參數組合是 145 棵樹、深度 20、最小分裂樣本數 2。這組合比起我原本直覺的「100 棵中等深度的樹」更偏向「深樹細切」，確實呼應了像素點很多、決策邊界需要細緻的資料特性，輸出結果比原先多了約0.01，來到0.96的分數，我們將他丟上leaderborad看看。

分數提升有限，代表隨機森林在這個影像任務上的表現已經接近極限。這次實驗雖然不是為了追求大幅提升，但讓我清楚看到 RF 在這種數據上的能力邊界。

雖然 GA 優化過的 RF 在樹模型中表現不錯，但影像任務的本質仍然更適合 CNN。接下來我們將回到卷積神經網路，看看深度學習能否進一步突破分數。

🧠 CNN｜卷積神經網路初探

隨機森林經過遺傳演算法調參後雖然有小幅提升，但畢竟是影像數據，卷積神經網路（CNN）才是更自然的選擇。

🔧 資料整理

在進入 CNN 前，必須先把原本的表格數據轉換成CNN格式：

Reshape + Normalize：將 784 維像素展開還原為 (28, 28, 1) 的灰階圖片，並把像素值縮放到 [0,1]，讓網路訓練更穩定。
One-hot Encoding：標籤從單一數字轉成長度 10 的 one-hot 向量，對應到輸出層的 softmax。
訓練/驗證切分：再把資料切成訓練集與驗證集（8:2），方便觀察模型在未知資料上的表現。

# 將 x, test_data 轉為 CNN 格式（reshape + normalize）
X = x.values.reshape(-1, 28, 28, 1) / 255.0  # 訓練資料
X_test = test_data.values.reshape(-1, 28, 28, 1) / 255.0  # 測試資料

# 將 y 轉為 one-hot encoding（分類任務）
from tensorflow.keras.utils import to_categorical
Y = to_categorical(y, num_classes=10)

# 切分訓練 / 驗證集
from sklearn.model_selection import train_test_split
X_train, X_valid, Y_train, Y_valid = train_test_split(X, Y, test_size=0.2, random_state=42)

🏗️ 模型設計與訓練準備

在資料前處理後，開始設計 CNN 模型。這裡我分成三個部分：

資料增強 (Data Augmentation)
為了讓模型在面對不同手寫風格時更有泛化能力，先透過 ImageDataGenerator 做隨機旋轉、縮放、平移。這能讓模型看到更多「變形後的手寫數字」，降低過擬合。
CNN 架構設計
- 採用三個卷積區塊 (Conv + BN + Conv + BN + Pool)，讓網路能逐層抽取筆劃、局部結構到整體輪廓的特徵。
- 每個卷積層後都加上 Batch Normalization，穩定梯度並加速收斂。
- Flatten 後接一層 512 維 Dense + Dropout，最後輸出 softmax。
訓練策略 (Callbacks)
- 優化器：Adam，學習率自適應，對初學任務相對穩定。
- 損失函數：Categorical Crossentropy，配合 one-hot 標籤。
- Callbacks： EarlyStopping：驗證集 5 epoch 沒進步就停止，並回到最佳權重。 ReduceLROnPlateau：卡住時自動降低學習率。 ModelCheckpoint：保存最佳模型檔案。

# 建立資料增強器
datagen = ImageDataGenerator(
    rotation_range=10,
    zoom_range=0.1,
    width_shift_range=0.1,
    height_shift_range=0.1
)
datagen.fit(X_train)

# 建立 CNN 模型（原始結構 + BN 強化）
cnn_model = Sequential([
    # Block 1
    Conv2D(32, (3, 3), padding='same', input_shape=(28, 28, 1)),
    BatchNormalization(),
    Conv2D(32, (3, 3), padding='same', activation='relu'),
    BatchNormalization(),
    MaxPooling2D(),

    # Block 2
    Conv2D(64, (3, 3), padding='same'),
    BatchNormalization(),
    Conv2D(64, (3, 3), padding='same', activation='relu'),
    BatchNormalization(),
    MaxPooling2D(),

    # Block 3
    Conv2D(128, (3, 3), padding='same'),
    BatchNormalization(),
    Conv2D(128, (3, 3), padding='same', activation='relu'),
    BatchNormalization(),
    MaxPooling2D(),

    # Dense 層
    Flatten(),
    Dense(512, activation='relu'),
    Dropout(0.5),
    Dense(10, activation='softmax')
])

# 編譯模型
cnn_model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

# 設定 callback
callbacks = [
    EarlyStopping(patience=5, restore_best_weights=True),
    ReduceLROnPlateau(patience=2, factor=0.5, verbose=1),
    ModelCheckpoint('best_cnn_model.h5', save_best_only=True, verbose=1)
]

🚀 模型訓練

一切準備好之後，就可以開始訓練模型。這裡使用 資料增強後的訓練集 搭配 batch_size=64，並設定最長 30 個 epoch。由於有 EarlyStopping 與 ReduceLROnPlateau，實際上通常不會跑滿，而是會在驗證集不再進步時自動停止，並回復最佳權重。

# 開始訓練
warnings.filterwarnings("ignore", category=UserWarning)

history = cnn_model.fit(
    datagen.flow(X_train, Y_train, batch_size=64),
    validation_data=(X_valid, Y_valid),
    epochs=30,
    callbacks=callbacks,
    verbose=1
)

訓練方式：datagen.flow 會在每個 batch 隨機做旋轉/平移/縮放，等於每次看到的數字影像都有些許變化。
驗證方式：驗證集保持不增強，確保觀察的是「真實泛化能力」。
預期結果：準確率應該比 Random Forest 明顯提升，通常能達到 0.99 左右；同時 loss 曲線會隨著學習率調整而逐漸下降。


# 前面重複訓練部分省略
Epoch 22: ReduceLROnPlateau reducing learning rate to 3.125000148429535e-05.
Epoch 22: val_loss did not improve from 0.01422
525/525 ━━━━━━━━━━━━━━━━━━━━ 14s 27ms/step - accuracy: 0.9963 - loss: 0.0116 - val_accuracy: 0.9952 - val_loss: 0.0155 - learning_rate: 6.2500e-05
Epoch 23/30
524/525 ━━━━━━━━━━━━━━━━━━━━ 0s 26ms/step - accuracy: 0.9974 - loss: 0.0098
Epoch 23: val_loss did not improve from 0.01422
525/525 ━━━━━━━━━━━━━━━━━━━━ 15s 28ms/step - accuracy: 0.9974 - loss: 0.0098 - val_accuracy: 0.9944 - val_loss: 0.0162 - learning_rate: 3.1250e-05

可以看到在第23次訓練就因EarlyStopping 停止，我們將訓練結果的準確率與損失畫出來

📉 訓練過程觀察

# 繪製訓練/驗證準確率
plt.figure(figsize=(12,4))
plt.subplot(1,2,1)
plt.plot(history.history['accuracy'], label='Train Acc')
plt.plot(history.history['val_accuracy'], label='Valid Acc')
plt.title('Accuracy Curve')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()

# 繪製訓練/驗證損失
plt.subplot(1,2,2)
plt.plot(history.history['loss'], label='Train Loss')
plt.plot(history.history['val_loss'], label='Valid Loss')
plt.title('Loss Curve')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()

plt.tight_layout()
plt.show()

圖是模型在 30 個 epoch 內的訓練 / 驗證準確率與損失變化：

準確率曲線（左圖）：
- 訓練與驗證準確率在前幾個 epoch 就快速拉升到 0.98 以上，之後逐步趨近 0.99。
- 訓練與驗證曲線後半段幾乎重合，代表沒有明顯過擬合，泛化能力良好。
損失曲線（右圖）：
- 訓練與驗證損失同步下降，並在 5 個 epoch 後趨於平穩。
- 驗證損失偶爾有小波動，但整體趨勢一致，說明模型學習穩定。
- ReduceLROnPlateau 應該有在後期發揮作用，讓曲線逐漸收斂。

整體準確率已經逼近 99%，但仍有少數數字會被誤判。為了觀察錯誤類型，對驗證集預測結果繪製混淆矩陣：

# 驗證集預測
y_valid_true = np.argmax(Y_valid, axis=1)
y_valid_pred = np.argmax(cnn_model.predict(X_valid), axis=1)

# 混淆矩陣
cm = confusion_matrix(y_valid_true, y_valid_pred)

plt.figure(figsize=(8,6))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues')
plt.xlabel("Predicted")
plt.ylabel("True")
plt.title("Confusion Matrix on Validation Set")
plt.show()

這樣我們就把混淆矩陣做出來，可以從中觀察到：

整體表現
幾乎所有數字都被正確分類，對角線格子非常清晰，錯誤案例非常少。
主要錯誤模式
- 5：在 703 筆樣本中，有 3 筆被誤判成其他數字。手寫「5」的尾部如果收得較圓，有時容易接近「6」。
- 8 與 2：少量「8」被錯判為「2」或「3」，原因可能是手寫筆劃不閉合，讓模型誤以為是「2/3」。
- 4 與 9：也出現了少數互相混淆，這與常見的人類判斷錯誤一致。

這樣我們就可以將輸出結果放上kaggle上看看實際幾分。

結果也如同訓練分數，有0.99371分，可以知道這次訓練結果不錯。

🏁 結論

這次的 Digit Recognizer 嘗試，從最初的 Random Forest baseline 出發，經過 遺傳演算法調參，分數雖然有小幅提升，但仍受限於影像被展平成向量的特性。

接著導入 CNN baseline，在基本的三段卷積結構加上 Batch Normalization、資料增強與 正則化策略後，模型很快在驗證集上達到 99% 左右準確率，並在 Kaggle 提交上得到 0.99371 的成績，與訓練結果高度一致。

幾點觀察：

樹模型適合快速嘗試，但對影像資訊的利用有限。
CNN 能自然捕捉筆劃與形狀特徵，表現明顯優於傳統方法。
錯誤大多來自「人眼也容易混淆」的數字（如 4 與 9、5 與 6、8 與 2/3），顯示模型的瓶頸在於手寫體的多樣性，而不是演算法不足。

總結：這次實驗證明了卷積網路在影像分類上的優勢，也為後續更進階的探索（如更深層架構、殘差網路、強化資料增強）打下基礎。

留言

夕月之下

0會員

9內容數

在模型尚未收斂前，記下語言的提示與意圖。觀察者、語言與語言模型的交界。

夕月之下的其他內容

2025/11/17

🏠 House Prices 房價預測：用演化找到最強模型的房價預測之路

本文分享在 Kaggle House Price挑戰中的實作經驗。在不進行複雜特徵工程的前提下，利用 KNN Imputer 和 Simple Imputer 自動補齊欄位的缺失值，並透過GA為四種模型尋找最佳超參數。最終透過模型融合得到成績。

2025/11/17

🏠 House Prices 房價預測：用演化找到最強模型的房價預測之路

2025/10/01

🛳️鐵達尼號生存預測補充資料-Optuna

這篇是Titanic 生還預測：Machine Learning from Disaster原先後面有的補充資料，因為字數限制另外開到這篇寫。透過 Optuna，我們可以讓模型自主尋找最佳的特徵組合和參數設定，大幅提升實驗效率。

2025/10/01

🛳️鐵達尼號生存預測補充資料-Optuna

2025/10/01

🛳️ Titanic 生還預測：Machine Learning from Disaster

這篇文章記錄了我第一次進行鐵達尼號比賽，以及後來又再度認真的玩這個比賽的過程，會介紹一下我最終的code以及最一開始到最終的心路歷程，分享給大家做參考。

2025/10/01

🛳️ Titanic 生還預測：Machine Learning from Disaster

看更多

你可能也想看

釀電影，啜一口電影的美好。

《傳奇：帕拉贊諾夫的十段殘篇》：以流亡書寫帕拉贊諾夫的政治寓言

賽勒布倫尼科夫以流亡處境回望蘇聯電影導演帕拉贊諾夫的舞台作品，以十段寓言式殘篇，重新拼貼記憶、暴力與美學，並將審查、政治犯、戰爭陰影與「形式即政治」的劇場傳統推到台前。本文聚焦於《傳奇：帕拉贊諾夫的十段殘篇》的舞台美術、音樂與多重扮演策略，嘗試解析極權底下不可言說之事，將如何成為可被觀看的公共發聲。

#釀電影#釀評論#藝術評論

2026/01/14

釀電影，啜一口電影的美好。

《傳奇：帕拉贊諾夫的十段殘篇》：以流亡書寫帕拉贊諾夫的政治寓言

#釀電影#釀評論#藝術評論

2026/01/14

趙鐸的沙龍

柏林劇團《三便士歌劇》：善讓人嚮往，惡卻更加迷人──布萊希特的疏離與慾望

柏林劇團在 2026 北藝嚴選，再次帶來由布萊希特改編的經典劇目《三便士歌劇》（The Threepenny Opera），導演巴里・柯斯基以舞台結構與舞台調度，重新向「疏離」進行提問。本文將從觀眾慾望作為戲劇內核，藉由沉浸與疏離的辯證，解析此作如何再次照見觀眾自身的位置。

#2026北藝嚴選#臺北表演藝術中心#北藝嚴選

2026/01/14

趙鐸的沙龍

柏林劇團《三便士歌劇》：善讓人嚮往，惡卻更加迷人──布萊希特的疏離與慾望

#2026北藝嚴選#臺北表演藝術中心#北藝嚴選

2026/01/14

花神沒有咖啡館的沙龍

《海妲．蓋柏樂》：晃晃跨幅町直球對決經典，解構現代女性的困頓與慾望

本文深入解析臺灣劇團「晃晃跨幅町」對易卜生經典劇作《海妲．蓋柏樂》的詮釋，從劇本歷史、聲響與舞臺設計，到演員的主體創作方法，探討此版本如何讓經典劇作在當代劇場語境下煥發新生，滿足現代觀眾的觀看慾望。

#2026北藝嚴選#北藝嚴選#臺北表演藝術中心

2026/01/14

花神沒有咖啡館的沙龍

《海妲．蓋柏樂》：晃晃跨幅町直球對決經典，解構現代女性的困頓與慾望

#2026北藝嚴選#北藝嚴選#臺北表演藝術中心

2026/01/14

涵柳的沙龍

《轉轉生 Re:INCARNATION》：從身體與服裝看見奈及利亞的重生

《轉轉生》為奈及利亞編舞家庫德斯．奧尼奎庫與 Q 舞團創作的當代舞蹈作品，融合舞蹈、音樂、時尚和視覺藝術，透過身體、服裝與群舞結構，回應殖民歷史、城市經驗與祖靈記憶的交錯。本文將從服裝設計、身體語彙與「輪迴」的「誕生—死亡—重生」結構出發，分析《轉轉生》如何以當代目光，形塑去殖民視角的奈及利亞歷史。

#2026北藝嚴選#北藝嚴選#臺北表演藝術中心

2026/01/14