一開始在處理撲克牌的編碼時還算順利,將rank和suit轉換成數值讓機器學習模型可以訓練。但當我請語言模型幫我設計訓練模型時,卻發現它漏掉了一些我認為相當重要的特徵 - 比如每位玩家的stack size、完整的牌面資訊(它只單純使用了牌的數量而非具體的rank和suit),最重要的是它完全忽略了previous actions這個關鍵特徵,既使我不斷地重複要求將其放入其中。最後將每個步驟拆解到相當小,一步一步要求才完成。
這讓我想起之前在處理語言模型時的一個重要領悟:與其深入鑽研他提供的每一段程式碼細節,不如先確保程式能夠運作,即使可能還不是最理想的狀態。這種方式和我過去的開發習慣有很大的不同。以前我總是試圖完全理解每個function、每個action的邏輯,但這在與語言模型協作時反而成為了一種障礙 - 畢竟它的寫作邏輯和風格往往與我們習慣的不同,花太多時間深入理解反而可能是在浪費精力,特別是當那個方向最後證明是個死胡同的時候。
__________________________________________________________________________________________________ Layer (type) Output Shape Param # Connected to ================================================================================================== seq_types_input (InputLayer) [(None, 20)] 0 __________________________________________________________________________________________________ seq_amounts_input (InputLayer) [(None, 20, 1)] 0 __________________________________________________________________________________________________ static_input (InputLayer) [(None, 384)] 0 __________________________________________________________________________________________________ action_type_embedding (Embeddin (None, 20, 16) 416 seq_types_input[0][0] __________________________________________________________________________________________________
Epoch 20/20 399180/399180 [==============================] - 136s 342us/sample
- loss: 0.6324 - acc: 0.6971 - val_loss: 0.7220 - val_acc: 0.6883 99795/99795 [==============================] - 8s 76us/sample - loss: 0.7220 - acc: 0.6883 Validation Loss: 0.7220 | Validation Accuracy: 0.6883
#!/usr/bin/env python3
import json
import numpy as np
from sklearn.model_selection import train_test_split
import tensorflow as tf
from tensorflow.keras.models import Model
from tensorflow.keras.layers import (
Input, Dense, LSTM, Embedding, Concatenate, Dropout
from tensorflow.keras.optimizers import Adam
# =============================================================================
# 1. Helper Functions for Feature Encoding
# =============================================================================
def one_hot_round(round_no):
"""One-hot encode round number (1=preflop, 2=flop, 3=turn, 4=river)."""
vec = np.zeros(4)
if 1 <= round_no <= 4:
vec[round_no - 1] = 1
return vec
def one_hot_position(pos, max_players=10):
"""One-hot encode a player's position (an integer in [0, max_players-1])."""
vec = np.zeros(max_players)
if pos < max_players:
vec[pos] = 1
return vec
def card_to_onehot(card):
Convert a card string (e.g., 'S4', 'HA', 'C10') to a 52-dim one-hot vector.
If the card is hidden (e.g., starts with 'X') it returns an all-zeros vector.
ranks = ['A','2','3','4','5','6','7','8','9','10','J','Q','K']
suits = ['S','H','D','C']
onehot = np.zeros(52)
if card is None or card.upper().startswith("X"):
return onehot
suit = card[0]
rank = card[1:]
if suit in suits and rank in ranks:
suit_index = suits.index(suit)
rank_index = ranks.index(rank)
index = suit_index * 13 + rank_index
onehot[index] = 1
return onehot
def encode_board_cards(board_cards, max_cards=5):
Encode the board cards as the concatenation of one-hot vectors (52 dims each).
Pads with zeros if there are fewer than max_cards.
encoded = []
for card in board_cards:
while len(encoded) < max_cards:
return np.concatenate(encoded[:max_cards])
def encode_hole_cards(hole_cards):
Encode the player's (or actor's) hole cards (expected to be a list of 2 cards)
as a concatenation of two 52-dim one-hot vectors.
encoded = []
for card in hole_cards:
while len(encoded) < 2:
return np.concatenate(encoded[:2])
# =============================================================================
# 2. Process Each Snapshot into Model Inputs and Target
# =============================================================================
def process_snapshot(snapshot):
From a snapshot dictionary, create:
- A vector of static features
- A sequence of previous actions (each with an action type and amount)
- The target (opponent's current action type)
**Static features include:**
- One-hot encoded round (4 dims)
- Pot size (1 dim; scaled)
- Blinds: small, big, ante (3 dims; scaled)
- Actor stack (1 dim; scaled)
- Actor position (one-hot, 10 dims)
- Number of players remaining (1 dim; scaled)
- Board cards (5 fixed cards × 52 dims = 260 dims)
- Actor hole cards (2 cards × 52 dims = 104 dims)
**Sequential features:**
For each previous action (up to a fixed max length) we use:
- action_type (integer; offset by +1 so that 0 is reserved for padding)
- action_sum (float; scaled)
# --- Static features ---
# 1. Round (from current action)
round_no = int(snapshot["action"]["round"])
round_vec = one_hot_round(round_no)
# 2. Pot size (scale by 100)
pot_size = np.array([float(snapshot["pot_size"]) / 100.0])
# 3. Blinds and ante (scaled)
blinds = snapshot.get("blinds", {})
small_blind = float(blinds.get("small_blind", 0)) / 100.0
big_blind = float(blinds.get("big_blind", 0)) / 100.0
ante = float(blinds.get("ante", 0)) / 100.0
blinds_vec = np.array([small_blind, big_blind, ante])
# 4. Actor stack size (scale by 1000)
actor_stack = np.array([float(snapshot.get("actor_stack_size", 0)) / 1000.0])
# 5. Actor position (one-hot with dimension 10)
actor_pos = int(snapshot.get("actor_position", 0))
pos_vec = one_hot_position(actor_pos, max_players=10)
# 6. Number of players remaining (scale by 10)
players_remaining = np.array([float(snapshot.get("players_remaining", 0)) / 10.0])
# 7. Board cards (5 fixed cards)
board_vec = encode_board_cards(snapshot.get("board_cards", []), max_cards=5)
# 8. Actor hole cards (2 cards)
hole_cards_vec = encode_hole_cards(snapshot.get("actor_hole_cards", []))
# Concatenate all static features:
# Total dims: 4 + 1 + 3 + 1 + 10 + 1 + 260 + 104 = 384
static_features = np.concatenate([
round_vec, pot_size, blinds_vec, actor_stack, pos_vec,
players_remaining, board_vec, hole_cards_vec
# --- Sequential features ---
# For each previous action, we take:
# - action_type (offset by +1 so that 0 is our pad value)
# - action_sum (scaled by 100)
seq_actions = snapshot.get("previous_actions", [])
seq_types = []
seq_amounts = []
for action in seq_actions:
act_type = int(action.get("action_type", 0)) + 1 # reserve 0 for padding
act_sum = float(action.get("action_sum", 0)) / 100.0
MAX_SEQ_LENGTH = 20 # maximum number of previous actions to consider
# Truncate if too long
seq_types = seq_types[:MAX_SEQ_LENGTH]
seq_amounts = seq_amounts[:MAX_SEQ_LENGTH]
# Pad sequences (pad type=0, which for action_type will be masked in the Embedding layer)
while len(seq_types) < MAX_SEQ_LENGTH:
seq_types = np.array(seq_types, dtype=np.int32)
seq_amounts = np.array(seq_amounts, dtype=np.float32).reshape((MAX_SEQ_LENGTH, 1))
# --- Target: Opponent's current action type (as integer) ---
target = int(snapshot["action"].get("action_type", 0))
return static_features, seq_types, seq_amounts, target
# =============================================================================
# 3. Load and Preprocess Data
# =============================================================================
def load_and_preprocess_data(json_filename):
Load snapshot logs from a JSON file and create training arrays.
The JSON file is expected to be a list of snapshots.
with open(json_filename, 'r') as f:
data = json.load(f)
static_features_list = []
seq_types_list = []
seq_amounts_list = []
targets = []
for snapshot in data:
static_feat, seq_types, seq_amounts, target = process_snapshot(snapshot)
X_static = np.stack(static_features_list) # shape: (N, 384)
X_seq_types = np.stack(seq_types_list) # shape: (N, MAX_SEQ_LENGTH)
X_seq_amounts = np.stack(seq_amounts_list) # shape: (N, MAX_SEQ_LENGTH, 1)
y = np.array(targets, dtype=np.int32) # shape: (N,)
return X_static, X_seq_types, X_seq_amounts, y
# Change the filename below to your JSON file produced by your XML parser.
JSON_FILENAME = 'logs.json'
X_static, X_seq_types, X_seq_amounts, y = load_and_preprocess_data(JSON_FILENAME)
# (Optional) Check the shapes of your training arrays:
print("X_static shape:", X_static.shape)
print("X_seq_types shape:", X_seq_types.shape)
print("X_seq_amounts shape:", X_seq_amounts.shape)
print("y shape:", y.shape)
# Split into training and validation sets
X_static_train, X_static_val, X_seq_types_train, X_seq_types_val, X_seq_amounts_train, X_seq_amounts_val, y_train, y_val = train_test_split(
X_static, X_seq_types, X_seq_amounts, y, test_size=0.2, random_state=42
# =============================================================================
# 4. Build the Keras Model
# =============================================================================
# Parameters for the sequential branch
# Adjust NUM_ACTION_TYPES based on your data (here we assume 25; update if needed)
NUM_ACTION_CLASSES = 30 # number of distinct action types to predict (update as needed)
# -- Static Input Branch --
static_input = Input(shape=(384,), name='static_input')
x_static = Dense(128, activation='relu')(static_input)
x_static = Dense(64, activation='relu')(x_static)
# -- Sequential Input Branch --
# Input for action types (integers; shape = (MAX_SEQ_LENGTH,))
seq_types_input = Input(shape=(MAX_SEQ_LENGTH,), dtype='int32', name='seq_types_input')
# Input for action amounts (floats; shape = (MAX_SEQ_LENGTH, 1))
seq_amounts_input = Input(shape=(MAX_SEQ_LENGTH, 1), dtype='float32', name='seq_amounts_input')
# Process action types with an Embedding layer.
# (We use mask_zero=True so that padded 0 values are ignored.)
x_seq_types = Embedding(
input_dim=NUM_ACTION_TYPES + 1, # +1 to reserve index 0 for padding
# Process the amounts with a simple dense layer (applied to each time step).
x_seq_amounts = Dense(8, activation='relu', name='amount_dense')(seq_amounts_input)
# Concatenate along the feature dimension: now each time step has (EMBEDDING_DIM + 8) features.
x_seq = Concatenate(name='seq_concat')([x_seq_types, x_seq_amounts])
# Process the concatenated sequence with an LSTM.
x_seq = LSTM(64, name='lstm_seq')(x_seq)
# -- Merge Both Branches --
x = Concatenate(name='merge')([x_static, x_seq])
x = Dense(64, activation='relu')(x)
x = Dropout(0.5)(x)
output = Dense(NUM_ACTION_CLASSES, activation='softmax', name='output')(x)
model = Model(
inputs=[static_input, seq_types_input, seq_amounts_input],
# =============================================================================
# 5. Train the Model
# =============================================================================
history = model.fit(
'static_input': X_static_train,
'seq_types_input': X_seq_types_train,
'seq_amounts_input': X_seq_amounts_train
'static_input': X_static_val,
'seq_types_input': X_seq_types_val,
'seq_amounts_input': X_seq_amounts_val
# =============================================================================
# 6. Evaluate / Save the Model
# =============================================================================
loss, acc = model.evaluate(
'static_input': X_static_val,
'seq_types_input': X_seq_types_val,
'seq_amounts_input': X_seq_amounts_val
print(f"Validation Loss: {loss:.4f} | Validation Accuracy: {acc:.4f}")
# Optionally, save your model: