重啟撲克機器人之路 -12：當簡化反而帶來更多可能

更新於 2025/02/21發佈於 2025/02/21閱讀時間約 26 分鐘

昨天在改進撲克資料的提取過程時，有了一些有趣的領悟。原本想著要把所有細節都考慮進去，結果反而讓自己陷入了一個數據過於複雜的困境。特別是在處理raise action的部分，之前總覺得每個不同的下注大小都應該被當作獨特的數據點來訓練，現在回想起來，這種過度精確的想法反而限制了模型的學習空間。

這讓我想起以前用Hand2Note分析玩家池的經驗，在數學以及理論上，30%跟50%的下注差異滿多的，但在實戰中，玩家的range其實並沒有太大的區別。於是我決定將各種下注大小重新分類，簡化成幾個主要類別：小注、中注、大注等。這個決定不僅讓數據更容易處理，更重要的是讓每個類別都有足夠的樣本來進行有意義的分析。

在整理數據的過程中也發現了一些之前忽略的問題。比如pot size的計算方式有誤 - 之前用的是牌局開始時的金額，而忽略已經投入底池的錢。還有玩家ID的部分，原先直接使用玩家名稱導致數據不夠一致，現在改用簡單的ID編號後，數據的一致性提高了不少。

原本打算使用Random Forest Classification來處理這project，感覺這樣可以讓整個流程變得更直接簡單。然而在檢視訓練數據時，意識到poker中previous action的重要性可能遠超過我的預期。這讓我開始思考是否需要一個能更好地處理序列數據的模型。

在與語言模型討論後，決定嘗試將特徵分成兩個分支：一個處理全局遊戲狀態，另一個專門處理動作序列。這個新方案使用了LSTM網絡來處理序列資料，雖然我還不太理解它的運作原理，但至少模型可以開始訓練了，這總是個好的起點。

回顧昨天的進展，發現有時候「簡化」反而能帶來更多可能性。就像之前用OpenHoldem時，總是想要處理所有可能的情況，結果反而讓程式變得難以維護。現在學會了如何在保持核心功能的同時，適度簡化實作方式，這或許才是真正的進步。

接下來還需要深入研究LSTM網絡的原理，雖然現在還是一頭霧水，但至少方向是對的。就像之前寫過的，有時候不需要完全理解所有細節，先讓東西能跑起來，再慢慢改進可能是更好的策略。

更新後的手牌資料提取，適用於iPoker xml hand history：

#!/usr/bin/env python3

import os

import xml.etree.ElementTree as ET

import json

import re



# ------------------------------

# Helper functions

# ------------------------------



def safe_float(text):

try:

return float(re.sub(r"[^\d\.]", "", text))

except Exception:

return 0.0



def street_from_round(round_no):

return {1: "preflop", 2: "flop", 3: "turn", 4: "river"}.get(round_no, "unknown")



def get_hole_cards(game, player):

for r in game.findall('round'):

for elem in r.findall('cards'):

if elem.attrib.get("type") == "Pocket" and elem.attrib.get("player") == player:

if elem.text:

cards = elem.text.split()

if any(c.upper().startswith("X") for c in cards):

return ["unknown", "unknown"]

else:

return cards

return ["unknown", "unknown"]



def simplify_action(action_details, round_no, blinds, pot_before_action):

"""

Simplify the action into a single string category.

Allowed original types:

- 0 → fold

- 3 → call

- 4 → check

- 5, 7, 23 → raise, which are further categorized as follows:

* Preflop (round 1), using the big blind as reference:

2.0×BB to 2.3×BB → "small raise preflop"

2.3×BB to 3.0×BB → "mid raise preflop"

3.0×BB to 4.0×BB → "big raise preflop"

≥ 4.0×BB → "all in preflop"

* Postflop (round > 1), using the pot (before the action) as reference:

< 35% of pot → "small raise postflop"

35% to 70% of pot → "mid raise postflop"

70% to 110% of pot → "big raise postflop"

≥ 110% of pot → "all in postflop"

"""

allowed_types = {0: "fold", 3: "call", 4: "check", 5: "raise", 7: "raise", 23: "raise"}

orig_type = action_details['action_type']

if orig_type not in allowed_types:

return None

base_action = allowed_types[orig_type]

new_action = action_details.copy()

if base_action != "raise":

new_action["simple_action_type"] = base_action

else:

if round_no == 1:

bb = blinds.get("big_blind", 1)

ratio = action_details['action_sum'] / bb if bb != 0 else 0

if ratio < 2.3:

new_action["simple_action_type"] = "small raise preflop"

elif 2.3 <= ratio < 3.0:

new_action["simple_action_type"] = "mid raise preflop"

elif 3.0 <= ratio < 4.0:

new_action["simple_action_type"] = "big raise preflop"

elif ratio >= 4.0:

new_action["simple_action_type"] = "all in preflop"

else:

new_action["simple_action_type"] = "raise"

else:

ratio = action_details['action_sum'] / pot_before_action if pot_before_action > 0 else 0

if ratio < 0.35:

new_action["simple_action_type"] = "small raise postflop"

elif ratio < 0.70:

new_action["simple_action_type"] = "mid raise postflop"

elif ratio < 1.10:

new_action["simple_action_type"] = "big raise postflop"

else:

new_action["simple_action_type"] = "all in postflop"

return new_action



def parse_decision_logs(root, hero):

logs = []

for game in root.findall('game'):

gamecode = game.attrib.get("gamecode", "")

general = game.find('general')

blinds = {

'small_blind': safe_float(general.findtext('smallblind', default="0")),

'big_blind': safe_float(general.findtext('bigblind', default="0")),

'ante': safe_float(general.findtext('ante', default="0"))

}

# Get players element and determine initial stacks.

players_elem = general.find('players')

# Identify small blind and big blind using round 0 actions.

small_blind_player = None

big_blind_player = None

for r in game.findall('round'):

if int(r.attrib.get('no', 0)) == 0:

for child in r:

if child.tag == "action":

if child.attrib.get('type') == "1":

small_blind_player = child.attrib.get('player')

elif child.attrib.get('type') == "2":

big_blind_player = child.attrib.get('player')

break

# Identify the button player from the dealer flag.

button_player = None

for p in players_elem.findall('player'):

if p.attrib.get('dealer','0') == '1':

button_player = p.attrib.get('name')

break

# Create mapping: small blind → 0, big blind → 1, button → 2.

player_positions = {}

if small_blind_player is not None:

player_positions[small_blind_player] = 0

if big_blind_player is not None:

player_positions[big_blind_player] = 1

if button_player is not None:

player_positions[button_player] = 2

# Get initial stacks for the mapped players.

player_stacks = {}

for p in players_elem.findall('player'):

name = p.attrib['name']

if name in player_positions:

player_stacks[name] = safe_float(p.attrib.get('chips', "0"))

active_players = {name: True for name in player_positions}

pot_size = 0.0

board_cards = []

cumulative_actions = [] # simplified actions (each includes "action_round")

snapshot_action_counter = 0

# Track cumulative contributions from each player.

player_contributions = {name: 0.0 for name in player_positions}

for r in game.findall('round'):

round_no = int(r.attrib.get('no', 0))

for child in r:

if child.tag == "cards":

if child.attrib.get("type") != "Pocket":

if child.text:

board_cards.extend(child.text.split())

elif child.tag == "action":

action_details = {

'player': child.attrib.get('player'),

'action_type': int(child.attrib.get('type')),

'action_sum': safe_float(child.attrib.get('sum')),

'action_round': round_no

}

player = action_details['player']

contribution = action_details['action_sum']

# For round 0 (blinds/antes): update contributions and pot; no snapshot.

if round_no < 1:

if player in player_contributions:

player_contributions[player] += contribution

pot_size += contribution

continue

# For rounds ≥ 1, capture current pot (before adding current action).

current_pot = pot_size

# Actor's current stack (as seen when taking the action).

actor_current_stack = player_stacks.get(player, 0.0) - player_contributions.get(player, 0.0)

# Simplify the action.

simple_action = simplify_action(action_details, round_no, blinds, current_pot)

if simple_action is not None:

snapshot_action_counter += 1

simple_action['action_no'] = snapshot_action_counter

# Replace the "player" field with its numeric value.

numeric_player = player_positions.get(player)

simple_action["player"] = numeric_player

# Build current stacks keyed by numeric positions.

current_player_stacks = { player_positions[name] : (player_stacks[name] - player_contributions.get(name, 0.0))

for name in player_positions }

# Create snapshot only for non-hero actions.

if player != hero:

snapshot = {

"gamecode": gamecode,

"round_no": round_no,

"current_street": street_from_round(round_no),

"blinds": blinds,

"player_positions": player_positions, # mapping from name to number

"player_stacks": current_player_stacks, # mapping from number to current stack

"pot_size": current_pot,

"board_cards": board_cards.copy(),

"previous_actions": [],

"action": simple_action.copy(),

"players_remaining": sum(1 for v in active_players.values() if v),

"is_button": (player_positions.get(player) == 2),

"actor_hole_cards": get_hole_cards(game, player),

"actor_stack_size": actor_current_stack,

"actor_position": player_positions.get(player)

}

# For previous actions, use the numeric "player" already set.

for act in cumulative_actions:

act2 = act.copy()

# Since act["player"] is now numeric, we simply assign it.

act2["player_position"] = act["player"]

snapshot["previous_actions"].append(act2)

logs.append(snapshot)

# Update contributions and pot AFTER snapshot creation.

if player in player_contributions:

player_contributions[player] += contribution

pot_size += contribution

if simple_action is not None:

cumulative_actions.append(simple_action)

# Mark a player as inactive if they folded.

if action_details.get('action_type') == 0:

active_players[player] = False

return logs



# ------------------------------

# Process All XML Files in ipoker_hh Folder

# ------------------------------



def process_all_hand_history(root_folder, hero=None):

all_logs = []

for dirpath, dirnames, filenames in os.walk(root_folder):

for filename in filenames:

if filename.endswith(".xml"):

file_path = os.path.join(dirpath, filename)

try:

tree = ET.parse(file_path)

root_xml = tree.getroot()

if hero is None:

session_general = root_xml.find('general')

if session_general is not None and session_general.find('nickname') is not None:

hero = session_general.find('nickname').text.strip()

logs = parse_decision_logs(root_xml, hero)

all_logs.extend(logs)

print(f"Processed file: {file_path} -> {len(logs)} snapshots.")

except ET.ParseError as e:

print(f"Error parsing XML file: {file_path}", e)

return all_logs



# ------------------------------

# Main

# ------------------------------



if __name__ == '__main__':

root_folder = "ipoker_hh_test" # Adjust folder path as needed.

hero_name = None # Or manually set your hero's name.

print("Processing all hand history XML files in folder:", root_folder)

all_logs = process_all_hand_history(root_folder, hero=hero_name)

print("Total snapshots extracted:", len(all_logs))

with open("logs.json", "w") as outfile:

json.dump(all_logs, outfile, indent=4)

print("Data extraction complete. Saved to logs.json")