重啟撲克機器人之路 -13：AI coding的陷阱

更新於 2025/02/23發佈於 2025/02/23閱讀時間約 27 分鐘

原本以為LSTM這種進階的機器學習模型會很難掌握，但實際接觸後發現，當你不執著於理解每個細節，反而能更輕鬆地運用它。比如說，我只需要知道它善於處理序列資料，能夠理解行動之間的順序關係，這就足以應用在我的撲克機器人project中了。

在語言模型協助寫code的過程中，遇上了隱藏的陷阱。大約80%的時候，模型會產生可以直接運作的程式碼；10%會有明顯的錯誤，這反而不是最麻煩的，因為錯誤很容易發現和修正。最令人困擾的是剩下的10%：程式碼看似完全正常，但實際上模型悄悄修改了一些原本的邏輯。

這種隱藏的修改特別難以發現，尤其是當程式碼越來越龐大時。比如在改進資料抽取時，模型不知為何移除了過濾Hero行動的程式碼。如果不是我花時間仔細檢查資料，這種問題可能會潛伏很久才被發現。這讓我意識到，與其請LLM直接修改整個程式碼，不如請它提供修改的方向，然後自己動手實作。雖然這樣可能比較費時，但能避免一些意想不到的問題。

說到資料檢查，這確實是個容易被忽視的環節。誰不想快點跳到訓練模型的部分呢？但正是因為花時間審視了資料集，才發現了hero action原來被放入dataset。這讓我想起以前的經驗：往往是那些看似繁瑣、無趣的基礎工作，最終會影響專案的成敗。

這陣子的開發經驗讓我重新思考了與AI工具合作的方式。有時候看似省時的捷徑，可能反而會帶來更多隱藏的問題。就像之前寫過的，在開發過程中，過度依賴AI工具可能也會有類似的陷阱。得找到一個平衡點，知道在什麼時候該仰賴工具，什麼時候該親自動手。

改善後提取iPoker hand history data的程式碼：

#!/usr/bin/env python3

import os

import xml.etree.ElementTree as ET

import json

import re



# ------------------------------

# Helper functions

# ------------------------------



def safe_float(text):

try:

return float(re.sub(r"[^\d\.]", "", text))

except Exception:

return 0.0



def street_from_round(round_no):

return {1: "preflop", 2: "flop", 3: "turn", 4: "river"}.get(round_no, "unknown")



def get_hole_cards(game, player):

for r in game.findall('round'):

for elem in r.findall('cards'):

if elem.attrib.get("type") == "Pocket" and elem.attrib.get("player") == player:

if elem.text:

cards = elem.text.split()

if any(c.upper().startswith("X") for c in cards):

return ["unknown", "unknown"]

else:

return cards

return ["unknown", "unknown"]



def simplify_action(action_details, round_no, blinds, pot_before_action, round_contributions, current_round_max):

"""

Simplify the action into a single string category.

For non-raise actions, the simplified action is just the same.

For raises:

- Preflop (round 1): use the big blind as reference.

- Postflop (round > 1): first compute the call amount required

(difference between the current round's highest bet and the player's current round contribution).

Then, effective raise = action_sum - call_amount, and effective pot = pot_before_action + call_amount.

The ratio effective_raise/effective_pot is used to determine raise size.

"""

allowed_types = {0: "fold", 3: "call", 4: "check", 5: "raise", 7: "raise", 23: "raise"}

orig_type = action_details['action_type']

if orig_type not in allowed_types:

return None

base_action = allowed_types[orig_type]

new_action = action_details.copy()

if base_action != "raise":

new_action["simple_action_type"] = base_action

else:

if round_no == 1:

bb = blinds.get("big_blind", 1)

ratio = action_details['action_sum'] / bb if bb != 0 else 0

# Preflop thresholds (adjust as needed).

if ratio <= 3:

new_action["simple_action_type"] = "small raise preflop"

elif ratio > 3:

new_action["simple_action_type"] = "all in preflop"

else:

new_action["simple_action_type"] = "raise"

else:

# For postflop raises, compute call amount.

current_player_contrib = round_contributions.get(action_details['player'], 0)

call_amount = max(0, current_round_max - current_player_contrib)

effective_raise = action_details['action_sum'] - call_amount

effective_pot = pot_before_action + call_amount

ratio = effective_raise / effective_pot if effective_pot > 0 else 0

# Thresholds: adjust as needed.

if ratio < 0.53:

new_action["simple_action_type"] = "small raise postflop"

else:

new_action["simple_action_type"] = "big raise postflop"

return new_action



def parse_decision_logs(root, hero):

logs = []

for game in root.findall('game'):

gamecode = game.attrib.get("gamecode", "")

general = game.find('general')

blinds = {

'small_blind': safe_float(general.findtext('smallblind', default="0")),

'big_blind': safe_float(general.findtext('bigblind', default="0")),

'ante': safe_float(general.findtext('ante', default="0"))

}

# Get players element and determine initial stacks.

players_elem = general.find('players')

# Determine hand type based on number of players.

num_players = len(players_elem.findall('player'))

if num_players == 2:

head_up = True

hand_type = "head-up"

elif num_players == 3:

head_up = False

hand_type = "3-handed"

else:

head_up = False

hand_type = f"{num_players}-handed"

# Identify small blind and big blind using round 0 actions.

small_blind_player = None

big_blind_player = None

for r in game.findall('round'):

if int(r.attrib.get('no', 0)) == 0:

for child in r:

if child.tag == "action":

if child.attrib.get('type') == "1":

small_blind_player = child.attrib.get('player')

elif child.attrib.get('type') == "2":

big_blind_player = child.attrib.get('player')

break

# Identify the button player from the dealer flag.

button_player = None

for p in players_elem.findall('player'):

if p.attrib.get('dealer','0') == '1':

button_player = p.attrib.get('name')

break

# Create mapping: small blind → 0, big blind → 1, button → 2.

player_positions = {}

if small_blind_player is not None:

player_positions[small_blind_player] = 0

if big_blind_player is not None:

player_positions[big_blind_player] = 1

if button_player is not None:

player_positions[button_player] = 2

# Get initial stacks for the mapped players.

player_stacks = {}

for p in players_elem.findall('player'):

name = p.attrib['name']

if name in player_positions:

player_stacks[name] = safe_float(p.attrib.get('chips', "0"))

active_players = {name: True for name in player_positions}

pot_size = 0.0

board_cards = []

cumulative_actions = [] # simplified actions (each includes "action_round")

snapshot_action_counter = 0

# Track cumulative contributions (over the whole game).

player_contributions = {name: 0.0 for name in player_positions}

for r in game.findall('round'):

round_no = int(r.attrib.get('no', 0))

# For rounds 1 and up, track contributions within the current betting round.

if round_no >= 1:

round_contributions = {player: 0.0 for player in player_positions}

current_round_max = 0.0

for child in r:

if child.tag == "cards":

if child.attrib.get("type") != "Pocket":

if child.text:

board_cards.extend(child.text.split())

elif child.tag == "action":

action_details = {

'player': child.attrib.get('player'),

'action_type': int(child.attrib.get('type')),

'action_sum': safe_float(child.attrib.get('sum')),

'action_round': round_no

}

player = action_details['player']

contribution = action_details['action_sum']

# For round 0 (blinds/antes): update cumulative contributions and pot; no snapshot.

if round_no < 1:

if player in player_contributions:

player_contributions[player] += contribution

pot_size += contribution

continue

# For rounds ≥ 1, capture the current pot (before adding current action).

current_pot = pot_size

# Actor's current stack (capped at 0 if contributions exceed chips).

actor_current_stack = max(0, player_stacks.get(player, 0.0) - player_contributions.get(player, 0.0))

# Simplify the action using the round-level data.

simple_action = simplify_action(action_details, round_no, blinds, current_pot, round_contributions, current_round_max)

if simple_action is not None:

snapshot_action_counter += 1

simple_action['action_no'] = snapshot_action_counter

# Replace the "player" field with its numeric value.

numeric_player = player_positions.get(player)

simple_action["player"] = numeric_player

# Build current stacks keyed by numeric positions.

current_player_stacks = {

player_positions[name] : max(0, player_stacks[name] - player_contributions.get(name, 0.0))

for name in player_positions

}

# Prepare previous actions.

previous_actions = []

for act in cumulative_actions:

act_copy = act.copy()

act_copy["player_position"] = act_copy["player"]

previous_actions.append(act_copy)

# Exclude hero's own actions from snapshots.

if player != hero:

snapshot = {

"gamecode": gamecode,

"round_no": round_no,

"current_street": street_from_round(round_no),

"blinds": {

"small_blind": blinds.get("small_blind"),

"big_blind": blinds.get("big_blind"),

"ante": blinds.get("ante")

},

"player_positions": player_positions, # mapping from name to number

"player_stacks": current_player_stacks, # mapping from number to current stack

"pot_size": current_pot,

"board_cards": board_cards.copy(),

"previous_actions": previous_actions,

"action": simple_action.copy(),

"players_remaining": sum(1 for v in active_players.values() if v),

"is_button": (player_positions.get(player) == 2),

"actor_hole_cards": get_hole_cards(game, player),

"actor_stack_size": actor_current_stack,

"actor_position": player_positions.get(player),

"head_up": head_up,

"hand_type": hand_type

}

logs.append(snapshot)

# Update round-level contributions for this action.

if round_no >= 1:

round_contributions[player] = round_contributions.get(player, 0) + contribution

current_round_max = max(current_round_max, round_contributions[player])

# Update cumulative contributions and pot AFTER snapshot creation.

if player in player_contributions:

player_contributions[player] += contribution

pot_size += contribution

if simple_action is not None:

cumulative_actions.append(simple_action)

# Mark a player as inactive if they folded.

if action_details.get('action_type') == 0:

active_players[player] = False

return logs



# ------------------------------

# Process All XML Files in ipoker_hh Folder

# ------------------------------



def process_all_hand_history(root_folder):

all_logs = []

for dirpath, dirnames, filenames in os.walk(root_folder):

for filename in filenames:

if filename.endswith(".xml"):

file_path = os.path.join(dirpath, filename)

try:

tree = ET.parse(file_path)

root_xml = tree.getroot()

# Re-read hero nickname from each file.

session_general = root_xml.find('general')

file_hero = None

if session_general is not None and session_general.find('nickname') is not None:

file_hero = session_general.find('nickname').text.strip()

logs = parse_decision_logs(root_xml, file_hero)

all_logs.extend(logs)

print(f"Processed file: {file_path} -> {len(logs)} snapshots.")

except ET.ParseError as e:

print(f"Error parsing XML file: {file_path}", e)

return all_logs



# ------------------------------

# Main

# ------------------------------



if __name__ == '__main__':

root_folder = "ipoker_hh_test" # Adjust folder path as needed.

print("Processing all hand history XML files in folder:", root_folder)

all_logs = process_all_hand_history(root_folder)

print("Total snapshots extracted:", len(all_logs))

with open("logs.json", "w") as outfile:

json.dump(all_logs, outfile, indent=4)

print("Data extraction complete. Saved to logs.json")

留言

留言分享你的想法！

傑劉的沙龍

3會員

18內容數

傑劉的沙龍的其他內容

2025/03/16

重啟撲克機器人之路 -16：數據庫架構逐漸清晰

記錄了對撲克數據庫程式碼的深入理解，以及如何通過精確的查詢獲得準確的分析結果。通過重新組織action type的分類，讓後續的數據分析變得更加高效。這個數據庫將是撲克機器人專案的重要組成部分，用於建立更精確的對手模型。

2025/03/16

重啟撲克機器人之路 -16：數據庫架構逐漸清晰

2025/03/14

重啟撲克機器人之路 -15：數據庫的深淵與突破

記錄了在建構撲克數據庫過程中遇到的挑戰和收穫。探討了自建系統與現成工具的差異，以及如何確保數據準確性。同時反思了精確表達查詢需求的重要性，以及自建系統潛在的長期價值。

2025/03/14

重啟撲克機器人之路 -15：數據庫的深淵與突破

2025/03/13

重啟撲克機器人之路 - 14：數據庫的轉向

記錄了在撲克機器人開發中從機器學習模型轉向建立自定義數據庫的過程，以及這個策略轉變背後的思考。通過分析真實玩家的行動分布，希望能訓練出更有效的撲克機器人。

2025/03/13

重啟撲克機器人之路 - 14：數據庫的轉向

看更多

你可能也想看

普普文創

【文創漫談】程式設計與技術能力 | 如何利用AI | 增強能力

程式設計與技術能力在現代社會中的重要性越來越明顯，尤其是在人工智能（AI）和自動化技術迅速發展的背景下。理解編程語言，如Python、R等，以及熟悉相關技術架構和工具，能夠幫助個人在這樣的環境中更好地工作。這種能力不僅對技術專業人士至關重要，也對非技術領域的人士日益重要，因為基礎的程式設計知識已

#文創漫談#程式設計與技術能力#如何利用AI

2024/07/29

普普文創

【文創漫談】程式設計與技術能力 | 如何利用AI | 增強能力

#文創漫談#程式設計與技術能力#如何利用AI

2024/07/29

是我啦，我好學啦

AI可以讓你的作品變更好看嗎？答案是「有困難」

AI繪圖要廣泛用於商用還有一大段路，還需要依賴人類的經驗判斷、調整，為什麼呢？

#AI繪圖#midjourney繪圖#AI套現

2024/07/24

是我啦，我好學啦

AI可以讓你的作品變更好看嗎？答案是「有困難」

AI繪圖要廣泛用於商用還有一大段路，還需要依賴人類的經驗判斷、調整，為什麼呢？

#AI繪圖#midjourney繪圖#AI套現

2024/07/24

Darren的沙龍

解密 AI 與資料科學 (二) : AI 的類型與實戰場景

本文要探討AI的任務與實戰場景。AI技術已深入生活各層面，從違約預測到都市交通管理。AI任務主要有三類：數值型資料處理、自然語言處理（NLP）和電腦影像辨識。時間序列資料和強化學習方法（如AlphaGo）也引起廣泛關注。AI演算法和方法因應不同學派和技術發展而多樣化，了解這些基礎有助選擇適合研究方向

#ChatGPT#AlphaGo#人工智慧

2024/07/19