在研究如何將PioSOLVER的解決方案整合進bot系統時,遇到了一些有趣的問題。原本打算使用checkmathpoker.com的API來存取preflop和postflop的解決方案,但每月154元的費用加上10,000次的請求限制,對於初期開發來說實在太過昂貴。
轉而使用PioSOLVER 2來產生heads-up preflop的解決方案,但馬上就遇到了如何將結果轉換成lookup table的問題。跟OpenHoldem時期相比,我不想再走回老路 - 手動複製貼上然後轉換格式,那實在太耗時間了。原本想透過PioSOLVER的UPI介面來處理,卻發現每次都需要重新載入18GB的解決方案,這在實際運行時根本不可行。
經過一番思考,我決定採取折衷的方案:先用JSON格式來儲存從PioSOLVER複製出來的preflop圖表。這樣做雖然還是需要一些手動工作,但至少在lookup速度上會快得多。目前專注在heads-up的情況,工作量應該還算可以接受,因為training app的選項有限,不需要處理太多變化。
不過當我開始考慮未來要擴展到6-max時,問題就變得複雜了。光是要處理不同的開牌大小(比如min open或3BB、10BB的3-bet)就會讓遊戲樹呈指數成長。這讓我想起在OpenHoldem專案中的教訓 - 試圖硬編碼所有能想到的情況,最後還是敵不過現實中無窮的變化,導致bot在未預期的情況下表現極差。
後來想到或許可以使用Machine learning model來學習solver output data,在經過幾番努力後,在加入了suited、pocket pair等特徵後,準確率更是提升到接近100%。特別是在處理不同stack size的情況時,模型展現出驚人的泛化能力,這讓我一度認為找到了一個突破口。
然而,就在準備深入開發這個方向時,突然意識到自己又不知不覺地走上了老路 - 那條以solver為基礎的道路。回想起幾年前的經驗,無論是用OpenHoldem硬編碼solver策略,還是透過其他方式實作,最終的結果都是相似的 - 勉強打平rake的mediocre表現。反觀那些根據玩家類型進行調整的exploitative策略,雖然看似不夠"完美",卻能帶來更好的收益。
這個發現讓我陷入了深思。為什麼明明是"理論正確"的solver策略,實戰效果卻總是不如預期?更弔詭的是,我從未聽說過有人完全依照solver策略來打而獲得巨大成功。那些成功的職業玩家,往往是從solver中學習,而不是盲目跟從。這讓我意識到,也許solver策略對我來說就像是一個美麗的陷阱 - 因為它"完美"且容易驗證,反而讓我一再地落入這個看似安全的選擇。
最終,我決定暫停當前的開發方向,轉而思考如何善用手上的200萬手spin&go歷史記錄。雖然還不確定具體該如何處理這些數據,但我相信這可能是一個更有價值的方向。這個決定讓我感到些許遺憾 - 畢竟當前的模型開發已經有了不錯的進展。但有時候,放下已經投入的工作,承認自己又走錯路,可能比固執地堅持更需要勇氣。
這次的經驗再次提醒我,在poker的世界裡,最重要的或許不是追求完美的理論策略,而是如何有效地針對不同對手進行調整。即便這條路看起來沒有solver策略那麼清晰明確,但可能才是真正值得投入的方向。
我簡單測試Machine Learning Model在不同stack sizes data的程式碼:
import pandas as pd
import numpy as np
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import KFold
from sklearn.metrics import mean_squared_error
# -------------------------
# 1. Load & Merge Data for Multiple Stack Sizes
# -------------------------
df_80 = pd.read_csv("hu_80bb_r_0.csv")
df_100 = pd.read_csv("hu_100bb_r_0.csv")
df_120 = pd.read_csv("hu_120bb_r_0.csv")
df_80["stack_size"] = 80
df_100["stack_size"] = 100
df_120["stack_size"] = 120
df = pd.concat([df_80, df_100, df_120], ignore_index=True)
# -------------------------
# 2. Clean Frequencies
# -------------------------
# We'll define a small function that sets frequencies near 0 → 0, near 100 → 100.
def fix_freq(freq, eps=1.0):
"""
If freq >= 100 - eps, set it to 100.
If freq <= eps, set it to 0.
Otherwise, leave it as is.
"""
if freq >= 100 - eps:
return 100.0
elif freq <= eps:
return 0.0
else:
return freq
df["RAISE 25"] = df["RAISE 25"].apply(fix_freq)
df["FOLD"] = df["FOLD"].apply(fix_freq)
# -------------------------
# 3. Parse & Canonicalize Hole Cards
# -------------------------
rank_map = {'2':2, '3':3, '4':4, '5':5, '6':6,
'7':7, '8':8, '9':9, 'T':10,
'J':11, 'Q':12, 'K':13, 'A':14}
suit_map = {'c':1, 'd':2, 'h':3, 's':4}
def parse_hand_to_canonical(hand_str):
"""
hand_str like 'Qd2s' or '2sQd' (4 chars total).
1) Extract card1, card2
2) Convert each to (rank, suit)
3) Canonicalize: Ensure (rank1, suit1) >= (rank2, suit2)
by rank primarily, then suit as tiebreaker
4) Return (rank1, suit1, rank2, suit2)
"""
card1 = hand_str[0:2] # e.g. 'Qd'
card2 = hand_str[2:4] # e.g. '2s'
# Parse ranks and suits
r1 = rank_map[card1[:-1]]
s1 = suit_map[card1[-1]]
r2 = rank_map[card2[:-1]]
s2 = suit_map[card2[-1]]
# If second card is "bigger" by rank or tie rank & bigger suit,
# swap so that (r1, s1) is always the "higher" or canonical card.
# This ensures Qd2s == 2sQd => same final representation.
if (r2 > r1) or (r2 == r1 and s2 > s1):
r1, r2 = r2, r1
s1, s2 = s2, s1
return r1, s1, r2, s2
# Apply to entire DataFrame
df[["rank1", "suit1", "rank2", "suit2"]] = df["Hand"].apply(
lambda h: pd.Series(parse_hand_to_canonical(h))
)
# -------------------------
# 4. Additional Indicators
# -------------------------
df["is_suited"] = (df["suit1"] == df["suit2"]).astype(int)
df["is_pair"] = (df["rank1"] == df["rank2"]).astype(int)
def is_connector(row):
return 1 if abs(row["rank1"] - row["rank2"]) == 1 else 0
def is_1_gap(row):
return 1 if abs(row["rank1"] - row["rank2"]) == 2 else 0
df["is_connector"] = df.apply(is_connector, axis=1)
df["is_1_gap"] = df.apply(is_1_gap, axis=1)
# -------------------------
# 5. Build the Target: Fold Frequency in [0,1]
# -------------------------
df["fold_freq"] = df["FOLD"] / 100.0 # convert from [0..100] to [0..1]
# -------------------------
# 6. Define X (Features) and y (Target)
# -------------------------
feature_cols = [
"rank1", "suit1", "rank2", "suit2",
"stack_size",
"is_suited", "is_pair", "is_connector", "is_1_gap"
]
X = df[feature_cols]
y = df["fold_freq"]
# -------------------------
# 7. K-Fold Cross-Validation
# -------------------------
kf = KFold(n_splits=5, shuffle=True, random_state=42)
model = RandomForestRegressor(n_estimators=100, random_state=42)
mse_list = []
for train_index, val_index in kf.split(X):
X_train, X_val = X.iloc[train_index], X.iloc[val_index]
y_train, y_val = y.iloc[train_index], y.iloc[val_index]
model.fit(X_train, y_train)
y_pred = model.predict(X_val)
mse = mean_squared_error(y_val, y_pred)
mse_list.append(mse)
mse_array = np.array(mse_list)
rmse_array = np.sqrt(mse_array)
print("MSE (per fold):", mse_array)
print("RMSE (per fold):", rmse_array)
print("Mean RMSE:", rmse_array.mean(), "Std dev:", rmse_array.std())
# -------------------------
# 8. Train Final Model on ALL Data
# -------------------------
final_model = RandomForestRegressor(n_estimators=100, random_state=42)
final_model.fit(X, y)
# -------------------------
# 9. Prepare Function to Query Any Hand + Stack
# -------------------------
def prepare_features(hand_str, stack_size):
"""
Convert a hand like 'AcAd' or '2sQd' + a stack size into
the 9-element feature vector. We do the same canonical parse
to ensure consistent ordering of the two cards.
Returns a 2D numpy array suitable for model.predict().
"""
card1 = hand_str[0:2] # e.g. '2s'
card2 = hand_str[2:4] # e.g. 'Qd'
# Parse & canonicalize
r1, s1, r2, s2 = parse_hand_to_canonical(hand_str)
# Indicators
is_suited = 1 if s1 == s2 else 0
is_pair = 1 if r1 == r2 else 0
is_connector = 1 if abs(r1 - r2) == 1 else 0
is_1_gap = 1 if abs(r1 - r2) == 2 else 0
features = [
r1, s1,
r2, s2,
stack_size,
is_suited,
is_pair,
is_connector,
is_1_gap
]
return np.array([features])
# -------------------------
# Example Testing
# -------------------------
test_hands = ["Qd2s", "2sQd", "AcAd", "5h9d", "9h5s"]
stack_size = 90
for hand in test_hands:
X_custom = prepare_features(hand, stack_size)
pred_fold = final_model.predict(X_custom)[0]
pred_raise = 1.0 - pred_fold
print(f"Hand: {hand}, Stack: {stack_size}")
print(f" Predicted fold frequency: {pred_fold:.4f} ({pred_fold*100:.2f}%)")
print(f" Predicted raise frequency: {pred_raise:.4f} ({pred_raise*100:.2f}%)")
print("----")