圖片來源:https://www.johnslots.com/en/responsible-gambling/
本文章收集與評論三個體現「精準度-可解釋性權衡 (Accuracy-Interpretability Tradeoff)」的例子。
在詐欺偵測的數據集中,本來預期要觀察到的權衡,似乎並沒有出現。這令人非常好奇這個權衡的故事,源頭是哪邊。
本文章收錄三個,對於此權衡的認知升級
- 這個權衡,似乎是metric dependent的
- 可解釋性的刻畫,是model class specific
- 賭博預測,在小範圍的模型下,的確有這個權衡
思考#1:權衡的確發生在R score與平均絕對誤差上
此
文章提供了一個「精準度-可解釋性權衡」的實踐例子。
此文章考慮的三個模型為
- 線性回歸 (Linear Regression)
- 決策樹 (Decision Tree)
- 梯度提升法 (Gradient Boosting)
裡面提供的數據為
#1 線性回歸
Mean Squared Error: 19592.4703292543
R score: 0.40700134640548247
Mean Absolute Error: 103.67180228987019
#2 決策樹
Mean Squared Error: 10880.635297455
R score: 0.6706795022162286
Mean Absolute Error: 73.76311613574498
#3 梯度提升法
Mean Squared Error: 1388.8979420780786
R score: 0.9579626971080454
Mean Absolute Error: 23.81293483364058
可以看到的是,隨著模型可解釋性下降,R score與Mean Absolute Error有顯著上升。
🤔 然而,這並不代表「分類任務」也會有這個權衡。還續繼續觀察。
思考#2:廣義加性模型,似乎沒有這個權衡
此
文章提供對
廣義加性模型(Generalized Additive Model; GAM)的可解釋性思考。
文章提供的觀點是,歸納偏誤(Inductive Bias)是可解釋模型中重要的元素。
Our results suggest that inductive bias plays a crucial role in what interpretable models learn and that tree-based GAMs represent the best balance of sparsity, fidelity and accuracy and thus appear to be the most trustworthy GAM models.
其中想權衡的三個點為
- Sparsity: use fewer features to make predictions 用少一點的特徵來做預測
- Fidelity: true patterns in the data 要能反應數據的真實模式
- Accuracy: 就是準確度
思考#3:解釋賭博,的確在小範圍有這個權衡
Section 2 discusses the related work in the application of machine learning to understand and interpret gambling behaviour. Section 5 discusses the interpretability of our empirical results, and concludes the need for further research of understanding and measuring algorithm interpretation.
同樣的邏輯,應該也可以應用到我們這裡。
可解釋性的需求,來自Responsible Gambling這個社群,需要輸出對賭博行為的知識。
As reported in [15], we polled the audience at a related presentation at the 2016 New Horizons in Responsible Gambling conference to explore the importance of knowledge extraction and algorithm interpretability.
用投票的方式,人還是喜歡可以解釋的演算法或模型。
Respondents were asked whether they would prefer a responsible gambling assessment algorithm that provided a 90% accurate assessment of problem gambling risk that they could not unpack or understand, or a model that provided a 75% accurate assessment that was fully interpretable and accountable. Only 20% chose the more accurate model, with 70% preferring to sacrifice 15 percentage points of accuracy for greater interpretability (10% were uncertain or felt it depended on the circumstances).
這邊的目標是預測有害博弈(Harmful Gambling)。也算是一種分類問題。
其使用的數據集,在上癮部門可以拿到。
Building on the work from the live action sports betting dataset available from the Division on Addiction public domain, in [12] nine supervised learning methods were assessed at identifying disordered Internet sports gamblers.
This paper focuses on knowledge extraction by using random forests and artificial neural networks and TREPAN on a new IGT dataset to not only predict, but also describe, self-excluders through knowledge extraction.
裡面比較了Random forest, Neural Network, Decision Tree. 是用預測正確率來看。其中Random forest最好。
感覺這個還是要自己做看看才知道。
而外產生的對合成數據的思考
合成數據與模擬數據還是不一樣的。
- 合成數據是根據真實數據來訓練一個「數據集模型」,藉此生成數據
- 模擬數據則是根據數理模型,根據隨機性機率理論,生成數據
合成數據訓練,可解釋性要如何賣?
想要弄成三個維度。現在有Fidelity, Accuracy,但不知道要怎麼「量化」可解釋性。
可能可以做Adult dataset,更原始,可能有更好的觀察?