匯東華的周末統計小小問題 :D
預定每周末出刊
British Medical Journal,英國醫學期刊(British Medical Journal,簡稱 BMJ),是一份同行評審性質的綜合醫學期刊,創始於1840年,是最古老的醫學期刊之一。2020年的IF=39.890,在醫學研究領域享有盛名。
BMJ自2008年9月開始至2015年由兩位流行病與統計學專家持續出了300多期Statistical question系列。在這個系列中,兩位學者每次出一道統計與流行病學有關的選擇題作答及解釋。本系列將精選Statistical Question,採用中英對照,有興趣的朋友們可進行回答。
BMJ 統計問題(1):評估兩個連續變項線性相關程度最適合的分析方式為何?(單選)
A. 散佈圖 (Scatter plot)
B. OR值 (Odds ratio)
C. 長條圖 (Bar chart)
D. 相關係數 (Correlation)
How would you best assess the degree of linear association between two continuous variables?
a) Scatter plot
b) Odds ratio
c) Bar chart
d) Correlation
Answer:
A scatter plot would display graphically the association between two continuous variables. The x axis displays values for one variable, and the y axis values for the other. Each point represents one paired measurement of each variable. This is a useful way to display the data and would show U shaped and other non-linear associations as well as linear associations.
However, a scatter plot does not measure an association. To put a number to it requires an assessment of correlation. Pearson’s correlation coefficient or Spearman’s rank correlation coefficient may be used to assess linear correlation. The correlation coefficient may take any value between 1 and −1: 0 represents no linear association, 1 represents a perfect straight line with y values increasing with increasing x values, −1 represents a perfect straight line with y values decreasing with increasing x values. Like other parametric techniques, Pearson’s correlation coefficient is sensitive to outlying values and may generate misleading P values when the y axis values are not normally distributed. Spearman’s rank correlation coefficient, like other non-parametric techniques, is less sensitive and more robust.
Odds ratios can be used to compare only dichotomous values. Odds express the probability of something happening divided by the probability of it not happening.
A bar chart is used to plot the frequency of a nominal variable (such as eye colour) or an ordinal variable (such as agreement with a survey question: “disagree strongly, disagree, agree, agree strongly.
中文說明:
散佈圖:以圖形方式顯示兩個連續變數之間的關聯。x軸顯示一個變數的值,y軸顯示另一個變數的值。每個點代表每個變數的一對測量值。這是一種顯示資料的有效方法,可以看出U形和其他非線性關聯以及線性關聯。但是,散佈圖無法準確衡量相關強度。要衡量關聯性,一般需要顯示相關程度之量化指標(相關係數)。
相關係數:衡量兩個變項線性相關的程度。皮爾森相關係數或斯皮爾曼等級相關係數可用於評估線性相關。相關係數介於1到-1之間的任何值:0表示兩變項沒有線性相關,1表示y值隨x值增加而增加的理想直線,-1表示y值隨x值增加而減少的理想直線。與其他母數分析法一樣,Pearson的相關係數對異常值敏感,容易受到影響,且當y值不呈常態分佈時,可能會產生誤導的P值。與其他無母數統計分析法一樣,Spearman的等級相關係數敏感度較低,較為穩健(robust)。
勝算比(Odds ratios,OR):僅可用於比較二分類數值,勝算(Odds)表示事件發生的機率除以未發生的機率。
長條圖:以圖示呈現名義變數(如:眼睛的顏色)或序位變數(如:調查問題的同意程度:“非常不同意,不同意,同意,非常同意”的次數)。
所以答案為 D (Correlation)
改編自:BMJ Statistical Question與「醫學論文與統計分析」鄭衛軍老師官網