Interrater reliability: the kappa statistic

Interrater reliability: the kappa statistic

2012 | Mary L. McHugh
The kappa statistic is used to measure interrater reliability, which is the degree to which raters agree on the scores they assign to the same variable. Unlike percent agreement, which simply counts the proportion of agreements, kappa adjusts for the possibility of chance agreement. Cohen's kappa ranges from -1 to +1, with higher values indicating greater agreement. While kappa is widely used, it has limitations, such as not accounting for rater independence and potential overestimation of agreement. Cohen suggested interpretations for kappa values, but these may be too lenient for healthcare research. Percent agreement is also used, but it can overestimate true agreement. Researchers are advised to calculate both percent agreement and kappa to assess interrater reliability. Kappa is particularly useful when there is a possibility of guessing, while percent agreement may be sufficient when raters are well-trained. Confidence intervals for kappa can provide more information about the reliability of results. Overall, both percent agreement and kappa have strengths and limitations, and researchers should consider both measures to ensure accurate and reliable data collection.The kappa statistic is used to measure interrater reliability, which is the degree to which raters agree on the scores they assign to the same variable. Unlike percent agreement, which simply counts the proportion of agreements, kappa adjusts for the possibility of chance agreement. Cohen's kappa ranges from -1 to +1, with higher values indicating greater agreement. While kappa is widely used, it has limitations, such as not accounting for rater independence and potential overestimation of agreement. Cohen suggested interpretations for kappa values, but these may be too lenient for healthcare research. Percent agreement is also used, but it can overestimate true agreement. Researchers are advised to calculate both percent agreement and kappa to assess interrater reliability. Kappa is particularly useful when there is a possibility of guessing, while percent agreement may be sufficient when raters are well-trained. Confidence intervals for kappa can provide more information about the reliability of results. Overall, both percent agreement and kappa have strengths and limitations, and researchers should consider both measures to ensure accurate and reliable data collection.
Reach us at info@study.space
[slides and audio] Interrater reliability%3A the kappa statistic