Performance Metric Distribution Characteristics of Medical School Exam Items

Exams are used to measure student progress and subject mastery. Exam item performance can be assessed by Item Difficulty, Discrimination Index (DI), and Point Biserial (PB). A previous investigation descriptively characterized these metrics for 62 exams at the Oklahoma State University College of Osteopathic Medicine, however, the distribution characteristics remain unknown. The primary objective of this study was to determine the normality of the item Difficulty, DI, and PB for these 62 exams.

Using the software suite R (version 4.0.2) and RStudio (Version 1.3.959) we performed graphical and numerical analysis of normality using both Q-Q plots (ggqqplot), and the Shapiro-Wilk Normality Test (ggpubr) as adjusted using the Benjamini-Hochberg procedure.

For item Difficulty, 93.4% of exams had statistically significant deviations from normality. For DI, 63.9% of exams had statistically significant deviations form normality. For PB, 11.5% of exams had statistically significant deviations from normality.

Our results suggest that item performance indicators vary drastically in their distribution characteristics within our sample. Our findings support the use of inferential statistics relying the assumption normality for PB but not item Difficulty or DI. These results may be useful for curriculum directors and test-writers.



Nicholas Sajjadi

Lindsay Terry








