Item Response Theory

Item Response Theory was developed to enable different tests to give similar estimates of student ability, e.g. so the November and May versions of the SAT give grades on the same scale in spite of having different items (questions).  Whereas classical test theory considers the whole test instrument to be a unit, IRT separates the test into individual items (questions demanding a response) taken by individual students.  It then assigns ability parameters to each student and difficulty and discrimination (difference between low and top students) parameters to each item.

LPW08 Measuring student learning with item response theory, Lee, Y-J, David J. Palazzo, Rasil Warnakulasooriya, and Pritchard, D. E., Phys. Rev. ST Phys. Educ. Res. 4, 010102 (2008)

Whereas IRT is designed for a one-time test, we define a current ability (i.e. midway through a semester).  We used it to investigate the effect of wrong answers, wrong answers with associated specific hints, and hints on the change of a student’s ability on that problem.  We find that students who examine hints (at some sacrifice in credit) prior to answering, then answer at nearly 2 standard deviations above their current ability

BDK12 Model-Based Collaborative Filtering Analysis of Student Response Data: Machine-Learning Item Response Theory, Yoav Bergner, Stefan Dröschler, Gerd Kortemeyer, Saif Rayyan, Daniel Seaton, and David Pritchard,  The 5th International Conference on Educational Data Mining, 95-102 (2012)

We show that standard collaborative filtering software is useful for analyzing results of concept tests as well as homework scores, and that it regains several common IRT models with particular choice of the parameters.  We show several examples of tests and assignments that are two dimensional.

CAB11 Item Response Theory Analysis of the Mechanics Baseline Test   Carolin N. Cardamone, Jonathan Abbot, Analia Barrantes, Andrew Pawl, Saif Rayyan, Daniel Seaton, Raluca Teodorescu, and Dave Pritchard Physics Education Research Conference 2011  Volume 1413, Pages 135-138  AIP Conference Proceedings