TY - JOUR T1 - A Comparison of Constraint Programming and Mixed-Integer Programming for Automated Test-Form Generation JF - Journal of Educational Measurement Y1 - 2018 A1 - Li, Jie A1 - van der Linden, Wim J. AB - Abstract The final step of the typical process of developing educational and psychological tests is to place the selected test items in a formatted form. The step involves the grouping and ordering of the items to meet a variety of formatting constraints. As this activity tends to be time-intensive, the use of mixed-integer programming (MIP) has been proposed to automate it. The goal of this article is to show how constraint programming (CP) can be used as an alternative to automate test-form generation problems with a large variety of formatting constraints, and how it compares with MIP-based form generation as for its models, solutions, and running times. Two empirical examples are presented: (i) automated generation of a computerized fixed-form; and (ii) automated generation of shadow tests for multistage testing. Both examples show that CP works well with feasible solutions and running times likely to be better than that for MIP-based applications. VL - 55 UR - https://onlinelibrary.wiley.com/doi/abs/10.1111/jedm.12187 ER - TY - JOUR T1 - Monitoring Items in Real Time to Enhance CAT Security JF - Journal of Educational Measurement Y1 - 2016 A1 - Zhang, Jinming A1 - Li, Jie AB - An IRT-based sequential procedure is developed to monitor items for enhancing test security. The procedure uses a series of statistical hypothesis tests to examine whether the statistical characteristics of each item under inspection have changed significantly during CAT administration. This procedure is compared with a previously developed CTT-based procedure through simulation studies. The results show that when the total number of examinees is fixed both procedures can control the rate of type I errors at any reasonable significance level by choosing an appropriate cutoff point and meanwhile maintain a low rate of type II errors. Further, the IRT-based method has a much lower type II error rate or more power than the CTT-based method when the number of compromised items is small (e.g., 5), which can be achieved if the IRT-based procedure can be applied in an active mode in the sense that flagged items can be replaced with new items. VL - 53 UR - http://dx.doi.org/10.1111/jedm.12104 ER - TY - JOUR T1 - Optimal Reassembly of Shadow Tests in CAT JF - Applied Psychological Measurement Y1 - 2016 A1 - Choi, Seung W. A1 - Moellering, Karin T. A1 - Li, Jie A1 - van der Linden, Wim J. AB - Even in the age of abundant and fast computing resources, concurrency requirements for large-scale online testing programs still put an uninterrupted delivery of computer-adaptive tests at risk. In this study, to increase the concurrency for operational programs that use the shadow-test approach to adaptive testing, we explored various strategies aiming for reducing the number of reassembled shadow tests without compromising the measurement quality. Strategies requiring fixed intervals between reassemblies, a certain minimal change in the interim ability estimate since the last assembly before triggering a reassembly, and a hybrid of the two strategies yielded substantial reductions in the number of reassemblies without degradation in the measurement accuracy. The strategies effectively prevented unnecessary reassemblies due to adapting to the noise in the early test stages. They also highlighted the practicality of the shadow-test approach by minimizing the computational load involved in its use of mixed-integer programming. VL - 40 UR - http://apm.sagepub.com/content/40/7/469.abstract ER -