TY - JOUR T1 - Latent Class Analysis of Recurrent Events in Problem-Solving Items JF - Applied Psychological Measurement Y1 - 2018 A1 - Haochen Xu A1 - Guanhua Fang A1 - Yunxiao Chen A1 - Jingchen Liu A1 - Zhiliang Ying AB - Computer-based assessment of complex problem-solving abilities is becoming more and more popular. In such an assessment, the entire problem-solving process of an examinee is recorded, providing detailed information about the individual, such as behavioral patterns, speed, and learning trajectory. The problem-solving processes are recorded in a computer log file which is a time-stamped documentation of events related to task completion. As opposed to cross-sectional response data from traditional tests, process data in log files are massive and irregularly structured, calling for effective exploratory data analysis methods. Motivated by a specific complex problem-solving item “Climate Control” in the 2012 Programme for International Student Assessment, the authors propose a latent class analysis approach to analyzing the events occurred in the problem-solving processes. The exploratory latent class analysis yields meaningful latent classes. Simulation studies are conducted to evaluate the proposed approach. VL - 42 UR - https://doi.org/10.1177/0146621617748325 ER - TY - CONF T1 - A Large-Scale Progress Monitoring Application with Computerized Adaptive Testing T2 - IACAT 2017 Conference Y1 - 2017 A1 - Okan Bulut A1 - Damien Cormier KW - CAT KW - Large-Scale tests KW - Process monitoring AB -

Many conventional assessment tools are available to teachers in schools for monitoring student progress in a formative manner. The outcomes of these assessment tools are essential to teachers’ instructional modifications and schools’ data-driven educational strategies, such as using remedial activities and planning instructional interventions for students with learning difficulties. When measuring student progress toward instructional goals or outcomes, assessments should be not only considerably precise but also sensitive to individual change in learning. Unlike conventional paper-pencil assessments that are usually not appropriate for every student, computerized adaptive tests (CATs) are highly capable of estimating growth consistently with minimum and consistent error. Therefore, CATs can be used as a progress monitoring tool in measuring student growth.

This study focuses on an operational CAT assessment that has been used for measuring student growth in reading during the academic school year. The sample of this study consists of nearly 7 million students from the 1st grade to the 12th grade in the US. The students received a CAT-based reading assessment periodically during the school year. The purpose of these periodical assessments is to measure the growth in students’ reading achievement and identify the students who may need additional instructional support (e.g., academic interventions). Using real data, this study aims to address the following research questions: (1) How many CAT administrations are necessary to make psychometrically sound decisions about the need for instructional changes in the classroom or when to provide academic interventions?; (2) What is the ideal amount of time between CAT administrations to capture student growth for the purpose of producing meaningful decisions from assessment results?

To address these research questions, we first used the Theil-Sen estimator for robustly fitting a regression line to each student’s test scores obtained from a series of CAT administrations. Next, we used the conditional standard error of measurement (cSEM) from the CAT administrations to create an error band around the Theil-Sen slope (i.e., student growth rate). This process resulted in the normative slope values across all the grade levels. The optimal number of CAT administrations was established from grade-level regression results. The amount of time needed for progress monitoring was determined by calculating the amount of time required for a student to show growth beyond the median cSEM value for each grade level. The results showed that the normative slope values were the highest for lower grades and declined steadily as grade level increased. The results also suggested that the CAT-based reading assessment is most useful for grades 1 through 4, since most struggling readers requiring an intervention appear to be within this grade range. Because CAT yielded very similar cSEM values across administrations, the amount of error in the progress monitoring decisions did not seem to depend on the number of CAT administrations.

Session Video

JF - IACAT 2017 Conference PB - Niigata Seiryo University CY - Niigata, Japan UR - https://drive.google.com/open?id=1uGbCKenRLnqTxImX1fZicR2c7GRV6Udc ER - TY - JOUR T1 - Latent-Class-Based Item Selection for Computerized Adaptive Progress Tests JF - Journal of Computerized Adaptive Testing Y1 - 2017 A1 - van Buuren, Nikky A1 - Eggen, Theo J. H. M. KW - computerized adaptive progress test KW - item selection method KW - Kullback-Leibler information KW - Latent class analysis KW - log-odds scoring VL - 5 UR - http://iacat.org/jcat/index.php/jcat/article/view/62/29 IS - 2 ER - TY - JOUR T1 - Longitudinal Multistage Testing JF - Journal of Educational Measurement Y1 - 2013 A1 - Pohl, Steffi AB -

This article introduces longitudinal multistage testing (lMST), a special form of multistage testing (MST), as a method for adaptive testing in longitudinal large-scale studies. In lMST designs, test forms of different difficulty levels are used, whereas the values on a pretest determine the routing to these test forms. Since lMST allows for testing in paper and pencil mode, lMST may represent an alternative to conventional testing (CT) in assessments for which other adaptive testing designs are not applicable. In this article the performance of lMST is compared to CT in terms of test targeting as well as bias and efficiency of ability and change estimates. Using a simulation study, the effect of the stability of ability across waves, the difficulty level of the different test forms, and the number of link items between the test forms were investigated.

VL - 50 UR - http://dx.doi.org/10.1111/jedm.12028 ER - TY - CHAP T1 - Limiting item exposure for target difficulty ranges in a high-stakes CAT Y1 - 2009 A1 - Li, X. A1 - Becker, K. A1 - Gorham, J. A1 - Woo, A. CY - D. J. Weiss (Ed.), Proceedings of the 2009 GMAC Conference on Computerized Adaptive Testing. {PDF File, 1. N1 - MB} ER - TY - JOUR T1 - Logistics of collecting patient-reported outcomes (PROs) in clinical practice: an overview and practical examples JF - Quality of Life Research Y1 - 2009 A1 - Rose, M. A1 - Bezjak, A. AB - PURPOSE: Interest in collecting patient-reported outcomes (PROs), such as health-related quality of life (HRQOL), health status reports, and patient satisfaction is on the rise and practical aspects of collecting PROs in clinical practice are becoming more important. The purpose of this paper is to draw the attention to a number of issues relevant for a successful integration of PRO measures into the daily work flow of busy clinical settings. METHODS: The paper summarizes the results from a breakout session held at an ISOQOL special topic conference for PRO measures in clinical practice in 2007. RESULTS: Different methodologies of collecting PROs are discussed, and the support needed for each methodology is highlighted. The discussion is illustrated by practical real-life examples from early adaptors who administered paper-pencil, or electronic PRO assessments (ePRO) for more than a decade. The paper also reports about new experiences with more recent technological developments, such as SmartPens and Computer Adaptive Tests (CATs) in daily practice. CONCLUSIONS: Methodological and logistical issues determine the resources needed for a successful integration of PRO measures into daily work flow procedures and influence significantly the usefulness of PRO data for clinical practice. VL - 18 SN - 0962-9343 (Print) N1 - Rose, MatthiasBezjak, AndreaNetherlandsQuality of life research : an international journal of quality of life aspects of treatment, care and rehabilitationQual Life Res. 2009 Feb;18(1):125-36. Epub 2009 Jan 20. ER - TY - JOUR T1 - Letting the CAT out of the bag: Comparing computer adaptive tests and an 11-item short form of the Roland-Morris Disability Questionnaire JF - Spine Y1 - 2008 A1 - Cook, K. F. A1 - Choi, S. W. A1 - Crane, P. K. A1 - Deyo, R. A. A1 - Johnson, K. L. A1 - Amtmann, D. KW - *Disability Evaluation KW - *Health Status Indicators KW - Adult KW - Aged KW - Aged, 80 and over KW - Back Pain/*diagnosis/psychology KW - Calibration KW - Computer Simulation KW - Diagnosis, Computer-Assisted/*standards KW - Humans KW - Middle Aged KW - Models, Psychological KW - Predictive Value of Tests KW - Questionnaires/*standards KW - Reproducibility of Results AB - STUDY DESIGN: A post hoc simulation of a computer adaptive administration of the items of a modified version of the Roland-Morris Disability Questionnaire. OBJECTIVE: To evaluate the effectiveness of adaptive administration of back pain-related disability items compared with a fixed 11-item short form. SUMMARY OF BACKGROUND DATA: Short form versions of the Roland-Morris Disability Questionnaire have been developed. An alternative to paper-and-pencil short forms is to administer items adaptively so that items are presented based on a person's responses to previous items. Theoretically, this allows precise estimation of back pain disability with administration of only a few items. MATERIALS AND METHODS: Data were gathered from 2 previously conducted studies of persons with back pain. An item response theory model was used to calibrate scores based on all items, items of a paper-and-pencil short form, and several computer adaptive tests (CATs). RESULTS: Correlations between each CAT condition and scores based on a 23-item version of the Roland-Morris Disability Questionnaire ranged from 0.93 to 0.98. Compared with an 11-item short form, an 11-item CAT produced scores that were significantly more highly correlated with scores based on the 23-item scale. CATs with even fewer items also produced scores that were highly correlated with scores based on all items. For example, scores from a 5-item CAT had a correlation of 0.93 with full scale scores. Seven- and 9-item CATs correlated at 0.95 and 0.97, respectively. A CAT with a standard-error-based stopping rule produced scores that correlated at 0.95 with full scale scores. CONCLUSION: A CAT-based back pain-related disability measure may be a valuable tool for use in clinical and research contexts. Use of CAT for other common measures in back pain research, such as other functional scales or measures of psychological distress, may offer similar advantages. VL - 33 SN - 1528-1159 (Electronic) N1 - Cook, Karon FChoi, Seung WCrane, Paul KDeyo, Richard AJohnson, Kurt LAmtmann, Dagmar5 P60-AR48093/AR/United States NIAMS5U01AR052171-03/AR/United States NIAMSComparative StudyResearch Support, N.I.H., ExtramuralUnited StatesSpineSpine. 2008 May 20;33(12):1378-83. ER - TY - JOUR T1 - Local Dependence in an Operational CAT: Diagnosis and Implications JF - Journal of Educational Measurement Y1 - 2008 A1 - Pommerich, Mary A1 - Segall, Daniel O. AB -

The accuracy of CAT scores can be negatively affected by local dependence if the CAT utilizes parameters that are misspecified due to the presence of local dependence and/or fails to control for local dependence in responses during the administration stage. This article evaluates the existence and effect of local dependence in a test of Mathematics Knowledge. Diagnostic tools were first used to evaluate the existence of local dependence in items that were calibrated under a 3PL model. A simulation study was then used to evaluate the effect of local dependence on the precision of examinee CAT scores when the 3PL model was used for selection and scoring. The diagnostic evaluation showed strong evidence for local dependence. The simulation suggested that local dependence in parameters had a minimal effect on CAT score precision, while local dependence in responses had a substantial effect on score precision, depending on the degree of local dependence present.

VL - 45 UR - http://dx.doi.org/10.1111/j.1745-3984.2008.00061.x ER - TY - JOUR T1 - La Validez desde una óptica psicométrica [Validity from a psychometric perspective] JF - Acta Comportamentalia Y1 - 2005 A1 - Muñiz, J. KW - Factor Analysis KW - Measurement KW - Psychometrics KW - Scaling (Testing) KW - Statistical KW - Technology KW - Test Validity AB - El estudio de la validez constituye el eje central de los análisis psicométricos de los instrumentos de medida. En esta comunicación se traza una breve nota histórica de los distintos modos de concebir la validez a lo largo de los tiempos, se comentan las líneas actuales, y se tratan de vislumbrar posibles vías futuras, teniendo en cuenta el impacto que las nuevas tecnologías informáticas están ejerciendo sobre los propios instrumentos de medida en Psicología y Educación. Cuestiones como los nuevos formatos multimedia de los ítems, la evaluación a distancia, el uso intercultural de las pruebas, las consecuencias de su uso, o los tests adaptativos informatizados, reclaman nuevas formas de evaluar y conceptualizar la validez. También se analizan críticamente algunos planteamientos recientes sobre el concepto de validez. The study of validity constitutes a central axis of psychometric analyses of measurement instruments. This paper presents a historical sketch of different modes of conceiving validity, with commentary on current views, and it attempts to predict future lines of research by considering the impact of new computerized technologies on measurement instruments in psychology and education. Factors such as the new multimedia format of items, distance assessment, the intercultural use of tests, the consequences of the latter, or the development of computerized adaptive tests demand new ways of conceiving and evaluating validity. Some recent thoughts about the concept of validity are also critically analyzed. (PsycINFO Database Record (c) 2005 APA ) (journal abstract) VL - 13 ER - TY - CHAP T1 - A Learning Environment for English for Academic Purposes Based on Adaptive Tests and Task-Based Systems T2 - Intelligent Tutoring Systems Y1 - 2004 A1 - Gonçalves, Jean P. A1 - Aluisio, Sandra M. A1 - de Oliveira, Leandro H.M. A1 - Oliveira Jr., Osvaldo N. ED - Lester, James C. ED - Vicari, Rosa Maria ED - Paraguaçu, Fábio JF - Intelligent Tutoring Systems T3 - Lecture Notes in Computer Science PB - Springer Berlin / Heidelberg VL - 3220 SN - 978-3-540-22948-3 UR - http://dx.doi.org/10.1007/978-3-540-30139-4_1 ER - TY - CONF T1 - A learning environment for english for academic purposes based on adaptive tests and task-based systems T2 - Intelligent Tutoring Systems. Y1 - 2004 A1 - PITON-GONÇALVES, J. A1 - ALUISIO, S. M. A1 - MENDONCA, L. H. A1 - NOVAES, O. O. JF - Intelligent Tutoring Systems. PB - Springer Berlin Heidelberg ER - TY - JOUR T1 - La simulation d’un test adaptatif basé sur le modèle de Rasch [Simulation of a Rasch-based adaptive test] JF - Mesure et évaluation en éducation. Y1 - 2002 A1 - Raîche, G. N1 - (In French) {PDF file, 30 KB} ER - TY - CHAP T1 - Le testing adaptatif [Adaptive testing] Y1 - 2002 A1 - Raîche, G. CY - D. R. Bertrand and J.G. Blais (Eds) : Les théories modernes de la mesure [Modern theories of measurement]. Sainte-Foy: Presses de l’Université du Québec. N1 - (In French) {PDF file, 191 KB} ER - TY - BOOK T1 - La distribution dchantillonnage en testing adaptatif en fonction de deux rgles darrt : selon lerreur type et selon le nombre ditems administrs [Sampling distribution of the proficiency estimate in computerized adaptive testing according to two stopping... Y1 - 2000 A1 - Rache, G. CY - Doctoral thesis, Montreal: University of Montreal N1 - . ER - TY - JOUR T1 - Lagrangian relaxation for constrained curve-fitting with binary variables: Applications in educational testing JF - Dissertation Abstracts International Section A: Humanities and Social Sciences Y1 - 2000 A1 - Koppel, N. B. KW - Analysis KW - Educational Measurement KW - Mathematical Modeling KW - Statistical AB - This dissertation offers a mathematical programming approach to curve fitting with binary variables. Various Lagrangian Relaxation (LR) techniques are applied to constrained curve fitting. Applications in educational testing with respect to test assembly are utilized. In particular, techniques are applied to both static exams (i.e. conventional paper-and-pencil (P&P)) and adaptive exams (i.e. a hybrid computerized adaptive test (CAT) called a multiple-forms structure (MFS)). This dissertation focuses on the development of mathematical models to represent these test assembly problems as constrained curve-fitting problems with binary variables and solution techniques for the test development. Mathematical programming techniques are used to generate parallel test forms with item characteristics based on item response theory. A binary variable is used to represent whether or not an item is present on a form. The problem of creating a test form is modeled as a network flow problem with additional constraints. In order to meet the target information and the test characteristic curves, a Lagrangian relaxation heuristic is applied to the problem. The Lagrangian approach works by multiplying the constraint by a "Lagrange multiplier" and adding it to the objective. By systematically varying the multiplier, the test form curves approach the targets. This dissertation explores modifications to Lagrangian Relaxation as it is applied to the classical paper-and-pencil exams. For the P&P exams, LR techniques are also utilized to include additional practical constraints to the network problem, which limit the item selection. An MFS is a type of a computerized adaptive test. It is a hybrid of a standard CAT and a P&P exam. The concept of an MFS will be introduced in this dissertation, as well as, the application of LR as it is applied to constructing parallel MFSs. The approach is applied to the Law School Admission Test for the assembly of the conventional P&P test as well as an experimental computerized test using MFSs. (PsycINFO Database Record (c) 2005 APA ) VL - 61 ER - TY - BOOK T1 - Learning Potential Computerised Adaptive Test (LPCAT): Technical Manual Y1 - 2000 A1 - De Beer, M. CY - Pretoria: UNISA N1 - #deBE00-01 ER - TY - BOOK T1 - Learning Potential Computerised Adaptive Test (LPCAT): User's Manual Y1 - 2000 A1 - De Beer, M. CY - Pretoria: UNISA N1 - #deBE00-02 ER - TY - JOUR T1 - Limiting answer review and change on computerized adaptive vocabulary tests: Psychometric and attitudinal results JF - Journal of Educational Measurement Y1 - 2000 A1 - Vispoel, W. P. A1 - Hendrickson, A. B. A1 - Bleiler, T. VL - 37 ER - TY - JOUR T1 - Los tests adaptativos informatizados en la frontera del siglo XXI: Una revisión [Computerized adaptive tests at the turn of the 21st century: A review] JF - Metodología de las Ciencias del Comportamiento Y1 - 2000 A1 - Hontangas, P. A1 - Ponsoda, V. A1 - Olea, J. A1 - Abad, F. J. KW - computerized adaptive testing VL - 2 SN - 1575-9105 ER - TY - CONF T1 - Limiting answer review and change on computerized adaptive vocabulary tests: Psychometric and attitudinal results T2 - Paper presented at the annual meeting of the National Council on Measurement in Education Y1 - 1999 A1 - Vispoel, W. P. A1 - Hendrickson, A. A1 - Bleiler, T. A1 - Widiatmo, H. A1 - Shrairi, S. A1 - Ihrig, D. JF - Paper presented at the annual meeting of the National Council on Measurement in Education CY - Montreal, Canada N1 - #VI99-01 ER - TY - ABST T1 - Linking scores for computer-adaptive and paper-and-pencil administrations of the SAT (Research Report No 97-12) Y1 - 1997 A1 - Lawrence, I. A1 - Feigenbaum, M. CY - Princeton NJ: Educational Testing Service N1 - #LA97-12 ER - TY - ABST T1 - La simulation de modèle sur ordinateur en tant que méthode de recherche : le cas concret de l’étude de la distribution d’échantillonnage de l’estimateur du niveau d’habileté en testing adaptatif en fonction de deux règles d’arrêt Y1 - 1994 A1 - Raîche, G. CY - Actes du 6e colloque de l‘Association pour la recherche au collégial. Montréal : Association pour la recherche au collégial, ARC ER - TY - CONF T1 - L'évaluation nationale individualisée et assistée par ordinateur [Large scale assessment: Tailored and computerized] T2 - Québec: Proceeding of the 14th Congress of the Association québécoise de pédagogie collégiale. Montréal: Association québécoise de pédagogie collégiale (AQPC). Y1 - 1994 A1 - Raîche, G. A1 - Béland, A. JF - Québec: Proceeding of the 14th Congress of the Association québécoise de pédagogie collégiale. Montréal: Association québécoise de pédagogie collégiale (AQPC). ER - TY - ABST T1 - Les tests adaptatifs en langue seconde Y1 - 1993 A1 - Laurier, M. CY - Communication lors de la 16e session d’étude de l’ADMÉÉ à Laval. Montréal: Association pour le développement de la mesure et de l’évaluation en éducation. ER - TY - JOUR T1 - Linking the standard and advanced forms of the Ravens Progressive Matrices in both the paper-and-pencil and computer-adaptive-testing formats JF - Educational and Psychological Measurement Y1 - 1993 A1 - Styles, I. A1 - Andrich, D. VL - 53 ER - TY - ABST T1 - The Language Training Division's computer adaptive reading proficiency test Y1 - 1992 A1 - Janczewski, D. A1 - Lowe, P. CY - Provo, UT: Language Training Division, Office of Training and Education ER - TY - JOUR T1 - Le testing adaptatif avec interprétation critérielle, une expérience de praticabilité du TAM pour l’évaluation sommative des apprentissages au Québec. JF - Mesure et évaluation en éducation Y1 - 1992 A1 - Auger, R. ED - Seguin, S. P. VL - 15-1 et 2 ER - TY - JOUR T1 - Latent structure and item sampling models for testing JF - Annual Review of Psychology Y1 - 1985 A1 - Traub, R. E. A1 - Lam, Y. R. VL - 36 ER - TY - CONF T1 - Legal and political considerations in large-scale adaptive testing T2 - Paper presented at the 23rd conference of the Military Testing Association. Y1 - 1982 A1 - B. K. Waters A1 - Lee, G. C. JF - Paper presented at the 23rd conference of the Military Testing Association. ER - TY - ABST T1 - A live tailored testing comparison study of the one- and three-parameter logistic models (Research Report 78-1) Y1 - 1978 A1 - Koch, W. J. A1 - Reckase, M. D. CY - Columbia MO: University of Missouri, Department of Psychology ER - TY - CHAP T1 - A Low-Cost Terminal Usable for Computerized Adaptive Testing Y1 - 1977 A1 - Lamos, J. P. A1 - B. K. Waters CY - D. J. Weiss (Ed.), Proceedings of the 1977 Computerized Adaptive Testing Conference. Minneapolis MN: University of Minnesota, Department of Psychology, Psychometric Methods Program ER - TY - JOUR T1 - Le development de lintelligence chez les enfants JF - LAnee Psychologique Y1 - 1908 A1 - Binet, A. A1 - Simon, T. VL - 14 N1 - In French ER -