TY - JOUR T1 - Item Calibration Methods With Multiple Subscale Multistage Testing JF - Journal of Educational Measurement Y1 - 2020 A1 - Wang, Chun A1 - Chen, Ping A1 - Jiang, Shengyu KW - EM KW - marginal maximum likelihood KW - missing data KW - multistage testing AB - Abstract Many large-scale educational surveys have moved from linear form design to multistage testing (MST) design. One advantage of MST is that it can provide more accurate latent trait (θ) estimates using fewer items than required by linear tests. However, MST generates incomplete response data by design; hence, questions remain as to how to calibrate items using the incomplete data from MST design. Further complication arises when there are multiple correlated subscales per test, and when items from different subscales need to be calibrated according to their respective score reporting metric. The current calibration-per-subscale method produced biased item parameters, and there is no available method for resolving the challenge. Deriving from the missing data principle, we showed when calibrating all items together the Rubin's ignorability assumption is satisfied such that the traditional single-group calibration is sufficient. When calibrating items per subscale, we proposed a simple modification to the current calibration-per-subscale method that helps reinstate the missing-at-random assumption and therefore corrects for the estimation bias that is otherwise existent. Three mainstream calibration methods are discussed in the context of MST, they are the marginal maximum likelihood estimation, the expectation maximization method, and the fixed parameter calibration. An extensive simulation study is conducted and a real data example from NAEP is analyzed to provide convincing empirical evidence. VL - 57 UR - https://onlinelibrary.wiley.com/doi/abs/10.1111/jedm.12241 ER - TY - JOUR T1 - How Adaptive Is an Adaptive Test: Are All Adaptive Tests Adaptive? JF - Journal of Computerized Adaptive Testing Y1 - 2019 A1 - Mark Reckase A1 - Unhee Ju A1 - Sewon Kim KW - computerized adaptive test KW - multistage test KW - statistical indicators of amount of adaptation VL - 7 UR - http://iacat.org/jcat/index.php/jcat/article/view/69/34 IS - 1 ER - TY - CONF T1 - Bayesian Perspectives on Adaptive Testing T2 - IACAT 2017 Conference Y1 - 2017 A1 - Wim J. van der Linden A1 - Bingnan Jiang A1 - Hao Ren A1 - Seung W. Choi A1 - Qi Diao KW - Bayesian Perspective KW - CAT AB -

Although adaptive testing is usually treated from the perspective of maximum-likelihood parameter estimation and maximum-informaton item selection, a Bayesian pespective is more natural, statistically efficient, and computationally tractable. This observation not only holds for the core process of ability estimation but includes such processes as item calibration, and real-time monitoring of item security as well. Key elements of the approach are parametric modeling of each relevant process, updating of the parameter estimates after the arrival of each new response, and optimal design of the next step.

The purpose of the symposium is to illustrates the role of Bayesian statistics in this approach. The first presentation discusses a basic Bayesian algorithm for the sequential update of any parameter in adaptive testing and illustrates the idea of Bayesian optimal design for the two processes of ability estimation and online item calibration. The second presentation generalizes the ideas to the case of 62 IACAT 2017 ABSTRACTS BOOKLET adaptive testing with polytomous items. The third presentation uses the fundamental Bayesian idea of sampling from updated posterior predictive distributions (“multiple imputations”) to deal with the problem of scoring incomplete adaptive tests.

Session Video 1

Session Video 2

 

JF - IACAT 2017 Conference PB - Niigata Seiryo University CY - Niigata, Japan ER - TY - CONF T1 - MHK-MST Design and the Related Simulation Study T2 - IACAT 2017 Conference Y1 - 2017 A1 - Ling Yuyu A1 - Zhou Chenglin A1 - Ren Jie KW - language testing KW - MHK KW - multistage testing AB -

The MHK is a national standardized exam that tests and rates Chinese language proficiency. It assesses non-native Chinese minorities’ abilities in using the Chinese language in their daily, academic and professional lives; Computerized multistage adaptive testing (MST) is a combination of conventional paper-and-pencil (P&P) and item level computerized adaptive test (CAT), it is a kind of test forms based on computerized technology, take the item set as the scoring unit. It can be said that, MST estimate the Ability extreme value more accurate than conventional paper-and-pencil (P&P), also used the CAT auto-adapted characteristic to reduce the examination length and the score time of report. At present, MST has used in some large test, like Uniform CPA Examination and Graduate Record Examination(GRE). Therefore, it is necessary to develop the MST of application in China.

Based on consideration of the MHK characteristics and its future development, the researchers start with design of MHK-MST. This simulation study is conducted to validate the performance of the MHK -MST system. Real difficulty parameters of MHK items and the simulated ability parameters of the candidates are used to generate the original score matrix and the item modules are delivered to the candidates following the adaptive procedures set according to the path rules. This simulation study provides a sound basis for the implementation of MHK-MST.

Session Video

JF - IACAT 2017 Conference PB - Niigata Seiryo University CY - Niigata, Japan ER - TY - JOUR T1 - Uncertainties in the Item Parameter Estimates and Robust Automated Test Assembly JF - Applied Psychological Measurement Y1 - 2013 A1 - Veldkamp, Bernard P. A1 - Matteucci, Mariagiulia A1 - de Jong, Martijn G. AB -

Item response theory parameters have to be estimated, and because of the estimation process, they do have uncertainty in them. In most large-scale testing programs, the parameters are stored in item banks, and automated test assembly algorithms are applied to assemble operational test forms. These algorithms treat item parameters as fixed values, and uncertainty is not taken into account. As a consequence, resulting tests might be off target or less informative than expected. In this article, the process of parameter estimation is described to provide insight into the causes of uncertainty in the item parameters. The consequences of uncertainty are studied. Besides, an alternative automated test assembly algorithm is presented that is robust against uncertainties in the data. Several numerical examples demonstrate the performance of the robust test assembly algorithm, and illustrate the consequences of not taking this uncertainty into account. Finally, some recommendations about the use of robust test assembly and some directions for further research are given.

VL - 37 UR - http://apm.sagepub.com/content/37/2/123.abstract ER - TY - JOUR T1 - Comparison Between Dichotomous and Polytomous Scoring of Innovative Items in a Large-Scale Computerized Adaptive Test JF - Educational and Psychological Measurement Y1 - 2012 A1 - Jiao, H. A1 - Liu, J. A1 - Haynie, K. A1 - Woo, A. A1 - Gorham, J. AB -

This study explored the impact of partial credit scoring of one type of innovative items (multiple-response items) in a computerized adaptive version of a large-scale licensure pretest and operational test settings. The impacts of partial credit scoring on the estimation of the ability parameters and classification decisions in operational test settings were explored in one real data analysis and two simulation studies when two different polytomous scoring algorithms, automated polytomous scoring and rater-generated polytomous scoring, were applied. For the real data analyses, the ability estimates from dichotomous and polytomous scoring were highly correlated; the classification consistency between different scoring algorithms was nearly perfect. Information distribution changed slightly in the operational item bank. In the two simulation studies comparing each polytomous scoring with dichotomous scoring, the ability estimates resulting from polytomous scoring had slightly higher measurement precision than those resulting from dichotomous scoring. The practical impact related to classification decision was minor because of the extremely small number of items that could be scored polytomously in this current study.

VL - 72 ER - TY - JOUR T1 - Design of a Computer-Adaptive Test to Measure English Literacy and Numeracy in the Singapore Workforce: Considerations, Benefits, and Implications JF - Journal of Applied Testing Technology Y1 - 2011 A1 - Jacobsen, J. A1 - Ackermann, R. A1 - Egüez, J. A1 - Ganguli, D. A1 - Rickard, P. A1 - Taylor, L. AB -

A computer adaptive test CAT) is a delivery methodology that serves the larger goals of the assessment system in which it is embedded. A thorough analysis of the assessment system for which a CAT is being designed is critical to ensure that the delivery platform is appropriate and addresses all relevant complexities. As such, a CAT engine must be designed to conform to the
validity and reliability of the overall system. This design takes the form of adherence to the assessment goals and objectives of the adaptive assessment system. When the assessment is adapted for use in another country, consideration must be given to any necessary revisions including content differences. This article addresses these considerations while drawing, in part, on the process followed in the development of the CAT delivery system designed to test English language workplace skills for the Singapore Workforce Development Agency. Topics include item creation and selection, calibration of the item pool, analysis and testing of the psychometric properties, and reporting and interpretation of scores. The characteristics and benefits of the CAT delivery system are detailed as well as implications for testing programs considering the use of a
CAT delivery system.

VL - 12 UR - http://www.testpublishers.org/journal-of-applied-testing-technology IS - 1 ER - TY - CONF T1 - Practitioner’s Approach to Identify Item Drift in CAT T2 - Annual Conference of the International Association for Computerized Adaptive Testing Y1 - 2011 A1 - Huijuan Meng A1 - Susan Steinkamp A1 - Paul Jones A1 - Joy Matthews-Lopez KW - CUSUM method KW - G2 statistic KW - IPA KW - item drift KW - item parameter drift KW - Lord's chi-square statistic KW - Raju's NCDIF JF - Annual Conference of the International Association for Computerized Adaptive Testing ER - TY - CONF T1 - Small-Sample Shadow Testing T2 - Annual Conference of the International Association for Computerized Adaptive Testing Y1 - 2011 A1 - Wallace Judd KW - CAT KW - shadow test JF - Annual Conference of the International Association for Computerized Adaptive Testing ER - TY - JOUR T1 - Development and validation of patient-reported outcome measures for sleep disturbance and sleep-related impairments JF - Sleep Y1 - 2010 A1 - Buysse, D. J. A1 - Yu, L. A1 - Moul, D. E. A1 - Germain, A. A1 - Stover, A. A1 - Dodds, N. E. A1 - Johnston, K. L. A1 - Shablesky-Cade, M. A. A1 - Pilkonis, P. A. KW - *Outcome Assessment (Health Care) KW - *Self Disclosure KW - Adult KW - Aged KW - Aged, 80 and over KW - Cross-Sectional Studies KW - Factor Analysis, Statistical KW - Female KW - Humans KW - Male KW - Middle Aged KW - Psychometrics KW - Questionnaires KW - Reproducibility of Results KW - Sleep Disorders/*diagnosis KW - Young Adult AB - STUDY OBJECTIVES: To develop an archive of self-report questions assessing sleep disturbance and sleep-related impairments (SRI), to develop item banks from this archive, and to validate and calibrate the item banks using classic validation techniques and item response theory analyses in a sample of clinical and community participants. DESIGN: Cross-sectional self-report study. SETTING: Academic medical center and participant homes. PARTICIPANTS: One thousand nine hundred ninety-three adults recruited from an Internet polling sample and 259 adults recruited from medical, psychiatric, and sleep clinics. INTERVENTIONS: None. MEASUREMENTS AND RESULTS: This study was part of PROMIS (Patient-Reported Outcomes Information System), a National Institutes of Health Roadmap initiative. Self-report item banks were developed through an iterative process of literature searches, collecting and sorting items, expert content review, qualitative patient research, and pilot testing. Internal consistency, convergent validity, and exploratory and confirmatory factor analysis were examined in the resulting item banks. Factor analyses identified 2 preliminary item banks, sleep disturbance and SRI. Item response theory analyses and expert content review narrowed the item banks to 27 and 16 items, respectively. Validity of the item banks was supported by moderate to high correlations with existing scales and by significant differences in sleep disturbance and SRI scores between participants with and without sleep disorders. CONCLUSIONS: The PROMIS sleep disturbance and SRI item banks have excellent measurement properties and may prove to be useful for assessing general aspects of sleep and SRI with various groups of patients and interventions. VL - 33 SN - 0161-8105 (Print)0161-8105 (Linking) N1 - Buysse, Daniel JYu, LanMoul, Douglas EGermain, AnneStover, AngelaDodds, Nathan EJohnston, Kelly LShablesky-Cade, Melissa APilkonis, Paul AAR052155/AR/NIAMS NIH HHS/United StatesU01AR52155/AR/NIAMS NIH HHS/United StatesU01AR52158/AR/NIAMS NIH HHS/United StatesU01AR52170/AR/NIAMS NIH HHS/United StatesU01AR52171/AR/NIAMS NIH HHS/United StatesU01AR52177/AR/NIAMS NIH HHS/United StatesU01AR52181/AR/NIAMS NIH HHS/United StatesU01AR52186/AR/NIAMS NIH HHS/United StatesResearch Support, N.I.H., ExtramuralValidation StudiesUnited StatesSleepSleep. 2010 Jun 1;33(6):781-92. U2 - 2880437 ER - TY - JOUR T1 - Replenishing a computerized adaptive test of patient-reported daily activity functioning JF - Quality of Life Research Y1 - 2009 A1 - Haley, S. M. A1 - Ni, P. A1 - Jette, A. M. A1 - Tao, W. A1 - Moed, R. A1 - Meyers, D. A1 - Ludlow, L. H. KW - *Activities of Daily Living KW - *Disability Evaluation KW - *Questionnaires KW - *User-Computer Interface KW - Adult KW - Aged KW - Cohort Studies KW - Computer-Assisted Instruction KW - Female KW - Humans KW - Male KW - Middle Aged KW - Outcome Assessment (Health Care)/*methods AB - PURPOSE: Computerized adaptive testing (CAT) item banks may need to be updated, but before new items can be added, they must be linked to the previous CAT. The purpose of this study was to evaluate 41 pretest items prior to including them into an operational CAT. METHODS: We recruited 6,882 patients with spine, lower extremity, upper extremity, and nonorthopedic impairments who received outpatient rehabilitation in one of 147 clinics across 13 states of the USA. Forty-one new Daily Activity (DA) items were administered along with the Activity Measure for Post-Acute Care Daily Activity CAT (DA-CAT-1) in five separate waves. We compared the scoring consistency with the full item bank, test information function (TIF), person standard errors (SEs), and content range of the DA-CAT-1 to the new CAT (DA-CAT-2) with the pretest items by real data simulations. RESULTS: We retained 29 of the 41 pretest items. Scores from the DA-CAT-2 were more consistent (ICC = 0.90 versus 0.96) than DA-CAT-1 when compared with the full item bank. TIF and person SEs were improved for persons with higher levels of DA functioning, and ceiling effects were reduced from 16.1% to 6.1%. CONCLUSIONS: Item response theory and online calibration methods were valuable in improving the DA-CAT. VL - 18 SN - 0962-9343 (Print)0962-9343 (Linking) N1 - Haley, Stephen MNi, PengshengJette, Alan MTao, WeiMoed, RichardMeyers, DougLudlow, Larry HK02 HD45354-01/HD/NICHD NIH HHS/United StatesResearch Support, N.I.H., ExtramuralNetherlandsQuality of life research : an international journal of quality of life aspects of treatment, care and rehabilitationQual Life Res. 2009 May;18(4):461-71. Epub 2009 Mar 14. ER - TY - JOUR T1 - Adaptive short forms for outpatient rehabilitation outcome assessment JF - American Journal of Physical Medicine and Rehabilitation Y1 - 2008 A1 - Jette, A. M. A1 - Haley, S. M. A1 - Ni, P. A1 - Moed, R. KW - *Activities of Daily Living KW - *Ambulatory Care Facilities KW - *Mobility Limitation KW - *Treatment Outcome KW - Disabled Persons/psychology/*rehabilitation KW - Female KW - Humans KW - Male KW - Middle Aged KW - Questionnaires KW - Rehabilitation Centers AB - OBJECTIVE: To develop outpatient Adaptive Short Forms for the Activity Measure for Post-Acute Care item bank for use in outpatient therapy settings. DESIGN: A convenience sample of 11,809 adults with spine, lower limb, upper limb, and miscellaneous orthopedic impairments who received outpatient rehabilitation in 1 of 127 outpatient rehabilitation clinics in the United States. We identified optimal items for use in developing outpatient Adaptive Short Forms based on the Basic Mobility and Daily Activities domains of the Activity Measure for Post-Acute Care item bank. Patient scores were derived from the Activity Measure for Post-Acute Care computerized adaptive testing program. Items were selected for inclusion on the Adaptive Short Forms based on functional content, range of item coverage, measurement precision, item exposure rate, and data collection burden. RESULTS: Two outpatient Adaptive Short Forms were developed: (1) an 18-item Basic Mobility Adaptive Short Form and (2) a 15-item Daily Activities Adaptive Short Form, derived from the same item bank used to develop the Activity Measure for Post-Acute Care computerized adaptive testing program. Both Adaptive Short Forms achieved acceptable psychometric properties. CONCLUSIONS: In outpatient postacute care settings where computerized adaptive testing outcome applications are currently not feasible, item response theory-derived Adaptive Short Forms provide the efficient capability to monitor patients' functional outcomes. The development of Adaptive Short Form functional outcome instruments linked by a common, calibrated item bank has the potential to create a bridge to outcome monitoring across postacute care settings and can facilitate the eventual transformation from Adaptive Short Forms to computerized adaptive testing applications easier and more acceptable to the rehabilitation community. VL - 87 SN - 1537-7385 (Electronic) N1 - Jette, Alan MHaley, Stephen MNi, PengshengMoed, RichardK02 HD45354-01/HD/NICHD NIH HHS/United StatesR01 HD43568/HD/NICHD NIH HHS/United StatesResearch Support, N.I.H., ExtramuralResearch Support, U.S. Gov't, Non-P.H.S.Research Support, U.S. Gov't, P.H.S.United StatesAmerican journal of physical medicine & rehabilitation / Association of Academic PhysiatristsAm J Phys Med Rehabil. 2008 Oct;87(10):842-52. ER - TY - JOUR T1 - Computerized adaptive testing for follow-up after discharge from inpatient rehabilitation: II. Participation outcomes JF - Archives of Physical Medicine and Rehabilitation Y1 - 2008 A1 - Haley, S. M. A1 - Gandek, B. A1 - Siebens, H. A1 - Black-Schaffer, R. M. A1 - Sinclair, S. J. A1 - Tao, W. A1 - Coster, W. J. A1 - Ni, P. A1 - Jette, A. M. KW - *Activities of Daily Living KW - *Adaptation, Physiological KW - *Computer Systems KW - *Questionnaires KW - Adult KW - Aged KW - Aged, 80 and over KW - Chi-Square Distribution KW - Factor Analysis, Statistical KW - Female KW - Humans KW - Longitudinal Studies KW - Male KW - Middle Aged KW - Outcome Assessment (Health Care)/*methods KW - Patient Discharge KW - Prospective Studies KW - Rehabilitation/*standards KW - Subacute Care/*standards AB - OBJECTIVES: To measure participation outcomes with a computerized adaptive test (CAT) and compare CAT and traditional fixed-length surveys in terms of score agreement, respondent burden, discriminant validity, and responsiveness. DESIGN: Longitudinal, prospective cohort study of patients interviewed approximately 2 weeks after discharge from inpatient rehabilitation and 3 months later. SETTING: Follow-up interviews conducted in patient's home setting. PARTICIPANTS: Adults (N=94) with diagnoses of neurologic, orthopedic, or medically complex conditions. INTERVENTIONS: Not applicable. MAIN OUTCOME MEASURES: Participation domains of mobility, domestic life, and community, social, & civic life, measured using a CAT version of the Participation Measure for Postacute Care (PM-PAC-CAT) and a 53-item fixed-length survey (PM-PAC-53). RESULTS: The PM-PAC-CAT showed substantial agreement with PM-PAC-53 scores (intraclass correlation coefficient, model 3,1, .71-.81). On average, the PM-PAC-CAT was completed in 42% of the time and with only 48% of the items as compared with the PM-PAC-53. Both formats discriminated across functional severity groups. The PM-PAC-CAT had modest reductions in sensitivity and responsiveness to patient-reported change over a 3-month interval as compared with the PM-PAC-53. CONCLUSIONS: Although continued evaluation is warranted, accurate estimates of participation status and responsiveness to change for group-level analyses can be obtained from CAT administrations, with a sizeable reduction in respondent burden. VL - 89 SN - 1532-821X (Electronic)0003-9993 (Linking) N1 - Haley, Stephen MGandek, BarbaraSiebens, HilaryBlack-Schaffer, Randie MSinclair, Samuel JTao, WeiCoster, Wendy JNi, PengshengJette, Alan MK02 HD045354-01A1/HD/NICHD NIH HHS/United StatesK02 HD45354-01/HD/NICHD NIH HHS/United StatesR01 HD043568/HD/NICHD NIH HHS/United StatesR01 HD043568-01/HD/NICHD NIH HHS/United StatesResearch Support, N.I.H., ExtramuralUnited StatesArchives of physical medicine and rehabilitationArch Phys Med Rehabil. 2008 Feb;89(2):275-83. U2 - 2666330 ER - TY - JOUR T1 - Item exposure control in a-stratified computerized adaptive testing JF - Psychological Testing Y1 - 2008 A1 - Jhu, Y.-J., A1 - Chen, S-Y. VL - 55 ER - TY - JOUR T1 - Letting the CAT out of the bag: Comparing computer adaptive tests and an 11-item short form of the Roland-Morris Disability Questionnaire JF - Spine Y1 - 2008 A1 - Cook, K. F. A1 - Choi, S. W. A1 - Crane, P. K. A1 - Deyo, R. A. A1 - Johnson, K. L. A1 - Amtmann, D. KW - *Disability Evaluation KW - *Health Status Indicators KW - Adult KW - Aged KW - Aged, 80 and over KW - Back Pain/*diagnosis/psychology KW - Calibration KW - Computer Simulation KW - Diagnosis, Computer-Assisted/*standards KW - Humans KW - Middle Aged KW - Models, Psychological KW - Predictive Value of Tests KW - Questionnaires/*standards KW - Reproducibility of Results AB - STUDY DESIGN: A post hoc simulation of a computer adaptive administration of the items of a modified version of the Roland-Morris Disability Questionnaire. OBJECTIVE: To evaluate the effectiveness of adaptive administration of back pain-related disability items compared with a fixed 11-item short form. SUMMARY OF BACKGROUND DATA: Short form versions of the Roland-Morris Disability Questionnaire have been developed. An alternative to paper-and-pencil short forms is to administer items adaptively so that items are presented based on a person's responses to previous items. Theoretically, this allows precise estimation of back pain disability with administration of only a few items. MATERIALS AND METHODS: Data were gathered from 2 previously conducted studies of persons with back pain. An item response theory model was used to calibrate scores based on all items, items of a paper-and-pencil short form, and several computer adaptive tests (CATs). RESULTS: Correlations between each CAT condition and scores based on a 23-item version of the Roland-Morris Disability Questionnaire ranged from 0.93 to 0.98. Compared with an 11-item short form, an 11-item CAT produced scores that were significantly more highly correlated with scores based on the 23-item scale. CATs with even fewer items also produced scores that were highly correlated with scores based on all items. For example, scores from a 5-item CAT had a correlation of 0.93 with full scale scores. Seven- and 9-item CATs correlated at 0.95 and 0.97, respectively. A CAT with a standard-error-based stopping rule produced scores that correlated at 0.95 with full scale scores. CONCLUSION: A CAT-based back pain-related disability measure may be a valuable tool for use in clinical and research contexts. Use of CAT for other common measures in back pain research, such as other functional scales or measures of psychological distress, may offer similar advantages. VL - 33 SN - 1528-1159 (Electronic) N1 - Cook, Karon FChoi, Seung WCrane, Paul KDeyo, Richard AJohnson, Kurt LAmtmann, Dagmar5 P60-AR48093/AR/United States NIAMS5U01AR052171-03/AR/United States NIAMSComparative StudyResearch Support, N.I.H., ExtramuralUnited StatesSpineSpine. 2008 May 20;33(12):1378-83. ER - TY - JOUR T1 - Severity of Organized Item Theft in Computerized Adaptive Testing: A Simulation Study JF - Applied Psychological Measurement Y1 - 2008 A1 - Qing Yi, A1 - Jinming Zhang, A1 - Chang, Hua-Hua AB -

Criteria had been proposed for assessing the severity of possible test security violations for computerized tests with high-stakes outcomes. However, these criteria resulted from theoretical derivations that assumed uniformly randomized item selection. This study investigated potential damage caused by organized item theft in computerized adaptive testing (CAT) for two realistic item selection methods, maximum item information and a-stratified with content blocking, using the randomized method as a baseline for comparison. Damage caused by organized item theft was evaluated by the number of compromised items each examinee could encounter and the impact of the compromised items on examinees' ability estimates. Severity of test security violation was assessed under self-organized and organized item theft simulation scenarios. Results indicated that though item theft could cause severe damage to CAT with either item selection method, the maximum item information method was more vulnerable to the organized item theft simulation than was the a-stratified method.

VL - 32 UR - http://apm.sagepub.com/content/32/7/543.abstract ER - TY - JOUR T1 - Computerized adaptive testing for measuring development of young children JF - Statistics in Medicine Y1 - 2007 A1 - Jacobusse, G. A1 - Buuren, S. KW - *Child Development KW - *Models, Statistical KW - Child, Preschool KW - Diagnosis, Computer-Assisted/*statistics & numerical data KW - Humans KW - Netherlands AB - Developmental indicators that are used for routine measurement in The Netherlands are usually chosen to optimally identify delayed children. Measurements on the majority of children without problems are therefore quite imprecise. This study explores the use of computerized adaptive testing (CAT) to monitor the development of young children. CAT is expected to improve the measurement precision of the instrument. We do two simulation studies - one with real data and one with simulated data - to evaluate the usefulness of CAT. It is shown that CAT selects developmental indicators that maximally match the individual child, so that all children can be measured to the same precision. VL - 26 SN - 0277-6715 (Print) N1 - Jacobusse, GertBuuren, Stef vanEnglandStatistics in medicineStat Med. 2007 Jun 15;26(13):2629-38. ER - TY - JOUR T1 - Prospective evaluation of the am-pac-cat in outpatient rehabilitation settings JF - Physical Therapy Y1 - 2007 A1 - Jette, A., A1 - Haley, S. A1 - Tao, W. A1 - Ni, P. A1 - Moed, R. A1 - Meyers, D. A1 - Zurek, M. VL - 87 ER - TY - JOUR T1 - Comparison of the Psychometric Properties of Several Computer-Based Test Designs for Credentialing Exams With Multiple Purposes JF - Applied Measurement in Education Y1 - 2006 A1 - Jodoin, Michael G. A1 - Zenisky, April A1 - Hambleton, Ronald K. VL - 19 UR - http://www.tandfonline.com/doi/abs/10.1207/s15324818ame1903_3 ER - TY - JOUR T1 - Computer adaptive testing improved accuracy and precision of scores over random item selection in a physical functioning item bank JF - Journal of Clinical Epidemiology Y1 - 2006 A1 - Haley, S. M. A1 - Ni, P. A1 - Hambleton, R. K. A1 - Slavin, M. D. A1 - Jette, A. M. KW - *Recovery of Function KW - Activities of Daily Living KW - Adolescent KW - Adult KW - Aged KW - Aged, 80 and over KW - Confidence Intervals KW - Factor Analysis, Statistical KW - Female KW - Humans KW - Male KW - Middle Aged KW - Outcome Assessment (Health Care)/*methods KW - Rehabilitation/*standards KW - Reproducibility of Results KW - Software AB - BACKGROUND AND OBJECTIVE: Measuring physical functioning (PF) within and across postacute settings is critical for monitoring outcomes of rehabilitation; however, most current instruments lack sufficient breadth and feasibility for widespread use. Computer adaptive testing (CAT), in which item selection is tailored to the individual patient, holds promise for reducing response burden, yet maintaining measurement precision. We calibrated a PF item bank via item response theory (IRT), administered items with a post hoc CAT design, and determined whether CAT would improve accuracy and precision of score estimates over random item selection. METHODS: 1,041 adults were interviewed during postacute care rehabilitation episodes in either hospital or community settings. Responses for 124 PF items were calibrated using IRT methods to create a PF item bank. We examined the accuracy and precision of CAT-based scores compared to a random selection of items. RESULTS: CAT-based scores had higher correlations with the IRT-criterion scores, especially with short tests, and resulted in narrower confidence intervals than scores based on a random selection of items; gains, as expected, were especially large for low and high performing adults. CONCLUSION: The CAT design may have important precision and efficiency advantages for point-of-care functional assessment in rehabilitation practice settings. VL - 59 SN - 0895-4356 (Print) N1 - Haley, Stephen MNi, PengshengHambleton, Ronald KSlavin, Mary DJette, Alan MK02 hd45354-01/hd/nichdR01 hd043568/hd/nichdComparative StudyResearch Support, N.I.H., ExtramuralResearch Support, U.S. Gov't, Non-P.H.S.EnglandJournal of clinical epidemiologyJ Clin Epidemiol. 2006 Nov;59(11):1174-82. Epub 2006 Jul 11. ER - TY - JOUR T1 - Computer adaptive testing improved accuracy and precision of scores over random item selection in a physical functioning item bank JF - Journal of Clinical Epidemiology Y1 - 2006 A1 - Haley, S. A1 - Ni, P. A1 - Hambleton, R. K. A1 - Slavin, M. A1 - Jette, A. VL - 59 SN - 08954356 ER - TY - JOUR T1 - [Item Selection Strategies of Computerized Adaptive Testing based on Graded Response Model.] JF - Acta Psychologica Sinica Y1 - 2006 A1 - Ping, Chen A1 - Shuliang, Ding A1 - Haijing, Lin A1 - Jie, Zhou KW - computerized adaptive testing KW - item selection strategy AB - Item selection strategy (ISS) is an important component of Computerized Adaptive Testing (CAT). Its performance directly affects the security, efficiency and precision of the test. Thus, ISS becomes one of the central issues in CATs based on the Graded Response Model (GRM). It is well known that the goal of IIS is to administer the next unused item remaining in the item bank that best fits the examinees current ability estimate. In dichotomous IRT models, every item has only one difficulty parameter and the item whose difficulty matches the examinee's current ability estimate is considered to be the best fitting item. However, in GRM, each item has more than two ordered categories and has no single value to represent the item difficulty. Consequently, some researchers have used to employ the average or the median difficulty value across categories as the difficulty estimate for the item. Using the average value and the median value in effect introduced two corresponding ISSs. In this study, we used computer simulation compare four ISSs based on GRM. We also discussed the effect of "shadow pool" on the uniformity of pool usage as well as the influence of different item parameter distributions and different ability estimation methods on the evaluation criteria of CAT. In the simulation process, Monte Carlo method was adopted to simulate the entire CAT process; 1,000 examinees drawn from standard normal distribution and four 1,000-sized item pools of different item parameter distributions were also simulated. The assumption of the simulation is that a polytomous item is comprised of six ordered categories. In addition, ability estimates were derived using two methods. They were expected a posteriori Bayesian (EAP) and maximum likelihood estimation (MLE). In MLE, the Newton-Raphson iteration method and the Fisher Score iteration method were employed, respectively, to solve the likelihood equation. Moreover, the CAT process was simulated with each examinee 30 times to eliminate random error. The IISs were evaluated by four indices usually used in CAT from four aspects--the accuracy of ability estimation, the stability of IIS, the usage of item pool, and the test efficiency. Simulation results showed adequate evaluation of the ISS that matched the estimate of an examinee's current trait level with the difficulty values across categories. Setting "shadow pool" in ISS was able to improve the uniformity of pool utilization. Finally, different distributions of the item parameter and different ability estimation methods affected the evaluation indices of CAT. (PsycINFO Database Record (c) 2007 APA, all rights reserved) PB - Science Press: China VL - 38 SN - 0439-755X (Print) ER - TY - BOOK T1 - A comparison of adaptive mastery testing using testlets with the 3-parameter logistic model Y1 - 2005 A1 - Jacobs-Cassuto, M.S. CY - Unpublished doctoral dissertation, University of Minnesota, Minneapolis, MN ER - TY - JOUR T1 - Contemporary measurement techniques for rehabilitation outcomes assessment JF - Journal of Rehabilitation Medicine Y1 - 2005 A1 - Jette, A. M. A1 - Haley, S. M. KW - *Disability Evaluation KW - Activities of Daily Living/classification KW - Disabled Persons/classification/*rehabilitation KW - Health Status Indicators KW - Humans KW - Outcome Assessment (Health Care)/*methods/standards KW - Recovery of Function KW - Research Support, N.I.H., Extramural KW - Research Support, U.S. Gov't, Non-P.H.S. KW - Sensitivity and Specificity computerized adaptive testing AB - In this article, we review the limitations of traditional rehabilitation functional outcome instruments currently in use within the rehabilitation field to assess Activity and Participation domains as defined by the International Classification of Function, Disability, and Health. These include a narrow scope of functional outcomes, data incompatibility across instruments, and the precision vs feasibility dilemma. Following this, we illustrate how contemporary measurement techniques, such as item response theory methods combined with computer adaptive testing methodology, can be applied in rehabilitation to design functional outcome instruments that are comprehensive in scope, accurate, allow for compatibility across instruments, and are sensitive to clinically important change without sacrificing their feasibility. Finally, we present some of the pressing challenges that need to be overcome to provide effective dissemination and training assistance to ensure that current and future generations of rehabilitation professionals are familiar with and skilled in the application of contemporary outcomes measurement. VL - 37 N1 - 1650-1977 (Print)Journal ArticleReview ER - TY - JOUR T1 - [Item characteristic curve equating under graded response models in IRT] JF - Acta Psychologica Sinica Y1 - 2005 A1 - Jun, Z. A1 - Dongming, O. A1 - Shuyuan, X. A1 - Haiqi, D. A1 - Shuqing, Q. KW - graded response models KW - item characteristic curve KW - Item Response Theory AB - In one of the largest qualificatory tests--economist test, to guarantee the comparability among different years, construct item bank and prepare for computerized adaptive testing, item characteristic curve equating and anchor test equating design under graded models in IRT are used, which have realized the item and ability parameter equating of test data in five years and succeeded in establishing an item bank. Based on it, cut scores of different years are compared by equating and provide demonstrational gist to constitute the eligibility standard of economist test. PB - Science Press: China VL - 37 SN - 0439-755X (Print) ER - TY - JOUR T1 - Activity outcome measurement for postacute care JF - Medical Care Y1 - 2004 A1 - Haley, S. M. A1 - Coster, W. J. A1 - Andres, P. L. A1 - Ludlow, L. H. A1 - Ni, P. A1 - Bond, T. L. A1 - Sinclair, S. J. A1 - Jette, A. M. KW - *Self Efficacy KW - *Sickness Impact Profile KW - Activities of Daily Living/*classification/psychology KW - Adult KW - Aftercare/*standards/statistics & numerical data KW - Aged KW - Boston KW - Cognition/physiology KW - Disability Evaluation KW - Factor Analysis, Statistical KW - Female KW - Human KW - Male KW - Middle Aged KW - Movement/physiology KW - Outcome Assessment (Health Care)/*methods/statistics & numerical data KW - Psychometrics KW - Questionnaires/standards KW - Rehabilitation/*standards/statistics & numerical data KW - Reproducibility of Results KW - Sensitivity and Specificity KW - Support, U.S. Gov't, Non-P.H.S. KW - Support, U.S. Gov't, P.H.S. AB - BACKGROUND: Efforts to evaluate the effectiveness of a broad range of postacute care services have been hindered by the lack of conceptually sound and comprehensive measures of outcomes. It is critical to determine a common underlying structure before employing current methods of item equating across outcome instruments for future item banking and computer-adaptive testing applications. OBJECTIVE: To investigate the factor structure, reliability, and scale properties of items underlying the Activity domains of the International Classification of Functioning, Disability and Health (ICF) for use in postacute care outcome measurement. METHODS: We developed a 41-item Activity Measure for Postacute Care (AM-PAC) that assessed an individual's execution of discrete daily tasks in his or her own environment across major content domains as defined by the ICF. We evaluated the reliability and discriminant validity of the prototype AM-PAC in 477 individuals in active rehabilitation programs across 4 rehabilitation settings using factor analyses, tests of item scaling, internal consistency reliability analyses, Rasch item response theory modeling, residual component analysis, and modified parallel analysis. RESULTS: Results from an initial exploratory factor analysis produced 3 distinct, interpretable factors that accounted for 72% of the variance: Applied Cognition (44%), Personal Care & Instrumental Activities (19%), and Physical & Movement Activities (9%); these 3 activity factors were verified by a confirmatory factor analysis. Scaling assumptions were met for each factor in the total sample and across diagnostic groups. Internal consistency reliability was high for the total sample (Cronbach alpha = 0.92 to 0.94), and for specific diagnostic groups (Cronbach alpha = 0.90 to 0.95). Rasch scaling, residual factor, differential item functioning, and modified parallel analyses supported the unidimensionality and goodness of fit of each unique activity domain. CONCLUSIONS: This 3-factor model of the AM-PAC can form the conceptual basis for common-item equating and computer-adaptive applications, leading to a comprehensive system of outcome instruments for postacute care settings. VL - 42 N1 - 0025-7079Journal ArticleMulticenter Study ER - TY - CHAP T1 - Computerized adaptive testing and item banking Y1 - 2004 A1 - Bjorner, J. B. A1 - Kosinski, M. A1 - Ware, J. E A1 - Jr. CY - P. M. Fayers and R. D. Hays (Eds.) Assessing Quality of Life. Oxford: Oxford University Press. N1 - {PDF file 371 KB} ER - TY - JOUR T1 - Computerized Adaptive Testing With Multiple-Form Structures JF - Applied Psychological Measurement Y1 - 2004 A1 - Armstrong, Ronald D. A1 - Jones, Douglas H. A1 - Koppel, Nicole B. A1 - Pashley, Peter J. AB -

A multiple-form structure (MFS) is an orderedcollection or network of testlets (i.e., sets of items).An examinee’s progression through the networkof testlets is dictated by the correctness of anexaminee’s answers, thereby adapting the test tohis or her trait level. The collection of pathsthrough the network yields the set of all possibletest forms, allowing test specialists the opportunityto review them before they are administered. Also,limiting the exposure of an individual MFS to aspecific period of time can enhance test security.This article provides an overview of methods thathave been developed to generate parallel MFSs.The approach is applied to the assembly of anexperimental computerized Law School Admission Test (LSAT).

VL - 28 UR - http://apm.sagepub.com/content/28/3/147.abstract ER - TY - JOUR T1 - Computerized adaptive testing with multiple-form structures JF - Applied Psychological Measurement Y1 - 2004 A1 - Armstrong, R. D. A1 - Jones, D. H. A1 - Koppel, N. B. A1 - Pashley, P. J. KW - computerized adaptive testing KW - Law School Admission Test KW - multiple-form structure KW - testlets AB - A multiple-form structure (MFS) is an ordered collection or network of testlets (i.e., sets of items). An examinee's progression through the network of testlets is dictated by the correctness of an examinee's answers, thereby adapting the test to his or her trait level. The collection of paths through the network yields the set of all possible test forms, allowing test specialists the opportunity to review them before they are administered. Also, limiting the exposure of an individual MFS to a specific period of time can enhance test security. This article provides an overview of methods that have been developed to generate parallel MFSs. The approach is applied to the assembly of an experimental computerized Law School Admission Test (LSAT). (PsycINFO Database Record (c) 2007 APA, all rights reserved) PB - Sage Publications: US VL - 28 SN - 0146-6216 (Print) ER - TY - ABST T1 - An investigation of two combination procedures of SPRT for three-category decisions in computerized classification test Y1 - 2004 A1 - Jiao, H. A1 - Wang, S A1 - Lau, A CY - Paper presented at the annual meeting of the American Educational Research Association, San Diego CA N1 - {PDF file, 649 KB} ER - TY - Generic T1 - An investigation of two combination procedures of SPRT for three-category classification decisions in computerized classification test T2 - annual meeting of the American Educational Research Association Y1 - 2004 A1 - Jiao, H. A1 - Wang, S A1 - Lau, CA KW - computerized adaptive testing KW - Computerized classification testing KW - sequential probability ratio testing JF - annual meeting of the American Educational Research Association CY - San Antonio, Texas N1 - annual meeting of the American Educational Research Association, San Antonio ER - TY - ABST T1 - The effects of model misfit in computerized classification test Y1 - 2003 A1 - Jiao, H. A1 - Lau, A. C. CY - Paper presented at the annual meeting of the National Council on Measurement in Education, Chicago IL N1 - {PDF file, 432 KB} ER - TY - JOUR T1 - The effects of model specification error in item response theory-based computerized classification test using sequential probability ratio test JF - Dissertation Abstracts International Section A: Humanities & Social Sciences Y1 - 2003 A1 - Jiao, H. AB - This study investigated the effects of model specification error on classification accuracy, error rates, and average test length in Item Response Theory (IRT) based computerized classification test (CCT) using sequential probability ratio test (SPRT) in making binary decisions from examinees' dichotomous responses. This study consisted of three sub-studies. In each sub-study, one of the three unidimensional dichotomous IRT models, the 1-parameter logistic (IPL), the 2-parameter logistic (2PL), and the 3-parameter logistic (3PL) model was set as the true model and the other two models were treated as the misfit models. Item pool composition, test length, and stratum depth were manipulated to simulate different test conditions. To ensure the validity of the study results, the true model based CCTs using the true and the recalibrated item parameters were compared first to study the effect of estimation error in item parameters in CCTs. Then, the true model and the misfit model based CCTs were compared to accomplish the research goal, The results indicated that estimation error in item parameters did not affect classification results based on CCTs using SPRT. The effect of model specification error depended on the true model, the misfit model, and the item pool composition. When the IPL or the 2PL IRT model was the true model, the use of another IRT model had little impact on the CCT results. When the 3PL IRT model was the true model, the use of the 1PL model raised the false positive error rates. The influence of using the 2PL instead of the 3PL model depended on the item pool composition. When the item discrimination parameters varied greatly from uniformity of one, the use of the 2PL IRT model raised the false negative error rates to above the nominal level. In the simulated test conditions with test length and item exposure constraints, using a misfit model in CCTs most often affected the average test length. Its effects on error rates and classification accuracy were negligible. It was concluded that in CCTs using SPRT, IRT model selection and evaluation is indispensable (PsycINFO Database Record (c) 2004 APA, all rights reserved). VL - 64 ER - TY - ABST T1 - A multidimensional IRT mechanism for better understanding adaptive test behavior Y1 - 2003 A1 - Jodoin, M. CY - Paper presented at the annual meeting of the National Council on Measurement in Education, Chicago IL ER - TY - JOUR T1 - Psychometric properties of several computer-based test designs with ideal and constrained item pools JF - Dissertation Abstracts International: Section B: The Sciences & Engineering Y1 - 2003 A1 - Jodoin, M. G. AB - The purpose of this study was to compare linear fixed length test (LFT), multi stage test (MST), and computer adaptive test (CAT) designs under three levels of item pool quality, two levels of match between test and item pool content specifications, two levels of test length, and several levels of exposure control expected to be practical for a number of testing programs. This design resulted in 132 conditions that were evaluated using a simulation study with 9000 examinees on several measures of overall measurement precision including reliability, the mean error and root mean squared error between true and estimated ability levels, classification precision including decision accuracy, false positive and false negative rates, and Kappa for cut scores corresponding to 30%, 50%, and 85% failure rates, and conditional measurement precision with the conditional root mean squared error between true and estimated ability levels conditioned on 25 true ability levels. Test reliability, overall and conditional measurement precision, and classification precision increased with item pool quality and test length, and decreased with less adequate match between item pool and test specification match. In addition, as the maximum exposure rate decreased and the type of exposure control implemented became more restrictive, test reliability, overall and conditional measurement precision, and classification precision decreased. Within item pool quality, match between test and item pool content specifications, test length, and exposure control, CAT designs showed superior psychometric properties as compared to MST designs which in turn were superior to LFT designs. However, some caution is warranted in interpreting these results since the ability of the automated test assembly software to construct test that met specifications was limited in conditions where pool usage was high. The practical importance of the differences between test designs on the evaluation criteria studied is discussed with respect to the inferences test users seek to make from test scores and nonpsychometric factors that may be important in some testing programs. (PsycINFO Database Record (c) 2004 APA, all rights reserved). VL - 64 ER - TY - CONF T1 - Comparison of the psychometric properties of several computer-based test designs for credentialing exams T2 - Paper presented at the annual meeting of the National Council on Measurement in Education Y1 - 2002 A1 - Jodoin, M. A1 - Zenisky, A. L. A1 - Hambleton, R. K. JF - Paper presented at the annual meeting of the National Council on Measurement in Education CY - New Orleans LA N1 - {PDF file, 261 KB} ER - TY - CONF T1 - Impact of selected factors on the psychometric quality of credentialing examinations administered with a sequential testlet design T2 - Paper presented at the annual meeting of the National Council on Measurement in Education Y1 - 2002 A1 - Hambleton, R. K. A1 - Jodoin, M. A1 - Zenisky, A. L. JF - Paper presented at the annual meeting of the National Council on Measurement in Education CY - New Orleans LA ER - TY - CONF T1 - Reliability and decision accuracy of linear parallel form and multi stage tests with realistic and ideal item pools T2 - Paper presented at the International Conference on Computer-Based Testing and the Internet Y1 - 2002 A1 - Jodoin, M. G. JF - Paper presented at the International Conference on Computer-Based Testing and the Internet CY - Winchester, England ER - TY - CONF T1 - An investigation of the impact of items that exhibit mild DIF on ability estimation in CAT T2 - Paper presented at the annual meeting of the National Council on Measurement in Education Y1 - 2001 A1 - Jennings, J. A. A1 - Dodd, B. G. A1 - Fitzpatrick, S. JF - Paper presented at the annual meeting of the National Council on Measurement in Education CY - Seattle WA ER - TY - CHAP T1 - Computer-adaptive testing: A methodology whose time has come T2 - Development of Computerised Middle School Achievement Tests Y1 - 2000 A1 - Linacre, J. M. ED - Kang, U. ED - Jean, E. ED - Linacre, J. M. KW - computerized adaptive testing JF - Development of Computerised Middle School Achievement Tests PB - MESA CY - Chicago, IL. USA VL - 69 ER - TY - CONF T1 - A comparison of two methods of controlling item exposure in computerized adaptive testing T2 - Paper presented at the meeting of the American Educational Research Association. San Diego CA. Y1 - 1998 A1 - Tang, L. A1 - Jiang, H. A1 - Chang, Hua-Hua JF - Paper presented at the meeting of the American Educational Research Association. San Diego CA. ER - TY - ABST T1 - Computer adaptive testing – Approaches for item selection and measurement Y1 - 1998 A1 - Armstrong, R. D. A1 - Jones, D. H. CY - Rutgers Center for Operations Research, New Brunswick NJ ER - TY - CONF T1 - Computerized adaptive testing with multiple form structures T2 - Paper presented at the annual meeting of the Psychometric Society Y1 - 1998 A1 - Armstrong, R. D. A1 - Jones, D. H. A1 - Berliner, N. JF - Paper presented at the annual meeting of the Psychometric Society CY - Urbana, IL ER - TY - ABST T1 - The relationship between computer familiarity and performance on computer-based TOEFL test tasks (Research Report 98-08) Y1 - 1998 A1 - Taylor, C. A1 - Jamieson, J. A1 - Eignor, D. R. A1 - Kirsch, I. CY - Princeton NJ: Educational Testing Service ER - TY - CONF T1 - Assessing speededness in variable-length computer-adaptive tests T2 - Paper presented at the annual meeting of the National Council on Measurement in Education Y1 - 1997 A1 - Bontempo, B A1 - Julian, E. R A1 - Gorham, J. L. JF - Paper presented at the annual meeting of the National Council on Measurement in Education CY - Chicago IL ER - TY - JOUR T1 - Evaluating an automatically scorable, open-ended response type for measuring mathematical reasoning in computer-adaptive tests Y1 - 1997 A1 - Bennett, R. E. A1 - Steffen, M. A1 - Singley, M.K. A1 - Morley, M. A1 - Jacquemin, D. ER - TY - CONF T1 - Mathematical programming approaches to computerized adaptive testing T2 - Paper presented at the annual meeting of the National Council on Measurement in Education Y1 - 1997 A1 - Jones, D. H. JF - Paper presented at the annual meeting of the National Council on Measurement in Education CY - Chicago IL ER - TY - JOUR T1 - Dispelling myths about the new NCLEX exam JF - Recruitment, Retention, and Restructuring Report Y1 - 1996 A1 - Johnson, S. H. KW - *Educational Measurement KW - *Licensure KW - Humans KW - Nursing Staff KW - Personnel Selection KW - United States AB - The new computerized NCLEX system is working well. Most new candidates, employers, and board of nursing representatives like the computerized adaptive testing system and the fast report of results. But, among the candidates themselves some myths have grown which cause them needless anxiety. VL - 9 N1 - Journal Article ER - TY - JOUR T1 - Shortfall of questions curbs use of computerized graduate exam JF - The Chronicle of Higher Education Y1 - 1995 A1 - Jacobson, R. L. ER - TY - JOUR T1 - Moving in a new direction: Computerized adaptive testing (CAT) JF - Nursing Management Y1 - 1993 A1 - Jones-Dickson, C. A1 - Dorsey, D. A1 - Campbell-Warnock, J. A1 - Fields, F. KW - *Computers KW - Accreditation/methods KW - Educational Measurement/*methods KW - Licensure, Nursing KW - United States VL - 24 SN - 0744-6314 (Print) N1 - Jones-Dickson, CDorsey, DCampbell-Warnock, JFields, FUnited statesNursing managementNurs Manage. 1993 Jan;24(1):80, 82. ER - TY - NEWS T1 - New computer technique seen producing a revolution in testing T2 - The Chronicle of Higher Education Y1 - 1993 A1 - Jacobson, R. L. JF - The Chronicle of Higher Education VL - 40 ER - TY - JOUR T1 - A comparison of self-adapted and computerized adaptive achievement tests JF - Journal of Educational Measurement Y1 - 1992 A1 - Wise, S. L. A1 - Plake, S. S A1 - Johnson, P. L. A1 - Roos, S. L. VL - 29 ER - TY - ABST T1 - The Language Training Division's computer adaptive reading proficiency test Y1 - 1992 A1 - Janczewski, D. A1 - Lowe, P. CY - Provo, UT: Language Training Division, Office of Training and Education ER - TY - JOUR T1 - Correlates of examinee item choice behavior in self-adapted testing JF - Mid-Western Educational Researcher Y1 - 1991 A1 - Johnson, J. L. A1 - Roos, L. L. A1 - Wise, S. L. A1 - Plake, B. S. VL - 4 ER - TY - ABST T1 - An empirical study of a broad range test of verbal ability Y1 - 1980 A1 - Kreitzberg, C. B. A1 - Jones, D. J. CY - Princeton NJ: Educational Testing Service ER - TY - CHAP T1 - Parallel forms reliability and measurement accuracy comparison of adaptive and conventional testing strategies Y1 - 1980 A1 - Johnson, M. J. A1 - Weiss, D. J. CY - D. J. Weiss (Ed.), Proceedings of the 1979 Computerized Adaptive Testing Conference (pp. 16-34). Minneapolis: University of Minnesota, Department of Psychology, Psychometric Methods Program, Computerized Adaptive Testing Laboratory. N1 - {PDF file, 918 KB} ER - TY - CONF T1 - Student reaction to computerized adaptive testing in the classroom T2 - Paper presented at the 87th annual meeting of the American Psychological Association Y1 - 1979 A1 - Johnson, M. J. JF - Paper presented at the 87th annual meeting of the American Psychological Association CY - New York N1 - #JO79-01 ER - TY - JOUR T1 - Bayesian tailored testing and the influence of item bank characteristics JF - Applied Psychological Measurement Y1 - 1977 A1 - Jensema, C J VL - 1 ER - TY - JOUR T1 - Bayesian Tailored Testing and the Influence of Item Bank Characteristics JF - Applied Psychological Measurement Y1 - 1977 A1 - Jensema, C J VL - 1 IS - 1 ER - TY - CHAP T1 - Bayesian tailored testing and the influence of item bank characteristics Y1 - 1976 A1 - Jensema, C J CY - C. K. Clark (Ed.), Proceedings of the First Conference on Computerized Adaptive Testing (pp. 82-89). Washington DC: U.S. Government Printing Office. N1 - {PDF file, 370 KB} ER - TY - JOUR T1 - An application of latent trait mental test theory JF - British Journal of Mathematical and Statistical Psychology Y1 - 1974 A1 - Jensema, C J VL - 27 N1 - #JE74029 ER - TY - ABST T1 - Computer-based adaptive testing models for the Air Force technical training environment: Phase I: Development of a computerized measurement system for Air Force technical Training Y1 - 1974 A1 - Hansen, D. N. A1 - Johnson, B. F. A1 - Fagan, R. L. A1 - Tan, P. A1 - Dick, W. CY - JSAS Catalogue of Selected Documents in Psychology, 5, 1-86 (MS No. 882). AFHRL Technical Report 74-48. ER - TY - JOUR T1 - The validity of Bayesian tailored testing JF - Educational and Psychological Measurement Y1 - 1974 A1 - Jensema, C J VL - 34 ER - TY - CHAP T1 - Computer-based psychological testing Y1 - 1973 A1 - Jones, D. A1 - Weinman, J. CY - A. Elithorn and D. Jones (Eds.), Artificial and human thinking (pp. 83-93). San Francisco CA: Jossey-Bass. ER - TY - ABST T1 - An application of latent trait mental test theory to the Washington Pre-College Testing Battery Y1 - 1972 A1 - Jensema, C J CY - Unpublished doctoral dissertation, University of Washington N1 - #JE72-01 ER -