01609nas a2200205 4500008004500000022001400045245005000059210004900109300001000158490000600168520099300174653003501167653003401202653002301236653001901259653002701278100002301305700001501328856006001343 2019 Engldsh a2165-659200aTime-Efficient Adaptive Measurement of Change0 aTimeEfficient Adaptive Measurement of Change a15-340 v73 a
The adaptive measurement of change (AMC) refers to the use of computerized adaptive testing (CAT) at multiple occasions to efficiently assess a respondent’s improvement, decline, or sameness from occasion to occasion. Whereas previous AMC research focused on administering the most informative item to a respondent at each stage of testing, the current research proposes the use of Fisher information per time unit as an item selection procedure for AMC. The latter procedure incorporates not only the amount of information provided by a given item but also the expected amount of time required to complete it. In a simulation study, the use of Fisher information per time unit item selection resulted in a lower false positive rate in the majority of conditions studied, and a higher true positive rate in all conditions studied, compared to item selection via Fisher information without accounting for the expected time taken. Future directions of research are suggested.
10aadaptive measurement of change10acomputerized adaptive testing10aFisher information10aitem selection10aresponse-time modeling1 aFinkelman, Matthew1 aWang, Chun uhttp://iacat.org/jcat/index.php/jcat/article/view/73/3501722nas a2200133 4500008004100000245009200041210006900133260005500202520120200257653000801459653001601467100001801483856008701501 2017 eng d00aA Comparison of Three Empirical Reliability Estimates for Computerized Adaptive Testing0 aComparison of Three Empirical Reliability Estimates for Computer aNiigata, JapanbNiigata Seiryo Universityc08/20173 aReliability estimates in Computerized Adaptive Testing (CAT) are derived from estimated thetas and standard error of estimated thetas. In practical, the observed standard error (OSE) of estimated thetas can be estimated by test information function for each examinee with respect to Item response theory (IRT). Unlike classical test theory (CTT), OSEs in IRT are conditional values given each estimated thetas so that those values should be marginalized to consider test reliability. Arithmetic mean, Harmonic mean, and Jensen equality were applied to marginalize OSEs to estimate CAT reliability. Based on different marginalization method, three empirical CAT reliabilities were compared with true reliabilities. Results showed that three empirical CAT reliabilities were underestimated compared to true reliability in short test length (< 40), whereas the magnitude of CAT reliabilities was followed by Jensen equality, Harmonic mean, and Arithmetic mean in long test length (> 40). Specifically, Jensen equality overestimated true reliability across all conditions in long test length (>50).
10aCAT10aReliability1 aSeo, Dong, Gi uhttps://drive.google.com/file/d/1gXgH-epPIWJiE0LxMHGiCAxZZAwy4dAH/view?usp=sharing03464nas a2200145 4500008004100000245004500041210004500086260005500131520301500186653000803201653001803209653001603227100001403243856006103257 2017 eng d00aItem Response Time on Task Effect in CAT0 aItem Response Time on Task Effect in CAT aNiigata, JapanbNiigata Seiryo Universityc08/20173 aIntroduction. In addition to reduced test length and increased measurement efficiency, computerized adaptive testing (CAT) can provide new insights into the cognitive process of task completion that cannot be mined via conventional tests. Response time is a primary characteristic of the task completion procedure. It has the potential to inform us about underlying processes. In this study, the relationship between response time and response accuracy will be investigated.
Hypothesis. The present study argues that the relationship between response time on task and response accuracy, which may be positive, negative, or curvilinear, will depend on cognitive nature of task items, holding ability of the subjects and difficulty of the items constant. The interpretations regarding the associations are not uniform either.
Research question. Is there a homogeneous effect of response time on test outcome across Graduate
Proposed explanations. If the accuracy of cognitive test responses decreases with response time, then it is an indication that the underlying cognitive process is a degrading process such as knowledge retrieval. More accessible knowledge can be retrieved faster than less accessible knowledge. It is inherent to knowledge retrieval that the success rate declines with elapsing response time. For instance, in reading tasks, the time on task effect is negative and the more negative, the easier a task is. However, if the accuracy of cognitive test responses increases with response time, then the process is of an upgrading nature, with an increasing success rate as a function of response time. For example, problem-solving takes time, and fast responses are less likely to be well-founded responses. It is of course also possible that the relationship is curvilinear, as when an increasing success rate is followed by a decreasing success rate or vice versa.
Methodology. The data are from computer-based GRE quantitative and verbal tests and will be analyzed with generalized linear mixed models (GLMM) framework after controlling the effect of ability and item difficulty as possible confounding factors. A linear model means a linear combination of predictors determining the probability of person p for answering item i correctly. The models are equivalent with advanced IRT models that go beyond the regular modeling of test responses in terms of one or more latent variables and item parameters. The lme4 package for R will be utilized to conduct the statistical calculation.
Implications. The right amount of testing time in CAT is important—too much is wasteful and costly, too little impacts score validity. The study is expected to provide new perception on the relationship between response time and response accuracy, which in turn, contribute to a better understanding of time effects and relevant cognitive process in CA.
10aCAT10aResponse time10aTask effect1 aShi, Yang uhttp://mail.iacat.org/item-response-time-task-effect-cat03772nas a2200145 4500008004100000245007300041210006900114260005500183520325500238653000803493653002203501653001803523100001403541856007103555 2017 eng d00aResponse Time and Response Accuracy in Computerized Adaptive Testing0 aResponse Time and Response Accuracy in Computerized Adaptive Tes aNiigata, JapanbNiigata Seiryo Universityc08/20173 aIntroduction. This study explores the relationship between response speed and response accuracy in Computerized Adaptive Testing (CAT). CAT provides a score as well as item response times, which can offer additional diagnostic information regarding behavioral processes of task completion that cannot be uncovered by paper-based instruments. The goal of this study is to investigate how the accuracy rate evolves as a function of response time. If the accuracy of cognitive test responses decreases with response time, then it is an indication that the underlying cognitive process is a degrading process such as knowledge retrieval. More accessible knowledge can be retrieved faster than less accessible knowledge. For instance, in reading tasks, the time on task effect is negative and the more negative, the easier a task is. However, if the accuracy of cognitive test responses increases with response time, then the process is of an upgrading nature, with an increasing success rate as a function of response time. For example, problem-solving takes time, and fast responses are less likely to be well-founded responses. It is of course also possible that the relationship is curvilinear, as when an increasing success rate is followed by a decreasing success rate or vice versa.
Hypothesis. The present study argues the relationship between response time on task and response accuracy can be positive, negative, or curvilinear, which depends on cognitive nature of task items holding ability of the subjects and difficulty of the items constant.
Methodology. Data from a subsection of GRE quantitative test were available. We will use generalized linear mixed models. A linear model means a linear combination of predictors determining the probability of person p for answering item i correctly. Modeling mixed effects means both random effects and fixed effects are included. Fixed effects refer to constants across test takers. The models are equivalent with advanced IRT models that go beyond the regular modeling of test responses in terms of one or more latent variables and item parameters. The lme4 package for R will be utilized to conduct the statistical calculation.
Research questions. 1. What is the relationship between response accuracy and response speed? 2. What is the correlation between response accuracy and type of response time (fast response vs slow response) after controlling ability of people?
Preliminary Findings. 1. There is a negative relationship between response time and response accuracy. The success rate declines with elapsing response time. 2. The correlation between the two response latent variables (fast and slow) is 1.0, indicating the time on task effects between respond time types are not different.
Implications. The right amount of testing time in CAT is important—too much is wasteful and costly, too little impacts score validity. The study is expected to provide new perception on the relationship between response time and response accuracy, which in turn, contribute to the best timing strategy in CAT—with or without time constraints.
10aCAT10aresponse accuracy10aResponse time1 aShi, Yang uhttps://drive.google.com/open?id=1yYP01bzGrKvJnfLwepcAoQQ2F4TdSvZ202532nas a2200205 4500008004100000245011900041210006900160260001200229520179400241653000802035653000802043653003402051653003002085653000802115653003102123653001602154653001302170100002002183856012302203 2011 eng d00aFrom Reliability to Validity: Expanding Adaptive Testing Practice to Find the Most Valid Score for Each Test Taker0 aFrom Reliability to Validity Expanding Adaptive Testing Practice c10/20113 aCAT is an exception to the traditional conception of validity. It is one of the few examples of individualized testing. Item difficulty is tailored to each examinee. The intent, however, is increased efficiency. Focus on reliability (reduced standard error); Equivalence with paper & pencil tests is valued; Validity is enhanced through improved reliability.
How Else Might We Individualize Testing Using CAT?
An ISV-Based View of Validity
Test Event -- An examinee encounters a series of items in a particular context.
CAT Goal: individualize testing to address CIV threats to score validity (i.e., maximize ISV).
Some Research Issues:
Optimaztion
How can we exploit the advantages of Balanced Block Design while keeping the logistics manageable?
Homogeneous Designs: Overlap between test booklets as regular as possible
Conclusions:
This study is just a short exploration in the matter of optimization of a MST. It is extremely hard or maybe impossible to chart influence of item pool and test specifications on optimization process. Simulations are very helpful in finding an acceptable MST.
10aCAT10amst10amultistage testing10aRasch10arouting10atif1 aVerschoor, Angela1 aRadtke, Ingrid1 aEggen, Theo uhttp://mail.iacat.org/content/test-assembly-model-mst03104nas a2200445 4500008004100000020004100041245012000082210006900202250001500271260001000286300001100296490000700307520175400314653003802068653002102106653001002127653000902137653002202146653002802168653003302196653001102229653001102240653000902251653001602260653001802276653001902294653003102313653003102344653001602375100001602391700001002407700001402417700001502431700001402446700001502460700001802475700002402493700001802517856012302535 2010 eng d a0161-8105 (Print)0161-8105 (Linking)00aDevelopment and validation of patient-reported outcome measures for sleep disturbance and sleep-related impairments0 aDevelopment and validation of patientreported outcome measures f a2010/06/17 cJun 1 a781-920 v333 aSTUDY OBJECTIVES: To develop an archive of self-report questions assessing sleep disturbance and sleep-related impairments (SRI), to develop item banks from this archive, and to validate and calibrate the item banks using classic validation techniques and item response theory analyses in a sample of clinical and community participants. DESIGN: Cross-sectional self-report study. SETTING: Academic medical center and participant homes. PARTICIPANTS: One thousand nine hundred ninety-three adults recruited from an Internet polling sample and 259 adults recruited from medical, psychiatric, and sleep clinics. INTERVENTIONS: None. MEASUREMENTS AND RESULTS: This study was part of PROMIS (Patient-Reported Outcomes Information System), a National Institutes of Health Roadmap initiative. Self-report item banks were developed through an iterative process of literature searches, collecting and sorting items, expert content review, qualitative patient research, and pilot testing. Internal consistency, convergent validity, and exploratory and confirmatory factor analysis were examined in the resulting item banks. Factor analyses identified 2 preliminary item banks, sleep disturbance and SRI. Item response theory analyses and expert content review narrowed the item banks to 27 and 16 items, respectively. Validity of the item banks was supported by moderate to high correlations with existing scales and by significant differences in sleep disturbance and SRI scores between participants with and without sleep disorders. CONCLUSIONS: The PROMIS sleep disturbance and SRI item banks have excellent measurement properties and may prove to be useful for assessing general aspects of sleep and SRI with various groups of patients and interventions.10a*Outcome Assessment (Health Care)10a*Self Disclosure10aAdult10aAged10aAged, 80 and over10aCross-Sectional Studies10aFactor Analysis, Statistical10aFemale10aHumans10aMale10aMiddle Aged10aPsychometrics10aQuestionnaires10aReproducibility of Results10aSleep Disorders/*diagnosis10aYoung Adult1 aBuysse, D J1 aYu, L1 aMoul, D E1 aGermain, A1 aStover, A1 aDodds, N E1 aJohnston, K L1 aShablesky-Cade, M A1 aPilkonis, P A uhttp://mail.iacat.org/content/development-and-validation-patient-reported-outcome-measures-sleep-disturbance-and-sleep02882nas a2200493 4500008004100000020004100041245014100082210006900223250001500292260000800307300001100315490000700326520125100333653003001584653001001614653000901624653004601633653003301679653001101712653003101723653001101754653000901765653003301774653001601807653002401823653004601847653005501893653005501948653004602003653001902049653003102068653001402099100001602113700001502129700001302144700001402157700001502171700001702186700001502203700001702218700001502235700001302250856012502263 2009 eng d a0090-5550 (Print)0090-5550 (Linking)00aDevelopment of an item bank for the assessment of depression in persons with mental illnesses and physical diseases using Rasch analysis0 aDevelopment of an item bank for the assessment of depression in a2009/05/28 cMay a186-970 v543 aOBJECTIVE: The calibration of item banks provides the basis for computerized adaptive testing that ensures high diagnostic precision and minimizes participants' test burden. The present study aimed at developing a new item bank that allows for assessing depression in persons with mental and persons with somatic diseases. METHOD: The sample consisted of 161 participants treated for a depressive syndrome, and 206 participants with somatic illnesses (103 cardiologic, 103 otorhinolaryngologic; overall mean age = 44.1 years, SD =14.0; 44.7% women) to allow for validation of the item bank in both groups. Persons answered a pool of 182 depression items on a 5-point Likert scale. RESULTS: Evaluation of Rasch model fit (infit < 1.3), differential item functioning, dimensionality, local independence, item spread, item and person separation (>2.0), and reliability (>.80) resulted in a bank of 79 items with good psychometric properties. CONCLUSIONS: The bank provides items with a wide range of content coverage and may serve as a sound basis for computerized adaptive testing applications. It might also be useful for researchers who wish to develop new fixed-length scales for the assessment of depression in specific rehabilitation settings.10aAdaptation, Psychological10aAdult10aAged10aDepressive Disorder/*diagnosis/psychology10aDiagnosis, Computer-Assisted10aFemale10aHeart Diseases/*psychology10aHumans10aMale10aMental Disorders/*psychology10aMiddle Aged10aModels, Statistical10aOtorhinolaryngologic Diseases/*psychology10aPersonality Assessment/statistics & numerical data10aPersonality Inventory/*statistics & numerical data10aPsychometrics/statistics & numerical data10aQuestionnaires10aReproducibility of Results10aSick Role1 aForkmann, T1 aBoecker, M1 aNorra, C1 aEberle, N1 aKircher, T1 aSchauerte, P1 aMischke, K1 aWesthofen, M1 aGauggel, S1 aWirtz, M uhttp://mail.iacat.org/content/development-item-bank-assessment-depression-persons-mental-illnesses-and-physical-diseases02752nas a2200433 4500008004100000020004600041245012800087210006900215250001500284300001200299490000700311520139300318653003401711653001501745653001001760653000901770653002201779653002501801653001101826653001101837653000901848653001601857653001501873653003801888653001901926653003101945653002801976653004802004653002202052100002002074700001202094700001402106700001602120700001402136700001702150700001502167700001502182856012102197 2009 eng d a1878-5921 (Electronic)0895-4356 (Linking)00aAn evaluation of patient-reported outcomes found computerized adaptive testing was efficient in assessing stress perception0 aevaluation of patientreported outcomes found computerized adapti a2008/07/22 a278-2870 v623 aOBJECTIVES: This study aimed to develop and evaluate a first computerized adaptive test (CAT) for the measurement of stress perception (Stress-CAT), in terms of the two dimensions: exposure to stress and stress reaction. STUDY DESIGN AND SETTING: Item response theory modeling was performed using a two-parameter model (Generalized Partial Credit Model). The evaluation of the Stress-CAT comprised a simulation study and real clinical application. A total of 1,092 psychosomatic patients (N1) were studied. Two hundred simulees (N2) were generated for a simulated response data set. Then the Stress-CAT was given to n=116 inpatients, (N3) together with established stress questionnaires as validity criteria. RESULTS: The final banks included n=38 stress exposure items and n=31 stress reaction items. In the first simulation study, CAT scores could be estimated with a high measurement precision (SE<0.32; rho>0.90) using 7.0+/-2.3 (M+/-SD) stress reaction items and 11.6+/-1.7 stress exposure items. The second simulation study reanalyzed real patients data (N1) and showed an average use of items of 5.6+/-2.1 for the dimension stress reaction and 10.0+/-4.9 for the dimension stress exposure. Convergent validity showed significantly high correlations. CONCLUSIONS: The Stress-CAT is short and precise, potentially lowering the response burden of patients in clinical decision making.10a*Diagnosis, Computer-Assisted10aAdolescent10aAdult10aAged10aAged, 80 and over10aConfidence Intervals10aFemale10aHumans10aMale10aMiddle Aged10aPerception10aQuality of Health Care/*standards10aQuestionnaires10aReproducibility of Results10aSickness Impact Profile10aStress, Psychological/*diagnosis/psychology10aTreatment Outcome1 aKocalevent, R D1 aRose, M1 aBecker, J1 aWalter, O B1 aFliege, H1 aBjorner, J B1 aKleiber, D1 aKlapp, B F uhttp://mail.iacat.org/content/evaluation-patient-reported-outcomes-found-computerized-adaptive-testing-was-efficient01655nas a2200289 4500008004100000020004100041245011100082210006900193250001500262260000800277300001100285490000700296520053700303653004800840653006200888653005700950653001101007653002701018653002401045653005101069653004701120653003101167653001301198100001301211700001901224856012201243 2009 eng d a0007-1102 (Print)0007-1102 (Linking)00aThe maximum priority index method for severely constrained item selection in computerized adaptive testing0 amaximum priority index method for severely constrained item sele a2008/06/07 cMay a369-830 v623 aThis paper introduces a new heuristic approach, the maximum priority index (MPI) method, for severely constrained item selection in computerized adaptive testing. Our simulation study shows that it is able to accommodate various non-statistical constraints simultaneously, such as content balancing, exposure control, answer key balancing, and so on. Compared with the weighted deviation modelling method, it leads to fewer constraint violations and better exposure control while maintaining the same level of measurement precision.10aAptitude Tests/*statistics & numerical data10aDiagnosis, Computer-Assisted/*statistics & numerical data10aEducational Measurement/*statistics & numerical data10aHumans10aMathematical Computing10aModels, Statistical10aPersonality Tests/*statistics & numerical data10aPsychometrics/*statistics & numerical data10aReproducibility of Results10aSoftware1 aCheng, Y1 aChang, Hua-Hua uhttp://mail.iacat.org/content/maximum-priority-index-method-severely-constrained-item-selection-computerized-adaptive02905nas a2200289 4500008004100000020004100041245011100082210006900193250001500262260000800277300001400285490000700299520193300306653002702239653003802266653004102304653001902345653001102364653001402375653003102389100001502420700001302435700001202448700001602460700001302476856012602489 2009 eng d a0315-162X (Print)0315-162X (Linking)00aProgress in assessing physical function in arthritis: PROMIS short forms and computerized adaptive testing0 aProgress in assessing physical function in arthritis PROMIS shor a2009/09/10 cSep a2061-20660 v363 aOBJECTIVE: Assessing self-reported physical function/disability with the Health Assessment Questionnaire Disability Index (HAQ) and other instruments has become central in arthritis research. Item response theory (IRT) and computerized adaptive testing (CAT) techniques can increase reliability and statistical power. IRT-based instruments can improve measurement precision substantially over a wider range of disease severity. These modern methods were applied and the magnitude of improvement was estimated. METHODS: A 199-item physical function/disability item bank was developed by distilling 1865 items to 124, including Legacy Health Assessment Questionnaire (HAQ) and Physical Function-10 items, and improving precision through qualitative and quantitative evaluation in over 21,000 subjects, which included about 1500 patients with rheumatoid arthritis and osteoarthritis. Four new instruments, (A) Patient-Reported Outcomes Measurement Information (PROMIS) HAQ, which evolved from the original (Legacy) HAQ; (B) "best" PROMIS 10; (C) 20-item static (short) forms; and (D) simulated PROMIS CAT, which sequentially selected the most informative item, were compared with the HAQ. RESULTS: Online and mailed administration modes yielded similar item and domain scores. The HAQ and PROMIS HAQ 20-item scales yielded greater information content versus other scales in patients with more severe disease. The "best" PROMIS 20-item scale outperformed the other 20-item static forms over a broad range of 4 standard deviations. The 10-item simulated PROMIS CAT outperformed all other forms. CONCLUSION: Improved items and instruments yielded better information. The PROMIS HAQ is currently available and considered validated. The new PROMIS short forms, after validation, are likely to represent further improvement. CAT-based physical function/disability assessment offers superior performance over static forms of equal length.10a*Disability Evaluation10a*Outcome Assessment (Health Care)10aArthritis/diagnosis/*physiopathology10aHealth Surveys10aHumans10aPrognosis10aReproducibility of Results1 aFries, J F1 aCella, D1 aRose, M1 aKrishnan, E1 aBruce, B uhttp://mail.iacat.org/content/progress-assessing-physical-function-arthritis-promis-short-forms-and-computerized-adaptive02598nas a2200337 4500008004100000020004600041245012800087210006900215250001500284300000700299490000600306520149200312653003201804653002301836653002501859653003401884653001101918653001101929653000901940653002601949653003101975653002702006653001102033653001802044100001502062700001202077700001402089700001802103700001202121856012702133 2009 eng d a1477-7525 (Electronic)1477-7525 (Linking)00aReduction in patient burdens with graphical computerized adaptive testing on the ADL scale: tool development and simulation0 aReduction in patient burdens with graphical computerized adaptiv a2009/05/07 a390 v73 aBACKGROUND: The aim of this study was to verify the effectiveness and efficacy of saving time and reducing burden for patients, nurses, and even occupational therapists through computer adaptive testing (CAT). METHODS: Based on an item bank of the Barthel Index (BI) and the Frenchay Activities Index (FAI) for assessing comprehensive activities of daily living (ADL) function in stroke patients, we developed a visual basic application (VBA)-Excel CAT module, and (1) investigated whether the averaged test length via CAT is shorter than that of the traditional all-item-answered non-adaptive testing (NAT) approach through simulation, (2) illustrated the CAT multimedia on a tablet PC showing data collection and response errors of ADL clinical functional measures in stroke patients, and (3) demonstrated the quality control of endorsing scale with fit statistics to detect responding errors, which will be further immediately reconfirmed by technicians once patient ends the CAT assessment. RESULTS: The results show that endorsed items could be shorter on CAT (M = 13.42) than on NAT (M = 23) at 41.64% efficiency in test length. However, averaged ability estimations reveal insignificant differences between CAT and NAT. CONCLUSION: This study found that mobile nursing services, placed at the bedsides of patients could, through the programmed VBA-Excel CAT module, reduce the burden to patients and save time, more so than the traditional NAT paper-and-pencil testing appraisals.10a*Activities of Daily Living10a*Computer Graphics10a*Computer Simulation10a*Diagnosis, Computer-Assisted10aFemale10aHumans10aMale10aPoint-of-Care Systems10aReproducibility of Results10aStroke/*rehabilitation10aTaiwan10aUnited States1 aChien, T W1 aWu, H M1 aWang, W-C1 aCastillo, R V1 aChou, W uhttp://mail.iacat.org/content/reduction-patient-burdens-graphical-computerized-adaptive-testing-adl-scale-tool-development02866nas a2200325 4500008004100000020002700041245007400068210006900142250001500211260000800226300001100234490000700245520188300252653003202135653003202167653002502199653002302224653004802247653001102295653001102306653000902317653001602326653001902342653002702361100001502388700001502403700001002418700001202428856010002440 2008 eng d a1537-7385 (Electronic)00aAdaptive short forms for outpatient rehabilitation outcome assessment0 aAdaptive short forms for outpatient rehabilitation outcome asses a2008/09/23 cOct a842-520 v873 aOBJECTIVE: To develop outpatient Adaptive Short Forms for the Activity Measure for Post-Acute Care item bank for use in outpatient therapy settings. DESIGN: A convenience sample of 11,809 adults with spine, lower limb, upper limb, and miscellaneous orthopedic impairments who received outpatient rehabilitation in 1 of 127 outpatient rehabilitation clinics in the United States. We identified optimal items for use in developing outpatient Adaptive Short Forms based on the Basic Mobility and Daily Activities domains of the Activity Measure for Post-Acute Care item bank. Patient scores were derived from the Activity Measure for Post-Acute Care computerized adaptive testing program. Items were selected for inclusion on the Adaptive Short Forms based on functional content, range of item coverage, measurement precision, item exposure rate, and data collection burden. RESULTS: Two outpatient Adaptive Short Forms were developed: (1) an 18-item Basic Mobility Adaptive Short Form and (2) a 15-item Daily Activities Adaptive Short Form, derived from the same item bank used to develop the Activity Measure for Post-Acute Care computerized adaptive testing program. Both Adaptive Short Forms achieved acceptable psychometric properties. CONCLUSIONS: In outpatient postacute care settings where computerized adaptive testing outcome applications are currently not feasible, item response theory-derived Adaptive Short Forms provide the efficient capability to monitor patients' functional outcomes. The development of Adaptive Short Form functional outcome instruments linked by a common, calibrated item bank has the potential to create a bridge to outcome monitoring across postacute care settings and can facilitate the eventual transformation from Adaptive Short Forms to computerized adaptive testing applications easier and more acceptable to the rehabilitation community.10a*Activities of Daily Living10a*Ambulatory Care Facilities10a*Mobility Limitation10a*Treatment Outcome10aDisabled Persons/psychology/*rehabilitation10aFemale10aHumans10aMale10aMiddle Aged10aQuestionnaires10aRehabilitation Centers1 aJette, A M1 aHaley, S M1 aNi, P1 aMoed, R uhttp://mail.iacat.org/content/adaptive-short-forms-outpatient-rehabilitation-outcome-assessment03437nas a2200481 4500008004100000020004600041245013800087210006900225250001500294260000800309300001200317490000700329520191400336653002702250653002302277653003102300653001502331653001602346653001002362653002102372653002402393653002302417653003802440653001102478653002202489653001102511653001102522653000902533653003702542653002102579653003102600653002602631653001702657653003202674653001602706653002802722100001602750700001502766700001002781700001502791700002502806856012402831 2008 eng d a1532-821X (Electronic)0003-9993 (Linking)00aAssessing self-care and social function using a computer adaptive testing version of the pediatric evaluation of disability inventory0 aAssessing selfcare and social function using a computer adaptive a2008/04/01 cApr a622-6290 v893 aOBJECTIVE: To examine score agreement, validity, precision, and response burden of a prototype computer adaptive testing (CAT) version of the self-care and social function scales of the Pediatric Evaluation of Disability Inventory compared with the full-length version of these scales. DESIGN: Computer simulation analysis of cross-sectional and longitudinal retrospective data; cross-sectional prospective study. SETTING: Pediatric rehabilitation hospital, including inpatient acute rehabilitation, day school program, outpatient clinics; community-based day care, preschool, and children's homes. PARTICIPANTS: Children with disabilities (n=469) and 412 children with no disabilities (analytic sample); 38 children with disabilities and 35 children without disabilities (cross-validation sample). INTERVENTIONS: Not applicable. MAIN OUTCOME MEASURES: Summary scores from prototype CAT applications of each scale using 15-, 10-, and 5-item stopping rules; scores from the full-length self-care and social function scales; time (in seconds) to complete assessments and respondent ratings of burden. RESULTS: Scores from both computer simulations and field administration of the prototype CATs were highly consistent with scores from full-length administration (r range, .94-.99). Using computer simulation of retrospective data, discriminant validity, and sensitivity to change of the CATs closely approximated that of the full-length scales, especially when the 15- and 10-item stopping rules were applied. In the cross-validation study the time to administer both CATs was 4 minutes, compared with over 16 minutes to complete the full-length scales. CONCLUSIONS: Self-care and social function score estimates from CAT administration are highly comparable with those obtained from full-length scale administration, with small losses in validity and precision and substantial decreases in administration time.10a*Disability Evaluation10a*Social Adjustment10aActivities of Daily Living10aAdolescent10aAge Factors10aChild10aChild, Preschool10aComputer Simulation10aCross-Over Studies10aDisabled Children/*rehabilitation10aFemale10aFollow-Up Studies10aHumans10aInfant10aMale10aOutcome Assessment (Health Care)10aReference Values10aReproducibility of Results10aRetrospective Studies10aRisk Factors10aSelf Care/*standards/trends10aSex Factors10aSickness Impact Profile1 aCoster, W J1 aHaley, S M1 aNi, P1 aDumas, H M1 aFragala-Pinkham, M A uhttp://mail.iacat.org/content/assessing-self-care-and-social-function-using-computer-adaptive-testing-version-pediatric03042nas a2200481 4500008004100000020004600041245012200087210006900209250001500278260000800293300001200301490000700313520155700320653003201877653003101909653002201940653002001962653001001982653000901992653002202001653002802023653003302051653001102084653001102095653002502106653000902131653001602140653004602156653002202202653002402224653003002248653002902278100001502307700001402322700001502336700002402351700001802375700001102393700001602404700001002420700001502430856011502445 2008 eng d a1532-821X (Electronic)0003-9993 (Linking)00aComputerized adaptive testing for follow-up after discharge from inpatient rehabilitation: II. Participation outcomes0 aComputerized adaptive testing for followup after discharge from a2008/01/30 cFeb a275-2830 v893 aOBJECTIVES: To measure participation outcomes with a computerized adaptive test (CAT) and compare CAT and traditional fixed-length surveys in terms of score agreement, respondent burden, discriminant validity, and responsiveness. DESIGN: Longitudinal, prospective cohort study of patients interviewed approximately 2 weeks after discharge from inpatient rehabilitation and 3 months later. SETTING: Follow-up interviews conducted in patient's home setting. PARTICIPANTS: Adults (N=94) with diagnoses of neurologic, orthopedic, or medically complex conditions. INTERVENTIONS: Not applicable. MAIN OUTCOME MEASURES: Participation domains of mobility, domestic life, and community, social, & civic life, measured using a CAT version of the Participation Measure for Postacute Care (PM-PAC-CAT) and a 53-item fixed-length survey (PM-PAC-53). RESULTS: The PM-PAC-CAT showed substantial agreement with PM-PAC-53 scores (intraclass correlation coefficient, model 3,1, .71-.81). On average, the PM-PAC-CAT was completed in 42% of the time and with only 48% of the items as compared with the PM-PAC-53. Both formats discriminated across functional severity groups. The PM-PAC-CAT had modest reductions in sensitivity and responsiveness to patient-reported change over a 3-month interval as compared with the PM-PAC-53. CONCLUSIONS: Although continued evaluation is warranted, accurate estimates of participation status and responsiveness to change for group-level analyses can be obtained from CAT administrations, with a sizeable reduction in respondent burden.10a*Activities of Daily Living10a*Adaptation, Physiological10a*Computer Systems10a*Questionnaires10aAdult10aAged10aAged, 80 and over10aChi-Square Distribution10aFactor Analysis, Statistical10aFemale10aHumans10aLongitudinal Studies10aMale10aMiddle Aged10aOutcome Assessment (Health Care)/*methods10aPatient Discharge10aProspective Studies10aRehabilitation/*standards10aSubacute Care/*standards1 aHaley, S M1 aGandek, B1 aSiebens, H1 aBlack-Schaffer, R M1 aSinclair, S J1 aTao, W1 aCoster, W J1 aNi, P1 aJette, A M uhttp://mail.iacat.org/content/computerized-adaptive-testing-follow-after-discharge-inpatient-rehabilitation-ii03314nas a2200433 4500008004100000020004600041245007700087210006900164250001500233260001100248300001200259490000700271520203200278653002702310653003002337653002102367653001002388653000902398653001502407653003602422653002102458653004402479653002402523653001102547653001102558653001302569653000902582653001602591653003002607653003002637653003102667100001502698700001302713700001502726700001402741700001502755700001402770856009602784 2008 eng d a1528-1159 (Electronic)0362-2436 (Linking)00aComputerized adaptive testing in back pain: Validation of the CAT-5D-QOL0 aComputerized adaptive testing in back pain Validation of the CAT a2008/05/23 cMay 20 a1384-900 v333 aSTUDY DESIGN: We have conducted an outcome instrument validation study. OBJECTIVE: Our objective was to develop a computerized adaptive test (CAT) to measure 5 domains of health-related quality of life (HRQL) and assess its feasibility, reliability, validity, and efficiency. SUMMARY OF BACKGROUND DATA: Kopec and colleagues have recently developed item response theory based item banks for 5 domains of HRQL relevant to back pain and suitable for CAT applications. The domains are Daily Activities (DAILY), Walking (WALK), Handling Objects (HAND), Pain or Discomfort (PAIN), and Feelings (FEEL). METHODS: An adaptive algorithm was implemented in a web-based questionnaire administration system. The questionnaire included CAT-5D-QOL (5 scales), Modified Oswestry Disability Index (MODI), Roland-Morris Disability Questionnaire (RMDQ), SF-36 Health Survey, and standard clinical and demographic information. Participants were outpatients treated for mechanical back pain at a referral center in Vancouver, Canada. RESULTS: A total of 215 patients completed the questionnaire and 84 completed a retest. On average, patients answered 5.2 items per CAT-5D-QOL scale. Reliability ranged from 0.83 (FEEL) to 0.92 (PAIN) and was 0.92 for the MODI, RMDQ, and Physical Component Summary (PCS-36). The ceiling effect was 0.5% for PAIN compared with 2% for MODI and 5% for RMQ. The CAT-5D-QOL scales correlated as anticipated with other measures of HRQL and discriminated well according to the level of satisfaction with current symptoms, duration of the last episode, sciatica, and disability compensation. The average relative discrimination index was 0.87 for PAIN, 0.67 for DAILY and 0.62 for WALK, compared with 0.89 for MODI, 0.80 for RMDQ, and 0.59 for PCS-36. CONCLUSION: The CAT-5D-QOL is feasible, reliable, valid, and efficient in patients with back pain. This methodology can be recommended for use in back pain research and should improve outcome assessment, facilitate comparisons across studies, and reduce patient burden.10a*Disability Evaluation10a*Health Status Indicators10a*Quality of Life10aAdult10aAged10aAlgorithms10aBack Pain/*diagnosis/psychology10aBritish Columbia10aDiagnosis, Computer-Assisted/*standards10aFeasibility Studies10aFemale10aHumans10aInternet10aMale10aMiddle Aged10aPredictive Value of Tests10aQuestionnaires/*standards10aReproducibility of Results1 aKopec, J A1 aBadii, M1 aMcKenna, M1 aLima, V D1 aSayre, E C1 aDvorak, M uhttp://mail.iacat.org/content/computerized-adaptive-testing-back-pain-validation-cat-5d-qol02561nas a2200313 4500008004100000020004100041245011500082210006900197250001500266300001100281490000700292520149300299653002701792653001001819653001401829653005301843653001501896653001101911653003701922653001801959653003101977653002602008653001402034653003202048100001502080700001002095700001502105856012702120 2008 eng d a0963-8288 (Print)0963-8288 (Linking)00aEfficiency and sensitivity of multidimensional computerized adaptive testing of pediatric physical functioning0 aEfficiency and sensitivity of multidimensional computerized adap a2008/02/26 a479-840 v303 aPURPOSE: Computerized adaptive tests (CATs) have efficiency advantages over fixed-length tests of physical functioning but may lose sensitivity when administering extremely low numbers of items. Multidimensional CATs may efficiently improve sensitivity by capitalizing on correlations between functional domains. Using a series of empirical simulations, we assessed the efficiency and sensitivity of multidimensional CATs compared to a longer fixed-length test. METHOD: Parent responses to the Pediatric Evaluation of Disability Inventory before and after intervention for 239 children at a pediatric rehabilitation hospital provided the data for this retrospective study. Reliability, effect size, and standardized response mean were compared between full-length self-care and mobility subscales and simulated multidimensional CATs with stopping rules at 40, 30, 20, and 10 items. RESULTS: Reliability was lowest in the 10-item CAT condition for the self-care (r = 0.85) and mobility (r = 0.79) subscales; all other conditions had high reliabilities (r > 0.94). All multidimensional CAT conditions had equivalent levels of sensitivity compared to the full set condition for both domains. CONCLUSIONS: Multidimensional CATs efficiently retain the sensitivity of longer fixed-length measures even with 5 items per dimension (10-item CAT condition). Measuring physical functioning with multidimensional CATs could enhance sensitivity following intervention while minimizing response burden.10a*Disability Evaluation10aChild10aComputers10aDisabled Children/*classification/rehabilitation10aEfficiency10aHumans10aOutcome Assessment (Health Care)10aPsychometrics10aReproducibility of Results10aRetrospective Studies10aSelf Care10aSensitivity and Specificity1 aAllen, D D1 aNi, P1 aHaley, S M uhttp://mail.iacat.org/content/efficiency-and-sensitivity-multidimensional-computerized-adaptive-testing-pediatric-physical03233nas a2200397 4500008004100000020002700041245014200068210006900210250001500279260001100294300001200305490000700317520193600324653002702260653003002287653001002317653000902327653002202336653003602358653001602394653002402410653004402434653001102478653001602489653002602505653003002531653003002561653003102591100001302622700001402635700001502649700001402664700001702678700001502695856012502710 2008 eng d a1528-1159 (Electronic)00aLetting the CAT out of the bag: Comparing computer adaptive tests and an 11-item short form of the Roland-Morris Disability Questionnaire0 aLetting the CAT out of the bag Comparing computer adaptive tests a2008/05/23 cMay 20 a1378-830 v333 aSTUDY DESIGN: A post hoc simulation of a computer adaptive administration of the items of a modified version of the Roland-Morris Disability Questionnaire. OBJECTIVE: To evaluate the effectiveness of adaptive administration of back pain-related disability items compared with a fixed 11-item short form. SUMMARY OF BACKGROUND DATA: Short form versions of the Roland-Morris Disability Questionnaire have been developed. An alternative to paper-and-pencil short forms is to administer items adaptively so that items are presented based on a person's responses to previous items. Theoretically, this allows precise estimation of back pain disability with administration of only a few items. MATERIALS AND METHODS: Data were gathered from 2 previously conducted studies of persons with back pain. An item response theory model was used to calibrate scores based on all items, items of a paper-and-pencil short form, and several computer adaptive tests (CATs). RESULTS: Correlations between each CAT condition and scores based on a 23-item version of the Roland-Morris Disability Questionnaire ranged from 0.93 to 0.98. Compared with an 11-item short form, an 11-item CAT produced scores that were significantly more highly correlated with scores based on the 23-item scale. CATs with even fewer items also produced scores that were highly correlated with scores based on all items. For example, scores from a 5-item CAT had a correlation of 0.93 with full scale scores. Seven- and 9-item CATs correlated at 0.95 and 0.97, respectively. A CAT with a standard-error-based stopping rule produced scores that correlated at 0.95 with full scale scores. CONCLUSION: A CAT-based back pain-related disability measure may be a valuable tool for use in clinical and research contexts. Use of CAT for other common measures in back pain research, such as other functional scales or measures of psychological distress, may offer similar advantages.10a*Disability Evaluation10a*Health Status Indicators10aAdult10aAged10aAged, 80 and over10aBack Pain/*diagnosis/psychology10aCalibration10aComputer Simulation10aDiagnosis, Computer-Assisted/*standards10aHumans10aMiddle Aged10aModels, Psychological10aPredictive Value of Tests10aQuestionnaires/*standards10aReproducibility of Results1 aCook, KF1 aChoi, S W1 aCrane, P K1 aDeyo, R A1 aJohnson, K L1 aAmtmann, D uhttp://mail.iacat.org/content/letting-cat-out-bag-comparing-computer-adaptive-tests-and-11-item-short-form-roland-morris03429nas a2200385 4500008004100000020004100041245010600082210006900188250001500257260001200272300001000284490000700294520220300301653002702504653001502531653001002546653002102556653002402577653002802601653003802629653001102667653001102678653001102689653003902700653000902739653002402748653003102772653004002803100001802843700001502861700001302876700001702889700001402906856012302920 2008 eng d a0271-6798 (Print)0271-6798 (Linking)00aMeasuring physical functioning in children with spinal impairments with computerized adaptive testing0 aMeasuring physical functioning in children with spinal impairmen a2008/03/26 cApr-May a330-50 v283 aBACKGROUND: The purpose of this study was to assess the utility of measuring current physical functioning status of children with scoliosis and kyphosis by applying computerized adaptive testing (CAT) methods. Computerized adaptive testing uses a computer interface to administer the most optimal items based on previous responses, reducing the number of items needed to obtain a scoring estimate. METHODS: This was a prospective study of 77 subjects (0.6-19.8 years) who were seen by a spine surgeon during a routine clinic visit for progress spine deformity. Using a multidimensional version of the Pediatric Evaluation of Disability Inventory CAT program (PEDI-MCAT), we evaluated content range, accuracy and efficiency, known-group validity, concurrent validity with the Pediatric Outcomes Data Collection Instrument, and test-retest reliability in a subsample (n = 16) within a 2-week interval. RESULTS: We found the PEDI-MCAT to have sufficient item coverage in both self-care and mobility content for this sample, although most patients tended to score at the higher ends of both scales. Both the accuracy of PEDI-MCAT scores as compared with a fixed format of the PEDI (r = 0.98 for both mobility and self-care) and test-retest reliability were very high [self-care: intraclass correlation (3,1) = 0.98, mobility: intraclass correlation (3,1) = 0.99]. The PEDI-MCAT took an average of 2.9 minutes for the parents to complete. The PEDI-MCAT detected expected differences between patient groups, and scores on the PEDI-MCAT correlated in expected directions with scores from the Pediatric Outcomes Data Collection Instrument domains. CONCLUSIONS: Use of the PEDI-MCAT to assess the physical functioning status, as perceived by parents of children with complex spinal impairments, seems to be feasible and achieves accurate and efficient estimates of self-care and mobility function. Additional item development will be needed at the higher functioning end of the scale to avoid ceiling effects for older children. LEVEL OF EVIDENCE: This is a level II prospective study designed to establish the utility of computer adaptive testing as an evaluation method in a busy pediatric spine practice.10a*Disability Evaluation10aAdolescent10aChild10aChild, Preschool10aComputer Simulation10aCross-Sectional Studies10aDisabled Children/*rehabilitation10aFemale10aHumans10aInfant10aKyphosis/*diagnosis/rehabilitation10aMale10aProspective Studies10aReproducibility of Results10aScoliosis/*diagnosis/rehabilitation1 aMulcahey, M J1 aHaley, S M1 aDuffy, T1 aPengsheng, N1 aBetz, R R uhttp://mail.iacat.org/content/measuring-physical-functioning-children-spinal-impairments-computerized-adaptive-testing01948nas a2200277 4500008004100000020004100041245007300082210006900155250001500224260000800239300001000247490000700257520099700264653001601261653002901277653004801306653006201354653001101416653002401427653004601451653003101497653001301528100001401541700001501555856010001570 2008 eng d a0007-1102 (Print)0007-1102 (Linking)00aPredicting item exposure parameters in computerized adaptive testing0 aPredicting item exposure parameters in computerized adaptive tes a2008/05/17 cMay a75-910 v613 aThe purpose of this study is to find a formula that describes the relationship between item exposure parameters and item parameters in computerized adaptive tests by using genetic programming (GP) - a biologically inspired artificial intelligence technique. Based on the formula, item exposure parameters for new parallel item pools can be predicted without conducting additional iterative simulations. Results show that an interesting formula between item exposure parameters and item parameters in a pool can be found by using GP. The item exposure parameters predicted based on the found formula were close to those observed from the Sympson and Hetter (1985) procedure and performed well in controlling item exposure rates. Similar results were observed for the Stocking and Lewis (1998) multinomial model for item selection and the Sympson and Hetter procedure with content balancing. The proposed GP approach has provided a knowledge-based solution for finding item exposure parameters.10a*Algorithms10a*Artificial Intelligence10aAptitude Tests/*statistics & numerical data10aDiagnosis, Computer-Assisted/*statistics & numerical data10aHumans10aModels, Statistical10aPsychometrics/statistics & numerical data10aReproducibility of Results10aSoftware1 aChen, S-Y1 aDoong, S H uhttp://mail.iacat.org/content/predicting-item-exposure-parameters-computerized-adaptive-testing03158nas a2200493 4500008004100000020002200041245008900063210006900152250001500221260000800236300001000244490000700254520169600261653003401957653002001991653001502011653001002026653000902036653002602045653003202071653003102103653001102134653001102145653000902156653003202165653001602197653002902213653004402242653002902286653003102315653003102346653001702377100001702394700001402411700001602425700001302441700001702454700002202471700001702493700001402510700001402524700001702538856010902555 2008 eng d a1075-2730 (Print)00aUsing computerized adaptive testing to reduce the burden of mental health assessment0 aUsing computerized adaptive testing to reduce the burden of ment a2008/04/02 cApr a361-80 v593 aOBJECTIVE: This study investigated the combination of item response theory and computerized adaptive testing (CAT) for psychiatric measurement as a means of reducing the burden of research and clinical assessments. METHODS: Data were from 800 participants in outpatient treatment for a mood or anxiety disorder; they completed 616 items of the 626-item Mood and Anxiety Spectrum Scales (MASS) at two times. The first administration was used to design and evaluate a CAT version of the MASS by using post hoc simulation. The second confirmed the functioning of CAT in live testing. RESULTS: Tests of competing models based on item response theory supported the scale's bifactor structure, consisting of a primary dimension and four group factors (mood, panic-agoraphobia, obsessive-compulsive, and social phobia). Both simulated and live CAT showed a 95% average reduction (585 items) in items administered (24 and 30 items, respectively) compared with administration of the full MASS. The correlation between scores on the full MASS and the CAT version was .93. For the mood disorder subscale, differences in scores between two groups of depressed patients--one with bipolar disorder and one without--on the full scale and on the CAT showed effect sizes of .63 (p<.003) and 1.19 (p<.001) standard deviation units, respectively, indicating better discriminant validity for CAT. CONCLUSIONS: Instead of using small fixed-length tests, clinicians can create item banks with a large item pool, and a small set of the items most relevant for a given individual can be administered with no loss of information, yielding a dramatic reduction in administration time and patient and clinician burden.10a*Diagnosis, Computer-Assisted10a*Questionnaires10aAdolescent10aAdult10aAged10aAgoraphobia/diagnosis10aAnxiety Disorders/diagnosis10aBipolar Disorder/diagnosis10aFemale10aHumans10aMale10aMental Disorders/*diagnosis10aMiddle Aged10aMood Disorders/diagnosis10aObsessive-Compulsive Disorder/diagnosis10aPanic Disorder/diagnosis10aPhobic Disorders/diagnosis10aReproducibility of Results10aTime Factors1 aGibbons, R D1 aWeiss, DJ1 aKupfer, D J1 aFrank, E1 aFagiolini, A1 aGrochocinski, V J1 aBhaumik, D K1 aStover, A1 aBock, R D1 aImmekus, J C uhttp://mail.iacat.org/content/using-computerized-adaptive-testing-reduce-burden-mental-health-assessment02309nas a2200301 4500008004100000020002200041245011900063210006900182250001500251260000800266300001000274490000700284520125000291653001501541653001001556653006201566653001101628653001101639653000901650653003801659653005601697653004601753653002101799653003101820100001601851700002001867856012001887 2007 eng d a1040-3590 (Print)00aComputerized adaptive personality testing: A review and illustration with the MMPI-2 Computerized Adaptive Version0 aComputerized adaptive personality testing A review and illustrat a2007/03/21 cMar a14-240 v193 aComputerized adaptive testing in personality assessment can improve efficiency by significantly reducing the number of items administered to answer an assessment question. Two approaches have been explored for adaptive testing in computerized personality assessment: item response theory and the countdown method. In this article, the authors review the literature on each and report the results of an investigation designed to explore the utility, in terms of item and time savings, and validity, in terms of correlations with external criterion measures, of an expanded countdown method-based research version of the Minnesota Multiphasic Personality Inventory-2 (MMPI-2), the MMPI-2 Computerized Adaptive Version (MMPI-2-CA). Participants were 433 undergraduate college students (170 men and 263 women). Results indicated considerable item savings and corresponding time savings for the adaptive testing modalities compared with a conventional computerized MMPI-2 administration. Furthermore, computerized adaptive administration yielded comparable results to computerized conventional administration of the MMPI-2 in terms of both test scores and their validity. Future directions for computerized adaptive personality testing are discussed.10aAdolescent10aAdult10aDiagnosis, Computer-Assisted/*statistics & numerical data10aFemale10aHumans10aMale10aMMPI/*statistics & numerical data10aPersonality Assessment/*statistics & numerical data10aPsychometrics/statistics & numerical data10aReference Values10aReproducibility of Results1 aForbey, J D1 aBen-Porath, Y S uhttp://mail.iacat.org/content/computerized-adaptive-personality-testing-review-and-illustration-mmpi-2-computerized02876nas a2200313 4500008004100000020002200041245010100063210006900164250001500233260000800248300001200256490000700268520179500275653005102070653002002121653003702141653002602178653001902204653001102223653003002234653004602264653003502310653002802345653001302373100002102386700001702407700001502424856012302439 2007 eng d a0315-162X (Print)00aImproving patient reported outcomes using item response theory and computerized adaptive testing0 aImproving patient reported outcomes using item response theory a a2007/06/07 cJun a1426-310 v343 aOBJECTIVE: Patient reported outcomes (PRO) are considered central outcome measures for both clinical trials and observational studies in rheumatology. More sophisticated statistical models, including item response theory (IRT) and computerized adaptive testing (CAT), will enable critical evaluation and reconstruction of currently utilized PRO instruments to improve measurement precision while reducing item burden on the individual patient. METHODS: We developed a domain hierarchy encompassing the latent trait of physical function/disability from the more general to most specific. Items collected from 165 English-language instruments were evaluated by a structured process including trained raters, modified Delphi expert consensus, and then patient evaluation. Each item in the refined data bank will undergo extensive analysis using IRT to evaluate response functions and measurement precision. CAT will allow for real-time questionnaires of potentially smaller numbers of questions tailored directly to each individual's level of physical function. RESULTS: Physical function/disability domain comprises 4 subdomains: upper extremity, trunk, lower extremity, and complex activities. Expert and patient review led to consensus favoring use of present-tense "capability" questions using a 4- or 5-item Likert response construct over past-tense "performance"items. Floor and ceiling effects, attribution of disability, and standardization of response categories were also addressed. CONCLUSION: By applying statistical techniques of IRT through use of CAT, existing PRO instruments may be improved to reduce questionnaire burden on the individual patients while increasing measurement precision that may ultimately lead to reduced sample size requirements for costly clinical trials.10a*Rheumatic Diseases/physiopathology/psychology10aClinical Trials10aData Interpretation, Statistical10aDisability Evaluation10aHealth Surveys10aHumans10aInternational Cooperation10aOutcome Assessment (Health Care)/*methods10aPatient Participation/*methods10aResearch Design/*trends10aSoftware1 aChakravarty, E F1 aBjorner, J B1 aFries, J F uhttp://mail.iacat.org/content/improving-patient-reported-outcomes-using-item-response-theory-and-computerized-adaptive02413nas a2200361 4500008004500000020001400045245011100059210006900170300001200239490000700251520135900258653001601617653002001633653001301653653002401666653002501690653001101715653001401726653001301740653001801753653002701771653001001798653001101808100001501819700001201834700001601846700001201862700001601874700001301890700001301903700001401916856012101930 2007 Engldsh a1057-924900aThe initial development of an item bank to assess and screen for psychological distress in cancer patients0 ainitial development of an item bank to assess and screen for psy a724-7320 v163 aPsychological distress is a common problem among cancer patients. Despite the large number of instruments that have been developed to assess distress, their utility remains disappointing. This study aimed to use Rasch models to develop an item-bank which would provide the basis for better means of assessing psychological distress in cancer patients. An item bank was developed from eight psychological distress questionnaires using Rasch analysis to link common items. Items from the questionnaires were added iteratively with common items as anchor points and misfitting items (infit mean square > 1.3) removed, and unidimensionality assessed. A total of 4914 patients completed the questionnaires providing an initial pool of 83 items. Twenty items were removed resulting in a final pool of 63 items. Good fit was demonstrated and no additional factor structure was evident from the residuals. However, there was little overlap between item locations and person measures, since items mainly targeted higher levels of distress. The Rasch analysis allowed items to be pooled and generated a unidimensional instrument for measuring psychological distress in cancer patients. Additional items are required to more accurately assess patients across the whole continuum of psychological distress. (PsycINFO Database Record (c) 2007 APA ) (journal abstract)10a3293 Cancer10acancer patients10aDistress10ainitial development10aItem Response Theory10aModels10aNeoplasms10aPatients10aPsychological10apsychological distress10aRasch10aStress1 aSmith, A B1 aRush, R1 aVelikova, G1 aWall, L1 aWright, E P1 aStark, D1 aSelby, P1 aSharpe, M uhttp://mail.iacat.org/content/initial-development-item-bank-assess-and-screen-psychological-distress-cancer-patients01689nas a2200217 4500008004100000020002200041245008000063210006900143260002600212300001200238490000700250520095600257653003401213653002701247653002301274653002901297100001601326700001801342700001301360856009801373 2007 eng d a0146-6216 (Print)00aTest design optimization in CAT early stage with the nominal response model0 aTest design optimization in CAT early stage with the nominal res bSage Publications: US a213-2320 v313 aThe early stage of computerized adaptive testing (CAT) refers to the phase of the trait estimation during the administration of only a few items. This phase can be characterized by bias and instability of estimation. In this study, an item selection criterion is introduced in an attempt to lessen this instability: the D-optimality criterion. A polytomous unconstrained CAT simulation is carried out to evaluate this criterion's performance under different test premises. The simulation shows that the extent of early stage instability depends primarily on the quality of the item pool information and its size and secondarily on the item selection criteria. The efficiency of the D-optimality criterion is similar to the efficiency of other known item selection criteria. Yet, it often yields estimates that, at the beginning of CAT, display a more robust performance against instability. (PsycINFO Database Record (c) 2007 APA, all rights reserved)10acomputerized adaptive testing10anominal response model10arobust performance10atest design optimization1 aPassos, V L1 aBerger, M P F1 aTan, F E uhttp://mail.iacat.org/content/test-design-optimization-cat-early-stage-nominal-response-model02653nas a2200397 4500008004100000020002200041245013500063210006900198250001500267260000800282300001200290490000700302520140700309653002601716653003101742653001501773653001001788653000901798653002201807653002501829653003301854653001101887653001101898653000901909653001601918653004601934653003001980653003102010653001302041100001502054700001002069700001802079700001602097700001502113856012702128 2006 eng d a0895-4356 (Print)00aComputer adaptive testing improved accuracy and precision of scores over random item selection in a physical functioning item bank0 aComputer adaptive testing improved accuracy and precision of sco a2006/10/10 cNov a1174-820 v593 aBACKGROUND AND OBJECTIVE: Measuring physical functioning (PF) within and across postacute settings is critical for monitoring outcomes of rehabilitation; however, most current instruments lack sufficient breadth and feasibility for widespread use. Computer adaptive testing (CAT), in which item selection is tailored to the individual patient, holds promise for reducing response burden, yet maintaining measurement precision. We calibrated a PF item bank via item response theory (IRT), administered items with a post hoc CAT design, and determined whether CAT would improve accuracy and precision of score estimates over random item selection. METHODS: 1,041 adults were interviewed during postacute care rehabilitation episodes in either hospital or community settings. Responses for 124 PF items were calibrated using IRT methods to create a PF item bank. We examined the accuracy and precision of CAT-based scores compared to a random selection of items. RESULTS: CAT-based scores had higher correlations with the IRT-criterion scores, especially with short tests, and resulted in narrower confidence intervals than scores based on a random selection of items; gains, as expected, were especially large for low and high performing adults. CONCLUSION: The CAT design may have important precision and efficiency advantages for point-of-care functional assessment in rehabilitation practice settings.10a*Recovery of Function10aActivities of Daily Living10aAdolescent10aAdult10aAged10aAged, 80 and over10aConfidence Intervals10aFactor Analysis, Statistical10aFemale10aHumans10aMale10aMiddle Aged10aOutcome Assessment (Health Care)/*methods10aRehabilitation/*standards10aReproducibility of Results10aSoftware1 aHaley, S M1 aNi, P1 aHambleton, RK1 aSlavin, M D1 aJette, A M uhttp://mail.iacat.org/content/computer-adaptive-testing-improved-accuracy-and-precision-scores-over-random-item-selectio-003330nas a2200469 4500008004100000020002200041245011600063210006900179250001500248260000800263300001200271490000700283520189400290653003202184653003102216653002202247653002002269653001002289653000902299653002202308653002802330653003302358653001102391653001102402653002502413653000902438653001602447653004602463653002202509653002402531653003002555653002902585100001502614700001502629700001602644700001102660700002402671700001402695700001802709700001002727856012302737 2006 eng d a0003-9993 (Print)00aComputerized adaptive testing for follow-up after discharge from inpatient rehabilitation: I. Activity outcomes0 aComputerized adaptive testing for followup after discharge from a2006/08/01 cAug a1033-420 v873 aOBJECTIVE: To examine score agreement, precision, validity, efficiency, and responsiveness of a computerized adaptive testing (CAT) version of the Activity Measure for Post-Acute Care (AM-PAC-CAT) in a prospective, 3-month follow-up sample of inpatient rehabilitation patients recently discharged home. DESIGN: Longitudinal, prospective 1-group cohort study of patients followed approximately 2 weeks after hospital discharge and then 3 months after the initial home visit. SETTING: Follow-up visits conducted in patients' home setting. PARTICIPANTS: Ninety-four adults who were recently discharged from inpatient rehabilitation, with diagnoses of neurologic, orthopedic, and medically complex conditions. INTERVENTIONS: Not applicable. MAIN OUTCOME MEASURES: Summary scores from AM-PAC-CAT, including 3 activity domains of movement and physical, personal care and instrumental, and applied cognition were compared with scores from a traditional fixed-length version of the AM-PAC with 66 items (AM-PAC-66). RESULTS: AM-PAC-CAT scores were in good agreement (intraclass correlation coefficient model 3,1 range, .77-.86) with scores from the AM-PAC-66. On average, the CAT programs required 43% of the time and 33% of the items compared with the AM-PAC-66. Both formats discriminated across functional severity groups. The standardized response mean (SRM) was greater for the movement and physical fixed form than the CAT; the effect size and SRM of the 2 other AM-PAC domains showed similar sensitivity between CAT and fixed formats. Using patients' own report as an anchor-based measure of change, the CAT and fixed length formats were comparable in responsiveness to patient-reported change over a 3-month interval. CONCLUSIONS: Accurate estimates for functional activity group-level changes can be obtained from CAT administrations, with a considerable reduction in administration time.10a*Activities of Daily Living10a*Adaptation, Physiological10a*Computer Systems10a*Questionnaires10aAdult10aAged10aAged, 80 and over10aChi-Square Distribution10aFactor Analysis, Statistical10aFemale10aHumans10aLongitudinal Studies10aMale10aMiddle Aged10aOutcome Assessment (Health Care)/*methods10aPatient Discharge10aProspective Studies10aRehabilitation/*standards10aSubacute Care/*standards1 aHaley, S M1 aSiebens, H1 aCoster, W J1 aTao, W1 aBlack-Schaffer, R M1 aGandek, B1 aSinclair, S J1 aNi, P uhttp://mail.iacat.org/content/computerized-adaptive-testing-follow-after-discharge-inpatient-rehabilitation-i-activity02347nas a2200217 4500008004100000020002200041245008000063210006900143260002600212300001000238490000700248520160400255653003001859653002101889653003201910653003001942653002501972100001501997700001502012856010202027 2006 eng d a0146-6216 (Print)00aSIMCAT 1.0: A SAS computer program for simulating computer adaptive testing0 aSIMCAT 10 A SAS computer program for simulating computer adaptiv bSage Publications: US a60-610 v303 aMonte Carlo methodologies are frequently applied to study the sampling distribution of the estimated proficiency level in adaptive testing. These methods eliminate real situational constraints. However, these Monte Carlo methodologies are not currently supported by the available software programs, and when these programs are available, their flexibility is limited. SIMCAT 1.0 is aimed at the simulation of adaptive testing sessions under different adaptive expected a posteriori (EAP) proficiency-level estimation methods (Blais & Raîche, 2005; Raîche & Blais, 2005) based on the one-parameter Rasch logistic model. These methods are all adaptive in the a priori proficiency-level estimation, the proficiency-level estimation bias correction, the integration interval, or a combination of these factors. The use of these adaptive EAP estimation methods diminishes considerably the shrinking, and therefore biasing, effect of the estimated a priori proficiency level encountered when this a priori is fixed at a constant value independently of the computed previous value of the proficiency level. SIMCAT 1.0 also computes empirical and estimated skewness and kurtosis coefficients, such as the standard error, of the estimated proficiency-level sampling distribution. In this way, the program allows one to compare empirical and estimated properties of the estimated proficiency-level sampling distribution under different variations of the EAP estimation method: standard error and bias, like the skewness and kurtosis coefficients. (PsycINFO Database Record (c) 2007 APA, all rights reserved)10acomputer adaptive testing10acomputer program10aestimated proficiency level10aMonte Carlo methodologies10aRasch logistic model1 aRaîche, G1 aBlais, J-G uhttp://mail.iacat.org/content/simcat-10-sas-computer-program-simulating-computer-adaptive-testing02112nas a2200229 4500008004100000245013800041210006900179300001400248490000700262520127000269653003101539653003401570653002501604653001701629653001901646653002401665100001401689700001801703700001701721700001901738856012501757 2006 eng d00aSimulated computerized adaptive test for patients with lumbar spine impairments was efficient and produced valid measures of function0 aSimulated computerized adaptive test for patients with lumbar sp a947–9560 v593 aObjective: To equate physical functioning (PF) items with Back Pain Functional Scale (BPFS) items, develop a computerized adaptive test (CAT) designed to assess lumbar spine functional status (LFS) in people with lumbar spine impairments, and compare discriminant validity of LFS measures (qIRT) generated using all items analyzed with a rating scale Item Response Theory model (RSM) and measures generated using the simulated CAT (qCAT). Methods: We performed a secondary analysis of retrospective intake rehabilitation data. Results: Unidimensionality and local independence of 25 BPFS and PF items were supported. Differential item functioning was negligible for levels of symptom acuity, gender, age, and surgical history. The RSM fit the data well. A lumbar spine specific CAT was developed that was 72% more efficient than using all 25 items to estimate LFS measures. qIRT and qCAT measures did not discriminate patients by symptom acuity, age, or gender, but discriminated patients by surgical history in similar clinically logical ways. qCAT measures were as precise as qIRT measures. Conclusion: A body part specific simulated CAT developed from an LFS item bank was efficient and produced precise measures of LFS without eroding discriminant validity.10aBack Pain Functional Scale10acomputerized adaptive testing10aItem Response Theory10aLumbar spine10aRehabilitation10aTrue-score equating1 aHart, D L1 aMioduski, J E1 aWerneke, M W1 aStratford, P W uhttp://mail.iacat.org/content/simulated-computerized-adaptive-test-patients-lumbar-spine-impairments-was-efficient-and-002654nas a2200409 4500008004100000245013400041210006900175300001000244490000700254520123100261653002501492653003201517653003101549653001001580653000901590653002201599653003301621653001101654653001101665653000901676653001601685653002401701653003101725653004101756653004501797653006801842653006101910653003001971653002802001653002202029100001402051700001302065700001802078700001402096700001502110856011902125 2006 eng d00aSimulated computerized adaptive test for patients with shoulder impairments was efficient and produced valid measures of function0 aSimulated computerized adaptive test for patients with shoulder a290-80 v593 aBACKGROUND AND OBJECTIVE: To test unidimensionality and local independence of a set of shoulder functional status (SFS) items, develop a computerized adaptive test (CAT) of the items using a rating scale item response theory model (RSM), and compare discriminant validity of measures generated using all items (theta(IRT)) and measures generated using the simulated CAT (theta(CAT)). STUDY DESIGN AND SETTING: We performed a secondary analysis of data collected prospectively during rehabilitation of 400 patients with shoulder impairments who completed 60 SFS items. RESULTS: Factor analytic techniques supported that the 42 SFS items formed a unidimensional scale and were locally independent. Except for five items, which were deleted, the RSM fit the data well. The remaining 37 SFS items were used to generate the CAT. On average, 6 items were needed to estimate precise measures of function using the SFS CAT, compared with all 37 SFS items. The theta(IRT) and theta(CAT) measures were highly correlated (r = .96) and resulted in similar classifications of patients. CONCLUSION: The simulated SFS CAT was efficient and produced precise, clinically relevant measures of functional status with good discriminating ability.10a*Computer Simulation10a*Range of Motion, Articular10aActivities of Daily Living10aAdult10aAged10aAged, 80 and over10aFactor Analysis, Statistical10aFemale10aHumans10aMale10aMiddle Aged10aProspective Studies10aReproducibility of Results10aResearch Support, N.I.H., Extramural10aResearch Support, U.S. Gov't, Non-P.H.S.10aShoulder Dislocation/*physiopathology/psychology/rehabilitation10aShoulder Pain/*physiopathology/psychology/rehabilitation10aShoulder/*physiopathology10aSickness Impact Profile10aTreatment Outcome1 aHart, D L1 aCook, KF1 aMioduski, J E1 aTeal, C R1 aCrane, P K uhttp://mail.iacat.org/content/simulated-computerized-adaptive-test-patients-shoulder-impairments-was-efficient-and02073nas a2200217 4500008004500000245013400045210006900179300001200248490000700260520127300267653003401540653004201574653002501616653001901641100001401660700001301674700001801687700001401705700001501719856012101734 2006 Engldsh 00aSimulated computerized adaptive test for patients with shoulder impairments was efficient and produced valid measures of function0 aSimulated computerized adaptive test for patients with shoulder a290-2980 v593 aBackground and Objective: To test unidimensionality and local independence of a set of shoulder functional status (SFS) items,
develop a computerized adaptive test (CAT) of the items using a rating scale item response theory model (RSM), and compare discriminant validity of measures generated using all items (qIRT) and measures generated using the simulated CAT (qCAT).
Study Design and Setting: We performed a secondary analysis of data collected prospectively during rehabilitation of 400 patients
with shoulder impairments who completed 60 SFS items.
Results: Factor analytic techniques supported that the 42 SFS items formed a unidimensional scale and were locally independent. Except for five items, which were deleted, the RSM fit the data well. The remaining 37 SFS items were used to generate the CAT. On average, 6 items on were needed to estimate precise measures of function using the SFS CAT, compared with all 37 SFS items. The qIRT and qCAT measures were highly correlated (r 5 .96) and resulted in similar classifications of patients.
Conclusion: The simulated SFS CAT was efficient and produced precise, clinically relevant measures of functional status with good
discriminating ability.