NotesFAQContact Us
Collection
Advanced
Search Tips
Showing 1 to 15 of 22 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Evers, Arne; Sijtsma, Klaas; Lucassen, Wouter; Meijer, Rob R. – International Journal of Testing, 2010
This article describes the 2009 revision of the Dutch Rating System for Test Quality and presents the results of test ratings from almost 30 years. The rating system evaluates the quality of a test on seven criteria: theoretical basis, quality of the testing materials, comprehensiveness of the manual, norms, reliability, construct validity, and…
Descriptors: Rating Scales, Documentation, Educational Quality, Educational Testing
Peer reviewed Peer reviewed
Direct linkDirect link
Schmitt, T. A.; Sass, D. A.; Sullivan, J. R.; Walker, C. M. – International Journal of Testing, 2010
Imposed time limits on computer adaptive tests (CATs) can result in examinees having difficulty completing all items, thus compromising the validity and reliability of ability estimates. In this study, the effects of speededness were explored in a simulated CAT environment by varying examinee response patterns to end-of-test items. Expectedly,…
Descriptors: Monte Carlo Methods, Simulation, Computer Assisted Testing, Adaptive Testing
Peer reviewed Peer reviewed
Direct linkDirect link
Barry, Carol L.; Horst, S. Jeanne; Finney, Sara J.; Brown, Allison R.; Kopp, Jason P. – International Journal of Testing, 2010
Given the prevalence of low-stakes testing internationally (e.g., NAEP, TIMSS, PIRLS), it is crucial to try to better understand examinee motivation in these contexts. In the current study, mixture modeling results supported three different profiles of test-taking effort over the course of five tests. Classes 1 and 2 had varying levels of effort…
Descriptors: Testing, Comparative Analysis, Accountability, College Students
Peer reviewed Peer reviewed
Direct linkDirect link
Davis-Becker, Susan L.; Buckendahl, Chad W.; Gerrow, Jack – International Journal of Testing, 2011
Throughout the world, cut scores are an important aspect of a high-stakes testing program because they are a key operational component of the interpretation of test scores. One method for setting standards that is prevalent in educational testing programs--the Bookmark method--is intended to be a less cognitively complex alternative to methods…
Descriptors: Standard Setting (Scoring), Cutting Scores, Educational Testing, Licensing Examinations (Professions)
Peer reviewed Peer reviewed
Direct linkDirect link
Zenisky, April L.; Crotts, Katrina M. – International Journal of Testing, 2010
The "International Journal of Testing" (IJT) is the journal of the International Test Commission. It is intended to support the dissemination of scholarly research on tests and test use worldwide. The purpose of this article is to reflect on what has been published in IJT over its nine volumes to date, with a focus on the extent to which…
Descriptors: Test Use, Testing, Evaluation, Tests
Peer reviewed Peer reviewed
Direct linkDirect link
In'nami, Yo; Koizumi, Rie – International Journal of Testing, 2010
Because structural equation models are widely used in testing and assessment, investigation into the accuracy of such models may help raise awareness of the value of reanalysis or replication. We focused on second language testing and learning studies and examined: (a) To what extent is information necessary for replication provided by authors?…
Descriptors: Structural Equation Models, Second Language Learning, Second Languages, Testing
Peer reviewed Peer reviewed
Direct linkDirect link
Kim, Se-Kang – International Journal of Testing, 2010
The aim of the current study is to validate the invariance of major profile patterns derived from multidimensional scaling (MDS) by bootstrapping. Profile Analysis via Multidimensional Scaling (PAMS) was employed to obtain profiles and bootstrapping was used to construct the sampling distributions of the profile coordinates and the empirical…
Descriptors: Intervals, Multidimensional Scaling, Profiles, Evaluation
Peer reviewed Peer reviewed
Direct linkDirect link
Reeve, Charlie L.; Lam, Holly – International Journal of Testing, 2007
Prior research regarding practice effects on ability tests has focused primarily on the differences in average score gains across different types of preparation (e.g., test retaking vs. focused item practice vs. coaching). In contrast, there has been little concerted effort towards understanding the significant variance in score gains across…
Descriptors: Achievement Gains, Test Wiseness, Tests, Testing
Peer reviewed Peer reviewed
Direct linkDirect link
Ullstadius, Eva; Carlstedt, Berit; Gustafsson, Jan-Eric – International Journal of Testing, 2008
The influence of general and verbal ability on each of 72 verbal analogy test items were investigated with new factor analytical techniques. The analogy items together with the Computerized Swedish Enlistment Battery (CAT-SEB) were given randomly to two samples of 18-year-old male conscripts (n = 8566 and n = 5289). Thirty-two of the 72 items had…
Descriptors: Test Items, Verbal Ability, Factor Analysis, Swedish
Peer reviewed Peer reviewed
Direct linkDirect link
Wyse, Adam E.; Mapuranga, Raymond – International Journal of Testing, 2009
Differential item functioning (DIF) analysis is a statistical technique used for ensuring the equity and fairness of educational assessments. This study formulates a new DIF analysis method using the information similarity index (ISI). ISI compares item information functions when data fits the Rasch model. Through simulations and an international…
Descriptors: Test Bias, Evaluation Methods, Test Items, Educational Assessment
Peer reviewed Peer reviewed
Direct linkDirect link
Brown, Richard S.; Villarreal, Julio C. – International Journal of Testing, 2007
There has been considerable research regarding the extent to which psychometric sound assessments sometimes yield individual score estimates that are inconsistent with the response patterns of the individual. It has been suggested that individual response patterns may differ from expectations for a number of reasons, including subject motivation,…
Descriptors: Psychometrics, Test Bias, Testing, Simulation
Peer reviewed Peer reviewed
Direct linkDirect link
Veldkamp, Bernard P. – International Journal of Testing, 2008
Integrity[TM], an online application for testing both the statistical integrity of the test and the academic integrity of the examinees, was evaluated for this review. Program features and the program output are described. An overview of the statistics in Integrity[TM] is provided, and the application is illustrated with a small simulation study.…
Descriptors: Simulation, Integrity, Statistics, Computer Assisted Testing
Peer reviewed Peer reviewed
Evers, Arne – International Journal of Testing, 2001
Describes the Dutch rating system for test quality, which evaluates a test for seven criteria, and analyses the results of test ratings from the past 18 years. Results show a steady increase in test quality in the Netherlands that can be attributed to use of better tests and declining use of tests of less quality after evaluation. (SLD)
Descriptors: Criteria, Educational Testing, Evaluation Methods, Foreign Countries
Peer reviewed Peer reviewed
Direct linkDirect link
Bodkin-Andrews, Gawaian H.; Ha, My Trinh; Craven, Rhonda G.; Yeung, Alexander Seesing – International Journal of Testing, 2010
This investigation reports on the cross-cultural equivalence testing of the Self-Description Questionnaire II (short version; SDQII-S) for Indigenous and non-Indigenous Australian secondary student samples. A variety of statistical analysis techniques were employed to assess the psychometric properties of the SDQII-S for both the Indigenous and…
Descriptors: Indigenous Populations, Disadvantaged, Testing, Measures (Individuals)
Peer reviewed Peer reviewed
Direct linkDirect link
Mahoney, Kate – International Journal of Testing, 2008
Education policy in many countries has undergone changes regarding the testing of English Language Learners (ELLs), who by definition are not yet proficient in the language of the test. As policies mandate the inclusion of ELLs in large-scale testing, many question the validity of achievement test scores because the degree to which the test score…
Descriptors: Test Items, Linguistics, Testing, Second Language Learning
Previous Page | Next Page ยป
Pages: 1  |  2