NotesFAQContact Us
Collection
Advanced
Search Tips
Publication Date
In 20250
Since 20240
Since 2021 (last 5 years)0
Since 2016 (last 10 years)5
Since 2006 (last 20 years)10
Source
Applied Measurement in…13
Audience
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing all 13 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
El Masri, Yasmine H.; Andrich, David – Applied Measurement in Education, 2020
In large-scale educational assessments, it is generally required that tests are composed of items that function invariantly across the groups to be compared. Despite efforts to ensure invariance in the item construction phase, for a range of reasons (including the security of items) it is often necessary to account for differential item…
Descriptors: Models, Goodness of Fit, Test Validity, Achievement Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Papenberg, Martin; Musch, Jochen – Applied Measurement in Education, 2017
In multiple-choice tests, the quality of distractors may be more important than their number. We therefore examined the joint influence of distractor quality and quantity on test functioning by providing a sample of 5,793 participants with five parallel test sets consisting of items that differed in the number and quality of distractors.…
Descriptors: Multiple Choice Tests, Test Items, Test Validity, Test Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
Cohen, Yoav; Levi, Effi; Ben-Simon, Anat – Applied Measurement in Education, 2018
In the current study, two pools of 250 essays, all written as a response to the same prompt, were rated by two groups of raters (14 or 15 raters per group), thereby providing an approximation to the essay's true score. An automated essay scoring (AES) system was trained on the datasets and then scored the essays using a cross-validation scheme. By…
Descriptors: Test Validity, Automation, Scoring, Computer Assisted Testing
Peer reviewed Peer reviewed
Direct linkDirect link
De Lisle, Jerome – Applied Measurement in Education, 2015
This article explores the challenge of setting performance standards in a non-Western context. The study is centered on standard-setting practice in the national learning assessments of Trinidad and Tobago. Quantitative and qualitative data from annual evaluations between 2005 and 2009 were compiled, analyzed, and deconstructed. In the mixed…
Descriptors: Foreign Countries, National Standards, Educational Assessment, Standard Setting
Peer reviewed Peer reviewed
Direct linkDirect link
Michaelides, Michalis P. – Applied Measurement in Education, 2019
The Student Background survey administered along with achievement tests in studies of the International Association for the Evaluation of Educational Achievement includes scales of student motivation, competence, and attitudes toward mathematics and science. The scales consist of positively- and negatively keyed items. The current research…
Descriptors: International Assessment, Achievement Tests, Mathematics Achievement, Mathematics Tests
Peer reviewed Peer reviewed
Direct linkDirect link
McGrane, Joshua Aaron; Humphry, Stephen Mark; Heldsinger, Sandra – Applied Measurement in Education, 2018
National standardized assessment programs have increasingly included extended written performances, amplifying the need for reliable, valid, and efficient methods of assessment. This article examines a two-stage method using comparative judgments and calibrated exemplars as a complement and alternative to existing methods of assessing writing.…
Descriptors: Standardized Tests, Foreign Countries, Writing Tests, Writing Evaluation
Peer reviewed Peer reviewed
Direct linkDirect link
Deunk, Marjolein I.; van Kuijk, Mechteld F.; Bosker, Roel J. – Applied Measurement in Education, 2014
Standard setting methods, like the Bookmark procedure, are used to assist education experts in formulating performance standards. Small group discussion is meant to help these experts in setting more reliable and valid cutoff scores. This study is an analysis of 15 small group discussions during two standards setting trajectories and their effect…
Descriptors: Cutting Scores, Standard Setting, Group Discussion, Reading Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Eklöf, Hanna; Pavešic, Barbara Japelj; Grønmo, Liv Sissel – Applied Measurement in Education, 2014
The purpose of the study was to measure students' reported test-taking effort and the relationship between reported effort and performance on the Trends in International Mathematics and Science Study (TIMSS) Advanced mathematics test. This was done in three countries participating in TIMSS Advanced 2008 (Sweden, Norway, and Slovenia), and the…
Descriptors: Mathematics Tests, Cross Cultural Studies, Foreign Countries, Correlation
Peer reviewed Peer reviewed
Direct linkDirect link
Leighton, Jacqueline P.; Heffernan, Colleen; Cor, M. Kenneth; Gokiert, Rebecca J.; Cui, Ying – Applied Measurement in Education, 2011
The "Standards for Educational and Psychological Testing" indicate that test instructions, and by extension item objectives, presented to examinees should be sufficiently clear and detailed to help ensure that they respond as developers intend them to respond (Standard 3.20; AERA, APA, & NCME, 1999). The present study investigates…
Descriptors: Test Construction, Validity, Evidence, Science Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Rogers, W. Todd; Lin, Jie; Rinaldi, Christia M. – Applied Measurement in Education, 2011
The evidence gathered in the present study supports the use of the simultaneous development of test items for different languages. The simultaneous approach used in the present study involved writing an item in one language (e.g., French) and, before moving to the development of a second item, translating the item into the second language (e.g.,…
Descriptors: Test Items, Item Analysis, Achievement Tests, French
Peer reviewed Peer reviewed
Bong, Mimi; Hocevar, Dennis – Applied Measurement in Education, 2002
Examined convergent and discriminant validity of various self-efficacy measures across two studies, one involving 358 U.S. high school students and another involving 235 Korean female high school students. Across the studies the first-order confirmatory factor analyses provide support for both convergent validity of different self-efficacy…
Descriptors: Comparative Analysis, Foreign Countries, High School Students, High Schools
Peer reviewed Peer reviewed
Byrne, Barbara M. – Applied Measurement in Education, 1990
The extent to which convergent and discriminant validity of 4 self-concept traits and the method effects associated with 3 measurement scales that were equivalent across gender was studied. Data from 832 students (412 boys and 420 girls) in grades 11 and 12 in suburban Canada were analyzed. (SLD)
Descriptors: Adolescents, Foreign Countries, Grade 11, Grade 12
Peer reviewed Peer reviewed
Byrne, Barbara M.; Schneider, Barry H. – Applied Measurement in Education, 1988
For 241 normal and 132 gifted fifth graders and 113 normal and 117 gifted seventh graders in Ottawa (Ontario), exploratory and confirmatory factor analyses investigated the factorial validity of the Perceived Competence Scale for Children (PCSC). Overall, the PCSC demonstrated sound psychometric properties. (SLD)
Descriptors: Ability, Academically Gifted, Age Differences, Competence