ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	9
Since 2016 (last 10 years)	23
Since 2006 (last 20 years)	53

Descriptor

Validity	32
Test Validity	29
Test Items	19
Scores	17
Foreign Countries	13
Item Response Theory	13
Test Construction	13
Elementary School Students	12
Mathematics Tests	12
Achievement Tests	10
Correlation	9
Evaluation Methods	9
Reliability	9
High School Students	8
Standardized Tests	8
Construct Validity	7
Educational Assessment	7
Evidence	7
Grade 5	7
Interrater Reliability	6
Item Analysis	6
Models	6
National Competency Tests	6
Reading Tests	6
Scoring	6
More ▼

Source

Applied Measurement in…

Publication Type

Journal Articles	71
Reports - Research	71
Speeches/Meeting Papers	2

Education Level

Elementary Secondary Education	10
Elementary Education	8
High Schools	8
Higher Education	8
Secondary Education	7
Grade 8	6
Middle Schools	5
Grade 4	4
Junior High Schools	4
Postsecondary Education	4
Grade 12	3
Grade 3	3
Grade 5	3
Grade 7	3
Grade 6	2
Grade 9	2
Intermediate Grades	2
Grade 10	1
Grade 11	1
More ▼

Audience

Location

Canada	4
Germany	2
Australia	1
California	1
California (Los Angeles)	1
Finland	1
France	1
Israel	1
Italy	1
Jordan	1
Massachusetts	1
Netherlands	1
New York	1
New York (New York)	1
Norway	1
Romania	1
Russia	1
Slovenia	1
South Korea	1
Sweden	1
Trinidad and Tobago	1
United Kingdom	1
United Kingdom (Northern…	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

SAT (College Admission Test)	3
Program for International…	2
Trends in International…	2
Armed Services Vocational…	1
Bar Examinations	1
Iowa Tests of Basic Skills	1
Law School Admission Test	1
Measures of Academic Progress	1
National Assessment of…	1
Perceived Competence Scale…	1
Progress in International…	1
United States Medical…	1
More ▼

What Works Clearinghouse Rating

Showing 1 to 15 of 71 results Save | Export

A Method for Displaying Incremental Validity with Expectancy Charts

Peer reviewed

Direct link

Lee, Samuel David; Walmsley, Philip T.; Sackett, Paul R.; Kuncel, Nathan – Applied Measurement in Education, 2021

Providing assessment validity information to decision makers in a clear and useful format is an ongoing challenge for the educational and psychological measurement community. We identify issues with a previous approach to a graphical presentation, noting that it is mislabeled as presenting incremental validity, when in fact it displays the effects…

Descriptors: Test Validity, Predictor Variables, Charts

A Method of Empirical Q-Matrix Validation for Multidimensional Item Response Theory

Peer reviewed

Direct link

Marcelo Andrade da Silva; A. Corinne Huggins-Manley; Jorge Luis Bazán; Amber Benedict – Applied Measurement in Education, 2024

A Q-matrix is a binary matrix that defines the relationship between items and latent variables and is widely used in diagnostic classification models (DCMs), and can also be adopted in multidimensional item response theory (MIRT) models. The construction process of the Q-matrix is typically carried out by experts in the subject area of the items…

Descriptors: Q Methodology, Matrices, Item Response Theory, Educational Assessment

A Method for Identifying Partial Test-Taking Engagement

Peer reviewed

Direct link

Wise, Steven; Kuhfeld, Megan – Applied Measurement in Education, 2021

Effort-moderated (E-M) scoring is intended to estimate how well a disengaged test taker would have performed had they been fully engaged. It accomplishes this adjustment by excluding disengaged responses from scoring and estimating performance from the remaining responses. The scoring method, however, assumes that the remaining responses are not…

Descriptors: Scoring, Achievement Tests, Identification, Validity

Comparison of Methods for Identifying Differential Step Functioning with Polytomous Item Response Data

Peer reviewed

Direct link

Finch, Holmes – Applied Measurement in Education, 2022

Much research has been devoted to identification of differential item functioning (DIF), which occurs when the item responses for individuals from two groups differ after they are conditioned on the latent trait being measured by the scale. There has been less work examining differential step functioning (DSF), which is present for polytomous…

Descriptors: Comparative Analysis, Item Response Theory, Item Analysis, Simulation

Efficient Assessment of Students' Proportional Reasoning

Peer reviewed

Direct link

Carney, Michele; Paulding, Katie; Champion, Joe – Applied Measurement in Education, 2022

Teachers need ways to efficiently assess students' cognitive understanding. One promising approach involves easily adapted and administered item types that yield quantitative scores that can be interpreted in terms of whether or not students likely possess key understandings. This study illustrates an approach to analyzing response process…

Descriptors: Middle School Students, Logical Thinking, Mathematical Logic, Problem Solving

Using Think-Alouds for Response Process Evidence of Teacher Attentiveness

Peer reviewed

Direct link

Mo, Ya; Carney, Michele; Cavey, Laurie; Totorica, Tatia – Applied Measurement in Education, 2021

There is a need for assessment items that assess complex constructs but can also be efficiently scored for evaluation of teacher education programs. In an effort to measure the construct of teacher attentiveness in an efficient and scalable manner, we are using exemplar responses elicited by constructed-response item prompts to develop…

Descriptors: Protocol Analysis, Test Items, Responses, Mathematics Teachers

Characterizing the Latent Classes in a Mixture IRT Model Using DIF

Peer reviewed

Direct link

Karadavut, Tugba – Applied Measurement in Education, 2021

Mixture IRT models address the heterogeneity in a population by extracting latent classes and allowing item parameters to vary between latent classes. Once the latent classes are extracted, they need to be further examined to be characterized. Some approaches have been adopted in the literature for this purpose. These approaches examine either the…

Descriptors: Item Response Theory, Models, Test Items, Maximum Likelihood Statistics

Validating Rubric Scoring Processes: An Application of an Item Response Tree Model

Peer reviewed

Direct link

Myers, Aaron J.; Ames, Allison J.; Leventhal, Brian C.; Holzman, Madison A. – Applied Measurement in Education, 2020

When rating performance assessments, raters may ascribe different scores for the same performance when rubric application does not align with the intended application of the scoring criteria. Given performance assessment score interpretation assumes raters apply rubrics as rubric developers intended, misalignment between raters' scoring processes…

Descriptors: Scoring Rubrics, Validity, Item Response Theory, Interrater Reliability

Gathering Response Process Data for a Problem-Solving Measure through Whole-Class Think Alouds

Peer reviewed

Direct link

Bostic, Jonathan David; Sondergeld, Toni A.; Matney, Gabriel; Stone, Gregory; Hicks, Tiara – Applied Measurement in Education, 2021

Response process validity evidence provides a window into a respondent's cognitive processing. The purpose of this study is to describe a new data collection tool called a whole-class think aloud (WCTA). This work is performed as part of test development for a series of problem-solving measures to be used in elementary and middle grades. Data from…

Descriptors: Data Collection, Protocol Analysis, Problem Solving, Cognitive Processes

Improving the Predictive Validity of Reading Comprehension Using Response Times of Correct Item Responses

Peer reviewed

Direct link

Su, Shiyang; Davison, Mark L. – Applied Measurement in Education, 2019

Response times have often been used as ancillary information to improve parameter estimation. Under the dual processing theory, assuming reading comprehension requires an automatic process, a fast, correct response is an indicator of effective automatic processing. A skilled, automatic comprehender should be high in response accuracy and low in…

Descriptors: Reaction Time, Reading Comprehension, Reading Tests, Predictive Validity

Impact of Item Parameter Drift on Rasch Scale Stability in Small Samples over Multiple Administrations

Peer reviewed

Direct link

Kopp, Jason P.; Jones, Andrew T. – Applied Measurement in Education, 2020

Traditional psychometric guidelines suggest that at least several hundred respondents are needed to obtain accurate parameter estimates under the Rasch model. However, recent research indicates that Rasch equating results in accurate parameter estimates with sample sizes as small as 25. Item parameter drift under the Rasch model has been…

Descriptors: Item Response Theory, Psychometrics, Sample Size, Sampling

The Effects of Effort Monitoring with Proctor Notification on Test-Taking Engagement, Test Performance, and Validity

Peer reviewed

Direct link

Wise, Steven L.; Kuhfeld, Megan R.; Soland, James – Applied Measurement in Education, 2019

When we administer educational achievement tests, we want to be confident that the resulting scores validly indicate what the test takers know and can do. However, if the test is perceived as low stakes by the test taker, disengaged test taking sometimes occurs, which poses a serious threat to score validity. When computer-based tests are used,…

Descriptors: Guessing (Tests), Computer Assisted Testing, Achievement Tests, Scores

The Trade-Off between Model Fit, Invariance, and Validity: The Case of PISA Science Assessments

Peer reviewed

Direct link

El Masri, Yasmine H.; Andrich, David – Applied Measurement in Education, 2020

In large-scale educational assessments, it is generally required that tests are composed of items that function invariantly across the groups to be compared. Despite efforts to ensure invariance in the item construction phase, for a range of reasons (including the security of items) it is often necessary to account for differential item…

Descriptors: Models, Goodness of Fit, Test Validity, Achievement Tests

On the Reliable Identification and Effectiveness of Computer-Based, Pop-Up Glossaries in Large-Scale Assessments

Peer reviewed

Direct link

Cohen, Dale J.; Ballman, Alesha; Rijmen, Frank; Cohen, Jon – Applied Measurement in Education, 2020

Computer-based, pop-up glossaries are perhaps the most promising accommodation aimed at mitigating the influence of linguistic structure and cultural bias on the performance of English Learner (EL) students on statewide assessments. To date, there is no established procedure for identifying the words that require a glossary for EL students that is…

Descriptors: Glossaries, Testing Accommodations, English Language Learners, Computer Assisted Testing

Formative Assessment of Computational Thinking: Cognitive and Metacognitive Processes

Peer reviewed

Direct link

Bonner, Sarah; Chen, Peggy; Jones, Kristi; Milonovich, Brandon – Applied Measurement in Education, 2021

We describe the use of think alouds to examine substantive processes involved in performance on a formative assessment of computational thinking (CT) designed to support self-regulated learning (SRL). Our task design model included three phases of work on a computational thinking problem: forethought, performance, and reflection. The cognitive…

Descriptors: Formative Evaluation, Thinking Skills, Metacognition, Computer Science Education

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5

Byrne, Barbara M.	3
Abedi, Jamal	2
Carney, Michele	2
Leighton, Jacqueline P.	2
Sackett, Paul R.	2
Sawyer, Richard	2
Steedle, Jeffrey T.	2
Zhang, Bo	2
A. Corinne Huggins-Manley	1
Amber Benedict	1
Ames, Allison J.	1
Andrich, David	1
Ballman, Alesha	1
Bazana, P. Gordon	1
Beatty, Adam S.	1
Ben-Simon, Anat	1
Bong, Mimi	1
Bonner, Sarah	1
Bosker, Roel J.	1
Bostic, Jonathan David	1
Buckendahl, Chad W.	1
Bush, M. Joan	1
Carlo, Maria S.	1
Cavey, Laurie	1
More ▼