ERIC - Search Results

Publication Date

In 2025	0
Since 2024	3
Since 2021 (last 5 years)	16
Since 2016 (last 10 years)	38
Since 2006 (last 20 years)	70

Descriptor

Scores	133
Test Items	36
Item Response Theory	25
Test Construction	24
Mathematics Tests	20
Comparative Analysis	18
Validity	18
Test Results	17
Scoring	16
Models	15
Reliability	15
Student Evaluation	15
Correlation	13
Test Interpretation	12
Test Validity	12
Academic Achievement	11
Achievement Tests	11
Classification	11
Error of Measurement	11
Evaluation Methods	11
High School Students	11
Test Reliability	11
College Entrance Examinations	10
Computation	10
Elementary School Students	10
More ▼

Source

Applied Measurement in…

133

Publication Type

Journal Articles	133
Reports - Research	86
Reports - Evaluative	42
Speeches/Meeting Papers	6
Information Analyses	4
Reports - Descriptive	4
Tests/Questionnaires	2
Book/Product Reviews	1
Reports - General	1

Education Level

Secondary Education	14
High Schools	13
Higher Education	13
Elementary Education	9
Postsecondary Education	8
Grade 8	7
Elementary Secondary Education	6
Middle Schools	6
Grade 3	5
Grade 4	5
Junior High Schools	5
Grade 10	3
Grade 11	3
Intermediate Grades	3
Early Childhood Education	2
Grade 12	2
Grade 5	2
Grade 6	2
Grade 7	2
Grade 9	2
Primary Education	2
Grade 1	1
Grade 2	1
More ▼

Audience

Location

Canada	3
Arizona	2
Georgia	2
Vermont	2
Virginia	2
California	1
California (Los Angeles)	1
Europe	1
Indiana	1
Iran	1
Kansas	1
Louisiana	1
Massachusetts	1
Michigan	1
Minnesota	1
Netherlands	1
New York	1
Norway	1
Ohio	1
Oman	1
Oregon	1
Slovenia	1
Sweden	1
Texas	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

SAT (College Admission Test)	9
National Assessment of…	5
Trends in International…	3
Program for International…	2
ACT Assessment	1
Advanced Placement…	1
Bar Examinations	1
College Level Examination…	1
Georgia Criterion Referenced…	1
Graduate Record Examinations	1
Law School Admission Test	1
Self Description Questionnaire	1
Stanford Binet Intelligence…	1
Wechsler Adult Intelligence…	1
Wechsler Intelligence Scale…	1
More ▼

What Works Clearinghouse Rating

Showing 1 to 15 of 133 results Save | Export

Comparing Examinee-Based and Response-Based Motivation Filtering Methods in Remote Low-Stakes Testing

Peer reviewed

Direct link

Sarah Alahmadi; Christine E. DeMars – Applied Measurement in Education, 2024

Large-scale educational assessments are sometimes considered low-stakes, increasing the possibility of confounding true performance level with low motivation. These concerns are amplified in remote testing conditions. To remove the effects of low effort levels in responses observed in remote low-stakes testing, several motivation filtering methods…

Descriptors: Multiple Choice Tests, Item Response Theory, College Students, Scores

New Tests of Rater Drift in Trend Scoring

Peer reviewed

Direct link

John R. Donoghue; Carol Eckerly – Applied Measurement in Education, 2024

Trend scoring constructed response items (i.e. rescoring Time A responses at Time B) gives rise to two-way data that follow a product multinomial distribution rather than the multinomial distribution that is usually assumed. Recent work has shown that the difference in sampling model can have profound negative effects on statistics usually used to…

Descriptors: Scoring, Error of Measurement, Reliability, Scoring Rubrics

A Census-Level, Multi-Grade Analysis of the Association between Testing Time, Breaks, and Achievement

Peer reviewed

Direct link

Rutkowski, David; Rutkowski, Leslie; Valdivia, Dubravka Svetina; Canbolat, Yusuf; Underhill, Stephanie – Applied Measurement in Education, 2023

Several states in the US have removed time limits on their state assessments. In Indiana, where this study takes place, the state assessment is both untimed during the testing window and allows unlimited breaks during the testing session. Using grade 3 and 8 math and English state assessment data, in this paper we focus on time used for testing…

Descriptors: Testing, Time, Intervals, Academic Achievement

When Should Individual Ability Estimates Be Reported if Rapid Guessing Is Present?

Peer reviewed

Direct link

Rios, Joseph A. – Applied Measurement in Education, 2022

Testing programs are confronted with the decision of whether to report individual scores for examinees that have engaged in rapid guessing (RG). As noted by the "Standards for Educational and Psychological Testing," this decision should be based on a documented criterion that determines score exclusion. To this end, a number of heuristic…

Descriptors: Testing, Guessing (Tests), Academic Ability, Scores

Efficient Assessment of Students' Proportional Reasoning

Peer reviewed

Direct link

Carney, Michele; Paulding, Katie; Champion, Joe – Applied Measurement in Education, 2022

Teachers need ways to efficiently assess students' cognitive understanding. One promising approach involves easily adapted and administered item types that yield quantitative scores that can be interpreted in terms of whether or not students likely possess key understandings. This study illustrates an approach to analyzing response process…

Descriptors: Middle School Students, Logical Thinking, Mathematical Logic, Problem Solving

Violation of Conditional Independence in the Many-Facets Rasch Model

Peer reviewed

Direct link

DeMars, Christine E. – Applied Measurement in Education, 2021

Estimation of parameters for the many-facets Rasch model requires that conditional on the values of the facets, such as person ability, item difficulty, and rater severity, the observed responses within each facet are independent. This requirement has often been discussed for the Rasch models and 2PL and 3PL models, but it becomes more complex…

Descriptors: Item Response Theory, Test Items, Ability, Scores

Effects of Using Double Ratings as Item Scores on IRT Proficiency Estimation

Peer reviewed

Direct link

Song, Yoon Ah; Lee, Won-Chan – Applied Measurement in Education, 2022

This article presents the performance of item response theory (IRT) models when double ratings are used as item scores over single ratings when rater effects are present. Study 1 examined the influence of the number of ratings on the accuracy of proficiency estimation in the generalized partial credit model (GPCM). Study 2 compared the accuracy of…

Descriptors: Item Response Theory, Item Analysis, Scores, Accuracy

Teacher Assessment Literacy: Implications for Diagnostic Assessment Systems

Peer reviewed

Direct link

Clark, Amy K.; Nash, Brooke; Karvonen, Meagan – Applied Measurement in Education, 2022

Assessments scored with diagnostic models are increasingly popular because they provide fine-grained information about student achievement. Because of differences in how diagnostic assessments are scored and how results are used, the information teachers must know to interpret and use results may differ from concepts traditionally included in…

Descriptors: Elementary School Teachers, Secondary School Teachers, Assessment Literacy, Diagnostic Tests

Coefficient [beta] as Extension of KR-21 Reliability for Summed and Scaled Scores for Polytomously-Scored Tests

Peer reviewed

Direct link

Almehrizi, Rashid S. – Applied Measurement in Education, 2021

KR-21 reliability and its extension (coefficient [alpha]) gives the reliability estimate of test scores under the assumption of tau-equivalent forms. KR-21 reliability gives the reliability estimate for summed scores for dichotomous items when items are randomly sampled from an infinite pool of similar items (randomly parallel forms). The article…

Descriptors: Test Reliability, Scores, Scoring, Computation

Improving Test-Taking Effort in Low-Stakes Group-Based Educational Testing: A Meta-Analysis of Interventions

Peer reviewed

Direct link

Rios, Joseph – Applied Measurement in Education, 2021

Four decades of research have shown that students' low test-taking effort is a serious threat to the validity of score-based inferences from low-stakes, group-based educational assessments. This meta-analysis sought to identify effective interventions for improving students' test-taking effort in such contexts. Included studies: (1) used a…

Descriptors: Test Wiseness, Student Motivation, Meta Analysis, Intervention

Using Think-Alouds for Response Process Evidence of Teacher Attentiveness

Peer reviewed

Direct link

Mo, Ya; Carney, Michele; Cavey, Laurie; Totorica, Tatia – Applied Measurement in Education, 2021

There is a need for assessment items that assess complex constructs but can also be efficiently scored for evaluation of teacher education programs. In an effort to measure the construct of teacher attentiveness in an efficient and scalable manner, we are using exemplar responses elicited by constructed-response item prompts to develop…

Descriptors: Protocol Analysis, Test Items, Responses, Mathematics Teachers

Comparing School Reports and Empirical Estimates of Relative Reliance on Tests vs Grades in College Admissions

Peer reviewed

Direct link

Sackett, Paul R.; Sharpe, Melissa S.; Kuncel, Nathan – Applied Measurement in Education, 2021

The literature is replete with references to a disproportionate reliance on admission test scores (e.g., the ACT or SAT) in the college admissions process. School-reported reliance on test scores and grades has been used to study this question, generally indicating relatively equal reliance on the two, with a slightly higher endorsement of grades.…

Descriptors: College Admission, Admission Criteria, College Entrance Examinations, College Applicants

Gender Differences and Similarities in High School Science Performance--What Do Item Response Patterns Tell Us?

Peer reviewed

Direct link

Yiling Cheng; I-Chien Chen; Barbara Schneider; Mark Reckase; Joseph Krajcik – Applied Measurement in Education, 2024

The current study expands on previous research on gender differences and similarities in science test scores. Using three different approaches -- differential item functioning, differential distractor functioning, and decision tree analysis -- we examine a high school science assessment administered to 3,849 10th-12th graders, of whom 2,021 are…

Descriptors: Gender Differences, Science Achievement, Responses, Testing

Comparison of Two Approaches to Interpretive Use Arguments

Peer reviewed

Direct link

Carney, Michele; Crawford, Angela; Siebert, Carl; Osguthorpe, Rich; Thiede, Keith – Applied Measurement in Education, 2019

The "Standards for Educational and Psychological Testing" recommend an argument-based approach to validation that involves a clear statement of the intended interpretation and use of test scores, the identification of the underlying assumptions and inferences in that statement--termed the interpretation/use argument, and gathering of…

Descriptors: Inquiry, Test Interpretation, Validity, Scores

Are Large Admissions Test Coaching Effects Widespread? A Longitudinal Analysis of Admissions Test Scores

Peer reviewed

Direct link

Dahlke, Jeffrey A.; Sackett, Paul R.; Kuncel, Nathan R. – Applied Measurement in Education, 2023

We examine longitudinal data from 120,384 students who took a version of the PSAT/SAT in the 9th, 10th, 11th, and 12th grades. We investigate score changes over time and show that socioeconomic status (SES) is related to the degree of score improvement. We note that the 9th and 10th grade PSAT are low-stakes tests, while the operational SAT is a…

Descriptors: Scores, College Entrance Examinations, Socioeconomic Status, Test Preparation

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

Hambleton, Ronald K.	4
Sireci, Stephen G.	4
Wise, Steven L.	4
Carney, Michele	3
Huff, Kristen	3
Johnson, Robert L.	3
Lane, Suzanne	3
Linn, Robert L.	3
Meijer, Rob R.	3
Sackett, Paul R.	3
Bridgeman, Brent	2
Crone, Linda J.	2
D'Agostino, Jerome V.	2
Dadey, Nathan	2
Davis, Laurie Laughlin	2
Engelhard, George, Jr.	2
Frary, Robert B.	2
Frisbie, David A.	2
Haberman, Shelby	2
Karvonen, Meagan	2
Kong, Xiaojing	2
Lee, Guemin	2
Mehrens, William A.	2
Penny, Jim	2
More ▼