ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	6
Since 2006 (last 20 years)	17

Descriptor

Error of Measurement	50
Test Results	50
Test Reliability	19
Scores	16
Test Construction	11
Achievement Tests	10
Test Interpretation	9
Testing	9
Testing Problems	9
Item Response Theory	8
Reliability	8
True Scores	8
Academic Achievement	7
Elementary Secondary Education	7
Scoring	6
Testing Programs	6
Foreign Countries	5
Item Analysis	5
Research Reports	5
Sampling	5
Simulation	5
Test Use	5
Classification	4
Comparative Analysis	4
Criterion Referenced Tests	4
More ▼

Source

Applied Psychological…	4
Educational Measurement:…	4
Educational and Psychological…	4
Journal of Educational…	3
Council of Chief State School…	2
Measurement and Evaluation in…	2
New York State Education…	2
ACT, Inc.	1
Assessment in Education:…	1
Communique	1
Evaluation & Research in…	1
GED Testing Service	1
Journal of Economic Education	1
NCPEA Publications	1
ProQuest LLC	1
Research Papers in Education	1
School Administrator	1
Test Service Bulletin	1
More ▼

Publication Type

Journal Articles	19
Reports - Evaluative	14
Reports - Research	13
Reports - Descriptive	9
Speeches/Meeting Papers	8
Numerical/Quantitative Data	5
ERIC Digests in Full Text	2
ERIC Publications	2
Books	1
Collected Works - General	1
Dissertations/Theses -…	1
Guides - General	1
Guides - Non-Classroom	1
More ▼

Education Level

Elementary Secondary Education	5
Secondary Education	4
Elementary Education	3
Grade 3	3
Early Childhood Education	2
Grade 4	2
Grade 5	2
Grade 6	2
Grade 7	2
Grade 8	2
Higher Education	2
Intermediate Grades	2
Junior High Schools	2
Middle Schools	2
Primary Education	2
High Schools	1
Postsecondary Education	1
More ▼

Audience

Policymakers	1
Practitioners	1
Researchers	1
Teachers	1

Location

New York	2
United Kingdom (England)	2
Canada	1
Chile	1
Colombia	1
Colorado	1
Florida	1
Illinois	1
Montana	1
Norway	1
Texas	1
United States	1
More ▼

Laws, Policies, & Programs

No Child Left Behind Act 2001

Assessments and Surveys

National Assessment of…	4
ACT Assessment	1
College Level Academic Skills…	1
General Educational…	1
Iowa Tests of Basic Skills	1
Iowa Tests of Educational…	1
Sequential Tests of…	1
Test of English as a Foreign…	1
Wechsler Intelligence Scale…	1
Work Keys (ACT)	1

What Works Clearinghouse Rating

Showing 1 to 15 of 50 results Save | Export

Review of the Use of Standardized Achievement Tests for Accountability Purposes in Education: The Colombia and Chile Cases

Direct link

Jose Antonio Mola Avila – ProQuest LLC, 2023

Accountability in education was implemented to improve poor learning outcomes by documenting and monitoring learning achievement results. In this process, external standardized achievement tests have played a central role, being the mechanism most frequently used to measure learning outcomes. However, several decades after its initial…

Descriptors: Foreign Countries, Standardized Tests, Achievement Tests, Accountability

Teacher-Assigned Grades and External Exams: Sources of Discrepancy

Peer reviewed

Direct link

José Manuel Arencibia Alemán; Astrid Marie Jorde Sandsør; Henrik Daae Zachrisson; Sigrid Blömeke – Assessment in Education: Principles, Policy & Practice, 2024

Modest correlations between teacher-assigned grades and external assessments of academic achievement (r = 0.40-0.60) have led many educational stakeholders to deem grades subjective and unreliable. However, theoretical and methodological challenges, such as construct misalignment, data unavailability and sample unrepresentativeness, limit the…

Descriptors: Grades (Scholastic), Grading, Achievement Tests, Test Validity

Polytomous Rasch Models in Counseling Assessment

Peer reviewed

Direct link

Willse, John T. – Measurement and Evaluation in Counseling and Development, 2017

This article provides a brief introduction to the Rasch model. Motivation for using Rasch analyses is provided. Important Rasch model concepts and key aspects of result interpretation are introduced, with major points reinforced using a simulation demonstration. Concrete guidelines are provided regarding sample size and the evaluation of items.

Descriptors: Item Response Theory, Test Results, Test Interpretation, Simulation

New York State Testing Program 2018: English Language Arts and Mathematics Grades 3-8. Technical Report

Download full text

New York State Education Department, 2018

This technical report provides detailed information regarding the technical, statistical, and measurement attributes of the New York State Testing Program (NYSTP) for the Grades 3-8 English Language Arts (ELA) and Mathematics 2018 Operational Tests. This report includes information about test content and test development, item (i.e., individual…

Descriptors: English, Language Arts, Language Tests, Mathematics Tests

Psychoeducational Reports That Matter: A Consumer-Responsive Approach, Part 2

Direct link

Lichtenstein, Robert – Communique, 2013

Assessment of human abilities and behaviors is enormously enhanced by the use of standardized assessment measures that yield norm-referenced scores. As school psychologists, they rely on quantitative findings to anchor their judgments about a child's developmental and educational functioning and to enhance our capacity to draw diagnostic…

Descriptors: Test Results, School Psychologists, Psychoeducational Methods, Scores

ACT Reporting Category Interpretation Guide: Version 1.0. ACT Working Paper 2016 (05)

Download full text

Powers, Sonya; Li, Dongmei; Suh, Hongwook; Harris, Deborah J. – ACT, Inc., 2016

ACT reporting categories and ACT Readiness Ranges are new features added to the ACT score reports starting in fall 2016. For each reporting category, the number correct score, the maximum points possible, the percent correct, and the ACT Readiness Range, along with an indicator of whether the reporting category score falls within the Readiness…

Descriptors: Scores, Classification, College Entrance Examinations, Error of Measurement

New York State Testing Program 2017: English Language Arts and Mathematics Grades 3-8. Technical Report

Download full text

New York State Education Department, 2017

This technical report provides detailed information regarding the technical, statistical, and measurement attributes of the New York State Testing Program (NYSTP) for the Grades 3-8 English Language Arts (ELA) and Mathematics 2017 Operational Tests. This report includes information about test content and test development, item (i.e., individual…

Descriptors: English, Language Arts, Language Tests, Mathematics Tests

Reporting Error and Reliability to Test-Takers: An International Review

Peer reviewed

Direct link

Bradshaw, Jenny; Wheater, Rebecca – Research Papers in Education, 2013

This review examined a range of approaches internationally to the reporting of assessment results for individual students, with a particular focus on how results are represented, the level of detail reported and the steps taken to quantify, report and explain error and uncertainty in the results' reports or certificates given to students in a…

Descriptors: Test Reliability, Error of Measurement, High Stakes Tests, Foreign Countries

Ramsay-Curve Differential Item Functioning

Peer reviewed

Direct link

Woods, Carol M. – Applied Psychological Measurement, 2011

Differential item functioning (DIF) occurs when an item on a test, questionnaire, or interview has different measurement properties for one group of people versus another, irrespective of true group-mean differences on the constructs being measured. This article is focused on item response theory based likelihood ratio testing for DIF (IRT-LR or…

Descriptors: Simulation, Item Response Theory, Testing, Questionnaires

Commonly Unrecognized Error Variance in Statewide Assessment Programs: Sources of Error Variance and What Can Be Done to Reduce Them

Download full text

Brockmann, Frank – Council of Chief State School Officers, 2011

State testing programs today are more extensive than ever, and their results are required to serve more purposes and high-stakes decisions than one might have imagined. Assessment results are used to hold schools, districts, and states accountable for student performance and to help guide a multitude of important decisions. This report describes…

Descriptors: Accuracy, Measurement, Testing, Expertise

Public Perceptions of Reliability in Examination Results in England

Peer reviewed

Direct link

He, Qingping; Boyle, Andrew; Opposs, Dennis – Evaluation & Research in Education, 2011

Building on findings from existing qualitative research into public perceptions of reliability in examination results in England, a questionnaire was developed and administered to samples of teachers, students and employers to study their awareness of and opinions about various aspects of reliability quantitatively. Main findings from the study…

Descriptors: Qualitative Research, Student Evaluation, Tests, Program Effectiveness

Scaling: An Items Module

Peer reviewed

Direct link

Tong, Ye; Kolen, Michael J. – Educational Measurement: Issues and Practice, 2010

"Scaling" is the process of constructing a score scale that associates numbers or other ordered indicators with the performance of examinees. Scaling typically is conducted to aid users in interpreting test results. This module describes different types of raw scores and scale scores, illustrates how to incorporate various sources of…

Descriptors: Test Results, Scaling, Measures (Individuals), Raw Scores

Addressing Two Commonly Unrecognized Sources of Score Instability in Annual State Assessments

Download full text

Doorey, Nancy A. – Council of Chief State School Officers, 2011

The work reported in this paper reflects a collaborative effort of many individuals representing multiple organizations. It began during a session at the October 2008 meeting of TILSA when a representative of a member state asked the group if any of their programs had experienced unexpected fluctuations in the annual state assessment scores, and…

Descriptors: Testing, Sampling, Expertise, Testing Programs

Empirical Selection of Anchors for Tests of Differential Item Functioning

Peer reviewed

Direct link

Woods, Carol M. – Applied Psychological Measurement, 2009

Differential item functioning (DIF) occurs when items on a test or questionnaire have different measurement properties for one group of people versus another, irrespective of group-mean differences on the construct. Methods for testing DIF require matching members of different groups on an estimate of the construct. Preferably, the estimate is…

Descriptors: Test Results, Testing, Item Response Theory, Test Bias

Same-Form Retest Effects on Credentialing Examinations

Peer reviewed

Direct link

Raymond, Mark R.; Neustel, Sandra; Anderson, Dan – Educational Measurement: Issues and Practice, 2009

Examinees who take high-stakes assessments are usually given an opportunity to repeat the test if they are unsuccessful on their initial attempt. To prevent examinees from obtaining unfair score increases by memorizing the content of specific test items, testing agencies usually assign a different test form to repeat examinees. The use of multiple…

Descriptors: Test Results, Test Items, Testing, Aptitude Tests

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4

Livingston, Samuel A.	3
Henson, Robin K.	2
Schafer, William D.	2
Woods, Carol M.	2
Alford, Betty J., Ed.	1
Anderson, Dan	1
Arenson, Ethan	1
Astrid Marie Jorde Sandsør	1
Ballenger, Julia W., Ed.	1
Belcher, Marcia	1
Boyle, Andrew	1
Bradshaw, Jenny	1
Brennan, Robert L.	1
Brockmann, Frank	1
Cloninger, Dale O.	1
Cronbach, Lee J.	1
Cypress, Beulah K.	1
Diederich, Paul B.	1
Doorey, Nancy A.	1
Doppelt, Jerome E.	1
Doss, David	1
Harris, Chester W.	1
Harris, Deborah J.	1
Harvill, Leo M.	1
More ▼