ERIC - Search Results

Publication Date

In 2025	1
Since 2024	3
Since 2021 (last 5 years)	6
Since 2016 (last 10 years)	10
Since 2006 (last 20 years)	16

Descriptor

Scores	70
Test Construction	70
Test Use	70
Test Validity	24
Test Interpretation	15
Test Reliability	15
Elementary Secondary Education	13
Test Results	12
Achievement Tests	11
Higher Education	11
Test Items	11
Testing Programs	11
Evaluation Methods	10
Language Tests	10
Standardized Tests	10
Academic Achievement	9
Educational Assessment	9
Psychometrics	9
Student Evaluation	9
Decision Making	8
State Programs	8
Test Format	8
Criterion Referenced Tests	7
Foreign Countries	7
Norm Referenced Tests	7
More ▼

Publication Type

Journal Articles	27
Reports - Research	20
Speeches/Meeting Papers	19
Reports - Evaluative	16
Guides - Non-Classroom	11
Tests/Questionnaires	10
Reports - Descriptive	9
Books	6
Information Analyses	5
Numerical/Quantitative Data	2
Opinion Papers	2
Book/Product Reviews	1
Collected Works - General	1
Dissertations/Theses -…	1
Dissertations/Theses -…	1
Guides - Classroom - Teacher	1
Guides - General	1
Reports - General	1
More ▼

Education Level

Elementary Secondary Education	5
Higher Education	4
Postsecondary Education	4
Elementary Education	2
Secondary Education	2
Adult Basic Education	1
Adult Education	1
Junior High Schools	1
Middle Schools	1

Audience

Practitioners	8
Teachers	5
Administrators	2
Researchers	2
Community	1
Parents	1
Students	1

Location

Hong Kong	2
Indiana	2
Ohio	2
Alabama	1
Colorado	1
Delaware	1
Europe	1
Indonesia	1
Kansas	1
Massachusetts	1
Michigan	1
Minnesota	1
New Jersey	1
New York	1
North Carolina	1
Oregon	1
Tennessee	1
United Kingdom	1
Vermont	1
More ▼

Laws, Policies, & Programs

Comprehensive Education…	1
Every Student Succeeds Act…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 70 results Save | Export

Language Testers and Their Place in the Policy Web

Peer reviewed

Direct link

Laura Schildt; Bart Deygers; Albert Weideman – Language Testing, 2024

In the context of policy-driven language testing for citizenship, a growing body of research examines the political justifications and ethical implications of language requirements and test use. However, virtually no studies have looked at the role that language testers play in the evolution of language requirements. Critical gaps remain in our…

Descriptors: Language Tests, Citizenship, Educational Policy, Assessment Literacy

Moving the Field of Vocabulary Assessment Forward: The Need for More Rigorous Test Development and Validation

Peer reviewed

Direct link

Schmitt, Norbert; Nation, Paul; Kremmel, Benjamin – Language Teaching, 2020

Recently, a large number of vocabulary tests have been made available to language teachers, testers, and researchers. Unfortunately, most of them have been launched with inadequate validation evidence. The field of language testing has become increasingly more rigorous in the area of test validation, but developers of vocabulary tests have…

Descriptors: Test Construction, Test Validity, Language Tests, Test Use

Using Multilabel Neural Network to Score High-Dimensional Assessments for Different Use Foci: An Example with College Major Preference Assessment

Peer reviewed

Direct link

Shun-Fu Hu; Amery D. Wu; Jake Stone – Journal of Educational Measurement, 2025

Scoring high-dimensional assessments (e.g., > 15 traits) can be a challenging task. This paper introduces the multilabel neural network (MNN) as a scoring method for high-dimensional assessments. Additionally, it demonstrates how MNN can score the same test responses to maximize different performance metrics, such as accuracy, recall, or…

Descriptors: Tests, Testing, Scores, Test Construction

Revisiting Rating Scale Development for Rater-Mediated Language Performance Assessments: Modelling Construct and Contextual Choices Made by Scale Developers

Peer reviewed

Direct link

Knoch, Ute; Deygers, Bart; Khamboonruang, Apichat – Language Testing, 2021

Rating scale development in the field of language assessment is often considered in dichotomous ways: It is assumed to be guided either by expert intuition or by drawing on performance data. Even though quite a few authors have argued that rating scale development is rarely so easily classifiable, this dyadic view has dominated language testing…

Descriptors: Rating Scales, Test Construction, Language Tests, Test Use

The Intent of ChatGPT Usage and Its Robustness in Medical Proficiency Exams: A Systematic Review

Peer reviewed

Direct link

Tatiana Chaiban; Zeinab Nahle; Ghaith Assi; Michelle Cherfane – Discover Education, 2024

Background: Since it was first launched, ChatGPT, a Large Language Model (LLM), has been widely used across different disciplines, particularly the medical field. Objective: The main aim of this review is to thoroughly assess the performance of the distinct version of ChatGPT in subspecialty written medical proficiency exams and the factors that…

Descriptors: Medical Education, Accuracy, Artificial Intelligence, Computer Software

Delineating Discrepancies between TOEFL PBT and CBT

Peer reviewed
PDF on ERIC

Download full text

Yulianto, Ahmad; Pudjitriherwanti, Anastasia; Kusumah, Chevy; Oktavia, Dies – International Journal of Language Testing, 2023

The increasing use of computer-based mode in language testing raises concern over its similarities with and differences from paper-based format. The present study aimed to delineate discrepancies between TOEFL PBT and CBT. For that objective, a quantitative method was employed to probe into scores equivalence, the performance of male-female…

Descriptors: Computer Assisted Testing, Test Format, Comparative Analysis, Scores

Making the Case for the Quality and Use of a New Language Proficiency Assessment: Validity Argument for the Redesigned "TOEIC Bridge"® Tests. Research Report. ETS RR-21-20

Peer reviewed
PDF on ERIC

Download full text

Schmidgall, Jonathan; Cid, Jaime; Carter Grissom, Elizabeth; Li, Lucy – ETS Research Report Series, 2021

The redesigned "TOEIC Bridge"® tests were designed to evaluate test takers' English listening, reading, speaking, and writing skills in the context of everyday adult life. In this paper, we summarize the initial validity argument that supports the use of test scores for the purpose of selection, placement, and evaluation of a test…

Descriptors: Language Tests, Second Language Learning, English (Second Language), Language Proficiency

"Quality Testing Standards" -- A Starter Kit for States. Version 6.17.2020

Download full text

New Meridian Corporation, 2020

New Meridian Corporation has developed the "Quality Testing Standards and Criteria for Comparability Claims" (QTS) to provide guidance to states that are interested in including New Meridian content and would like to either keep reporting scores on the New Meridian Scale or use the New Meridian performance levels; that is, the state…

Descriptors: Testing, Standards, Comparative Analysis, Test Content

A Brief Guide to Selecting and Using Pre-Post Assessments

Download full text

Sanders, Sara – National Technical Assistance Center for the Education of Neglected or Delinquent Children and Youth (NDTAC), 2019

This guide is designed to assist States, agencies, and/or facilities who work with youth who are neglected, delinquent, or at-risk (N or D). The information in the guide will benefit those who are (a) interested in implementing pre-posttests, (b) in the process of identifying an appropriate pre-posttest, or (c) ready to evaluate current testing…

Descriptors: At Risk Students, Delinquency, Pretests Posttests, Testing

Does Test Item Performance Increase with Test-to-Standards Alignment?

Peer reviewed

Direct link

Traynor, Anne – Educational Assessment, 2017

Variation in test performance among examinees from different regions or national jurisdictions is often partially attributed to differences in the degree of content correspondence between local school or training program curricula, and the test of interest. This posited relationship between test-curriculum correspondence, or "alignment,"…

Descriptors: Test Items, Test Construction, Alignment (Education), Curriculum

Building an "Assessment Use Argument" for Sign Language: The BSL Nonsense Sign Repetition Test

Peer reviewed

Direct link

Mann, Wolfgang; Marshall, Chloe R. – International Journal of Bilingual Education and Bilingualism, 2010

In this article, we adapt a concept designed to structure language testing more effectively, the "Assessment Use Argument" ("AUA"), as a framework for the development and/or use of sign language assessments for deaf children who are taught in a sign bilingual education setting. By drawing on data from a recent investigation of…

Descriptors: Sign Language, Bilingual Education, Deafness, Language Tests

Consequences of Test Score Use as Validity Evidence: Roles and Responsibilities

Peer reviewed

Direct link

Nichols, Paul D.; Williams, Natasha – Educational Measurement: Issues and Practice, 2009

This article has three goals. The first goal is to clarify the role that the consequences of test score use play in validity judgments by reviewing the role that modern writers on validity have ascribed for consequences in supporting validity judgments. The second goal is to summarize current views on who is responsible for collecting evidence of…

Descriptors: Tests, Test Validity, Scores, Data Collection

Justifying the Use of a Second Language Oral Test as an Exit Test in Hong Kong: An Application of Assessment Use Argument Framework

Direct link

Jia, Yujie – ProQuest LLC, 2013

This study employed Bachman and Palmer's (2010) Assessment Use Argument framework to investigate to what extent the use of a second language oral test as an exit test in a Hong Kong university can be justified. It also aimed to help test developers of this oral test identify the most critical areas in the current test design that might need…

Descriptors: Test Use, Language Tests, Oral Language, Second Language Learning

Initial Development and Score Validation of the Adolescent Anger Rating Scale.

Peer reviewed

Burney, DeAnna McKinnie; Kromrey, Jeffrey – Educational and Psychological Measurement, 2001

Studied the construct validity of scores on the Adolescent Anger Rating Scale (AARS) developed to measure instrumental and reactive anger. Results for 792 12- to 19-year-olds indicate that AARS scores are internally consistent and stable when anger subtypes are measured. (SLD)

Descriptors: Adolescents, Anger, Scores, Test Construction

High-Stakes Testing in Education: Science and Practice in K-12 Settings

Direct link

Bovaird, James A., Ed.; Geisinger, Kurt F., Ed.; Buckendahl, Chad W., Ed. – APA Books, 2011

Educational assessment and, more broadly, educational research in the United States have entered into an era characterized by a dramatic increase in the prevalence and importance of test score use in accountability systems. This volume covers a selection of contemporary issues about testing science and practice that impact the nation's public…

Descriptors: Graduate Students, Test Use, Student Placement, Educational Research

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5

Educational and Psychological…	4
Applied Measurement in…	3
Educational Measurement:…	2
Language Testing	2
APA Books	1
Alberta Journal of…	1
Corwin Press	1
Discover Education	1
ETS Research Report Series	1
Educational Assessment	1
Educational Researcher	1
Evaluation and the Health…	1
International Journal of…	1
International Journal of…	1
International Journal of…	1
Journal of Educational…	1
Language Teaching	1
Measurement and Evaluation in…	1
National Technical Assistance…	1
New Meridian Corporation	1
ProQuest LLC	1
Psychometrika	1
SAGE Publications (CA)	1
TESL Canada Journal	1
Teaching Pre K-8	1
More ▼

Thompson, Bruce	3
Hambleton, Ronald K.	2
Messick, Samuel	2
Saurino, Dan R.	2
Albert Weideman	1
Amery D. Wu	1
Andrich, David	1
Armstrong, Anne-Marie	1
Arter, Judith A.	1
Ashmore, Robert J.	1
Bart Deygers	1
Bond, Linda A.	1
Bovaird, James A., Ed.	1
Bowman, Harry L.	1
Bright, Elizabeth L.	1
Buckendahl, Chad W., Ed.	1
Burney, DeAnna McKinnie	1
Buser, Karen	1
Calkins, Lucy	1
Campbell, Chari A.	1
Campbell, Clifton P.	1
Carter Grissom, Elizabeth	1
Cheng, Sheung-Tak	1
Chongruksa, Jiratha	1
More ▼

National Assessment of…	2
Test of English as a Foreign…	2
Armed Services Vocational…	1
Bender Gestalt Test	1
Delaware Student Testing…	1
Human Figure Drawing Test	1
Iowa Tests of Basic Skills	1
Measures of Academic Progress	1
Myers Briggs Type Indicator	1
National Teacher Examinations	1
New Jersey College Basic…	1
North Carolina End of Course…	1
Pennsylvania Educational…	1
SAT (College Admission Test)	1
Slosson Intelligence Test	1
Test of Adult Basic Education	1
Test of English for…	1
Trends in International…	1
Wide Range Achievement Test	1
Woodcock Johnson Tests of…	1
More ▼