ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	4
Since 2016 (last 10 years)	7
Since 2006 (last 20 years)	12

Descriptor

Scores	17
Scoring	17
Test Format	17
Test Items	8
Test Validity	8
Test Construction	7
Computer Assisted Testing	6
English (Second Language)	4
Language Tests	4
Second Language Learning	4
Test Use	4
Testing	4
Achievement Tests	3
Adaptive Testing	3
Difficulty Level	3
Educational Assessment	3
Evaluation Methods	3
Foreign Countries	3
Higher Education	3
Language Proficiency	3
Multiple Choice Tests	3
Psychometrics	3
Test Reliability	3
Tests	3
College Students	2
More ▼

Source

Practical Assessment,…	3
College Board	1
Council for Aid to Education	1
Education and Information…	1
Evaluation and the Health…	1
International Journal of…	1
Journal of Educational…	1
Journal of Educational…	1
Journal of Pan-Pacific…	1
Language Assessment Quarterly	1
Language Testing	1
National Center for the…	1
More ▼

Publication Type

Journal Articles	11
Reports - Research	8
Reports - Descriptive	4
Reports - Evaluative	4
Speeches/Meeting Papers	3
Tests/Questionnaires	2
Numerical/Quantitative Data	1

Education Level

Higher Education	4
Postsecondary Education	2
High Schools	1
Secondary Education	1

Audience

Location

China	1
Israel	1
Italy	1
Louisiana	1
Missouri	1
North Dakota	1
Tennessee	1
United States	1

Laws, Policies, & Programs

Assessments and Surveys

ACT Assessment	1
Computer Attitude Scale	1
Test of English as a Foreign…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 17 results Save | Export

Historical Perspectives on Score Comparability Issues Raised by Innovations in Testing

Peer reviewed

Direct link

Baldwin, Peter; Clauser, Brian E. – Journal of Educational Measurement, 2022

While score comparability across test forms typically relies on common (or randomly equivalent) examinees or items, innovations in item formats, test delivery, and efforts to extend the range of score interpretation may require a special data collection before examinees or items can be used in this way--or may be incompatible with common examinee…

Descriptors: Scoring, Testing, Test Items, Test Format

Hanyu Shuiping Kaoshi (HSK): A Multi-Level, Multi-Purpose Proficiency Test

Peer reviewed

Direct link

Peng, Yue; Yan, Wei; Cheng, Liying – Language Testing, 2021

This test review focuses on the current version (2009) of [Chinese characters omitted] (Hanyu Shuiping Kaoshi), literally translated as the Chinese Language Proficiency Test and abbreviated as HSK. Tailored to non-native speakers of the Chinese language, this test consists of six proficiency levels (Levels 1 and 2 as beginners, Levels 3 and 4 as…

Descriptors: Language Proficiency, Language Tests, Chinese, Decision Making

Adapting Paper-Based Tests for Computer Administration: Lessons Learned from 30 Years of Mode Effects Studies in Education

Peer reviewed
PDF on ERIC

Download full text

Lynch, Sarah – Practical Assessment, Research & Evaluation, 2022

In today's digital age, tests are increasingly being delivered on computers. Many of these computer-based tests (CBTs) have been adapted from paper-based tests (PBTs). However, this change in mode of test administration has the potential to introduce construct-irrelevant variance, affecting the validity of score interpretations. Because of this,…

Descriptors: Computer Assisted Testing, Tests, Scores, Scoring

Investigation of 2018 ACT Score Declines Final Report

Download full text

Keng, Leslie; Boyer, Michelle – National Center for the Improvement of Educational Assessment, 2020

ACT requested assistance from the National Center for the Improvement of Educational Assessment (Center for Assessment) to investigate declines of scores for states administering the ACT to its 11th grade students in 2018. This request emerged from conversations among state leaders, the Center for Assessment, and ACT in trying to understand the…

Descriptors: College Entrance Examinations, Scores, Test Score Decline, Educational Trends

Computerized Testing in Reading Comprehension Skill: Investigating Score Interchangeability, Item Review, Age and Gender Stereotypes, ICT Literacy and Computer Attitudes

Peer reviewed

Direct link

Toroujeni, Seyyed Morteza Hashemi – Education and Information Technologies, 2022

Score interchangeability of Computerized Fixed-Length Linear Testing (henceforth CFLT) and Paper-and-Pencil-Based Testing (henceforth PPBT) has become a controversial issue over the last decade when technology has meaningfully restructured methods of the educational assessment. Given this controversy, various testing guidelines published on…

Descriptors: Computer Assisted Testing, Reading Tests, Reading Comprehension, Scoring

Fairness Concerns of Discrete Option Multiple Choice Items

Peer reviewed
PDF on ERIC

Download full text

Eckerly, Carol; Smith, Russell; Sowles, John – Practical Assessment, Research & Evaluation, 2018

The Discrete Option Multiple Choice (DOMC) item format was introduced by Foster and Miller (2009) with the intent of improving the security of test content. However, by changing the amount and order of the content presented, the test taking experience varies by test taker, thereby introducing potential fairness issues. In this paper we…

Descriptors: Culture Fair Tests, Multiple Choice Tests, Testing, Test Items

The New Computer Adaptive Test of Size and Strength (CATSS): Development and Validation

Peer reviewed

Direct link

Aviad-Levitzky, Tami; Laufer, Batia; Goldstein, Zahava – Language Assessment Quarterly, 2019

This article describes the development and validation of the new CATSS (Computer Adaptive Test of Size and Strength), which measures vocabulary knowledge in four modalities -- productive recall, receptive recall, productive recognition, and receptive recognition. In the first part of the paper we present the assumptions that underlie the test --…

Descriptors: Foreign Countries, Test Construction, Test Validity, Test Reliability

Rating Quality Studies Using Rasch Measurement Theory. Research Report 2013-3

Download full text

Engelhard, George, Jr.; Wind, Stefanie A. – College Board, 2013

The major purpose of this study is to examine the quality of ratings assigned to CR (constructed-response) questions in large-scale assessments from the perspective of Rasch Measurement Theory. Rasch Measurement Theory provides a framework for the examination of rating scale category structure that can yield useful information for interpreting the…

Descriptors: Measurement Techniques, Rating Scales, Test Theory, Scores

A Case Study of an International Performance-Based Assessment of Critical Thinking Skills

Download full text

Wolf, Raffaela; Zahner, Doris; Kostoris, Fiorella; Benjamin, Roger – Council for Aid to Education, 2014

The measurement of higher-order competencies within a tertiary education system across countries presents methodological challenges due to differences in educational systems, socio-economic factors, and perceptions as to which constructs should be assessed (Blömeke, Zlatkin-Troitschanskaia, Kuhn, & Fege, 2013). According to Hart Research…

Descriptors: Case Studies, International Assessment, Performance Based Assessment, Critical Thinking

Fundamental Concerns in High-Stakes Language Testing: The Case of the College English Test

Peer reviewed
PDF on ERIC

Download full text

Direct link

Jin, Yan – Journal of Pan-Pacific Association of Applied Linguistics, 2011

The College English Test (CET) is an English language test designed for educational purposes, administered on a very large scale, and used for making high-stakes decisions. This paper discusses the key issues facing the CET during the course of its development in the past two decades. It argues that the most fundamental and critical concerns of…

Descriptors: High Stakes Tests, Language Tests, Measures (Individuals), Graduates

Gating Items: Definition, Significance, and Need for Further Study

Peer reviewed

Direct link

Judd, Wallace – Practical Assessment, Research & Evaluation, 2009

Over the past twenty years in performance testing a specific item type with distinguishing characteristics has arisen time and time again. It's been invented independently by dozens of test development teams. And yet this item type is not recognized in the research literature. This article is an invitation to investigate the item type, evaluate…

Descriptors: Test Items, Test Format, Evaluation, Item Analysis

A Missing Data Approach to Estimating Distributions of Scores for Optional Test Sections.

Allen, Nancy L.; And Others – 1992

Many testing programs include a section of optional questions in addition to mandatory parts of a test. These optional parts of a test are not often truly parallel to one another, and groups of examinees selecting each optional test section are not equivalent to one another. This paper provides a general method based on missing-data methods for…

Descriptors: Comparative Testing, Estimation (Mathematics), Graphs, Scaling

Evaluating Student Multiple-Choice Responses: Effects of Coded and Free Formats.

Peer reviewed

Harasym, P. H.; And Others – Evaluation and the Health Professions, 1980

Coded, as opposed to free response items, in a multiple choice physiology test had a cueing effect which raised students' scores, especially for lower achievers. Reliability of coded items was also lower. Item format and scoring method had an effect on test results. (GDC)

Descriptors: Achievement Tests, Comparative Testing, Cues, Higher Education

A Computer-Based Approach for Deriving and Measuring Individual and Team Knowledge Structure from Essay Questions

Peer reviewed

Direct link

Clariana, Roy B.; Wallace, Patricia – Journal of Educational Computing Research, 2007

This proof-of-concept investigation describes a computer-based approach for deriving the knowledge structure of individuals and of groups from their written essays, and considers the convergent criterion-related validity of the computer-based scores relative to human rater essay scores and multiple-choice test scores. After completing a…

Descriptors: Computer Assisted Testing, Multiple Choice Tests, Construct Validity, Cognitive Structures

Inter-rater Reliability on Various Types of Assessments Scored by School District Staff.

Download full text

Myerberg, N. James – 1996

The Montgomery County (Maryland) public school system has started using assessments other than multiple-choice tests because it is felt that this will provide school staff with better information about the success of the instructional program. One of the ways assessments can provide better information is by having teachers score student papers.…

Descriptors: Accountability, Achievement Tests, Educational Assessment, Elementary Secondary Education

Previous Page | Next Page »

Pages: 1 | 2

Allen, Nancy L.	1
Aviad-Levitzky, Tami	1
Baldwin, Peter	1
Benjamin, Roger	1
Boyer, Michelle	1
Cheng, Liying	1
Clariana, Roy B.	1
Clauser, Brian E.	1
Eckerly, Carol	1
Engelhard, George, Jr.	1
Goldstein, Zahava	1
Hambleton, Ronald K.	1
Harasym, P. H.	1
Jin, Yan	1
Judd, Wallace	1
Keng, Leslie	1
Kostoris, Fiorella	1
Laufer, Batia	1
Lynch, Sarah	1
Myerberg, N. James	1
Peng, Yue	1
Sireci, Stephen G.	1
Smith, Russell	1
Sowles, John	1
More ▼