ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	2
Since 2006 (last 20 years)	12

Descriptor

Difficulty Level	16
Error of Measurement	16
Statistical Analysis	16
Test Items	13
Item Response Theory	7
Equated Scores	5
Goodness of Fit	5
Multiple Choice Tests	5
Simulation	4
College Entrance Examinations	3
Comparative Analysis	3
Reading Comprehension	3
Reading Tests	3
Test Construction	3
Test Reliability	3
Ability Grouping	2
Accuracy	2
Computation	2
Correlation	2
Elementary School Students	2
Foreign Countries	2
Formative Evaluation	2
Grade 5	2
Item Analysis	2
Mathematical Models	2
More ▼

Source

Behavioral Research and…	3
Practical Assessment,…	2
College Entrance Examination…	1
ETS Research Report Series	1
Educational Research and…	1
Educational Testing Service	1
Educational and Psychological…	1
Eurasian Journal of…	1
International Journal of…	1
Language Testing	1
Structural Equation Modeling:…	1
More ▼

Publication Type

Reports - Research	12
Journal Articles	8
Numerical/Quantitative Data	4
Reports - Evaluative	3
Speeches/Meeting Papers	2
Tests/Questionnaires	1

Education Level

Elementary Education	3
Grade 5	2
Higher Education	2
Postsecondary Education	2
Early Childhood Education	1
Grade 2	1
Grade 4	1
Grade 7	1
Intermediate Grades	1
Middle Schools	1
Primary Education	1
More ▼

Audience

Researchers

Location

Austria	1
Belgium	1
Germany	1
Japan	1
Luxembourg	1

Laws, Policies, & Programs

Assessments and Surveys

SAT (College Admission Test)	2
Progress in International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 16 results Save | Export

Impacts of Differences in Group Abilities and Anchor Test Features on Three Non-IRT Test Equating Methods

Peer reviewed
PDF on ERIC

Download full text

Inga Laukaityte; Marie Wiberg – Practical Assessment, Research & Evaluation, 2024

The overall aim was to examine effects of differences in group ability and features of the anchor test form on equating bias and the standard error of equating (SEE) using both real and simulated data. Chained kernel equating, Postratification kernel equating, and Circle-arc equating were studied. A college admissions test with four different…

Descriptors: Ability Grouping, Test Items, College Entrance Examinations, High Stakes Tests

A Comparison of Kernel Equating and Item Response Theory Equating Methods

Peer reviewed
PDF on ERIC

Download full text

Akin-Arikan, Çigdem; Gelbal, Selahattin – Eurasian Journal of Educational Research, 2021

Purpose: This study aims to compare the performances of Item Response Theory (IRT) equating and kernel equating (KE) methods based on equating errors (RMSD) and standard error of equating (SEE) using the anchor item nonequivalent groups design. Method: Within this scope, a set of conditions, including ability distribution, type of anchor items…

Descriptors: Equated Scores, Item Response Theory, Test Items, Statistical Analysis

Self-Assessment of Japanese as a Second Language: The Role of Experiences in the Naturalistic Acquisition

Peer reviewed

Direct link

Suzuki, Yuichi – Language Testing, 2015

Self-assessment has been used to assess second language proficiency; however, as sources of measurement errors vary, they may threaten the validity and reliability of the tools. The present paper investigated the role of experiences in using Japanese as a second language in the naturalistic acquisition context on the accuracy of the…

Descriptors: Self Evaluation (Individuals), Error of Measurement, Japanese, Second Language Learning

A Comparison of Limited-Information and Full-Information Methods in M"plus" for Estimating Item Response Theory Parameters for Nonnormal Populations

Peer reviewed

Direct link

DeMars, Christine E. – Structural Equation Modeling: A Multidisciplinary Journal, 2012

In structural equation modeling software, either limited-information (bivariate proportions) or full-information item parameter estimation routines could be used for the 2-parameter item response theory (IRT) model. Limited-information methods assume the continuous variable underlying an item response is normally distributed. For skewed and…

Descriptors: Item Response Theory, Structural Equation Models, Computation, Computer Software

Observed-Score Equating with a Heterogeneous Target Population

Peer reviewed

Direct link

Duong, Minh Q.; von Davier, Alina A. – International Journal of Testing, 2012

Test equating is a statistical procedure for adjusting for test form differences in difficulty in a standardized assessment. Equating results are supposed to hold for a specified target population (Kolen & Brennan, 2004; von Davier, Holland, & Thayer, 2004) and to be (relatively) independent of the subpopulations from the target population (see…

Descriptors: Ability Grouping, Difficulty Level, Psychometrics, Statistical Analysis

Fixing the c Parameter in the Three-Parameter Logistic Model

Peer reviewed
PDF on ERIC

Download full text

Han, Kyung T. – Practical Assessment, Research & Evaluation, 2012

For several decades, the "three-parameter logistic model" (3PLM) has been the dominant choice for practitioners in the field of educational measurement for modeling examinees' response data from multiple-choice (MC) items. Past studies, however, have pointed out that the c-parameter of 3PLM should not be interpreted as a guessing…

Descriptors: Statistical Analysis, Models, Multiple Choice Tests, Guessing (Tests)

Analyzing the Reliability of the easyCBM Reading Comprehension Measures: Grade 7. Technical Report #1206

Download full text

Irvin, P. Shawn; Alonzo, Julie; Lai, Cheng-Fei; Park, Bitnara Jasmine; Tindal, Gerald – Behavioral Research and Teaching, 2012

In this technical report, we present the results of a reliability study of the seventh-grade multiple choice reading comprehension measures available on the easyCBM learning system conducted in the spring of 2011. Analyses include split-half reliability, alternate form reliability, person and item reliability as derived from Rasch analysis,…

Descriptors: Reading Comprehension, Testing Programs, Statistical Analysis, Grade 7

The Effects of Different Types of Anchor Tests on Observed Score Equating. Research Report. ETS RR-09-41

Download full text

Liu, Jinghua; Sinharay, Sandip; Holland, Paul W.; Feigenbaum, Miriam; Curley, Edward – Educational Testing Service, 2009

This study explores the use of a different type of anchor, a "midi anchor", that has a smaller spread of item difficulties than the tests to be equated, and then contrasts its use with the use of a "mini anchor". The impact of different anchors on observed score equating were evaluated and compared with respect to systematic…

Descriptors: Equated Scores, Test Items, Difficulty Level, Error of Measurement

How Do Different Versions of a Test Instrument Function in a Single Language? A DIF Analysis of the PIRLS 2006 German Assessments

Peer reviewed

Direct link

Stubbe, Tobias C. – Educational Research and Evaluation, 2011

The challenge inherent in cross-national research of providing instruments in different languages measuring the same construct is well known. But even instruments in a single language may be biased towards certain countries or regions due to local linguistic specificities. Consequently, it may be appropriate to use different versions of an…

Descriptors: Test Items, International Studies, Foreign Countries, German

Examining the Technical Adequacy of Second-Grade Reading Comprehension Measures in a Progress Monitoring Assessment System. Technical Report # 08-08

Download full text

Alonzo, Julie; Liu, Kimy; Tindal, Gerald – Behavioral Research and Teaching, 2008

This technical report describes the development of reading comprehension assessments designed for use as progress monitoring measures appropriate for 2nd Grade students. The creation, piloting, and technical adequacy of the measures are presented. The following are appended: (1) Item Specifications for MC [Multiple Choice] Comprehension - Passage…

Descriptors: Reading Comprehension, Reading Tests, Grade 2, Elementary School Students

Examining the Technical Adequacy of Fifth-Grade Reading Comprehension Measures in a Progress Monitoring Assessment System. Technical Report # 08-07

Download full text

Alonzo, Julie; Tindal, Gerald – Behavioral Research and Teaching, 2008

This technical report describes the development and piloting of reading comprehension measures developed for use by fifth-grade students as part of an online progress monitoring assessment system, http://easycbm.com. Each comprehension measure is comprised of an original work of narrative fiction approximately 1500 words in length followed by 20…

Descriptors: Reading Comprehension, Reading Tests, Grade 5, Multiple Choice Tests

Choice of Anchor Test in Equating. Research Report. ETS RR-06-35

Peer reviewed
PDF on ERIC

Download full text

Sinharay, Sandip; Holland, Paul – ETS Research Report Series, 2006

It is a widely held belief that anchor tests should be miniature versions (i.e., minitests), with respect to content and statistical characteristics of the tests being equated. This paper examines the foundations for this belief. It examines the requirement of statistical representativeness of anchor tests that are content representative. The…

Descriptors: Test Items, Equated Scores, Evaluation Methods, Difficulty Level

Standard Errors of Estimate in Item-Examinee Sampling as a Function of Test Reliability, Variation in Item Difficulty Indices and Degree of Skewness in the Normative Distribution

Peer reviewed

Shoemaker, David M. – Educational and Psychological Measurement, 1972

Descriptors: Difficulty Level, Error of Measurement, Item Sampling, Simulation

Adjusting Scores on Examinations Offering a Choice of Questions.

Download full text

Livingston, Samuel A. – 1986

This paper deals with test fairness regarding a test consisting of two parts: (1) a "common" section, taken by all students; and (2) a "variable" section, in which some students may answer a different set of questions from other students. For example, a test taken by several thousand students each year contains a common multiple-choice portion and…

Descriptors: Difficulty Level, Error of Measurement, Essay Tests, Mathematical Models

A Simulation Study to Explore Configuring the New SAT® Critical Reading Section without Analogy Items. Research Report No. 2004-2. ETS RR-04-01

Download full text

Liu, Jinghua; Feigenbaum, Miriam; Cook, Linda – College Entrance Examination Board, 2004

This study explored possible configurations of the new SAT® critical reading section without analogy items. The item pool contained items from SAT verbal (SAT-V) sections of 14 previously administered SAT tests, calibrated using the three-parameter logistic IRT model. Multiple versions of several prototypes that do not contain analogy items were…

Descriptors: College Entrance Examinations, Critical Reading, Logical Thinking, Difficulty Level

Previous Page | Next Page »

Pages: 1 | 2

Alonzo, Julie	3
Tindal, Gerald	3
Feigenbaum, Miriam	2
Liu, Jinghua	2
Sinharay, Sandip	2
Akin-Arikan, Çigdem	1
Benson, Jeri	1
Cook, Linda	1
Curley, Edward	1
DeMars, Christine E.	1
Duong, Minh Q.	1
Gelbal, Selahattin	1
Han, Kyung T.	1
Holland, Paul	1
Holland, Paul W.	1
Inga Laukaityte	1
Irvin, P. Shawn	1
Lai, Cheng-Fei	1
Liu, Kimy	1
Livingston, Samuel A.	1
Marie Wiberg	1
Park, Bitnara Jasmine	1
Shoemaker, David M.	1
Stubbe, Tobias C.	1
Suzuki, Yuichi	1
More ▼