ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	1
Since 2006 (last 20 years)	4

Descriptor

Difficulty Level	6
Raw Scores	6
Statistical Analysis	6
Test Items	6
Equated Scores	5
Comparative Analysis	4
Item Analysis	4
Goodness of Fit	3
Multiple Choice Tests	2
Reading Tests	2
Test Format	2
Academic Achievement	1
Accuracy	1
Age	1
Bias	1
College Entrance Examinations	1
Educational Trends	1
Elementary Secondary Education	1
English (Second Language)	1
Error of Measurement	1
Gender Differences	1
Instructional Program…	1
Item Response Theory	1
Language Tests	1
Listening Comprehension Tests	1
More ▼

Source

ETS Research Report Series	1
Educational Testing Service	1
Educational and Psychological…	1
Pearson	1

Author

Bell, Anita I.	1
Binici, Salih	1
Curley, Edward	1
Feigenbaum, Miriam	1
Holland, Paul W.	1
Kreines, David C.	1
Lenhard, Alexandra	1
Lenhard, Wolfgang	1
Liao, Chi-Wen	1
Liu, Jinghua	1
Livingston, Samuel A.	1
Mead, Ronald J.	1
Powers, Sonya	1
Sinharay, Sandip	1
Turhan, Ahmet	1
More ▼

Publication Type

Reports - Research	6
Speeches/Meeting Papers	3
Journal Articles	2

Education Level

Elementary Secondary Education

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

SAT (College Admission Test)

What Works Clearinghouse Rating

Showing all 6 results Save | Export

Improvement of Norm Score Quality via Regression-Based Continuous Norming

Peer reviewed

Direct link

Lenhard, Wolfgang; Lenhard, Alexandra – Educational and Psychological Measurement, 2021

The interpretation of psychometric test results is usually based on norm scores. We compared semiparametric continuous norming (SPCN) with conventional norming methods by simulating results for test scales with different item numbers and difficulties via an item response theory approach. Subsequently, we modeled the norm scores based on random…

Descriptors: Test Norms, Scores, Regression (Statistics), Test Items

Population Invariance of Vertical Scaling Results

Direct link

Powers, Sonya; Turhan, Ahmet; Binici, Salih – Pearson, 2012

The population sensitivity of vertical scaling results was evaluated for a state reading assessment spanning grades 3-10 and a state mathematics test spanning grades 3-8. Subpopulations considered included males and females. The 3-parameter logistic model was used to calibrate math and reading items and a common item design was used to construct…

Descriptors: Scaling, Equated Scores, Standardized Tests, Reading Tests

The Effects of Different Types of Anchor Tests on Observed Score Equating. Research Report. ETS RR-09-41

Download full text

Liu, Jinghua; Sinharay, Sandip; Holland, Paul W.; Feigenbaum, Miriam; Curley, Edward – Educational Testing Service, 2009

This study explores the use of a different type of anchor, a "midi anchor", that has a smaller spread of item difficulties than the tests to be equated, and then contrasts its use with the use of a "mini anchor". The impact of different anchors on observed score equating were evaluated and compared with respect to systematic…

Descriptors: Equated Scores, Test Items, Difficulty Level, Error of Measurement

Examining an Alternative to Score Equating: A Randomly Equivalent Forms Approach. Research Report. ETS RR-08-14

Peer reviewed
PDF on ERIC

Download full text

Liao, Chi-Wen; Livingston, Samuel A. – ETS Research Report Series, 2008

Randomly equivalent forms (REF) of tests in listening and reading for nonnative speakers of English were created by stratified random assignment of items to forms, stratifying on item content and predicted difficulty. The study included 50 replications of the procedure for each test. Each replication generated 2 REFs. The equivalence of those 2…

Descriptors: Equated Scores, Item Analysis, Test Items, Difficulty Level

Equating Tests With the Rasch Model.

Kreines, David C.; Mead, Ronald J. – 1979

An explanation is given of what is meant by "sample-free" item calibration and by "item-free" person measurement as these terms are applied to the one-parameter logistic test theory model of Georg Rasch. When the difficulty of an item is calibrated separately for two different samples the results may differ; but, according the…

Descriptors: Difficulty Level, Equated Scores, Goodness of Fit, Item Analysis

A Comparison of Three Equating Procedures on the Certifying Examination for Primary Care Physician's Assistants.

Bell, Anita I. – 1979

An equating study was conducted on the Certifying Examination for Primary Care Physician's Assistants to compare the ability of current examinees with the standardization group and to determine if current test items are more difficult than previous items. Using 46 common items from the multiple choice section, the 1978 exam was equated to the 1976…

Descriptors: Comparative Analysis, Difficulty Level, Educational Trends, Equated Scores