ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	2
Since 2006 (last 20 years)	18

Descriptor

Test Items	22
Item Response Theory	10
Difficulty Level	6
Equated Scores	6
Scores	6
Statistical Analysis	6
Scoring	5
Computer Assisted Testing	4
Error of Measurement	4
Language Tests	4
Models	4
College Entrance Examinations	3
Correlation	3
Item Analysis	3
Licensing Examinations…	3
Mathematics Tests	3
Multiple Choice Tests	3
Raw Scores	3
Test Bias	3
Testing	3
Adaptive Testing	2
Bias	2
College Students	2
Computation	2
Data Collection	2
More ▼

Source

Educational Testing Service

Publication Type

Reports - Research	11
Reports - Evaluative	6
Information Analyses	2
Numerical/Quantitative Data	2
Reports - Descriptive	2
Guides - Classroom - Learner	1
Guides - General	1
Opinion Papers	1
Tests/Questionnaires	1

Education Level

Higher Education	3
Postsecondary Education	3
Adult Education	1
Elementary Education	1
Elementary Secondary Education	1
High Schools	1
Junior High Schools	1
Middle Schools	1
Secondary Education	1

Audience

Practitioners

Location

California	2
Canada	1
Connecticut	1
Georgia	1
Indiana	1
Iowa	1
Michigan	1
Wisconsin	1

Laws, Policies, & Programs

Assessments and Surveys

SAT (College Admission Test)	2
Test of English as a Foreign…	2

What Works Clearinghouse Rating

Showing 1 to 15 of 22 results Save | Export

Basic Concepts of Item Response Theory: A Nonmathematical Introduction. Research Memorandum. ETS RM-20-06

Download full text

Livingston, Samuel A. – Educational Testing Service, 2020

This booklet is a conceptual introduction to item response theory (IRT), which many large-scale testing programs use for constructing and scoring their tests. Although IRT is essentially mathematical, the approach here is nonmathematical, in order to serve as an introduction on the topic for people who want to understand why IRT is used and what…

Descriptors: Item Response Theory, Scoring, Test Items, Scaling

Exploring Math Education Relations by Analyzing Large Data Sets II. Research Memorandum. ETS RM-21-02

Download full text

Weeks, Jonathan; Baron, Patricia – Educational Testing Service, 2021

The current project, Exploring Math Education Relations by Analyzing Large Data Sets (EMERALDS) II, is an attempt to identify specific Common Core State Standards procedural, conceptual, and problem-solving competencies in earlier grades that best predict success in algebraic areas in later grades. The data for this study include two cohorts of…

Descriptors: Mathematics Education, Common Core State Standards, Problem Solving, Mathematics Tests

Equating Test Scores (without IRT). Second Edition

Download full text

Livingston, Samuel A. – Educational Testing Service, 2014

This booklet grew out of a half-day class on equating that author Samuel Livingston teaches for new statistical staff at Educational Testing Service (ETS). The class is a nonmathematical introduction to the topic, emphasizing conceptual understanding and practical applications. The class consists of illustrated lectures, interspersed with…

Descriptors: Equated Scores, Scoring, Self Evaluation (Individuals), Scores

The Single Group with Nearly Equivalent Tests (SiGNET) Design for Equating Very Small Volume Multiple-Choice Tests. Research Report. ETS RR-11-31

Download full text

Grant, Mary C. – Educational Testing Service, 2011

The "single group with nearly equivalent tests" (SiGNET) design proposed here was developed to address the problem of equating scores on multiple-choice test forms with very small single-administration samples. In this design, the majority of items in each new test form consist of items from the previous form, and the new items that were…

Descriptors: Multiple Choice Tests, Equated Scores, Test Items

The Sensitivity of Parameter Estimates to the Latent Ability Distribution. Research Report. ETS RR-11-40

Download full text

Xu, Xueli; Jia, Yue – Educational Testing Service, 2011

Estimation of item response model parameters and ability distribution parameters has been, and will remain, an important topic in the educational testing field. Much research has been dedicated to addressing this task. Some studies have focused on item parameter estimation when the latent ability was assumed to follow a normal distribution,…

Descriptors: Test Items, Statistical Analysis, Computation, Item Response Theory

The Value of the Studied Item in the Matching Criterion in Differential Item Functioning (DIF) Analysis. Research Report. ETS RR-10-13

Download full text

Tan, Xuan; Xiang, Bihua; Dorans, Neil J.; Qu, Yanxuan – Educational Testing Service, 2010

The nature of the matching criterion (usually the total score) in the study of differential item functioning (DIF) has been shown to impact the accuracy of different DIF detection procedures. One of the topics related to the nature of the matching criterion is whether the studied item should be included. Although many studies exist that suggest…

Descriptors: Test Bias, Test Items, Item Response Theory

Statistical Procedures to Evaluate Quality of Scale Anchoring. Research Report. ETS RR-11-02

Download full text

Haberman, Shelby J.; Sinharay, Sandip; Lee, Yi-Hsuan – Educational Testing Service, 2011

Providing information to test takers and test score users about the abilities of test takers at different score levels has been a persistent problem in educational and psychological measurement (Carroll, 1993). Scale anchoring (Beaton & Allen, 1992), a technique that describes what students at different points on a score scale know and can do,…

Descriptors: Statistical Analysis, Scores, Regression (Statistics), Item Response Theory

A General Procedure to Assess the Internal Structure of a Noncognitive Measure--The Student360 Insight Program (S360) Time Management Scale. Research Report. ETS RR-11-42

Download full text

Ling, Guangming; Rijmen, Frank – Educational Testing Service, 2011

The factorial structure of the Time Management (TM) scale of the Student 360: Insight Program (S360) was evaluated based on a national sample. A general procedure with a variety of methods was introduced and implemented, including the computation of descriptive statistics, exploratory factor analysis (EFA), and confirmatory factor analysis (CFA).…

Descriptors: Time Management, Measures (Individuals), Statistical Analysis, Factor Analysis

Practical Considerations in Computer-Based Testing

Download full text

Educational Testing Service, 2011

Choosing whether to test via computer is the most difficult and consequential decision the designers of a testing program can make. The decision is difficult because of the wide range of choices available. Designers can choose where and how often the test is made available, how the test items look and function, how those items are combined into…

Descriptors: Test Items, Testing Programs, Testing, Computer Assisted Testing

Single- versus Double-Scoring of Trend Responses in Trend Score Equating with Constructed-Response Tests. Research Report. ETS RR-10-12

Download full text

Tan, Xuan; Ricker, Kathryn L.; Puhan, Gautam – Educational Testing Service, 2010

This study examines the differences in equating outcomes between two trend score equating designs resulting from two different scoring strategies for trend scoring when operational constructed-response (CR) items are double-scored--the single group (SG) design, where each trend CR item is double-scored, and the nonequivalent groups with anchor…

Descriptors: Equated Scores, Scoring, Responses, Test Items

Studies of a Latent Class Signal Detection Model for Constructed Response Scoring II: Incomplete and Hierarchical Designs. Research Report. ETS RR-10-08

Download full text

DeCarlo, Lawrence T. – Educational Testing Service, 2010

A basic consideration in large-scale assessments that use constructed response (CR) items, such as essays, is how to allocate the essays to the raters that score them. Designs that are used in practice are incomplete, in that each essay is scored by only a subset of the raters, and also unbalanced, in that the number of essays scored by each rater…

Descriptors: Test Items, Responses, Essay Tests, Scoring

Unfair Treatment vs. Confirmation Bias? Comments on Santelices and Wilson. Research Report. ETS RR-10-20

Download full text

Dorans, Neil J. – Educational Testing Service, 2010

Santelices and Wilson (2010) claimed to have addressed technical criticisms of Freedle (2003) presented in Dorans (2004a) and elsewhere. Santelices and Wilson's abstract claimed that their study confirmed that SAT[R] verbal items do function differently for African American and White subgroups. In this commentary, I demonstrate that the…

Descriptors: College Entrance Examinations, Verbal Tests, Test Bias, Test Items

Limits on the Accuracy of Linking. Research Report. ETS RR-10-22

Download full text

Haberman, Shelby J. – Educational Testing Service, 2010

Sampling errors limit the accuracy with which forms can be linked. Limitations on accuracy are especially important in testing programs in which a very large number of forms are employed. Standard inequalities in mathematical statistics may be used to establish lower bounds on the achievable inking accuracy. To illustrate results, a variety of…

Descriptors: Testing Programs, Equated Scores, Sampling, Accuracy

Does Linking Mixed-Format Tests Using a Multiple-Choice Anchor Produce Comparable Results for Male and Female Subgroups? Research Report. ETS RR-11-44

Download full text

Kim, Sooyeon; Walker, Michael E. – Educational Testing Service, 2011

This study examines the use of subpopulation invariance indices to evaluate the appropriateness of using a multiple-choice (MC) item anchor in mixed-format tests, which include both MC and constructed-response (CR) items. Linking functions were derived in the nonequivalent groups with anchor test (NEAT) design using an MC-only anchor set for 4…

Descriptors: Test Format, Multiple Choice Tests, Test Items, Gender Differences

Computer-Adaptive Testing for Students with Disabilities: A Review of the Literature. Research Report. ETS RR-11-32

Download full text

Stone, Elizabeth; Davey, Tim – Educational Testing Service, 2011

There has been an increased interest in developing computer-adaptive testing (CAT) and multistage assessments for K-12 accountability assessments. The move to adaptive testing has been met with some resistance by those in the field of special education who express concern about routing of students with divergent profiles (e.g., some students with…

Descriptors: Disabilities, Adaptive Testing, Accountability, Computer Assisted Testing

Previous Page | Next Page »

Pages: 1 | 2

Sinharay, Sandip	3
Davey, Tim	2
Dorans, Neil J.	2
Haberman, Shelby J.	2
Holland, Paul W.	2
Livingston, Samuel A.	2
Tan, Xuan	2
Baron, Patricia	1
Cheng, Peter C. H.	1
Curley, Edward	1
DeCarlo, Lawrence T.	1
Feigenbaum, Miriam	1
Grant, Mary C.	1
Herbert, Erin	1
Jia, Yue	1
Katz, Irvin R.	1
Kim, Hyun-Joo	1
Kim, Sooyeon	1
Kostin, Irene	1
Lee, Yi-Hsuan	1
Ling, Guangming	1
Liu, Jinghua	1
Nissan, Susan	1
Puhan, Gautam	1
Qu, Yanxuan	1
More ▼