ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	2
Since 2006 (last 20 years)	6

Descriptor

Error of Measurement	17
Statistical Analysis	17
Test Construction	17
Test Items	8
Item Analysis	5
Test Reliability	5
Comparative Analysis	4
Mathematical Models	4
Reliability	4
Sample Size	4
Scores	4
Difficulty Level	3
Item Response Theory	3
Measurement Techniques	3
Reading Tests	3
Sampling	3
Test Interpretation	3
Test Validity	3
Achievement Tests	2
Analysis of Variance	2
Correlation	2
Criterion Referenced Tests	2
Educational Research	2
Equated Scores	2
Factor Analysis	2
More ▼

Source

ETS Research Report Series	2
Applied Measurement in…	1
Behavioral Research and…	1
College Entrance Examination…	1
Educ Psychol Meas	1
Educational Measurement:…	1
Educational Researcher	1
Language Teaching Research	1
Psychometrika	1

Publication Type

Reports - Research	10
Journal Articles	6
Speeches/Meeting Papers	4
Reports - Evaluative	2
Books	1
Guides - Non-Classroom	1
Numerical/Quantitative Data	1

Education Level

Elementary Education	2
Higher Education	2
Postsecondary Education	2
Secondary Education	2
Early Childhood Education	1
Elementary Secondary Education	1
Grade 10	1
Grade 2	1
Grade 3	1
Grade 5	1
Grade 8	1
High Schools	1
Intermediate Grades	1
Junior High Schools	1
Middle Schools	1
Primary Education	1
More ▼

Audience

Researchers	1
Students	1

Location

Japan

Laws, Policies, & Programs

Assessments and Surveys

SAT (College Admission Test)	1
Test of English for…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 17 results Save | Export

An Information-Correction Method for Testlet-Based Test Analysis: From the Perspectives of Item Response Theory and Generalizability Theory. Research Report. ETS RR-17-27

Peer reviewed
PDF on ERIC

Download full text

Li, Feifei – ETS Research Report Series, 2017

An information-correction method for testlet-based tests is introduced. This method takes advantage of both generalizability theory (GT) and item response theory (IRT). The measurement error for the examinee proficiency parameter is often underestimated when a unidimensional conditional-independence IRT model is specified for a testlet dataset. By…

Descriptors: Item Response Theory, Generalizability Theory, Tests, Error of Measurement

Classification Errors and Bias Regarding Research on Sexual Minority Youths

Peer reviewed

Direct link

Cimpian, Joseph R. – Educational Researcher, 2017

Quantitative research on sexual minority youths (SMYs) has likely contributed to misperceptions about the risk and deviance of this population. In part because it often relies on self-reported data from population-based self-administered questionnaires, this research is susceptible to misclassification bias whereby youths who are not SMYs are…

Descriptors: Secondary School Students, Adolescents, Minority Group Students, Homosexuality

Exploring Alternative Test Form Linking Designs with Modified Equating Sample Size and Anchor Test Length. Research Report. ETS RR-13-02

Peer reviewed
PDF on ERIC

Download full text

Wang, Lin; Qian, Jiahe; Lee, Yi-Hsuan – ETS Research Report Series, 2013

The purpose of this study was to evaluate the combined effects of reduced equating sample size and shortened anchor test length on item response theory (IRT)-based linking and equating results. Data from two independent operational forms of a large-scale testing program were used to establish the baseline results for evaluating the results from…

Descriptors: Test Construction, Item Response Theory, Testing Programs, Simulation

An Application of Generalizability Theory to Evaluate the Technical Quality of an Alternate Assessment

Peer reviewed

Direct link

Taylor, Melinda Ann; Pastor, Dena A. – Applied Measurement in Education, 2013

Although federal regulations require testing students with severe cognitive disabilities, there is little guidance regarding how technical quality should be established. It is known that challenges exist with documentation of the reliability of scores for alternate assessments. Typical measures of reliability do little in modeling multiple sources…

Descriptors: Generalizability Theory, Alternative Assessment, Test Reliability, Scores

The Creation and Validation of a Listening Vocabulary Levels Test

Peer reviewed

Direct link

McLean, Stuart; Kramer, Brandon; Beglar, David – Language Teaching Research, 2015

An important gap in the field of second language vocabulary assessment concerns the lack of validated tests measuring aural vocabulary knowledge. The primary purpose of this study is to introduce and provide preliminary validity evidence for the Listening Vocabulary Levels Test (LVLT), which has been designed as a diagnostic tool to measure…

Descriptors: Test Construction, Test Validity, English (Second Language), Second Language Learning

Examining the Technical Adequacy of Second-Grade Reading Comprehension Measures in a Progress Monitoring Assessment System. Technical Report # 08-08

Download full text

Alonzo, Julie; Liu, Kimy; Tindal, Gerald – Behavioral Research and Teaching, 2008

This technical report describes the development of reading comprehension assessments designed for use as progress monitoring measures appropriate for 2nd Grade students. The creation, piloting, and technical adequacy of the measures are presented. The following are appended: (1) Item Specifications for MC [Multiple Choice] Comprehension - Passage…

Descriptors: Reading Comprehension, Reading Tests, Grade 2, Elementary School Students

A Test Of Inclusion Which Allows For Errors Of Measurement

Peer reviewed

White, Richard T.; Clark, R. Malcolm – Psychometrika, 1973

A test which allows for errors of measurement is derived for the hypothesis that all the members of a population who possess a certain skill are a sub-set of the members who possess another skill. (Author)

Descriptors: Error of Measurement, Mathematical Applications, Psychometrics, Statistical Analysis

Effect of Variation in Probability of Guessing Correctly on Reliability of Multiple-Choice Tests

Frary, Robert B.; Zimmerman, Donald W. – Educ Psychol Meas, 1970

Descriptors: Error of Measurement, Guessing (Tests), Multiple Choice Tests, Probability

EFFECT OF ERROR OF MEASUREMENT ON THE POWER OF STATISTICAL TESTS. FINAL REPORT.

Download full text

CLEARY, T.A.; LINN, ROBERT L. – 1967

THE PURPOSE OF THIS RESEARCH WAS TO STUDY THE EFFECT OF ERROR OF MEASUREMENT UPON THE POWER OF STATISTICAL TESTS. ATTENTION WAS FOCUSED ON THE F-TEST OF THE SINGLE FACTOR ANALYSIS OF VARIANCE. FORMULAS WERE DERIVED TO SHOW THE RELATIONSHIP BETWEEN THE NONCENTRALITY PARAMETERS FOR ANALYSES USING TRUE SCORES AND THOSE USING OBSERVED SCORES. THE…

Descriptors: Analysis of Variance, Error of Measurement, Measurement Techniques, Psychological Testing

The Technical Quality of Performance Assessments: Standard Errors of Percents of Pupils Reaching Standards.

Peer reviewed

Yen, Wendy M. – Educational Measurement: Issues and Practice, 1997

The accuracy of statistics based on performance assessments that represent percentages of students reaching standards is explored using data from a large-scale performance assessment, the Maryland School Performance Assessment Program. Results with students in grades 3, 5, and 8 support the accuracy of pooling results to produce the statistics.…

Descriptors: Achievement Tests, Elementary Education, Error of Measurement, Performance Based Assessment

The Effect of Sequential Dependence on the Sampling Distributions of KR-20, KR-21, and Split-Halves Reliabilities.

Download full text

Sullins, Walter L. – 1971

Five-hundred dichotomously scored response patterns were generated with sequentially independent (SI) items and 500 with dependent (SD) items for each of thirty-six combinations of sampling parameters (i.e., three test lengths, three sample sizes, and four item difficulty distributions). KR-20, KR-21, and Split-Half (S-H) reliabilities were…

Descriptors: Comparative Analysis, Correlation, Error of Measurement, Item Analysis

A Simulation Study to Explore Configuring the New SAT® Critical Reading Section without Analogy Items. Research Report No. 2004-2. ETS RR-04-01

Download full text

Liu, Jinghua; Feigenbaum, Miriam; Cook, Linda – College Entrance Examination Board, 2004

This study explored possible configurations of the new SAT® critical reading section without analogy items. The item pool contained items from SAT verbal (SAT-V) sections of 14 previously administered SAT tests, calibrated using the three-parameter logistic IRT model. Multiple versions of several prototypes that do not contain analogy items were…

Descriptors: College Entrance Examinations, Critical Reading, Logical Thinking, Difficulty Level

A Comparison of Three Types of Test Development Procedures Using Classical and Latent Trait Methods.

Benson, Jeri; Wilson, Michael – 1979

Three methods of item selection were used to select sets of 38 items from a 50-item verbal analogies test and the resulting item sets were compared for internal consistency, standard errors of measurement, item difficulty, biserial item-test correlations, and relative efficiency. Three groups of 1,500 cases each were used for item selection. First…

Descriptors: Comparative Analysis, Difficulty Level, Efficiency, Error of Measurement

Some Applications of Generalizability Theory to the Dependability of Domain-Referenced Tests. ACT Technical Bulletin No. 32.

PDF pending restoration

Brennan, Robert L. – 1979

Using the basic principles of generalizability theory, a psychometric model for domain-referenced interpretations is proposed, discussed, and illustrated. This approach, assuming an analysis of variance or linear model, is applicable to numerous data collection designs, including the traditional persons-crossed-with-items design, which is treated…

Descriptors: Analysis of Variance, Cost Effectiveness, Criterion Referenced Tests, Cutting Scores

The Rasch Model for Dichotomous Items: Theory, Applications and a Computer Program. No. 63.

Download full text

Gustafsson, Jan-Eric – 1977

The Rasch model for test analysis is described and compared with two-parameter and three-parameter latent-trait models. Conditional maximum likelihood equations for estimating item parameters are derived, and estimates of person parameters are described together with their confidence intervals. Goodness of fit tests are discussed, including a…

Descriptors: Adaptive Testing, Computer Programs, Equated Scores, Error of Measurement

Previous Page | Next Page »

Pages: 1 | 2

Alonzo, Julie	1
Beglar, David	1
Benson, Jeri	1
Brennan, Robert L.	1
CLEARY, T.A.	1
Cimpian, Joseph R.	1
Clark, R. Malcolm	1
Cook, Linda	1
Feigenbaum, Miriam	1
Fink, Arlene	1
Frary, Robert B.	1
Gustafsson, Jan-Eric	1
Haladyna, Tom	1
Kramer, Brandon	1
LINN, ROBERT L.	1
Lee, Yi-Hsuan	1
Li, Feifei	1
Liu, Jinghua	1
Liu, Kimy	1
McLean, Stuart	1
Pastor, Dena A.	1
Qian, Jiahe	1
Roid, Gale	1
Sullins, Walter L.	1
Taylor, Melinda Ann	1
More ▼