ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	1
Since 2006 (last 20 years)	7

Descriptor

Test Length	13
Validity	13
Comparative Analysis	4
Reliability	3
Sample Size	3
Simulation	3
Cutting Scores	2
Educational Testing	2
Error Patterns	2
Error of Measurement	2
Evaluation Methods	2
Higher Education	2
Item Response Theory	2
Latent Trait Theory	2
Mathematical Models	2
Maximum Likelihood Statistics	2
Sampling	2
Test Items	2
Testing Accommodations	2
Accountability	1
Achievement Tests	1
Adaptive Testing	1
Admission (School)	1
Attitude Measures	1
Bias	1
More ▼

Source

Journal of Psychoeducational…	2
ProQuest LLC	2
Applied Psychological…	1
ERS Spectrum	1
Education 3-13	1
Educational and Psychological…	1
Journal of Experimental…	1
Psychological Assessment	1

Publication Type

Journal Articles	8
Reports - Research	7
Dissertations/Theses -…	2
Reports - Evaluative	2
Information Analyses	1
Numerical/Quantitative Data	1
Reports - Descriptive	1
Tests/Questionnaires	1

Education Level

Elementary Education	2
Higher Education	2
Middle Schools	1
Postsecondary Education	1

Audience

Researchers

Location

Florida	1
New York	1
Turkey	1

Laws, Policies, & Programs

Assessments and Surveys

Law School Admission Test	2
Florida Comprehensive…	1
Graduate Record Examinations	1
Medical College Admission Test	1
Nelson Denny Reading Tests	1
SAT (College Admission Test)	1

What Works Clearinghouse Rating

Showing all 13 results Save | Export

A Regression Discontinuity Design Framework for Controlling Selection Bias in Evaluations of Differential Item Functioning

Peer reviewed

Direct link

Koziol, Natalie A.; Goodrich, J. Marc; Yoon, HyeonJin – Educational and Psychological Measurement, 2022

Differential item functioning (DIF) is often used to examine validity evidence of alternate form test accommodations. Unfortunately, traditional approaches for evaluating DIF are prone to selection bias. This article proposes a novel DIF framework that capitalizes on regression discontinuity design analysis to control for selection bias. A…

Descriptors: Regression (Statistics), Item Analysis, Validity, Testing Accommodations

In Search of the Optimal Number of Response Categories in a Rating Scale

Peer reviewed

Direct link

Lee, Jihyun; Paek, Insu – Journal of Psychoeducational Assessment, 2014

Likert-type rating scales are still the most widely used method when measuring psychoeducational constructs. The present study investigates a long-standing issue of identifying the optimal number of response categories. A special emphasis is given to categorical data, which were generated by the Item Response Theory (IRT) Graded-Response Modeling…

Descriptors: Likert Scales, Responses, Item Response Theory, Classification

The Effects of Young EFL Learners' Perceptions of Tests on Test Anxiety

Peer reviewed

Direct link

Aydin, Selami – Education 3-13, 2012

Studies conducted so far have mainly focused on the relationships between perceptions of tests and test anxiety among adult foreign language learners, while there is a lack of research focusing on young learners on the above-mentioned issue. Thus, this study aims to examine the relationship between test anxiety among young learners who study…

Descriptors: Test Length, Content Validity, Validity, Measures (Individuals)

Effects of Extended Time Allotments on Reading Comprehension Performance of College Students with and without Learning Disabilities

Peer reviewed

Direct link

Lewandowski, Lawrence; Cohen, Justin; Lovett, Benjamin J. – Journal of Psychoeducational Assessment, 2013

Students with disabilities often receive test accommodations in schools and on high-stakes tests. Students with learning disabilities (LD) represent the largest disability group in schools, and extended time the most common test accommodation requested by such students. This pairing persists despite controversy over the validity of extended time…

Descriptors: Testing Accommodations, Learning Disabilities, Reading Comprehension, Undergraduate Students

Controlling Type I Error Rate in Evaluating Differential Item Functioning for Four DIF Methods: Use of Three Procedures for Adjustment of Multiple Item Testing

Direct link

Kim, Jihye – ProQuest LLC, 2010

In DIF studies, a Type I error refers to the mistake of identifying non-DIF items as DIF items, and a Type I error rate refers to the proportion of Type I errors in a simulation study. The possibility of making a Type I error in DIF studies is always present and high possibility of making such an error can weaken the validity of the assessment.…

Descriptors: Test Bias, Test Length, Simulation, Testing

Comparability of Examinee Proficiency Scores on Computer Adaptive Tests Using Real and Simulated Data

Direct link

Evans, Josiah Jeremiah – ProQuest LLC, 2010

In measurement research, data simulations are a commonly used analytical technique. While simulation designs have many benefits, it is unclear if these artificially generated datasets are able to accurately capture real examinee item response behaviors. This potential lack of comparability may have important implications for administration of…

Descriptors: Computer Assisted Testing, Adaptive Testing, Educational Testing, Admission (School)

Ramsay-Curve Item Response Theory for the Three-Parameter Logistic Item Response Model

Peer reviewed

Direct link

Woods, Carol M. – Applied Psychological Measurement, 2008

In Ramsay-curve item response theory (RC-IRT), the latent variable distribution is estimated simultaneously with the item parameters of a unidimensional item response model using marginal maximum likelihood estimation. This study evaluates RC-IRT for the three-parameter logistic (3PL) model with comparisons to the normal model and to the empirical…

Descriptors: Test Length, Computation, Item Response Theory, Maximum Likelihood Statistics

An Investigation into the Possible Speededness of the Medical College Admission Test. MCAT Monograph 3.

PDF pending restoration

Neustel, Sandra – 2001

As a continuing part of its validity studies, the Association of American Medical Colleges commissioned a study of the speediness of the Medical College Admission Test (MCAT). If speed is a hidden part of the test, it is a threat to its construct validity. As a general rule, the criterion used to indicate lack of speediness is that 80% of the…

Descriptors: College Applicants, College Entrance Examinations, Higher Education, Medical Education

The Tridimensional Personality Questionnaire: Reliability and Validity Studies and Derivation of a Short Form.

Peer reviewed

Sher, Kenneth J.; And Others – Psychological Assessment, 1995

Interrelated analyses were conducted with more than 4,000 college students to examine the reliability and validity of the Tridimensional Personality Questionnaire (TPQ) and to develop and validate a short version of the scale. Results provide moderate support for the reliability and validity of both the TPQ and the short form. (SLD)

Descriptors: College Students, Factor Analysis, Higher Education, Personality Assessment

The Usefulness of the Bock Model for Scoring with Information from Incorrect Responses.

Peer reviewed

Huynh, Huynh; Casteel, Jim – Journal of Experimental Education, 1987

In the context of pass/fail decisions, using the Bock multi-nominal latent trait model for moderate-length tests does not produce decisions that differ substantially from those based on the raw scores. The Bock decisions appear to relate less strongly to outside criteria than those based on the raw scores. (Author/JAZ)

Descriptors: Cutting Scores, Error Patterns, Grade 6, Intermediate Grades

An Investigation of Methods for Reducing Sampling Error in Certain IRT Procedures.

Download full text

Wingersky, Marilyn S.; Lord, Frederic M. – 1983

The sampling errors of maximum likelihood estimates of item-response theory parameters are studied in the case where both people and item parameters are estimated simultaneously. A check on the validity of the standard error formulas is carried out. The effect of varying sample size, test length, and the shape of the ability distribution is…

Descriptors: Error of Measurement, Estimation (Mathematics), Item Banks, Latent Trait Theory

Criterion-Referenced Testing: A Critical Analysis of Selected Models. Technical Paper 306. Final Report

Download full text

Steinheiser, Frederick H., Jr.; And Others – 1978

Alternative mathematical models for scoring and decision making with criterion referenced tests are described, especially as they concern appropriate test length and methods of establishing statistically valid cutting scores. Several of these approaches are reviewed and compared on formal-analytic and empirical grounds: (1) Block's approach to…

Descriptors: Comparative Analysis, Criterion Referenced Tests, Cutting Scores, Decision Making

Go Back and Check Your Work: Recommendations for Improving Florida's Accountability System

Peer reviewed

Direct link

Jones, Brett D.; Egley, Robert J. – ERS Spectrum, 2005

The purpose of this paper is to discuss Florida teachers' recommendations for improving the Florida Comprehensive Assessment Test (FCAT) and to compare their recommendations with those of Florida administrators. Although teachers' suggestions varied as to the types and extent of remedies needed to improve the FCAT, some common themes emerged. The…

Descriptors: Test Results, Core Curriculum, Student Evaluation, Accountability

Aydin, Selami	1
Casteel, Jim	1
Cohen, Justin	1
Egley, Robert J.	1
Evans, Josiah Jeremiah	1
Goodrich, J. Marc	1
Huynh, Huynh	1
Jones, Brett D.	1
Kim, Jihye	1
Koziol, Natalie A.	1
Lee, Jihyun	1
Lewandowski, Lawrence	1
Lord, Frederic M.	1
Lovett, Benjamin J.	1
Neustel, Sandra	1
Paek, Insu	1
Sher, Kenneth J.	1
Steinheiser, Frederick H., Jr.	1
Wingersky, Marilyn S.	1
Woods, Carol M.	1
Yoon, HyeonJin	1
More ▼