ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	4
Since 2006 (last 20 years)	7

Descriptor

Educational Testing	59
Test Reliability	59
Test Validity	23
Test Construction	19
Test Interpretation	14
Higher Education	13
Statistical Analysis	12
Achievement Tests	10
Elementary Secondary Education	10
Scoring	9
Student Evaluation	9
Student Placement	9
Academic Achievement	8
College Freshmen	8
Criterion Referenced Tests	8
Elementary Education	8
Test Format	8
Testing Programs	8
Evaluation Methods	7
Item Analysis	7
Mathematical Models	7
Multiple Choice Tests	7
State Programs	7
Equivalency Tests	6
Essay Tests	6
More ▼

Publication Type

Reports - Research	59
Journal Articles	20
Speeches/Meeting Papers	8
Reports - Descriptive	6
Tests/Questionnaires	3
Numerical/Quantitative Data	2
Books	1
Collected Works - General	1
Dissertations/Theses -…	1
Dissertations/Theses -…	1
Guides - Non-Classroom	1
Opinion Papers	1
More ▼

Education Level

Early Childhood Education	2
Elementary Education	2
Higher Education	2
Elementary Secondary Education	1
Grade 1	1
Grade 2	1
Grade 3	1
Grade 4	1
Grade 8	1
Intermediate Grades	1
Junior High Schools	1
Middle Schools	1
Postsecondary Education	1
Preschool Education	1
Primary Education	1
Secondary Education	1
More ▼

Audience

Practitioners	4
Administrators	2
Researchers	2
Teachers	2
Counselors	1

Location

California	7
Arizona (Phoenix)	1
Australia	1
Canada	1
Ghana	1
Indiana	1
Maryland	1
New Jersey	1
North America	1
United Kingdom	1

Laws, Policies, & Programs

Individuals with Disabilities…

Assessments and Surveys

Kaufman Assessment Battery…	2
National Assessment of…	1
New Jersey College Basic…	1
Pediatric Evaluation of…	1
Stanford Binet Intelligence…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 59 results Save | Export

Somers' D as an Alternative for the Item-Test and Item-Rest Correlation Coefficients in the Educational Measurement Settings

Peer reviewed
PDF on ERIC

Download full text

Metsämuuronen, Jari – International Journal of Educational Methodology, 2020

Pearson product-moment correlation coefficient between item g and test score X, known as item-test or item-total correlation ("Rit"), and item-rest correlation ("Rir") are two of the most used classical estimators for item discrimination power (IDP). Both "Rit" and "Rir" underestimate IDP caused by the…

Descriptors: Correlation, Test Items, Scores, Difficulty Level

The Effect of the Ratio of Common Items and the Separation of Grade Distributions on the Precision of Vertical Scaling

Peer reviewed

Direct link

Guangming Li; Zhengyan Liang – SAGE Open, 2024

In order to investigate the influence of separation of grade distributions and ratio of common items on the precision of vertical scaling, this simulation study chooses common item design and first grade as base grade. There are four grades with 1,000 students each to take part in a test which has 100 items. Monte Carlo simulation method is used…

Descriptors: Elementary School Students, Grade 1, Grade 2, Grade 3

Using Reliability and Item Analysis to Evaluate a Teacher-Developed Test in Educational Measurement and Evaluation

Peer reviewed

Direct link

Quaigrain, Kennedy; Arhin, Ato Kwamina – Cogent Education, 2017

Item analysis is essential in improving items which will be used again in later tests; it can also be used to eliminate misleading items in a test. The study focused on item and test quality and explored the relationship between difficulty index (p-value) and discrimination index (DI) with distractor efficiency (DE). The study was conducted among…

Descriptors: Item Analysis, Teacher Developed Materials, Test Reliability, Educational Assessment

Using the Major Field Test for a Bachelor's Degree in Business as a Learning Outcomes Assessment: Evidence from a Review of 20 Years of Institution-Based Research

Peer reviewed

Direct link

Ling, Guangming; Bochenek, Jennifer; Burkander, Kri – Journal of Education for Business, 2015

By applying multilevel models with random effects, the authors reviewed and synthesized findings from 30 studies that were published in the last 20 years exploring the relationship between the Educational Testing Service Major Field Test for a Bachelor's Degree in Business (MFTB) and related factors. The results suggest that MFTB scores correlated…

Descriptors: Bachelors Degrees, Institutional Research, Educational Testing, Scores

Does It Matter Whether One Takes a Test on an iPad or a Desktop Computer?

Peer reviewed

Direct link

Ling, Guangming – International Journal of Testing, 2016

To investigate possible iPad related mode effect, we tested 403 8th graders in Indiana, Maryland, and New Jersey under three mode conditions through random assignment: a desktop computer, an iPad alone, and an iPad with an external keyboard. All students had used an iPad or computer for six months or longer. The 2-hour test included reading, math,…

Descriptors: Educational Testing, Computer Assisted Testing, Handheld Devices, Computers

The Reliability of Results from National Curriculum Testing in England

Peer reviewed

Direct link

Newton, Paul E. – Educational Research, 2009

Background: National curriculum tests have been administered in England for well over a decade. Although reliability evidence has been published, critics have argued that there is not enough evidence (of the right kind) and that test results may be insufficiently reliable. Purpose: This article collates and discusses evidence on the reliability of…

Descriptors: National Curriculum, Test Results, Foreign Countries, Elementary Secondary Education

Keeping It "R-E-A-L" with Authentic Assessment

Peer reviewed

Direct link

Macy, Marisa; Bagnato, Stephen J. – NHSA Dialog, 2010

The inclusion of young children with disabilities has remained a function of the Head Start program since its inception in the 1960s when the United States Congress mandated that children with disabilities comprise 10% of the Head Start enrollment (Zigler & Styfco, 2000). Standardized, norm-referenced tests used to identify children with…

Descriptors: Performance Based Assessment, Disadvantaged Youth, Norm Referenced Tests, Disabilities

Significant Improvement in Freshman Composition as Measured by Impromptu Essays: A Large-Scale Experiment.

Peer reviewed

Davis, Ken – Research in the Teaching of English, 1979

Impromtu pre- and post-test essays by 302 students randomly selected from over 80 sections of a first-semester freshman composition course revealed significant improvement. (DD)

Descriptors: College Freshmen, Educational Research, Educational Testing, Higher Education

On the Virtues and Vices of the Standard Error of Measurement.

Peer reviewed

Williams, Richard H.; Zimmerman, Donald W. – Journal of Experimental Education, 1984

This paper provides a list of 10 salient features of the standard error of measurement, contrasting it to the reliability coefficient. It is concluded that the standard error of measurement should be regarded as a primary characteristic of a mental test. (Author/DWH)

Descriptors: Educational Testing, Error of Measurement, Evaluation Methods, Psychological Testing

Estimating the Reliability of Criterion-Referenced Tests before Administration.

Peer reviewed

Chase, Clint – Mid-Western Educational Researcher, 1996

Classical procedures for calculating the two indices of decision consistency (P and Kappa) for criterion-referenced tests require two testings on each child. Huynh, Peng, and Subkoviak have presented one-testing procedures for these indices. These indices can be estimated without any test administration using Ebel's estimates of the mean, standard…

Descriptors: Criterion Referenced Tests, Educational Research, Educational Testing, Estimation (Mathematics)

An Investigation of the Accuracy of Alternative Methods of True Score Estimation in High-Stakes Mixed-Format Examinations.

Peer reviewed

Klinger, Don A.; Rogers, W. Todd – Alberta Journal of Educational Research, 2003

The estimation accuracy of procedures based on classical test score theory and item response theory (generalized partial credit model) were compared for examinations consisting of multiple-choice and extended-response items. Analysis of British Columbia Scholarship Examination results found an error rate of about 10 percent for both methods, with…

Descriptors: Academic Achievement, Educational Testing, Foreign Countries, High Stakes Tests

The Effect Three Testing Modes--Reading while Listening, Listening and Silent Reading--Have on Sixth Grade Boys and Girls.

Peer reviewed

Klein, Howard A. – Reading Improvement, 1989

Examines whether using a combined silent reading-listening mode to administer the "Social Studies Inference Test" optimized information gathering. Finds that the combined modality produced more correct inferences than did silent reading alone. Finds only one gender difference--girls'"caution score" was higher than that for…

Descriptors: Data Collection, Educational Testing, Grade 6, Intermediate Grades

Stability of the K-ABC and S-B:4 with Preschool Children.

Download full text

Bauer, Joseph J.; Smith, Douglas K. – 1988

Stability of performance on the Kaufman Assessment Battery for Children (K-ABC) and the Stanford-Binet Intelligence Scale: Fourth Edition (S-B:4) over a 1-year interval was examined with a sample of 28 nonhandicapped preschoolers. Each child was administered both tests in counterbalanced order and retested in 1 year with either the K-ABC or the…

Descriptors: Early Childhood Education, Educational Testing, Intelligence Tests, Middle Class

Relative Effectiveness of Single and Double Multiple-Choice Questions in Educational Measurement.

Peer reviewed

Weiten, Wayne – Journal of Experimental Education, 1982

A comparison of double as opposed to single multiple-choice questions yielded significant differences in regard to item difficulty, item discrimination, and internal reliability, but not concurrent validity. (Author/PN)

Descriptors: Difficulty Level, Educational Testing, Higher Education, Multiple Choice Tests

An Alternate Procedure to Obtain Ability Estimates in Latent Trait Models.

Houser, Ronald L.; And Others – 1983

This report describes a procedure that promises to improve the stability, accuracy, and efficiency of the employment of latent trait models and an application of the procedure to the Rasch model. Data were collected from the Portland Public Schools Level Tests administered to 25,740 students. Since each of the 173 items (chosen from the total…

Descriptors: Academic Achievement, Educational Testing, Item Banks, Latent Trait Theory

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4

Journal of Experimental…	2
Alberta Journal of…	1
Cogent Education	1
Community/Junior College…	1
Diagnostique	1
Educational Research	1
Educational and Psychological…	1
International Journal of…	1
International Journal of…	1
International Journal of…	1
Journal of Education for…	1
Journal of Research in…	1
Mid-Western Educational…	1
NHSA Dialog	1
Reading Improvement	1
Research in the Teaching of…	1
SAGE Open	1
Social Education	1
TESOL Quarterly	1
More ▼

White, Edward M.	6
Ling, Guangming	2
Reckase, Mark D.	2
Smith, Douglas K.	2
Arhin, Ato Kwamina	1
Bagnato, Stephen J.	1
Barnes, Laura L. B.	1
Bauer, Joseph J.	1
Bochenek, Jennifer	1
Brady, Raymond G.	1
Bridgeford, Nancy J.	1
Bryant, Namok C.	1
Bryce, Jennifer	1
Burkander, Kri	1
Chase, Clint	1
Davis, Ken	1
Dieterich, Thomas G.	1
Ekstrom, Ruth B.	1
Elias, Patricia	1
Elias, Patricia J.	1
Estes, Gary D.	1
Feldt, Leonard S.	1
Frankel, Steven M.	1
Gamel, Nona N.	1
More ▼