ERIC - Search Results

Publication Date

In 2025	0
Since 2024	2
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	3
Since 2006 (last 20 years)	6

Descriptor

Error of Measurement	13
Testing Problems	13
Evaluation Methods	5
Test Construction	5
Student Evaluation	4
Elementary Secondary Education	3
Scores	3
Test Reliability	3
Cultural Context	2
Educational Assessment	2
Higher Education	2
Measurement Techniques	2
Outcomes of Education	2
Reliability	2
Research Methodology	2
Standardized Tests	2
Test Interpretation	2
Test Items	2
Test Validity	2
Testing	2
Translation	2
True Scores	2
Academic Achievement	1
Achievement Tests	1
Adaptive Testing	1
More ▼

Source

International Journal of…	2
Applied Measurement in…	1
Educational Measurement:…	1
Evaluation and Program…	1
Journal of Agronomic…	1
Journal of Educational and…	1
Language, Speech, and Hearing…	1
Research in the Teaching of…	1
Sociological Methods &…	1

Publication Type

Reports - Descriptive	13
Journal Articles	10
Speeches/Meeting Papers	3
Opinion Papers	1
Reports - Evaluative	1

Education Level

Audience

Researchers

Location

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 13 results Save | Export

A Crash Course in Good and Bad Controls

Peer reviewed

Direct link

Carlos Cinelli; Andrew Forney; Judea Pearl – Sociological Methods & Research, 2024

Many students of statistics and econometrics express frustration with the way a problem known as "bad control" is treated in the traditional literature. The issue arises when the addition of a variable to a regression equation produces an unintended discrepancy between the regression coefficient and the effect that the coefficient is…

Descriptors: Regression (Statistics), Robustness (Statistics), Error of Measurement, Testing Problems

Reframing Research and Assessment Practices: Advancing an Antiracist and Anti-Ableist Research Agenda

Peer reviewed

Direct link

Angela Johnson; Elizabeth Barker; Marcos Viveros Cespedes – Educational Measurement: Issues and Practice, 2024

Educators and researchers strive to build policies and practices on data and evidence, especially on academic achievement scores. When assessment scores are inaccurate for specific student populations or when scores are inappropriately used, even data-driven decisions will be misinformed. To maximize the impact of the research-practice-policy…

Descriptors: Equal Education, Inclusion, Evaluation Methods, Error of Measurement

Bad Questions: An Essay Involving Item Response Theory

Peer reviewed

Direct link

Thissen, David – Journal of Educational and Behavioral Statistics, 2016

David Thissen, a professor in the Department of Psychology and Neuroscience, Quantitative Program at the University of North Carolina, has consulted and served on technical advisory committees for assessment programs that use item response theory (IRT) over the past couple decades. He has come to the conclusion that there are usually two purposes…

Descriptors: Item Response Theory, Test Construction, Testing Problems, Student Evaluation

Theory of Test Translation Error

Peer reviewed

Direct link

Solano-Flores, Guillermo; Backhoff, Eduardo; Contreras-Nino, Luis Angel – International Journal of Testing, 2009

In this article, we present a theory of test translation whose intent is to provide the conceptual foundation for effective, systematic work in the process of test translation and test translation review. According to the theory, translation error is multidimensional; it is not simply the consequence of defective translation but an inevitable fact…

Descriptors: Test Items, Investigations, Semantics, Translation

Reliability Estimation When a Test Is Split into Two Parts of Unknown Effective Length.

Peer reviewed

Feldt, Leonard S. – Applied Measurement in Education, 2002

Considers the situation in which content or administrative considerations limit the way in which a test can be partitioned to estimate the internal consistency reliability of the total test score. Demonstrates that a single-valued estimate of the total score reliability is possible only if an assumption is made about the comparative size of the…

Descriptors: Error of Measurement, Reliability, Scores, Test Construction

Culture and Consequences: The Canaries in the Coal Mine

Peer reviewed

Direct link

Murphy, Sandra – Research in the Teaching of English, 2007

The persistent gap between the performance of mainstream students and racially and linguistically diverse students--for example, African Americans, Hispanic Americans, and Native Americans--on standardized tests may well signal problems with procedures for the development and use of standardized tests in general, and for their use with culturally…

Descriptors: Language Minorities, Standardized Tests, Test Validity, Prior Learning

The Truth about Scores Children Achieve on Tests.

Peer reviewed

Brown, Jonathan R. – Language, Speech, and Hearing Services in Schools, 1989

The importance of using the standard error of measurement (SEm) in determining reliability in test scores is emphasized. The SEm is compared to the hypothetical true score for standardized tests, and procedures for calculation of the SEm are explained. (JDD)

Descriptors: Elementary Secondary Education, Error of Measurement, Scores, Standardized Tests

A Study of Three-option and Four-option Multiple Choice Exams.

Cooper, Terence H. – Journal of Agronomic Education (JAE), 1988

Describes a study used to determine differences in exam reliability, difficulty, and student evaluations. Indicates that when a fourth option was added to the three-option items, the exams became more difficult. Includes methods, results discussion, and tables on student characteristics, whole test analyses, and selected items. (RT)

Descriptors: Agronomy, College Science, Error of Measurement, Evaluation Methods

To Be or Not to Be: Control and Balancing of Type I and Type II Errors.

Peer reviewed

Cohen, Patricia – Evaluation and Program Planning: An International Journal, 1982

The various costs of Type I and Type II errors of inference from data are discussed. Six methods for minimizing each error type are presented, which may be employed even after data collection for Type I and which minimizes Type II errors by a study design and analytical means combination. (Author/CM)

Descriptors: Analysis of Variance, Data Analysis, Data Collection, Error of Measurement

Maintaining Scoring Standards over a Rubric Transition Process.

Goldberg, Gail Lynn; Walker-Bartnick, Leslie – 1988

A scoring rubric transition study is described. It was designed to evaluate possible drift in scoring the Maryland Writing Test from year to year (when using a modified holistic scoring method), to evaluate strategies for revising swing rubrics from narrative and explanatory writing while maintaining original scoring standards, and to establish…

Descriptors: Educational Assessment, Elementary Secondary Education, Error of Measurement, Grading

Considerations for Creating Multi-Language Personality Norms: A Three-Component Model of Error

Peer reviewed

Direct link

Meyer, Kevin D.; Foster, Jeff L. – International Journal of Testing, 2008

With the increasing globalization of human resources practices, a commensurate increase in demand has occurred for multi-language ("global") personality norms for use in selection and development efforts. The combination of data from multiple translations of a personality assessment into a single norm engenders error from multiple sources. This…

Descriptors: Global Approach, Cultural Differences, Norms, Human Resources

The Case for Unobtrusive Measures.

Download full text

Terenzini, Patrick T. – 1986

Unobtrusive measures are recommended as a means of assessing educational outcomes of colleges. Such measures can counteract the response bias which is common in questionnaires and interviews. Outcomes researchers are, in fact, asked to supplement standard measures with unobtrusive measures. Interesting data may result from observation of students'…

Descriptors: Colleges, Cost Effectiveness, Educational Assessment, Error of Measurement

An Application of Latent Trait Test Methodology to a Large School District Testing Program.

Ridgeway, Gretchen Freiheit – 1982

A one-parameter latent trait model was the basis of the test development procedures in the Basic Skills Assessment Program (BSAP) of the Department of Defense Dependents Schools (DoDDS). Several issues are involved in applying the Rasch model to an assessment program in a large school district. Separate sets of skills continua are arranged by…

Descriptors: Achievement Tests, Basic Skills, Dependents Schools, Difficulty Level

Andrew Forney	1
Angela Johnson	1
Backhoff, Eduardo	1
Brown, Jonathan R.	1
Carlos Cinelli	1
Cohen, Patricia	1
Contreras-Nino, Luis Angel	1
Cooper, Terence H.	1
Elizabeth Barker	1
Feldt, Leonard S.	1
Foster, Jeff L.	1
Goldberg, Gail Lynn	1
Judea Pearl	1
Marcos Viveros Cespedes	1
Meyer, Kevin D.	1
Murphy, Sandra	1
Ridgeway, Gretchen Freiheit	1
Solano-Flores, Guillermo	1
Terenzini, Patrick T.	1
Thissen, David	1
Walker-Bartnick, Leslie	1
More ▼