ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	5
Since 2006 (last 20 years)	14

Descriptor

Error of Measurement	44
Item Analysis	44
Test Reliability	44
Test Validity	19
Test Construction	13
Test Items	13
Mathematical Models	12
Comparative Analysis	8
Correlation	7
Test Interpretation	7
Test Theory	7
Testing Problems	7
Criterion Referenced Tests	6
Measurement Techniques	6
Psychometrics	6
Statistical Analysis	6
Achievement Tests	5
Factor Analysis	5
Item Response Theory	5
Latent Trait Theory	5
Sampling	5
Adaptive Testing	4
Goodness of Fit	4
Measurement	4
Probability	4
More ▼

Publication Type

Reports - Research	22
Journal Articles	18
Reports - Evaluative	5
Reports - Descriptive	4
Speeches/Meeting Papers	4
Tests/Questionnaires	4
Numerical/Quantitative Data	1
Reference Materials -…	1

Education Level

Elementary Secondary Education	4
Higher Education	2
Elementary Education	1
Grade 4	1
Grade 5	1
High Schools	1
Intermediate Grades	1
Middle Schools	1

Audience

Teachers

Location

Maine	1
Mississippi	1

Laws, Policies, & Programs

Assessments and Surveys

General Educational…	2
Dimensions of Self Concept	1
Expressive One Word Picture…	1
Peabody Picture Vocabulary…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 44 results Save | Export

A Tutorial on Cross Wave Measurement Invariance Testing with Item Factor Analysis

Peer reviewed
PDF on ERIC

Download full text

R. Noah Padgett – Practical Assessment, Research & Evaluation, 2023

The consistency of psychometric properties across waves of data collection provides valuable evidence that scores can be interpreted consistently. Evidence supporting the consistency of psychometric properties can come from using a longitudinal extension of item factor analysis to account for the lack of independence of observation when evaluating…

Descriptors: Psychometrics, Factor Analysis, Item Analysis, Validity

Measuring Text Structure Awareness in Upper Elementary Grades

Peer reviewed

Direct link

Strong, John Z. – Reading & Writing Quarterly, 2023

Awareness of informational text structures is related to reading comprehension and varies according to characteristics of readers and texts. The purpose of this study was to develop and refine a measure of text structure awareness, the Text Structure Identification Test (TSIT), by investigating its internal consistency reliability and construct…

Descriptors: Text Structure, Reading Instruction, Construct Validity, Grade 4

CTT Package in R

Peer reviewed

Direct link

Sheng, Yanyan – Measurement: Interdisciplinary Research and Perspectives, 2019

Classical approach to test theory has been the foundation for educational and psychological measurement for over 90 years. This approach concerns with measurement error and hence test reliability, which in part relies on individual test items. The CTT package, developed in light of this, provides functions for test- and item-level analyses of…

Descriptors: Item Response Theory, Test Reliability, Item Analysis, Error of Measurement

Item Response Theory: An Introduction to Latent Trait Models to Test and Item Development

Peer reviewed
PDF on ERIC

Download full text

Bichi, Ado Abdu; Talib, Rohaya – International Journal of Evaluation and Research in Education, 2018

Testing in educational system perform a number of functions, the results from a test can be used to make a number of decisions in education. It is therefore well accepted in the education literature that, testing is an important element of education. To effectively utilize the tests in educational policies and quality assurance its validity and…

Descriptors: Item Response Theory, Test Items, Test Construction, Decision Making

Examination of Different Item Response Theory Models on Tests Composed of Testlets

Peer reviewed
PDF on ERIC

Download full text

Kogar, Esin Yilmaz; Kelecioglu, Hülya – Journal of Education and Learning, 2017

The purpose of this research is to first estimate the item and ability parameters and the standard error values related to those parameters obtained from Unidimensional Item Response Theory (UIRT), bifactor (BIF) and Testlet Response Theory models (TRT) in the tests including testlets, when the number of testlets, number of independent items, and…

Descriptors: Item Response Theory, Models, Mathematics Tests, Test Items

Impact of Design Effects in Large-Scale District and State Assessments

Peer reviewed

Direct link

Phillips, Gary W. – Applied Measurement in Education, 2015

This article proposes that sampling design effects have potentially huge unrecognized impacts on the results reported by large-scale district and state assessments in the United States. When design effects are unrecognized and unaccounted for they lead to underestimating the sampling error in item and test statistics. Underestimating the sampling…

Descriptors: State Programs, Sampling, Research Design, Error of Measurement

Assumptions of Multiple Regression: Correcting Two Misconceptions

Peer reviewed
PDF on ERIC

Download full text

Williams, Matt N.; Gomez Grajales, Carlos Alberto; Kurkiewicz, Dason – Practical Assessment, Research & Evaluation, 2013

In 2002, an article entitled "Four assumptions of multiple regression that researchers should always test" by Osborne and Waters was published in "PARE." This article has gone on to be viewed more than 275,000 times (as of August 2013), and it is one of the first results displayed in a Google search for "regression…

Descriptors: Multiple Regression Analysis, Misconceptions, Reader Response, Predictor Variables

Multidimensional CAT Item Selection Methods for Domain Scores and Composite Scores: Theory and Applications

Peer reviewed

Direct link

Yao, Lihua – Psychometrika, 2012

Multidimensional computer adaptive testing (MCAT) can provide higher precision and reliability or reduce test length when compared with unidimensional CAT or with the paper-and-pencil test. This study compared five item selection procedures in the MCAT framework for both domain scores and overall scores through simulation by varying the structure…

Descriptors: Item Banks, Test Length, Simulation, Adaptive Testing

Construct Validity and Measurement Invariance of the Peabody Picture Vocabulary Test-III Form A

Peer reviewed

Direct link

Pae, Hye K.; Greenberg, Daphne; Morris, Robin D. – Language Assessment Quarterly, 2012

The aim of this study was to apply the Rasch model to an analysis of the psychometric properties of the Peabody Picture Vocabulary Test--III Form A (PPVT--IIIA) items with struggling adult readers. The PPVT--IIIA was administered to 229 African American adults whose isolated word reading skills were between third and fifth grades. Conformity of…

Descriptors: African Americans, Test Items, Construct Validity, Test Validity

Bridging the Educational Research-Teaching Practice Gap: Tools for Evaluating the Quality of Assessment Instruments

Peer reviewed

Direct link

Anderson, Trevor R.; Rogan, John M. – Biochemistry and Molecular Biology Education, 2010

Student assessment is central to the educational process and can be used for multiple purposes including, to promote student learning, to grade student performance and to evaluate the educational quality of qualifications. It is, therefore, of utmost importance that assessment instruments are of a high quality. In this article, we present various…

Descriptors: Educational Assessment, Educational Quality, Student Evaluation, Educational Research

Tests in Europe: Where We Are and Where We Should Go

Peer reviewed

Direct link

Elosua, Paula; Iliescu, Dragos – International Journal of Testing, 2012

Psychometric practice does not always converge with the advances of psychometric theory. In order to investigate this gap, the authors focus on the 10 most used psychological tests in Europe, as identified by recent surveys. The article analyzes test manuals published in 6 different European countries for these 10 most used tests. A total of 32…

Descriptors: Psychological Testing, Personality Measures, Error of Measurement, Foreign Countries

Reliability Analysis for the Internationally Administered 2002 Series GED Tests. GED Testing Service[R] Research Studies, 2009-3

Download full text

Setzer, J. Carl; He, Yi – GED Testing Service, 2009

Reliability Analysis for the Internationally Administered 2002 Series GED (General Educational Development) Tests Reliability refers to the consistency, or stability, of test scores when the authors administer the measurement procedure repeatedly to groups of examinees (American Educational Research Association [AERA], American Psychological…

Descriptors: Educational Research, Error of Measurement, Scores, Test Reliability

Reliability and Validity Evidence for the GED[R] English as a Second Language Test. GED Testing Service[R] Research Studies, 2009-4

Download full text

Setzer, J. Carl – GED Testing Service, 2009

The GED[R] English as a Second Language (GED ESL) Test was designed to serve as an adjunct to the GED test battery when an examinee takes either the Spanish- or French-language version of the tests. The GED ESL Test is a criterion-referenced, multiple-choice instrument that assesses the functional, English reading skills of adults whose first…

Descriptors: Language Tests, High School Equivalency Programs, Psychometrics, Reading Skills

A Review of the Beta-Binomial Model and Its Extensions.

Peer reviewed

Wilcox, Rand R. – Journal of Educational Statistics, 1981

Both the binomial and beta-binomial models are applied to various problems occurring in mental test theory. The paper reviews and critiques these models. The emphasis is on the extensions of the models that have been proposed in recent years, and that might not be familiar to many educators. (Author)

Descriptors: Error of Measurement, Item Analysis, Mathematical Models, Test Reliability

Why is a Longer Test Usually a More Reliable Test?

Peer reviewed

Ebel, Robert L. – Educational and Psychological Measurement, 1972

Author supports the credibility of the propositions that: (1) the true component of a score is proportional to the number of equivalent elements that contribute to it. And, (2) the error component of a score is proportional to the square root of the number of equivalent elements that contribute to it. (Author/MB)

Descriptors: Error of Measurement, Item Analysis, Mathematical Applications, Scores

Previous Page | Next Page »

Pages: 1 | 2 | 3

Journal of Educational…	4
Educational and Psychological…	3
GED Testing Service	2
Practical Assessment,…	2
Psychometrika	2
Applied Measurement in…	1
Applied Psychological…	1
Biochemistry and Molecular…	1
Evaluation and the Health…	1
International Journal of…	1
International Journal of…	1
Journal of Education and…	1
Journal of Educational…	1
Journal of Experimental…	1
Language Assessment Quarterly	1
Measurement:…	1
National Center for Research…	1
Reading & Writing Quarterly	1
Research Quarterly for…	1
School Psychology Review	1
More ▼

Bashaw, W. L.	2
Haladyna, Tom	2
Patience, Wayne M.	2
Reckase, Mark D.	2
Rentz, R. Robert	2
Setzer, J. Carl	2
Whitely, Susan E.	2
Altepeter, Tom	1
Anderson, Trevor R.	1
Benson, Jeri	1
Bichi, Ado Abdu	1
Bowes, Neal	1
Brennan, Robert L.	1
Bridgeman, Brent	1
Crocker, A. C.	1
Cromack, Theodore R.	1
Cuttance, Peter F.	1
Dawis, Rene V.	1
Diamond, James J.	1
Diederich, Paul B.	1
Ebel, Robert L.	1
Elosua, Paula	1
Emrick, John A.	1
Enders, Craig K.	1
More ▼