Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 2 |
Since 2016 (last 10 years) | 5 |
Since 2006 (last 20 years) | 14 |
Descriptor
Error of Measurement | 44 |
Item Analysis | 44 |
Test Reliability | 44 |
Test Validity | 19 |
Test Construction | 13 |
Test Items | 13 |
Mathematical Models | 12 |
Comparative Analysis | 8 |
Correlation | 7 |
Test Interpretation | 7 |
Test Theory | 7 |
More ▼ |
Source
Author
Bashaw, W. L. | 2 |
Haladyna, Tom | 2 |
Patience, Wayne M. | 2 |
Reckase, Mark D. | 2 |
Rentz, R. Robert | 2 |
Setzer, J. Carl | 2 |
Whitely, Susan E. | 2 |
Altepeter, Tom | 1 |
Anderson, Trevor R. | 1 |
Benson, Jeri | 1 |
Bichi, Ado Abdu | 1 |
More ▼ |
Publication Type
Reports - Research | 22 |
Journal Articles | 18 |
Reports - Evaluative | 5 |
Reports - Descriptive | 4 |
Speeches/Meeting Papers | 4 |
Tests/Questionnaires | 4 |
Numerical/Quantitative Data | 1 |
Reference Materials -… | 1 |
Education Level
Elementary Secondary Education | 4 |
Higher Education | 2 |
Elementary Education | 1 |
Grade 4 | 1 |
Grade 5 | 1 |
High Schools | 1 |
Intermediate Grades | 1 |
Middle Schools | 1 |
Audience
Teachers | 1 |
Location
Maine | 1 |
Mississippi | 1 |
Laws, Policies, & Programs
Assessments and Surveys
General Educational… | 2 |
Dimensions of Self Concept | 1 |
Expressive One Word Picture… | 1 |
Peabody Picture Vocabulary… | 1 |
What Works Clearinghouse Rating
R. Noah Padgett – Practical Assessment, Research & Evaluation, 2023
The consistency of psychometric properties across waves of data collection provides valuable evidence that scores can be interpreted consistently. Evidence supporting the consistency of psychometric properties can come from using a longitudinal extension of item factor analysis to account for the lack of independence of observation when evaluating…
Descriptors: Psychometrics, Factor Analysis, Item Analysis, Validity
Strong, John Z. – Reading & Writing Quarterly, 2023
Awareness of informational text structures is related to reading comprehension and varies according to characteristics of readers and texts. The purpose of this study was to develop and refine a measure of text structure awareness, the Text Structure Identification Test (TSIT), by investigating its internal consistency reliability and construct…
Descriptors: Text Structure, Reading Instruction, Construct Validity, Grade 4
Sheng, Yanyan – Measurement: Interdisciplinary Research and Perspectives, 2019
Classical approach to test theory has been the foundation for educational and psychological measurement for over 90 years. This approach concerns with measurement error and hence test reliability, which in part relies on individual test items. The CTT package, developed in light of this, provides functions for test- and item-level analyses of…
Descriptors: Item Response Theory, Test Reliability, Item Analysis, Error of Measurement
Bichi, Ado Abdu; Talib, Rohaya – International Journal of Evaluation and Research in Education, 2018
Testing in educational system perform a number of functions, the results from a test can be used to make a number of decisions in education. It is therefore well accepted in the education literature that, testing is an important element of education. To effectively utilize the tests in educational policies and quality assurance its validity and…
Descriptors: Item Response Theory, Test Items, Test Construction, Decision Making
Kogar, Esin Yilmaz; Kelecioglu, Hülya – Journal of Education and Learning, 2017
The purpose of this research is to first estimate the item and ability parameters and the standard error values related to those parameters obtained from Unidimensional Item Response Theory (UIRT), bifactor (BIF) and Testlet Response Theory models (TRT) in the tests including testlets, when the number of testlets, number of independent items, and…
Descriptors: Item Response Theory, Models, Mathematics Tests, Test Items
Phillips, Gary W. – Applied Measurement in Education, 2015
This article proposes that sampling design effects have potentially huge unrecognized impacts on the results reported by large-scale district and state assessments in the United States. When design effects are unrecognized and unaccounted for they lead to underestimating the sampling error in item and test statistics. Underestimating the sampling…
Descriptors: State Programs, Sampling, Research Design, Error of Measurement
Williams, Matt N.; Gomez Grajales, Carlos Alberto; Kurkiewicz, Dason – Practical Assessment, Research & Evaluation, 2013
In 2002, an article entitled "Four assumptions of multiple regression that researchers should always test" by Osborne and Waters was published in "PARE." This article has gone on to be viewed more than 275,000 times (as of August 2013), and it is one of the first results displayed in a Google search for "regression…
Descriptors: Multiple Regression Analysis, Misconceptions, Reader Response, Predictor Variables
Yao, Lihua – Psychometrika, 2012
Multidimensional computer adaptive testing (MCAT) can provide higher precision and reliability or reduce test length when compared with unidimensional CAT or with the paper-and-pencil test. This study compared five item selection procedures in the MCAT framework for both domain scores and overall scores through simulation by varying the structure…
Descriptors: Item Banks, Test Length, Simulation, Adaptive Testing
Pae, Hye K.; Greenberg, Daphne; Morris, Robin D. – Language Assessment Quarterly, 2012
The aim of this study was to apply the Rasch model to an analysis of the psychometric properties of the Peabody Picture Vocabulary Test--III Form A (PPVT--IIIA) items with struggling adult readers. The PPVT--IIIA was administered to 229 African American adults whose isolated word reading skills were between third and fifth grades. Conformity of…
Descriptors: African Americans, Test Items, Construct Validity, Test Validity
Anderson, Trevor R.; Rogan, John M. – Biochemistry and Molecular Biology Education, 2010
Student assessment is central to the educational process and can be used for multiple purposes including, to promote student learning, to grade student performance and to evaluate the educational quality of qualifications. It is, therefore, of utmost importance that assessment instruments are of a high quality. In this article, we present various…
Descriptors: Educational Assessment, Educational Quality, Student Evaluation, Educational Research
Elosua, Paula; Iliescu, Dragos – International Journal of Testing, 2012
Psychometric practice does not always converge with the advances of psychometric theory. In order to investigate this gap, the authors focus on the 10 most used psychological tests in Europe, as identified by recent surveys. The article analyzes test manuals published in 6 different European countries for these 10 most used tests. A total of 32…
Descriptors: Psychological Testing, Personality Measures, Error of Measurement, Foreign Countries
Setzer, J. Carl; He, Yi – GED Testing Service, 2009
Reliability Analysis for the Internationally Administered 2002 Series GED (General Educational Development) Tests Reliability refers to the consistency, or stability, of test scores when the authors administer the measurement procedure repeatedly to groups of examinees (American Educational Research Association [AERA], American Psychological…
Descriptors: Educational Research, Error of Measurement, Scores, Test Reliability
Setzer, J. Carl – GED Testing Service, 2009
The GED[R] English as a Second Language (GED ESL) Test was designed to serve as an adjunct to the GED test battery when an examinee takes either the Spanish- or French-language version of the tests. The GED ESL Test is a criterion-referenced, multiple-choice instrument that assesses the functional, English reading skills of adults whose first…
Descriptors: Language Tests, High School Equivalency Programs, Psychometrics, Reading Skills

Wilcox, Rand R. – Journal of Educational Statistics, 1981
Both the binomial and beta-binomial models are applied to various problems occurring in mental test theory. The paper reviews and critiques these models. The emphasis is on the extensions of the models that have been proposed in recent years, and that might not be familiar to many educators. (Author)
Descriptors: Error of Measurement, Item Analysis, Mathematical Models, Test Reliability

Ebel, Robert L. – Educational and Psychological Measurement, 1972
Author supports the credibility of the propositions that: (1) the true component of a score is proportional to the number of equivalent elements that contribute to it. And, (2) the error component of a score is proportional to the square root of the number of equivalent elements that contribute to it. (Author/MB)
Descriptors: Error of Measurement, Item Analysis, Mathematical Applications, Scores