Publication Date
In 2025 | 1 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 2 |
Since 2016 (last 10 years) | 4 |
Since 2006 (last 20 years) | 14 |
Descriptor
Error of Measurement | 30 |
Scores | 30 |
Test Construction | 30 |
Reliability | 9 |
Test Reliability | 9 |
Test Items | 8 |
Test Interpretation | 6 |
Measurement | 5 |
Test Validity | 5 |
Academic Achievement | 4 |
Comparative Analysis | 4 |
More ▼ |
Source
Author
Anders Hjorth-Trolle | 1 |
Anders Holm | 1 |
Balkin, Richard S. | 1 |
Bardhoshi, Gerta | 1 |
Biancarosa, Gina | 1 |
Bleses, Dorthe | 1 |
Borich, Gary D. | 1 |
Bramble, William | 1 |
Brennan, Robert L. | 1 |
Briggs, Derek C. | 1 |
Bristow, M. | 1 |
More ▼ |
Publication Type
Journal Articles | 18 |
Reports - Research | 14 |
Reports - Descriptive | 6 |
Reports - Evaluative | 6 |
Speeches/Meeting Papers | 6 |
Guides - Non-Classroom | 1 |
Opinion Papers | 1 |
Reports - General | 1 |
Tests/Questionnaires | 1 |
Education Level
Elementary Education | 4 |
Higher Education | 3 |
Postsecondary Education | 3 |
Grade 2 | 2 |
Elementary Secondary Education | 1 |
Grade 10 | 1 |
Grade 3 | 1 |
Grade 5 | 1 |
Grade 8 | 1 |
High Schools | 1 |
Intermediate Grades | 1 |
More ▼ |
Audience
Researchers | 2 |
Laws, Policies, & Programs
No Child Left Behind Act 2001 | 1 |
Race to the Top | 1 |
Assessments and Surveys
Cognitive Abilities Test | 1 |
Dynamic Indicators of Basic… | 1 |
Iowa Tests of Basic Skills | 1 |
MacArthur Communicative… | 1 |
New Jersey College Basic… | 1 |
SAT (College Admission Test) | 1 |
What Works Clearinghouse Rating
Anders Holm; Anders Hjorth-Trolle; Robert Andersen – Sociological Methods & Research, 2025
Lagged dependent variables (LDVs) are often used as predictors in ordinary least squares (OLS) models in the social sciences. Although several estimators are commonly employed, little is known about their relative merits in the presence of classical measurement error and different longitudinal processes. We assess the performance of four commonly…
Descriptors: Elementary Education, Scores, Error of Measurement, Predictor Variables
Ozdemir, Burhanettin; Gelbal, Selahattin – Education and Information Technologies, 2022
The computerized adaptive tests (CAT) apply an adaptive process in which the items are tailored to individuals' ability scores. The multidimensional CAT (MCAT) designs differ in terms of different item selection, ability estimation, and termination methods being used. This study aims at investigating the performance of the MCAT designs used to…
Descriptors: Scores, Computer Assisted Testing, Test Items, Language Proficiency
Bardhoshi, Gerta; Erford, Bradley T. – Measurement and Evaluation in Counseling and Development, 2017
Precision is a key facet of test development, with score reliability determined primarily according to the types of error one wants to approximate and demonstrate. This article identifies and discusses several primary forms of reliability estimation: internal consistency (i.e., split-half, KR-20, a), test-retest, alternate forms, interscorer, and…
Descriptors: Scores, Test Reliability, Accuracy, Pretests Posttests
Balkin, Richard S. – Measurement and Evaluation in Counseling and Development, 2017
An overview of standards related to demonstrating evidence regarding relationships with criteria as it pertains to instrument development was presented, along with heuristic examples. Additional measures and a comprehensive design are necessary to establish evidence related to the use and interpretation of test scores for the validation of a…
Descriptors: Evidence, Academic Standards, Test Construction, Evaluation Criteria
Moses, Tim – ETS Research Report Series, 2013
The purpose of this report is to review ETS psychometric contributions that focus on test scores. Two major sections review contributions based on assessing test scores' measurement characteristics and other contributions about using test scores as predictors in correlational and regression relationships. An additional section reviews additional…
Descriptors: Psychometrics, Scores, Correlation, Regression (Statistics)
Taylor, Melinda Ann; Pastor, Dena A. – Applied Measurement in Education, 2013
Although federal regulations require testing students with severe cognitive disabilities, there is little guidance regarding how technical quality should be established. It is known that challenges exist with documentation of the reliability of scores for alternate assessments. Typical measures of reliability do little in modeling multiple sources…
Descriptors: Generalizability Theory, Alternative Assessment, Test Reliability, Scores
Haberman, Shelby J.; Dorans, Neil J. – Educational Testing Service, 2011
For testing programs that administer multiple forms within a year and across years, score equating is used to ensure that scores can be used interchangeably. In an ideal world, samples sizes are large and representative of populations that hardly change over time, and very reliable alternate test forms are built with nearly identical psychometric…
Descriptors: Scores, Reliability, Equated Scores, Test Construction
Solano-Flores, Guillermo; Li, Min – Educational Research and Evaluation, 2013
We discuss generalizability (G) theory and the fair and valid assessment of linguistic minorities, especially emergent bilinguals. G theory allows examination of the relationship between score variation and language variation (e.g., variation of proficiency across languages, language modes, and social contexts). Studies examining score variation…
Descriptors: Measurement, Testing, Language Proficiency, Test Construction
Stoolmiller, Michael; Biancarosa, Gina; Fien, Hank – Assessment for Effective Intervention, 2013
Lack of psychometric equivalence of oral reading fluency (ORF) passages used within a grade for screening and progress monitoring has recently become an issue with calls for the use of equating methods to ensure equivalence. To investigate the nature of the nonequivalence and to guide the choice of equating method to correct for nonequivalence,…
Descriptors: School Personnel, Reading Fluency, Emergent Literacy, Psychometrics
Bristow, M.; Erkorkmaz, K.; Huissoon, J. P.; Jeon, Soo; Owen, W. S.; Waslander, S. L.; Stubley, G. D. – IEEE Transactions on Education, 2012
Any meaningful initiative to improve the teaching and learning in introductory control systems courses needs a clear test of student conceptual understanding to determine the effectiveness of proposed methods and activities. The authors propose a control systems concept inventory. Development of the inventory was collaborative and iterative. The…
Descriptors: Diagnostic Tests, Concept Formation, Undergraduate Students, Engineering Education
Hughes, Gail D. – Research in the Schools, 2009
The impacts of incorrect responses to reverse-coded survey items were examined in this simulation study by reversing responses to traditional Likert-format items from 700 administrators in randomly selected schools in a 7-county region in central Arkansas that were obtained from an archival dataset. Specifically, the number of reverse-coded items…
Descriptors: Surveys, Coding, Context Effect, Measures (Individuals)
Vach, Werner; Bleses, Dorthe; Jorgensen, Rune – Clinical Linguistics & Phonetics, 2010
Several research groups have previously constructed short forms of the MacArthur-Bates Communicative Development Inventories (CDI) for different languages. We consider the specific aim of constructing such a short form to be used for language screening in a specific age group. We present a novel strategy for the construction, which is applicable…
Descriptors: Age, Test Reliability, Measures (Individuals), Error of Measurement
Briggs, Derek C. – Partnership for Assessment of Readiness for College and Careers, 2011
There is often confusion about distinctions between growth models and value-added models. The first half of this paper attempts to dispel some of these confusions by clarifying terminology and illustrating by example how the results from a large-scale assessment can and will be used to make inferences about student growth and the value-added…
Descriptors: Value Added Models, Language Usage, Measurement, Inferences

Feldt, Leonard S. – Applied Measurement in Education, 2002
Considers the situation in which content or administrative considerations limit the way in which a test can be partitioned to estimate the internal consistency reliability of the total test score. Demonstrates that a single-valued estimate of the total score reliability is possible only if an assumption is made about the comparative size of the…
Descriptors: Error of Measurement, Reliability, Scores, Test Construction
Ferrao, Maria – Assessment & Evaluation in Higher Education, 2010
The Bologna Declaration brought reforms into higher education that imply changes in teaching methods, didactic materials and textbooks, infrastructures and laboratories, etc. Statistics and mathematics are disciplines that traditionally have the worst success rates, particularly in non-mathematics core curricula courses. This research project,…
Descriptors: Foreign Countries, Computer Assisted Testing, Educational Technology, Educational Assessment
Previous Page | Next Page ยป
Pages: 1 | 2