ERIC - Search Results

Publication Date

In 2025	1
Since 2024	1
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	4
Since 2006 (last 20 years)	14

Descriptor

Error of Measurement	30
Scores	30
Test Construction	30
Reliability	9
Test Reliability	9
Test Items	8
Test Interpretation	6
Measurement	5
Test Validity	5
Academic Achievement	4
Comparative Analysis	4
Generalizability Theory	4
Psychometrics	4
Statistical Analysis	4
Validity	4
Alternative Assessment	3
Criterion Referenced Tests	3
Elementary School Students	3
Equated Scores	3
Foreign Countries	3
Goodness of Fit	3
Item Response Theory	3
Language Proficiency	3
Latent Trait Theory	3
Mathematics Tests	3
More ▼

Source

Applied Measurement in…	3
Measurement and Evaluation in…	2
Assessment & Evaluation in…	1
Assessment for Effective…	1
Clinical Linguistics &…	1
College Entrance Examination…	1
ETS Research Report Series	1
Education and Information…	1
Educational Research and…	1
Educational Testing Service	1
Educational and Psychological…	1
IEEE Transactions on Education	1
Journal of Educational…	1
Partnership for Assessment of…	1
Physical Educator	1
Psychology in the Schools	1
Research in the Schools	1
Sociological Methods &…	1
More ▼

Publication Type

Journal Articles	18
Reports - Research	14
Reports - Descriptive	6
Reports - Evaluative	6
Speeches/Meeting Papers	6
Guides - Non-Classroom	1
Opinion Papers	1
Reports - General	1
Tests/Questionnaires	1

Education Level

Elementary Education	4
Higher Education	3
Postsecondary Education	3
Grade 2	2
Elementary Secondary Education	1
Grade 10	1
Grade 3	1
Grade 5	1
Grade 8	1
High Schools	1
Intermediate Grades	1
Junior High Schools	1
Middle Schools	1
Secondary Education	1
More ▼

Audience

Researchers

Location

Arkansas	1
Canada	1
Colorado (Boulder)	1
Denmark	1
Oregon	1
Portugal	1

Laws, Policies, & Programs

No Child Left Behind Act 2001	1
Race to the Top	1

Assessments and Surveys

Cognitive Abilities Test	1
Dynamic Indicators of Basic…	1
Iowa Tests of Basic Skills	1
MacArthur Communicative…	1
New Jersey College Basic…	1
SAT (College Admission Test)	1

What Works Clearinghouse Rating

Showing 1 to 15 of 30 results Save | Export

Lagged Dependent Variable Predictors, Classical Measurement Error, and Path Dependency: The Conditions under Which Various Estimators Are Appropriate

Peer reviewed

Direct link

Anders Holm; Anders Hjorth-Trolle; Robert Andersen – Sociological Methods & Research, 2025

Lagged dependent variables (LDVs) are often used as predictors in ordinary least squares (OLS) models in the social sciences. Although several estimators are commonly employed, little is known about their relative merits in the presence of classical measurement error and different longitudinal processes. We assess the performance of four commonly…

Descriptors: Elementary Education, Scores, Error of Measurement, Predictor Variables

Measuring Language Ability of Students with Compensatory Multidimensional CAT: A Post-Hoc Simulation Study

Peer reviewed

Direct link

Ozdemir, Burhanettin; Gelbal, Selahattin – Education and Information Technologies, 2022

The computerized adaptive tests (CAT) apply an adaptive process in which the items are tailored to individuals' ability scores. The multidimensional CAT (MCAT) designs differ in terms of different item selection, ability estimation, and termination methods being used. This study aims at investigating the performance of the MCAT designs used to…

Descriptors: Scores, Computer Assisted Testing, Test Items, Language Proficiency

Processes and Procedures for Estimating Score Reliability and Precision

Peer reviewed

Direct link

Bardhoshi, Gerta; Erford, Bradley T. – Measurement and Evaluation in Counseling and Development, 2017

Precision is a key facet of test development, with score reliability determined primarily according to the types of error one wants to approximate and demonstrate. This article identifies and discusses several primary forms of reliability estimation: internal consistency (i.e., split-half, KR-20, a), test-retest, alternate forms, interscorer, and…

Descriptors: Scores, Test Reliability, Accuracy, Pretests Posttests

Evaluating Evidence Regarding Relationships with Criteria

Peer reviewed

Direct link

Balkin, Richard S. – Measurement and Evaluation in Counseling and Development, 2017

An overview of standards related to demonstrating evidence regarding relationships with criteria as it pertains to instrument development was presented, along with heuristic examples. Additional measures and a comprehensive design are necessary to establish evidence related to the use and interpretation of test scores for the validation of a…

Descriptors: Evidence, Academic Standards, Test Construction, Evaluation Criteria

ETS Psychometric Contributions: Focus on Test Scores. Research Report. ETS RR-13-15. ETS R&D Scientific and Policy Contributions Series. ETS SPC-13-03

Peer reviewed
PDF on ERIC

Download full text

Moses, Tim – ETS Research Report Series, 2013

The purpose of this report is to review ETS psychometric contributions that focus on test scores. Two major sections review contributions based on assessing test scores' measurement characteristics and other contributions about using test scores as predictors in correlational and regression relationships. An additional section reviews additional…

Descriptors: Psychometrics, Scores, Correlation, Regression (Statistics)

An Application of Generalizability Theory to Evaluate the Technical Quality of an Alternate Assessment

Peer reviewed

Direct link

Taylor, Melinda Ann; Pastor, Dena A. – Applied Measurement in Education, 2013

Although federal regulations require testing students with severe cognitive disabilities, there is little guidance regarding how technical quality should be established. It is known that challenges exist with documentation of the reliability of scores for alternate assessments. Typical measures of reliability do little in modeling multiple sources…

Descriptors: Generalizability Theory, Alternative Assessment, Test Reliability, Scores

Sources of Score Scale Inconsistency. Research Report. ETS RR-11-10

Download full text

Haberman, Shelby J.; Dorans, Neil J. – Educational Testing Service, 2011

For testing programs that administer multiple forms within a year and across years, score equating is used to ensure that scores can be used interchangeably. In an ideal world, samples sizes are large and representative of populations that hardly change over time, and very reliable alternate test forms are built with nearly identical psychometric…

Descriptors: Scores, Reliability, Equated Scores, Test Construction

Generalizability Theory and the Fair and Valid Assessment of Linguistic Minorities

Peer reviewed

Direct link

Solano-Flores, Guillermo; Li, Min – Educational Research and Evaluation, 2013

We discuss generalizability (G) theory and the fair and valid assessment of linguistic minorities, especially emergent bilinguals. G theory allows examination of the relationship between score variation and language variation (e.g., variation of proficiency across languages, language modes, and social contexts). Studies examining score variation…

Descriptors: Measurement, Testing, Language Proficiency, Test Construction

Measurement Properties of DIBELS Oral Reading Fluency in Grade 2: Implications for Equating Studies

Peer reviewed

Direct link

Stoolmiller, Michael; Biancarosa, Gina; Fien, Hank – Assessment for Effective Intervention, 2013

Lack of psychometric equivalence of oral reading fluency (ORF) passages used within a grade for screening and progress monitoring has recently become an issue with calls for the use of equating methods to ensure equivalence. To investigate the nature of the nonequivalence and to guide the choice of equating method to correct for nonequivalence,…

Descriptors: School Personnel, Reading Fluency, Emergent Literacy, Psychometrics

A Control Systems Concept Inventory Test Design and Assessment

Peer reviewed

Direct link

Bristow, M.; Erkorkmaz, K.; Huissoon, J. P.; Jeon, Soo; Owen, W. S.; Waslander, S. L.; Stubley, G. D. – IEEE Transactions on Education, 2012

Any meaningful initiative to improve the teaching and learning in introductory control systems courses needs a clear test of student conceptual understanding to determine the effectiveness of proposed methods and activities. The authors propose a control systems concept inventory. Development of the inventory was collaborative and iterative. The…

Descriptors: Diagnostic Tests, Concept Formation, Undergraduate Students, Engineering Education

The Impact of Incorrect Responses to Reverse-Coded Survey Items

Peer reviewed

Direct link

Hughes, Gail D. – Research in the Schools, 2009

The impacts of incorrect responses to reverse-coded survey items were examined in this simulation study by reversing responses to traditional Likert-format items from 700 administrators in randomly selected schools in a 7-county region in central Arkansas that were obtained from an archival dataset. Specifically, the number of reverse-coded items…

Descriptors: Surveys, Coding, Context Effect, Measures (Individuals)

Construction of a Danish CDI Short Form for Language Screening at the Age of 36 Months: Methodological Considerations and Results

Peer reviewed

Direct link

Vach, Werner; Bleses, Dorthe; Jorgensen, Rune – Clinical Linguistics & Phonetics, 2010

Several research groups have previously constructed short forms of the MacArthur-Bates Communicative Development Inventories (CDI) for different languages. We consider the specific aim of constructing such a short form to be used for language screening in a specific age group. We present a novel strategy for the construction, which is applicable…

Descriptors: Age, Test Reliability, Measures (Individuals), Error of Measurement

Making Inferences about Growth and Value-Added: Design Issues for the PARCC Consortium. A White Paper

Download full text

Briggs, Derek C. – Partnership for Assessment of Readiness for College and Careers, 2011

There is often confusion about distinctions between growth models and value-added models. The first half of this paper attempts to dispel some of these confusions by clarifying terminology and illustrating by example how the results from a large-scale assessment can and will be used to make inferences about student growth and the value-added…

Descriptors: Value Added Models, Language Usage, Measurement, Inferences

Reliability Estimation When a Test Is Split into Two Parts of Unknown Effective Length.

Peer reviewed

Feldt, Leonard S. – Applied Measurement in Education, 2002

Considers the situation in which content or administrative considerations limit the way in which a test can be partitioned to estimate the internal consistency reliability of the total test score. Demonstrates that a single-valued estimate of the total score reliability is possible only if an assumption is made about the comparative size of the…

Descriptors: Error of Measurement, Reliability, Scores, Test Construction

E-Assessment within the Bologna Paradigm: Evidence from Portugal

Peer reviewed

Direct link

Ferrao, Maria – Assessment & Evaluation in Higher Education, 2010

The Bologna Declaration brought reforms into higher education that imply changes in teaching methods, didactic materials and textbooks, infrastructures and laboratories, etc. Statistics and mathematics are disciplines that traditionally have the worst success rates, particularly in non-mathematics core curricula courses. This research project,…

Descriptors: Foreign Countries, Computer Assisted Testing, Educational Technology, Educational Assessment

Previous Page | Next Page »

Pages: 1 | 2

Anders Hjorth-Trolle	1
Anders Holm	1
Balkin, Richard S.	1
Bardhoshi, Gerta	1
Biancarosa, Gina	1
Bleses, Dorthe	1
Borich, Gary D.	1
Bramble, William	1
Brennan, Robert L.	1
Briggs, Derek C.	1
Bristow, M.	1
Cantor, Nancy K.	1
Chen, Wen-Hung	1
Cook, Linda	1
Cronbach, Lee J.	1
Dorans, Neil J.	1
Erford, Bradley T.	1
Erkorkmaz, K.	1
Espelage, Dorothy L.	1
Feigenbaum, Miriam	1
Feldt, Leonard S.	1
Ferrao, Maria	1
Ferrara, Steve	1
Fien, Hank	1
More ▼