Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 5 |
Since 2006 (last 20 years) | 14 |
Descriptor
Measurement | 15 |
Probability | 15 |
Scores | 15 |
Test Items | 5 |
Item Analysis | 4 |
Mathematics Tests | 4 |
Algebra | 3 |
Correlation | 3 |
Data Analysis | 3 |
Error of Measurement | 3 |
Geometry | 3 |
More ▼ |
Source
Author
Publication Type
Journal Articles | 10 |
Reports - Research | 9 |
Reports - Descriptive | 5 |
Dissertations/Theses -… | 1 |
Opinion Papers | 1 |
Speeches/Meeting Papers | 1 |
Tests/Questionnaires | 1 |
Education Level
Middle Schools | 4 |
Elementary Education | 3 |
Grade 8 | 3 |
Intermediate Grades | 3 |
Junior High Schools | 3 |
Secondary Education | 3 |
Grade 12 | 2 |
Grade 4 | 2 |
High Schools | 2 |
Higher Education | 2 |
Postsecondary Education | 2 |
More ▼ |
Audience
Policymakers | 1 |
Teachers | 1 |
Laws, Policies, & Programs
Assessments and Surveys
National Assessment of… | 3 |
ACT Assessment | 1 |
Georgia Criterion Referenced… | 1 |
SAT (College Admission Test) | 1 |
What Works Clearinghouse Rating
Metsämuuronen, Jari – International Journal of Educational Methodology, 2021
Although Goodman-Kruskal gamma (G) is used relatively rarely it has promising potential as a coefficient of association in educational settings. Characteristics of G are studied in three sub-studies related to educational measurement settings. G appears to be unexpectedly appealing as an estimator of association between an item and a score because…
Descriptors: Educational Assessment, Measurement, Item Analysis, Correlation
Zhang, Xiuyuan – AERA Online Paper Repository, 2019
The main purpose of the study is to evaluate the qualities of human essay ratings for a large-scale assessment using Rasch measurement theory. Specifically, Many-Facet Rasch Measurement (MFRM) was utilized to examine the rating scale category structure and provide important information about interpretations of ratings in the large-scale…
Descriptors: Essays, Evaluators, Writing Evaluation, Reliability
Culpepper, Steven Andrew – Journal of Educational and Behavioral Statistics, 2017
In the absence of clear incentives, achievement tests may be subject to the effect of slipping where item response functions have upper asymptotes below one. Slipping reduces score precision for higher latent scores and distorts test developers' understandings of item and test information. A multidimensional four-parameter normal ogive model was…
Descriptors: Measurement, Achievement Tests, Item Response Theory, National Competency Tests
Liu, Leping; Ripley, Darren – International Journal of Technology in Teaching and Learning, 2014
Propensity score matching (PSM) has been used to estimate causal effects of treatment, especially in studies where random assignment to treatment is difficult to obtain. The main purpose of this article is to provide some practical guidance for propensity score sample matching, including definitions, procedures, decisions on each step, and methods…
Descriptors: Probability, Scores, Technology Integration, Science Education
National Assessment Governing Board, 2017
The National Assessment of Educational Progress (NAEP) is the only continuing and nationally representative measure of trends in academic achievement of U.S. elementary and secondary school students in various subjects. For more than four decades, NAEP assessments have been conducted periodically in reading, mathematics, science, writing, U.S.…
Descriptors: Mathematics Achievement, Multiple Choice Tests, National Competency Tests, Educational Trends
Cramer, Angelique O. J. – Measurement: Interdisciplinary Research and Perspectives, 2012
What is validity? A simple question but apparently one with many answers, as Paul Newton highlights in his review of the history of validity. The current definition of validity, as entertained in the 1999 "Standards for Educational and Psychological Testing" is indeed a consensus, one between the classical notion of attributes, and measures…
Descriptors: Validity, Educational Testing, Depression (Psychology), Psychology
National Assessment Governing Board, 2017
Since 1973, the National Assessment of Educational Progress (NAEP) has gathered information about student achievement in mathematics. Results of these periodic assessments, produced in print and web-based formats, provide valuable information to a wide variety of audiences. They inform citizens about the nature of students' comprehension of the…
Descriptors: Mathematics Tests, Mathematics Achievement, Mathematics Instruction, Grade 4
Abayomi, Kobi; Pizarro, Gonzalo – Social Indicators Research, 2013
We offer a straightforward framework for measurement of progress, across many dimensions, using cross-national social indices, which we classify as linear combinations of multivariate country level data onto a univariate score. We suggest a Bayesian approach which yields probabilistic (confidence type) intervals for the point estimates of country…
Descriptors: Bayesian Statistics, Intervals, Guidelines, Measurement
Kruyen, Peter M.; Emons, Wilco H. M.; Sijtsma, Klaas – International Journal of Testing, 2012
Personnel selection shows an enduring need for short stand-alone tests consisting of, say, 5 to 15 items. Despite their efficiency, short tests are more vulnerable to measurement error than longer test versions. Consequently, the question arises to what extent reducing test length deteriorates decision quality due to increased impact of…
Descriptors: Measurement, Personnel Selection, Decision Making, Error of Measurement
Sadaghiani, Homeyra R.; Pollock, Steven J. – Physical Review Special Topics - Physics Education Research, 2015
As part of an ongoing investigation of students' learning in first semester upper-division quantum mechanics, we needed a high-quality conceptual assessment instrument for comparing outcomes of different curricular approaches. The process of developing such a tool started with converting a preliminary version of a 14-item open-ended quantum…
Descriptors: Science Instruction, Quantum Mechanics, Mechanics (Physics), Multiple Choice Tests
van der Ark, L. Andries; Bergsma, Wicher P. – Psychometrika, 2010
In contrast to dichotomous item response theory (IRT) models, most well-known polytomous IRT models do not imply stochastic ordering of the latent trait by the total test score (SOL). This has been thought to make the ordering of respondents on the latent trait using the total test score questionable and throws doubt on the justifiability of using…
Descriptors: Scores, Nonparametric Statistics, Item Response Theory, Models
Anderson, Lezley Barker – ProQuest LLC, 2013
The purpose of this causal-comparative study was to examine whether differences exist in the mathematics achievement of fifth grade gifted students based on the instructional delivery model used for mathematics instruction, cluster or collaborative, as defined by the Georgia Department of Education. The content area of mathematics, an area…
Descriptors: Comparative Analysis, Grade 5, Elementary School Students, Academically Gifted
Kreiner, Svend – Applied Psychological Measurement, 2011
To rule out the need for a two-parameter item response theory (IRT) model during item analysis by Rasch models, it is important to check the Rasch model's assumption that all items have the same item discrimination. Biserial and polyserial correlation coefficients measuring the association between items and restscores are often used in an informal…
Descriptors: Item Analysis, Correlation, Item Response Theory, Models
Webber, Douglas A. – Economics of Education Review, 2012
Using detailed individual-level data from public universities in the state of Ohio, I estimate the effect of various institutional expenditures on the probability of graduating from college. Using a competing risks regression framework, I find differential impacts of expenditure categories across student characteristics. I estimate that student…
Descriptors: Student Characteristics, Educational Finance, Measurement, Probability
Livingston, Samuel A. – 1976
A distinction is made between reliability of measurement and reliability of classification; the "criterion-referenced reliability coefficient" describes the former. Application of this coefficient to the probability distribution of possible scores for a single student yields a meaningful way to describe the reliability of a single score. (Author)
Descriptors: Classification, Criterion Referenced Tests, Error of Measurement, Measurement