Publication Date
In 2025 | 0 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 4 |
Since 2016 (last 10 years) | 7 |
Since 2006 (last 20 years) | 15 |
Descriptor
Raw Scores | 40 |
Scoring | 40 |
Test Items | 13 |
Test Interpretation | 9 |
Elementary Secondary Education | 7 |
Item Response Theory | 7 |
Student Evaluation | 7 |
Test Results | 7 |
Test Use | 7 |
Achievement Tests | 6 |
Educational Assessment | 6 |
More ▼ |
Source
Author
Livingston, Samuel A. | 2 |
Applebaum, Wayne R. | 1 |
Baglin, Roger F. | 1 |
Baskin, David | 1 |
Bayless, D. L. | 1 |
Bene, Nancy H. | 1 |
Betts, Joe | 1 |
Bidwell, Sarah L. | 1 |
Boone, William J. | 1 |
Brann, Kristy L. | 1 |
Casteel, Jim | 1 |
More ▼ |
Publication Type
Education Level
Elementary Secondary Education | 6 |
Elementary Education | 5 |
Secondary Education | 3 |
Grade 6 | 2 |
Higher Education | 2 |
Grade 10 | 1 |
Kindergarten | 1 |
Postsecondary Education | 1 |
Audience
Teachers | 2 |
Parents | 1 |
Practitioners | 1 |
Researchers | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Iowa Tests of Basic Skills | 2 |
Metropolitan Achievement Tests | 1 |
Preschool Language Scale | 1 |
Program for International… | 1 |
Raven Progressive Matrices | 1 |
Wechsler Adult Intelligence… | 1 |
Wechsler Intelligence Scale… | 1 |
What Works Clearinghouse Rating
Jessica Stinson – ProQuest LLC, 2024
Intelligence tests have been used in the United States since the early 1900s for assessing soldiers during World War I (Kaufman & Harrison, 2008; White & Hall, 1980). Presently, cognitive assessments are used in school, civil service, military, clinical, and industry settings (White & Hall, 1980). Although the results of these…
Descriptors: Graduate Students, Masters Programs, Doctoral Programs, Comparative Analysis
Emily Relkin; Sara K. Johnson; Marina U. Bers – Educational Technology & Society, 2023
"TechCheck" is an assessment of Computational Thinking (CT) for early elementary school children consisting of fifteen developmentally appropriate unplugged challenges that probe six CT domains. The first version of "TechCheck" showed good psychometric properties as well as ease of administration and scoring in a validation…
Descriptors: Elementary School Students, Developmentally Appropriate Practices, Computation, Thinking Skills
Betts, Joe; Muntean, William; Kim, Doyoung; Kao, Shu-chuan – Educational and Psychological Measurement, 2022
The multiple response structure can underlie several different technology-enhanced item types. With the increased use of computer-based testing, multiple response items are becoming more common. This response type holds the potential for being scored polytomously for partial credit. However, there are several possible methods for computing raw…
Descriptors: Scoring, Test Items, Test Format, Raw Scores
Jin, Kuan-Yu; Wang, Wen-Chung – Journal of Educational Measurement, 2018
The Rasch facets model was developed to account for facet data, such as student essays graded by raters, but it accounts for only one kind of rater effect (severity). In practice, raters may exhibit various tendencies such as using middle or extreme scores in their ratings, which is referred to as the rater centrality/extremity response style. To…
Descriptors: Scoring, Models, Interrater Reliability, Computation
Brann, Kristy L.; Boone, William J.; Splett, Joni W.; Clemons, Courtney; Bidwell, Sarah L. – Journal of Psychoeducational Assessment, 2021
Given the important role that teachers play in supporting student mental health, it is critical teachers feel confident in their ability to fill such roles. To inform strategies intended to improve teacher confidence in supporting student mental health, a psychometrically sound tool assessing teacher school mental health self-efficacy is needed.…
Descriptors: Teacher Surveys, Test Construction, Psychometrics, Mental Health
Kim, Sooyeon; Livingston, Samuel A. – ETS Research Report Series, 2017
The purpose of this simulation study was to assess the accuracy of a classical test theory (CTT)-based procedure for estimating the alternate-forms reliability of scores on a multistage test (MST) having 3 stages. We generated item difficulty and discrimination parameters for 10 parallel, nonoverlapping forms of the complete 3-stage test and…
Descriptors: Accuracy, Test Theory, Test Reliability, Adaptive Testing
Goodrich, J. Marc; Lonigan, Christopher J. – Journal of Child Language, 2018
This study evaluated the development of vocabulary knowledge over the course of two academic years, beginning in preschool, in a large sample (N = 944) of language-minority children using scores from single-language vocabulary assessments and conceptual scores. Results indicated that although children began the study with higher raw scores for…
Descriptors: Language Acquisition, Native Language, Second Language Learning, Vocabulary Development
Livingston, Samuel A. – Educational Testing Service, 2014
This booklet grew out of a half-day class on equating that author Samuel Livingston teaches for new statistical staff at Educational Testing Service (ETS). The class is a nonmathematical introduction to the topic, emphasizing conceptual understanding and practical applications. The class consists of illustrated lectures, interspersed with…
Descriptors: Equated Scores, Scoring, Self Evaluation (Individuals), Scores
Sheehan, Dwayne P.; Lafave, Mark R.; Katz, Larry – Measurement in Physical Education and Exercise Science, 2011
This study was designed to test the intra- and inter-rater reliability of the University of North Carolina's Balance Error Scoring System in 9- and 10-year-old children. Additionally, a modified version of the Balance Error Scoring System was tested to determine if it was more sensitive in this population ("raw scores"). Forty-six…
Descriptors: Elementary School Students, Interrater Reliability, Scoring, Raw Scores
van Kleeck, Anne; Lange, Alissa; Schwarz, Amy Louise – Journal of Speech, Language, and Hearing Research, 2011
Purpose: The Renfrew Bus Story--North American Edition (RBS-NA; C. Glasgow & J. Cowley, 1994) is widely used in clinical and research settings to determine children's language abilities, although possible influences of race and maternal education on RBS-NA performance are unknown. The current study compared RBS-NA retells of 4 groups of children:…
Descriptors: Sentences, Mothers, Scoring, Raw Scores
Hennessey, Stephen – International Journal of Disability, Development and Education, 2011
This article describes a method for identifying test items as disability neutral for children with vision and motor disabilities. Graduate students rated 130 items of the Preschool Language Scale and obtained inter-rater correlation coefficients of 0.58 for ratings of items as disability neutral for children with vision disability, and 0.77 for…
Descriptors: Graduate Students, Test Items, Physical Disabilities, Multiple Disabilities
Klesch, Heather S. – ProQuest LLC, 2010
The reporting of scores on educational tests is at times misunderstood, misinterpreted, and potentially confusing to examinees and other stakeholders who may need to interpret test scores. In reporting test results to examinees, there is a need for clarity in the message communicated. As pressure rises for students to demonstrate performance at a…
Descriptors: Feedback (Response), Test Results, Focus Groups, Educational Testing

May, Kim O.; Nicewander, W. Alan – Journal of Educational Measurement, 1997
Dato de Gruijter is correct in the recent conclusion that one equation derived by the present authors should be changed to reflect that it is an approximation, but it is still argued that percentile ranks for difficult tests can have substantially lower reliability and information relative to their number correct scores holds. (SLD)
Descriptors: Equations (Mathematics), Estimation (Mathematics), Raw Scores, Reliability

Scheiblechner, Hartmann – Psychometrika, 1995
The isotonic ordinal probabilistic model (ISOP) is introduced as a common nonparametric theoretical structure for unidimensional models for quantitative, ordinal, and dichotomous variables. Fundamental theorems on dichotomous and polytomous weakly independent ordered systems are derived, and testing at the observed empirical level is discussed.…
Descriptors: Equations (Mathematics), Nonparametric Statistics, Probability, Raw Scores

Haynes, Jack R. – Educational and Psychological Measurement, 1975
Descriptors: Classification, Comparative Analysis, Factor Analysis, Factor Structure