NotesFAQContact Us
Collection
Advanced
Search Tips
What Works Clearinghouse Rating
Showing 1 to 15 of 42 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Shun-Fu Hu; Amery D. Wu; Jake Stone – Journal of Educational Measurement, 2025
Scoring high-dimensional assessments (e.g., > 15 traits) can be a challenging task. This paper introduces the multilabel neural network (MNN) as a scoring method for high-dimensional assessments. Additionally, it demonstrates how MNN can score the same test responses to maximize different performance metrics, such as accuracy, recall, or…
Descriptors: Tests, Testing, Scores, Test Construction
Peer reviewed Peer reviewed
Direct linkDirect link
Knoch, Ute; Deygers, Bart; Khamboonruang, Apichat – Language Testing, 2021
Rating scale development in the field of language assessment is often considered in dichotomous ways: It is assumed to be guided either by expert intuition or by drawing on performance data. Even though quite a few authors have argued that rating scale development is rarely so easily classifiable, this dyadic view has dominated language testing…
Descriptors: Rating Scales, Test Construction, Language Tests, Test Use
Peer reviewed Peer reviewed
Direct linkDirect link
Trierweiler, Tammy J.; Lewis, Charles; Smith, Robert L. – Journal of Educational Measurement, 2016
In this study, we describe what factors influence the observed score correlation between an (external) anchor test and a total test. We show that the anchor to full-test observed score correlation is based on two components: the true score correlation between the anchor and total test, and the reliability of the anchor test. Findings using an…
Descriptors: Scores, Correlation, Tests, Test Reliability
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Otoyo, Lucia; Bush, Martin – Practical Assessment, Research & Evaluation, 2018
This article presents the results of an empirical study of "subset selection" tests, which are a generalisation of traditional multiple-choice tests in which test takers are able to express partial knowledge. Similar previous studies have mostly been supportive of subset selection, but the deduction of marks for incorrect responses has…
Descriptors: Multiple Choice Tests, Grading, Test Reliability, Test Format
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Ketterlin-Geller, Leanne R.; Perry, Lindsey; Platas, Linda M.; Sitbakhan, Yasmin – Global Education Review, 2018
Test scoring procedures should align with the intended uses and interpretations of test results. In this paper, we examine three test scoring procedures for an operational assessment of early numeracy, the Early Grade Mathematics Assessment (EGMA). The EGMA is an assessment that tests young children's foundational mathematics knowledge and has…
Descriptors: Alignment (Education), Scoring, Test Use, Mathematics Tests
Allen, Jeff M.; Mattern, Krista – ACT, Inc., 2019
States and districts have expressed interest in administering the ACT® to 10th-grade students. Given that the ACT was designed to be administered in the spring of 11th grade or fall of 12th grade, the appropriateness of this use should be evaluated. As such, the focus of this paper is to summarize empirical evidence evaluating the use of the ACT…
Descriptors: Test Validity, College Entrance Examinations, High School Students, Grade 10
Peer reviewed Peer reviewed
Direct linkDirect link
Gokturk, Nazlinur – Language Assessment Quarterly, 2018
The number of public school students who are English learners (ELs) has been increasing steadily in the United States. According to a report by Snyder, de Brey, and Dillow (2016), in the 2014-2015 school year, nearly 4.6 million students enrolled in Kindergarten through grade 12 (K-12) in U.S. schools were English learners, representing…
Descriptors: Public Schools, English (Second Language), Second Language Learning, Second Language Instruction
Sanders, Sara – National Technical Assistance Center for the Education of Neglected or Delinquent Children and Youth (NDTAC), 2019
This guide is designed to assist States, agencies, and/or facilities who work with youth who are neglected, delinquent, or at-risk (N or D). The information in the guide will benefit those who are (a) interested in implementing pre-posttests, (b) in the process of identifying an appropriate pre-posttest, or (c) ready to evaluate current testing…
Descriptors: At Risk Students, Delinquency, Pretests Posttests, Testing
New York State Education Department, 2015
This technical report provides an overview of the New York State Alternate Assessment (NYSAA), including a description of the purpose of the NYSAA, the processes utilized to develop and implement the NYSAA program, and Stakeholder involvement in those processes. By comparing the intent of the NYSAA with its process and design, the validity of the…
Descriptors: Alternative Assessment, Grade 3, Grade 4, Grade 5
Peer reviewed Peer reviewed
Direct linkDirect link
Kambas, A.; Venetsanou, F.; Giannakidou, D.; Fatouros, I. G.; Avloniti, A.; Chatzinikolaou, A.; Draganidis, D.; Zimmer, R. – Research in Developmental Disabilities: A Multidisciplinary Journal, 2012
Given the negative influence of motor difficulties on people's quality of life their early identification seems to be crucial and consequently the information provided by a sound assessment tool is of great importance. The aim of this study was to examine the suitability of the MOT 4-6 (Zimmer & Volkamer, 1987) for use with preschoolers in…
Descriptors: Quality of Life, Preschool Children, Identification, Psychomotor Skills
Proctor, Thomas P.; Kim, YoungKoung Rachel – College Board, 2009
Presented at the national conference for the American Educational Research Association (AERA) in April 2009. This study examined the utility of scores on the SAT writing test, specifically examining the reliability of scores using generalizability and item response theories. The study also provides an overview of current predictive validity…
Descriptors: College Entrance Examinations, Writing Tests, Psychometrics, Predictive Validity
Peer reviewed Peer reviewed
Burrell, Brenda; And Others – Educational and Psychological Measurement, 1995
The measurement characteristics of the Perceived Adequacy of Resources Scale, a measure of family functioning, were investigated. The reliability and validity of total and subtest scores were studied with 113 mothers. Results were generally favorable regarding the integrity of scores from the measure. (SLD)
Descriptors: Family Characteristics, Mothers, Psychometrics, Scores
Peer reviewed Peer reviewed
Graham, Steve – Journal of School Psychology, 1986
Examines influences of construct, writer, assignment, and rater variables on the evaluation of handwriting products and investigates the instructional applicability of handwriting scales. Results indicated handwriting scales do not provide an adequate means of determining competence, individualizing instruction, or monitoring progress. (Author/ABB)
Descriptors: Evaluation Criteria, Handwriting, Influences, Instructional Material Evaluation
Peer reviewed Peer reviewed
MacKay, Gilbert; Lundie, Jennifer – International Journal of Disability, Development and Education, 1998
Recognizes the attraction of Goal Attainment Scaling (GAS), a technique that uses a scale to measure client's achievement, but suggests that there are concerns about the calculation of its standard scores. Examples show how GAS may be used in service development, whether or not numerical values are attached. (Author/CR)
Descriptors: Achievement Gains, Achievement Rating, Adults, Children
Kirisci, Levent; Clark, Duncan B. – 1996
The reliability and validity of the State-Trait Anxiety Inventory for Children (STAIC) was studied with 675 adolescents aged 12 to 18 recruited from clinical and community sources. The STAIC is a self-report measure that has been widely used to assess state and trait anxiety of children. It has been suggested that the child version may be more…
Descriptors: Adolescents, Anxiety, Children, Factor Structure
Previous Page | Next Page »
Pages: 1  |  2  |  3