NotesFAQContact Us
Collection
Advanced
Search Tips
What Works Clearinghouse Rating
Showing 1 to 15 of 31 results Save | Export
Wenjing Guo – ProQuest LLC, 2021
Constructed response (CR) items are widely used in large-scale testing programs, including the National Assessment of Educational Progress (NAEP) and many district and state-level assessments in the United States. One unique feature of CR items is that they depend on human raters to assess the quality of examinees' work. The judgment of human…
Descriptors: National Competency Tests, Responses, Interrater Reliability, Error of Measurement
Sanders, Sara – National Technical Assistance Center for the Education of Neglected or Delinquent Children and Youth (NDTAC), 2019
This guide is designed to assist States, agencies, and/or facilities who work with youth who are neglected, delinquent, or at-risk (N or D). The information in the guide will benefit those who are (a) interested in implementing pre-posttests, (b) in the process of identifying an appropriate pre-posttest, or (c) ready to evaluate current testing…
Descriptors: At Risk Students, Delinquency, Pretests Posttests, Testing
Katzman, John – New England Journal of Higher Education, 2014
It is so easy to criticize the SAT that most observers overlook the weaknesses of its architect, the College Board. This author contents that, until the latter is replaced, however, the former will never be fixed. The College Board has every incentive to create a complex, stressful, expensive college admissions system. Because it is accountable to…
Descriptors: Standardized Tests, Testing Programs, Program Administration, Cost Effectiveness
Peer reviewed Peer reviewed
Direct linkDirect link
Ling, Guangming – International Journal of Testing, 2016
To investigate possible iPad related mode effect, we tested 403 8th graders in Indiana, Maryland, and New Jersey under three mode conditions through random assignment: a desktop computer, an iPad alone, and an iPad with an external keyboard. All students had used an iPad or computer for six months or longer. The 2-hour test included reading, math,…
Descriptors: Educational Testing, Computer Assisted Testing, Handheld Devices, Computers
Peer reviewed Peer reviewed
Direct linkDirect link
Casey, Beth M.; Lombardi, Caitlin McPherran; Pollock, Amanda; Fineman, Bonnie; Pezaris, Elizabeth – Journal of Cognition and Development, 2017
This study investigated longitudinal pathways leading from early spatial skills in first-grade girls to their fifth-grade analytical math reasoning abilities (N = 138). First-grade assessments included spatial skills, verbal skills, addition/subtraction skills, and frequency of choice of a decomposition or retrieval strategy on the…
Descriptors: Females, Arithmetic, Mathematics Instruction, Predictor Variables
Peer reviewed Peer reviewed
Direct linkDirect link
Mislevy, Robert J. – Educational Measurement: Issues and Practice, 2012
This article presents the author's observations on Neil Dorans's NCME Career Award Address: "The Contestant Perspective on Taking Tests: Emanations from the Statue within." He calls attention to some points that Dr. Dorans made in his address, and offers his thoughts in response.
Descriptors: Testing, Test Reliability, Psychometrics, Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Wu, Hung-Hsi – Journal of Mathematics Education at Teachers College, 2012
This article makes two simple observations about high-stakes assessments. The first is that, because mathematics is a very technical subject, an assessment item can be mathematically flawed regardless of how elementary it is. For this reason, every assessment project needs the active participation of high level mathematicians. A second point is…
Descriptors: Mathematics Education, High Stakes Tests, Student Evaluation, Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Tong, Ye; Kolen, Michael J. – Educational Measurement: Issues and Practice, 2010
"Scaling" is the process of constructing a score scale that associates numbers or other ordered indicators with the performance of examinees. Scaling typically is conducted to aid users in interpreting test results. This module describes different types of raw scores and scale scores, illustrates how to incorporate various sources of…
Descriptors: Test Results, Scaling, Measures (Individuals), Raw Scores
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Bennett, Randy Elliot; Persky, Hilary; Weiss, Andy; Jenkins, Frank – Journal of Technology, Learning, and Assessment, 2010
This paper describes a study intended to demonstrate how an emerging skill, problem solving with technology, might be measured in the National Assessment of Educational Progress (NAEP). Two computer-delivered assessment scenarios were designed, one on solving science-related problems through electronic information search and the other on solving…
Descriptors: National Competency Tests, Problem Solving, Technology Uses in Education, Computer Assisted Testing
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Somers, Marie-Andree; Zhu, Pei; Wong, Edmond – National Center for Education Evaluation and Regional Assistance, 2011
This study examines the practical implications of using state tests to measure student achievement in impact evaluations that span multiple states and grades. In particular, the study examines the sensitivity of impact findings to (1) the type of assessment used to measured achievement (state tests or an external assessment administered by the…
Descriptors: Evaluators, Grades (Scholastic), Academic Achievement, Program Effectiveness
Peer reviewed Peer reviewed
Nichols, Paul; Kuehl, Barbara Jean – Applied Measurement in Education, 1999
An approach is presented that can predict internal consistency of cognitively complex assessments on two dimensions, those of adding tasks with similar or different solution strategies and adding test takers with different solution strategies. Data from the 1992 National Assessment of Educational Progress mathematics assessment are used to…
Descriptors: Cognitive Tests, Mathematics Tests, Prediction, Test Reliability
Peer reviewed Peer reviewed
Burton, Nancy W. – Educational and Psychological Measurement, 1981
This study was concerned with selecting a measure of scorer agreement for use with the National Assessment of Educational Progress. The simple percent of agreement and Cohen's kappa were compared. It was concluded that Cohen's kappa does not add sufficient information to make its calculation worthwhile. (Author/BW)
Descriptors: Educational Assessment, Elementary Secondary Education, Quality Control, Scoring
Hogan, Thomas P.; Mishler, Carol – 1982
This literature review summarizes what is currently known about the agreement among six measures of writing skills. Three of these methods involve the application of human judgment in scoring or rating a piece of writing: holistic, analytical, and primary trait scoring. Two methods involve anatomical or taxonomic analysis of a piece of writing:…
Descriptors: Comparative Testing, Criterion Referenced Tests, Measurement Techniques, Scoring
Farrell, Edmund J. – 1971
Conclusions from an examination of the results of the National Assessment of Educational Progress indicate that it furnishes little help for those involved in the publication of composition textbooks. Four main difficulties in making inferences from the Assessment data on writing are (1) it is not clear why individuals perform as well or as poorly…
Descriptors: Comparative Testing, Educational Research, Evaluation, Test Interpretation
Peer reviewed Peer reviewed
Burton, Nancy W. – Journal of Educational Measurement, 1980
Analysis of variance methods were used to investigate the reliability of scores on open ended items in the National Assessment of Educational Progress. The study was designed to determine their stability over seven different scorers and time of scoring during a three-month interval. (Author/CTM) Aspect of National Assessment (NAEP) dealt with in…
Descriptors: Career Development, Educational Assessment, Elementary Secondary Education, Item Analysis
Previous Page | Next Page »
Pages: 1  |  2  |  3