NotesFAQContact Us
Collection
Advanced
Search Tips
Audience
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Showing 1 to 15 of 16 results Save | Export
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Amanda A. Wolkowitz; Russell Smith – Practical Assessment, Research & Evaluation, 2024
A decision consistency (DC) index is an estimate of the consistency of a classification decision on an exam. More specifically, DC estimates the percentage of examinees that would have the same classification decision on an exam if they were to retake the same or a parallel form of the exam again without memory of taking the exam the first time.…
Descriptors: Testing, Test Reliability, Replication (Evaluation), Decision Making
Peer reviewed Peer reviewed
Direct linkDirect link
Feinberg, Richard A. – Educational Measurement: Issues and Practice, 2021
Unforeseen complications during the administration of large-scale testing programs are inevitable and can prevent examinees from accessing all test material. For classification tests in which the primary purpose is to yield a decision, such as a pass/fail result, the current study investigated a model-based standard error approach, Bayesian…
Descriptors: High Stakes Tests, Classification, Decision Making, Bayesian Statistics
Peer reviewed Peer reviewed
Direct linkDirect link
Liu, Ren; Qian, Hong; Luo, Xiao; Woo, Ada – Educational and Psychological Measurement, 2018
Subscore reporting under item response theory models has always been a challenge partly because the test length of each subdomain is limited for precisely locating individuals on multiple continua. Diagnostic classification models (DCMs), providing a pass/fail decision and associated probability of pass on each subdomain, are promising…
Descriptors: Classification, Probability, Pass Fail Grading, Scores
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Bashkov, Bozhidar M.; Clauser, Jerome C. – Practical Assessment, Research & Evaluation, 2019
Successful testing programs rely on high-quality test items to produce reliable scores and defensible exams. However, determining what statistical screening criteria are most appropriate to support these goals can be daunting. This study describes and demonstrates cost-benefit analysis as an empirical approach to determining appropriate screening…
Descriptors: Test Items, Test Reliability, Evaluation Criteria, Accuracy
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Casey, Kevin – Journal of Learning Analytics, 2017
Learning analytics offers insights into student behaviour and the potential to detect poor performers before they fail exams. If the activity is primarily online (for example computer programming), a wealth of low-level data can be made available that allows unprecedented accuracy in predicting which students will pass or fail. In this paper, we…
Descriptors: Keyboarding (Data Entry), Educational Research, Data Collection, Data Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Bramley, Tom – Educational Research, 2010
Background: A recent article published in "Educational Research" on the reliability of results in National Curriculum testing in England (Newton, "The reliability of results from national curriculum testing in England," "Educational Research" 51, no. 2: 181-212, 2009) suggested that: (1) classification accuracy can be…
Descriptors: National Curriculum, Educational Research, Testing, Measurement
Peer reviewed Peer reviewed
Direct linkDirect link
van Rijn, P. W.; Beguin, A. A.; Verstralen, H. H. F. M. – Assessment in Education: Principles, Policy & Practice, 2012
While measurement precision is relatively easy to establish for single tests and assessments, it is much more difficult to determine for decision making with multiple tests on different subjects. This latter is the situation in the system of final examinations for secondary education in the Netherlands and is used as an example in this paper. This…
Descriptors: Secondary Education, Tests, Foreign Countries, Decision Making
Mundy, C. Jean – Journal of Health, Physical Education and Recreation, 1974
Descriptors: Classification, Competency Based Education, Course Objectives, Evaluation
Peer reviewed Peer reviewed
Dwyer, Carol Anne – Psychological Assessment, 1996
The uses and abuses of cut scores are examined. The article demonstrates (1) that cut scores always entail judgment; (2) that cut scores inherently result in misclassification; (3) that cut scores impose an artificial dichotomy on an essentially continuous distribution of knowledge, skill, or ability; and (4) that no true cut scores exist. (SLD)
Descriptors: Classification, Cutting Scores, Educational Testing, Error of Measurement
Breyer, F. Jay; Lewis, Charles – 1994
A single-administration classification reliability index is described that estimates the probability of consistently classifying examinees to mastery or nonmastery states as if those examinees had been tested with two alternate forms. The procedure is applicable to any test used for classification purposes, subdividing that test into two…
Descriptors: Classification, Cutting Scores, Objective Tests, Pass Fail Grading
Schulz, E. Matthew; Wang, Lin – 2001
In this study, items were drawn from a full-length test of 30 items in order to construct shorter tests for the purpose of making accurate pass/fail classifications with regard to a specific criterion point on the latent ability metric. A three-item parameter Item Response Theory (IRT) framework was used. The criterion point on the latent ability…
Descriptors: Ability, Classification, Item Response Theory, Pass Fail Grading
Peer reviewed Peer reviewed
Spray, Judith A.; Reckase, Mark D. – Journal of Educational and Behavioral Statistics, 1996
Two procedures for classifying examinees into categories, one based on the sequential probability ratio test (SPRT) and the other on sequential Bayes methodology, were compared to determine which required fewer items for classification. Results showed that the SPRT procedure requires fewer items to achieve the same accuracy level. (SLD)
Descriptors: Ability, Bayesian Statistics, Classification, Comparative Analysis
Avner, R. A. – 1970
This report compares maximum linear prediction, maximum total correct classifications for a group, and maximum probability of correct classification for an individual as objective criteria for univariate grading scales. Since the goals of valid prediction and valid classification lead to conflicting criteria, it is possible that a compromise…
Descriptors: Achievement Rating, Classification, Evaluation, Evaluation Criteria
DeMauro, Gerald E. – 1989
It is difficult to estimate the percentage of examinees who pass National Teacher Evaluation (NTE) tests because many users of the tests require that examinees pass different combinations of tests or use different passing scores for each of the tests. This study first develops a taxonomy of state NTE requirements and then computes passing rates…
Descriptors: Blacks, Classification, Cutting Scores, Ethnicity
Sykes, Robert C.; And Others – 1992
A part-form methodology was used to study the effect of varying degrees of multidimensionality on the consistency of pass/fail classification decisions obtained from simulated unidimensional item response theory (IRT) based licensure examinations. A control on the degree of form multidimensionality permitted an assessment throughout the range of…
Descriptors: Classification, Comparative Testing, Computer Simulation, Decision Making
Previous Page | Next Page ยป
Pages: 1  |  2