NotesFAQContact Us
Collection
Advanced
Search Tips
Audience
Researchers1
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Showing all 14 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Jones, Andrew T.; Kopp, Jason P.; Ong, Thai Q. – Educational Measurement: Issues and Practice, 2020
Studies investigating invariance have often been limited to measurement or prediction invariance. Selection invariance, wherein the use of test scores for classification results in equivalent classification accuracy between groups, has received comparatively little attention in the psychometric literature. Previous research suggests that some form…
Descriptors: Test Construction, Test Bias, Classification, Accuracy
Peer reviewed Peer reviewed
Direct linkDirect link
Adrian Adams; Lauren Barth-Cohen – CBE - Life Sciences Education, 2024
In undergraduate research settings, students are likely to encounter anomalous data, that is, data that do not meet their expectations. Most of the research that directly or indirectly captures the role of anomalous data in research settings uses post-hoc reflective interviews or surveys. These data collection approaches focus on recall of past…
Descriptors: Undergraduate Students, Physics, Science Instruction, Laboratory Experiments
Peer reviewed Peer reviewed
Direct linkDirect link
Grundin, Hans U. – Literacy, 2018
This paper aims to present a critical analysis of the Year 1 Phonics Screening Check (PSC), with special focus on the relationship between the UK Department for Education's policy-making and the evidence considered in the process of developing and evaluating the PSC. The reports from the in-house Standards and Testing Agency and from commissioned…
Descriptors: Foreign Countries, Criticism, Screening Tests, Phonics
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Bichi, Ado Abdu; Talib, Rohaya – International Journal of Evaluation and Research in Education, 2018
Testing in educational system perform a number of functions, the results from a test can be used to make a number of decisions in education. It is therefore well accepted in the education literature that, testing is an important element of education. To effectively utilize the tests in educational policies and quality assurance its validity and…
Descriptors: Item Response Theory, Test Items, Test Construction, Decision Making
Colton, Dean A.; Gao, Xiaohong; Harris, Deborah J.; Kolen, Michael J.; Martinovich-Barhite, Dara; Wang, Tianyou; Welch, Catherine J. – 1997
This collection consists of six papers, each dealing with some aspects of reliability and performance testing. Each paper has an abstract, and each contains its own references. Papers include: (1) "Using Reliabilities To Make Decisions" (Deborah J. Harris); (2) "Conditional Standard Errors, Reliability, and Decision Consistency…
Descriptors: Decision Making, Error of Measurement, Item Response Theory, Performance Based Assessment
Livingston, Samuel A. – 1978
The traditional reliability coefficient and standard error of measurement are not adequate measures of reliability for tests used to make pass/fail decisions. Answering the important reliability questions requires estimation of the joint distribution of true and observed scores. Lord's "Method 20" estimates this distribution without the…
Descriptors: Cutting Scores, Decision Making, Efficiency, Error of Measurement
Peer reviewed Peer reviewed
Livingston, Samuel A.; Wingersky, Marilyn A. – Journal of Educational Measurement, 1979
Procedures are described for studying the reliability of decisions based on specific passing scores with tests made up of discrete items and designed to measure continuous rather than categorical traits. These procedures are based on the estimation of the joint distribution of true scores and observed scores. (CTM)
Descriptors: Cutting Scores, Decision Making, Efficiency, Error of Measurement
Thompson, Bruce; Crowley, Susan – 1994
Most training programs in education and psychology focus on classical test theory techniques for assessing score dependability. This paper discusses generalizability theory and explores its concepts using a small heuristic data set. Generalizability theory subsumes and extends classical test score theory. It is able to estimate the magnitude of…
Descriptors: Analysis of Variance, Cutting Scores, Decision Making, Error of Measurement
Peer reviewed Peer reviewed
Mellenbergh, Gideon J.; van der Linden, Wim J. – Applied Psychological Measurement, 1979
For six tests, coefficient delta as an index for internal optimality is computed. Internal optimality is defined as the magnitude of risk of the decision procedure with respect to the true score. Results are compared with an alternative index (coefficient kappa) for assessing the consistency of decisions. (Author/JKS)
Descriptors: Classification, Comparative Analysis, Decision Making, Error of Measurement
Peer reviewed Peer reviewed
Brennan, Robert L.; Johnson, Eugene G. – Educational Measurement: Issues and Practice, 1995
The application of generalizability theory to the reliability and error variance estimation for performance assessment scores is discussed. Decision makers concerned with performance assessment need to realize the restrictions that limit generalizability such as limitations that lead to reductions in the number of tasks possible, rater quality,…
Descriptors: Decision Making, Educational Assessment, Error of Measurement, Estimation (Mathematics)
Marshall, J. Laird – 1976
A summary is provided of the rationale for questioning the applicability of classical reliability measures to criterion referenced tests; an extension of the classical theory of true and error scores to incorporate a theory of dichotomous decisions; a presentation of the mean split-half coefficient of agreement, a single-administration test index…
Descriptors: Career Development, Computer Programs, Criterion Referenced Tests, Decision Making
Haladyna, Tom – 1976
The existence of criterion-referenced (CR) measurement is questioned in this paper. Despite beliefs that differences exist between two alternative forms of measurement, CR and Norm Referenced (NR), an analysis of philosophical and psychological descriptions of measurement, as well as a growing number of empirical studies, reveal that the common…
Descriptors: Academic Standards, Achievement Tests, Career Development, Comparative Analysis
Fruen, Mary – NCME Measurement in Education, 1978
There are both strengths and weaknesses of using standardized test scores as a criterion for admission to institutions of higher education. The relative importance of scores is dependent on the institution's degree of selectivity. In general, decision processes and admissions criteria are not well defined. Advantages of test scores include: use of…
Descriptors: Admission Criteria, College Admission, College Entrance Examinations, Competitive Selection
Macpherson, Colin R.; Rowley, Glenn L. – 1986
Teacher-made mastery tests were administered in a classroom-sized sample to study their decision consistency. Decision-consistency of criterion-referenced tests is usually defined in terms of the proportion of examinees who are classified in the same way after two test administrations. Single-administration estimates of decision consistency were…
Descriptors: Classroom Research, Comparative Testing, Criterion Referenced Tests, Cutting Scores