NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 15 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Wise, Steven L. – Educational Measurement: Issues and Practice, 2017
The rise of computer-based testing has brought with it the capability to measure more aspects of a test event than simply the answers selected or constructed by the test taker. One behavior that has drawn much research interest is the time test takers spend responding to individual multiple-choice items. In particular, very short response…
Descriptors: Guessing (Tests), Multiple Choice Tests, Test Items, Reaction Time
Peer reviewed Peer reviewed
Direct linkDirect link
Pommerich, Mary – Educational Measurement: Issues and Practice, 2012
Neil Dorans has made a career of advocating for the examinee. He continues to do so in his NCME career award address, providing a thought-provoking commentary on some current trends in educational measurement that could potentially affect the integrity of test scores. Concerns expressed in the address call attention to a conundrum that faces…
Descriptors: Testing, Scores, Measurement, Test Construction
Peer reviewed Peer reviewed
Direct linkDirect link
Dorans, Neil J. – Educational Measurement: Issues and Practice, 2012
Views on testing--its purpose and uses and how its data are analyzed--are related to one's perspective on test takers. Test takers can be viewed as learners, examinees, or contestants. I briefly discuss the perspective of test takers as learners. I maintain that much of psychometrics views test takers as examinees. I discuss test takers as a…
Descriptors: Testing, Test Theory, Item Response Theory, Test Reliability
Peer reviewed Peer reviewed
Kane, Michael; Crooks, Terence; Cohen, Allan – Educational Measurement: Issues and Practice, 1999
Analyzes the three major inferences involved in interpretation of performance assessments: (1) scoring of the observed performances; (2) generalization to a domain of assessment performances like those included in the assessment; and (3) extrapolation to the large performance domain of interest. Suggests ways to improve the validity of performance…
Descriptors: Performance Based Assessment, Performance Factors, Scoring, Test Interpretation
Peer reviewed Peer reviewed
Hills, John R. – Educational Measurement: Issues and Practice, 1984
Normal Curve Equivalents (NCEs), a new score system for standardized tests, are used by school districts in reporting results to federal funding agencies. The author uses a quiz format to answer questions on the use of NCE scores. (EGS)
Descriptors: Scores, Scoring, Standardized Tests, Test Interpretation
Peer reviewed Peer reviewed
Kolen, Michael J. – Educational Measurement: Issues and Practice, 1988
An instructional module is presented to promote a conceptual understanding of test form equating using traditional methods. Equating is distinguished from scaling. The equating methods described are: (1) mean; (2) linear; and (3) equipercentile. The module includes a self-test. (SLD)
Descriptors: College Entrance Examinations, College Students, Equated Scores, Higher Education
Peer reviewed Peer reviewed
Linn, Robert L.; Burton, Elizabeth – Educational Measurement: Issues and Practice, 1994
Generalizability of performance-based assessment scores across raters and tasks is examined, focusing on implications of generalizability analyses for specific uses and interpretations of assessment results. Although it seems probable that assessment conditions, task characteristics, and interactions with instructional experiences affect the…
Descriptors: Educational Assessment, Educational Experience, Generalizability Theory, Interaction
Peer reviewed Peer reviewed
Geisinger, Kurt F. – Educational Measurement: Issues and Practice, 1991
Ways to use standard-setting data to adjust cutoff scores on examinations are reviewed. Ten sources of information to be used in determining standards are listed. The decision to modify passing scores should be based on these types of information and consideration of adverse impact or rating process irregularities. (SLD)
Descriptors: Cutting Scores, Evaluation Utilization, Evaluators, Interrater Reliability
Peer reviewed Peer reviewed
Fisher, Thomas M.; Smith, Julia – Educational Measurement: Issues and Practice, 1991
Incidents affecting the implementation of large-scale testing programs are described to illustrate associated problems. Issues addressed include creation of test materials, preparation of answer documents, transportation of test materials, scoring and analysis of tests, and dissemination and utilization of test results. (TJH)
Descriptors: Answer Keys, Computer Assisted Testing, Information Dissemination, Program Implementation
Peer reviewed Peer reviewed
Plake, Barbara S.; And Others – Educational Measurement: Issues and Practice, 1991
Possible sources of intrajudge inconsistency in standard setting are reviewed, and approaches are presented to improve the accuracy of rating. Procedures for providing judges with feedback through discussion or computerized communication are discussed. Monitoring and maintaining judges' consistency throughout the rating process are essential. (SLD)
Descriptors: Computer Assisted Instruction, Evaluators, Examiners, Feedback
Peer reviewed Peer reviewed
Mills, Craig N.; And Others – Educational Measurement: Issues and Practice, 1991
An approach is presented to the definition of minimal competence for judges to use in standard setting. Panelists in standard setting must receive training to ensure that differences in rating result from differences in perceptions of item difficulty, not in differences of opinion about the definition of minimal competence. (SLD)
Descriptors: Cutting Scores, Decision Making, Definitions, Difficulty Level
Peer reviewed Peer reviewed
Jaeger, Richard M. – Educational Measurement: Issues and Practice, 1991
Issues concerning the selection of judges for standard setting are discussed. Determining the consistency of judges' recommendations, or their congruity with other expert recommendations, would help in selection. Enough judges must be chosen to allow estimation of recommendations by an entire population of judges. (SLD)
Descriptors: Cutting Scores, Evaluation Methods, Evaluators, Examiners
Peer reviewed Peer reviewed
Reid, Jerry B. – Educational Measurement: Issues and Practice, 1991
Training judges to generate item ratings in standard setting once the reference group has been defined is discussed. It is proposed that sensitivity to the factors that determine difficulty can be improved through training. Three criteria for determining when training is sufficient are offered. (SLD)
Descriptors: Computer Assisted Instruction, Difficulty Level, Evaluators, Interrater Reliability
Peer reviewed Peer reviewed
Bond, Lloyd – Educational Measurement: Issues and Practice, 1995
The extent to which performance assessments can have unintended and undesirable consequences is not yet clear, but preliminary evidence suggests several issues of bias and fairness. Among these is the sheer difficulty of scoring complex performance assessments, as Vermont's experiences with portfolio assessment illustrate. (SLD)
Descriptors: Black Students, Culture Fair Tests, Educational Assessment, Performance Based Assessment
Peer reviewed Peer reviewed
Plake, Barbara S.; And Others – Educational Measurement: Issues and Practice, 1993
Approximately 900 teachers in Virginia were surveyed to assess teachers' competencies in the 7 basic assessment areas identified in the "Standards for Teacher Competence in Educational Assessment of Students." Results will be used in designing training material in assessment for inservice teacher education. (SLD)
Descriptors: Academic Standards, Educational Assessment, Elementary School Teachers, Elementary Secondary Education