NotesFAQContact Us
Collection
Advanced
Search Tips
Showing 4,861 to 4,875 of 9,547 results Save | Export
Peer reviewed Peer reviewed
Braun, Henry I. – Journal of Educational Statistics, 1988
A statistical experiment was conducted in an operational setting to determine the contributions of different sources of variability to the unreliability scoring of essays and other free-response questions. Partially balanced incomplete block designs facilitated the unbiased estimation of certain main effects without requiring readers to assess the…
Descriptors: Essay Tests, Grading, Reliability, Scoring
Cantor, Jeffrey A. – Training and Development Journal, 1987
The author discusses writing items for a multiple-choice test. Topics include (1) formatting, (2) central theme development, (3) stem revision, (4) distractors, and (5) test validity and reliability. (CH)
Descriptors: Adult Education, Multiple Choice Tests, Test Construction, Test Items
Peer reviewed Peer reviewed
Cziko, Gary A. – Educational and Psychological Measurement, 1984
Some problems associated with the criteria of reproducibility and scalability as they are used in Guttman scalogram analysis to evaluate cumulative, nonparametric scales of dichotomous items are discussed. A computer program is presented which analyzes response patterns elicited by dichotomous scales designed to be cumulative. (Author/DWH)
Descriptors: Scaling, Statistical Analysis, Test Construction, Test Items
Rizavi, Saba; Way, Walter D.; Davey, Tim; Herbert, Erin – 2002
The purpose of this study was to investigate and to quantify the tolerable error in item parameter estimates for different sets of items used in computer-based testing. The study examined items that were administered repeatedly to different examinee samples over time, examining items that were administered linearly in a fixed order each time they…
Descriptors: Adaptive Testing, Estimation (Mathematics), High Stakes Tests, Test Items
Ferdous, Abdullah; Plake, Barbara – 2003
In the Angoff standard setting procedure, subject matter experts (SMEs) estimate the probability that a hypothetical randomly selected minimally competent candidate will answer correctly each item comprising the test. In many cases, these item performance estimates are made twice, with information shared with SMEs between estimates. This…
Descriptors: Cost Effectiveness, Estimation (Mathematics), Standard Setting (Scoring), Test Items
Rudner, Lawrence M. – 2001
This paper describes and evaluates the use of decision theory as a tool for classifying examinees based on their item response patterns. Decision theory, developed by A. Wald (1947) and now widely used in engineering, agriculture, and computing, provides a simple model for the analysis of categorical data. Measurement decision theory requires only…
Descriptors: Classification, Mathematical Models, Measurement Techniques, Responses
Plumer, Gilbert E. – 1999
The nontechnical ability to identify or match argumentative structure is considered by many to be an important reasoning skill. Instruments that have questions designed to measure this skill include major standardized tests for graduate school admission, for example, the Law School Admission Test (LSAT), the Graduate Record Examination (GRE), and…
Descriptors: College Entrance Examinations, Persuasive Discourse, Test Construction, Test Items
Nevada State Dept. of Education, Carson City. – 1999
This document presents a released mathematics test, Form E, from the Nevada High School Proficiency Tests. The first section contains 31 word problems for which students must select the correct answer. The second part of the test contains 29 more test items. Directions, a sheet of formulas to use in calculating answers, and an answer key are…
Descriptors: High School Students, High Schools, Mathematics Tests, Test Items
Shen, Linjun – 2001
Two standard setting methods, the Angoff method and the Rasch model based Item Map approach, were compared for setting a standard for a high-stakes medical licensure examination, the last examination of a three-examination series of a national medical licensing examination. The standard setting committee consisted of 23 physicians who were…
Descriptors: Comparative Analysis, Licensing Examinations (Professions), Physicians, Standards
Kim, Seonghoon; Lee, Won-Chan – ACT Inc, 2004
Under item response theory (IRT), obtaining a common proficiency scale is required in many applications. Four IRT linking methods, including the mean/mean, mean/sigma, Haebara, and Stocking-Lord methods, have been developed and widely used to estimate linking coefficients (slope and intercept) for a linear transformation from one scale to…
Descriptors: Measures (Individuals), Simulation, Item Response Theory, Test Items
Peer reviewed Peer reviewed
Rozeboom, William W. – Psychometrika, 1982
Bounds for the multiple correlation of common factors with the items which comprise those factors are developed. It is then shown that under broad, but not completely general, conditions, the circumstances under which an infinite item domain does or does not perfectly determine selected subsets of its common factors. (Author/JKS)
Descriptors: Factor Analysis, Item Analysis, Multiple Regression Analysis, Test Items
Peer reviewed Peer reviewed
Lucas, Peter A.; McConkie, George W. – American Educational Research Journal, 1980
An approach is described for the characterization of test questions in terms of the information in a passage relevant to answering them and the nature of the relationship of this information to the questions. The approach offers several advantages over previous algorithms for the production of test items. (Author/GDC)
Descriptors: Content Analysis, Cues, Test Construction, Test Format
Peer reviewed Peer reviewed
Plake, Barbara S. – Journal of Experimental Education, 1980
Three-item orderings and two levels of knowledge of ordering were used to study differences in test results, student's perception of the test's fairness and difficulty, and student's estimation of test performance. No significant order effect was found. (Author/GK)
Descriptors: Difficulty Level, Higher Education, Scores, Test Format
Peer reviewed Peer reviewed
Lumsden, James – Applied Psychological Measurement, 1980
A test theory model based on the Thurstone judgmental model is described. By restricting various parameters of the model, 3 Rasch models, 2 pseudo-Rasch models, 3 two-parameter models, and a Weber's Law model are derived. (Author/CTM)
Descriptors: Latent Trait Theory, Mathematical Models, Scaling, Test Items
Peer reviewed Peer reviewed
Chen, Wen-Hung; Thissen, David – Journal of Educational and Behavioral Statistics, 1997
Four statistics are proposed for the detection of local dependence (LD) among items analyzed using item response theory. Simulation results show that, under the locally dependent condition, the X-squared and G-squared indexes appear to be sensitive in detecting LD or multidimensionality among items. (SLD)
Descriptors: Identification, Item Response Theory, Simulation, Test Construction
Pages: 1  |  ...  |  321  |  322  |  323  |  324  |  325  |  326  |  327  |  328  |  329  |  ...  |  637