Publication Date
| In 2026 | 0 |
| Since 2025 | 215 |
| Since 2022 (last 5 years) | 1084 |
| Since 2017 (last 10 years) | 2594 |
| Since 2007 (last 20 years) | 4955 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 653 |
| Teachers | 563 |
| Researchers | 250 |
| Students | 201 |
| Administrators | 81 |
| Policymakers | 22 |
| Parents | 17 |
| Counselors | 8 |
| Community | 7 |
| Support Staff | 3 |
| Media Staff | 1 |
| More ▼ | |
Location
| Turkey | 226 |
| Canada | 223 |
| Australia | 155 |
| Germany | 116 |
| United States | 99 |
| China | 90 |
| Florida | 86 |
| Indonesia | 82 |
| Taiwan | 78 |
| United Kingdom | 73 |
| California | 66 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 4 |
| Meets WWC Standards with or without Reservations | 4 |
| Does not meet standards | 1 |
Jiang, Hai; Tang, K. Linda – 1998
This discussion of new methods for calibrating item response theory (IRT) models looks into new optimization procedures, such as the Genetic Algorithm (GA) to improve on the use of the Newton-Raphson procedure. The advantages of using a global optimization procedure like GA is that this kind of procedure is not easily affected by local optima and…
Descriptors: Algorithms, Item Response Theory, Mathematical Models, Simulation
van der Linden, Wim J. – 1998
Six methods for assembling tests from a pool with an item-set structure are presented. All methods are computational and based on the technique of mixed integer programming. The methods are evaluated using such criteria as the feasibility of their linear programming problems and their expected solution times. The methods are illustrated for two…
Descriptors: Higher Education, Item Banks, Selection, Test Construction
Estimating Multidimensional Item Response Models with Mixed Structure. Research Report. ETS RR-05-04
Zhang, Jinming – ETS Research Report Series, 2005
This study derived an expectation-maximization (EM) algorithm for estimating the parameters of multidimensional item response models. A genetic algorithm (GA) was developed to be used in the maximization step in each EM cycle. The focus of the EM-GA algorithm developed in this paper was on multidimensional items with "mixed structure."…
Descriptors: Item Response Theory, Computation, Mathematics, Simulation
PDF pending restorationFrisbie, David A. – 1980
The development of a new technique, the Relative Difficulty Ratio (RDR), is described, as well as how it can be used to determine the difficulty level of a test so that meaningful inter-test difficulty comparisons can be made. Assumptions made in computing RDR include: 1) each item must be scored dichotomously with only one answer choice keyed as…
Descriptors: Difficulty Level, Item Analysis, Measurement Techniques, Scores
Peer reviewedVegelius, Jan – Educational and Psychological Measurement, 1977
The G index of agreement does not permit the use of various weights for its various items. The weighted G index described here, make it possible to use unequal weights. An example of the procedure is provided. (Author/JKS)
Descriptors: Correlation, Item Analysis, Multidimensional Scaling, Test Items
Peer reviewedBraun, Henry I. – Journal of Educational Statistics, 1988
A statistical experiment was conducted in an operational setting to determine the contributions of different sources of variability to the unreliability scoring of essays and other free-response questions. Partially balanced incomplete block designs facilitated the unbiased estimation of certain main effects without requiring readers to assess the…
Descriptors: Essay Tests, Grading, Reliability, Scoring
Cantor, Jeffrey A. – Training and Development Journal, 1987
The author discusses writing items for a multiple-choice test. Topics include (1) formatting, (2) central theme development, (3) stem revision, (4) distractors, and (5) test validity and reliability. (CH)
Descriptors: Adult Education, Multiple Choice Tests, Test Construction, Test Items
Peer reviewedCziko, Gary A. – Educational and Psychological Measurement, 1984
Some problems associated with the criteria of reproducibility and scalability as they are used in Guttman scalogram analysis to evaluate cumulative, nonparametric scales of dichotomous items are discussed. A computer program is presented which analyzes response patterns elicited by dichotomous scales designed to be cumulative. (Author/DWH)
Descriptors: Scaling, Statistical Analysis, Test Construction, Test Items
Rizavi, Saba; Way, Walter D.; Davey, Tim; Herbert, Erin – 2002
The purpose of this study was to investigate and to quantify the tolerable error in item parameter estimates for different sets of items used in computer-based testing. The study examined items that were administered repeatedly to different examinee samples over time, examining items that were administered linearly in a fixed order each time they…
Descriptors: Adaptive Testing, Estimation (Mathematics), High Stakes Tests, Test Items
Ferdous, Abdullah; Plake, Barbara – 2003
In the Angoff standard setting procedure, subject matter experts (SMEs) estimate the probability that a hypothetical randomly selected minimally competent candidate will answer correctly each item comprising the test. In many cases, these item performance estimates are made twice, with information shared with SMEs between estimates. This…
Descriptors: Cost Effectiveness, Estimation (Mathematics), Standard Setting (Scoring), Test Items
Rudner, Lawrence M. – 2001
This paper describes and evaluates the use of decision theory as a tool for classifying examinees based on their item response patterns. Decision theory, developed by A. Wald (1947) and now widely used in engineering, agriculture, and computing, provides a simple model for the analysis of categorical data. Measurement decision theory requires only…
Descriptors: Classification, Mathematical Models, Measurement Techniques, Responses
Plumer, Gilbert E. – 1999
The nontechnical ability to identify or match argumentative structure is considered by many to be an important reasoning skill. Instruments that have questions designed to measure this skill include major standardized tests for graduate school admission, for example, the Law School Admission Test (LSAT), the Graduate Record Examination (GRE), and…
Descriptors: College Entrance Examinations, Persuasive Discourse, Test Construction, Test Items
Nevada State Dept. of Education, Carson City. – 1999
This document presents a released mathematics test, Form E, from the Nevada High School Proficiency Tests. The first section contains 31 word problems for which students must select the correct answer. The second part of the test contains 29 more test items. Directions, a sheet of formulas to use in calculating answers, and an answer key are…
Descriptors: High School Students, High Schools, Mathematics Tests, Test Items
Shen, Linjun – 2001
Two standard setting methods, the Angoff method and the Rasch model based Item Map approach, were compared for setting a standard for a high-stakes medical licensure examination, the last examination of a three-examination series of a national medical licensing examination. The standard setting committee consisted of 23 physicians who were…
Descriptors: Comparative Analysis, Licensing Examinations (Professions), Physicians, Standards
Kim, Seonghoon; Lee, Won-Chan – ACT Inc, 2004
Under item response theory (IRT), obtaining a common proficiency scale is required in many applications. Four IRT linking methods, including the mean/mean, mean/sigma, Haebara, and Stocking-Lord methods, have been developed and widely used to estimate linking coefficients (slope and intercept) for a linear transformation from one scale to…
Descriptors: Measures (Individuals), Simulation, Item Response Theory, Test Items


