Publication Date
| In 2026 | 0 |
| Since 2025 | 220 |
| Since 2022 (last 5 years) | 1089 |
| Since 2017 (last 10 years) | 2599 |
| Since 2007 (last 20 years) | 4960 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 653 |
| Teachers | 563 |
| Researchers | 250 |
| Students | 201 |
| Administrators | 81 |
| Policymakers | 22 |
| Parents | 17 |
| Counselors | 8 |
| Community | 7 |
| Support Staff | 3 |
| Media Staff | 1 |
| More ▼ | |
Location
| Turkey | 226 |
| Canada | 223 |
| Australia | 155 |
| Germany | 116 |
| United States | 99 |
| China | 90 |
| Florida | 86 |
| Indonesia | 82 |
| Taiwan | 78 |
| United Kingdom | 73 |
| California | 66 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 4 |
| Meets WWC Standards with or without Reservations | 4 |
| Does not meet standards | 1 |
Peer reviewedKim, Seock-Ho; Cohen, Allan S. – Applied Psychological Measurement, 1998
Investigated Type I error rates of the likelihood-ratio test for the detection of differential item functioning (DIF) using Monte Carlo simulations under the graded-response model. Type I error rates were within theoretically expected values for all six combinations of sample sizes and ability-matching conditions at each of the nominal alpha…
Descriptors: Ability, Item Bias, Item Response Theory, Monte Carlo Methods
Peer reviewedO'Neill, Thomas; Lunz, Mary E.; Thiede, Keith – Journal of Applied Measurement, 2000
Studied item exposure in a computerized adaptive test when the item selection algorithm presents examinees with questions they were asked in a previous test administration. Results with 178 repeat examinees on a medical technologists' test indicate that the combined use of an adaptive algorithm to select items and latent trait theory to estimate…
Descriptors: Adaptive Testing, Algorithms, Computer Assisted Testing, Item Response Theory
Peer reviewedDassa, Clement; Lambert, Jean; Blais, Regis; Potvin, Diane; Gauthier, Natalie – Canadian Journal of Program Evaluation/La Revue canadienne d'evaluation de programme, 1997
Whether a middle alternative in the response choices to a questionnaire influences the reliability and validity of survey responses was studied with 1,390 physicians, nurses, and midwives. Including a neutral option had little effect on overall reliability and validity, but allowed better coherence when items were considered globally. (SLD)
Descriptors: Attitude Measures, Nurses, Obstetrics, Opinions
Peer reviewedBennett, Randy Elliot; Morley, Mary; Quardt, Dennis – Applied Psychological Measurement, 2000
Describes three open-ended response types that could broaden the conception of mathematical problem solving used in computerized admissions tests: (1) mathematical expression (ME); (2) generating examples (GE); and (3) and graphical modeling (GM). Illustrates how combining ME, GE, and GM can form extended constructed response problems. (SLD)
Descriptors: Adaptive Testing, Computer Assisted Testing, Constructed Response, Mathematics Tests
Peer reviewedRushton, J. Philippe; Skuy, Mervyn – Intelligence, 2000
Administered untimed Raven's Standard Progressive Matrices (SPM) to 173 African and 136 White college students in South Africa. In comparison with the 1993 U.S. normative sample, African students scored at the 14th percentile, and White students at the 61st percentile. Differences were greater on SPM items with the highest item total correlations,…
Descriptors: Black Students, College Students, Correlation, Foreign Countries
Peer reviewedReise, Steven P.; Flannery, Wm. Peter – Applied Measurement in Education, 1996
Statistical and theoretical issues that arise from assessing person-fit on measures of typical performance are discussed, including the frequent attenuation of detection of person-misfit, the need for methods of identifying sources of response aberrancy, and person-fit measures as moderators of trait-criterion relations. (SLD)
Descriptors: Item Response Theory, Measurement Techniques, Performance, Responses
Peer reviewedKim, Mikyung – Language Testing, 2001
Investigates differential item functioning (DIF) across two different broad language groupings, Asian and European, in a speaking test in which the test takers' responses were rated polytomously. Data were collected from 1038 nonnative speakers of English from France, Hong Kong, Japan, Spain, Switzerland, and Thailand who took the SPEAK test in…
Descriptors: English (Second Language), Foreign Countries, Item Analysis, Language Tests
Peer reviewedFlannelly, Laura T. – Journal of Nursing Education, 2001
Between administrations of a test, 36 nursing students were given a practice test and answer key that provided feedback; 30 were not. Those who performed poorly on the test were more overconfident about answers to hard questions. This judgment bias can be reduced by providing feedback about their performance and confidence. (Contains 54…
Descriptors: Bias, Feedback, Higher Education, Nursing Education
Peer reviewedAlderson, J. Charles; Percsich, Richard; Szabo, Gabor – Language Testing, 2000
Reports on the potential problems in scoring responses to sequencing tests, the development of a computer program to overcome these difficulties, and an exploration of the value of scoring procedures. (Author/VWL)
Descriptors: Computer Software, Foreign Countries, Item Analysis, Language Tests
Peer reviewedParke, Carol S. – Educational Assessment, 2001
Discusses an approach to analyzing performance assessments that identifies potential reasons for misfitting items and uses this information to improve on items and rubrics for these assessments. Illustrates the approach through a 53-item mathematics performance assessment completed by approximately 500 middle school students. (SLD)
Descriptors: Goodness of Fit, Mathematics Tests, Middle School Students, Middle Schools
Peer reviewedSubkoviak, Michael J.; Kane, Michael T.; Duncan, Patrick H. – Mid-Western Educational Researcher, 2002
Compares Angoff and Nedelsky methods for setting passing scores on tests. Using one of the methods, 84 college students were taught to estimate their probable scores on a vocabulary test. Estimates were compared to their later actual scores. The Nedelsky method was considerably less accurate under certain conditions, and both methods…
Descriptors: Cutting Scores, Difficulty Level, Evaluation Research, Test Construction
Peer reviewedvan der Linden, Wim J.; Scrams, David J.; Schnipke, Deborah L. – Applied Psychological Measurement, 1999
Proposes an item-selection algorithm for neutralizing the differential effects of time limits on computerized adaptive test scores. Uses a statistical model for distributions of examinees' response times on items in a bank that is updated each time an item is administered. Demonstrates the method using an item bank from the Armed Services…
Descriptors: Adaptive Testing, Algorithms, Computer Assisted Testing, Item Banks
Peer reviewedStocking, Martha L. – Applied Psychological Measurement, 1997
Investigated three models that permit restricted examinee control over revising previous answers in the context of adaptive testing, using simulation. Two models permitting item revisions worked well in preserving test fairness and accuracy, and one model may preserve some cognitive processing styles developed by examinees for a linear testing…
Descriptors: Adaptive Testing, Cognitive Processes, Comparative Analysis, Computer Assisted Testing
Peer reviewedRyan, Katherine E.; Chiu, Shuwan – Applied Measurement in Education, 2001
Examined whether patterns of gender differential item functioning (DIF) in parcels of items are influenced by changes in item position. Findings for more than 2,000 college freshmen taking a test of mathematics suggest that the amounts of gender DIF and DIF present in item parcels tend not to be influenced by changes in item position. (SLD)
Descriptors: College Freshmen, Context Effect, Higher Education, Item Bias
Hwang, Gwo-Jen; Lin, Bertrand M. T.; Lin, Tsung-Liang – Computers and Education, 2006
A well-constructed test sheet not only helps the instructor evaluate the learning status of the students, but also facilitates the diagnosis of the problems embedded in the students' learning process. This paper addresses the problem of selecting proper test items to compose a test sheet that conforms to such assessment requirements as average…
Descriptors: Test Items, Item Banks, Student Evaluation, Difficulty Level

Direct link
