NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 12 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Marc Brysbaert – Cognitive Research: Principles and Implications, 2024
Experimental psychology is witnessing an increase in research on individual differences, which requires the development of new tasks that can reliably assess variations among participants. To do this, cognitive researchers need statistical methods that many researchers have not learned during their training. The lack of expertise can pose…
Descriptors: Experimental Psychology, Individual Differences, Statistical Analysis, Task Analysis
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Bashkov, Bozhidar M.; Clauser, Jerome C. – Practical Assessment, Research & Evaluation, 2019
Successful testing programs rely on high-quality test items to produce reliable scores and defensible exams. However, determining what statistical screening criteria are most appropriate to support these goals can be daunting. This study describes and demonstrates cost-benefit analysis as an empirical approach to determining appropriate screening…
Descriptors: Test Items, Test Reliability, Evaluation Criteria, Accuracy
Lorié, William A. – Online Submission, 2013
A reverse engineering approach to automatic item generation (AIG) was applied to a figure-based publicly released test item from the Organisation for Economic Cooperation and Development (OECD) Programme for International Student Assessment (PISA) mathematical literacy cognitive instrument as part of a proof of concept. The author created an item…
Descriptors: Numeracy, Mathematical Concepts, Mathematical Logic, Difficulty Level
Peer reviewed Peer reviewed
Direct linkDirect link
Waller, Niels G. – Applied Psychological Measurement, 2008
Reliability is a property of test scores from individuals who have been sampled from a well-defined population. Reliability indices, such as coefficient and related formulas for internal consistency reliability (KR-20, Hoyt's reliability), yield lower bound reliability estimates when (a) subjects have been sampled from a single population and when…
Descriptors: Test Items, Reliability, Scores, Psychometrics
Peer reviewed Peer reviewed
Shoemaker, David M. – Journal of Educational Measurement, 1971
Descriptors: Difficulty Level, Item Sampling, Statistical Analysis, Test Construction
Peer reviewed Peer reviewed
van der Linden, Wim J. – Applied Psychological Measurement, 1979
The restrictions on item difficulties that must be met when binomial models are applied to domain-referenced testing are examined. Both a deterministic and a stochastic conception of item responses are discussed with respect to difficulty and Guttman-type items. (Author/BH)
Descriptors: Difficulty Level, Item Sampling, Latent Trait Theory, Mathematical Models
Peer reviewed Peer reviewed
Passmore, David Lynn – Journal of Studies in Technical Careers, 1983
Vocational and technical education researchers need to be aware of the uses and limits of various statistical models. The author reviews the Rasch Model and applies it to results from a nutrition test given to student nurses. (Author)
Descriptors: Educational Research, Item Sampling, Nursing Education, Nutrition
Berk, Ronald A. – 1978
Sixteen item statistics recommended for use in the development of criterion-referenced tests were evaluated. There were two major criteria: (1) practicability in terms of ease of computation and interpretation and (2) meaningfulness in the context of the development process. Most of the statistics were based on a comparison of performance changes…
Descriptors: Achievement Tests, Criterion Referenced Tests, Difficulty Level, Guides
Scheetz, James P.; Forsyth, Robert A. – 1977
Empirical evidence is presented related to the effects of using a stratified sampling of items in multiple matrix sampling on the accuracy of estimates of the population mean. Data were obtained from a sample of 600 high school students for a 36-item mathematics test and a 40-item vocabulary test, both subtests of the Iowa Tests of Educational…
Descriptors: Achievement Tests, Difficulty Level, Item Analysis, Item Sampling
Douglass, James B. – 1979
A general process for testing the feasibility of applying alternative mathematical or statistical models to the solution of a practical problem is presented and flowcharted. The system is used to compare five models for test equating: (1) anchor test equating using classical test theory; (2) anchor test equating using the one-parameter logistic…
Descriptors: Comparative Analysis, Equated Scores, Flow Charts, Goodness of Fit
Theunissen, Phiel J. J. M. – 1983
Any systematic approach to the assessment of students' ability implies the use of a model. The more explicit the model is, the more its users know about what they are doing and what the consequences are. The Rasch model is a strong model where measurement is a bonus of the model itself. It is based on four ideas: (1) separation of observable…
Descriptors: Ability Grouping, Difficulty Level, Evaluation Criteria, Item Sampling
Lewy, Arieh; Doron, Rina – 1977
The concept of tailored testing for individuals is applied to the construction of tests for special groups and extended to apply to item content as well as item difficulty. It is suggested that evaluators may decide to construct tests on the basis of a unique combination of items drawn from an item bank to fit the need of a particular group. At…
Descriptors: Achievement Tests, Adaptive Testing, Criterion Referenced Tests, Group Norms