Publication Date
In 2025 | 1 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 3 |
Since 2016 (last 10 years) | 4 |
Since 2006 (last 20 years) | 6 |
Descriptor
Computer Software | 12 |
Test Items | 12 |
Test Reliability | 12 |
Difficulty Level | 6 |
Item Analysis | 5 |
Test Validity | 4 |
Estimation (Mathematics) | 3 |
Goodness of Fit | 3 |
Models | 3 |
Testing | 3 |
Computer Assisted Testing | 2 |
More ▼ |
Source
Journal of Educational… | 2 |
Collegiate Microcomputer | 1 |
Educational and Psychological… | 1 |
IEEE Transactions on Learning… | 1 |
International Journal of… | 1 |
Language Assessment Quarterly | 1 |
Language Testing | 1 |
System | 1 |
Author
Aiken, Lewis R. | 1 |
Aryadoust, Vahid | 1 |
Boekkooi-Timminga, Ellen | 1 |
Cobern, William W. | 1 |
Davidson, Fred | 1 |
Greenberg, Daphne | 1 |
Haladyna, Thomas M. | 1 |
Jin, Kuan-Yu | 1 |
Kuan-Yu Jin | 1 |
Kuswanto, Heru | 1 |
Levitov, Justin E. | 1 |
More ▼ |
Publication Type
Journal Articles | 9 |
Reports - Research | 8 |
Reports - Descriptive | 2 |
Reports - Evaluative | 2 |
Computer Programs | 1 |
Speeches/Meeting Papers | 1 |
Education Level
High Schools | 1 |
Higher Education | 1 |
Secondary Education | 1 |
Audience
Practitioners | 1 |
Location
Indonesia | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Peabody Picture Vocabulary… | 1 |
What Works Clearinghouse Rating
Kuan-Yu Jin; Wai-Lok Siu – Journal of Educational Measurement, 2025
Educational tests often have a cluster of items linked by a common stimulus ("testlet"). In such a design, the dependencies caused between items are called "testlet effects." In particular, the directional testlet effect (DTE) refers to a recursive influence whereby responses to earlier items can positively or negatively affect…
Descriptors: Models, Test Items, Educational Assessment, Scores
Rao, Dhawaleswar; Saha, Sujan Kumar – IEEE Transactions on Learning Technologies, 2020
Automatic multiple choice question (MCQ) generation from a text is a popular research area. MCQs are widely accepted for large-scale assessment in various domains and applications. However, manual generation of MCQs is expensive and time-consuming. Therefore, researchers have been attracted toward automatic MCQ generation since the late 90's.…
Descriptors: Multiple Choice Tests, Test Construction, Automation, Computer Software
Aryadoust, Vahid; Ng, Li Ying; Sayama, Hiroki – Language Testing, 2021
Over the past decades, the application of Rasch measurement in language assessment has gradually increased. In the present study, we coded 215 papers using Rasch measurement published in 21 applied linguistics journals for multiple features. We found that seven Rasch models and 23 software packages were adopted in these papers, with many-facet…
Descriptors: Language Tests, Testing, Test Items, Network Analysis
Maghfiroh, Anissa; Kuswanto, Heru – International Journal of Instruction, 2022
This research aims to reveal the effectiveness of the use of Kofie GeBoL media in improving (1) vector representation ability and (2) critical thinking ability in physics instruction. It is a descriptive quantitative study with the quasi-experiment design. It was conducted in two stages: empirical try out and implementation of Kofie GeboL to see…
Descriptors: Physics, Instructional Effectiveness, Critical Thinking, Thinking Skills
Jin, Kuan-Yu; Wang, Wen-Chung – Journal of Educational Measurement, 2014
Sometimes, test-takers may not be able to attempt all items to the best of their ability (with full effort) due to personal factors (e.g., low motivation) or testing conditions (e.g., time limit), resulting in poor performances on certain items, especially those located toward the end of a test. Standard item response theory (IRT) models fail to…
Descriptors: Student Evaluation, Item Response Theory, Models, Simulation
Pae, Hye K.; Greenberg, Daphne; Morris, Robin D. – Language Assessment Quarterly, 2012
The aim of this study was to apply the Rasch model to an analysis of the psychometric properties of the Peabody Picture Vocabulary Test--III Form A (PPVT--IIIA) items with struggling adult readers. The PPVT--IIIA was administered to 229 African American adults whose isolated word reading skills were between third and fifth grades. Conformity of…
Descriptors: African Americans, Test Items, Construct Validity, Test Validity
A Zero-One Programming Approach to Gulliksen's Matched Random Subtests Method. Research Report 86-4.
van der Linden, Wim J.; Boekkooi-Timminga, Ellen – 1986
In order to estimate the classical coefficient of test reliability, parallel measurements are needed. H. Gulliksen's matched random subtests method, which is a graphical method for splitting a test into parallel test halves, has practical relevance because it maximizes the alpha coefficient as a lower bound of the classical test reliability…
Descriptors: Algorithms, Computer Assisted Testing, Computer Software, Difficulty Level

Davidson, Fred – System, 2000
Statistical analysis tools in language testing are described, chiefly classical test theory and item response theory. Computer software for statistical analysis is briefly reviewed and divided into three tiers: commonly available; statistical packages; and specialty software. (Author/VWL)
Descriptors: Computer Software, Language Tests, Second Language Learning, Statistical Analysis
Sympson, J. Bradford; Haladyna, Thomas M. – 1988
A new approach to polychotomous scoring of test items, similar to "max-alpha" scaling (MAS) and known as polyweighting, has been developed. Unlike MAS, this new method of polychotomous scoring provides scoring weights for a given item that are independent of the difficulty of other items in the analysis. Moreover, the scoring weights are…
Descriptors: Computer Software, Difficulty Level, Item Analysis, Latent Trait Theory

Cobern, William W. – 1986
This computer program, written in BASIC, performs three different calculations of test reliability: (1) the Kuder-Richardson method; (2); the "common split-half" method; and (3) the Rulon-Guttman split-half method. The program reads sequential access data files for microcomputers that have been set up by statistical packages such as…
Descriptors: Computer Software, Difficulty Level, Educational Research, Equations (Mathematics)
Thompson, Bruce; Levitov, Justin E. – Collegiate Microcomputer, 1985
Discusses features of a microcomputer program, SCOREIT, used at New Orleans' Loyola University and several high schools to score and analyze test results. Benefits and dimensions of the program's automated test and item analysis are outlined, and several examples illustrating test and item analyses by SCOREIT are presented. (MBR)
Descriptors: Computer Assisted Testing, Computer Software, Difficulty Level, Higher Education

Aiken, Lewis R. – Educational and Psychological Measurement, 1989
Two alternatives to traditional item analysis and reliability estimation procedures are considered for determining the difficulty, discrimination, and reliability of optional items on essay and other tests. A computer program to compute these measures is described, and illustrations are given. (SLD)
Descriptors: College Entrance Examinations, Computer Software, Difficulty Level, Essay Tests