ERIC - Search Results

Publication Date

In 2026	0
Since 2025	1
Since 2022 (last 5 years)	2
Since 2017 (last 10 years)	4
Since 2007 (last 20 years)	6

Descriptor

Computer Software	12
Test Items	12
Test Reliability	12
Difficulty Level	6
Item Analysis	5
Test Validity	4
Estimation (Mathematics)	3
Goodness of Fit	3
Models	3
Testing	3
Computer Assisted Testing	2
Higher Education	2
Item Response Theory	2
Language Tests	2
Microcomputers	2
Scores	2
Test Construction	2
Adults	1
African Americans	1
Age Differences	1
Algorithms	1
Automation	1
Cartoons	1
College Entrance Examinations	1
Comparative Analysis	1
More ▼

Source

Journal of Educational…	2
Collegiate Microcomputer	1
Educational and Psychological…	1
IEEE Transactions on Learning…	1
International Journal of…	1
Language Assessment Quarterly	1
Language Testing	1
System	1

Publication Type

Journal Articles	9
Reports - Research	8
Reports - Descriptive	2
Reports - Evaluative	2
Computer Programs	1
Speeches/Meeting Papers	1

Education Level

High Schools	1
Higher Education	1
Secondary Education	1

Audience

Practitioners

Location

Indonesia

Laws, Policies, & Programs

Assessments and Surveys

Peabody Picture Vocabulary…

What Works Clearinghouse Rating

Showing all 12 results Save | Export

Modeling Directional Testlet Effects on Multiple Open-Ended Questions

Peer reviewed

Direct link

Kuan-Yu Jin; Wai-Lok Siu – Journal of Educational Measurement, 2025

Educational tests often have a cluster of items linked by a common stimulus ("testlet"). In such a design, the dependencies caused between items are called "testlet effects." In particular, the directional testlet effect (DTE) refers to a recursive influence whereby responses to earlier items can positively or negatively affect…

Descriptors: Models, Test Items, Educational Assessment, Scores

Automatic Multiple Choice Question Generation From Text: A Survey

Peer reviewed

Direct link

Rao, Dhawaleswar; Saha, Sujan Kumar – IEEE Transactions on Learning Technologies, 2020

Automatic multiple choice question (MCQ) generation from a text is a popular research area. MCQs are widely accepted for large-scale assessment in various domains and applications. However, manual generation of MCQs is expensive and time-consuming. Therefore, researchers have been attracted toward automatic MCQ generation since the late 90's.…

Descriptors: Multiple Choice Tests, Test Construction, Automation, Computer Software

A Comprehensive Review of Rasch Measurement in Language Assessment: Recommendations and Guidelines for Research

Peer reviewed

Direct link

Aryadoust, Vahid; Ng, Li Ying; Sayama, Hiroki – Language Testing, 2021

Over the past decades, the application of Rasch measurement in language assessment has gradually increased. In the present study, we coded 215 papers using Rasch measurement published in 21 applied linguistics journals for multiple features. We found that seven Rasch models and 23 software packages were adopted in these papers, with many-facet…

Descriptors: Language Tests, Testing, Test Items, Network Analysis

Benthik Android Physics Comic Effectiveness for Vector Representation and Crtitical Thinking Students' Improvement

Peer reviewed
PDF on ERIC

Download full text

Maghfiroh, Anissa; Kuswanto, Heru – International Journal of Instruction, 2022

This research aims to reveal the effectiveness of the use of Kofie GeBoL media in improving (1) vector representation ability and (2) critical thinking ability in physics instruction. It is a descriptive quantitative study with the quasi-experiment design. It was conducted in two stages: empirical try out and implementation of Kofie GeboL to see…

Descriptors: Physics, Instructional Effectiveness, Critical Thinking, Thinking Skills

Item Response Theory Models for Performance Decline during Testing

Peer reviewed

Direct link

Jin, Kuan-Yu; Wang, Wen-Chung – Journal of Educational Measurement, 2014

Sometimes, test-takers may not be able to attempt all items to the best of their ability (with full effort) due to personal factors (e.g., low motivation) or testing conditions (e.g., time limit), resulting in poor performances on certain items, especially those located toward the end of a test. Standard item response theory (IRT) models fail to…

Descriptors: Student Evaluation, Item Response Theory, Models, Simulation

Construct Validity and Measurement Invariance of the Peabody Picture Vocabulary Test-III Form A

Peer reviewed

Direct link

Pae, Hye K.; Greenberg, Daphne; Morris, Robin D. – Language Assessment Quarterly, 2012

The aim of this study was to apply the Rasch model to an analysis of the psychometric properties of the Peabody Picture Vocabulary Test--III Form A (PPVT--IIIA) items with struggling adult readers. The PPVT--IIIA was administered to 229 African American adults whose isolated word reading skills were between third and fifth grades. Conformity of…

Descriptors: African Americans, Test Items, Construct Validity, Test Validity

A Zero-One Programming Approach to Gulliksen's Matched Random Subtests Method. Research Report 86-4.

Download full text

van der Linden, Wim J.; Boekkooi-Timminga, Ellen – 1986

In order to estimate the classical coefficient of test reliability, parallel measurements are needed. H. Gulliksen's matched random subtests method, which is a graphical method for splitting a test into parallel test halves, has practical relevance because it maximizes the alpha coefficient as a lower bound of the classical test reliability…

Descriptors: Algorithms, Computer Assisted Testing, Computer Software, Difficulty Level

The Language Tester's Statistical Toolbox.

Peer reviewed

Davidson, Fred – System, 2000

Statistical analysis tools in language testing are described, chiefly classical test theory and item response theory. Computer software for statistical analysis is briefly reviewed and divided into three tiers: commonly available; statistical packages; and specialty software. (Author/VWL)

Descriptors: Computer Software, Language Tests, Second Language Learning, Statistical Analysis

An Evaluation of "Polyweighting" in Domain-Referenced Testing.

Sympson, J. Bradford; Haladyna, Thomas M. – 1988

A new approach to polychotomous scoring of test items, similar to "max-alpha" scaling (MAS) and known as polyweighting, has been developed. Unlike MAS, this new method of polychotomous scoring provides scoring weights for a given item that are independent of the difficulty of other items in the analysis. Moreover, the scoring weights are…

Descriptors: Computer Software, Difficulty Level, Item Analysis, Latent Trait Theory

A BASIC Microcomputer Program for Estimating Test Reliability.

PDF pending restoration

Cobern, William W. – 1986

This computer program, written in BASIC, performs three different calculations of test reliability: (1) the Kuder-Richardson method; (2); the "common split-half" method; and (3) the Rulon-Guttman split-half method. The program reads sequential access data files for microcomputers that have been set up by statistical packages such as…

Descriptors: Computer Software, Difficulty Level, Educational Research, Equations (Mathematics)

Using Microcomputers to Score and Evaluate Items.

Thompson, Bruce; Levitov, Justin E. – Collegiate Microcomputer, 1985

Discusses features of a microcomputer program, SCOREIT, used at New Orleans' Loyola University and several high schools to score and analyze test results. Benefits and dimensions of the program's automated test and item analysis are outlined, and several examples illustrating test and item analyses by SCOREIT are presented. (MBR)

Descriptors: Computer Assisted Testing, Computer Software, Difficulty Level, Higher Education

Analyzing Optional Test Items.

Peer reviewed

Aiken, Lewis R. – Educational and Psychological Measurement, 1989

Two alternatives to traditional item analysis and reliability estimation procedures are considered for determining the difficulty, discrimination, and reliability of optional items on essay and other tests. A computer program to compute these measures is described, and illustrations are given. (SLD)

Descriptors: College Entrance Examinations, Computer Software, Difficulty Level, Essay Tests

Aiken, Lewis R.	1
Aryadoust, Vahid	1
Boekkooi-Timminga, Ellen	1
Cobern, William W.	1
Davidson, Fred	1
Greenberg, Daphne	1
Haladyna, Thomas M.	1
Jin, Kuan-Yu	1
Kuan-Yu Jin	1
Kuswanto, Heru	1
Levitov, Justin E.	1
Maghfiroh, Anissa	1
Morris, Robin D.	1
Ng, Li Ying	1
Pae, Hye K.	1
Rao, Dhawaleswar	1
Saha, Sujan Kumar	1
Sayama, Hiroki	1
Sympson, J. Bradford	1
Thompson, Bruce	1
Wai-Lok Siu	1
Wang, Wen-Chung	1
van der Linden, Wim J.	1
More ▼