ERIC - Search Results

Publication Date

In 2025	1
Since 2024	3
Since 2021 (last 5 years)	4
Since 2016 (last 10 years)	15
Since 2006 (last 20 years)	23

Descriptor

Guessing (Tests)	116
Test Reliability	116
Multiple Choice Tests	74
Test Validity	45
Scoring Formulas	38
Test Items	31
Test Construction	26
Scoring	24
Response Style (Tests)	22
Objective Tests	19
Statistical Analysis	19
Testing Problems	19
Higher Education	18
Scores	17
Comparative Analysis	15
Item Analysis	15
Responses	13
Test Format	13
Testing	13
Weighted Scores	13
Difficulty Level	12
Measurement Techniques	12
Test Interpretation	12
Confidence Testing	11
Item Response Theory	10
More ▼

Publication Type

Journal Articles	47
Reports - Research	46
Reports - Evaluative	14
Speeches/Meeting Papers	12
Reports - Descriptive	5
Opinion Papers	4
Dissertations/Theses -…	3
Guides - Non-Classroom	2
Information Analyses	2
Collected Works - General	1
Guides - Classroom - Teacher	1
Reports - General	1
Tests/Questionnaires	1
More ▼

Education Level

Higher Education	8
Postsecondary Education	6
Secondary Education	3
High Schools	2
Grade 10	1

Audience

Practitioners	3
Researchers	3

Location

Australia	1
Denmark	1
France	1
Germany	1
Indiana	1
Iran	1
Jordan	1
Nigeria	1
United Kingdom	1

Laws, Policies, & Programs

Assessments and Surveys

Graduate Record Examinations	3
ACT Assessment	1
California Achievement Tests	1
Embedded Figures Test	1
Graduate Management Admission…	1
SAT (College Admission Test)	1

What Works Clearinghouse Rating

Showing 1 to 15 of 116 results Save | Export

Linking Errors Introduced by Rapid Guessing Responses When Employing Multigroup Concurrent IRT Scaling

Direct link

Jiayi Deng – ProQuest LLC, 2024

Test score comparability in international large-scale assessments (LSA) is of utmost importance in measuring the effectiveness of education systems and understanding the impact of education on economic growth. To effectively compare test scores on an international scale, score linking is widely used to convert raw scores from different linguistic…

Descriptors: Item Response Theory, Scoring Rubrics, Scoring, Error of Measurement

Is Effort Moderated Scoring Robust to Multidimensional Rapid Guessing?

Peer reviewed

Direct link

Joseph A. Rios; Jiayi Deng – Educational and Psychological Measurement, 2025

To mitigate the potential damaging consequences of rapid guessing (RG), a form of noneffortful responding, researchers have proposed a number of scoring approaches. The present simulation study examines the robustness of the most popular of these approaches, the unidimensional effort-moderated (EM) scoring procedure, to multidimensional RG (i.e.,…

Descriptors: Scoring, Guessing (Tests), Reaction Time, Item Response Theory

Preventing Satisficing: A Narrative Review

Peer reviewed

Direct link

Danielle R. Blazek; Jason T. Siegel – International Journal of Social Research Methodology, 2024

Social scientists have long agreed that satisficing behavior increases error and reduces the validity of survey data. There have been numerous reviews on detecting satisficing behavior, but preventing this behavior has received less attention. The current narrative review provides empirically supported guidance on preventing satisficing by…

Descriptors: Response Style (Tests), Responses, Reaction Time, Test Interpretation

An Experimental Validation of Sequential Multiple-Choice Tests

Peer reviewed

Direct link

Papenberg, Martin; Diedenhofen, Birk; Musch, Jochen – Journal of Experimental Education, 2021

Testwiseness may introduce construct-irrelevant variance to multiple-choice test scores. Presenting response options sequentially has been proposed as a potential solution to this problem. In an experimental validation, we determined the psychometric properties of a test based on the sequential presentation of response options. We created a strong…

Descriptors: Test Wiseness, Test Validity, Test Reliability, Multiple Choice Tests

Reconsidering the Assessment Policy: Practical Use of Liberal Multiple-Choice Tests (SAC Method)

Peer reviewed
PDF on ERIC

Download full text

Cesur, Kursat – Educational Policy Analysis and Strategic Research, 2019

Examinees' performances are assessed using a wide variety of different techniques. Multiple-choice (MC) tests are among the most frequently used ones. Nearly, all standardized achievement tests make use of MC test items and there is a variety of ways to score these tests. The study compares number right and liberal scoring (SAC) methods. Mixed…

Descriptors: Multiple Choice Tests, Scoring, Evaluation Methods, Guessing (Tests)

Strategic Omission and Risk Aversion: A Bias-Reliability Tradeoff

Peer reviewed
PDF on ERIC

Download full text

Lang, David – Grantee Submission, 2019

Whether high-stakes exams such as the SAT or College Board AP exams should penalize incorrect answers is a controversial question. In this paper, we document that penalty functions can have differential effects depending on a student's risk tolerance. Moreover, literature shows that risk aversion tends to vary along other areas of concern such as…

Descriptors: High Stakes Tests, Risk, Item Response Theory, Test Bias

A Simulation-Based Method for Finding the Optimal Number of Options for Multiple-Choice Items on a Test. Research Report. ETS RR-18-22

Peer reviewed
PDF on ERIC

Download full text

Guo, Hongwen; Zu, Jiyun; Kyllonen, Patrick – ETS Research Report Series, 2018

For a multiple-choice test under development or redesign, it is important to choose the optimal number of options per item so that the test possesses the desired psychometric properties. On the basis of available data for a multiple-choice assessment with 8 options, we evaluated the effects of changing the number of options on test properties…

Descriptors: Multiple Choice Tests, Test Items, Simulation, Test Construction

Exploration of Factors Affecting the Added Value of Test Subscores

Peer reviewed

Direct link

Wang, Xiaolin; Svetina, Dubravka; Dai, Shenghai – Journal of Experimental Education, 2019

Recently, interest in test subscore reporting for diagnosis purposes has been growing rapidly. The two simulation studies here examined factors (sample size, number of subscales, correlation between subscales, and three factors affecting subscore reliability: number of items per subscale, item parameter distribution, and data generating model)…

Descriptors: Value Added Models, Scores, Sample Size, Correlation

Multiple Choice Questions: Answering Correctly and Knowing the Answer

Peer reviewed

Direct link

McKenna, Peter – Interactive Technology and Smart Education, 2019

Purpose: This paper aims to examine whether multiple choice questions (MCQs) can be answered correctly without knowing the answer and whether constructed response questions (CRQs) offer more reliable assessment. Design/methodology/approach: The paper presents a critical review of existing research on MCQs, then reports on an experimental study…

Descriptors: Multiple Choice Tests, Accuracy, Test Wiseness, Objective Tests

Addressing the Shortcomings of Traditional Multiple-Choice Tests: Subset Selection without Mark Deductions

Peer reviewed
PDF on ERIC

Download full text

Otoyo, Lucia; Bush, Martin – Practical Assessment, Research & Evaluation, 2018

This article presents the results of an empirical study of "subset selection" tests, which are a generalisation of traditional multiple-choice tests in which test takers are able to express partial knowledge. Similar previous studies have mostly been supportive of subset selection, but the deduction of marks for incorrect responses has…

Descriptors: Multiple Choice Tests, Grading, Test Reliability, Test Format

Same Test, Better Scores: Boosting the Reliability of Short Online Intelligence Recruitment Tests with Nested Logit Item Response Theory Models

Peer reviewed
PDF on ERIC

Download full text

Storme, Martin; Myszkowski, Nils; Baron, Simon; Bernard, David – Journal of Intelligence, 2019

Assessing job applicants' general mental ability online poses psychometric challenges due to the necessity of having brief but accurate tests. Recent research (Myszkowski & Storme, 2018) suggests that recovering distractor information through Nested Logit Models (NLM; Suh & Bolt, 2010) increases the reliability of ability estimates in…

Descriptors: Intelligence Tests, Item Response Theory, Comparative Analysis, Test Reliability

Evaluating the Impact of Guessing and Its Interactions with Other Test Characteristics on Confidence Interval Procedures for Coefficient Alpha

Peer reviewed

Direct link

Paek, Insu – Educational and Psychological Measurement, 2016

The effect of guessing on the point estimate of coefficient alpha has been studied in the literature, but the impact of guessing and its interactions with other test characteristics on the interval estimators for coefficient alpha has not been fully investigated. This study examined the impact of guessing and its interactions with other test…

Descriptors: Guessing (Tests), Computation, Statistical Analysis, Test Length

Validity Considerations for 10th-Grade ACT State and District Testing. Insights in Education and Work

Download full text

Allen, Jeff M.; Mattern, Krista – ACT, Inc., 2019

States and districts have expressed interest in administering the ACT® to 10th-grade students. Given that the ACT was designed to be administered in the spring of 11th grade or fall of 12th grade, the appropriateness of this use should be evaluated. As such, the focus of this paper is to summarize empirical evidence evaluating the use of the ACT…

Descriptors: Test Validity, College Entrance Examinations, High School Students, Grade 10

Reducing the Need for Guesswork in Multiple-Choice Tests

Peer reviewed

Direct link

Bush, Martin – Assessment & Evaluation in Higher Education, 2015

The humble multiple-choice test is very widely used within education at all levels, but its susceptibility to guesswork makes it a suboptimal assessment tool. The reliability of a multiple-choice test is partly governed by the number of items it contains; however, longer tests are more time consuming to take, and for some subject areas, it can be…

Descriptors: Guessing (Tests), Multiple Choice Tests, Test Format, Test Reliability

The "Test of Financial Literacy": Development and Measurement Characteristics

Peer reviewed

Direct link

Walstad, William B.; Rebeck, Ken – Journal of Economic Education, 2017

The "Test of Financial Literacy" (TFL) was created to measure the financial knowledge of high school students. Its content is based on the standards and benchmarks stated in the "National Standards for Financial Literacy" (Council for Economic Education 2013). The test development process involved extensive item writing and…

Descriptors: Tests, Money Management, Literacy, High School Students

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8

Educational and Psychological…	16
Journal of Educational…	10
Journal of Experimental…	6
Assessment & Evaluation in…	5
Applied Psychological…	4
Psychometrika	4
Educ Psychol Meas	3
New Directions for Testing…	3
ProQuest LLC	3
Applied Measurement in…	2
ETS Research Report Series	2
ACT, Inc.	1
Assessment and Evaluation in…	1
Educational Policy Analysis…	1
Educational Researcher	1
Grantee Submission	1
Interactive Technology and…	1
International Journal of…	1
International Journal of…	1
Journal of Computer-Based…	1
Journal of Dental Education	1
Journal of Economic Education	1
Journal of Education and…	1
Journal of Educational…	1
Journal of Geography in…	1
More ▼

Frary, Robert B.	8
Burton, Richard F.	5
Cross, Lawrence H.	3
Ebel, Robert L.	3
Kane, Michael T.	3
Rippey, Robert M.	3
Wilcox, Rand R.	3
Brennan, Robert L.	2
Bush, Martin	2
Donlon, Thomas F.	2
Glasnapp, Douglas R.	2
Hambleton, Ronald K.	2
Hendrickson, Gerry F.	2
Jaradat, Derar	2
Jiayi Deng	2
Koehler, Roger A.	2
Moloney, James M.	2
Oosterhof, Albert C.	2
Reilly, Richard R.	2
Stewart, Jeffrey	2
Traub, Ross E.	2
Wise, Steven L.	2
Zimmerman, Donald W.	2
Abu-Sayf, F. K.	1
More ▼