ERIC - Search Results

Publication Date

In 2025	0
Since 2024	3
Since 2021 (last 5 years)	17
Since 2016 (last 10 years)	24
Since 2006 (last 20 years)	42

Descriptor

Comparative Analysis	79
Test Items	79
Test Validity	79
Test Reliability	35
Foreign Countries	28
Item Analysis	27
Test Construction	24
Difficulty Level	21
Item Response Theory	16
Scores	15
Statistical Analysis	14
Test Format	14
Higher Education	13
Multiple Choice Tests	12
Achievement Tests	11
English (Second Language)	11
Mathematics Tests	11
Reading Tests	11
Correlation	10
Language Tests	10
Test Bias	10
College Students	8
Computer Assisted Testing	8
Scoring	8
Undergraduate Students	8
More ▼

Publication Type

Reports - Research	55
Journal Articles	44
Reports - Evaluative	13
Speeches/Meeting Papers	13
Reports - Descriptive	5
Dissertations/Theses -…	3
Tests/Questionnaires	3
Books	1
Collected Works - General	1
Collected Works - Serials	1
Information Analyses	1
Non-Print Media	1
Numerical/Quantitative Data	1
Opinion Papers	1
Reference Materials - General	1
More ▼

Education Level

Higher Education	15
Postsecondary Education	14
Secondary Education	7
Elementary Education	6
Intermediate Grades	3
Early Childhood Education	2
Elementary Secondary Education	2
Grade 4	2
High Schools	2
Middle Schools	2
Grade 5	1
Grade 6	1
More ▼

Audience

Administrators	1
Parents	1
Policymakers	1

Location

Israel	3
Japan	3
Australia	2
China	2
Germany	2
Indonesia	2
Ohio	2
United States	2
Canada	1
Colorado	1
District of Columbia	1
Europe	1
Georgia	1
Idaho	1
Illinois	1
Iran	1
Malaysia	1
Minnesota	1
Nevada	1
New Jersey	1
New York	1
Oregon	1
Russia	1
South Africa	1
South Dakota	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

Raven Progressive Matrices	2
SAT (College Admission Test)	2
Trends in International…	2
California Achievement Tests	1
Comprehensive Tests of Basic…	1
Defining Issues Test	1
Graduate Record Examinations	1
Gray Oral Reading Test	1
International Association for…	1
Minnesota Multiphasic…	1
National Assessment of…	1
Program for International…	1
Progress in International…	1
Sequential Tests of…	1
Stanford Achievement Tests	1
Test of English as a Foreign…	1
Wechsler Adult Intelligence…	1
More ▼

What Works Clearinghouse Rating

Showing 1 to 15 of 79 results Save | Export

Generating Social and Emotional Skill Items: Humans vs. ChatGPT. ACT Research. Issue Brief

Download full text

Kate E. Walton; Cristina Anguiano-Carrasco – ACT, Inc., 2024

Large language models (LLMs), such as ChatGPT, are becoming increasingly prominent. Their use is becoming more and more popular to assist with simple tasks, such as summarizing documents, translating languages, rephrasing sentences, or answering questions. Reports like McKinsey's (Chui, & Yee, 2023) estimate that by implementing LLMs,…

Descriptors: Artificial Intelligence, Man Machine Systems, Natural Language Processing, Test Construction

The Social Shapes Test as a Self-Administered, Online Measure of Social Intelligence: Two Studies with Typically Developing Adults and Adults with Autism Spectrum Disorder

Peer reviewed

Direct link

Matt I. Brown; Patrick R. Heck; Christopher F. Chabris – Journal of Autism and Developmental Disorders, 2024

The Social Shapes Test (SST) is a measure of social intelligence which does not use human faces or rely on extensive verbal ability. The SST has shown promising validity among adults without autism spectrum disorder (ASD), but it is uncertain whether it is suitable for adults with ASD. We find measurement invariance between adults with (n = 229)…

Descriptors: Interpersonal Competence, Autism Spectrum Disorders, Emotional Intelligence, Verbal Ability

Validity of Multiple-Choice Digital Formative Assessment for Assessing Students' (Mis)Conceptions: Evidence from a Mixed-Methods Study in Algebra

Peer reviewed

Direct link

Katrin Klingbeil; Fabian Rösken; Bärbel Barzel; Florian Schacht; Kaye Stacey; Vicki Steinle; Daniel Thurm – ZDM: Mathematics Education, 2024

Assessing students' (mis)conceptions is a challenging task for teachers as well as for researchers. While individual assessment, for example through interviews, can provide deep insights into students' thinking, this is very time-consuming and therefore not feasible for whole classes or even larger settings. For those settings, automatically…

Descriptors: Multiple Choice Tests, Formative Evaluation, Mathematics Tests, Misconceptions

Reliability and Validity Evidence of Diagnostic Methods: Comparison of Diagnostic Classification Models and Item Response Theory-Based Methods

Direct link

Yoo Jeong Jang – ProQuest LLC, 2022

Despite the increasing demand for diagnostic information, observed subscores have been often reported to lack adequate psychometric qualities such as reliability, distinctiveness, and validity. Therefore, several statistical techniques based on CTT and IRT frameworks have been proposed to improve the quality of subscores. More recently, DCM has…

Descriptors: Classification, Accuracy, Item Response Theory, Correlation

How Administration Stakes and Settings Affect Student Behavior and Performance on a Biology Concept Assessment

Peer reviewed

Direct link

Uminski, Crystal; Hubbard, Joanna K.; Couch, Brian A. – CBE - Life Sciences Education, 2023

Biology instructors use concept assessments in their courses to gauge student understanding of important disciplinary ideas. Instructors can choose to administer concept assessments based on participation (i.e., lower stakes) or the correctness of responses (i.e., higher stakes), and students can complete the assessment in an in-class or…

Descriptors: Biology, Science Tests, High Stakes Tests, Scores

Treatments of Differential Item Functioning: A Comparison of Four Methods

Peer reviewed

Direct link

Liu, Xiaowen; Jane Rogers, H. – Educational and Psychological Measurement, 2022

Test fairness is critical to the validity of group comparisons involving gender, ethnicities, culture, or treatment conditions. Detection of differential item functioning (DIF) is one component of efforts to ensure test fairness. The current study compared four treatments for items that have been identified as showing DIF: deleting, ignoring,…

Descriptors: Item Analysis, Comparative Analysis, Culture Fair Tests, Test Validity

Reliability and Validity of Methods to Assess Undergraduate Healthcare Student Performance in Pharmacology: Comparison of Open Book versus Time-Limited Closed Book Examinations

Peer reviewed
PDF on ERIC

Download full text

David Bell; Vikki O'Neill; Vivienne Crawford – Practitioner Research in Higher Education, 2023

We compared the influence of open-book extended duration versus closed book time-limited format on reliability and validity of written assessments of pharmacology learning outcomes within our medical and dental courses. Our dental cohort undertake a mid-year test (30xfree-response short answer to a question, SAQ) and end-of-year paper (4xSAQ,…

Descriptors: Undergraduate Students, Pharmacology, Pharmaceutical Education, Test Format

The Pattern of Test-Taking Effort across Items in Cognitive Ability Test: A Latent Class Analysis

Peer reviewed
PDF on ERIC

Download full text

Akhtar, Hanif – International Association for Development of the Information Society, 2022

When examinees perceive a test as low stakes, it is logical to assume that some of them will not put out their maximum effort. This condition makes the validity of the test results more complicated. Although many studies have investigated motivational fluctuation across tests during a testing session, only a small number of studies have…

Descriptors: Intelligence Tests, Student Motivation, Test Validity, Student Attitudes

Developing the Diagnostic Test of Misconceptions of Fractions

Peer reviewed
PDF on ERIC

Download full text

Aleyna Altan; Zehra Taspinar Sener – Online Submission, 2023

This research aimed to develop a valid and reliable test to be used to detect sixth grade students' misconceptions and errors regarding the subject of fractions. A misconception diagnostic test has been developed that includes the concept of fractions, different representations of fractions, ordering and comparing fractions, equivalence of…

Descriptors: Diagnostic Tests, Mathematics Tests, Fractions, Misconceptions

Gender Bias in Test Item Formats: Evidence from PISA 2009, 2012, and 2015 Math and Reading Tests

Peer reviewed

Direct link

Shear, Benjamin R. – Journal of Educational Measurement, 2023

Large-scale standardized tests are regularly used to measure student achievement overall and for student subgroups. These uses assume tests provide comparable measures of outcomes across student subgroups, but prior research suggests score comparisons across gender groups may be complicated by the type of test items used. This paper presents…

Descriptors: Gender Bias, Item Analysis, Test Items, Achievement Tests

A Design for Comparing CTT and IRT in Test Assembly, Scoring and Argumentation: Differences among Reliability, Information and Validation

Peer reviewed

Direct link

Alqarni, Abdulelah Mohammed – Journal on Educational Psychology, 2019

This study compares the psychometric properties of reliability in Classical Test Theory (CTT), item information in Item Response Theory (IRT), and validation from the perspective of modern validity theory for the purpose of bringing attention to potential issues that might exist when testing organizations use both test theories in the same testing…

Descriptors: Test Theory, Item Response Theory, Test Construction, Scoring

Benthik Android Physics Comic Effectiveness for Vector Representation and Crtitical Thinking Students' Improvement

Peer reviewed
PDF on ERIC

Download full text

Maghfiroh, Anissa; Kuswanto, Heru – International Journal of Instruction, 2022

This research aims to reveal the effectiveness of the use of Kofie GeBoL media in improving (1) vector representation ability and (2) critical thinking ability in physics instruction. It is a descriptive quantitative study with the quasi-experiment design. It was conducted in two stages: empirical try out and implementation of Kofie GeboL to see…

Descriptors: Physics, Instructional Effectiveness, Critical Thinking, Thinking Skills

Comparison of Confirmatory Factor Analysis Estimation Methods on Mixed-Format Data

Peer reviewed
PDF on ERIC

Download full text

Kilic, Abdullah Faruk; Dogan, Nuri – International Journal of Assessment Tools in Education, 2021

Weighted least squares (WLS), weighted least squares mean-and-variance-adjusted (WLSMV), unweighted least squares mean-and-variance-adjusted (ULSMV), maximum likelihood (ML), robust maximum likelihood (MLR) and Bayesian estimation methods were compared in mixed item response type data via Monte Carlo simulation. The percentage of polytomous items,…

Descriptors: Factor Analysis, Computation, Least Squares Statistics, Maximum Likelihood Statistics

Scoring Graphical Responses in TIMSS 2019 Using Artificial Neural Networks

Peer reviewed

Direct link

von Davier, Matthias; Tyack, Lillian; Khorramdel, Lale – Educational and Psychological Measurement, 2023

Automated scoring of free drawings or images as responses has yet to be used in large-scale assessments of student achievement. In this study, we propose artificial neural networks to classify these types of graphical responses from a TIMSS 2019 item. We are comparing classification accuracy of convolutional and feed-forward approaches. Our…

Descriptors: Scoring, Networks, Artificial Intelligence, Elementary Secondary Education

Examining Provision and Sufficiency of Testing Accommodations for English Learners

Peer reviewed

Direct link

Roschmann, Sarina; Witmer, Sara E.; Volker, Martin A. – International Journal of Testing, 2021

Accommodations are commonly provided to address language-related barriers students may experience during testing. Research on the validity of scores from accommodated test administrations remains somewhat inconclusive. The current study investigated item response patterns to understand whether accommodations, as used in practice among English…

Descriptors: Testing Accommodations, English Language Learners, Scores, Item Response Theory

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6

Journal of Educational…	4
Educational and Psychological…	3
Language Testing	3
Research in Developmental…	3
CBE - Life Sciences Education	2
International Journal of…	2
ProQuest LLC	2
ACT, Inc.	1
Assessment & Evaluation in…	1
British Journal of Language…	1
College Board	1
Communique	1
Diaspora, Indigenous, and…	1
Early Education and…	1
Edinburgh Working Papers in…	1
Educational Assessment	1
Educational Studies	1
Focus	1
International Association for…	1
International Association for…	1
International Journal of…	1
International Journal of…	1
International Research in…	1
Journal of Autism and…	1
Journal of Chemical Education	1
More ▼

Haladyna, Tom	2
Roid, Gale	2
Afflerbach, Peter	1
Akhtar, Hanif	1
Alderson, J. Charles	1
Aleyna Altan	1
Allen, Nancy L.	1
Alqarni, Abdulelah Mohammed	1
Arth, Thomas O.	1
Awwad, Abeer	1
Bardovi-Harlig, Kathleen	1
Basset, Katherine	1
Benderson, Albert, Ed.	1
Berman, Ye'Elah	1
Betjemann, Rebecca S.	1
Bishop, Pamela R.	1
Blair, Bernadette	1
Boldt, Robert F.	1
Brown, James Dean	1
Brown, Ted	1
Burts, Diane C.	1
Bärbel Barzel	1
Canan Karababa, Z.	1
Chien, Chi-Wen	1
More ▼