ERIC - Search Results

Publication Date

In 2025	1
Since 2024	4
Since 2021 (last 5 years)	26
Since 2016 (last 10 years)	43
Since 2006 (last 20 years)	73

Descriptor

Comparative Analysis	121
Test Items	121
Test Validity	79
Foreign Countries	43
Item Analysis	39
Test Reliability	35
Test Construction	34
Difficulty Level	31
Item Response Theory	28
Validity	24
Language Tests	23
Scores	23
Test Format	22
Statistical Analysis	21
English (Second Language)	20
Construct Validity	19
Multiple Choice Tests	18
Correlation	17
Higher Education	17
Second Language Learning	16
Mathematics Tests	15
Reading Tests	15
Achievement Tests	13
Psychometrics	12
College Students	11
More ▼

Publication Type

Reports - Research	89
Journal Articles	75
Speeches/Meeting Papers	20
Reports - Evaluative	19
Tests/Questionnaires	6
Reports - Descriptive	5
Dissertations/Theses -…	4
Information Analyses	2
Numerical/Quantitative Data	2
Books	1
Collected Works - General	1
Collected Works - Serials	1
Non-Print Media	1
Opinion Papers	1
Reference Materials - General	1
Reports - General	1
More ▼

Education Level

Higher Education	23
Postsecondary Education	21
Secondary Education	12
Elementary Education	11
High Schools	5
Early Childhood Education	4
Elementary Secondary Education	3
Intermediate Grades	3
Grade 4	2
Middle Schools	2
Grade 5	1
Grade 6	1
More ▼

Audience

Administrators	1
Parents	1
Policymakers	1
Researchers	1

Location

Canada	4
Germany	4
Indonesia	3
Israel	3
Japan	3
United States	3
Australia	2
China	2
Colorado	2
Georgia	2
Nevada	2
Ohio	2
Oregon	2
Turkey	2
Turkey (Ankara)	2
United Kingdom	2
Africa	1
Belgium	1
China (Beijing)	1
District of Columbia	1
Europe	1
Idaho	1
Illinois	1
Iran	1
Malaysia	1
More ▼

Laws, Policies, & Programs

What Works Clearinghouse Rating

Showing 1 to 15 of 121 results Save | Export

Developing an MLA-Test for Young Learners -- Insights from Measurement Theory and Language Testing

Peer reviewed

Direct link

Kaja Haugen; Cecilie Hamnes Carlsen; Christine Möller-Omrani – Language Awareness, 2025

This article presents the process of constructing and validating a test of metalinguistic awareness (MLA) for young school children (age 8-10). The test was developed between 2021 and 2023 as part of the MetaLearn research project, financed by The Research Council of Norway. The research team defines MLA as using metalinguistic knowledge at a…

Descriptors: Language Tests, Test Construction, Elementary School Students, Metalinguistics

Reliability and Validity of Methods to Assess Undergraduate Healthcare Student Performance in Pharmacology: Comparison of Open Book versus Time-Limited Closed Book Examinations

Peer reviewed
PDF on ERIC

Download full text

David Bell; Vikki O'Neill; Vivienne Crawford – Practitioner Research in Higher Education, 2023

We compared the influence of open-book extended duration versus closed book time-limited format on reliability and validity of written assessments of pharmacology learning outcomes within our medical and dental courses. Our dental cohort undertake a mid-year test (30xfree-response short answer to a question, SAQ) and end-of-year paper (4xSAQ,…

Descriptors: Undergraduate Students, Pharmacology, Pharmaceutical Education, Test Format

Generating Social and Emotional Skill Items: Humans vs. ChatGPT. ACT Research. Issue Brief

Download full text

Kate E. Walton; Cristina Anguiano-Carrasco – ACT, Inc., 2024

Large language models (LLMs), such as ChatGPT, are becoming increasingly prominent. Their use is becoming more and more popular to assist with simple tasks, such as summarizing documents, translating languages, rephrasing sentences, or answering questions. Reports like McKinsey's (Chui, & Yee, 2023) estimate that by implementing LLMs,…

Descriptors: Artificial Intelligence, Man Machine Systems, Natural Language Processing, Test Construction

The Social Shapes Test as a Self-Administered, Online Measure of Social Intelligence: Two Studies with Typically Developing Adults and Adults with Autism Spectrum Disorder

Peer reviewed

Direct link

Matt I. Brown; Patrick R. Heck; Christopher F. Chabris – Journal of Autism and Developmental Disorders, 2024

The Social Shapes Test (SST) is a measure of social intelligence which does not use human faces or rely on extensive verbal ability. The SST has shown promising validity among adults without autism spectrum disorder (ASD), but it is uncertain whether it is suitable for adults with ASD. We find measurement invariance between adults with (n = 229)…

Descriptors: Interpersonal Competence, Autism Spectrum Disorders, Emotional Intelligence, Verbal Ability

How Useful Is Comparative Judgement of Item Difficulty for Standard Maintaining?

Download full text

Benton, Tom – Research Matters, 2020

This article reviews the evidence on the extent to which experts' perceptions of item difficulties, captured using comparative judgement, can predict empirical item difficulties. This evidence is drawn from existing published studies on this topic and also from statistical analysis of data held by Cambridge Assessment. Having reviewed the…

Descriptors: Test Items, Difficulty Level, Expertise, Comparative Analysis

The Concurrent Validity of Comparative Judgement Outcomes Compared with Marks

Download full text

Gill, Tim – Research Matters, 2022

In Comparative Judgement (CJ) exercises, examiners are asked to look at a selection of candidate scripts (with marks removed) and order them in terms of which they believe display the best quality. By including scripts from different examination sessions, the results of these exercises can be used to help with maintaining standards. Results from…

Descriptors: Comparative Analysis, Decision Making, Scripts, Standards

Validity of Multiple-Choice Digital Formative Assessment for Assessing Students' (Mis)Conceptions: Evidence from a Mixed-Methods Study in Algebra

Peer reviewed

Direct link

Katrin Klingbeil; Fabian Rösken; Bärbel Barzel; Florian Schacht; Kaye Stacey; Vicki Steinle; Daniel Thurm – ZDM: Mathematics Education, 2024

Assessing students' (mis)conceptions is a challenging task for teachers as well as for researchers. While individual assessment, for example through interviews, can provide deep insights into students' thinking, this is very time-consuming and therefore not feasible for whole classes or even larger settings. For those settings, automatically…

Descriptors: Multiple Choice Tests, Formative Evaluation, Mathematics Tests, Misconceptions

Reliability and Validity Evidence of Diagnostic Methods: Comparison of Diagnostic Classification Models and Item Response Theory-Based Methods

Direct link

Yoo Jeong Jang – ProQuest LLC, 2022

Despite the increasing demand for diagnostic information, observed subscores have been often reported to lack adequate psychometric qualities such as reliability, distinctiveness, and validity. Therefore, several statistical techniques based on CTT and IRT frameworks have been proposed to improve the quality of subscores. More recently, DCM has…

Descriptors: Classification, Accuracy, Item Response Theory, Correlation

How Administration Stakes and Settings Affect Student Behavior and Performance on a Biology Concept Assessment

Peer reviewed

Direct link

Uminski, Crystal; Hubbard, Joanna K.; Couch, Brian A. – CBE - Life Sciences Education, 2023

Biology instructors use concept assessments in their courses to gauge student understanding of important disciplinary ideas. Instructors can choose to administer concept assessments based on participation (i.e., lower stakes) or the correctness of responses (i.e., higher stakes), and students can complete the assessment in an in-class or…

Descriptors: Biology, Science Tests, High Stakes Tests, Scores

Treatments of Differential Item Functioning: A Comparison of Four Methods

Peer reviewed

Direct link

Liu, Xiaowen; Jane Rogers, H. – Educational and Psychological Measurement, 2022

Test fairness is critical to the validity of group comparisons involving gender, ethnicities, culture, or treatment conditions. Detection of differential item functioning (DIF) is one component of efforts to ensure test fairness. The current study compared four treatments for items that have been identified as showing DIF: deleting, ignoring,…

Descriptors: Item Analysis, Comparative Analysis, Culture Fair Tests, Test Validity

A Regression Discontinuity Design Framework for Controlling Selection Bias in Evaluations of Differential Item Functioning

Peer reviewed

Direct link

Koziol, Natalie A.; Goodrich, J. Marc; Yoon, HyeonJin – Educational and Psychological Measurement, 2022

Differential item functioning (DIF) is often used to examine validity evidence of alternate form test accommodations. Unfortunately, traditional approaches for evaluating DIF are prone to selection bias. This article proposes a novel DIF framework that capitalizes on regression discontinuity design analysis to control for selection bias. A…

Descriptors: Regression (Statistics), Item Analysis, Validity, Testing Accommodations

Changes in the Speed-Ability Relation through Different Treatments of Rapid Guessing

Peer reviewed

Direct link

Deribo, Tobias; Goldhammer, Frank; Kroehne, Ulf – Educational and Psychological Measurement, 2023

As researchers in the social sciences, we are often interested in studying not directly observable constructs through assessments and questionnaires. But even in a well-designed and well-implemented study, rapid-guessing behavior may occur. Under rapid-guessing behavior, a task is skimmed shortly but not read and engaged with in-depth. Hence, a…

Descriptors: Reaction Time, Guessing (Tests), Behavior Patterns, Bias

The Pattern of Test-Taking Effort across Items in Cognitive Ability Test: A Latent Class Analysis

Peer reviewed
PDF on ERIC

Download full text

Akhtar, Hanif – International Association for Development of the Information Society, 2022

When examinees perceive a test as low stakes, it is logical to assume that some of them will not put out their maximum effort. This condition makes the validity of the test results more complicated. Although many studies have investigated motivational fluctuation across tests during a testing session, only a small number of studies have…

Descriptors: Intelligence Tests, Student Motivation, Test Validity, Student Attitudes

Developing the Diagnostic Test of Misconceptions of Fractions

Peer reviewed
PDF on ERIC

Download full text

Aleyna Altan; Zehra Taspinar Sener – Online Submission, 2023

This research aimed to develop a valid and reliable test to be used to detect sixth grade students' misconceptions and errors regarding the subject of fractions. A misconception diagnostic test has been developed that includes the concept of fractions, different representations of fractions, ordering and comparing fractions, equivalence of…

Descriptors: Diagnostic Tests, Mathematics Tests, Fractions, Misconceptions

Gender Bias in Test Item Formats: Evidence from PISA 2009, 2012, and 2015 Math and Reading Tests

Peer reviewed

Direct link

Shear, Benjamin R. – Journal of Educational Measurement, 2023

Large-scale standardized tests are regularly used to measure student achievement overall and for student subgroups. These uses assume tests provide comparable measures of outcomes across student subgroups, but prior research suggests score comparisons across gender groups may be complicated by the type of test items used. This paper presents…

Descriptors: Gender Bias, Item Analysis, Test Items, Achievement Tests

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

Educational and Psychological…	6
Language Testing	6
Journal of Educational…	4
Language Assessment Quarterly	4
ProQuest LLC	3
Research in Developmental…	3
CBE - Life Sciences Education	2
International Journal of…	2
International Journal of…	2
Language Testing in Asia	2
Online Submission	2
Research Matters	2
ACT, Inc.	1
Applied Measurement in…	1
Applied Psychological…	1
Assessment & Evaluation in…	1
Assessment in Education:…	1
Bilingual Research Journal	1
Biochemistry and Molecular…	1
British Journal of Language…	1
College Board	1
Communique	1
Diaspora, Indigenous, and…	1
Early Education and…	1
Edinburgh Working Papers in…	1
More ▼

Allen, Nancy L.	2
Haladyna, Tom	2
Roid, Gale	2
Wainer, Howard	2
Abel, Michael B.	1
Acar, Tülin	1
Aesaert, Koen	1
Afflerbach, Peter	1
Akbari, Alireza	1
Akhtar, Hanif	1
Alderson, J. Charles	1
Aleyna Altan	1
Alqarni, Abdulelah Mohammed	1
Arth, Thomas O.	1
Awwad, Abeer	1
Baker, Harley E.	1
Bardovi-Harlig, Kathleen	1
Basset, Katherine	1
Benderson, Albert, Ed.	1
Benton, Tom	1
Berman, Ye'Elah	1
Betebenner, Damian W.	1
Betjemann, Rebecca S.	1
Bishop, Pamela R.	1
More ▼

Test of English as a Foreign…	5
SAT (College Admission Test)	3
Trends in International…	3
Program for International…	2
Raven Progressive Matrices	2
Armed Services Vocational…	1
California Achievement Tests	1
Comprehensive Tests of Basic…	1
Defining Issues Test	1
Early Childhood Environment…	1
Graduate Record Examinations	1
Gray Oral Reading Test	1
International Association for…	1
Minnesota Multiphasic…	1
National Assessment of…	1
Progress in International…	1
Sequential Tests of…	1
Stanford Achievement Tests	1
Strong Interest Inventory	1
Test of Written English	1
Wechsler Adult Intelligence…	1
More ▼