ERIC - Search Results

Publication Date

In 2025

Descriptor

Difficulty Level	18
Test Items	18
Foreign Countries	8
Item Response Theory	8
Test Reliability	8
Psychometrics	7
Test Validity	7
Item Analysis	6
Test Construction	6
Accuracy	3
Cognitive Processes	3
English (Second Language)	3
Language Tests	3
Multiple Choice Tests	3
Science Tests	3
Undergraduate Students	3
Universities	3
Artificial Intelligence	2
College Students	2
Comparative Analysis	2
Computer Software	2
Evaluators	2
Factor Analysis	2
Factor Structure	2
Prior Learning	2
More ▼

Source

Chemistry Education Research…	1
Educational Process:…	1
Educational Psychology Review	1
Educational and Psychological…	1
Instructional Science: An…	1
International Electronic…	1
International Journal of…	1
International Journal of…	1
International Journal of…	1
Journal of Autism and…	1
Journal of Biological…	1
Journal of Education and…	1
Journal of Educational and…	1
Journal of Positive Behavior…	1
Language Education &…	1
Language Testing in Asia	1
National Center for Research…	1
Teaching of Psychology	1
More ▼

Publication Type

Reports - Research	18
Journal Articles	17
Tests/Questionnaires	1

Education Level

Higher Education	6
Postsecondary Education	6
Secondary Education	4
Junior High Schools	2
Middle Schools	2
Early Childhood Education	1
Elementary Education	1
Elementary Secondary Education	1
Grade 2	1
High Schools	1
Primary Education	1
More ▼

Audience

Location

Bosnia and Herzegovina	1
Germany	1
Indonesia	1
Iran	1
Oman	1
Slovakia	1
Thailand (Bangkok)	1
United Kingdom	1

Laws, Policies, & Programs

Assessments and Surveys

Big Five Inventory

What Works Clearinghouse Rating

Showing 1 to 15 of 18 results Save | Export

Embedding Embedded Standard Setting: An Application of Cross-Classified Item Response Theory. CRESST Report 876

Download full text

Yun-Kyung Kim; Li Cai – National Center for Research on Evaluation, Standards, and Student Testing (CRESST), 2025

This paper introduces an application of cross-classified item response theory (IRT) modeling to an assessment utilizing the embedded standard setting (ESS) method (Lewis & Cook). The cross-classified IRT model is used to treat both item and person effects as random, where the item effects are regressed on the target performance levels (target…

Descriptors: Standard Setting (Scoring), Item Response Theory, Test Items, Difficulty Level

Interaction of Social Deference and Cognitive Processing in the Prediction of Acquiescence

Peer reviewed

Direct link

Patrik Havan; Michal Kohút; Peter Halama – International Journal of Testing, 2025

Acquiescence is the tendency of participants to shift their responses to agreement. Lechner et al. (2019) introduced the following mechanisms of acquiescence: social deference and cognitive processing. We added their interaction into a theoretical framework. The sample consists of 557 participants. We found significant medium strong relationship…

Descriptors: Cognitive Processes, Attention, Difficulty Level, Reflection

The Accuracy of Estimating Parameters of Multiple-Choice Test Items, Following Item-Response Theory: A Simulation Study

Peer reviewed
PDF on ERIC

Download full text

Aiman Mohammad Freihat; Omar Saleh Bani Yassin – Educational Process: International Journal, 2025

Background/purpose: This study aimed to reveal the accuracy of estimation of multiple-choice test items parameters following the models of the item-response theory in measurement. Materials/methods: The researchers depended on the measurement accuracy indicators, which express the absolute difference between the estimated and actual values of the…

Descriptors: Accuracy, Computation, Multiple Choice Tests, Test Items

Comparative Evaluation of C-Test Reliability Using Classical and Modern Psychometric Methods

Peer reviewed
PDF on ERIC

Download full text

Neda Kianinezhad; Mohsen Kianinezhad – Language Education & Assessment, 2025

This study presents a comparative analysis of classical reliability measures, including Cronbach's alpha, test-retest, and parallel forms reliability, alongside modern psychometric methods such as the Rasch model and Mokken scaling, to evaluate the reliability of C-tests in language proficiency assessment. Utilizing data from 150 participants…

Descriptors: Psychometrics, Test Reliability, Language Proficiency, Language Tests

Empirically Deriving Cut Scores in the Positive Behavioral Interventions and Supports (PBIS) Tiered Fidelity Inventory (TFI) through a Bookmarking Process

Peer reviewed

Direct link

Jerin Kim; Kent McIntosh – Journal of Positive Behavior Interventions, 2025

We aimed to identify empirically valid cut scores on the positive behavioral interventions and supports (PBIS) Tiered Fidelity Inventory (TFI) through an expert panel process known as bookmarking. The TFI is a measurement tool to evaluate the fidelity of implementation of PBIS. In the bookmark method, experts reviewed all TFI items and item scores…

Descriptors: Positive Behavior Supports, Cutting Scores, Fidelity, Program Evaluation

Is Effort Moderated Scoring Robust to Multidimensional Rapid Guessing?

Peer reviewed

Direct link

Joseph A. Rios; Jiayi Deng – Educational and Psychological Measurement, 2025

To mitigate the potential damaging consequences of rapid guessing (RG), a form of noneffortful responding, researchers have proposed a number of scoring approaches. The present simulation study examines the robustness of the most popular of these approaches, the unidimensional effort-moderated (EM) scoring procedure, to multidimensional RG (i.e.,…

Descriptors: Scoring, Guessing (Tests), Reaction Time, Item Response Theory

Investigating Construct Validity of Cognitive Load Measurement Using Single-Item Subjective Rating Scales

Peer reviewed

Direct link

Katrin Schuessler; Vanessa Fischer; Maik Walpuski – Instructional Science: An International Journal of the Learning Sciences, 2025

Cognitive load studies are mostly centered on information on perceived cognitive load. Single-item subjective rating scales are the dominant measurement practice to investigate overall cognitive load. Usually, either invested mental effort or perceived task difficulty is used as an overall cognitive load measure. However, the extent to which the…

Descriptors: Cognitive Processes, Difficulty Level, Rating Scales, Construct Validity

Argument-Based Validation of Chulalongkorn University Language Institute (CULI) Test: A Rasch-Based Evidence Investigation

Peer reviewed

Direct link

Apichat Khamboonruang – Language Testing in Asia, 2025

Chulalongkorn University Language Institute (CULI) test was developed as a local standardised test of English for professional and international communication. To ensure that the CULI test fulfils its intended purposes, this study employed Kane's argument-based validation and Rasch measurement approaches to construct the validity argument for the…

Descriptors: Universities, Second Language Learning, Second Language Instruction, Language Tests

Developing the Mental Effort and Load-Translingual Scale (MEL-TS) as a Foundation for Translingual Research in Self-Regulated Learning

Peer reviewed

Direct link

Tino Endres; Lisa Bender; Stoo Sepp; Shirong Zhang; Louise David; Melanie Trypke; Dwayne Lieck; Juliette C. Désiron; Johanna Bohm; Sophia Weissgerber; Juan Cristobal Castro-Alonso; Fred Paas – Educational Psychology Review, 2025

Assessing cognitive demand is crucial for research on self-regulated learning; however, discrepancies in translating essential concepts across languages can hinder the comparison of research findings. Different languages often emphasize various components and interpret certain constructs differently. This paper aims to develop a translingual set…

Descriptors: Cognitive Processes, Difficulty Level, Metacognition, Translation

Developing and Validating a Biological System Thinking Test for Middle School Students

Peer reviewed

Direct link

Ruying Li; Gaofeng Li – International Journal of Science and Mathematics Education, 2025

Systems thinking (ST) is an essential competence for future life and biology learning. Appropriate assessment is critical for collecting sufficient information to develop ST in biology education. This research offers an ST framework based on a comprehensive understanding of biological systems, encompassing four skills across three complexity…

Descriptors: Test Construction, Test Validity, Science Tests, Cognitive Tests

Validity and Reliability Analysis of a Socioscientific Issues-Based Critical Thinking Self-Assessment Instrument Using the Rasch Model

Peer reviewed
PDF on ERIC

Download full text

Y. Yokhebed; Rexy Maulana Dwi Karmadi; Luvia Ranggi Nastiti – Journal of Biological Education Indonesia (Jurnal Pendidikan Biologi Indonesia), 2025

Although self-assessment in critical thinking is thought to help students recognise their strengths and weaknesses, the reliability and validity of the assessment tool is still questionable, so a more objective evaluation is needed. Objective of this investigation is to assess the self-assessment tools in evaluating students' critical thinking…

Descriptors: Self Evaluation (Individuals), Critical Thinking, Science and Society, Test Validity

Design, Development, and Evaluation of the Organic Chemistry Representational Competence Assessment (ORCA)

Peer reviewed

Direct link

Lyniesha Ward; Fridah Rotich; Jeffrey R. Raker; Regis Komperda; Sachin Nedungadi; Maia Popova – Chemistry Education Research and Practice, 2025

This paper describes the design and evaluation of the Organic chemistry Representational Competence Assessment (ORCA). Grounded in Kozma and Russell's representational competence framework, the ORCA measures the learner's ability to "interpret," "translate," and "use" six commonly used representations of molecular…

Descriptors: Organic Chemistry, Science Tests, Test Construction, Student Evaluation

Examining the Effect of Item Difficulty and Rater Leniency on Iranian Test Takers' Performance on WDCT and DSAT: A Comparative Study

Peer reviewed
PDF on ERIC

Download full text

Reza Shahi; Hamdollah Ravand; Golam Reza Rohani – International Journal of Language Testing, 2025

The current paper intends to exploit the Many Facet Rasch Model to investigate and compare the impact of situations (items) and raters on test takers' performance on the Written Discourse Completion Test (WDCT) and Discourse Self-Assessment Tests (DSAT). In this study, the participants were 110 English as a Foreign Language (EFL) students at…

Descriptors: Comparative Analysis, English (Second Language), Second Language Learning, Second Language Instruction

The Knowledge of Autism Questionnaire-UK: Development and Initial Psychometric Evaluation

Peer reviewed

Direct link

Sophie Langhorne; Nora Uglik-Marucha; Charlotte Broadhurst; Elena Lieven; Amelia Pearson; Silia Vitoratou; Kathy Leadbitter – Journal of Autism and Developmental Disorders, 2025

Tools to measure autism knowledge are needed to assess levels of understanding within particular groups of people and to evaluate whether awareness-raising campaigns or interventions lead to improvements in understanding. Several such measures are in circulation, but, to our knowledge, there are no psychometrically-validated questionnaires that…

Descriptors: Foreign Countries, Autism Spectrum Disorders, Questionnaires, Psychometrics

Content and Item Response Theory Analysis of ChatGPT-4-Generated Multiple-Choice Items

Peer reviewed

Direct link

Roger Young; Emily Courtney; Alexander Kah; Mariah Wilkerson; Yi-Hsin Chen – Teaching of Psychology, 2025

Background: Multiple-choice item (MCI) assessments are burdensome for instructors to develop. Artificial intelligence (AI, e.g., ChatGPT) can streamline the process without sacrificing quality. The quality of AI-generated MCIs and human experts is comparable. However, whether the quality of AI-generated MCIs is equally good across various domain-…

Descriptors: Item Response Theory, Multiple Choice Tests, Psychology, Textbooks

Previous Page | Next Page »

Pages: 1 | 2

Ahmed Al - Badri	1
Aiman Mohammad Freihat	1
Alexander Kah	1
Amelia Pearson	1
Apichat Khamboonruang	1
Benjamin W. Domingue	1
Charlotte Broadhurst	1
Dina Kamber Hamzic	1
Dwayne Lieck	1
Elena Lieven	1
Emily Courtney	1
Fred Paas	1
Fridah Rotich	1
Gaofeng Li	1
Golam Reza Rohani	1
Hamdollah Ravand	1
Ismar Hadžalic	1
Jeffrey R. Raker	1
Jerin Kim	1
Jiayi Deng	1
Johanna Bohm	1
Joseph A. Rios	1
Joshua B. Gilbert	1
Juan Cristobal Castro-Alonso	1
Juliette C. Désiron	1
More ▼