ERIC - Search Results

Publication Date

In 2026	0
Since 2025	2
Since 2022 (last 5 years)	3
Since 2017 (last 10 years)	10
Since 2007 (last 20 years)	16

Descriptor

Accuracy	16
Test Items	16
Test Reliability	16
Item Response Theory	9
Test Validity	6
Foreign Countries	5
Item Analysis	4
Classification	3
Comparative Analysis	3
Factor Analysis	3
Models	3
Psychometrics	3
Scores	3
Second Language Learning	3
Bayesian Statistics	2
Computation	2
Diagnostic Tests	2
Difficulty Level	2
English (Second Language)	2
Evaluation Methods	2
Language Tests	2
Mathematics Tests	2
Maximum Likelihood Statistics	2
Multiple Choice Tests	2
Response Style (Tests)	2
More ▼

Source

ProQuest LLC	3
ETS Research Report Series	2
American Journal of…	1
Educational Research and…	1
International Journal of…	1
Journal of Educational…	1
Journal of Intellectual &…	1
Measurement and Evaluation in…	1
Partnership for Assessment of…	1
Practical Assessment,…	1
SAGE Open	1
Shanlax International Journal…	1
South African Journal of…	1
More ▼

Publication Type

Reports - Research	13
Journal Articles	12
Dissertations/Theses -…	3
Tests/Questionnaires	2
Numerical/Quantitative Data	1

Education Level

Higher Education	2
Postsecondary Education	2
Elementary Education	1
Elementary Secondary Education	1
Secondary Education	1

Audience

Location

Iran	2
Australia	1
Singapore	1
South Africa	1

Laws, Policies, & Programs

Assessments and Surveys

Autism Diagnostic Observation…

What Works Clearinghouse Rating

Showing 1 to 15 of 16 results Save | Export

Comparing and Combining IRTree Models and Anchoring Vignettes in Addressing Response Styles

Peer reviewed

Direct link

Mingfeng Xue; Ping Chen – Journal of Educational Measurement, 2025

Response styles pose great threats to psychological measurements. This research compares IRTree models and anchoring vignettes in addressing response styles and estimating the target traits. It also explores the potential of combining them at the item level and total-score level (ratios of extreme and middle responses to vignettes). Four models…

Descriptors: Item Response Theory, Models, Comparative Analysis, Vignettes

Estimating the Psychometric Properties ("Item Difficulty, Discrimination and Reliability Indices") of Test Items Using Kuder-Richardson Approach (KR-20)

Peer reviewed
PDF on ERIC

Download full text

Ntumi, Simon; Agbenyo, Sheilla; Bulala, Tapela – Shanlax International Journal of Education, 2023

There is no need or point to testing of knowledge, attributes, traits, behaviours or abilities of an individual if information obtained from the test is inaccurate. However, by and large, it seems the estimation of psychometric properties of test items in classroomshas been completely ignored otherwise dying slowly in most testing environments. In…

Descriptors: Psychometrics, Accuracy, Test Validity, Factor Analysis

Examining the Effect of Item Difficulty and Rater Leniency on Iranian Test Takers' Performance on WDCT and DSAT: A Comparative Study

Peer reviewed
PDF on ERIC

Download full text

Reza Shahi; Hamdollah Ravand; Golam Reza Rohani – International Journal of Language Testing, 2025

The current paper intends to exploit the Many Facet Rasch Model to investigate and compare the impact of situations (items) and raters on test takers' performance on the Written Discourse Completion Test (WDCT) and Discourse Self-Assessment Tests (DSAT). In this study, the participants were 110 English as a Foreign Language (EFL) students at…

Descriptors: Comparative Analysis, English (Second Language), Second Language Learning, Second Language Instruction

A Review of Subscore Estimation Methods. ETS RR-18-17

Peer reviewed
PDF on ERIC

Download full text

Fu, Jianbin; Qu, Yanxuan – ETS Research Report Series, 2018

Various subscore estimation methods that use auxiliary information to improve subscore accuracy and stability have been developed. This report provides a review of various subscore estimation methods described in the literature. The methodology of each method is described, then research studies on these subscore estimation methods are summarized.…

Descriptors: Scores, Evaluation Methods, Item Response Theory, Test Items

Bayesian Approaches to Test Score Measurement Errors in Student Growth Prediction Models

Direct link

Pei-Hsuan Chiu – ProQuest LLC, 2018

Evidence of student growth is a primary outcome of interest for educational accountability systems. When three or more years of student test data are available, questions around how students grow and what their predicted growth is can be answered. Given that test scores contain measurement error, this error should be considered in growth and…

Descriptors: Bayesian Statistics, Scores, Error of Measurement, Growth Models

Towards Optimal Measurement and Theoretical Grounding of L2 English Elicited Imitation: Examining Scales, (Mis)Fits, and Prompt Features from Item Response Theory and Random Forest Approaches

Direct link

Ji-young Shin – ProQuest LLC, 2021

The present dissertation investigated the impact of scales/scoring methods and prompt linguistic features on the measurement quality of L2 English elicited imitation (EI). Scales/scoring methods are an important feature for the validity and reliability of L2 EI test, but less is known (Yan et al., 2016). Prompt linguistic features are also known…

Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Semantics

Determining Item Screening Criteria Using Cost-Benefit Analysis

Peer reviewed
PDF on ERIC

Download full text

Bashkov, Bozhidar M.; Clauser, Jerome C. – Practical Assessment, Research & Evaluation, 2019

Successful testing programs rely on high-quality test items to produce reliable scores and defensible exams. However, determining what statistical screening criteria are most appropriate to support these goals can be daunting. This study describes and demonstrates cost-benefit analysis as an empirical approach to determining appropriate screening…

Descriptors: Test Items, Test Reliability, Evaluation Criteria, Accuracy

Hierarchical Diagnostic Classification Modeling of Reading Comprehension

Peer reviewed

Direct link

Tabatabaee-Yazdi, Mona – SAGE Open, 2020

The Hierarchical Diagnostic Classification Model (HDCM) reflects on the sequences of the presentation of the essential materials and attributes to answer the items of a test correctly. In this study, a foreign language reading comprehension test was analyzed employing HDCM and the generalized deterministic-input, noisy and gate (G-DINA) model to…

Descriptors: Diagnostic Tests, Classification, Models, Reading Comprehension

A Rasch Analysis of the Junior Metacognitive Awareness Inventory with Singapore Students

Peer reviewed

Direct link

Ning, Hoi Kwan – Measurement and Evaluation in Counseling and Development, 2018

The psychometric properties of the 2 versions of the Junior Metacognitive Awareness Inventory were examined with Singapore student samples. Other than 2 misfitting items and an underutilized response scale, Rasch analysis demonstrated that the instruments have good measurement precision, and no differential item functioning was detected across…

Descriptors: Foreign Countries, Metacognition, Measures (Individuals), Item Response Theory

Adapting the Autistic Behavioural Indicators Instrument (ABII) as a Parent Questionnaire (ABII-PQ)

Peer reviewed

Direct link

Ward, Samantha L.; Sullivan, Karen A.; Gilmore, Linda – Journal of Intellectual & Developmental Disability, 2017

Background: Both parent-report and clinician-administered autism spectrum disorder (ASD) screening instruments are important to accurately inform ASD risk ascertainment. The aim of this study was to adapt a clinician-administered ASD screening instrument, the Autistic Behavioural Indicators Instrument (ABII), as a parent questionnaire equivalent…

Descriptors: Foreign Countries, Autism, Diagnostic Tests, Observation

Variance Difference between Maximum Likelihood Estimation Method and Expected A Posteriori Estimation Method Viewed from Number of Test Items

Peer reviewed
PDF on ERIC

Download full text

Mahmud, Jumailiyah; Sutikno, Muzayanah; Naga, Dali S. – Educational Research and Reviews, 2016

The aim of this study is to determine variance difference between maximum likelihood and expected A posteriori estimation methods viewed from number of test items of aptitude test. The variance presents an accuracy generated by both maximum likelihood and Bayes estimation methods. The test consists of three subtests, each with 40 multiple-choice…

Descriptors: Maximum Likelihood Statistics, Computation, Item Response Theory, Test Items

Findings from the Quality of Items/Tasks/Stimuli Investigations: PARCC Field Tests

Download full text

Thacker, Arthur A.; Dickinson, Emily R.; Bynum, Bethany H.; Wen, Yao; Smith, Erin; Sinclair, Andrea L.; Deatz, Richard C.; Wise, Lauress L. – Partnership for Assessment of Readiness for College and Careers, 2015

The Partnership for Assessment of Readiness for College and Careers (PARCC) field tests during the spring of 2014 provided an opportunity to investigate the quality of the items, tasks, and associated stimuli. HumRRO conducted several research studies summarized in this report. Quality of test items is integral to the "Theory of Action"…

Descriptors: Achievement Tests, Test Items, Common Core State Standards, Difficulty Level

Short-Form Philadelphia Naming Test: Rationale and Empirical Evaluation

Peer reviewed

Direct link

Walker, Grant M.; Schwartz, Myrna F. – American Journal of Speech-Language Pathology, 2012

Purpose: To create two matched short forms of the Philadelphia Naming Test (PNT; Roach, Schwartz, Martin, Grewal, & Brecher, 1996) that yield similar results to the PNT for measuring anomia. Method: In Study 1, archived naming data from 94 individuals with aphasia were used to identify which PNT items should be included in the short forms. The 2…

Descriptors: Naming, Tests, Aphasia, Test Items

Multilevel Item Factor Analysis and Student Perceptions of Teacher Effectiveness

Direct link

Kuhfeld, Megan Rebecca – ProQuest LLC, 2016

Measures of teacher effectiveness have become a major research and policy issue due to the increased focus on teacher accountability during the past decade. Growing concerns about the variability in the quality of teaching and traditional approaches to measuring teacher effectiveness led to federal and state policies calling for more rigorous…

Descriptors: Factor Analysis, Student Attitudes, Teacher Effectiveness, Student Evaluation of Teacher Performance

Test Items and Translation: Capturing Early Conceptual Development in Mathematics Reliably?

Peer reviewed
PDF on ERIC

Download full text

Dampier, Graham; Mawila, Daphney – South African Journal of Childhood Education, 2012

Translating items of educational tests from one language to another is problematic. Arriving at accurate translations of concepts formulated in a language that is grammatically and syntactically incommensurable with a target language is a concern that probably won't find resolution. And the very act of translation can obscure the accuracy of test…

Descriptors: Test Items, Translation, Test Reliability, Item Response Theory

Previous Page | Next Page »

Pages: 1 | 2

Agbenyo, Sheilla	1
Bashkov, Bozhidar M.	1
Bulala, Tapela	1
Bynum, Bethany H.	1
Clauser, Jerome C.	1
Dampier, Graham	1
Deatz, Richard C.	1
Dickinson, Emily R.	1
Fu, Jianbin	1
Gilmore, Linda	1
Golam Reza Rohani	1
Hamdollah Ravand	1
Ji-young Shin	1
Kuhfeld, Megan Rebecca	1
Mahmud, Jumailiyah	1
Mawila, Daphney	1
Mingfeng Xue	1
Naga, Dali S.	1
Ning, Hoi Kwan	1
Ntumi, Simon	1
Pei-Hsuan Chiu	1
Ping Chen	1
Qu, Yanxuan	1
Reza Shahi	1
Schwartz, Myrna F.	1
More ▼