ERIC - Search Results

Publication Date

In 2026	0
Since 2025	3
Since 2022 (last 5 years)	10
Since 2017 (last 10 years)	20
Since 2007 (last 20 years)	25

Descriptor

Accuracy	26
Test Items	26
Test Validity	26
Item Response Theory	10
Foreign Countries	9
Test Construction	9
Item Analysis	8
Psychometrics	8
English (Second Language)	6
Language Tests	6
Scores	6
Test Reliability	6
Second Language Learning	5
Classification	4
College Students	4
Difficulty Level	4
Factor Analysis	4
Goodness of Fit	4
Mathematics Tests	4
Comparative Analysis	3
Diagnostic Tests	3
Error of Measurement	3
Grammar	3
Language Proficiency	3
Models	3
More ▼

Publication Type

Journal Articles	19
Reports - Research	19
Dissertations/Theses -…	5
Tests/Questionnaires	2
Information Analyses	1
Reports - Descriptive	1
Reports - Evaluative	1
Speeches/Meeting Papers	1

Education Level

Higher Education	8
Postsecondary Education	7
Elementary Education	2
Secondary Education	2
Adult Education	1
Early Childhood Education	1
Elementary Secondary Education	1
High Schools	1
Junior High Schools	1
Middle Schools	1
Primary Education	1
More ▼

Audience

Location

Iran	2
Japan	2
Australia	1
Japan (Tokyo)	1
Thailand (Bangkok)	1
Turkey	1

Laws, Policies, & Programs

Assessments and Surveys

Force Concept Inventory	1
Graduate Record Examinations	1
Social Skills Improvement…	1
Test of English as a Foreign…	1
Trends in International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 26 results Save | Export

Autism Knowledge Assessments: A Closer Examination of Validity by Autism Experts

Peer reviewed

Direct link

Camilla M. McMahon; Maryellen Brunson McClain; Savannah Wells; Sophia Thompson; Jeffrey D. Shahidullah – Journal of Autism and Developmental Disorders, 2025

Purpose: The goal of the current study was to conduct a substantive validity review of four autism knowledge assessments with prior psychometric support (Gillespie-Lynch in J Autism and Dev Disord 45(8):2553-2566, 2015; Harrison in J Autism and Dev Disord 47(10):3281-3295, 2017; McClain in J Autism and Dev Disord 50(3):998-1006, 2020; McMahon…

Descriptors: Measures (Individuals), Psychometrics, Test Items, Accuracy

Estimating the Psychometric Properties ("Item Difficulty, Discrimination and Reliability Indices") of Test Items Using Kuder-Richardson Approach (KR-20)

Peer reviewed
PDF on ERIC

Download full text

Ntumi, Simon; Agbenyo, Sheilla; Bulala, Tapela – Shanlax International Journal of Education, 2023

There is no need or point to testing of knowledge, attributes, traits, behaviours or abilities of an individual if information obtained from the test is inaccurate. However, by and large, it seems the estimation of psychometric properties of test items in classroomshas been completely ignored otherwise dying slowly in most testing environments. In…

Descriptors: Psychometrics, Accuracy, Test Validity, Factor Analysis

An Example of Redeveloping Checklists to Support Assessors Who Check Draft Exam Papers for Errors

Download full text

Vitello, Sylvia; Crisp, Victoria; Ireland, Jo – Research Matters, 2023

Assessment materials must be checked for errors before they are presented to candidates. Any errors have the potential to reduce validity. For example, in the most extreme cases, an error may turn an otherwise well-designed exam question into one that is impossible to answer. In Cambridge University Press & Assessment, assessment materials are…

Descriptors: Check Lists, Test Validity, Error Correction, Test Construction

Argument-Based Validation of Chulalongkorn University Language Institute (CULI) Test: A Rasch-Based Evidence Investigation

Peer reviewed

Direct link

Apichat Khamboonruang – Language Testing in Asia, 2025

Chulalongkorn University Language Institute (CULI) test was developed as a local standardised test of English for professional and international communication. To ensure that the CULI test fulfils its intended purposes, this study employed Kane's argument-based validation and Rasch measurement approaches to construct the validity argument for the…

Descriptors: Universities, Second Language Learning, Second Language Instruction, Language Tests

Real-Life Applications of Competence-Based Test Development to the Construction, Improvement, and Shortening of Tests

Peer reviewed

Direct link

Pasquale Anselmi; Jürgen Heller; Luca Stefanutti; Egidio Robusto; Giulia Barillari – Education and Information Technologies, 2025

Competence-based test development (CbTD) is a novel method for constructing tests that are as informative as possible about the competence state (the set of skills an individual masters) underlying the item responses. If desired, the tests can also be minimal, meaning that no item can be eliminated without reducing their informativeness. To…

Descriptors: Competency Based Education, Test Construction, Test Length, Usability

Reliability and Validity Evidence of Diagnostic Methods: Comparison of Diagnostic Classification Models and Item Response Theory-Based Methods

Direct link

Yoo Jeong Jang – ProQuest LLC, 2022

Despite the increasing demand for diagnostic information, observed subscores have been often reported to lack adequate psychometric qualities such as reliability, distinctiveness, and validity. Therefore, several statistical techniques based on CTT and IRT frameworks have been proposed to improve the quality of subscores. More recently, DCM has…

Descriptors: Classification, Accuracy, Item Response Theory, Correlation

Improving Test Security and Efficiency of Computerized Adaptive Testing for the Force Concept Inventory

Peer reviewed

Direct link

Yasuda, Jun-ichiro; Hull, Michael M.; Mae, Naohiro – Physical Review Physics Education Research, 2022

This paper presents improvements made to a computerized adaptive testing (CAT)-based version of the FCI (FCI-CAT) in regards to test security and test efficiency. First, we will discuss measures to enhance test security by controlling for item overexposure, decreasing the risk that respondents may (i) memorize the content of a pretest for use on…

Descriptors: Adaptive Testing, Computer Assisted Testing, Test Items, Risk Management

Impact of Item Parameter Drift on Rasch Scale Stability in Small Samples over Multiple Administrations

Peer reviewed

Direct link

Kopp, Jason P.; Jones, Andrew T. – Applied Measurement in Education, 2020

Traditional psychometric guidelines suggest that at least several hundred respondents are needed to obtain accurate parameter estimates under the Rasch model. However, recent research indicates that Rasch equating results in accurate parameter estimates with sample sizes as small as 25. Item parameter drift under the Rasch model has been…

Descriptors: Item Response Theory, Psychometrics, Sample Size, Sampling

Probing the Internal Validity of the LLAMA Language Aptitude Tests

Peer reviewed

Direct link

Bokander, Lars; Bylund, Emanuel – Language Learning, 2020

Over the past decade, the LLAMA language aptitude test battery has come to play an increasingly important role as an instrument in research on individual differences in language development. However, a potentially serious problem that has been pointed out by several scholars is that the LLAMA has not yet been carefully validated. We addressed this…

Descriptors: Item Analysis, Language Tests, Test Items, Individual Differences

Examining the Robustness of the Latent Growth Curve Model to Violations of Longitudinal Measurement Equivalence: A Methodological Study with Practical Applications in Child Development

Direct link

Rachel A. Gross – ProQuest LLC, 2020

The present study was motivated by the theory-method mismatch between heterotypic continuity (aspects of development that manifest differently across the lifespan thus cannot be measured the same way over time) and longitudinal measurement equivalence (the statistical assumption that the developmental phenomenon studied is measured on the same…

Descriptors: Robustness (Statistics), Structural Equation Models, Longitudinal Studies, Error of Measurement

Comparison of DIF Methods for the Student Experience in the Research University Survey: A Validity and Methodological Study

Direct link

Thapelo Ncube Whitfield – ProQuest LLC, 2021

Student Experience surveys are used to measure student attitudes towards their campus as well as to initiate conversations for institutional change. Validity evidence to support the interpretations of these surveys' results, however, is lacking. The first purpose of this study was to compare three Differential Item Functioning (DIF) methods on…

Descriptors: College Students, Student Surveys, Student Experience, Student Attitudes

Developing a Level-Specific Checklist for Assessing EFL Writing

Peer reviewed

Direct link

Lukácsi, Zoltán – Language Testing, 2021

In second language writing assessment, rating scales and scores from human-mediated assessment have been criticized for a number of shortcomings including problems with adequacy, relevance, and reliability (Hamp-Lyons, 1990; McNamara, 1996; Weigle, 2002). In its testing practice, Euroexam International also detected that the rating scales for…

Descriptors: Test Construction, Test Validity, Test Items, Check Lists

Comparison of Confirmatory Factor Analysis Estimation Methods on Mixed-Format Data

Peer reviewed
PDF on ERIC

Download full text

Kilic, Abdullah Faruk; Dogan, Nuri – International Journal of Assessment Tools in Education, 2021

Weighted least squares (WLS), weighted least squares mean-and-variance-adjusted (WLSMV), unweighted least squares mean-and-variance-adjusted (ULSMV), maximum likelihood (ML), robust maximum likelihood (MLR) and Bayesian estimation methods were compared in mixed item response type data via Monte Carlo simulation. The percentage of polytomous items,…

Descriptors: Factor Analysis, Computation, Least Squares Statistics, Maximum Likelihood Statistics

The Construction and Validation of a Q-Matrix for Cognitive Diagnostic Analysis: The Case of the Reading Comprehension Section of the IAUEPT

Peer reviewed
PDF on ERIC

Download full text

Boori, Ali Akbar; Ghazanfari, Mohammad; Ghonsooly, Behzad; Baghaei, Purya – International Journal of Language Testing, 2023

Cognitive diagnostic models (CDMs) have received sustained attention in educational settings because they can be used to operationalize formative assessment to provide diagnostic feedback and inform instruction. A large number of CDMs have been developed over the past few years. An important component of all CDMs is a Q-matrix that specifies a…

Descriptors: Reading Comprehension, Reading Tests, English (Second Language), Islam

Scoring Graphical Responses in TIMSS 2019 Using Artificial Neural Networks

Peer reviewed

Direct link

von Davier, Matthias; Tyack, Lillian; Khorramdel, Lale – Educational and Psychological Measurement, 2023

Automated scoring of free drawings or images as responses has yet to be used in large-scale assessments of student achievement. In this study, we propose artificial neural networks to classify these types of graphical responses from a TIMSS 2019 item. We are comparing classification accuracy of convolutional and feed-forward approaches. Our…

Descriptors: Scoring, Networks, Artificial Intelligence, Elementary Secondary Education

Previous Page | Next Page »

Pages: 1 | 2

ProQuest LLC	5
American Journal of…	1
Applied Measurement in…	1
Applied Psycholinguistics	1
ETS Research Report Series	1
Education and Information…	1
Educational Research and…	1
Educational and Psychological…	1
Grantee Submission	1
InSight: A Journal of…	1
International Journal of…	1
International Journal of…	1
Journal of Autism and…	1
Language Learning	1
Language Testing	1
Language Testing in Asia	1
Mathematics Education…	1
Physical Review Physics…	1
Research Matters	1
SAGE Open	1
Shanlax International Journal…	1
TESL-EJ	1
More ▼

Agbenyo, Sheilla	1
Apichat Khamboonruang	1
Asquith, Steven	1
Baghaei, Purya	1
Bokander, Lars	1
Bonifay, Wes E.	1
Boori, Ali Akbar	1
Bulala, Tapela	1
Bylund, Emanuel	1
Camilla M. McMahon	1
Crisp, Victoria	1
Dogan, Nuri	1
Dronjic, Vedran	1
Egidio Robusto	1
Eklund, Katie	1
Ghazanfari, Mohammad	1
Ghonsooly, Behzad	1
Giulia Barillari	1
Graf, Edith Aurora	1
Helms-Park, Rena	1
Hull, Michael M.	1
Ireland, Jo	1
Izumi, Jared	1
Jeffrey D. Shahidullah	1
Ji-young Shin	1
More ▼