ERIC - Search Results

Publication Date

In 2025	6
Since 2024	13

Descriptor

Test Interpretation	13
Test Validity	13
Test Reliability	6
Foreign Countries	5
Scores	5
Test Construction	5
Evaluation Methods	4
Academic Achievement	2
Alternative Assessment	2
Assessment Literacy	2
Error of Measurement	2
Factor Analysis	2
Factor Structure	2
Intervention	2
Item Analysis	2
Item Response Theory	2
Measurement Techniques	2
Personality Traits	2
Psychometrics	2
Racial Differences	2
Response Style (Tests)	2
Scoring Rubrics	2
Test Bias	2
Test Items	2
Test Use	2
More ▼

Source

Assessment for Effective…	1
Autism: The International…	1
ETS Research Report Series	1
Educational and Psychological…	1
Evaluation Review	1
International Journal of…	1
Interpreter and Translator…	1
Journal of Educational…	1
Language Testing in Asia	1
National Assessment Governing…	1
School Leadership Review	1
Society for Research on…	1
Structural Equation Modeling:…	1
More ▼

Publication Type

Journal Articles	11
Reports - Research	10
Reports - Evaluative	2
Information Analyses	1
Reports - Descriptive	1
Tests/Questionnaires	1

Education Level

Higher Education	4
Postsecondary Education	4
Secondary Education	3
Elementary Education	2
Junior High Schools	2
Middle Schools	2
Grade 12	1
Grade 4	1
Grade 8	1
High Schools	1
Intermediate Grades	1
More ▼

Audience

Location

China	1
Greece	1
Illinois	1
Iran (Tehran)	1
Kentucky (Louisville)	1
Spain	1
United Kingdom	1
United States	1

Laws, Policies, & Programs

Assessments and Surveys

Early Childhood Longitudinal…	1
High School Longitudinal…	1
National Assessment of…	1
Progress in International…	1
Social Responsiveness Scale	1
Stages of Concern…	1

What Works Clearinghouse Rating

Showing all 13 results Save | Export

New Developments in Measurement Invariance Testing: An Overview and Comparison of EFA-Based Approaches

Peer reviewed

Direct link

Philipp Sterner; Kim De Roover; David Goretzko – Structural Equation Modeling: A Multidisciplinary Journal, 2025

When comparing relations and means of latent variables, it is important to establish measurement invariance (MI). Most methods to assess MI are based on confirmatory factor analysis (CFA). Recently, new methods have been developed based on exploratory factor analysis (EFA); most notably, as extensions of multi-group EFA, researchers introduced…

Descriptors: Error of Measurement, Measurement Techniques, Factor Analysis, Structural Equation Models

A Historic Review and Empirical Revitalization of the Stages of Concern Questionnaire

Peer reviewed
PDF on ERIC

Download full text

Kent Anderson Seidel – School Leadership Review, 2025

This paper examines one of three central diagnostic tools of the Concerns Based Adoption Model, the Stages of Concern Questionnaire (SoCQ). The SoCQ was developed with a focus on K12 education. It has been used widely since developed in 1973, in early childhood, higher education, medical, business, community, and military settings. The SoCQ…

Descriptors: Questionnaires, Educational Change, Educational Innovation, Intervention

A Note on the Use of Categorical Subscores

Peer reviewed

Direct link

Kylie Gorney; Sandip Sinharay – Journal of Educational Measurement, 2025

Although there exists an extensive amount of research on subscores and their properties, limited research has been conducted on categorical subscores and their interpretations. In this paper, we focus on the claim of Feinberg and von Davier that categorical subscores are useful for remediation and instructional purposes. We investigate this claim…

Descriptors: Tests, Scores, Test Interpretation, Alternative Assessment

Raters' Scoring Process in Assessment of Interpreting: An Empirical Study Based on Eye Tracking and Retrospective Verbalisation

Peer reviewed

Direct link

Chao Han; Binghan Zheng; Mingqing Xie; Shirong Chen – Interpreter and Translator Trainer, 2024

Human raters' assessment of interpreting is a complex process. Previous researchers have mainly relied on verbal reports to examine this process. To advance our understanding, we conducted an empirical study, collecting raters' eye-movement and retrospection data in a computerised interpreting assessment in which three groups of raters (n = 35)…

Descriptors: Foreign Countries, College Students, College Graduates, Interrater Reliability

The Broad Autism Phenotype--International Test (BAP-IT): A Two-Domain-Based Test for the Assessment of the Broad Autism Phenotype

Peer reviewed

Direct link

Marta Godoy-Giménez; Ángel García-Pérez; Fernando Cañadas; Angeles F. Estévez; Pablo Sayans-Jiménez – Autism: The International Journal of Research and Practice, 2024

The broad autism phenotype is the phenotypic expression of the primary characteristics of autism. However, currently available tests do not agree with the two-domain operationalization of broad autism phenotype or autism, and their internal structure has shown instability across applications. This study presents the Broad Autism…

Descriptors: Autism Spectrum Disorders, Genetics, Diagnostic Tests, Foreign Countries

Development of the Quantitative Modelling Observation Protocol (QMOP) for Undergraduate Biology Courses: Validity Evidence for Score Interpretation and Uses

Peer reviewed

Direct link

Lyrica Lucas; Anum Khushal; Robert Mayes; Brian A. Couch; Joseph Dauer – International Journal of Science Education, 2025

Educational reform priorities such as emphasis on quantitative modelling (QM) have positioned undergraduate biology instructors as designers of QM experiences to engage students in authentic science practices that support the development of data-driven and evidence-based reasoning. Yet, little is known about how biology instructors adapt to the…

Descriptors: Undergraduate Students, College Science, Biology, Classroom Observation Techniques

Separation of Traits and Extreme Response Style in IRTree Models: The Role of Mimicry Effects for the Meaningful Interpretation of Estimates

Peer reviewed

Direct link

Viola Merhof; Caroline M. Böhm; Thorsten Meiser – Educational and Psychological Measurement, 2024

Item response tree (IRTree) models are a flexible framework to control self-reported trait measurements for response styles. To this end, IRTree models decompose the responses to rating items into sub-decisions, which are assumed to be made on the basis of either the trait being measured or a response style, whereby the effects of such person…

Descriptors: Item Response Theory, Test Interpretation, Test Reliability, Test Validity

A Rasch-Based Validation of the University of Tehran English Proficiency Test (UTEPT)

Peer reviewed

Direct link

Shadi Noroozi; Hossein Karami – Language Testing in Asia, 2024

Recently, psychometricians and researchers have voiced their concern over the exploration of language test items in light of Messick's validation framework. Validity has been central to test development and use; however, it has not received due attention in language tests having grave consequences for test takers. The present study sought to…

Descriptors: Foreign Countries, Doctoral Students, Graduate Students, Language Proficiency

Re-Examining Measurement Invariance of School Climate Surveys across Race/Ethnicity

Peer reviewed

Direct link

Stephen M. Leach; Jason C. Immekus; Jeffrey C. Valentine; Prathiba Batley; Dena Dossett; Tamara Lewis; Thomas Reece – Assessment for Effective Intervention, 2025

Educators commonly use school climate survey scores to inform and evaluate interventions for equitably improving learning and reducing educational disparities. Unfortunately, validity evidence to support these (and other) score uses often falls short. In response, Whitehouse et al. proposed a collaborative, two-part validity testing framework for…

Descriptors: School Surveys, Measurement, Hierarchical Linear Modeling, Educational Environment

Calibrating Items Using an Unfolding Model of Item Response Theory: The Case of the Trait Personality Questionnaire 5 (TPQue5)

Peer reviewed

Direct link

Eirini M. Mitropoulou; Leonidas A. Zampetakis; Ioannis Tsaousis – Evaluation Review, 2024

Unfolding item response theory (IRT) models are important alternatives to dominance IRT models in describing the response processes on self-report tests. Their usage is common in personality measures, since they indicate potential differentiations in test score interpretation. This paper aims to gain a better insight into the structure of trait…

Descriptors: Foreign Countries, Adults, Item Response Theory, Personality Traits

NAEP Achievement Levels Validity Argument Report

Download full text

Anne H. Davidson – National Assessment Governing Board, 2025

The purpose of this National Assessment of Educational Progress (NAEP) Achievement Levels Validity Argument Report is to synthesize evidence currently available to address the validity of the interpretations and uses of the NAEP Achievement Levels. Validity is the extent to which theory and evidence supports or refutes proposed and enacted test…

Descriptors: National Competency Tests, Academic Achievement, Test Validity, College Entrance Examinations

The Development of a Social-Emotional Competence and Well-Being Measure for Educators

Peer reviewed

Direct link

Jingtong Pan; Kimberly Kendziora; Christina LiCalsi; Karthik Ramesh; George Stifel – Society for Research on Educational Effectiveness, 2024

Background/Context: Research consistently emphasizes the importance of social-emotional learning (SEL) in education settings (Cipriano et al., 2023; Wigelsworth et al., 2023). In addition, it has become increasingly evident that educators' social-emotional competence and well-being plays a crucial role in fostering SEL among students (Braun et…

Descriptors: Social Emotional Learning, Elementary School Teachers, Secondary School Teachers, Well Being

Charting the Future of Assessments. Research Report. ETS RR-24-13

Peer reviewed
PDF on ERIC

Download full text

Patrick Kyllonen; Amit Sevak; Teresa Ober; Ikkyu Choi; Jesse Sparks; Daniel Fishtein – ETS Research Report Series, 2024

Assessment refers to a broad array of approaches for measuring or evaluating a person's (or group of persons') skills, behaviors, dispositions, or other attributes. Assessments range from standardized tests used in admissions, employee selection, licensure examinations, and domestic and international large-scale assessments of cognitive and…

Descriptors: Assessment Literacy, Testing, Test Bias, Test Construction

Amit Sevak	1
Angeles F. Estévez	1
Anne H. Davidson	1
Anum Khushal	1
Binghan Zheng	1
Brian A. Couch	1
Caroline M. Böhm	1
Chao Han	1
Christina LiCalsi	1
Daniel Fishtein	1
David Goretzko	1
Dena Dossett	1
Eirini M. Mitropoulou	1
Fernando Cañadas	1
George Stifel	1
Hossein Karami	1
Ikkyu Choi	1
Ioannis Tsaousis	1
Jason C. Immekus	1
Jeffrey C. Valentine	1
Jesse Sparks	1
Jingtong Pan	1
Joseph Dauer	1
Karthik Ramesh	1
Kent Anderson Seidel	1
More ▼