Publication Date
In 2025 | 3 |
Since 2024 | 13 |
Since 2021 (last 5 years) | 20 |
Since 2016 (last 10 years) | 45 |
Since 2006 (last 20 years) | 81 |
Descriptor
Error of Measurement | 164 |
Test Construction | 164 |
Test Reliability | 61 |
Test Items | 50 |
Test Validity | 41 |
Item Response Theory | 40 |
Scores | 30 |
Item Analysis | 27 |
Psychometrics | 26 |
Equated Scores | 20 |
Achievement Tests | 19 |
More ▼ |
Source
Author
Haladyna, Tom | 4 |
Alonzo, Julie | 3 |
Brennan, Robert L. | 3 |
Hambleton, Ronald K. | 3 |
Livingston, Samuel A. | 3 |
Lord, Frederic M. | 3 |
Roid, Gale | 3 |
Solano-Flores, Guillermo | 3 |
Tindal, Gerald | 3 |
Dever, Jill A. | 2 |
Dorans, Neil J. | 2 |
More ▼ |
Publication Type
Education Level
Elementary Education | 16 |
Secondary Education | 15 |
Higher Education | 13 |
Postsecondary Education | 11 |
Grade 3 | 10 |
Middle Schools | 10 |
Grade 4 | 9 |
Junior High Schools | 9 |
Early Childhood Education | 8 |
Grade 5 | 8 |
Grade 8 | 8 |
More ▼ |
Audience
Researchers | 6 |
Practitioners | 1 |
Students | 1 |
Teachers | 1 |
Location
New York | 5 |
Canada | 3 |
Australia | 2 |
Japan | 2 |
New Mexico | 2 |
Turkey | 2 |
Arkansas | 1 |
Chile | 1 |
Colorado (Boulder) | 1 |
Denmark | 1 |
Ethiopia | 1 |
More ▼ |
Laws, Policies, & Programs
No Child Left Behind Act 2001 | 2 |
Race to the Top | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Hwanggyu Lim; Danqi Zhu; Edison M. Choe; Kyung T. Han – Journal of Educational Measurement, 2024
This study presents a generalized version of the residual differential item functioning (RDIF) detection framework in item response theory, named GRDIF, to analyze differential item functioning (DIF) in multiple groups. The GRDIF framework retains the advantages of the original RDIF framework, such as computational efficiency and ease of…
Descriptors: Item Response Theory, Test Bias, Test Reliability, Test Construction
Güler Yavuz Temel – Journal of Educational Measurement, 2024
The purpose of this study was to investigate multidimensional DIF with a simple and nonsimple structure in the context of multidimensional Graded Response Model (MGRM). This study examined and compared the performance of the IRT-LR and Wald test using MML-EM and MHRM estimation approaches with different test factors and test structures in…
Descriptors: Computation, Multidimensional Scaling, Item Response Theory, Models
Philipp Sterner; Kim De Roover; David Goretzko – Structural Equation Modeling: A Multidisciplinary Journal, 2025
When comparing relations and means of latent variables, it is important to establish measurement invariance (MI). Most methods to assess MI are based on confirmatory factor analysis (CFA). Recently, new methods have been developed based on exploratory factor analysis (EFA); most notably, as extensions of multi-group EFA, researchers introduced…
Descriptors: Error of Measurement, Measurement Techniques, Factor Analysis, Structural Equation Models
Hung-Yu Huang – Educational and Psychological Measurement, 2025
The use of discrete categorical formats to assess psychological traits has a long-standing tradition that is deeply embedded in item response theory models. The increasing prevalence and endorsement of computer- or web-based testing has led to greater focus on continuous response formats, which offer numerous advantages in both respondent…
Descriptors: Response Style (Tests), Psychological Characteristics, Item Response Theory, Test Reliability
Stella Y. Kim; Carl Westine; Tong Wu; Derek Maher – Journal of College Student Retention: Research, Theory & Practice, 2024
The primary purpose of this study is to validate a student engagement measure for its use in evaluation of a learning assistant (LA) program. A series of psychometric evaluations were made for both the original scale of Higher Education Student Engagement Scale (HESES) and its adapted version designed to be used in gauging the effectiveness of…
Descriptors: Learner Engagement, Teaching Assistants, Test Validity, Test Reliability
Nicolas Pichot; Boris Forthmann; Eric Bonetto; Thomas Arciszewski; Nathalie Bonnardel; Sara Jaubert; Jean B. Pavani – Journal of Creative Behavior, 2024
The term "creative" is commonly used in everyday language and in academic discourse to discuss the nature of artistic and innovative productions. This usage inherently implies the existence of a variable of creativity that allows different creative works to be compared. The standard definition of creativity asserts that a production must…
Descriptors: Creativity, Test Construction, Test Validity, Productive Thinking
Practical Considerations in Choosing an Anchor Test Form for Equating under the Random Groups Design
Cui, Zhongmin; He, Yong – Measurement: Interdisciplinary Research and Perspectives, 2023
Careful considerations are necessary when there is a need to choose an anchor test form from a list of old test forms for equating under the random groups design. The choice of the anchor form potentially affects the accuracy of equated scores on new test forms. Few guidelines, however, can be found in the literature on choosing the anchor form.…
Descriptors: Test Format, Equated Scores, Best Practices, Test Construction
Montserrat Beatriz Valdivia Medinaceli – ProQuest LLC, 2023
My dissertation examines three current challenges of international large-scale assessments (ILSAs) associated with the transition from linear testing to an adaptive testing design. ILSAs are important for making comparisons among populations and informing countries about the quality of their educational systems. ILSA's results inform policymakers…
Descriptors: International Assessment, Achievement Tests, Adaptive Testing, Test Items
G. R. Quintana; I. Dufraix; J. I. Escudero-Pasten; J. F. Santibáñez-Palma; C. Figueroa-Grenett – Cogent Education, 2024
Scientific research is vital for student's education, fostering critical thinking, problem-solving skills, and deepening subject knowledge. To assess students' attitudes towards research, the attitude towards research scale was developed (EACIN). This study addresses three gaps regarding this instrument: inconsistent latent structure, lack of…
Descriptors: Foreign Countries, Undergraduate Students, Psychometrics, Gender Differences
Anders Holm; Anders Hjorth-Trolle; Robert Andersen – Sociological Methods & Research, 2025
Lagged dependent variables (LDVs) are often used as predictors in ordinary least squares (OLS) models in the social sciences. Although several estimators are commonly employed, little is known about their relative merits in the presence of classical measurement error and different longitudinal processes. We assess the performance of four commonly…
Descriptors: Elementary Education, Scores, Error of Measurement, Predictor Variables
AL-Dossary, Saeed A.; Almohayya, Bander M. – Psychology in the Schools, 2024
The present study aims to validate the Flourishing Scale (FS) in a convenience sample of 233 special education teachers. The FS's psychometric properties were investigated using exploratory factor analysis (EFA) and confirmatory factor analysis (CFA). EFA had a one-factor solution that explained 49.9% of the variance, a Cronbach's alpha internal…
Descriptors: Error of Measurement, Arabic, Test Construction, Special Education Teachers
D. Steger; S. Weiss; O. Wilhelm – Creativity Research Journal, 2023
Creativity can be measured with a variety of methods including self-reports, others reports, and ability tests. While typical self-reports are best understood as weak proxies of creativity, biographical reports that assess previous creative activities seem more promising. Drawbacks of such measures -- including skewed item distributions, a lack of…
Descriptors: Creativity, Creativity Tests, Test Construction, Algorithms
Frazier, Thomas W.; Khaliq, Izma; Scullin, Keeley; Uljarevic, Mirko; Shih, Andy; Karpur, Arun – Journal of Autism and Developmental Disorders, 2023
At present, there are no brief, freely-available, informant-report measures that evaluate key challenging behaviors relevant to youth with autism spectrum disorder (ASD) or other developmental disabilities (DD). This paper describes the development, refinement, and initial psychometric evaluation of a new 18-item measure, the Open-Source…
Descriptors: Test Construction, Psychometrics, Behavior Problems, Autism Spectrum Disorders
Firdissa J. Aga – Intersection: A Journal at the Intersection of Assessment and Learning, 2024
The study investigated hurdles to the quality of student learning assessment by examining issues related to assessment procedures and practices, learners and learning, learning resources and test constructs, and test admin and feedback. Quantitative and qualitative data were collected from two Ethiopian universities using two types of…
Descriptors: Foreign Countries, College Faculty, College Students, Test Construction
Angela Johnson; Elizabeth Barker; Marcos Viveros Cespedes – Educational Measurement: Issues and Practice, 2024
Educators and researchers strive to build policies and practices on data and evidence, especially on academic achievement scores. When assessment scores are inaccurate for specific student populations or when scores are inappropriately used, even data-driven decisions will be misinformed. To maximize the impact of the research-practice-policy…
Descriptors: Equal Education, Inclusion, Evaluation Methods, Error of Measurement