Publication Date
In 2025 | 2 |
Since 2024 | 2 |
Since 2021 (last 5 years) | 6 |
Since 2016 (last 10 years) | 19 |
Since 2006 (last 20 years) | 69 |
Descriptor
Evaluation Methods | 81 |
Psychometrics | 81 |
Scores | 81 |
Measurement Techniques | 19 |
Test Validity | 19 |
Correlation | 18 |
Test Reliability | 16 |
Comparative Analysis | 14 |
Foreign Countries | 14 |
Item Response Theory | 13 |
Measures (Individuals) | 13 |
More ▼ |
Source
Author
Dumas, Denis G. | 2 |
McKown, Clark | 2 |
McNeish, Daniel M. | 2 |
Raykov, Tenko | 2 |
Sijtsma, Klaas | 2 |
Abu-Hamour, Bashir | 1 |
Albano, Anthony D. | 1 |
Allen, Adelaide M. | 1 |
Amery D. Wu | 1 |
Anderson, Kate | 1 |
Arjoon, Janelle A. | 1 |
More ▼ |
Publication Type
Education Level
Elementary Education | 10 |
Higher Education | 9 |
Elementary Secondary Education | 8 |
Early Childhood Education | 6 |
Postsecondary Education | 5 |
Grade 3 | 4 |
Grade 4 | 4 |
Middle Schools | 4 |
Secondary Education | 4 |
Grade 1 | 3 |
Grade 2 | 3 |
More ▼ |
Audience
Researchers | 3 |
Parents | 1 |
Location
Canada | 3 |
United Kingdom | 3 |
Netherlands | 2 |
Australia | 1 |
Germany | 1 |
Illinois | 1 |
Indonesia | 1 |
Israel | 1 |
Jordan | 1 |
Louisiana | 1 |
Maryland | 1 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Deschênes, Marie-France; Dionne, Éric; Dorion, Michelle; Grondin, Julie – Practical Assessment, Research & Evaluation, 2023
The use of the aggregate scoring method for scoring concordance tests requires the weighting of test items to be derived from the performance of a group of experts who take the test under the same conditions as the examinees. However, the average score of experts constituting the reference panel remains a critical issue in the use of these tests.…
Descriptors: Scoring, Tests, Evaluation Methods, Test Items
Sophie Lilit Litschwartz – ProQuest LLC, 2021
In education research test scores are a common object of analysis. Across studies test scores can be an important outcome, a highly predictive covariate, or a means of assigning treatment. However, test scores are a measure of an underlying proficiency we can't observe directly and so contain error. This measurement error has implications for how…
Descriptors: Scores, Inferences, Educational Research, Evaluation Methods
Shun-Fu Hu; Amery D. Wu; Jake Stone – Journal of Educational Measurement, 2025
Scoring high-dimensional assessments (e.g., > 15 traits) can be a challenging task. This paper introduces the multilabel neural network (MNN) as a scoring method for high-dimensional assessments. Additionally, it demonstrates how MNN can score the same test responses to maximize different performance metrics, such as accuracy, recall, or…
Descriptors: Tests, Testing, Scores, Test Construction
Xue Zhang; Chun Wang – Grantee Submission, 2022
Item-level fit analysis not only serves as a complementary check to global fit analysis, it is also essential in scale development because the fit results will guide item revision and/or deletion (Liu & Maydeu-Olivares, 2014). During data collection, missing response data may likely happen due to various reasons. Chi-square-based item fit…
Descriptors: Goodness of Fit, Item Response Theory, Scores, Test Length
Dumas, Denis; McNeish, Daniel; Greene, Jeffrey A. – Educational Psychologist, 2020
Scholars have lamented that current methods of assessing student performance do not align with contemporary views of learning as situated within students, contexts, and time. Here, we introduce and describe one theoretical--psychometric paradigm--termed "dynamic measurement"--designed to provide a valid representation of the way students…
Descriptors: Alternative Assessment, Psychometrics, Educational Psychology, Student Evaluation
Clark McKown; Nicole Russo-Ponsaran; Matthew Wronski; Ashley Karls – Grantee Submission, 2025
This study describes the rationale, design, development, and technical properties of SELweb MS, a direct assessment of social and emotional competencies in middle school students. Assessment and item design were iteratively developed with input from youth and experts to measure five domains: Self-Awareness, Self-Management, Social Awareness,…
Descriptors: Psychometrics, Social Emotional Learning, Middle School Students, Correlation
Weinstein, Theresa J.; Ceh, Simon Majed; Meinel, Christoph; Benedek, Mathias – Creativity Research Journal, 2022
Evaluating creativity of verbal responses or texts is a challenging task due to psychometric issues associated with subjective ratings and the peculiarities of textual data. We explore an approach to objectively assess the creativity of responses in a sentence generation task to (1) better understand what language-related aspects are valued by…
Descriptors: Creativity, Sentences, Natural Language Processing, Computation
Keng, Leslie; Boyer, Michelle – National Center for the Improvement of Educational Assessment, 2020
ACT requested assistance from the National Center for the Improvement of Educational Assessment (Center for Assessment) to investigate declines of scores for states administering the ACT to its 11th grade students in 2018. This request emerged from conversations among state leaders, the Center for Assessment, and ACT in trying to understand the…
Descriptors: College Entrance Examinations, Scores, Test Score Decline, Educational Trends
Dumas, Denis G.; McNeish, Daniel M. – Educational Researcher, 2018
Dynamic measurement modeling (DMM) has been shown to improve the consequential validity of longitudinal mathematics assessment in the Early Childhood Longitudinal Study-Kindergarten (ECLS-K) database. Here, the authors demonstrate the capability of DMM to similarly improve the consequential validity of ECLS-K reading assessment through the…
Descriptors: Measurement Techniques, Student Evaluation, Alternative Assessment, Evaluation Methods
Leventhal, Brian – ProQuest LLC, 2017
More robust and rigorous psychometric models, such as multidimensional Item Response Theory models, have been advocated for survey applications. However, item responses may be influenced by construct-irrelevant variance factors such as preferences for extreme response options. Through empirical and simulation methods, this study evaluates the use…
Descriptors: Psychometrics, Item Response Theory, Simulation, Models
Lee, Minji K.; Sweeney, Kevin; Melican, Gerald J. – Educational Assessment, 2017
This study investigates the relationships among factor correlations, inter-item correlations, and the reliability estimates of subscores, providing a guideline with respect to psychometric properties of useful subscores. In addition, it compares subscore estimation methods with respect to reliability and distinctness. The subscore estimation…
Descriptors: Scores, Test Construction, Test Reliability, Test Validity
Raikes, Abbie; Sayre, Rebecca; Davis, Dawn; Anderson, Kate; Hyson, Marilou; Seminario, Evelyn; Burton, Anna – Early Years: An International Journal of Research and Development, 2019
Measuring Early Learning Quality & Outcomes (MELQO) was initiated to address needs for child development and quality of early childhood education (ECE) data, specifically for low- and middle-income countries. Drawing from existing tools, MELQO convened a consortium to create open-source tools to be adapted to national contexts, simultaneously…
Descriptors: Educational Quality, Outcomes of Education, Child Development, Early Childhood Education
Wedman, Jonathan; Lyrén, Per-Erik – Practical Assessment, Research & Evaluation, 2015
When subscores on a test are reported to the test taker, the appropriateness of reporting them depends on whether they provide useful information above what is provided by the total score. Subscores that fail to do so lack adequate psychometric quality and should not be reported. There are several methods for examining the quality of subscores,…
Descriptors: Evaluation Methods, Psychometrics, Scores, Tests
Sriram, Rishi – Journal of Student Affairs Research and Practice, 2014
The study of competencies in student affairs began more than 4 decades ago, but no instrument currently exists to measure competencies broadly. This study builds upon previous research by developing an instrument to measure student affairs competencies. Results not only validate the competencies espoused by NASPA and ACPA, but also suggest adding…
Descriptors: Reliability, Psychometrics, Student Personnel Services, Student Personnel Workers
Dumas, Denis G.; McNeish, Daniel M. – Educational Researcher, 2017
Single-timepoint educational measurement practices are capable of assessing student ability at the time of testing but are not designed to be informative of student capacity for developing in any particular academic domain, despite commonly being used in such a manner. For this reason, such measurement practice systematically underestimates the…
Descriptors: Measurement Techniques, Student Evaluation, Evaluation Methods, Testing