Publication Date
In 2025 | 2 |
Since 2024 | 2 |
Since 2021 (last 5 years) | 2 |
Since 2016 (last 10 years) | 15 |
Since 2006 (last 20 years) | 25 |
Descriptor
Test Reliability | 39 |
Test Validity | 18 |
Test Construction | 10 |
Scores | 8 |
Test Interpretation | 8 |
Error of Measurement | 7 |
Item Response Theory | 7 |
Psychometrics | 7 |
Scoring | 7 |
Educational Assessment | 6 |
Test Bias | 6 |
More ▼ |
Source
Educational Measurement:… | 39 |
Author
Frisbie, David A. | 3 |
Brennan, Robert L. | 2 |
Cizek, Gregory J. | 2 |
Mislevy, Robert J. | 2 |
Allalouf, Avi | 1 |
Arends, Lidia R. | 1 |
Attali, Yigal | 1 |
Bakeman, Roger | 1 |
Beller, Michal | 1 |
Bouwmeester, Samantha | 1 |
Buzick, Heather | 1 |
More ▼ |
Publication Type
Journal Articles | 39 |
Reports - Research | 15 |
Reports - Evaluative | 10 |
Reports - Descriptive | 8 |
Opinion Papers | 5 |
Information Analyses | 4 |
Speeches/Meeting Papers | 2 |
Guides - Classroom - Learner | 1 |
Guides - Non-Classroom | 1 |
Education Level
Elementary Secondary Education | 2 |
Higher Education | 2 |
Junior High Schools | 2 |
Middle Schools | 2 |
Secondary Education | 2 |
Elementary Education | 1 |
Postsecondary Education | 1 |
Audience
Researchers | 1 |
Laws, Policies, & Programs
No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
National Assessment of… | 2 |
ACT Assessment | 1 |
Graduate Record Examinations | 1 |
Iowa Tests of Basic Skills | 1 |
Iowa Tests of Educational… | 1 |
Preliminary Scholastic… | 1 |
SAT (College Admission Test) | 1 |
What Works Clearinghouse Rating
Katherine E. Castellano; Daniel F. McCaffrey; Joseph A. Martineau – Educational Measurement: Issues and Practice, 2025
Growth-to-standard models evaluate student growth against the growth needed to reach a future standard or target of interest, such as proficiency. A common growth-to-standard model involves comparing the popular Student Growth Percentile (SGP) to Adequate Growth Percentiles (AGPs). AGPs follow from an involved process based on fitting a series of…
Descriptors: Student Evaluation, Growth Models, Student Educational Objectives, Educational Indicators
Sanford R. Student; Derek C. Briggs; Laurie Davis – Educational Measurement: Issues and Practice, 2025
Vertical scales are frequently developed using common item nonequivalent group linking. In this design, one can use upper-grade, lower-grade, or mixed-grade common items to estimate the linking constants that underlie the absolute measurement of growth. Using the Rasch model and a dataset from Curriculum Associates' i-Ready Diagnostic in math in…
Descriptors: Elementary School Mathematics, Elementary School Students, Middle School Mathematics, Middle School Students
Wilkerson, Judy R. – Educational Measurement: Issues and Practice, 2020
Validity and reliability are a major focus in teacher education accreditation by the Council for Accreditation of Educator Preparation (CAEP). CAEP requires the use of "accepted research standards," but many faculty and administrators are unsure how to meet this requirement. The Standards of Educational and Psychological Testing…
Descriptors: Test Construction, Test Validity, Test Reliability, Teacher Education Programs
Jones, Andrew T.; Kopp, Jason P.; Ong, Thai Q. – Educational Measurement: Issues and Practice, 2020
Studies investigating invariance have often been limited to measurement or prediction invariance. Selection invariance, wherein the use of test scores for classification results in equivalent classification accuracy between groups, has received comparatively little attention in the psychometric literature. Previous research suggests that some form…
Descriptors: Test Construction, Test Bias, Classification, Accuracy
Yocarini, Iris E.; Bouwmeester, Samantha; Smeets, Guus; Arends, Lidia R. – Educational Measurement: Issues and Practice, 2018
This real-data-guided simulation study systematically evaluated the decision accuracy of complex decision rules combining multiple tests within different realistic curricula. Specifically, complex decision rules combining conjunctive aspects and compensatory aspects were evaluated. A conjunctive aspect requires a minimum level of performance,…
Descriptors: Comparative Analysis, Decision Making, Accuracy, Higher Education
Attali, Yigal – Educational Measurement: Issues and Practice, 2019
Rater training is an important part of developing and conducting large-scale constructed-response assessments. As part of this process, candidate raters have to pass a certification test to confirm that they are able to score consistently and accurately before they begin scoring operationally. Moreover, many assessment programs require raters to…
Descriptors: Evaluators, Certification, High Stakes Tests, Scoring
Lewis, Charlie; Chajewski, Michael; Rupp, André A. – Educational Measurement: Issues and Practice, 2018
In this ITEMS module, we provide a two-part introduction to the topic of reliability from the perspective of "classical test theory" (CTT). In the first part, which is directed primarily at beginning learners, we review and build on the content presented in the original didactic ITEMS article by Traub and Rowley (1991). Specifically, we…
Descriptors: Test Reliability, Test Theory, Computation, Data Collection
Madison, Matthew J. – Educational Measurement: Issues and Practice, 2019
Recent advances have enabled diagnostic classification models (DCMs) to accommodate longitudinal data. These longitudinal DCMs were developed to study how examinees change, or transition, between different attribute mastery statuses over time. This study examines using longitudinal DCMs as an approach to assessing growth and serves three purposes:…
Descriptors: Longitudinal Studies, Item Response Theory, Psychometrics, Criterion Referenced Tests
Rubright, Jonathan D. – Educational Measurement: Issues and Practice, 2018
Performance assessments, scenario-based tasks, and other groups of items carry a risk of violating the local item independence assumption made by unidimensional item response theory (IRT) models. Previous studies have identified negative impacts of ignoring such violations, most notably inflated reliability estimates. Still, the influence of this…
Descriptors: Performance Based Assessment, Item Response Theory, Models, Test Reliability
Mislevy, Robert J.; Oliveri, Maria Elena – Educational Measurement: Issues and Practice, 2019
In this digital ITEMS module, Dr. Robert [Bob] Mislevy and Dr. Maria Elena Oliveri introduce and illustrate a sociocognitive perspective on educational measurement, which focuses on a variety of design and implementation considerations for creating fair and valid assessments for learners from diverse populations with diverse sociocultural…
Descriptors: Educational Testing, Reliability, Test Validity, Test Reliability
Jonson, Jessica L.; Trantham, Pamela; Usher-Tate, Betty Jean – Educational Measurement: Issues and Practice, 2019
One of the substantive changes in the 2014 Standards for Educational and Psychological Testing was the elevation of fairness in testing as a foundational element of practice in addition to validity and reliability. Previous research indicates that testing practices often do not align with professional standards and guidelines. Therefore, to raise…
Descriptors: Culture Fair Tests, Test Validity, Test Reliability, Intelligence Tests
Ferrara, Steve – Educational Measurement: Issues and Practice, 2017
Test security is not an end in itself; it is important because we want to be able to make valid interpretations from test scores. In this article, I propose a framework for comprehensive test security systems: prevention, detection, investigation, and resolution. The article discusses threats to test security, roles and responsibilities, rigorous…
Descriptors: Testing Programs, Educational Practices, Educational Policy, Program Improvement
Johnson, Evelyn S.; Crawford, Angela; Moylan, Laura A.; Zheng, Yuzhu – Educational Measurement: Issues and Practice, 2018
The evidence-centered design framework was used to create a special education teacher observation system, Recognizing Effective Special Education Teachers. Extensive reviews of research informed the domain analysis and modeling stages, and led to the conceptual framework in which effective special education teaching is operationalized as the…
Descriptors: Evidence Based Practice, Special Education Teachers, Observation, Disabilities
Castellano, Katherine E.; McCaffrey, Daniel F. – Educational Measurement: Issues and Practice, 2017
Mean or median student growth percentiles (MGPs) are a popular measure of educator performance, but they lack rigorous evaluation. This study investigates the error in MGP due to test score measurement error (ME). Using analytic derivations, we find that errors in the commonly used MGP are correlated with average prior latent achievement: Teachers…
Descriptors: Teacher Evaluation, Teacher Effectiveness, Value Added Models, Achievement Gains
Furtak, Erin Marie; Ruiz-Primo, Maria Araceli; Bakeman, Roger – Educational Measurement: Issues and Practice, 2017
Formative assessment is a classroom practice that has received much attention in recent years for its established potential at increasing student learning. A frequent analytic approach for determining the quality of formative assessment practices is to develop a coding scheme and determine frequencies with which the codes are observed; however,…
Descriptors: Sequential Approach, Formative Evaluation, Alternative Assessment, Incidence