Publication Date
In 2025 | 0 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 3 |
Since 2016 (last 10 years) | 11 |
Since 2006 (last 20 years) | 16 |
Descriptor
Source
Applied Measurement in… | 25 |
Author
Wise, Steven L. | 3 |
Traynor, Anne | 2 |
Abulela, Mohammed A. A. | 1 |
Andrich, David | 1 |
Anne Traynor | 1 |
Ansley, Timothy | 1 |
Benson, Jeri | 1 |
Bishop, N. Scott | 1 |
Cohen, Dale J. | 1 |
Crocker, Linda M. | 1 |
El Masri, Yasmine H. | 1 |
More ▼ |
Publication Type
Journal Articles | 25 |
Reports - Research | 15 |
Reports - Evaluative | 9 |
Information Analyses | 1 |
Reports - Descriptive | 1 |
Speeches/Meeting Papers | 1 |
Education Level
Elementary Secondary Education | 6 |
Elementary Education | 4 |
Secondary Education | 4 |
Grade 3 | 2 |
Grade 5 | 2 |
Higher Education | 2 |
Intermediate Grades | 2 |
Middle Schools | 2 |
Postsecondary Education | 2 |
Grade 11 | 1 |
Grade 4 | 1 |
More ▼ |
Audience
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Anne Traynor; Sara C. Christopherson – Applied Measurement in Education, 2024
Combining methods from earlier content validity and more contemporary content alignment studies may allow a more complete evaluation of the meaning of test scores than if either set of methods is used on its own. This article distinguishes item relevance indices in the content validity literature from test representativeness indices in the…
Descriptors: Test Validity, Test Items, Achievement Tests, Test Construction
Wise, Steven L. – Applied Measurement in Education, 2020
In achievement testing there is typically a practical requirement that the set of items administered should be representative of some target content domain. This is accomplished by establishing test blueprints specifying the content constraints to be followed when selecting the items for a test. Sometimes, however, students give disengaged…
Descriptors: Test Items, Test Content, Achievement Tests, Guessing (Tests)
Pools, Elodie – Applied Measurement in Education, 2022
Many low-stakes assessments, such as international large-scale surveys, are administered during time-limited testing sessions and some test-takers are not able to endorse the last items of the test, resulting in not-reached (NR) items. However, because the test has no consequence for the respondents, these NR items can also stem from quitting the…
Descriptors: Achievement Tests, Foreign Countries, International Assessment, Secondary School Students
Traynor, Anne; Li, Tingxuan; Zhou, Shuqi – Applied Measurement in Education, 2020
During the development of large-scale school achievement tests, panels of independent subject-matter experts use systematic judgmental methods to rate the correspondence between a given test's items and performance objective statements. The individual experts' ratings may then be used to compute summary indices to quantify the match between a…
Descriptors: Alignment (Education), Achievement Tests, Curriculum, Error of Measurement
El Masri, Yasmine H.; Andrich, David – Applied Measurement in Education, 2020
In large-scale educational assessments, it is generally required that tests are composed of items that function invariantly across the groups to be compared. Despite efforts to ensure invariance in the item construction phase, for a range of reasons (including the security of items) it is often necessary to account for differential item…
Descriptors: Models, Goodness of Fit, Test Validity, Achievement Tests
Abulela, Mohammed A. A.; Rios, Joseph A. – Applied Measurement in Education, 2022
When there are no personal consequences associated with test performance for examinees, rapid guessing (RG) is a concern and can differ between subgroups. To date, the impact of differential RG on item-level measurement invariance has received minimal attention. To that end, a simulation study was conducted to examine the robustness of the…
Descriptors: Comparative Analysis, Robustness (Statistics), Nonparametric Statistics, Item Analysis
Haladyna, Thomas M.; Rodriguez, Michael C.; Stevens, Craig – Applied Measurement in Education, 2019
The evidence is mounting regarding the guidance to employ more three-option multiple-choice items. From theoretical analyses, empirical results, and practical considerations, such items are of equal or higher quality than four- or five-option items, and more items can be administered to improve content coverage. This study looks at 58 tests,…
Descriptors: Multiple Choice Tests, Test Items, Testing Problems, Guessing (Tests)
Wise, Steven L. – Applied Measurement in Education, 2015
Whenever the purpose of measurement is to inform an inference about a student's achievement level, it is important that we be able to trust that the student's test score accurately reflects what that student knows and can do. Such trust requires the assumption that a student's test event is not unduly influenced by construct-irrelevant factors…
Descriptors: Achievement Tests, Scores, Validity, Test Items
Cohen, Dale J.; Zhang, Jin; Wothke, Werner – Applied Measurement in Education, 2019
Construct-irrelevant cognitive complexity of some items in the statewide grade-level assessments may impose performance barriers for students with disabilities who are ineligible for alternate assessments based on alternate achievement standards. This has spurred research into whether items can be modified to reduce complexity without affecting…
Descriptors: Test Items, Accessibility (for Disabled), Students with Disabilities, Low Achievement
Michaelides, Michalis P. – Applied Measurement in Education, 2019
The Student Background survey administered along with achievement tests in studies of the International Association for the Evaluation of Educational Achievement includes scales of student motivation, competence, and attitudes toward mathematics and science. The scales consist of positively- and negatively keyed items. The current research…
Descriptors: International Assessment, Achievement Tests, Mathematics Achievement, Mathematics Tests
George, Ann Cathrice; Robitzsch, Alexander – Applied Measurement in Education, 2018
This article presents a new perspective on measuring gender differences in the large-scale assessment study Trends in International Science Study (TIMSS). The suggested empirical model is directly based on the theoretical competence model of the domain mathematics and thus includes the interaction between content and cognitive sub-competencies.…
Descriptors: Achievement Tests, Elementary Secondary Education, Mathematics Achievement, Mathematics Tests
Traynor, Anne – Applied Measurement in Education, 2017
It has long been argued that U.S. states' differential performance on nationwide assessments may reflect differences in students' opportunity to learn the tested content that is primarily due to variation in curricular content standards, rather than in instructional quality or educational investment. To quantify the effect of differences in…
Descriptors: Test Items, Difficulty Level, State Standards, Academic Standards
Wan, Lei; Henly, George A. – Applied Measurement in Education, 2012
Many innovative item formats have been proposed over the past decade, but little empirical research has been conducted on their measurement properties. This study examines the reliability, efficiency, and construct validity of two innovative item formats--the figural response (FR) and constructed response (CR) formats used in a K-12 computerized…
Descriptors: Test Items, Test Format, Computer Assisted Testing, Measurement
Wise, Steven L.; Pastor, Dena A.; Kong, Xiaojing J. – Applied Measurement in Education, 2009
Previous research has shown that rapid-guessing behavior can degrade the validity of test scores from low-stakes proficiency tests. This study examined, using hierarchical generalized linear modeling, examinee and item characteristics for predicting rapid-guessing behavior. Several item characteristics were found significant; items with more text…
Descriptors: Guessing (Tests), Achievement Tests, Correlation, Test Items
Rogers, W. Todd; Lin, Jie; Rinaldi, Christia M. – Applied Measurement in Education, 2011
The evidence gathered in the present study supports the use of the simultaneous development of test items for different languages. The simultaneous approach used in the present study involved writing an item in one language (e.g., French) and, before moving to the development of a second item, translating the item into the second language (e.g.,…
Descriptors: Test Items, Item Analysis, Achievement Tests, French
Previous Page | Next Page ยป
Pages: 1 | 2