Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 2 |
Since 2016 (last 10 years) | 7 |
Since 2006 (last 20 years) | 20 |
Descriptor
Scores | 33 |
Validity | 18 |
Test Validity | 12 |
Test Construction | 9 |
Test Interpretation | 8 |
Test Items | 7 |
Decision Making | 5 |
Item Response Theory | 5 |
Scoring | 5 |
Student Evaluation | 5 |
Achievement Tests | 4 |
More ▼ |
Source
Applied Measurement in… | 33 |
Author
Wise, Steven L. | 4 |
Carney, Michele | 3 |
Linn, Robert L. | 3 |
Huff, Kristen | 2 |
Mehrens, William A. | 2 |
Sawyer, Richard | 2 |
Bejar, Isaac I. | 1 |
Cavey, Laurie | 1 |
Champion, Joe | 1 |
Crawford, Angela | 1 |
Dunbar, Stephen B. | 1 |
More ▼ |
Publication Type
Journal Articles | 33 |
Reports - Research | 17 |
Reports - Evaluative | 13 |
Reports - Descriptive | 2 |
Speeches/Meeting Papers | 2 |
Book/Product Reviews | 1 |
Information Analyses | 1 |
Reports - General | 1 |
Education Level
Higher Education | 6 |
High Schools | 5 |
Secondary Education | 4 |
Grade 8 | 2 |
Junior High Schools | 2 |
Middle Schools | 2 |
Postsecondary Education | 2 |
Grade 12 | 1 |
Grade 4 | 1 |
Grade 9 | 1 |
Audience
Location
California (Los Angeles) | 1 |
New York | 1 |
Norway | 1 |
Slovenia | 1 |
Sweden | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Bar Examinations | 1 |
Law School Admission Test | 1 |
National Assessment of… | 1 |
Program for International… | 1 |
SAT (College Admission Test) | 1 |
Trends in International… | 1 |
What Works Clearinghouse Rating
Carney, Michele; Paulding, Katie; Champion, Joe – Applied Measurement in Education, 2022
Teachers need ways to efficiently assess students' cognitive understanding. One promising approach involves easily adapted and administered item types that yield quantitative scores that can be interpreted in terms of whether or not students likely possess key understandings. This study illustrates an approach to analyzing response process…
Descriptors: Middle School Students, Logical Thinking, Mathematical Logic, Problem Solving
Mo, Ya; Carney, Michele; Cavey, Laurie; Totorica, Tatia – Applied Measurement in Education, 2021
There is a need for assessment items that assess complex constructs but can also be efficiently scored for evaluation of teacher education programs. In an effort to measure the construct of teacher attentiveness in an efficient and scalable manner, we are using exemplar responses elicited by constructed-response item prompts to develop…
Descriptors: Protocol Analysis, Test Items, Responses, Mathematics Teachers
Carney, Michele; Crawford, Angela; Siebert, Carl; Osguthorpe, Rich; Thiede, Keith – Applied Measurement in Education, 2019
The "Standards for Educational and Psychological Testing" recommend an argument-based approach to validation that involves a clear statement of the intended interpretation and use of test scores, the identification of the underlying assumptions and inferences in that statement--termed the interpretation/use argument, and gathering of…
Descriptors: Inquiry, Test Interpretation, Validity, Scores
Kopp, Jason P.; Jones, Andrew T. – Applied Measurement in Education, 2020
Traditional psychometric guidelines suggest that at least several hundred respondents are needed to obtain accurate parameter estimates under the Rasch model. However, recent research indicates that Rasch equating results in accurate parameter estimates with sample sizes as small as 25. Item parameter drift under the Rasch model has been…
Descriptors: Item Response Theory, Psychometrics, Sample Size, Sampling
Wise, Steven L.; Kuhfeld, Megan R.; Soland, James – Applied Measurement in Education, 2019
When we administer educational achievement tests, we want to be confident that the resulting scores validly indicate what the test takers know and can do. However, if the test is perceived as low stakes by the test taker, disengaged test taking sometimes occurs, which poses a serious threat to score validity. When computer-based tests are used,…
Descriptors: Guessing (Tests), Computer Assisted Testing, Achievement Tests, Scores
Wise, Steven L. – Applied Measurement in Education, 2015
Whenever the purpose of measurement is to inform an inference about a student's achievement level, it is important that we be able to trust that the student's test score accurately reflects what that student knows and can do. Such trust requires the assumption that a student's test event is not unduly influenced by construct-irrelevant factors…
Descriptors: Achievement Tests, Scores, Validity, Test Items
Oliveri, Maria; McCaffrey, Daniel; Ezzo, Chelsea; Holtzman, Steven – Applied Measurement in Education, 2017
The assessment of noncognitive traits is challenging due to possible response biases, "subjectivity" and "faking." Standardized third-party evaluations where an external evaluator rates an applicant on their strengths and weaknesses on various noncognitive traits are a promising alternative. However, accurate score-based…
Descriptors: Factor Analysis, Decision Making, College Admission, Likert Scales
Schmidgall, Jonathan – Applied Measurement in Education, 2017
This study utilizes an argument-based approach to validation to examine the implications of reliability in order to further differentiate the concepts of score and decision consistency. In a methodological example, the framework of generalizability theory was used to estimate appropriate indices of score consistency and evaluations of the…
Descriptors: Scores, Reliability, Validity, Generalizability Theory
Steedle, Jeffrey T. – Applied Measurement in Education, 2014
Possible lack of motivation is a perpetual concern when tests have no stakes attached to performance. Specifically, the validity of test score interpretations may be compromised when examinees are unmotivated to exert their best efforts. Motivation filtering, a procedure that filters out apparently unmotivated examinees, was applied to the…
Descriptors: College Outcomes Assessment, Student Motivation, Sampling, Validity
Sawyer, Richard – Applied Measurement in Education, 2013
Correlational evidence suggests that high school GPA is better than admission test scores in predicting first-year college GPA, although test scores have incremental predictive validity. The usefulness of a selection variable in making admission decisions depends in part on its predictive validity, but also on institutions' selectivity and…
Descriptors: High Schools, Grade Point Average, College Entrance Examinations, College Admission
Eklöf, Hanna; Pavešic, Barbara Japelj; Grønmo, Liv Sissel – Applied Measurement in Education, 2014
The purpose of the study was to measure students' reported test-taking effort and the relationship between reported effort and performance on the Trends in International Mathematics and Science Study (TIMSS) Advanced mathematics test. This was done in three countries participating in TIMSS Advanced 2008 (Sweden, Norway, and Slovenia), and the…
Descriptors: Mathematics Tests, Cross Cultural Studies, Foreign Countries, Correlation
Huff, Kristen; Steinberg, Linda; Matts, Thomas – Applied Measurement in Education, 2010
The cornerstone of evidence-centered assessment design (ECD) is an evidentiary argument that requires that each target of measurement (e.g., learning goal) for an assessment be expressed as a "claim" to be made about an examinee that is relevant to the specific purpose and audience(s) for the assessment. The "observable evidence" required to…
Descriptors: Advanced Placement Programs, Equivalency Tests, Evidence, Test Construction
Wise, Lauress L. – Applied Measurement in Education, 2010
The articles in this special issue make two important contributions to our understanding of the impact of accommodations on test score validity. First, they illustrate a variety of methods for collection and rigorous analyses of empirical data that can supplant expert judgment of the impact of accommodations. These methods range from internal…
Descriptors: Reading Achievement, Educational Assessment, Test Reliability, Learning Disabilities
Stone, Clement A.; Ye, Feifei; Zhu, Xiaowen; Lane, Suzanne – Applied Measurement in Education, 2010
Although reliability of subscale scores may be suspect, subscale scores are the most common type of diagnostic information included in student score reports. This research compared methods for augmenting the reliability of subscale scores for an 8th-grade mathematics assessment. Yen's Objective Performance Index, Wainer et al.'s augmented scores,…
Descriptors: Item Response Theory, Case Studies, Reliability, Scores
Wolf, Mikyung Kim; Kim, Jinok; Kao, Jenny – Applied Measurement in Education, 2012
Glossary and reading aloud test items are commonly allowed in many states' accommodation policies for English language learner (ELL) students for large-scale mathematics assessments. However, little research is available regarding the effects of these accommodations on ELL students' performance. Further, no research exists that examines how…
Descriptors: Testing Accommodations, Glossaries, Reading Aloud to Others, Validity