Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 2 |
Since 2016 (last 10 years) | 4 |
Since 2006 (last 20 years) | 11 |
Descriptor
Standardized Tests | 11 |
Test Items | 5 |
Item Analysis | 4 |
Item Response Theory | 4 |
Achievement Tests | 3 |
College Entrance Examinations | 3 |
Difficulty Level | 3 |
Effect Size | 3 |
Error of Measurement | 3 |
Foreign Countries | 3 |
Scores | 3 |
More ▼ |
Source
Educational and Psychological… | 11 |
Author
Publication Type
Journal Articles | 11 |
Reports - Research | 9 |
Reports - Evaluative | 2 |
Education Level
Higher Education | 3 |
Elementary Education | 2 |
Postsecondary Education | 2 |
Secondary Education | 2 |
Elementary Secondary Education | 1 |
Grade 3 | 1 |
Grade 4 | 1 |
High Schools | 1 |
Intermediate Grades | 1 |
Junior High Schools | 1 |
Audience
Laws, Policies, & Programs
Assessments and Surveys
Florida Comprehensive… | 1 |
Indiana Statewide Testing for… | 1 |
SAT (College Admission Test) | 1 |
TerraNova Multiple Assessments | 1 |
What Works Clearinghouse Rating
Henninger, Mirka; Debelak, Rudolf; Strobl, Carolin – Educational and Psychological Measurement, 2023
To detect differential item functioning (DIF), Rasch trees search for optimal split-points in covariates and identify subgroups of respondents in a data-driven way. To determine whether and in which covariate a split should be performed, Rasch trees use statistical significance tests. Consequently, Rasch trees are more likely to label small DIF…
Descriptors: Item Response Theory, Test Items, Effect Size, Statistical Significance
Lions, Séverin; Dartnell, Pablo; Toledo, Gabriela; Godoy, María Inés; Córdova, Nora; Jiménez, Daniela; Lemarié, Julie – Educational and Psychological Measurement, 2023
Even though the impact of the position of response options on answers to multiple-choice items has been investigated for decades, it remains debated. Research on this topic is inconclusive, perhaps because too few studies have obtained experimental data from large-sized samples in a real-world context and have manipulated the position of both…
Descriptors: Multiple Choice Tests, Test Items, Item Analysis, Responses
Konstantopoulos, Spyros; Li, Wei; Miller, Shazia; van der Ploeg, Arie – Educational and Psychological Measurement, 2019
This study discusses quantile regression methodology and its usefulness in education and social science research. First, quantile regression is defined and its advantages vis-à-vis vis ordinary least squares regression are illustrated. Second, specific comparisons are made between ordinary least squares and quantile regression methods. Third, the…
Descriptors: Regression (Statistics), Statistical Analysis, Educational Research, Social Science Research
Huggins-Manley, Anne Corinne – Educational and Psychological Measurement, 2017
This study defines subpopulation item parameter drift (SIPD) as a change in item parameters over time that is dependent on subpopulations of examinees, and hypothesizes that the presence of SIPD in anchor items is associated with bias and/or lack of invariance in three psychometric outcomes. Results show that SIPD in anchor items is associated…
Descriptors: Psychometrics, Test Items, Item Response Theory, Hypothesis Testing
Dimitrov, Dimiter M.; Raykov, Tenko; AL-Qataee, Abdullah Ali – Educational and Psychological Measurement, 2015
This article is concerned with developing a measure of general academic ability (GAA) for high school graduates who apply to colleges, as well as with the identification of optimal weights of the GAA indicators in a linear combination that yields a composite score with maximal reliability and maximal predictive validity, employing the framework of…
Descriptors: Foreign Countries, Academic Ability, Aptitude Tests, High School Students
Attali, Yigal – Educational and Psychological Measurement, 2011
Contrary to previous research on sequential ratings of student performance, this study found that professional essay raters of a large-scale standardized testing program produced ratings that were drawn toward previous ratings, creating an assimilation effect. Longer intervals between the two adjacent ratings and higher degree of agreement with…
Descriptors: Essay Tests, Standardized Tests, Sequential Approach, Test Bias
Engelhard, George, Jr. – Educational and Psychological Measurement, 2011
The purpose of this study is to describe a new approach for evaluating the judgments of standard-setting panelists within the context of the bookmark procedure. The bookmark procedure is widely used for setting performance standards on high-stakes assessments. A many-faceted Rasch (MFR) model is proposed for evaluating the bookmark judgments of…
Descriptors: Educational Assessment, Performance Based Assessment, Grade 3, Evaluation Methods
Kobrin, Jennifer L.; Kim, YoungKoung; Sackett, Paul R. – Educational and Psychological Measurement, 2012
There is much debate on the merits and pitfalls of standardized tests for college admission, with questions regarding the format (multiple-choice vs. constructed response), cognitive complexity, and content of these assessments (achievement vs. aptitude) at the forefront of the discussion. This study addressed these questions by investigating the…
Descriptors: Grade Point Average, Standardized Tests, Predictive Validity, Predictor Variables
Scherbaum, Charles A.; Goldstein, Harold W. – Educational and Psychological Measurement, 2008
Recent research examining racial differences on standardized cognitive tests has focused on the impact of test item difficulty. Studies using data from the SAT and GRE have reported a correlation between item difficulty and differential item functioning (DIF) such that minority test takers are less likely than majority test takers to respond…
Descriptors: Race, Test Items, Standardized Tests, Cognitive Tests
Nugent, William R. – Educational and Psychological Measurement, 2006
One of the most important effect sizes used in meta-analysis is the standardized mean difference (SMD). In this article, the conditions under which SMD effect sizes based on different measures of the same construct are directly comparable are investigated. The results show that SMD effect sizes from different measures of the same construct are…
Descriptors: Effect Size, Meta Analysis, True Scores, Error of Measurement
Chang, Shun-Wen – Educational and Psychological Measurement, 2006
This study evaluates the effects of employing the linear, normalizing, and arcsine transformation methods for constructing scale scores on the Basic Competence Test (BCTEST). Tests in three subject areas (Chinese, English, and Mathematics) were studied using the data of test administrations from 2001 to 2003. The resulting scale scores for each…
Descriptors: Standardized Tests, Achievement Tests, Test Theory, True Scores