Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 4 |
Since 2016 (last 10 years) | 8 |
Since 2006 (last 20 years) | 35 |
Descriptor
Gender Differences | 43 |
Measures (Individuals) | 14 |
Factor Analysis | 13 |
Foreign Countries | 12 |
Scores | 12 |
Item Response Theory | 10 |
Factor Structure | 9 |
Test Bias | 9 |
Psychometrics | 8 |
Measurement Techniques | 7 |
Test Items | 7 |
More ▼ |
Source
Educational and Psychological… | 43 |
Author
Publication Type
Journal Articles | 43 |
Reports - Research | 29 |
Reports - Evaluative | 9 |
Reports - Descriptive | 2 |
Education Level
Higher Education | 10 |
High Schools | 6 |
Postsecondary Education | 6 |
Secondary Education | 6 |
Elementary Education | 4 |
Grade 8 | 3 |
Middle Schools | 3 |
Adult Education | 2 |
Grade 3 | 2 |
Grade 6 | 2 |
Grade 7 | 2 |
More ▼ |
Audience
Location
Germany | 3 |
California | 1 |
Canada | 1 |
Illinois | 1 |
Japan | 1 |
Netherlands | 1 |
Netherlands (Amsterdam) | 1 |
Singapore | 1 |
South Korea | 1 |
Spain | 1 |
United States | 1 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Weigl, Klemens; Forstner, Thomas – Educational and Psychological Measurement, 2021
Paper-based visual analogue scale (VAS) items were developed 100 years ago. Although they gained great popularity in clinical and medical research for assessing pain, they have been scarcely applied in other areas of psychological research for several decades. However, since the beginning of digitization, VAS have attracted growing interest among…
Descriptors: Test Construction, Visual Measures, Gender Differences, Foreign Countries
Jin, Kuan-Yu; Eckes, Thomas – Educational and Psychological Measurement, 2022
Performance assessments heavily rely on human ratings. These ratings are typically subject to various forms of error and bias, threatening the assessment outcomes' validity and fairness. Differential rater functioning (DRF) is a special kind of threat to fairness manifesting itself in unwanted interactions between raters and performance- or…
Descriptors: Performance Based Assessment, Rating Scales, Test Bias, Student Evaluation
Liu, Xiaowen; Jane Rogers, H. – Educational and Psychological Measurement, 2022
Test fairness is critical to the validity of group comparisons involving gender, ethnicities, culture, or treatment conditions. Detection of differential item functioning (DIF) is one component of efforts to ensure test fairness. The current study compared four treatments for items that have been identified as showing DIF: deleting, ignoring,…
Descriptors: Item Analysis, Comparative Analysis, Culture Fair Tests, Test Validity
Walters, Glenn D.; Espelage, Dorothy L. – Educational and Psychological Measurement, 2019
The purpose of this study was to investigate the latent structure type (categorical vs. dimensional) of bullying perpetration in a large sample of middle school students. A nine-item bullying scale was administered to 1,222 (625 boys, 597 girls) early adolescents enrolled in middle schools in a Midwestern state. Based on the results of a principal…
Descriptors: Early Adolescents, Bullying, Middle School Students, Scores
Stoevenbelt, Andrea H.; Wicherts, Jelte M.; Flore, Paulette C.; Phillips, Lorraine A. T.; Pietschnig, Jakob; Verschuere, Bruno; Voracek, Martin; Schwabe, Inga – Educational and Psychological Measurement, 2023
When cognitive and educational tests are administered under time limits, tests may become speeded and this may affect the reliability and validity of the resulting test scores. Prior research has shown that time limits may create or enlarge gender gaps in cognitive and academic testing. On average, women complete fewer items than men when a test…
Descriptors: Timed Tests, Gender Differences, Item Response Theory, Correlation
Lee, HyeSun; Smith, Weldon Z. – Educational and Psychological Measurement, 2020
Based on the framework of testlet models, the current study suggests the Bayesian random block item response theory (BRB IRT) model to fit forced-choice formats where an item block is composed of three or more items. To account for local dependence among items within a block, the BRB IRT model incorporated a random block effect into the response…
Descriptors: Bayesian Statistics, Item Response Theory, Monte Carlo Methods, Test Format
Engelhard, George, Jr.; Rabbitt, Matthew P.; Engelhard, Emily M. – Educational and Psychological Measurement, 2018
This study focuses on model-data fit with a particular emphasis on household-level fit within the context of measuring household food insecurity. Household fit indices are used to examine the psychometric quality of household-level measures of food insecurity. In the United States, measures of food insecurity are commonly obtained from the U.S.…
Descriptors: Food, Hunger, Psychometrics, Low Income Groups
Marcoulides, Katerina M.; Grimm, Kevin J. – Educational and Psychological Measurement, 2017
Synthesizing results from multiple studies is a daunting task during which researchers must tackle a variety of challenges. The task is even more demanding when studying developmental processes longitudinally and when different instruments are used to measure constructs. Data integration methodology is an emerging field that enables researchers to…
Descriptors: Growth Models, Longitudinal Studies, Mathematics Skills, Achievement Tests
Wang, Qiu; Diemer, Matthew A.; Maier, Kimberly S. – Educational and Psychological Measurement, 2013
This study integrated Bayesian hierarchical modeling and receiver operating characteristic analysis (BROCA) to evaluate how interest strength (IS) and interest differentiation (ID) predicted low–socioeconomic status (SES) youth's interest-major congruence (IMC). Using large-scale Kuder Career Search online-assessment data, this study fit three…
Descriptors: Bayesian Statistics, Socioeconomic Status, Student Interests, Gender Differences
Okumura, Taichi – Educational and Psychological Measurement, 2014
This study examined the empirical differences between the tendency to omit items and reading ability by applying tree-based item response (IRTree) models to the Japanese data of the Programme for International Student Assessment (PISA) held in 2009. For this purpose, existing IRTree models were expanded to contain predictors and to handle…
Descriptors: Foreign Countries, Item Response Theory, Test Items, Reading Ability
Wetzel, Eunike; Xu, Xueli; von Davier, Matthias – Educational and Psychological Measurement, 2015
In large-scale educational surveys, a latent regression model is used to compensate for the shortage of cognitive information. Conventionally, the covariates in the latent regression model are principal components extracted from background data. This operational method has several important disadvantages, such as the handling of missing data and…
Descriptors: Surveys, Regression (Statistics), Models, Research Methodology
Liu, Ou Lydia; Bridgeman, Brent; Gu, Lixiong; Xu, Jun; Kong, Nan – Educational and Psychological Measurement, 2015
Research on examinees' response changes on multiple-choice tests over the past 80 years has yielded some consistent findings, including that most examinees make score gains by changing answers. This study expands the research on response changes by focusing on a high-stakes admissions test--the Verbal Reasoning and Quantitative Reasoning measures…
Descriptors: College Entrance Examinations, High Stakes Tests, Graduate Study, Verbal Ability
Albano, Anthony D.; Rodriguez, Michael C. – Educational and Psychological Measurement, 2013
Although a substantial amount of research has been conducted on differential item functioning in testing, studies have focused on detecting differential item functioning rather than on explaining how or why it may occur. Some recent work has explored sources of differential functioning using explanatory and multilevel item response models. This…
Descriptors: Test Bias, Hierarchical Linear Modeling, Gender Differences, Educational Opportunities
Kaliski, Pamela K.; Wind, Stefanie A.; Engelhard, George, Jr.; Morgan, Deanna L.; Plake, Barbara S.; Reshetar, Rosemary A. – Educational and Psychological Measurement, 2013
The many-faceted Rasch (MFR) model has been used to evaluate the quality of ratings on constructed response assessments; however, it can also be used to evaluate the quality of judgments from panel-based standard setting procedures. The current study illustrates the use of the MFR model for examining the quality of ratings obtained from a standard…
Descriptors: Item Response Theory, Models, Standard Setting (Scoring), Science Tests
Mitchelson, Jacqueline K.; Wicher, Eliza W.; LeBreton, James M.; Craig, S. Bartholomew – Educational and Psychological Measurement, 2009
The current study evaluates the measurement precision of the Abridged Big Five Circumplex (AB5C) of personality traits by identifying those items that demonstrate differential item functioning by gender and ethnicity. Differential item functioning is found in 33 of 45 (73%) of the AB5C scales, across gender and ethnic groups (Caucasian vs. African…
Descriptors: Personality Measures, Personality Traits, Test Bias, Ethnicity