Publication Date
| In 2026 | 0 |
| Since 2025 | 38 |
| Since 2022 (last 5 years) | 225 |
| Since 2017 (last 10 years) | 570 |
| Since 2007 (last 20 years) | 1377 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Researchers | 110 |
| Practitioners | 107 |
| Teachers | 46 |
| Administrators | 25 |
| Policymakers | 24 |
| Counselors | 12 |
| Parents | 7 |
| Students | 7 |
| Support Staff | 4 |
| Community | 2 |
Location
| California | 61 |
| Canada | 60 |
| United States | 57 |
| Turkey | 47 |
| Australia | 43 |
| Florida | 34 |
| Germany | 26 |
| Texas | 26 |
| China | 25 |
| Netherlands | 25 |
| Iran | 22 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 1 |
| Meets WWC Standards with or without Reservations | 1 |
| Does not meet standards | 1 |
Glazerman, Steven M.; Potamites, Liz – Mathematica Policy Research, Inc., 2011
There are many ways to use student test scores to evaluate schools. This paper defines and examines different estimators, including regression-based value-added indicators, average gains, and successive cohort differences in achievement levels. Given that regression-based indicators are theoretically preferred but not always feasible, we consider…
Descriptors: Academic Achievement, Accountability, Achievement Gains, Educational Indicators
Paek, Insu – ETS Research Report Series, 2009
Three statistical testing procedures well-known in the maximum likelihood approach are the Wald, likelihood ratio (LR), and score tests. Although well-known, the application of these three testing procedures in the logistic regression method to investigate differential item function (DIF) has not been rigorously made yet. Employing a variety of…
Descriptors: Test Bias, Statistical Analysis, Regression (Statistics), Maximum Likelihood Statistics
MacInnes, Jann Marie Wise – ProQuest LLC, 2009
Multilevel data often exist in educational studies. The focus of this study is to consider differential item functioning (DIF) for dichotomous items from a multilevel perspective. One of the most often used methods for detecting DIF in dichotomously scored items is the Mantel-Haenszel log odds-ratio. However, the Mantel-Haenszel reduces the…
Descriptors: Test Bias, Simulation, Item Response Theory, Test Items
Davis, Susan L.; Buckendahl, Chad W. – Applied Measurement in Education, 2009
In response to a Congressional mandate, an evaluation of the National Assessment of Educational Progress (NAEP) was undertaken beginning in 2004. The evaluation design included a series of studies that encompassed the breadth and selected areas of depth of the NAEP program. Studies were identified with input from key stakeholders and were…
Descriptors: National Competency Tests, Evaluation Methods, Evaluation Criteria, Test Results
Sinharay, Sandip; Dorans, Neil J.; Grant, Mary C.; Blew, Edwin O. – Journal of Educational and Behavioral Statistics, 2009
Test administrators often face the challenge of detecting differential item functioning (DIF) with samples of size smaller than that recommended by experts. A Bayesian approach can incorporate, in the form of a prior distribution, existing information on the inference problem at hand, which yields more stable estimation, especially for small…
Descriptors: Test Bias, Computation, Bayesian Statistics, Data
Wuang, Yee-Pay; Lin, Yueh-Hsien; Su, Chwen-Yng – Research in Developmental Disabilities: A Multidisciplinary Journal, 2009
The Bruininks-Oseretsky Test of Motor Proficiency-Second Edition (BOT-2) is widely used to assess motor skills for both clinical and research purposes; however, its validity has not been adequately assessed in intellectual disabilities (ID). This study used partial credit Rasch model to examine the measurement properties of the BOT-2 among 446…
Descriptors: Mental Retardation, Item Response Theory, Ability, Test Items
Weitzman, R. A. – Educational and Psychological Measurement, 2009
Building on the Kelley and Gulliksen versions of classical test theory, this article shows that a logistic model having only a single item parameter can account for varying item discrimination, as well as difficulty, by using item-test correlations to adjust incorrect-correct (0-1) item responses prior to an initial model fit. The fit occurs…
Descriptors: Item Response Theory, Test Items, Difficulty Level, Test Bias
Abrahams, Fatima; Friedrich, Christian; Tredoux, Nanette – Industry and Higher Education, 2012
South African higher education institutions are experiencing challenges regarding access, redress and the successful completion of programmes in an environment where there are still imbalances in the schooling system. Tools are needed that will assist with the process of selecting students. The aim of this study is to determine whether a test…
Descriptors: Higher Education, Test Results, Abstract Reasoning, Gender Differences
Jiang, Bo; Xu, Xiaoying; Garcia, Alicia; Lewis, Jennifer E. – Journal of Chemical Education, 2010
The Test of Logical Thinking (TOLT) and the Group Assessment of Logical Thinking (GALT) are two of the instruments most widely used by science educators and researchers to measure students' formal reasoning abilities. Based on Piaget's cognitive development theory, formal thinking ability has been shown to be essential for student achievement in…
Descriptors: Test Bias, Test Reliability, Chemistry, Logical Thinking
Penfield, Randall D.; Lee, Okhee – Journal of Research in Science Teaching, 2010
Recent test-based accountability policy in the U.S. has involved annually assessing all students in core subjects and holding schools accountable for adequate progress of all students by implementing sanctions when adequate progress is not met. Despite its potential benefits, basing educational policy on assessments developed for a student…
Descriptors: Science Tests, Student Diversity, Accountability, Minority Groups
Walpuski, Maik; Ropohl, Mathias; Sumfleth, Elke – Chemistry Education Research and Practice, 2011
In 2004 national educational standards for chemistry were implemented in Germany. While the standards describe different competencies to be reached after grade 10, no compulsory contents are defined. The contents are defined by the different German states individually. This means that there are no defined common topics taught in all states that…
Descriptors: Test Items, Foreign Countries, Reading Skills, Grade 10
Mitchelson, Jacqueline K.; Wicher, Eliza W.; LeBreton, James M.; Craig, S. Bartholomew – Educational and Psychological Measurement, 2009
The current study evaluates the measurement precision of the Abridged Big Five Circumplex (AB5C) of personality traits by identifying those items that demonstrate differential item functioning by gender and ethnicity. Differential item functioning is found in 33 of 45 (73%) of the AB5C scales, across gender and ethnic groups (Caucasian vs. African…
Descriptors: Personality Measures, Personality Traits, Test Bias, Ethnicity
Wu, Li-Tzy; Ringwalt, Christopher L.; Yang, Chongming; Reeve, Bryce B.; Pan, Jeng-Jong; Blazer, Dan G. – Journal of the American Academy of Child & Adolescent Psychiatry, 2009
DSM-IV's hierarchical distinction between abuse of and dependence on prescription opioids is not supported since the symptoms of abuse in adolescents are not less severe than dependence. The finding is based on the examination of the DSM-IV criteria for opioid use disorders using item response theory.
Descriptors: Test Bias, Adolescents, Item Response Theory, Drug Abuse
Helms, Janet E. – American Psychologist, 2009
In defending tests of cognitive abilities, knowledge, or skills (CAKS) from the skepticism of their "family members, friends, and neighbors" and aiding psychologists forced to defend tests from "myth and hearsay" in their own skeptical social networks (p. 215), Sackett, Borneman, and Connelly focused on evaluating validity coefficients, racial or…
Descriptors: Test Validity, Cognitive Ability, Error of Measurement, Test Bias
Keller, Christopher M.; Kros, John F. – Marketing Education Review, 2011
Measures of survey reliability are commonly addressed in marketing courses. One statistic of reliability is "Cronbach's alpha." This paper presents an application of survey reliability as a reflexive application of multiple-choice exam validation. The application provides an interactive decision support system that incorporates survey item…
Descriptors: Test Validity, Marketing, Test Reliability, Multiple Choice Tests

Peer reviewed
Direct link
