Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 4 |
Since 2006 (last 20 years) | 16 |
Descriptor
Statistical Analysis | 18 |
Gender Differences | 13 |
Racial Differences | 9 |
Scores | 7 |
Test Items | 6 |
Comparative Analysis | 5 |
Test Bias | 5 |
College Entrance Examinations | 4 |
Correlation | 4 |
Differences | 4 |
Difficulty Level | 4 |
More ▼ |
Source
ETS Research Report Series | 18 |
Author
Moses, Tim | 3 |
Dorans, Neil J. | 2 |
Liu, Jinghua | 2 |
Rock, Donald A. | 2 |
Attali, Yigal | 1 |
Blew, Edwin O. | 1 |
Bridgeman, Brent | 1 |
Chubbuck, Kay | 1 |
Curley, W. Edward | 1 |
Deng, Weiling | 1 |
Fang, Lin | 1 |
More ▼ |
Publication Type
Journal Articles | 18 |
Reports - Research | 18 |
Tests/Questionnaires | 2 |
Education Level
Higher Education | 7 |
Postsecondary Education | 6 |
High Schools | 3 |
Secondary Education | 3 |
Early Childhood Education | 1 |
Elementary Education | 1 |
Grade 1 | 1 |
Grade 12 | 1 |
Grade 8 | 1 |
Kindergarten | 1 |
Primary Education | 1 |
More ▼ |
Audience
Location
China | 1 |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Roohr, Katrina Crotts; Liu, Ou Lydia; Liu, Huili – ETS Research Report Series, 2017
The "ETS"® Proficiency Profile (EPP), a college-level assessment, has been widely used to evaluate general education student learning outcomes (SLOs) in college. The purpose of this study was to investigate validity evidence for the EPP by evaluating the relationship with outcomes such as student retention, cumulative grade point average…
Descriptors: Research Reports, Outcome Measures, Test Validity, Predictor Variables
Stricker, Lawrence J.; Rock, Donald A.; Bridgeman, Brent – ETS Research Report Series, 2015
This study explores stereotype threat on low-stakes tests used in a large-scale assessment, math and reading tests in the Education Longitudinal Study of 2002 (ELS). Issues identified in laboratory research (though not observed in studies of high-stakes tests) were assessed: whether inquiring about their race and gender is related to the…
Descriptors: Stereotypes, Reading Tests, Mathematics Tests, Longitudinal Studies
Qian, Xiaoyu; Nandakumar, Ratna; Glutting, Joseoph; Ford, Danielle; Fifield, Steve – ETS Research Report Series, 2017
In this study, we investigated gender and minority achievement gaps on 8th-grade science items employing a multilevel item response methodology. Both gaps were wider on physics and earth science items than on biology and chemistry items. Larger gender gaps were found on items with specific topics favoring male students than other items, for…
Descriptors: Item Analysis, Gender Differences, Achievement Gap, Grade 8
Attali, Yigal – ETS Research Report Series, 2014
Previous research on calculator use in standardized assessments of quantitative ability focused on the effect of calculator availability on item difficulty and on whether test developers can predict these effects. With the introduction of an on-screen calculator on the Quantitative Reasoning measure of the "GRE"® revised General Test, it…
Descriptors: College Entrance Examinations, Graduate Study, Calculators, Test Items
Chubbuck, Kay; Curley, W. Edward; King, Teresa C. – ETS Research Report Series, 2016
This study gathered quantitative and qualitative evidence concerning gender differences in performance by using critical reading material on the "SAT"® test with sports and science content. The fundamental research questions guiding the study were: If sports and science are to be included in a skills test, what kinds of material are…
Descriptors: College Entrance Examinations, Gender Differences, Critical Reading, Reading Tests
Yu, Guoxing; He, Lianzhen; Rea-Dickins, Pauline; Kiely, Richard; Lu, Yanbin; Zhang, Jing; Zhang, Yan; Xu, Shasha; Fang, Lin – ETS Research Report Series, 2017
Language test preparation has often been studied within the consequential validity framework in relation to ethics, equity, fairness, and washback of assessment. The use of independent and integrated speaking tasks in the "TOEFL iBT"® test represents a significant development and innovation in assessing speaking ability in academic…
Descriptors: English (Second Language), Language Tests, Second Language Learning, Oral Language
Moses, Tim; Liu, Jinghua; Tan, Adele; Deng, Weiling; Dorans, Neil J. – ETS Research Report Series, 2013
In this study, differential item functioning (DIF) methods utilizing 14 different matching variables were applied to assess DIF in the constructed-response (CR) items from 6 forms of 3 mixed-format tests. Results suggested that the methods might produce distinct patterns of DIF results for different tests and testing programs, in that the DIF…
Descriptors: Test Construction, Multiple Choice Tests, Test Items, Item Analysis
Paek, Insu – ETS Research Report Series, 2009
Three statistical testing procedures well-known in the maximum likelihood approach are the Wald, likelihood ratio (LR), and score tests. Although well-known, the application of these three testing procedures in the logistic regression method to investigate differential item function (DIF) has not been rigorously made yet. Employing a variety of…
Descriptors: Test Bias, Statistical Analysis, Regression (Statistics), Maximum Likelihood Statistics
Liu, Jinghua; Zhu, Xiaowen – ETS Research Report Series, 2008
The purpose of this paper is to explore methods to approximate population invariance without conducting multiple linkings for subpopulations. Under the single group or equivalent groups design, no linking needs to be performed for the parallel-linear system linking functions. The unequated raw score information can be used as an approximation. For…
Descriptors: Raw Scores, Test Format, Comparative Analysis, Test Construction
Moses, Tim – ETS Research Report Series, 2008
Nine statistical strategies for selecting equating functions in an equivalent groups design were evaluated. The strategies of interest were likelihood ratio chi-square tests, regression tests, Kolmogorov-Smirnov tests, and significance tests for equated score differences. The most accurate strategies in the study were the likelihood ratio tests…
Descriptors: Equated Scores, Statistical Analysis, Statistical Significance, Regression (Statistics)
Stankov, Lazar; Lee, Jihyun – ETS Research Report Series, 2007
This paper examines the nature of confidence in relation to cognitive abilities, personality traits, and metacognition. Confidence was measured as it was expressed in answers to each test item during the administration of reading and listening sections of the TOEFL® iBT. The confidence scores were correlated with the accuracy scores from the TOEFL…
Descriptors: English (Second Language), Grade Point Average, High Schools, Personality Traits
Sinharay, Sandip; Dorans, Neil J.; Grant, Mary C.; Blew, Edwin O.; Knorr, Colleen M. – ETS Research Report Series, 2006
The application of the Mantel-Haenszel test statistic (and other popular DIF-detection methods) to determine DIF requires large samples, but test administrators often need to detect DIF with small samples. There is no universally agreed upon statistical approach for performing DIF analysis with small samples; hence there is substantial scope of…
Descriptors: Test Bias, Computation, Sample Size, Bayesian Statistics
Moses, Tim; Kim, Sooyeon – ETS Research Report Series, 2007
This study evaluated the impact of unequal reliability on test equating methods in the nonequivalent groups with anchor test (NEAT) design. Classical true score-based models were compared in terms of their assumptions about how reliability impacts test scores. These models were related to treatment of population ability differences by different…
Descriptors: Reliability, Equated Scores, Test Items, Statistical Analysis
Rock, Donald A.; Pollack, Judy M.; Weiss, Michael – ETS Research Report Series, 2004
This study attempts to identify different patterns of cognitive growth in kindergarten and first grade associated with selected subpopulations. The results are based upon a nationally representative sample of fall kindergartners who were retested in the spring of their kindergarten year and then again in the fall and spring of their first grade…
Descriptors: Cognitive Development, Kindergarten, Grade 1, Age Differences
Xu, Xueli; von Davier, Matthias – ETS Research Report Series, 2006
More than a dozen statistical models have been developed for the purpose of cognitive diagnosis. These models are supposed to extract a much finer level of information from item responses than traditional unidimensional item response models. In this paper, a general diagnostic model (GDM) was used to analyze a set of simulated sparse data and real…
Descriptors: Statistical Analysis, National Competency Tests, Diagnostic Tests, Item Response Theory
Previous Page | Next Page »
Pages: 1 | 2