Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 9 |
Since 2006 (last 20 years) | 20 |
Descriptor
Test Bias | 35 |
Test Reliability | 35 |
Test Validity | 28 |
Test Construction | 12 |
Achievement Tests | 8 |
Student Evaluation | 8 |
Psychometrics | 7 |
Scoring | 7 |
Academic Achievement | 6 |
Evaluation Methods | 6 |
Grade 3 | 6 |
More ▼ |
Source
Author
Ackerman, Michael | 1 |
Asilkalkan, Abdullah | 1 |
Banville, Dominique | 1 |
Beller, Michal | 1 |
Berge, Jos M. F. Ten | 1 |
Bonner, Cavan V. | 1 |
Boone, William J. | 1 |
Boyle, J. David | 1 |
Chaplin, Duncan | 1 |
Choi, Youn-Jeng | 1 |
Coniam, David | 1 |
More ▼ |
Publication Type
Reports - Descriptive | 35 |
Journal Articles | 23 |
Numerical/Quantitative Data | 5 |
Reports - Research | 1 |
Speeches/Meeting Papers | 1 |
Education Level
Early Childhood Education | 5 |
Elementary Education | 5 |
Grade 3 | 5 |
Grade 4 | 5 |
Grade 5 | 5 |
Grade 6 | 5 |
Grade 7 | 5 |
Intermediate Grades | 5 |
Junior High Schools | 5 |
Middle Schools | 5 |
Primary Education | 5 |
More ▼ |
Laws, Policies, & Programs
Elementary and Secondary… | 1 |
Elementary and Secondary… | 1 |
Every Student Succeeds Act… | 1 |
No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Liou, Gloria; Bonner, Cavan V.; Tay, Louis – International Journal of Testing, 2022
With the advent of big data and advances in technology, psychological assessments have become increasingly sophisticated and complex. Nevertheless, traditional psychometric issues concerning the validity, reliability, and measurement bias of such assessments remain fundamental in determining whether score inferences of human attributes are…
Descriptors: Psychometrics, Computer Assisted Testing, Adaptive Testing, Data
Wesolowski, Brian C. – Music Educators Journal, 2020
Validity, reliability, and fairness are three prominent indicators for evaluating the quality of assessment processes. Each of the indicators is most often written about and applied in the context of large-scale assessment. As a result, the technical properties of these indicators make them limited in both their practicality and relevance for…
Descriptors: Music Education, Test Validity, Test Reliability, Student Evaluation
Flanagan, Agnes; Cormier, Damien C. – Communique, 2019
One of the areas subsumed under the data-based decision making and accountability practice identified in the National Association of School Psychologists' (NASP) "Model for Integrated School Psychological Services" is to collect information on psychological and educational variables to make decisions at a number of levels of service…
Descriptors: Test Bias, School Psychologists, Measurement, Data Collection
Center on Standards and Assessments Implementation, 2018
Reliability is a measure of consistency. It is the degree to which student results are the same when they take the same test on different occasions, when different scorers score the same item or task, and when different but equivalent tests are taken at the same time or at different times. Reliability is about making sure that different test forms…
Descriptors: Test Reliability, Test Validity, Student Evaluation, Test Bias
Choi, Youn-Jeng; Asilkalkan, Abdullah – Measurement: Interdisciplinary Research and Perspectives, 2019
About 45 R packages to analyze data using item response theory (IRT) have been developed over the last decade. This article introduces these 45 R packages with their descriptions and features. It also describes possible advanced IRT models using R packages, as well as dichotomous and polytomous IRT models, and R packages that contain applications…
Descriptors: Item Response Theory, Data Analysis, Computer Software, Test Bias
Lian, Lim Hooi; Yew, Wun Thiam; Meng, Chew Cheng – International Education Studies, 2014
Currently, in order to reform the Malaysian education system, there have been a number of education policy initiatives launched by the Malaysian Ministry of Education (MOE). All these initiatives have encouraged and inculcated teaching and learning for creativity, critical, innovative and higher-order thinking skills rather than conceptual…
Descriptors: Foreign Countries, Educational Policy, Evaluation Methods, Teacher Competencies
Longford, Nicholas T. – Journal of Educational and Behavioral Statistics, 2014
A method for medical screening is adapted to differential item functioning (DIF). Its essential elements are explicit declarations of the level of DIF that is acceptable and of the loss function that quantifies the consequences of the two kinds of inappropriate classification of an item. Instead of a single level and a single function, sets of…
Descriptors: Test Items, Test Bias, Simulation, Hypothesis Testing
Popham, W. James – Phi Delta Kappan, 2014
The tests we use to evaluate student achievement may well be sound measures of what students know, but they are faulty indicators at best of how well they have been taught. A remedy to this this situation of judging teachers by the performance of their students on high-stakes tests may be in hand already. We should look to the methods successfully…
Descriptors: High Stakes Tests, Academic Achievement, Teacher Evaluation, Evaluation Methods
Skinner, Rebecca R.; Lomax, Erin – Congressional Research Service, 2017
Federal education legislation continues to emphasize the role of assessment in elementary and secondary schools. Perhaps most prominently, the Elementary and Secondary Education Act (ESEA), as amended by the Every Student Succeeds Act (ESSA; P.L. 114-95), requires the use of test-based educational accountability systems in states and specifies the…
Descriptors: Educational Assessment, Educational Legislation, Elementary Secondary Education, Federal Legislation
New Meridian Corporation, 2020
The purpose of this report is to describe the technical qualities of the 2018-2019 operational administration of the English language arts/literacy (ELA/L) and mathematics summative assessments in grades 3 through 8 and high school. The ELA/L assessments focus on reading and comprehending a range of sufficiently complex texts independently and…
Descriptors: Language Arts, Literacy Education, Mathematics Education, Summative Evaluation
New Meridian Corporation, 2020
The purpose of this report is to describe the technical qualities of the 2018-2019 operational administration of the English language arts/literacy (ELA/L) and mathematics assessments in grades 3 through 8 and high school. New Meridian, in coordination with multiple states and vendors, developed an alternate form of the summative assessment to…
Descriptors: Language Arts, Literacy Education, Mathematics Education, Summative Evaluation
Coniam, David; Falvey, Peter – Language Testing, 2013
The "Language Proficiency Assessment for Teachers of English" (LPATE) is a test of standards of English language ability for Hong Kong primary and secondary school teachers of English. The impetus for the creation of the LPATE arose, in 1996, because of concerns in business and education communities over falling English language…
Descriptors: English Teachers, Elementary School Teachers, Secondary School Teachers, Language Tests
New York State Education Department, 2016
This technical report provides detailed information regarding the technical, statistical, and measurement attributes of the New York State Testing Program (NYSTP) for the Grades 3-8 Common Core English Language Arts (ELA) and Mathematics 2016 Operational Tests. This report includes information about test content and test development, item (i.e.,…
Descriptors: Testing Programs, English, Language Arts, Mathematics Tests
New York State Education Department, 2015
This technical report provides detailed information regarding the technical, statistical, and measurement attributes of the New York State Testing Program (NYSTP) for the Grades 3-8 Common Core English Language Arts (ELA) and Mathematics 2015 Operational Tests. This report includes information about test content and test development, item (i.e.,…
Descriptors: Testing Programs, English, Language Arts, Mathematics Tests
Goldhaber, Dan; Chaplin, Duncan – Center for Education Data & Research, 2012
In a provocative and influential paper, Jesse Rothstein (2010) finds that standard value added models (VAMs) suggest implausible future teacher effects on past student achievement, a finding that obviously cannot be viewed as causal. This is the basis of a falsification test (the Rothstein falsification test) that appears to indicate bias in VAM…
Descriptors: School Effectiveness, Teacher Effectiveness, Achievement Gains, Statistical Bias