Publication Date
In 2025 | 2 |
Since 2024 | 16 |
Since 2021 (last 5 years) | 39 |
Since 2016 (last 10 years) | 107 |
Since 2006 (last 20 years) | 969 |
Descriptor
Educational Testing | 4158 |
Elementary Secondary Education | 894 |
Student Evaluation | 880 |
Academic Achievement | 753 |
Educational Assessment | 661 |
Evaluation Methods | 606 |
Achievement Tests | 578 |
Test Construction | 538 |
Higher Education | 532 |
Standardized Tests | 498 |
Testing Problems | 468 |
More ▼ |
Source
Author
Thurlow, Martha | 22 |
Popham, W. James | 17 |
Baker, Eva L. | 14 |
Shipman, Virginia C. | 13 |
Sinharay, Sandip | 13 |
Ebel, Robert L. | 12 |
Haney, Walt | 11 |
Herman, Joan L. | 10 |
Mislevy, Robert J. | 10 |
Hartley, Nancy K. | 8 |
Koretz, Daniel | 8 |
More ▼ |
Publication Type
Education Level
Audience
Practitioners | 289 |
Teachers | 137 |
Researchers | 79 |
Administrators | 77 |
Policymakers | 67 |
Parents | 19 |
Students | 19 |
Counselors | 9 |
Community | 6 |
Media Staff | 1 |
Support Staff | 1 |
More ▼ |
Location
California | 101 |
Canada | 82 |
Florida | 54 |
Australia | 52 |
United Kingdom | 51 |
United Kingdom (England) | 50 |
United States | 49 |
New York | 46 |
Texas | 42 |
United Kingdom (Great Britain) | 28 |
New Jersey | 27 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Meets WWC Standards without Reservations | 1 |
Meets WWC Standards with or without Reservations | 2 |
Does not meet standards | 1 |
Yixi Wang – ProQuest LLC, 2020
Binary item response theory (IRT) models are widely used in educational testing data. These models are not perfect because they simplify the individual item responding process, ignore the differences among different response patterns, cannot handle multidimensionality that lay behind options within a single item, and cannot manage missing response…
Descriptors: Item Response Theory, Educational Testing, Data, Models
Pakprod, Nuttakan; Jirasatjanukul, Kanokrat; Tumthong, Damrong; Amklad, Prapa; Lekchom, Wipa – International Education Studies, 2021
The objective of this research is to study the results of activities to increase the scores of Ordinary National Education Test. Cluster; teachers of Phetchaburi Rajabhat University comparing the results of Ordinary National Education Test in 2017-2018 and studying the satisfaction of the activities. The target group is 49 schools in Phetchaburi…
Descriptors: Foreign Countries, National Competency Tests, Scores, Test Results
Suthathip Thirakunkovit – Language Testing in Asia, 2025
Establishing a cut score is a crucial aspect of the test development process since the selected cut score has the potential to impact students' performance outcomes and shape instructional strategies within the classroom. Therefore, it is vital for those involved in test development to set a cut score that is both fair and justifiable. This cut…
Descriptors: Cutting Scores, Culture Fair Tests, Language Tests, Test Construction
Cooke, Gillian; Elliott, Gill – Research Matters, 2021
In times of crisis it is good to look back. Not only is it comforting, but better understanding of events in our past can inform decision-making and help us find direction at uncertain times. COVID-19 may have presented new challenges, but this exploration of historical disruptions to school exams highlights themes and a recognisable human spirit.…
Descriptors: Educational History, Educational Testing, Pandemics, War
Salmani Nodoushan, Mohammad Ali – Online Submission, 2021
This paper follows a line of logical argumentation to claim that what Samuel Messick conceptualized about construct validation has probably been misunderstood by some educational policy makers, practicing educators, and classroom teachers. It argues that, while Messick's unified theory of test validation aimed at (a) warning educational…
Descriptors: Construct Validity, Test Theory, Test Use, Affordances
Lim Hooi Lian; Wun Thiam Yew – International Journal of Assessment Tools in Education, 2023
The majority of students from elementary to tertiary levels have misunderstandings and challenges acquiring various statistical concepts and skills. However, the existing statistics assessment frameworks challenge practice in a classroom setting. The purpose of this research is to develop and validate a statistical thinking assessment tool…
Descriptors: Psychometrics, Grade 7, Middle School Mathematics, Statistics Education
Tavares, Walter; Kuper, Ayelet; Kulasegaram, Kulamakan; Whitehead, Cynthia – Advances in Health Sciences Education, 2020
The array of different philosophical positions underlying contemporary views on competence, assessment strategies and justification have led to advances in assessment science. Challenges may arise when these philosophical positions are not considered in assessment design. These can include (a) a logical incompatibility leading to varied or…
Descriptors: Performance Based Assessment, Educational Testing, Test Interpretation, Test Results
Phelps, Richard P. – Online Submission, 2019
If it is not possible for one to critique other research and succeed--or even remain securely employed--in a research profession, how is the profession ever to rid itself of flawed, biased, or fraudulent research? Answer: it will not. Any community that disallows accusations of bad behavior condones bad behavior. Any community that disallows…
Descriptors: Educational Research, Deception, Ethics, Information Dissemination
Sainan Xu; Jing Lu; Jiwei Zhang; Chun Wang; Gongjun Xu – Grantee Submission, 2024
With the growing attention on large-scale educational testing and assessment, the ability to process substantial volumes of response data becomes crucial. Current estimation methods within item response theory (IRT), despite their high precision, often pose considerable computational burdens with large-scale data, leading to reduced computational…
Descriptors: Educational Assessment, Bayesian Statistics, Statistical Inference, Item Response Theory
Guangming Li; Zhengyan Liang – SAGE Open, 2024
In order to investigate the influence of separation of grade distributions and ratio of common items on the precision of vertical scaling, this simulation study chooses common item design and first grade as base grade. There are four grades with 1,000 students each to take part in a test which has 100 items. Monte Carlo simulation method is used…
Descriptors: Elementary School Students, Grade 1, Grade 2, Grade 3
Hong, Seong Eun; Monroe, Scott; Falk, Carl F. – Journal of Educational Measurement, 2020
In educational and psychological measurement, a person-fit statistic (PFS) is designed to identify aberrant response patterns. For parametric PFSs, valid inference depends on several assumptions, one of which is that the item response theory (IRT) model is correctly specified. Previous studies have used empirical data sets to explore the effects…
Descriptors: Educational Testing, Psychological Testing, Goodness of Fit, Error of Measurement
Metsämuuronen, Jari – International Journal of Educational Methodology, 2020
Kelley's Discrimination Index (DI) is a simple and robust, classical non-parametric short-cut to estimate the item discrimination power (IDP) in the practical educational settings. Unlike item-total correlation, DI can reach the ultimate values of +1 and -1, and it is stable against the outliers. Because of the computational easiness, DI is…
Descriptors: Test Items, Computation, Item Analysis, Nonparametric Statistics
Mari Quanbeck; Andrew R. Hinkle; Sheryl S. Lazarus; Virginia A. Ressa; Martha M. Thurlow – National Center on Educational Outcomes, 2023
This report contains the proceedings of a forum held on June 28, 2023 in New Orleans, Louisiana, to discuss issues surrounding meaningful accessibility of assessments. The forum was a post-session to the Council of Chief State School Officers (CCSSO) National Conference on Student Assessment (NCSA) and was a collaboration of the "Assessment,…
Descriptors: Accessibility (for Disabled), Educational Testing, Technology Integration, Barriers
Nisbet, Isabel; Shaw, Stuart D. – Assessment in Education: Principles, Policy & Practice, 2019
Fairness in assessment is seen as increasingly important but there is a need for greater clarity in use of the term 'fair'. Also, fairness is perceived through a range of 'lenses' reflecting different traditions of thought. The lens used determines how fairness is seen and described. This article distinguishes different uses of 'fair' which have…
Descriptors: Test Bias, Measurement, Theories, Educational Assessment
Sinharay, Sandip – Grantee Submission, 2019
Benefiting from item preknowledge (e.g., McLeod, Lewis, & Thissen, 2003) is a major type of fraudulent behavior during educational assessments. This paper suggests a new statistic that can be used for detecting the examinees who may have benefitted from item preknowledge using their response times. The statistic quantifies the difference in…
Descriptors: Test Items, Cheating, Reaction Time, Identification