Publication Date
| In 2026 | 0 |
| Since 2025 | 8 |
| Since 2022 (last 5 years) | 36 |
| Since 2017 (last 10 years) | 115 |
| Since 2007 (last 20 years) | 378 |
Descriptor
| Test Theory | 1166 |
| Test Items | 262 |
| Test Reliability | 252 |
| Test Construction | 246 |
| Test Validity | 245 |
| Psychometrics | 183 |
| Scores | 176 |
| Item Response Theory | 168 |
| Foreign Countries | 160 |
| Item Analysis | 141 |
| Statistical Analysis | 134 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Location
| United States | 17 |
| United Kingdom (England) | 15 |
| Canada | 14 |
| Australia | 13 |
| Turkey | 12 |
| Sweden | 8 |
| United Kingdom | 8 |
| Netherlands | 7 |
| Texas | 7 |
| New York | 6 |
| Taiwan | 6 |
| More ▼ | |
Laws, Policies, & Programs
| No Child Left Behind Act 2001 | 4 |
| Elementary and Secondary… | 3 |
| Individuals with Disabilities… | 3 |
Assessments and Surveys
What Works Clearinghouse Rating
Soland, James; Kuhfeld, Megan – Educational Assessment, 2019
Considerable research has examined the use of rapid guessing measures to identify disengaged item responses. However, little is known about students who rapidly guess over the course of several tests. In this study, we use achievement test data from six administrations over three years to investigate whether rapid guessing is a stable trait-like…
Descriptors: Testing, Guessing (Tests), Reaction Time, Achievement Tests
Simon, Molly N.; Prather, Edward E.; Buxner, Sanlyn R.; Impey, Chris D. – International Journal of Science Education, 2019
The discovery and characterisation of planets orbiting distant stars has shed light on the origin of our own Solar System. It is important that college-level introductory astronomy students have a general understanding of the planet formation process before they are able to draw parallels between extrasolar systems and our own Solar System. In…
Descriptors: Measures (Individuals), Test Validity, Test Reliability, Student Evaluation
Gafni, Naomi – Assessment in Education: Principles, Policy & Practice, 2016
Naomi Gafni, director of Research and Development, National Institute for Testing and Evaluation, Jerusalem, Israel, has devoted a substantial part of her career to the development of admissions tests and other educational tests and to the investigation of their validity. As such she is keenly aware of the complexities involved in this process.…
Descriptors: Test Validity, Test Interpretation, Test Use, Test Construction
Sunny, Cijy Elizabeth – ProQuest LLC, 2018
Meeting Science, Technology, Engineering, and Mathematics (STEM) workforce demands require students to have positive attitudes and persist through the STEM pipeline. There is limited research on reliable and valid instruments that measures non-cognitive skills towards all STEM fields. Tools that have been developed have mostly used traditional…
Descriptors: Student Attitudes, Academic Persistence, STEM Education, Stakeholders
Bazvand, Ali Darabi; Kheirzadeh, Shiela; Ahmadi, Alireza – International Journal of Assessment Tools in Education, 2019
The findings of previous research into the compatibility of stakeholders' perceptions with statistical estimations of item difficulty are not seemingly consistent. Furthermore, most research shows that teachers' estimation of item difficulty is not reliable since they tend to overestimate the difficulty of easy items and underestimate the…
Descriptors: Foreign Countries, High Stakes Tests, Test Items, Difficulty Level
Twing, Jon S. – Assessment in Education: Principles, Policy & Practice, 2016
This special issue of "Assessment in Education" contains the type of debate needed about what Cizek (2015) calls a "… lingering flaw in the concept of validity…." Some practitioners might not agree that the current theory of validation is flawed. Specifically, the debate Jon Twing is referencing concerns the role of the…
Descriptors: Test Validity, Misconceptions, Evidence, Scores
Andrich, David – Educational Measurement: Issues and Practice, 2016
Since Cronbach's (1951) elaboration of a from its introduction by Guttman (1945), this coefficient has become ubiquitous in characterizing assessment instruments in education, psychology, and other social sciences. Also ubiquitous are caveats on the calculation and interpretation of this coefficient. This article summarizes a recent contribution…
Descriptors: Computation, Correlation, Test Theory, Measures (Individuals)
Chirkina, T. A.; Khavenson, T. E. – Russian Education & Society, 2018
School climate is one of the significant factors determining educational achievement. However, the lack of instruments to measure it has complicated the study of this concept in Russia. We review the history of the study of the concept of "school climate," and we discuss approaches to how it can be defined. We describe the most widely…
Descriptors: Educational Environment, Definitions, Measurement, Questionnaires
Chin, Huan; Chew, Cheng Meng; Lim, Hooi Lian; Thien, Lei Mee – International Journal of Science and Mathematics Education, 2022
Cognitive Diagnostic Assessment (CDA) is an alternative assessment which can give a clear picture of pupils' learning process and cognitive structures to education stakeholders so that appropriate instructional strategies can be designed to tailored pupils' needs. Coincide with this function, the Ordered Multiple-Choice (OMC) items were…
Descriptors: Mathematics Instruction, Mathematics Tests, Multiple Choice Tests, Diagnostic Tests
James, Mary – Assessment in Education: Principles, Policy & Practice, 2017
In this commentary, Mary James highlights two problems she deemed critical during her work exploring the relationships between assessment and learning in theory and practice. First, efforts to improve assessment for learning were not always successful either in improving performance or in other ways. Second, and this may be a reason for the first…
Descriptors: Educational Assessment, Learning Theories, Test Theory, Learning
Kim, Peter – Language Teaching Research Quarterly, 2021
Foreign language aptitude is defined as one's potential to learn a second language. A language learner with higher aptitude is predicted to learn more, faster, and reach a higher level of proficiency. If this is the case, one way to validate the construct of aptitude and its measure is to conduct a validation study in which measures of aptitude is…
Descriptors: Morphology (Languages), Syntax, Second Language Learning, Second Language Instruction
Shanmugam, S. Kanageswari Suppiah; Wong, Vincent; Rajoo, Murugan – Malaysian Journal of Learning and Instruction, 2020
Purpose: This study examined the quality of English test items using psychometric and linguistic characteristics among Grade Six pupils. Method: Contrary to the conventional approach of relying only on statistics when investigating item quality, this study adopted a mixed-method approach by employing psychometric analysis and cognitive interviews.…
Descriptors: English (Second Language), Second Language Instruction, Language Tests, Psychometrics
Different Analyses, Different Conclusions? Validity Evidence from the EGMA Spatial Reasoning Subtask
Perry, Lindsey – Global Education Review, 2018
As the global development community shifts its focus from improving access to education to improving learning and instruction, the need for instruments that accurately measure student achievement in mathematics and meet technical standards is increasing. This paper explores the importance of collecting high-quality validity evidence that aligns…
Descriptors: Mathematics Tests, Test Validity, Spatial Ability, Foreign Countries
Raykov, Tenko; Marcoulides, George A.; Patelis, Thanos – Educational and Psychological Measurement, 2015
A critical discussion of the assumption of uncorrelated errors in classical psychometric theory and its applications is provided. It is pointed out that this assumption is essential for a number of fundamental results and underlies the concept of parallel tests, the Spearman-Brown's prophecy and the correction for attenuation formulas as well as…
Descriptors: Psychometrics, Correlation, Validity, Reliability
Sinharay, Sandip – Educational Measurement: Issues and Practice, 2014
Brennan (Brennan, R. L., 2012) noted that users of test scores often want (indeed, demand) that subscores be reported, along with total test scores, for diagnostic purposes. Haberman (Haberman, S. J., 2008) suggested a method based on classical test theory (CTT) to determine if subscores have added value over the total score. According to this…
Descriptors: Scores, Test Theory, Test Interpretation

Peer reviewed
Direct link
