Publication Date
| In 2026 | 0 |
| Since 2025 | 2142 |
| Since 2022 (last 5 years) | 12652 |
| Since 2017 (last 10 years) | 33777 |
| Since 2007 (last 20 years) | 68268 |
Descriptor
| Foreign Countries | 30502 |
| Test Validity | 21718 |
| Scores | 18245 |
| Academic Achievement | 16904 |
| Test Construction | 16724 |
| Test Reliability | 15006 |
| Achievement Tests | 14836 |
| Standardized Tests | 14707 |
| Comparative Analysis | 14429 |
| Elementary Secondary Education | 13033 |
| Language Tests | 12545 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 5033 |
| Teachers | 3390 |
| Researchers | 2630 |
| Policymakers | 1229 |
| Administrators | 976 |
| Students | 687 |
| Parents | 325 |
| Counselors | 216 |
| Community | 162 |
| Support Staff | 50 |
| Media Staff | 34 |
| More ▼ | |
Location
| Turkey | 2813 |
| Australia | 2425 |
| Canada | 2269 |
| California | 1851 |
| United States | 1725 |
| Texas | 1613 |
| China | 1577 |
| United Kingdom | 1315 |
| Florida | 1312 |
| United Kingdom (England) | 1202 |
| Germany | 1120 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 121 |
| Meets WWC Standards with or without Reservations | 189 |
| Does not meet standards | 174 |
Randall, Jennifer – Educational Assessment, 2023
In a justice-oriented antiracist assessment process, attention to the disruption of white supremacy must occur at every stage--from construct articulation to score reporting. An important step in the assessment development process is the item review stage often referred to as Bias/Fairness and Sensitivity Review. I argue that typical approaches to…
Descriptors: Social Justice, Racism, Test Bias, Test Items
Jiang, Zhehan; Han, Yuting; Xu, Lingling; Shi, Dexin; Liu, Ren; Ouyang, Jinying; Cai, Fen – Educational and Psychological Measurement, 2023
The part of responses that is absent in the nonequivalent groups with anchor test (NEAT) design can be managed to a planned missing scenario. In the context of small sample sizes, we present a machine learning (ML)-based imputation technique called chaining random forests (CRF) to perform equating tasks within the NEAT design. Specifically, seven…
Descriptors: Test Items, Equated Scores, Sample Size, Artificial Intelligence
Davis, Sara D.; Chan, Jason C. K. – Educational Psychology Review, 2023
Prior testing can facilitate subsequent learning, a phenomenon termed the forward testing effect (FTE). We examined a metacognitive account of this effect, which proposes that the FTE occurs because retrieval leads to strategy optimizations during later learning. One prediction of this account is that tests that require less retrieval effort…
Descriptors: Metacognition, Futures (of Society), Tests, Difficulty Level
Yan Yan; Caleb P. Hood – Texas Association for Literacy Education Yearbook, 2023
The authors' institution exceeded the Texas Science of Teaching Reading (STR) exam's passing rate of 86.6% for the 2021-2022 academic year. The authors think this success was largely due to conducting an analysis of test questions and helping preservice teachers better prepare for the exam. The authors helped preservice teachers supplement the…
Descriptors: Preservice Teachers, Test Wiseness, Reading Instruction, Test Coaching
Su, Kun; Henson, Robert A. – Journal of Educational and Behavioral Statistics, 2023
This article provides a process to carefully evaluate the suitability of a content domain for which diagnostic classification models (DCMs) could be applicable and then optimized steps for constructing a test blueprint for applying DCMs and a real-life example illustrating this process. The content domains were carefully evaluated using a set of…
Descriptors: Classification, Models, Science Tests, Physics
Zari Saeedi; Hessameddin Ghanbar; Mahdi Rezaei – International Journal of Language Testing, 2024
Despite being a popular topic in language testing, cognitive load has not received enough attention in vocabulary test items. The purpose of the current study was to scrutinize the cognitive load and vocabulary test items' differences, examinees' reaction times, and perceived difficulty. To this end, 150 students were selected using…
Descriptors: Language Tests, Test Items, Difficulty Level, Vocabulary Development
Pablo Robles-García; Stuart McLean; Jeffrey Stewart; Ji-young Shin; Claudia Helena Sánchez-Gutiérrez – Language Assessment Quarterly, 2024
Recent literature in the field of L2 vocabulary assessment has advocated for the development of written receptive vocabulary tests such as Vocabulary Levels Tests (VLTs) that use: (a) meaning-recall item formats, (b) a minimum of 40 item counts per 1,000-frequency band to improve level estimates, and (c) lemmas (not word-families) as the lexical…
Descriptors: Spanish, Test Validity, Test Construction, Vocabulary Development
Jingtong Pan; Kimberly Kendziora; Christina LiCalsi; Karthik Ramesh; George Stifel – Society for Research on Educational Effectiveness, 2024
Background/Context: Research consistently emphasizes the importance of social-emotional learning (SEL) in education settings (Cipriano et al., 2023; Wigelsworth et al., 2023). In addition, it has become increasingly evident that educators' social-emotional competence and well-being plays a crucial role in fostering SEL among students (Braun et…
Descriptors: Social Emotional Learning, Elementary School Teachers, Secondary School Teachers, Well Being
Katie J. Robinson; David R. Lubans; Myrto F. Mavilidi; Francisco B. Ortega; Nicholas Riley – Measurement in Physical Education and Exercise Science, 2024
The purpose of our study was to assess the test-retest reliability and concurrent validity of the 30 sec Sit to Stand test in a sample of adolescents. We recruited 30 male (58%) and 22 female (42%) participants (mean age = 15.77 years ± 0.46). Participants completed the 30-sec Sit to Stand and standing long jump tests on two occasions separated by…
Descriptors: Test Validity, Adolescents, Psychomotor Skills, Test Reliability
Mirian Agus; Giovanni Bonaiuti; Arianna Marras – Journal of Science Education and Technology, 2024
In recent years, numerous research studies have highlighted how teachers' perceptions of educational robotics (ER) and their sense of self-efficacy can influence the learning process. Although different instruments exist to investigate teachers' perspectives on ER, the Robotics Interest Questionnaire (RIQ) scale, developed within the Portuguese…
Descriptors: Teacher Attitudes, Self Efficacy, Test Validity, Test Reliability
Tíscar Rodríguez-Jiménez; Verónica Vidal-Arenas; Raquel Falcó; Beatriz Moreno-Amador; Juan C. Marzo; José A. Piqueras – Child & Youth Care Forum, 2024
Background: The Social Emotional Distress Scale-Secondary (SEDS-S) is a short measure designed for comprehensive school-based mental health screening, particularly for using very brief self-reported measures of well-being and distress. Whereas prior studies have shown validity and reliability evidence for the English version, there is a lack of…
Descriptors: Measures (Individuals), Psychometrics, Spanish, Well Being
Güler Yavuz Temel – Journal of Educational Measurement, 2024
The purpose of this study was to investigate multidimensional DIF with a simple and nonsimple structure in the context of multidimensional Graded Response Model (MGRM). This study examined and compared the performance of the IRT-LR and Wald test using MML-EM and MHRM estimation approaches with different test factors and test structures in…
Descriptors: Computation, Multidimensional Scaling, Item Response Theory, Models
Lauren N. Currie; Robinder P. Bedi; Anita M. Hubley – Measurement and Evaluation in Counseling and Development, 2024
This study evaluated the psychometric properties of the Hope-Action Inventory (HAI) scores with a problematic substance use population (N = 783). The hierarchical seven-factor structure of the HAI fit the data well. Further, the HAI scores had satisfactory internal consistency reliability and good convergent evidence for validity.
Descriptors: Psychometrics, Substance Abuse, Test Validity, Test Reliability
Blaženka Divjak; Barbi Svetec; Damir Horvat – Journal of Computer Assisted Learning, 2024
Background: Sound learning design should be based on the constructive alignment of intended learning outcomes (LOs), teaching and learning activities and formative and summative assessment. Assessment validity strongly relies on its alignment with LOs. Valid and reliable formative assessment can be analysed as a predictor of students' academic…
Descriptors: Automation, Formative Evaluation, Test Validity, Test Reliability
Süreyya Yörük – International Journal for Talent Development and Creativity, 2024
The Torrance Tests of Creative Thinking Figural Forms A and B are widely used to measure creative potential. Despite their common application in research, there has been a lack of focus on the psychometric properties of the tests. Thus, the scoring of the items is based on some unexamined hypotheses. The items are hypothesized to be equally…
Descriptors: Foreign Countries, Grade 2, Creative Thinking, Creativity Tests

Peer reviewed
Direct link
