Publication Date
| In 2026 | 8 |
| Since 2025 | 2276 |
| Since 2022 (last 5 years) | 12791 |
| Since 2017 (last 10 years) | 33916 |
| Since 2007 (last 20 years) | 68407 |
Descriptor
| Foreign Countries | 30560 |
| Test Validity | 21743 |
| Scores | 18256 |
| Academic Achievement | 16928 |
| Test Construction | 16756 |
| Test Reliability | 15028 |
| Achievement Tests | 14859 |
| Standardized Tests | 14720 |
| Comparative Analysis | 14431 |
| Elementary Secondary Education | 13042 |
| Language Tests | 12551 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 5034 |
| Teachers | 3393 |
| Researchers | 2630 |
| Policymakers | 1232 |
| Administrators | 978 |
| Students | 687 |
| Parents | 325 |
| Counselors | 216 |
| Community | 162 |
| Support Staff | 50 |
| Media Staff | 34 |
| More ▼ | |
Location
| Turkey | 2822 |
| Australia | 2426 |
| Canada | 2270 |
| California | 1854 |
| United States | 1726 |
| Texas | 1615 |
| China | 1578 |
| United Kingdom | 1315 |
| Florida | 1312 |
| United Kingdom (England) | 1202 |
| Germany | 1122 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 121 |
| Meets WWC Standards with or without Reservations | 189 |
| Does not meet standards | 174 |
Tim Moses; YoungKoung Kim – Journal of Educational Measurement, 2025
This study considers the estimation of marginal reliability and conditional accuracy measures using a generalized recursion procedure with several IRT-based ability and score estimators. The estimators include MLE, TCC, and EAP abilities, and corresponding test scores obtained with different weightings of the item scores. We consider reliability…
Descriptors: Item Response Theory, Scoring, Reliability, Accuracy
Casandra Koevoets-Beach; Donya Kurdi; Morgan Balabanoff – Practical Assessment, Research & Evaluation, 2025
Confidence tiers have been paired with multiple choice items across different fields since the early twentieth century and have seen widespread adoption in discipline-based education research fields seeking to evaluate aspects of self-regulated learning. The design of two-tiered confidence judgments impacts interpretability and perception of their…
Descriptors: Confidence Testing, Interviews, Metacognition, Undergraduate Students
Kondo, Kanako; Mizuta, Masanobu; Kawai, Yoshitaka; Sogami, Tohru; Fujimura, Shintaro; Kojima, Tsuyoshi; Abe, Chika; Tanaka, Ryo; Shiromoto, Osamu; Uozumi, Ryuji; Kishimoto, Yo; Tateya, Ichiro; Omori, Koichi; Haji, Tomoyuki – Journal of Speech, Language, and Hearing Research, 2021
Purpose: Auditory-perceptual evaluation is essential for the assessment of voice quality. The Consensus Auditory-Perceptual Evaluation of Voice (CAPE-V) provides a standardized protocol and assessment form for clinicians to analyze the voice quality and has been adapted into several different languages. The aims of this study were to develop the…
Descriptors: Japanese, Test Validity, Test Reliability, Voice Disorders
Lenz, A. Stephen; Rocha, Lauran; Aras, Yahyahan – International Journal for the Advancement of Counselling, 2021
A systematic search was conducted to identify measures of school climate developed and reported between 1993 to 2017. We coded data related to participant and setting characteristics, qualities of measures, amounts of validity evidence, and degrees of reliability estimates. Results indicated 9 school climate measures featuring disparate…
Descriptors: Educational Environment, Evaluation, Literature Reviews, Test Construction
Gao, Xuliang; Ma, Wenchao; Wang, Daxun; Cai, Yan; Tu, Dongbo – Journal of Educational and Behavioral Statistics, 2021
This article proposes a class of cognitive diagnosis models (CDMs) for polytomously scored items with different link functions. Many existing polytomous CDMs can be considered as special cases of the proposed class of polytomous CDMs. Simulation studies were carried out to investigate the feasibility of the proposed CDMs and the performance of…
Descriptors: Cognitive Measurement, Models, Test Items, Scoring
Kaya, Fatih – Education Quarterly Reviews, 2021
The aim of this study was to develop a valid and reliable measurement tool in order to determine the democracy levels of teacher candidates. During the scale development process in the research, the validity and reliability studies were conducted through three independent study groups. The first study group consisted of 627 students studying at…
Descriptors: Democracy, Measures (Individuals), Preservice Teachers, Test Validity
Oh, Seungbin; Shillingford-Butler, Ann – Measurement and Evaluation in Counseling and Development, 2021
The authors present the development and examination of the "Client Assessment of Multicultural Competent Behavior" (CAMCB) scores. The CAMCB was designed to measure therapists' multicultural competent behaviors within the context of therapeutic process, from clients' perspective. In this article, three-phases of the study are presented…
Descriptors: Counselor Evaluation, Test Construction, Cultural Awareness, Test Validity
Tunc, Emine Burcu; Parlak, Simel; Uluman, Muge; Eryigit, Derya – International Journal of Assessment Tools in Education, 2021
The aim of this research is to develop Hostility in Pandemic Scale (HPS) for Turkey Population to determine the hostility levels of individuals, which is a factor affecting the mental well-being of the society during the pandemic. The study group consists of 855 individuals between the ages of 18-65 from different genders, and have experienced the…
Descriptors: Psychological Patterns, Pandemics, COVID-19, Test Construction
Schulte, Niklas; Holling, Heinz; Bürkner, Paul-Christian – Educational and Psychological Measurement, 2021
Forced-choice questionnaires can prevent faking and other response biases typically associated with rating scales. However, the derived trait scores are often unreliable and ipsative, making interindividual comparisons in high-stakes situations impossible. Several studies suggest that these problems vanish if the number of measured traits is high.…
Descriptors: Questionnaires, Measurement Techniques, Test Format, Scoring
Markelz, Andrew M.; Riden, Benjamin S.; Zoder-Martell, Kimberly A.; Miller, Joseph E.; Bolinger, Sarah J. – Journal of Positive Behavior Interventions, 2021
Supported by decades of research on praise and its effect on student behaviors, we developed the Behavior-Specific Praise--Observation Tool (BSP-OT) to measure characteristics of effective praise. We evaluated interrater reliability of the BSP-OT to measure praise specificity, contingency, and variety using intraclass correlation (ICC) and Cohen's…
Descriptors: Test Reliability, Classroom Observation Techniques, Positive Reinforcement, Interrater Reliability
Tabin, Mireille; Diacquenod, Cindy; De Palma, Nicola; Gerber, Fabienne; Straccia, Claudio; Wilson, Carlene; Kosel, Markus; Petitpierre, Geneviève – Journal of Intellectual & Developmental Disability, 2021
Background: Social vulnerability refers to the ways in which an individual is at risk of being victimised. The Test of Interpersonal Competences and Personal Vulnerability [TICPV] is an Australian assessment tool designed for adults with intellectual disabilities (ID) [Wilson et al. (1996). Vulnerability to criminal exploitation: Influence of…
Descriptors: Test Validity, At Risk Persons, Intellectual Disability, Test Reliability
Kats, Daniel J.; Skotko, Brian G.; de Graaf, Gert; Skladzien, Ellen; Hooper, Brian Takashi; Mordi, Rose; Mykhailenko, Tetiana; Buckley, Frank; Patsiogiannis, Vasiliki; Krell, Kavita; Haugen, Kelsey; Donelan, Karen – Journal of Applied Research in Intellectual Disabilities, 2023
Background: Down syndrome is the most common liveborn genetic condition. However, there are no surveys measuring societal services and supports for people with Down syndrome. We developed a questionnaire so that initiatives could be targeted towards countries most in need of assistance. Method: We formed a geographically diverse group of…
Descriptors: Down Syndrome, Social Services, Questionnaires, Test Construction
Atar, Burcu; Atalay Kabasakal, Kubra; Kibrislioglu Uysal, Nermin – Journal of Experimental Education, 2023
The purpose of this study was to evaluate the population invariance of equating functions across country subgroups in TIMSS 2015 mathematics tests in relation to the raw-score distribution, DIF, and DTF. We used equipercentile and IRT observed-score equating methods. The results of the study indicate that there is a relationship between the…
Descriptors: Foreign Countries, Achievement Tests, International Assessment, Mathematics Tests
Weeks, Sean N.; Renshaw, Tyler L.; Serang, Sarfaraz – Journal of Psychoeducational Assessment, 2023
Minority stress theory is a model for understanding health disparities among sexual minorities, defined as those who experience a level of same-sex attraction, identity, or behavior. Methods for assessing minority stress among youth included only adult measures until the development of the Sexual Minority Adolescent Stress Inventory (SMASI). The…
Descriptors: Adolescents, LGBTQ People, Test Validity, Stress Variables
Rutkowski, David; Rutkowski, Leslie; Valdivia, Dubravka Svetina; Canbolat, Yusuf; Underhill, Stephanie – Applied Measurement in Education, 2023
Several states in the US have removed time limits on their state assessments. In Indiana, where this study takes place, the state assessment is both untimed during the testing window and allows unlimited breaks during the testing session. Using grade 3 and 8 math and English state assessment data, in this paper we focus on time used for testing…
Descriptors: Testing, Time, Intervals, Academic Achievement

Peer reviewed
Direct link
