Publication Date
In 2025 | 23 |
Since 2024 | 50 |
Since 2021 (last 5 years) | 167 |
Since 2016 (last 10 years) | 431 |
Since 2006 (last 20 years) | 715 |
Descriptor
Scores | 1105 |
Test Reliability | 1105 |
Test Validity | 584 |
Foreign Countries | 264 |
Psychometrics | 232 |
Test Construction | 225 |
Correlation | 197 |
Factor Analysis | 177 |
Test Items | 169 |
Statistical Analysis | 147 |
Measures (Individuals) | 130 |
More ▼ |
Source
Author
Publication Type
Education Level
Audience
Researchers | 22 |
Practitioners | 19 |
Teachers | 4 |
Administrators | 3 |
Counselors | 2 |
Parents | 2 |
Community | 1 |
Policymakers | 1 |
Location
Turkey | 47 |
Canada | 15 |
China | 14 |
United Kingdom | 13 |
Australia | 11 |
Germany | 11 |
Netherlands | 11 |
Spain | 11 |
Texas | 11 |
United Kingdom (England) | 11 |
United States | 10 |
More ▼ |
Laws, Policies, & Programs
Elementary and Secondary… | 4 |
No Child Left Behind Act 2001 | 2 |
Race to the Top | 2 |
Elementary and Secondary… | 1 |
Elementary and Secondary… | 1 |
Every Student Succeeds Act… | 1 |
Individuals with Disabilities… | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Meets WWC Standards without Reservations | 1 |
Meets WWC Standards with or without Reservations | 1 |
Ying Xu; Xiaodong Li; Jin Chen – Language Testing, 2025
This article provides a detailed review of the Computer-based English Listening Speaking Test (CELST) used in Guangdong, China, as part of the National Matriculation English Test (NMET) to assess students' English proficiency. The CELST measures listening and speaking skills as outlined in the "English Curriculum for Senior Middle…
Descriptors: Computer Assisted Testing, English (Second Language), Language Tests, Listening Comprehension Tests
Grace C. Tetschner; Sachin Nedungadi – Chemistry Education Research and Practice, 2025
Many undergraduate chemistry students hold alternate conceptions related to resonance--an important and fundamental topic of organic chemistry. To help address these alternate conceptions, an organic chemistry instructor could administer the resonance concept inventory (RCI), which is a multiple-choice assessment that was designed to identify…
Descriptors: Scientific Concepts, Concept Formation, Item Response Theory, Scores
Marzieh Haghayeghi; Ali Moghadamzadeh; Hamdollah Ravand; Mohamad Javadipour; Hossein Kareshki – Journal of Psychoeducational Assessment, 2025
This study aimed to address the need for a comprehensive assessment tool to evaluate the mathematical abilities of first-grade students through cognitive diagnostic assessment (CDA). The primary challenge involved in this endeavor was to delineate the specific cognitive skills and sub-skills pertinent to first-grade mathematics (FG-M) and to…
Descriptors: Test Construction, Cognitive Measurement, Check Lists, Mathematics Tests
Shun-Fu Hu; Amery D. Wu; Jake Stone – Journal of Educational Measurement, 2025
Scoring high-dimensional assessments (e.g., > 15 traits) can be a challenging task. This paper introduces the multilabel neural network (MNN) as a scoring method for high-dimensional assessments. Additionally, it demonstrates how MNN can score the same test responses to maximize different performance metrics, such as accuracy, recall, or…
Descriptors: Tests, Testing, Scores, Test Construction
Kent Anderson Seidel – School Leadership Review, 2025
This paper examines one of three central diagnostic tools of the Concerns Based Adoption Model, the Stages of Concern Questionnaire (SoCQ). The SoCQ was developed with a focus on K12 education. It has been used widely since developed in 1973, in early childhood, higher education, medical, business, community, and military settings. The SoCQ…
Descriptors: Questionnaires, Educational Change, Educational Innovation, Intervention
Hakyung Sung; Sooyeon Cho; Kristopher Kyle – Language Assessment Quarterly, 2024
Lexical diversity (LD) is an important indicator of second language lexical development. Much research has investigated LD indices, with a focus on learners of English. However, further research is needed in languages that are typologically distinct from English, such as Korean. In this study, we evaluated the reliability and validity of LD…
Descriptors: Second Language Learning, Korean, Persuasive Discourse, Language Tests
Farshad Effatpanah; Purya Baghaei; Mona Tabatabaee-Yazdi; Esmat Babaii – Language Testing, 2025
This study aimed to propose a new method for scoring C-Tests as measures of general language proficiency. In this approach, the unit of analysis is sentences rather than gaps or passages. That is, the gaps correctly reformulated in each sentence were aggregated as sentence score, and then each sentence was entered into the analysis as a polytomous…
Descriptors: Item Response Theory, Language Tests, Test Items, Test Construction
Kylie Gorney; Sandip Sinharay – Journal of Educational Measurement, 2025
Although there exists an extensive amount of research on subscores and their properties, limited research has been conducted on categorical subscores and their interpretations. In this paper, we focus on the claim of Feinberg and von Davier that categorical subscores are useful for remediation and instructional purposes. We investigate this claim…
Descriptors: Tests, Scores, Test Interpretation, Alternative Assessment
Marta Godoy-Giménez; Ángel García-Pérez; Fernando Cañadas; Angeles F. Estévez; Pablo Sayans-Jiménez – Autism: The International Journal of Research and Practice, 2024
The broad autism phenotype is the phenotypic expression of the primary characteristics of autism. However, currently available tests do not agree with the two-domain operationalization of broad autism phenotype or autism, and their internal structure has shown instability across applications. This study presents the Broad Autism…
Descriptors: Autism Spectrum Disorders, Genetics, Diagnostic Tests, Foreign Countries
Mustafa Ilhan; Nese Güler; Gülsen Tasdelen Teker; Ömer Ergenekon – International Journal of Assessment Tools in Education, 2024
This study aimed to examine the effects of reverse items created with different strategies on psychometric properties and respondents' scale scores. To this end, three versions of a 10-item scale in the research were developed: 10 positive items were integrated in the first form (Form-P) and five positive and five reverse items in the other two…
Descriptors: Test Items, Psychometrics, Scores, Measures (Individuals)
John Jerrim; Luis Alejandro Lopez-Agudo; Oscar David Marcenaro-Gutierrez – British Journal of Educational Studies, 2024
International large-scale assessments have gained much attention since the beginning of the twenty-first century, influencing education legislation in many countries. This includes Spain, where they have been used by successive governments to justify education policy change. Unfortunately, there was a problem with the PISA 2018 reading scores for…
Descriptors: Foreign Countries, Achievement Tests, International Assessment, Secondary School Students
Ehri Ryu – Society for Research on Educational Effectiveness, 2024
Background/Context: Confirmatory factor analysis (CFA) model is a commonly adopted framework to estimate and test a measurement model. Once a well-fitting final CFA model is selected, the selected model may be used to test structural relationships of the latent constructs with other variables, to construct a test with desired reliability and…
Descriptors: Research Problems, Factor Analysis, Scores, Computation
Thompson, Kathryn N. – ProQuest LLC, 2023
It is imperative to collect validity evidence prior to interpreting and using test scores. During the process of collecting validity evidence, test developers should consider whether test scores are contaminated by sources of extraneous information. This is referred to as construct irrelevant variance, or the "degree to which test scores are…
Descriptors: Test Wiseness, Test Items, Item Response Theory, Scores
Meyer, J. Patrick; Hu, Ann; Li, Sylvia – NWEA, 2023
The Content Proximity Project was designed to improve the content validity of the MAP® Growth™ assessments while retaining the ability for the test to adapt off-grade and meet students wherever they are in their learning. Two main features of the project were the development of an enhanced item selection algorithm, and a spring pilot study…
Descriptors: Achievement Tests, Mathematics Achievement, Content Validity, Mathematics Tests
Acikgul, Kubra; Sad, Suleyman Nihat; Altay, Bilal – International Journal of Assessment Tools in Education, 2023
This study aimed to develop a useful test to measure university students' spatial abilities validly and reliably. Following a sequential explanatory mixed methods research design, first, qualitative methods were used to develop the trial items for the test; next, the psychometric properties of the test were analyzed through quantitative methods…
Descriptors: Spatial Ability, Scores, Multiple Choice Tests, Test Validity