Publication Date
In 2025 | 2 |
Since 2024 | 3 |
Since 2021 (last 5 years) | 13 |
Since 2016 (last 10 years) | 53 |
Since 2006 (last 20 years) | 103 |
Descriptor
Scoring | 265 |
Test Construction | 265 |
Test Reliability | 212 |
Test Validity | 161 |
Testing | 67 |
Test Items | 63 |
Interrater Reliability | 50 |
Item Analysis | 43 |
Test Interpretation | 43 |
Psychometrics | 36 |
Language Tests | 35 |
More ▼ |
Source
Author
Publication Type
Education Level
Location
Canada | 6 |
Nebraska | 6 |
New York | 6 |
Florida | 5 |
Turkey | 5 |
United States | 4 |
Australia | 3 |
California | 3 |
Chile | 2 |
Germany | 2 |
Japan | 2 |
More ▼ |
Laws, Policies, & Programs
Individuals with Disabilities… | 3 |
No Child Left Behind Act 2001 | 2 |
Education Consolidation… | 1 |
Elementary and Secondary… | 1 |
Elementary and Secondary… | 1 |
Kentucky Education Reform Act… | 1 |
Race to the Top | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Meets WWC Standards without Reservations | 1 |
Meets WWC Standards with or without Reservations | 1 |
Bolton, Tiffany; Stevenson, Brittney; Janes, William – Journal of Occupational Therapy, Schools & Early Intervention, 2023
Researchers utilized a cross-sectional secondary analysis of data within an ongoing non-randomized controlled trial study design to establish the reliability and internal consistency of a novel handwriting assessment for preschoolers, the Just Write! (JW), written by the authors. Seventy-eight children from an area preschool participated in the…
Descriptors: Handwriting, Writing Skills, Writing Evaluation, Preschool Children
Parker, Mark A. J.; Hedgeland, Holly; Jordan, Sally E.; Braithwaite, Nicholas St. J. – European Journal of Science and Mathematics Education, 2023
The study covers the development and testing of the alternative mechanics survey (AMS), a modified force concept inventory (FCI), which used automatically marked free-response questions. Data were collected over a period of three academic years from 611 participants who were taking physics classes at high school and university level. A total of…
Descriptors: Test Construction, Scientific Concepts, Physics, Test Reliability
Safak, Pinar; Cakmak, Salih; Karakoc, Tamer; Aydin O'Dwyer, Pinar – European Journal of Educational Research, 2021
This study aimed to develop a valid and reliable instrument that measures the functional vision of students with low vision. Thus, an assessment tool and performance activities were developed for three vision skill groups (near vision skills, distance vision skills, and visual field) that include functional vision skills. The universe was 1485…
Descriptors: Foreign Countries, Vision Tests, Diagnostic Tests, Vision
Reuben S. Asempapa; Doris Lee – Discover Education, 2025
Across the world, standards and practices for preparing teachers of mathematics emphasize the importance of math modeling (MM) in developing students' mathematical thinking. The aim of this research study was to develop the Mathematical Modeling Knowledge Scale (MAMKS), capable of determining preservice teachers' (PSTs') knowledge of MM. The study…
Descriptors: Preservice Teachers, Preservice Teacher Education, Mathematics Education, Mathematics Curriculum
Donoghue, John R.; McClellan, Catherine A.; Hess, Melinda R. – ETS Research Report Series, 2022
When constructed-response items are administered for a second time, it is necessary to evaluate whether the current Time B administration's raters have drifted from the scoring of the original administration at Time A. To study this, Time A papers are sampled and rescored by Time B scorers. Commonly the scores are compared using the proportion of…
Descriptors: Item Response Theory, Test Construction, Scoring, Testing
Farshad Effatpanah; Purya Baghaei; Mona Tabatabaee-Yazdi; Esmat Babaii – Language Testing, 2025
This study aimed to propose a new method for scoring C-Tests as measures of general language proficiency. In this approach, the unit of analysis is sentences rather than gaps or passages. That is, the gaps correctly reformulated in each sentence were aggregated as sentence score, and then each sentence was entered into the analysis as a polytomous…
Descriptors: Item Response Theory, Language Tests, Test Items, Test Construction
Roohr, Katrina Crotts; Burkander, Kri; Mao, Liyang – ETS Research Report Series, 2018
Oral communication has been identified as an important skill by higher education institutions and by the workforce community. Despite its importance, minimal research has been conducted around the development of tasks to measure oral communication skills and behaviors. The purpose of this preliminary study is to evaluate the different factors…
Descriptors: Speech Communication, Video Technology, Test Construction, Scoring
Maxwell, Bruce; Boon, Helen; Tanchuk, Nicolas; Rauwerda, Bryan – Journal of Moral Education, 2021
This article documents the adaptation, piloting and validation of a measure of teachers' ethical sensitivity. To create the test, we modified a measure from dentistry drawing on literature in teacher professional ethics and drew on the expertise of professional ethics scholars and practitioners. Based on the results of Rasch analysis combined with…
Descriptors: Ethics, Moral Values, Scores, Teacher Education Programs
Johnson, Jennifer M.; Meisinger, Elizabeth B.; Robinson, Melissa F. – Journal of Psychoeducational Assessment, 2019
This review focuses on the Feifer Assessment of Reading (FAR) produced by S. G. Feifer and R. G. Nader in 2015. The FAR is a comprehensive reading test that is individually administered to children and adults aged 4 to 21 years. The structure of the FAR is based on a gradient model of brain functioning (Goldberg, 1990; Luria, 1980) and reflects a…
Descriptors: Reading Tests, Scoring, Test Construction, Test Norms
Güntay Tasçi – Science Insights Education Frontiers, 2024
The present study has aimed to develop and validate a protein concept inventory (PCI) consisting of 25 multiple-choice (MC) questions to assess students' understanding of protein, which is a fundamental concept across different biology disciplines. The development process of the PCI involved a literature review to identify protein-related content,…
Descriptors: Science Instruction, Science Tests, Multiple Choice Tests, Biology
Guo, Hongwen; Ling, Guangming; Frankel, Lois – ETS Research Report Series, 2020
With advances in technology, researchers and test developers are developing new item types to measure complex skills like problem solving and critical thinking. Analyzing such items is often challenging because of their complicated response patterns, and thus it is important to develop psychometric methods for practitioners and researchers to…
Descriptors: Test Construction, Test Items, Item Analysis, Psychometrics
Attali, Yigal – Educational Measurement: Issues and Practice, 2019
Rater training is an important part of developing and conducting large-scale constructed-response assessments. As part of this process, candidate raters have to pass a certification test to confirm that they are able to score consistently and accurately before they begin scoring operationally. Moreover, many assessment programs require raters to…
Descriptors: Evaluators, Certification, High Stakes Tests, Scoring
Er, Zübeyde; Dinç Artut, Perihan; Bal, Ayten Pinar – Pegem Journal of Education and Instruction, 2023
The aim of this research is to develop a reliable and valid scale to determine the mathematical thinking skills of gifted students. In addition, with the developed scale, thinking skills of gifted students was examined in terms of various variables. In this context, the research was carried out on two different study groups. The first stage of…
Descriptors: Measures (Individuals), Rating Scales, Test Construction, Construct Validity
Benton, Tom – Cambridge Assessment, 2018
One of the questions with the longest history in educational assessment is whether it is possible to increase the reliability of a test simply by altering the way in which scores on individual test items are combined to make the overall test score. Most usually, the score available on each item is communicated to the candidate within a question…
Descriptors: Test Items, Scoring, Predictive Validity, Test Reliability
Cesur, Kursat – Educational Policy Analysis and Strategic Research, 2019
Examinees' performances are assessed using a wide variety of different techniques. Multiple-choice (MC) tests are among the most frequently used ones. Nearly, all standardized achievement tests make use of MC test items and there is a variety of ways to score these tests. The study compares number right and liberal scoring (SAC) methods. Mixed…
Descriptors: Multiple Choice Tests, Scoring, Evaluation Methods, Guessing (Tests)