Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 7 |
Since 2016 (last 10 years) | 26 |
Since 2006 (last 20 years) | 50 |
Descriptor
Evaluation Methods | 80 |
Test Items | 80 |
Test Reliability | 50 |
Test Validity | 37 |
Test Construction | 27 |
Scores | 20 |
Student Evaluation | 20 |
Foreign Countries | 19 |
Interrater Reliability | 17 |
Item Response Theory | 17 |
Reliability | 17 |
More ▼ |
Source
Author
Friedman, Greg | 2 |
McGinty, Dixie | 2 |
Michaels, Hillary | 2 |
Neel, John H. | 2 |
Ochieng, Charles | 2 |
Yen, Shu Jing | 2 |
Aaron McVay | 1 |
Adadan, Emine | 1 |
Ahmed, Wondimu | 1 |
Akarsu, Bayram | 1 |
Akaygun, Sevil | 1 |
More ▼ |
Publication Type
Education Level
Location
China | 2 |
India | 2 |
Israel | 2 |
Turkey | 2 |
United Kingdom | 2 |
United States | 2 |
Canada | 1 |
Colorado | 1 |
Dominica | 1 |
Egypt | 1 |
Ethiopia | 1 |
More ▼ |
Laws, Policies, & Programs
Every Student Succeeds Act… | 3 |
Individuals with Disabilities… | 3 |
Rehabilitation Act 1973… | 3 |
No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Novak, Josip; Rebernjak, Blaž – Measurement: Interdisciplinary Research and Perspectives, 2023
A Monte Carlo simulation study was conducted to examine the performance of [alpha], [lambda]2, [lambda][subscript 4], [lambda][subscript 2], [omega][subscript T], GLB[subscript MRFA], and GLB[subscript Algebraic] coefficients. Population reliability, distribution shape, sample size, test length, and number of response categories were varied…
Descriptors: Monte Carlo Methods, Evaluation Methods, Reliability, Simulation
Langbeheim, Elon; Akaygun, Sevil; Adadan, Emine; Hlatshwayo, Manzini; Ramnarain, Umesh – International Journal of Science and Mathematics Education, 2023
Linking assessment and curriculum in science education, particularly within the topic of matter and its changes, is often taken for granted. Some of the fundamental elements of the assessment, such as the choice of wording and visual representations, as well as its relation to the curricular sequence, remain understudied. In addition, very few…
Descriptors: Student Evaluation, Evaluation Methods, Science Education, Test Items
Gill, Tim – Research Matters, 2022
In Comparative Judgement (CJ) exercises, examiners are asked to look at a selection of candidate scripts (with marks removed) and order them in terms of which they believe display the best quality. By including scripts from different examination sessions, the results of these exercises can be used to help with maintaining standards. Results from…
Descriptors: Comparative Analysis, Decision Making, Scripts, Standards
Aaron McVay – ProQuest LLC, 2021
As assessments move towards computerized testing and making continuous testing available the need for rapid assembly of forms is increasing. The objective of this study was to investigate variability in assembled forms through the lens of first- and second-order equity properties of equating, by examining three factors and their interactions. Two…
Descriptors: Automation, Computer Assisted Testing, Test Items, Reaction Time
Toker, Turker – International Journal of Curriculum and Instruction, 2023
Achievement tests are among the most widely used data collection tools to measure the knowledge and skill levels of individuals. For this reason, the existence of valid and reliable achievement tests that can perfectly reveal the competencies that a person should have in any discipline is of great importance. The purpose of this research is to…
Descriptors: Basic Skills, Evaluation Methods, Test Items, Test Validity
Sewagegn, Abatihun A. – International Journal of Instruction, 2019
Assessment plays a significant role in determining the quality of education. This is particularly so when students are properly assessed using various appropriate methods of assessment. This study investigates teachers' assessment methods and the challenges they encounter in assessing learning in an Ethiopian university. A convergent parallel…
Descriptors: Evaluation Methods, Teaching Experience, College Faculty, Foreign Countries
Koçak, Duygu – International Electronic Journal of Elementary Education, 2020
One of the most commonly used methods for measuring higher-order thinking skills such as problem-solving or written expression is open-ended items. Three main approaches are used to evaluate responses to open-ended items: general evaluation, rating scales, and rubrics. In order to measure and improve problem-solving skills of students, firstly, an…
Descriptors: Interrater Reliability, Item Response Theory, Test Items, Rating Scales
Fu, Jianbin; Qu, Yanxuan – ETS Research Report Series, 2018
Various subscore estimation methods that use auxiliary information to improve subscore accuracy and stability have been developed. This report provides a review of various subscore estimation methods described in the literature. The methodology of each method is described, then research studies on these subscore estimation methods are summarized.…
Descriptors: Scores, Evaluation Methods, Item Response Theory, Test Items
Torres, Anthony; Sriraman, Vedaraman; Ortiz, Araceli – International Journal of Instruction, 2021
The focus of this study is to implement multiple assessment methods in order to comprehensively assess the impact of a Project Based Learning (PrBL) application in construction project management course. The assessment methods include various direct (objective) and indirect (subjective) evaluations methods. These methods included a pre and post…
Descriptors: Active Learning, Student Projects, Construction Management, Student Attitudes
Sanyin Cheng – Journal of Developmental and Physical Disabilities, 2020
This research evaluates the appropriateness of test accommodations among Chinese university students with hearing impairment, using reliability estimates and exploratory factor analysis. Study 1 explores the appropriateness of test directions accommodation for one nonverbal assessment (Group Embedded Figures Test) and two verbal assessments with…
Descriptors: Testing Accommodations, College Students, Deafness, Hearing Impairments
Akbari, Alireza; Shahnazari, Mohammadtaghi – Language Testing in Asia, 2019
The present research paper introduces a translation evaluation method called Calibrated Parsing Items Evaluation (CPIE hereafter). This evaluation method maximizes translators' performance through identifying the parsing items with an optimal p-docimology and d-index (item discrimination). This method checks all the possible parses (annotations)…
Descriptors: Test Items, Translation, Computer Software, Evaluators
Wang, Xiaolin; Svetina, Dubravka; Dai, Shenghai – Journal of Experimental Education, 2019
Recently, interest in test subscore reporting for diagnosis purposes has been growing rapidly. The two simulation studies here examined factors (sample size, number of subscales, correlation between subscales, and three factors affecting subscore reliability: number of items per subscale, item parameter distribution, and data generating model)…
Descriptors: Value Added Models, Scores, Sample Size, Correlation
International Journal of Testing, 2019
These guidelines describe considerations relevant to the assessment of test takers in or across countries or regions that are linguistically or culturally diverse. The guidelines were developed by a committee of experts to help inform test developers, psychometricians, test users, and test administrators about fairness issues in support of the…
Descriptors: Test Bias, Student Diversity, Cultural Differences, Language Usage
Koskey, Kristin L. K.; Makki, Nidaa; Ahmed, Wondimu; Garafolo, Nicholas G.; Visco, Donald P., Jr. – School Science and Mathematics, 2020
Integrating engineering into the K-12 science curriculum continues to be a focus in national reform efforts in science education. Although there is an increasing interest in research in and practice of integrating engineering in K-12 science education, to date only a few studies have focused on the development of an assessment tool to measure…
Descriptors: Middle School Students, Engineering, Design, Science Education
Desstya, Anatri; Prasetyo, Zuhdan Kun; Suyanta; Susila, Ihwan; Irwanto – International Journal of Instruction, 2019
This study aims to report the development an instrument that is standardized (reviewed by validity, reliability, and difficulty index) to detect science misconception in an elementary school teacher. This study used a 4-D model; defining, designing, developing, and disseminating. First, it was prepared with 47 opened-ended questions, and then it…
Descriptors: Elementary School Teachers, Misconceptions, Evaluation Methods, Teacher Evaluation