Publication Date
In 2025 | 2 |
Since 2024 | 3 |
Since 2021 (last 5 years) | 8 |
Since 2016 (last 10 years) | 25 |
Since 2006 (last 20 years) | 44 |
Descriptor
Models | 58 |
Test Items | 58 |
Test Reliability | 58 |
Test Validity | 30 |
Test Construction | 25 |
Item Response Theory | 21 |
Item Analysis | 17 |
Goodness of Fit | 16 |
Foreign Countries | 13 |
Psychometrics | 13 |
Difficulty Level | 9 |
More ▼ |
Source
Author
Burton, Richard F. | 2 |
Champagne, Zachary M. | 2 |
Farina, Kristy | 2 |
Hambleton, Ronald K. | 2 |
LaVenia, Mark | 2 |
Lee, Won-Chan | 2 |
Schoen, Robert C. | 2 |
Trevisan, Michael S. | 2 |
Wang, Wen-Chung | 2 |
Aditya Shah | 1 |
Ajay Devmane | 1 |
More ▼ |
Publication Type
Education Level
Audience
Practitioners | 2 |
Administrators | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Graduate Record Examinations | 2 |
Alberta Grade Twelve Diploma… | 1 |
Center for Epidemiologic… | 1 |
Hidden Figures Test | 1 |
Stages of Concern… | 1 |
Wechsler Adult Intelligence… | 1 |
What Works Clearinghouse Rating
Kuan-Yu Jin; Wai-Lok Siu – Journal of Educational Measurement, 2025
Educational tests often have a cluster of items linked by a common stimulus ("testlet"). In such a design, the dependencies caused between items are called "testlet effects." In particular, the directional testlet effect (DTE) refers to a recursive influence whereby responses to earlier items can positively or negatively affect…
Descriptors: Models, Test Items, Educational Assessment, Scores
Aditya Shah; Ajay Devmane; Mehul Ranka; Prathamesh Churi – Education and Information Technologies, 2024
Online learning has grown due to the advancement of technology and flexibility. Online examinations measure students' knowledge and skills. Traditional question papers include inconsistent difficulty levels, arbitrary question allocations, and poor grading. The suggested model calibrates question paper difficulty based on student performance to…
Descriptors: Computer Assisted Testing, Difficulty Level, Grading, Test Construction
Kent Anderson Seidel – School Leadership Review, 2025
This paper examines one of three central diagnostic tools of the Concerns Based Adoption Model, the Stages of Concern Questionnaire (SoCQ). The SoCQ was developed with a focus on K12 education. It has been used widely since developed in 1973, in early childhood, higher education, medical, business, community, and military settings. The SoCQ…
Descriptors: Questionnaires, Educational Change, Educational Innovation, Intervention
Dhyaaldian, Safa Mohammed Abdulridah; Kadhim, Qasim Khlaif; Mutlak, Dhameer A.; Neamah, Nour Raheem; Kareem, Zaidoon Hussein; Hamad, Doaa A.; Tuama, Jassim Hassan; Qasim, Mohammed Saad – International Journal of Language Testing, 2022
A C-Test is a gap-filling test for measuring language competence in the first and second language. C-Tests are usually analyzed with polytomous Rasch models by considering each passage as a super-item or testlet. This strategy helps overcome the local dependence inherent in C-Test gaps. However, there is little research on the best polytomous…
Descriptors: Item Response Theory, Cloze Procedure, Reading Tests, Language Tests
Mateja Ploj Virtic; Andre Du Plessis; Andrej Šorgo – Center for Educational Policy Studies Journal, 2023
In the context of improving the quality of teacher education, the focus of the present work was to adapt the Mentoring for Effective Primary Science Teaching instrument to become more universal and have the potential to be used beyond the elementary science mentoring context. The adapted instrument was renamed the Mentoring for Effective Teaching…
Descriptors: Test Construction, Test Validity, Test Reliability, Measures (Individuals)
Tim Jacobbe; Bob delMas; Brad Hartlaub; Jeff Haberstroh; Catherine Case; Steven Foti; Douglas Whitaker – Numeracy, 2023
The development of assessments as part of the funded LOCUS project is described. The assessments measure students' conceptual understanding of statistics as outlined in the GAISE PreK-12 Framework. Results are reported from a large-scale administration to 3,430 students in grades 6 through 12 in the United States. Items were designed to assess…
Descriptors: Statistics Education, Common Core State Standards, Student Evaluation, Elementary School Students
Rao, Dhawaleswar; Saha, Sujan Kumar – IEEE Transactions on Learning Technologies, 2020
Automatic multiple choice question (MCQ) generation from a text is a popular research area. MCQs are widely accepted for large-scale assessment in various domains and applications. However, manual generation of MCQs is expensive and time-consuming. Therefore, researchers have been attracted toward automatic MCQ generation since the late 90's.…
Descriptors: Multiple Choice Tests, Test Construction, Automation, Computer Software
Sideridis, Georgios D.; Tsaousis, Ioannis; Al-Sadaawi, Abdullah – Educational and Psychological Measurement, 2019
The purpose of the present study was to apply the methodology developed by Raykov on modeling item-specific variance for the measurement of internal consistency reliability with longitudinal data. Participants were a randomly selected sample of 500 individuals who took on a professional qualifications test in Saudi Arabia over four different…
Descriptors: Test Reliability, Test Items, Longitudinal Studies, Foreign Countries
Intasoi, Sasima; Junpeng, Putcharee; Tang, Keow Ngang; Ketchatturat, Jatuphum; Zhang, Yidan; Wilson, Mark – International Journal of Evaluation and Research in Education, 2020
The study aimed to develop and validate an assessment framework of multidimensional scientific competencies for seventh-grade students in the northeastern region of Thailand. A total of 289 samples with three different scientific competency levels were randomly selected to participate as test-takers. The design-based research encompassing four…
Descriptors: Science Tests, Grade 7, Foreign Countries, Science Process Skills
Rubright, Jonathan D. – Educational Measurement: Issues and Practice, 2018
Performance assessments, scenario-based tasks, and other groups of items carry a risk of violating the local item independence assumption made by unidimensional item response theory (IRT) models. Previous studies have identified negative impacts of ignoring such violations, most notably inflated reliability estimates. Still, the influence of this…
Descriptors: Performance Based Assessment, Item Response Theory, Models, Test Reliability
Warsono; Nursuhud, Puji Iman; Darma, Rio Sandhika; Supahar – International Journal of Instruction, 2020
The study was conducted to analyze the items about the ability of high school students diagram representation and obtain Item Curve Characteristic. Grid test instruments are compiled based on competencies and indicators of diagram representation which are then used to compile items. The test instrument consisted of five items and was validated by…
Descriptors: High School Students, Problem Solving, Visual Aids, Scoring
Al-Jarf, Reima – Online Submission, 2023
This article aims to give a comprehensive guide to planning and designing vocabulary tests which include Identifying the skills to be covered by the test; outlining the course content covered; preparing a table of specifications that shows the skill, content topics and number of questions allocated to each; and preparing the test instructions. The…
Descriptors: Vocabulary Development, Learning Processes, Test Construction, Course Content
Sarwanto; Fajari, Laksmi Evasufi Widi; Chumdari – International Journal of Instruction, 2021
Critical thinking skills are the 21st-century life skills that are needed by students. However, in elementary schools, there are no instruments that are truly effective and efficient to measure critical thinking skills. This research aims to develop an open-ended question assessment instrument to measure students' critical-thinking skills, to test…
Descriptors: Critical Thinking, Thinking Skills, Teaching Methods, Questioning Techniques
Inal, Ebru; Altintas, Kerim Hakan; Dogan, Nuri – International Journal of Assessment Tools in Education, 2018
The Health Belief Model (HBM) is one of the oldest and most recognized conceptual framework of health behavior and can be applied to disaster preparedness efforts which focus predominantly on human behavior. The study aims to develop and test the psychometric properties of the General Disaster Preparedness Belief (GDPB) scale based on the HBM. A…
Descriptors: Natural Disasters, Emergency Programs, Health Behavior, Models
Tabatabaee-Yazdi, Mona – SAGE Open, 2020
The Hierarchical Diagnostic Classification Model (HDCM) reflects on the sequences of the presentation of the essential materials and attributes to answer the items of a test correctly. In this study, a foreign language reading comprehension test was analyzed employing HDCM and the generalized deterministic-input, noisy and gate (G-DINA) model to…
Descriptors: Diagnostic Tests, Classification, Models, Reading Comprehension