Publication Date
In 2025 | 1 |
Since 2024 | 4 |
Since 2021 (last 5 years) | 18 |
Since 2016 (last 10 years) | 53 |
Since 2006 (last 20 years) | 100 |
Descriptor
Models | 135 |
Test Items | 135 |
Test Validity | 83 |
Test Construction | 57 |
Item Response Theory | 44 |
Validity | 36 |
Foreign Countries | 35 |
Difficulty Level | 32 |
Test Reliability | 32 |
Psychometrics | 27 |
Item Analysis | 25 |
More ▼ |
Source
Author
Publication Type
Education Level
Audience
Practitioners | 3 |
Administrators | 1 |
Policymakers | 1 |
Teachers | 1 |
Location
Germany | 4 |
Indonesia | 4 |
Australia | 3 |
Canada | 3 |
Iran | 3 |
New York | 3 |
United Kingdom | 3 |
China | 2 |
Georgia | 2 |
Oregon | 2 |
Singapore | 2 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Aditya Shah; Ajay Devmane; Mehul Ranka; Prathamesh Churi – Education and Information Technologies, 2024
Online learning has grown due to the advancement of technology and flexibility. Online examinations measure students' knowledge and skills. Traditional question papers include inconsistent difficulty levels, arbitrary question allocations, and poor grading. The suggested model calibrates question paper difficulty based on student performance to…
Descriptors: Computer Assisted Testing, Difficulty Level, Grading, Test Construction
Kent Anderson Seidel – School Leadership Review, 2025
This paper examines one of three central diagnostic tools of the Concerns Based Adoption Model, the Stages of Concern Questionnaire (SoCQ). The SoCQ was developed with a focus on K12 education. It has been used widely since developed in 1973, in early childhood, higher education, medical, business, community, and military settings. The SoCQ…
Descriptors: Questionnaires, Educational Change, Educational Innovation, Intervention
Ö. Emre C. Alagöz; Thorsten Meiser – Educational and Psychological Measurement, 2024
To improve the validity of self-report measures, researchers should control for response style (RS) effects, which can be achieved with IRTree models. A traditional IRTree model considers a response as a combination of distinct decision-making processes, where the substantive trait affects the decision on response direction, while decisions about…
Descriptors: Item Response Theory, Validity, Self Evaluation (Individuals), Decision Making
Luan, Lin; Liang, Jyh-Chong; Chai, Ching Sing; Lin, Tzu-Bin; Dong, Yan – Interactive Learning Environments, 2023
The emergence of new media technologies has empowered individuals to not merely consume but also create, share and critique media contents. Such activities are dependent on new media literacy (NML) necessary for living and working in the participatory culture of the twenty-first century. Although a burgeoning body of research has focused on the…
Descriptors: Foreign Countries, Media Literacy, Test Construction, English (Second Language)
Nájera, Pablo; Sorrel, Miguel A.; Abad, Francisco José – Educational and Psychological Measurement, 2019
Cognitive diagnosis models (CDMs) are latent class multidimensional statistical models that help classify people accurately by using a set of discrete latent variables, commonly referred to as attributes. These models require a Q-matrix that indicates the attributes involved in each item. A potential problem is that the Q-matrix construction…
Descriptors: Matrices, Statistical Analysis, Models, Classification
Karadavut, Tugba – Applied Measurement in Education, 2021
Mixture IRT models address the heterogeneity in a population by extracting latent classes and allowing item parameters to vary between latent classes. Once the latent classes are extracted, they need to be further examined to be characterized. Some approaches have been adopted in the literature for this purpose. These approaches examine either the…
Descriptors: Item Response Theory, Models, Test Items, Maximum Likelihood Statistics
Meng, Yaru; Fu, Hua – Modern Language Journal, 2023
The distinguishing feature of dynamic assessment (DA) is the dialectical integration of assessment and instruction. However, how to design the targeted instruction or mediation has been relatively underexplored. To address this gap, this study proposes the attribute-based mediation model (AMM), an English-as-a-foreign-language listening mediation…
Descriptors: Evaluation Methods, Teaching Methods, Models, English (Second Language)
Anderson, Daniel; Rowley, Brock; Stegenga, Sondra; Irvin, P. Shawn; Rosenberg, Joshua M. – Educational Measurement: Issues and Practice, 2020
Validity evidence based on test content is critical to meaningful interpretation of test scores. Within high-stakes testing and accountability frameworks, content-related validity evidence is typically gathered via alignment studies, with panels of experts providing qualitative judgments on the degree to which test items align with the…
Descriptors: Content Validity, Artificial Intelligence, Test Items, Vocabulary
Ketabi, Somaye; Alavi, Seyyed Mohammed; Ravand, Hamdollah – International Journal of Language Testing, 2021
Although Diagnostic Classification Models (DCMs) were introduced to education system decades ago, it seems that these models were not employed for the original aims upon which they had been designed. Using DCMs has been mostly common in analyzing large-scale non-diagnostic tests and these models have been rarely used in developing Cognitive…
Descriptors: Diagnostic Tests, Test Construction, Goodness of Fit, Classification
Mateja Ploj Virtic; Andre Du Plessis; Andrej Šorgo – Center for Educational Policy Studies Journal, 2023
In the context of improving the quality of teacher education, the focus of the present work was to adapt the Mentoring for Effective Primary Science Teaching instrument to become more universal and have the potential to be used beyond the elementary science mentoring context. The adapted instrument was renamed the Mentoring for Effective Teaching…
Descriptors: Test Construction, Test Validity, Test Reliability, Measures (Individuals)
Tim Jacobbe; Bob delMas; Brad Hartlaub; Jeff Haberstroh; Catherine Case; Steven Foti; Douglas Whitaker – Numeracy, 2023
The development of assessments as part of the funded LOCUS project is described. The assessments measure students' conceptual understanding of statistics as outlined in the GAISE PreK-12 Framework. Results are reported from a large-scale administration to 3,430 students in grades 6 through 12 in the United States. Items were designed to assess…
Descriptors: Statistics Education, Common Core State Standards, Student Evaluation, Elementary School Students
Qi Huang; Daniel M. Bolt; Weicong Lyu – Large-scale Assessments in Education, 2024
Large scale international assessments depend on invariance of measurement across countries. An important consideration when observing cross-national differential item functioning (DIF) is whether the DIF actually reflects a source of bias, or might instead be a methodological artifact reflecting item response theory (IRT) model misspecification.…
Descriptors: Test Items, Item Response Theory, Test Bias, Test Validity
Rao, Dhawaleswar; Saha, Sujan Kumar – IEEE Transactions on Learning Technologies, 2020
Automatic multiple choice question (MCQ) generation from a text is a popular research area. MCQs are widely accepted for large-scale assessment in various domains and applications. However, manual generation of MCQs is expensive and time-consuming. Therefore, researchers have been attracted toward automatic MCQ generation since the late 90's.…
Descriptors: Multiple Choice Tests, Test Construction, Automation, Computer Software
Foghahaee, Zahra – Language Teaching Research Quarterly, 2019
Reverse engineering (RE) can play an important role in the re-designing tests in L2 English. It can also enrich the aim of teaching the same as raising children through academic achievement. In addition, it can play a key role in helping students understand how much their test is valid by using Standard reverse engineering (SRE). This paper is a…
Descriptors: Language Tests, Second Language Learning, Second Language Instruction, English (Second Language)
Torre, Jimmy de la; Akbay, Lokman – Eurasian Journal of Educational Research, 2019
Purpose: Well-designed assessment methodologies and various cognitive diagnosis models (CDMs) to extract diagnostic information about examinees' individual strengths and weaknesses have been developed. Due to this novelty, as well as educational specialists' lack of familiarity with CDMs, their applications are not widespread. This article aims at…
Descriptors: Cognitive Measurement, Models, Computer Software, Testing