Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 2 |
| Since 2017 (last 10 years) | 4 |
| Since 2007 (last 20 years) | 18 |
Descriptor
| Evaluation Methods | 42 |
| Test Items | 42 |
| Testing | 42 |
| Test Construction | 11 |
| Models | 9 |
| Foreign Countries | 8 |
| Scoring | 8 |
| Student Evaluation | 8 |
| Test Validity | 8 |
| Computer Assisted Testing | 7 |
| Evaluation Criteria | 7 |
| More ▼ | |
Source
Author
Publication Type
Education Level
| Elementary Secondary Education | 2 |
| Grade 4 | 2 |
| Grade 8 | 2 |
| Higher Education | 2 |
| Postsecondary Education | 2 |
| Secondary Education | 2 |
| Grade 10 | 1 |
| Grade 11 | 1 |
| Grade 12 | 1 |
| Grade 3 | 1 |
| Grade 5 | 1 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
| National Assessment of… | 2 |
What Works Clearinghouse Rating
Ozsoy, Seyma Nur; Kilmen, Sevilay – International Journal of Assessment Tools in Education, 2023
In this study, Kernel test equating methods were compared under NEAT and NEC designs. In NEAT design, Kernel post-stratification and chain equating methods taking into account optimal and large bandwidths were compared. In the NEC design, gender and/or computer/tablet use was considered as a covariate, and Kernel test equating methods were…
Descriptors: Equated Scores, Testing, Test Items, Statistical Analysis
Patel, Nirmal; Sharma, Aditya; Shah, Tirth; Lomas, Derek – Journal of Educational Data Mining, 2021
Process Analysis is an emerging approach to discover meaningful knowledge from temporal educational data. The study presented in this paper shows how we used Process Analysis methods on the National Assessment of Educational Progress (NAEP) test data for modeling and predicting student test-taking behavior. Our process-oriented data exploration…
Descriptors: Learning Analytics, National Competency Tests, Evaluation Methods, Prediction
Lynch, Sarah – Practical Assessment, Research & Evaluation, 2022
In today's digital age, tests are increasingly being delivered on computers. Many of these computer-based tests (CBTs) have been adapted from paper-based tests (PBTs). However, this change in mode of test administration has the potential to introduce construct-irrelevant variance, affecting the validity of score interpretations. Because of this,…
Descriptors: Computer Assisted Testing, Tests, Scores, Scoring
Wolkowitz, Amanda A.; Davis-Becker, Susan L.; Gerrow, Jack D. – Journal of Applied Testing Technology, 2016
The purpose of this study was to investigate the impact of a cheating prevention strategy employed for a professional credentialing exam that involved releasing over 7,000 active and retired exam items. This study evaluated: 1) If any significant differences existed between examinee performance on released versus non-released items; 2) If item…
Descriptors: Cheating, Test Content, Test Items, Foreign Countries
Wolf, Raffaela – ProQuest LLC, 2013
Preservation of equity properties was examined using four equating methods--IRT True Score, IRT Observed Score, Frequency Estimation, and Chained Equipercentile--in a mixed-format test under a common-item nonequivalent groups (CINEG) design. Equating of mixed-format tests under a CINEG design can be influenced by factors such as attributes of the…
Descriptors: Testing, Item Response Theory, Equated Scores, Test Items
Golovachyova, Viktoriya N.; Menlibekova, Gulbakhyt Zh.; Abayeva, Nella F.; Ten, Tatyana L.; Kogaya, Galina D. – International Journal of Environmental and Science Education, 2016
Using computer-based monitoring systems that rely on tests could be the most effective way of knowledge evaluation. The problem of objective knowledge assessment by means of testing takes on a new dimension in the context of new paradigms in education. The analysis of the existing test methods enabled us to conclude that tests with selected…
Descriptors: Expertise, Computer Assisted Testing, Student Evaluation, Knowledge Level
International Journal of Testing, 2019
These guidelines describe considerations relevant to the assessment of test takers in or across countries or regions that are linguistically or culturally diverse. The guidelines were developed by a committee of experts to help inform test developers, psychometricians, test users, and test administrators about fairness issues in support of the…
Descriptors: Test Bias, Student Diversity, Cultural Differences, Language Usage
Haro, Elizabeth K.; Haro, Luis S. – Journal of Chemical Education, 2014
The multiple-choice question (MCQ) is the foundation of knowledge assessment in K-12, higher education, and standardized entrance exams (including the GRE, MCAT, and DAT). However, standard MCQ exams are limited with respect to the types of questions that can be asked when there are only five choices. MCQs offering additional choices more…
Descriptors: Multiple Choice Tests, Coding, Scoring Rubrics, Test Scoring Machines
Kim, Eun Sook; Yoon, Myeongsun; Lee, Taehun – Educational and Psychological Measurement, 2012
Multiple-indicators multiple-causes (MIMIC) modeling is often used to test a latent group mean difference while assuming the equivalence of factor loadings and intercepts over groups. However, this study demonstrated that MIMIC was insensitive to the presence of factor loading noninvariance, which implies that factor loading invariance should be…
Descriptors: Test Items, Simulation, Testing, Statistical Analysis
Henning, Grant – English Teaching Forum, 2012
To some extent, good testing procedure, like good language use, can be achieved through avoidance of errors. Almost any language-instruction program requires the preparation and administration of tests, and it is only to the extent that certain common testing mistakes have been avoided that such tests can be said to be worthwhile selection,…
Descriptors: Testing, English (Second Language), Testing Problems, Student Evaluation
Panjaburee, Patcharin; Hwang, Gwo-Jen; Triampo, Wannapong; Shih, Bo-Ying – Computers & Education, 2010
With the popularization of computer and communication technologies, researchers have attempted to develop computer-assisted testing and diagnostic systems to help students improve their learning performance on the Internet. In developing a diagnostic system for detecting students' learning problems, it is difficult for individual teachers to…
Descriptors: Learning Problems, Test Items, Testing, Teaching Methods
Flowers, Claudia; Kim, Do-Hong; Lewis, Preston; Davis, Violeta Carmen – Journal of Special Education Technology, 2011
This study examined the academic performance and preference of students with disabilities for two types of test administration conditions, computer-based testing (CBT) and pencil-and-paper testing (PPT). Data from a large-scale assessment program were used to examine differences between CBT and PPT academic performance for third to eleventh grade…
Descriptors: Testing, Test Items, Effect Size, Computer Assisted Testing
Mullis, Ina V. S.; Bohrnstedt, George W.; Preuschoff, Anna Corinna; de los Reyes, Illiana; Stancavage, Fran; Martin, Michael O. – American Institutes for Research, 2012
National Assessment of Educational Progress (NAEP) has expended considerable effort to ensure high quality in data collection by developing standardized materials and survey operation procedures and using well-trained professional administrators. However, schools are allowed to minimize the disruption associated with pulling students out of…
Descriptors: Testing, National Competency Tests, Program Effectiveness, Scores
von Davier, Matthias – Measurement: Interdisciplinary Research and Perspectives, 2009
In this commentary, the author points out few issues, one being that there are models mislabeled as diagnostic, which deal with linear decompositions of item difficulties rather than estimating multidimensional skill variables. The author discusses the issue that there are many new names for essentially well-known models for multiple simultaneous…
Descriptors: Test Items, Probability, Models, Diagnostic Tests
Hancock, Gregory R. – Measurement: Interdisciplinary Research and Perspectives, 2009
As Rupp and Templin (2008) stated directly, diagnostic classification methods "are confirmatory in nature." Methods, though, are neither inherently confirmatory nor exploratory. Diagnostic classification modeling, with its analytical and computational obstacles eventually yielding as a comprehensive and potent discipline emerges, will…
Descriptors: Structural Equation Models, Test Items, Models, Diagnostic Tests

Peer reviewed
Direct link
