Publication Date
| In 2026 | 0 |
| Since 2025 | 220 |
| Since 2022 (last 5 years) | 1089 |
| Since 2017 (last 10 years) | 2599 |
| Since 2007 (last 20 years) | 4960 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 653 |
| Teachers | 563 |
| Researchers | 250 |
| Students | 201 |
| Administrators | 81 |
| Policymakers | 22 |
| Parents | 17 |
| Counselors | 8 |
| Community | 7 |
| Support Staff | 3 |
| Media Staff | 1 |
| More ▼ | |
Location
| Turkey | 226 |
| Canada | 223 |
| Australia | 155 |
| Germany | 116 |
| United States | 99 |
| China | 90 |
| Florida | 86 |
| Indonesia | 82 |
| Taiwan | 78 |
| United Kingdom | 73 |
| California | 66 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 4 |
| Meets WWC Standards with or without Reservations | 4 |
| Does not meet standards | 1 |
Mohammed Ambusaidi – ProQuest LLC, 2022
There is an increased demand on nursing faculty to provide quality teaching and assessment. Nursing faculty are required to ensure accurate assessment of learning through testing and outcome measurement that are critical elements of the evaluation process. Likewise, nursing faculty should implement a logical evaluation system. However, the…
Descriptors: Nursing Education, College Faculty, Test Construction, Test Validity
Akbay, Lokman; Kilinç, Mustafa – International Journal of Assessment Tools in Education, 2018
Measurement models need to properly delineate the real aspect of examinees' response processes for measurement accuracy purposes. To avoid invalid inferences, fit of examinees' response data to the model is studied through "person-fit" statistics. Misfit between the examinee response data and measurement model may be due to invalid…
Descriptors: Reliability, Goodness of Fit, Cognitive Measurement, Models
Sinharay, Sandip; Jensen, Jens Ledet – Grantee Submission, 2018
In educational and psychological measurement, researchers and/or practitioners are often interested in examining whether the ability of an examinee is the same over two sets of items. Such problems can arise in measurement of change, detection of cheating on unproctored tests, erasure analysis, detection of item preknowledge etc. Traditional…
Descriptors: Test Items, Ability, Mathematics, Item Response Theory
Benton, Tom – Cambridge Assessment, 2018
One of the questions with the longest history in educational assessment is whether it is possible to increase the reliability of a test simply by altering the way in which scores on individual test items are combined to make the overall test score. Most usually, the score available on each item is communicated to the candidate within a question…
Descriptors: Test Items, Scoring, Predictive Validity, Test Reliability
Azevedo, Jose Manuel; Oliveira, Ema P.; Beites, Patrícia Damas – International Journal of Information and Learning Technology, 2019
Purpose: The purpose of this paper is to find appropriate forms of analysis of multiple-choice questions (MCQ) to obtain an assessment method, as fair as possible, for the students. The authors intend to ascertain if it is possible to control the quality of the MCQ contained in a bank of questions, implemented in Moodle, presenting some evidence…
Descriptors: Learning Analytics, Multiple Choice Tests, Test Theory, Item Response Theory
Snow, Stephen; Wilde, Adriana; Denny, Paul; schraefel, m. c. – British Journal of Educational Technology, 2019
Peer-learning that engages students in multiple choice question (MCQ) formulation promotes higher task engagement and deeper learning than simply answering MCQ's in summative assessment. Yet presently, the literature detailing deployments of student-authored MCQ software is biased towards accounts from Science, Technology, Engineering, Maths and…
Descriptors: Student Developed Materials, Multiple Choice Tests, Computer Software, Cooperative Learning
Zhan, Peida; Jiao, Hong; Man, Kaiwen; Wang, Lijun – Journal of Educational and Behavioral Statistics, 2019
In this article, we systematically introduce the just another Gibbs sampler (JAGS) software program to fit common Bayesian cognitive diagnosis models (CDMs) including the deterministic inputs, noisy "and" gate model; the deterministic inputs, noisy "or" gate model; the linear logistic model; the reduced reparameterized unified…
Descriptors: Bayesian Statistics, Computer Software, Models, Test Items
Trate, Jaclyn M.; Fisher, Victoria; Blecking, Anja; Geissinger, Peter; Murphy, Kristen L. – Journal of Chemical Education, 2019
Assessment and evaluation tools and instruments are developed to measure many things from content knowledge to misconceptions to student affect. The standard validation processes for these are regularly conducted and provide strong evidence for the validity of the measurements that are made. As part of the suite of validation tools available to…
Descriptors: Test Validity, Multiple Choice Tests, Chemistry, Science Tests
Crisp, Victoria; Johnson, Martin; Constantinou, Filio – Research in Education, 2019
In educational contexts, questioning performs a number of functions. These include facilitating learning in the classroom and the recognition of achievement through examinations and other assessments. Good quality questions are important to ensuring that these functions are achieved. This research focused on educational exams and used views from…
Descriptors: Test Items, Test Construction, Educational Quality, Test Validity
Sideridis, Georgios D.; Tsaousis, Ioannis; Al-Sadaawi, Abdullah – Educational and Psychological Measurement, 2019
The purpose of the present study was to apply the methodology developed by Raykov on modeling item-specific variance for the measurement of internal consistency reliability with longitudinal data. Participants were a randomly selected sample of 500 individuals who took on a professional qualifications test in Saudi Arabia over four different…
Descriptors: Test Reliability, Test Items, Longitudinal Studies, Foreign Countries
Setiawan, Risky – European Journal of Educational Research, 2019
The purposes of this research are: 1) to compare two equalizing tests conducted with Hebara and Stocking Lord method; 2) to describe the characteristics of each equalizing test method using windows' IRTEQ program. This research employs a participatory approach as the data are collected through questionnaires based on the National Examination…
Descriptors: Equated Scores, Evaluation Methods, Evaluation Criteria, Test Items
Eaton, Philip; Frank, Barrett; Johnson, Keith; Willoughy, Shannon – Physical Review Physics Education Research, 2019
While numerous studies have analyzed the conceptions probed by the Force Concept Inventory (FCI), assessments dedicated to electricity and magnetism lack similar analyses. This paper investigated the conceptions explored by the Brief Electricity and Magnetism Assessment (BEMA) and the Conceptual Survey of Electricity and Magnetism (CSEM) using…
Descriptors: Energy, Magnets, Physics, Science Tests
Wang, Wenyi; Song, Lihong; Chen, Ping; Ding, Shuliang – Journal of Educational Measurement, 2019
Most of the existing classification accuracy indices of attribute patterns lose effectiveness when the response data is absent in diagnostic testing. To handle this issue, this article proposes new indices to predict the correct classification rate of a diagnostic test before administering the test under the deterministic noise input…
Descriptors: Cognitive Tests, Classification, Accuracy, Diagnostic Tests
Svetina, Dubravka; Liaw, Yuan-Ling; Rutkowski, Leslie; Rutkowski, David – Journal of Educational Measurement, 2019
This study investigates the effect of several design and administration choices on item exposure and person/item parameter recovery under a multistage test (MST) design. In a simulation study, we examine whether number-correct (NC) or item response theory (IRT) methods are differentially effective at routing students to the correct next stage(s)…
Descriptors: Measurement, Item Analysis, Test Construction, Item Response Theory
Breakall, Jared; Randles, Christopher; Tasker, Roy – Chemistry Education Research and Practice, 2019
Multiple-choice (MC) exams are common in undergraduate general chemistry courses in the United States and are known for being difficult to construct. With their extensive use in the general chemistry classroom, it is important to ensure that these exams are valid measures of what chemistry students know and can do. One threat to MC exam validity…
Descriptors: Science Instruction, Chemistry, College Science, Undergraduate Study

Direct link
Peer reviewed
