Publication Date
| In 2026 | 0 |
| Since 2025 | 3 |
| Since 2022 (last 5 years) | 5 |
| Since 2017 (last 10 years) | 11 |
| Since 2007 (last 20 years) | 19 |
Descriptor
| Models | 30 |
| Scores | 30 |
| Test Reliability | 30 |
| Test Validity | 11 |
| Test Items | 9 |
| Comparative Analysis | 8 |
| Evaluation Methods | 7 |
| Statistical Analysis | 7 |
| Test Construction | 7 |
| Educational Assessment | 5 |
| Error of Measurement | 5 |
| More ▼ | |
Source
Author
| Al-Jarf, Reima | 1 |
| Ali Moghadamzadeh | 1 |
| Amrein-Beardsley, Audrey | 1 |
| Bergquist, Constance | 1 |
| Bob delMas | 1 |
| Bormuth, John R. | 1 |
| Brad Hartlaub | 1 |
| Burton, Richard F. | 1 |
| Catherine Case | 1 |
| Dorans, Neil J. | 1 |
| Douglas Whitaker | 1 |
| More ▼ | |
Publication Type
| Journal Articles | 22 |
| Reports - Research | 18 |
| Reports - Evaluative | 5 |
| Reports - Descriptive | 2 |
| Speeches/Meeting Papers | 2 |
| Collected Works - Proceedings | 1 |
| Dissertations/Theses -… | 1 |
| Reports - General | 1 |
Education Level
| Higher Education | 4 |
| Elementary Education | 3 |
| Postsecondary Education | 3 |
| Grade 3 | 2 |
| Early Childhood Education | 1 |
| Elementary Secondary Education | 1 |
| Grade 1 | 1 |
| Grade 4 | 1 |
| Grade 5 | 1 |
| Grade 6 | 1 |
| Grade 7 | 1 |
| More ▼ | |
Audience
Location
| Asia | 1 |
| Australia | 1 |
| Brazil | 1 |
| Connecticut | 1 |
| Denmark | 1 |
| Egypt | 1 |
| Estonia | 1 |
| Florida | 1 |
| Germany | 1 |
| Greece | 1 |
| Hawaii | 1 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
| ACT Assessment | 1 |
| California Achievement Tests | 1 |
| Myers Briggs Type Indicator | 1 |
| Stages of Concern… | 1 |
| Wechsler Intelligence Scale… | 1 |
What Works Clearinghouse Rating
Kuan-Yu Jin; Wai-Lok Siu – Journal of Educational Measurement, 2025
Educational tests often have a cluster of items linked by a common stimulus ("testlet"). In such a design, the dependencies caused between items are called "testlet effects." In particular, the directional testlet effect (DTE) refers to a recursive influence whereby responses to earlier items can positively or negatively affect…
Descriptors: Models, Test Items, Educational Assessment, Scores
Kent Anderson Seidel – School Leadership Review, 2025
This paper examines one of three central diagnostic tools of the Concerns Based Adoption Model, the Stages of Concern Questionnaire (SoCQ). The SoCQ was developed with a focus on K12 education. It has been used widely since developed in 1973, in early childhood, higher education, medical, business, community, and military settings. The SoCQ…
Descriptors: Questionnaires, Educational Change, Educational Innovation, Intervention
Marzieh Haghayeghi; Ali Moghadamzadeh; Hamdollah Ravand; Mohamad Javadipour; Hossein Kareshki – Journal of Psychoeducational Assessment, 2025
This study aimed to address the need for a comprehensive assessment tool to evaluate the mathematical abilities of first-grade students through cognitive diagnostic assessment (CDA). The primary challenge involved in this endeavor was to delineate the specific cognitive skills and sub-skills pertinent to first-grade mathematics (FG-M) and to…
Descriptors: Test Construction, Cognitive Measurement, Check Lists, Mathematics Tests
Tim Jacobbe; Bob delMas; Brad Hartlaub; Jeff Haberstroh; Catherine Case; Steven Foti; Douglas Whitaker – Numeracy, 2023
The development of assessments as part of the funded LOCUS project is described. The assessments measure students' conceptual understanding of statistics as outlined in the GAISE PreK-12 Framework. Results are reported from a large-scale administration to 3,430 students in grades 6 through 12 in the United States. Items were designed to assess…
Descriptors: Statistics Education, Common Core State Standards, Student Evaluation, Elementary School Students
Rubright, Jonathan D. – Educational Measurement: Issues and Practice, 2018
Performance assessments, scenario-based tasks, and other groups of items carry a risk of violating the local item independence assumption made by unidimensional item response theory (IRT) models. Previous studies have identified negative impacts of ignoring such violations, most notably inflated reliability estimates. Still, the influence of this…
Descriptors: Performance Based Assessment, Item Response Theory, Models, Test Reliability
Al-Jarf, Reima – Online Submission, 2023
This article aims to give a comprehensive guide to planning and designing vocabulary tests which include Identifying the skills to be covered by the test; outlining the course content covered; preparing a table of specifications that shows the skill, content topics and number of questions allocated to each; and preparing the test instructions. The…
Descriptors: Vocabulary Development, Learning Processes, Test Construction, Course Content
Smogorzewska, Joanna; Szumski, Grzegorz; Grygiel, Pawel – Developmental Psychology, 2019
The main aims of this study were to further validate the Children's Social Understanding Scale (CSUS), which is a parent-report measure developed by Tahiroglu and colleagues, and to fill in some gaps in the existing research. Our study included more than 700 Polish parents from a diverse educational background who had children with disabilities,…
Descriptors: Foreign Countries, Children, Theory of Mind, Disabilities
Zaidi, Nikki L.; Swoboda, Christopher M.; Kelcey, Benjamin M.; Manuel, R. Stephen – Advances in Health Sciences Education, 2017
The extant literature has largely ignored a potentially significant source of variance in multiple mini-interview (MMI) scores by "hiding" the variance attributable to the sample of attributes used on an evaluation form. This potential source of hidden variance can be defined as rating items, which typically comprise an MMI evaluation…
Descriptors: Interviews, Scores, Generalizability Theory, Monte Carlo Methods
Haberman, Shelby J.; Liu, Yang; Lee, Yi-Hsuan – ETS Research Report Series, 2019
Distractor analyses are routinely conducted in educational assessments with multiple-choice items. In this research report, we focus on three item response models for distractors: (a) the traditional nominal response (NR) model, (b) a combination of a two-parameter logistic model for item scores and a NR model for selections of incorrect…
Descriptors: Multiple Choice Tests, Scores, Test Reliability, High Stakes Tests
Westrick, Paul A. – Educational Assessment, 2017
Undergraduate grade point average (GPA) is a commonly employed measure in educational research, serving as a criterion or as a predictor depending on the research question. Over the decades, researchers have used a variety of reliability coefficients to estimate the reliability of undergraduate GPA, which suggests that there has been no consensus…
Descriptors: Undergraduate Students, Test Reliability, College Entrance Examinations, Longitudinal Studies
Katz, Daniel S. – Kappa Delta Pi Record, 2016
Including growth models based on student test scores in teacher evaluations effectively holds teachers individually accountable for students improving their test scores. While an attractive policy for state administrators and advocates of education reform, value-added measures have been fraught with problems, and their use in teacher evaluation is…
Descriptors: Teacher Evaluation, Models, Scores, Evaluation Criteria
Dorans, Neil J. – ETS Research Report Series, 2014
Simulations are widely used. Simulations produce numbers that are deductive demonstrations of what a model says will happen.They produce numerical results that are consistent with the premises of the model used to generate the numbers. These simulated numerical results are not empirical data that address aspects of the world that lies outside the…
Descriptors: Simulation, Equated Scores, Scores, Scientific Methodology
Amrein-Beardsley, Audrey; Geiger, Tray – Phi Delta Kappan, 2017
Houston's experience with the Educational Value-Added Assessment System (R) (EVAAS) raises questions that other districts should consider before buying the software and using it for high-stakes decisions. Researchers found that teachers in Houston, all of whom were under the EVAAS gun, but who taught relatively more racial minority students,…
Descriptors: Value Added Models, School Districts, Computer Software, Educational Technology
Peoples, Shelagh – ProQuest LLC, 2012
The purpose of this study was to determine which of three competing models will provide, reliable, interpretable, and responsive measures of elementary students' understanding of the nature of science (NOS). The Nature of Science Instrument-Elementary (NOSI-E), a 28-item Rasch-based instrument, was used to assess students' NOS…
Descriptors: Scientific Principles, Science Tests, Elementary School Students, Item Response Theory
Lorence, Jon – Educational Research Quarterly, 2010
The Texas Assessment of Academic Skills (TAAS) test was the major source of data for the Texas educational accountability system from 1994 through 2002. Contrary to critics who claim that TAAS data are invalid and unreliable measures of student performance, structural equation analyses of TAAS reading data based on the 1994 Texas third grade…
Descriptors: Educational Assessment, High Stakes Tests, Reading Tests, Scores
Previous Page | Next Page ยป
Pages: 1 | 2
Peer reviewed
Direct link
