Publication Date
In 2025 | 1 |
Since 2024 | 5 |
Since 2021 (last 5 years) | 10 |
Since 2016 (last 10 years) | 33 |
Since 2006 (last 20 years) | 80 |
Descriptor
Scaling | 194 |
Test Construction | 194 |
Test Items | 66 |
Item Response Theory | 56 |
Test Validity | 55 |
Test Reliability | 49 |
Scoring | 42 |
Item Analysis | 36 |
Equated Scores | 34 |
Scores | 32 |
Psychometrics | 31 |
More ▼ |
Source
Author
Publication Type
Education Level
Audience
Researchers | 10 |
Administrators | 1 |
Students | 1 |
Location
Asia | 4 |
Australia | 4 |
Canada | 4 |
Italy | 3 |
New York | 3 |
United States | 3 |
Chile | 2 |
Denmark | 2 |
Europe | 2 |
Florida | 2 |
France | 2 |
More ▼ |
Laws, Policies, & Programs
Individuals with Disabilities… | 3 |
No Child Left Behind Act 2001 | 3 |
Individuals with Disabilities… | 1 |
Kentucky Education Reform Act… | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Candra Skrzypek – Psychology in the Schools, 2024
Teachers play a critical role in school mental health. They aid in the identification and referral of students in need of mental health services and are key players in implementing interventions. Nevertheless, teachers often lack the education and training needed to support youths' mental health. Increasing teachers' mental health literacy (MHL)…
Descriptors: Teachers, Mental Health, Student Welfare, Multiple Literacies
Harold Doran; Testsuhiro Yamada; Ted Diaz; Emre Gonulates; Vanessa Culver – Journal of Educational Measurement, 2025
Computer adaptive testing (CAT) is an increasingly common mode of test administration offering improved test security, better measurement precision, and the potential for shorter testing experiences. This article presents a new item selection algorithm based on a generalized objective function to support multiple types of testing conditions and…
Descriptors: Computer Assisted Testing, Adaptive Testing, Test Items, Algorithms
Hongwen Guo; Matthew S. Johnson; Daniel F. McCaffrey; Lixong Gu – ETS Research Report Series, 2024
The multistage testing (MST) design has been gaining attention and popularity in educational assessments. For testing programs that have small test-taker samples, it is challenging to calibrate new items to replenish the item pool. In the current research, we used the item pools from an operational MST program to illustrate how research studies…
Descriptors: Test Items, Test Construction, Sample Size, Scaling
Kyle D. S. Maclean; Tiffany Bayley – INFORMS Transactions on Education, 2024
We introduce a novel type of assessment that allows for efficient grading of higher order thinking skills. In this assessment, a student reviews and corrects a technical memo that has errors in its formulation or process. To overcome the grading challenges imposed by essay-type responses in large undergraduate courses, we provide a Visual Basic…
Descriptors: Undergraduate Students, Thinking Skills, Test Construction, Error Correction
Annamaria Di Fabio; Andrea Svicher – Journal of Psychoeducational Assessment, 2024
The Eco-Generativity Scale (EGS) is a recently developed 28-item scale derived from a 4-factor higher-order model (ecological generativity, social generativity, environmental identity, and agency/pathways). The aim of this study was to develop a short-scale version of the EGS to facilitate its use with university students (N = 779) who will…
Descriptors: Foreign Countries, College Students, Ecology, Likert Scales
Ozberk, Eren Halil; Unsal Ozberk, Elif Bengi; Uluc, Sait; Oktem, Ferhunde – International Journal of Assessment Tools in Education, 2021
The Kaufman Brief Intelligence Test--Second Edition (KBIT-2) is designed to measure verbal and nonverbal abilities in a wide range of individuals from 4 years 0 months to 90 years 11 months of age. This study examines both the advantages of using Mokken Scale Analysis (MSA) in intelligence tests and the hierarchical order of the items in the…
Descriptors: Intelligence Tests, Nonparametric Statistics, Test Items, Test Construction
Livingston, Samuel A. – Educational Testing Service, 2020
This booklet is a conceptual introduction to item response theory (IRT), which many large-scale testing programs use for constructing and scoring their tests. Although IRT is essentially mathematical, the approach here is nonmathematical, in order to serve as an introduction on the topic for people who want to understand why IRT is used and what…
Descriptors: Item Response Theory, Scoring, Test Items, Scaling
Fidler, James R.; Risk, Nicole M. – Educational Measurement: Issues and Practice, 2019
Credentialing examination developers rely on task (job) analyses for establishing inventories of task and knowledge areas in which competency is required for safe and successful practice in target occupations. There are many ways in which task-related information may be gathered from practitioner ratings, each with its own advantage and…
Descriptors: Job Analysis, Scaling, Licensing Examinations (Professions), Test Construction
Raeder, Henrik Galligani; Andersson, Björn; Olsen, Rolf Vegar – Assessment in Education: Principles, Policy & Practice, 2022
Enabling comparable scores across grades is of interest for policymakers to evaluate educational systems, for researchers to investigate substantive questions, and for teachers to infer student growth. This study implemented a vertical scaling design to numeracy tests given in grades 5 and 8 as part of the Norwegian national testing system. Our…
Descriptors: Numeracy, Foreign Countries, Mathematics Tests, Scaling
Hsueh, JoAnn; Portilla, Ximena; McCormick, Meghan; Balu, Rekha; Najafi, Behnosh – MDRC, 2022
The Measures for Early Success Initiative aims to reimagine the landscape of early learning assessments for the millions of 3- to 5-year-olds enrolled in Pre-K, so that more equitable data can be applied to meaningfully support and strengthen early learning experiences for all young children. This document outlines design parameters for child…
Descriptors: Early Childhood Education, Preschool Children, Student Evaluation, Child Development
Ayanwale, Musa Adekunle; Ndlovu, Mdutshekelwa – Education Sciences, 2021
This study investigated the scalability of a cognitive multiple-choice test through the Mokken package in the R programming language for statistical computing. A 2019 mathematics West African Examinations Council (WAEC) instrument was used to gather data from randomly drawn K-12 participants (N = 2866; Male = 1232; Female = 1634; Mean age = 16.5…
Descriptors: Cognitive Tests, Multiple Choice Tests, Scaling, Test Items
Christensen, Rhonda; Knezek, Gerald – Journal of Technology Education, 2022
This article describes the development and validation of an Innovation Attitude Survey (IAS) composed of 16 Likert-type items selected to measure middle school students' attitudes toward innovation and leadership in the advancement of new ideas. The goal of developing the IAS was to identify desirable dispositions that may be related to future…
Descriptors: Attitude Measures, Likert Scales, Test Construction, Test Validity
Browne, Matthew; Rockloff, Matthew; Rawat, Vijay – Sociological Methods & Research, 2018
Development and refinement of self-report measures generally involves selecting a subset of indicators from a larger set. Despite the importance of this task, methods applied to accomplish this are often idiosyncratic and ad hoc, or based on incomplete statistical criteria. We describe a structural equation modeling (SEM)-based technique, based on…
Descriptors: Structural Equation Models, Scaling, Evaluation Criteria, Psychometrics
Smith, William Zachary; Dickenson, Tammiee S.; Rogers, Bradley David – AERA Online Paper Repository, 2017
Questionnaire refinement and a process for selecting items for elimination are important tools for survey developers. One of the major obstacles in questionnaire refinement and elimination in surveys lies in one's ability to adequately and appropriately reconstruct a survey. Often times, surveys can be long and strenuous on the respondent,…
Descriptors: Surveys, Psychometrics, Test Construction, Test Reliability
Kim, Yoon Jeon; Almond, Russell G.; Shute, Valerie J. – International Journal of Testing, 2016
Game-based assessment (GBA) is a specific use of educational games that employs game activities to elicit evidence for educationally valuable skills and knowledge. While this approach can provide individualized and diagnostic information about students, the design and development of assessment mechanics for a GBA is a nontrivial task. In this…
Descriptors: Design, Evidence Based Practice, Test Construction, Physics