Publication Date
| In 2026 | 0 |
| Since 2025 | 8 |
| Since 2022 (last 5 years) | 36 |
| Since 2017 (last 10 years) | 115 |
| Since 2007 (last 20 years) | 378 |
Descriptor
| Test Theory | 1166 |
| Test Items | 262 |
| Test Reliability | 252 |
| Test Construction | 246 |
| Test Validity | 245 |
| Psychometrics | 183 |
| Scores | 176 |
| Item Response Theory | 168 |
| Foreign Countries | 160 |
| Item Analysis | 141 |
| Statistical Analysis | 134 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Location
| United States | 17 |
| United Kingdom (England) | 15 |
| Canada | 14 |
| Australia | 13 |
| Turkey | 12 |
| Sweden | 8 |
| United Kingdom | 8 |
| Netherlands | 7 |
| Texas | 7 |
| New York | 6 |
| Taiwan | 6 |
| More ▼ | |
Laws, Policies, & Programs
| No Child Left Behind Act 2001 | 4 |
| Elementary and Secondary… | 3 |
| Individuals with Disabilities… | 3 |
Assessments and Surveys
What Works Clearinghouse Rating
Jeong, Heejeong – Language Testing, 2013
Language assessment courses (LACs) are taught by professionals who have majored in the area of language testing (language testers or LTs), but also by others who come from different language-related majors (non-language testers, non-LTs). Different language assessment courses may be developed, depending on who teaches the course and the…
Descriptors: Language Tests, Courses, Teacher Education, Teacher Educators
Lane, Kathleen Lynne; Oakes, Wendy Peia; Carter, Erik W.; Lambert, Warren E.; Jenkins, Abbie B. – Assessment for Effective Intervention, 2013
We reported findings of an exploratory validation study of a revised universal screening instrument: the Student Risk Screening Scale--Internalizing and Externalizing (SRSS-IE) for use with middle school students. Tested initially for use with elementary-age students, the SRSS-IE was adapted to include seven additional items reflecting…
Descriptors: Test Reliability, Test Validity, Screening Tests, Middle School Students
Rantanen, Pekka – Assessment & Evaluation in Higher Education, 2013
A multilevel analysis approach was used to analyse students' evaluation of teaching (SET). The low value of inter-rater reliability stresses that any solid conclusions on teaching cannot be made on the basis of single feedbacks. To assess a teacher's general teaching effectiveness, one needs to evaluate four randomly chosen course implementations.…
Descriptors: Test Reliability, Feedback (Response), Generalizability Theory, Student Evaluation of Teacher Performance
Bretz, Stacey Lowery; Linenberger, Kimberly J. – Biochemistry and Molecular Biology Education, 2012
Enzyme function is central to student understanding of multiple topics within the biochemistry curriculum. In particular, students must understand how enzymes and substrates interact with one another. This manuscript describes the development of a 15-item Enzyme-Substrate Interactions Concept Inventory (ESICI) that measures student understanding…
Descriptors: Biochemistry, Science Education, Science Instruction, Scientific Concepts
Yorke, Mantz; Orr, Susan; Blair, Bernadette – Studies in Higher Education, 2014
There has long been the suspicion amongst staff in Art & Design that the ratings given to their subject disciplines in the UK's National Student Survey are adversely affected by a combination of circumstances--a "perfect storm". The "perfect storm" proposition is tested by comparing ratings for Art & Design with those…
Descriptors: Student Surveys, National Surveys, Art Education, Design
Gomez, Laura E.; Arias, Benito; Verdugo, Miguel Angel; Navas, Patricia – Journal of Intellectual & Developmental Disability, 2012
Background: Most instruments that assess quality of life have been validated by means of the classical test theory (CTT). However, CTT limitations have resulted in the development of alternative models, such as the Rasch rating scale model (RSM). The main goal of this paper is testing and improving the psychometric properties of the INTEGRAL…
Descriptors: Evidence, Models, Mental Retardation, Quality of Life
Chen, Haiwen; Holland, Paul – Educational Testing Service, 2009
In this paper, we develop a new chained equipercentile equating procedure for the nonequivalent groups with anchor test (NEAT) design under the assumptions of the classical test theory model. This new equating is named chained true score equipercentile equating. We also apply the kernel equating framework to this equating design, resulting in a…
Descriptors: True Scores, Equated Scores, Test Theory, Methods
Almehrizi, Rashid S. – Applied Psychological Measurement, 2013
The majority of large-scale assessments develop various score scales that are either linear or nonlinear transformations of raw scores for better interpretations and uses of assessment results. The current formula for coefficient alpha (a; the commonly used reliability coefficient) only provides internal consistency reliability estimates of raw…
Descriptors: Raw Scores, Scaling, Reliability, Computation
Winchell, Brooke – ProQuest LLC, 2011
The purpose of the study was to (a) examine the psychometric properties of The Assessment, Evaluation, and Programming System for Infants and Children (AEPS Test); (b) provide a process for establishing psychometric properties for other Curriculum Based Assessments (CBAs); and (c) identify and guide evaluation and subsequent revisions of the AEPS…
Descriptors: Curriculum Based Assessment, Psychometrics, Item Response Theory, Test Theory
Wallace, Colin S.; Prather, Edward E.; Duncan, Douglas K. – Astronomy Education Review, 2011
This is the first in a series of five articles describing a national study of general education astronomy students' conceptual and reasoning difficulties with cosmology. In this paper, we describe the process by which we designed four new surveys to assess general education astronomy students' conceptual cosmology knowledge. These surveys focused…
Descriptors: General Education, Astronomy, Surveys, Evolution
Bandalos, Deborah L.; Kopp, Jason P. – Educational Measurement: Issues and Practice, 2012
In this article, we discuss the importance of measurement literacy and some issues encountered in teaching introductory measurement courses. We present results from a survey of introductory measurement instructors, including information about the topics included in such courses and the amount of time spent on each. Topics that were included by the…
Descriptors: Class Activities, Motivation Techniques, Item Analysis, Test Theory
Fang, Jiqian; Power, Mick; Lin, Yueqing; Zhang, Jinxin; Hao, Yuantao; Chatterji, Somnath – Gerontologist, 2012
Purpose of the study: To explore short-form versions of World Health Organization Quality of Life (WHOQOL-OLD) with acceptable psychometric properties, which was developed for older adults by the WHOQOL research group, containing 24 items initially. Design and Methods: We randomly sampled two-thirds of respondents from the data of WHOQOL-OLD field…
Descriptors: Quality of Life, Test Reliability, Correlation, Psychometrics
Shahat, Mohamed A.; Ohle, Annika; Treagust, David F.; Fischer, Hans E. – International Journal of Science and Mathematics Education, 2013
Educators and policymakers envision the future of education in Egypt as enabling learners to acquire scientific inquiry and problem-solving skills. In this article, we describe the validation of a model for problem solving and the design of instruments for evaluating new teaching methods in Egyptian science classes. The instruments were based on…
Descriptors: Foreign Countries, Questionnaires, Problem Solving, Science Instruction
Mahon, Catherine; Lyddy, Fiona; Barnes-Holmes, Dermot – Journal of Applied Behavior Analysis, 2010
The purpose of the current study was to develop and test a computerized matching-to-sample (MTS) protocol to facilitate recombinative generalization of subword units (onsets and rimes) and recognition of novel onset-rime and onset-rime-rime words. In addition, we sought to isolate the key training components necessary for recombinative…
Descriptors: Rhyme, Generalization, Reading Instruction, Behavior
Lyren, Per-Erik – Practical Assessment, Research & Evaluation, 2009
The added value of reporting subscores on a college admission test (SweSAT) was examined in this study. Using a CTT-derived objective method for determining the value of reporting subscores, it was concluded that there is added value in reporting section scores (Verbal/Quantitative) as well as subtest scores. These results differ from a study of…
Descriptors: College Entrance Examinations, Scores, Test Theory, Foreign Countries

Peer reviewed
Direct link
