Publication Date
In 2025 | 0 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 3 |
Since 2016 (last 10 years) | 5 |
Since 2006 (last 20 years) | 7 |
Descriptor
Comparative Analysis | 11 |
Error of Measurement | 11 |
Test Validity | 11 |
Test Reliability | 6 |
Item Analysis | 4 |
Scores | 4 |
Test Items | 4 |
Correlation | 3 |
Criterion Referenced Tests | 3 |
Foreign Countries | 3 |
Measures (Individuals) | 3 |
More ▼ |
Source
American Institutes for… | 1 |
International Journal of… | 1 |
Journal of Autism and… | 1 |
Language Testing | 1 |
Measurement in Physical… | 1 |
National Center for Education… | 1 |
Research Quarterly for… | 1 |
Research in Developmental… | 1 |
Author
Haladyna, Tom | 3 |
Roid, Gale | 2 |
Blaker, Lisa | 1 |
Bowes, Neal | 1 |
Chambers, Samuel | 1 |
Christopher F. Chabris | 1 |
De Cat, Jos | 1 |
Desloovere, Kaat | 1 |
Dogan, Nuri | 1 |
Driller, Matthew | 1 |
Feys, Hilde | 1 |
More ▼ |
Publication Type
Reports - Research | 9 |
Journal Articles | 6 |
Guides - Non-Classroom | 1 |
Reports - Evaluative | 1 |
Tests/Questionnaires | 1 |
Education Level
Elementary Education | 1 |
Grade 4 | 1 |
Kindergarten | 1 |
Secondary Education | 1 |
Audience
Location
Germany | 1 |
New Zealand | 1 |
Turkey | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Early Childhood Longitudinal… | 1 |
National Assessment of… | 1 |
Progress in International… | 1 |
What Works Clearinghouse Rating
Matt I. Brown; Patrick R. Heck; Christopher F. Chabris – Journal of Autism and Developmental Disorders, 2024
The Social Shapes Test (SST) is a measure of social intelligence which does not use human faces or rely on extensive verbal ability. The SST has shown promising validity among adults without autism spectrum disorder (ASD), but it is uncertain whether it is suitable for adults with ASD. We find measurement invariance between adults with (n = 229)…
Descriptors: Interpersonal Competence, Autism Spectrum Disorders, Emotional Intelligence, Verbal Ability
Kilic, Abdullah Faruk; Dogan, Nuri – International Journal of Assessment Tools in Education, 2021
Weighted least squares (WLS), weighted least squares mean-and-variance-adjusted (WLSMV), unweighted least squares mean-and-variance-adjusted (ULSMV), maximum likelihood (ML), robust maximum likelihood (MLR) and Bayesian estimation methods were compared in mixed item response type data via Monte Carlo simulation. The percentage of polytomous items,…
Descriptors: Factor Analysis, Computation, Least Squares Statistics, Maximum Likelihood Statistics
Schnoor, Birger; Hartig, Johannes; Klinger, Thorsten; Naumann, Alexander; Usanova, Irina – Language Testing, 2023
Research on assessing English as a foreign language (EFL) development has been growing recently. However, empirical evidence from longitudinal analyses based on substantial samples is still needed. In such settings, tests for measuring language development must meet high standards of test quality such as validity, reliability, and objectivity, as…
Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Longitudinal Studies
O'Donnell, Shannon; Tavares, Francisco; McMaster, Daniel; Chambers, Samuel; Driller, Matthew – Measurement in Physical Education and Exercise Science, 2018
The current study aimed to assess the validity and test-retest reliability of a linear position transducer when compared to a force plate through a counter-movement jump in female participants. Twenty-seven female recreational athletes (19 ± 2 years) performed three counter-movement jumps simultaneously using the linear position transducer and…
Descriptors: Test Validity, Test Reliability, Females, Athletes
Phillips, Gary W. – American Institutes for Research, 2014
This paper describes a statistical linking between the 2011 National Assessment of Educational Progress (NAEP) in Grade 4 reading and the 2011 Progress in International Reading Literacy Study (PIRLS) in Grade 4 reading. The primary purpose of the linking study is to obtain a statistical comparison between NAEP (a national assessment) and PIRLS (an…
Descriptors: National Competency Tests, Reading Achievement, Comparative Analysis, Measures (Individuals)
Tourangeau, Karen; Nord, Christine; Lê, Thanh; Wallner-Allen, Kathleen; Vaden-Kiernan, Nancy; Blaker, Lisa; Najarian, Michelle – National Center for Education Statistics, 2018
This manual provides guidance and documentation for users of the longitudinal kindergarten-fourth grade (K-4) data file of the Early Childhood Longitudinal Study, Kindergarten Class of 2010-11 (ECLS-K:2011). It mainly provides information specific to the fourth-grade round of data collection. The first chapter provides an overview of the…
Descriptors: Children, Longitudinal Studies, Surveys, Kindergarten
Heyrman, Lieve; Molenaers, Guy; Desloovere, Kaat; Verheyden, Geert; De Cat, Jos; Monbaliu, Elegast; Feys, Hilde – Research in Developmental Disabilities: A Multidisciplinary Journal, 2011
In this study the psychometric properties of the Trunk Control Measurement Scale (TCMS) in children with cerebral palsy (CP) were examined. Twenty-six children with spastic CP (mean age 11 years 3 months, range 8-15 years; Gross Motor Function Classification System level I n = 11, level II n = 5, level III n = 10) were included in this study. To…
Descriptors: Construct Validity, Cerebral Palsy, Test Validity, Interrater Reliability
Lane, Andrew M.; Nevill, Alan M.; Bowes, Neal; Fox, Kenneth R. – Research Quarterly for Exercise and Sport, 2005
Establishing stability, defined as observing minimal measurement error in a test-retest assessment, is vital to validating psychometric tools. Correlational methods, such as Pearson product-moment, intraclass, and kappa are tests of association or consistency, whereas stability or reproducibility (regarded here as synonymous) assesses the…
Descriptors: Psychometrics, Multivariate Analysis, Correlation, Test Validity
Roid, Gale; Haladyna, Tom – 1978
The technology of transforming sentences from prose instruction into test questions was examined by comparing two methods of selecting sentences (keyword vs. rare singleton), two types of question words (nouns vs. adjectives), and two foil construction methods (writer's choice vs. algorithmic). Four item writers created items using each…
Descriptors: Algorithms, Cloze Procedure, Comparative Analysis, Criterion Referenced Tests
Haladyna, Tom – 1976
The existence of criterion-referenced (CR) measurement is questioned in this paper. Despite beliefs that differences exist between two alternative forms of measurement, CR and Norm Referenced (NR), an analysis of philosophical and psychological descriptions of measurement, as well as a growing number of empirical studies, reveal that the common…
Descriptors: Academic Standards, Achievement Tests, Career Development, Comparative Analysis
Haladyna, Tom; Roid, Gale – 1976
Three approaches to the construction of achievement tests are compared: construct, operational, and empirical. The construct approach is based upon classical test theory and measures an abstract representation of the instructional objectives. The operational approach specifies instructional intent through instructional objectives, facet design,…
Descriptors: Academic Achievement, Achievement Tests, Career Development, Comparative Analysis