Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 4 |
Since 2006 (last 20 years) | 10 |
Descriptor
Scores | 22 |
Statistical Analysis | 22 |
Test Theory | 22 |
Test Reliability | 7 |
Item Analysis | 6 |
Item Response Theory | 6 |
Test Items | 6 |
Mathematical Models | 5 |
Computer Assisted Testing | 4 |
Criterion Referenced Tests | 4 |
Mastery Tests | 4 |
More ▼ |
Source
Author
Publication Type
Reports - Research | 14 |
Journal Articles | 10 |
Reports - Evaluative | 5 |
Speeches/Meeting Papers | 4 |
Numerical/Quantitative Data | 2 |
Reports - Descriptive | 2 |
Guides - Non-Classroom | 1 |
Reference Materials -… | 1 |
Education Level
Elementary Education | 2 |
Middle Schools | 2 |
Early Childhood Education | 1 |
Grade 2 | 1 |
Grade 3 | 1 |
Grade 4 | 1 |
Grade 5 | 1 |
Grade 6 | 1 |
Grade 7 | 1 |
Grade 8 | 1 |
Higher Education | 1 |
More ▼ |
Audience
Researchers | 2 |
Location
Florida | 1 |
Indonesia | 1 |
Luxembourg | 1 |
Laws, Policies, & Programs
Elementary and Secondary… | 1 |
Assessments and Surveys
California Achievement Tests | 1 |
Stanford Achievement Tests | 1 |
Strengths and Difficulties… | 1 |
Wisconsin Card Sorting Test | 1 |
What Works Clearinghouse Rating
Joshua B. Gilbert – Annenberg Institute for School Reform at Brown University, 2022
This simulation study examines the characteristics of the Explanatory Item Response Model (EIRM) when estimating treatment effects when compared to classical test theory (CTT) sum and mean scores and item response theory (IRT)-based theta scores. Results show that the EIRM and IRT theta scores provide generally equivalent bias and false positive…
Descriptors: Item Response Theory, Models, Test Theory, Computation
Kogar, Hakan – International Journal of Assessment Tools in Education, 2018
The aim of this simulation study, determine the relationship between true latent scores and estimated latent scores by including various control variables and different statistical models. The study also aimed to compare the statistical models and determine the effects of different distribution types, response formats and sample sizes on latent…
Descriptors: Simulation, Context Effect, Computation, Statistical Analysis
Traxler, Adrienne; Henderson, Rachel; Stewart, John; Stewart, Gay; Papak, Alexis; Lindell, Rebecca – Physical Review Physics Education Research, 2018
Research on the test structure of the Force Concept Inventory (FCI) has largely ignored gender, and research on FCI gender effects (often reported as "gender gaps") has seldom interrogated the structure of the test. These rarely crossed streams of research leave open the possibility that the FCI may not be structurally valid across…
Descriptors: Physics, Science Instruction, Sex Fairness, Gender Differences
Retnawati, Heri – Turkish Online Journal of Educational Technology - TOJET, 2015
This study aimed to compare the accuracy of the test scores as results of Test of English Proficiency (TOEP) based on paper and pencil test (PPT) versus computer-based test (CBT). Using the participants' responses to the PPT documented from 2008-2010 and data of CBT TOEP documented in 2013-2014 on the sets of 1A, 2A, and 3A for the Listening and…
Descriptors: Scores, Accuracy, Computer Assisted Testing, English (Second Language)
Lane, Kathleen Lynne; Oakes, Wendy Peia; Cantwell, Emily D.; Menzies, Holly Mariah; Schatschneider, Christopher; Lambert, Warren; Common, Eric Alan – Journal of Emotional and Behavioral Disorders, 2017
We report results of an exploratory validation study of the "Student Risk Screening Scale-Internalizing and Externalizing" (SRSS-IE) applied with the first sample of middle and high school students from nine middle and three high schools from three states. The "Student Risk Screening Scale" (SRSS) was modified to broaden the…
Descriptors: Scores, Psychometrics, Evidence, Middle Schools
Kelcey, Ben; McGinn, Daniel; Hill, Heather – Society for Research on Educational Effectiveness, 2013
Recent policy has charged schools and districts with maintaining highly qualified teachers and differentiating among teachers in terms of their effectiveness (U.S. Department of Education, 2009). This emphasis has driven the development and implementation of teacher quality measures which are increasingly being used to evaluate teachers with…
Descriptors: Teacher Effectiveness, Measures (Individuals), Observation, Teacher Evaluation
Foorman, Barbara R.; Petscher, Yaacov; Schatschneider, Chris – Florida Center for Reading Research, 2015
The grades K-2 Florida Center for Reading Research (FCRR) Reading Assessment (FRA) consists of computer-adaptive alphabetic and oral language screening tasks that provide a Probability of Literacy Success (PLS) linked to grade-level performance (i.e., the 40th percentile) on the word reading (in kindergarten) or reading comprehension (in grades…
Descriptors: Reading Instruction, Reading Tests, Kindergarten, Grade 1
Steinmetz, Jean-Paul; Brunner, Martin; Loarer, Even; Houssemand, Claude – Psychological Assessment, 2010
The Wisconsin Card Sorting Test (WCST) assesses executive and frontal lobe function and can be administered manually or by computer. Despite the widespread application of the 2 versions, the psychometric equivalence of their scores has rarely been evaluated and only a limited set of criteria has been considered. The present experimental study (N =…
Descriptors: Computer Assisted Testing, Psychometrics, Test Theory, Scores
Sinharay, Sandip; Haberman, Shelby; Puhan, Gautam – Educational Measurement: Issues and Practice, 2007
There is an increasing interest in reporting subscores, both at examinee level and at aggregate levels. However, it is important to ensure reasonable subscore performance in terms of high reliability and validity to minimize incorrect instructional and remediation decisions. This article employs a statistical measure based on classical test theory…
Descriptors: Test Reliability, Test Theory, Test Validity, Statistical Analysis
Jung, Eunju; Liu, Kimy; Ketterlin-Geller, Leanne R.; Tindal, Gerald – Behavioral Research and Teaching, 2008
The purpose of this study was to develop general outcome measures (GOM) in mathematics so that teachers could focus their instruction on needed prerequisite skills. We describe in detail, the manner in which content-related evidence was established and then present a number of statistical analyses conducted to evaluate the technical adequacy of…
Descriptors: Item Analysis, Test Construction, Test Theory, Mathematics Tests
Dawson, Thomas E. – 1997
The basic processes in univariate statistics involve partitioning the sum of squares into two components: explained and within. This paper explains that the same partitioning occurs in measurement analyses, i.e., splitting the sum of squares into reliable and unreliable components. In addition, it is shown how the three types of error inherent in…
Descriptors: Estimation (Mathematics), Measurement Techniques, Scores, Statistical Analysis

Brennan, Robert L. – Educational Measurement: Issues and Practice, 1997
The history of generalizability theory (G theory) is told from the perspective of one researcher's experiences, describing psychometric and scientific perspectives that influenced the development of G theory and its adoption. Work that remains to be done in the field is outlined. (SLD)
Descriptors: Educational Testing, Generalizability Theory, Measurement, Psychometrics

Spencer, Bruce D. – Journal of Educational Measurement, 1983
Because test scores are ordinal not cordinal attributes, the average test score often is a misleading way to summarize the scores of a group of individuals. Similarly, correlation coefficients may be misleading summary measures of association between test scores. Proper, readily interpretable, summary statistics are developed from a theory of…
Descriptors: Correlation, Measurement Techniques, Scores, Statistical Analysis
Cliff, Norman – 1984
In almost all applications of measurement there is some sort of response by a human subject. Almost always, the response scale is ordinal, but almost always it is treated as if it were an interval measure. Methods for treating data ordinally are currently being developed in three areas: ordinal analysis for questionnaire responses, ordinal…
Descriptors: Multiple Regression Analysis, Questionnaires, Research Problems, Scores

Balch, William R. – Teaching of Psychology, 1989
Studies the effect of item order on test scores and completion time. Students scored slightly higher when test items were grouped sequentially (relating to text and lectures) than on tests when test items were grouped by text chapter but ordered randomly, or when test items were ordered randomly. Found no differences in completion time. (Author/LS)
Descriptors: Educational Research, Higher Education, Performance, Psychology
Previous Page | Next Page ยป
Pages: 1 | 2