ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	4
Since 2006 (last 20 years)	10

Descriptor

Scores	22
Statistical Analysis	22
Test Theory	22
Test Reliability	7
Item Analysis	6
Item Response Theory	6
Test Items	6
Mathematical Models	5
Computer Assisted Testing	4
Criterion Referenced Tests	4
Mastery Tests	4
Psychometrics	4
Test Construction	4
Test Interpretation	4
Comparative Analysis	3
Computation	3
Estimation (Mathematics)	3
Factor Analysis	3
Foreign Countries	3
Goodness of Fit	3
Multiple Regression Analysis	3
Reading Tests	3
Reliability	3
Test Validity	3
Testing Problems	3
More ▼

Source

Educational Measurement:…	2
Annenberg Institute for…	1
Behavioral Research and…	1
Educational and Psychological…	1
Florida Center for Reading…	1
International Journal of…	1
Journal of Educational…	1
Journal of Emotional and…	1
Physical Review Physics…	1
Psychological Assessment	1
Society for Research on…	1
Teaching of Psychology	1
Turkish Online Journal of…	1
More ▼

Publication Type

Reports - Research	14
Journal Articles	10
Reports - Evaluative	5
Speeches/Meeting Papers	4
Numerical/Quantitative Data	2
Reports - Descriptive	2
Guides - Non-Classroom	1
Reference Materials -…	1

Education Level

Elementary Education	2
Middle Schools	2
Early Childhood Education	1
Grade 2	1
Grade 3	1
Grade 4	1
Grade 5	1
Grade 6	1
Grade 7	1
Grade 8	1
Higher Education	1
Kindergarten	1
Primary Education	1
More ▼

Audience

Researchers

Location

Florida	1
Indonesia	1
Luxembourg	1

Laws, Policies, & Programs

Elementary and Secondary…

Assessments and Surveys

California Achievement Tests	1
Stanford Achievement Tests	1
Strengths and Difficulties…	1
Wisconsin Card Sorting Test	1

What Works Clearinghouse Rating

Showing 1 to 15 of 22 results Save | Export

Estimating Treatment Effects with the Explanatory Item Response Model. EdWorkingPaper No. 22-677

Download full text

Joshua B. Gilbert – Annenberg Institute for School Reform at Brown University, 2022

This simulation study examines the characteristics of the Explanatory Item Response Model (EIRM) when estimating treatment effects when compared to classical test theory (CTT) sum and mean scores and item response theory (IRT)-based theta scores. Results show that the EIRM and IRT theta scores provide generally equivalent bias and false positive…

Descriptors: Item Response Theory, Models, Test Theory, Computation

Effects of Various Simulation Conditions on Latent-Trait Estimates: A Simulation Study

Peer reviewed
PDF on ERIC

Download full text

Kogar, Hakan – International Journal of Assessment Tools in Education, 2018

The aim of this simulation study, determine the relationship between true latent scores and estimated latent scores by including various control variables and different statistical models. The study also aimed to compare the statistical models and determine the effects of different distribution types, response formats and sample sizes on latent…

Descriptors: Simulation, Context Effect, Computation, Statistical Analysis

Gender Fairness within the Force Concept Inventory

Peer reviewed

Direct link

Traxler, Adrienne; Henderson, Rachel; Stewart, John; Stewart, Gay; Papak, Alexis; Lindell, Rebecca – Physical Review Physics Education Research, 2018

Research on the test structure of the Force Concept Inventory (FCI) has largely ignored gender, and research on FCI gender effects (often reported as "gender gaps") has seldom interrogated the structure of the test. These rarely crossed streams of research leave open the possibility that the FCI may not be structurally valid across…

Descriptors: Physics, Science Instruction, Sex Fairness, Gender Differences

The Comparison of Accuracy Scores on the Paper and Pencil Testing vs. Computer-Based Testing

Peer reviewed
PDF on ERIC

Download full text

Retnawati, Heri – Turkish Online Journal of Educational Technology - TOJET, 2015

This study aimed to compare the accuracy of the test scores as results of Test of English Proficiency (TOEP) based on paper and pencil test (PPT) versus computer-based test (CBT). Using the participants' responses to the PPT documented from 2008-2010 and data of CBT TOEP documented in 2013-2014 on the sets of 1A, 2A, and 3A for the Listening and…

Descriptors: Scores, Accuracy, Computer Assisted Testing, English (Second Language)

Psychometric Evidence of SRSS-IE Scores in Middle and High Schools

Peer reviewed

Direct link

Lane, Kathleen Lynne; Oakes, Wendy Peia; Cantwell, Emily D.; Menzies, Holly Mariah; Schatschneider, Christopher; Lambert, Warren; Common, Eric Alan – Journal of Emotional and Behavioral Disorders, 2017

We report results of an exploratory validation study of the "Student Risk Screening Scale-Internalizing and Externalizing" (SRSS-IE) applied with the first sample of middle and high school students from nine middle and three high schools from three states. The "Student Risk Screening Scale" (SRSS) was modified to broaden the…

Descriptors: Scores, Psychometrics, Evidence, Middle Schools

Measurement of Classroom Teaching Quality with Item Response Theory

Peer reviewed
PDF on ERIC

Download full text

Kelcey, Ben; McGinn, Daniel; Hill, Heather – Society for Research on Educational Effectiveness, 2013

Recent policy has charged schools and districts with maintaining highly qualified teachers and differentiating among teachers in terms of their effectiveness (U.S. Department of Education, 2009). This emphasis has driven the development and implementation of teacher quality measures which are increasingly being used to evaluate teachers with…

Descriptors: Teacher Effectiveness, Measures (Individuals), Observation, Teacher Evaluation

Florida Center for Reading Research (FCRR) Reading Assessment (FRA): Kindergarten to Grade 2. Technical Manual

Download full text

Foorman, Barbara R.; Petscher, Yaacov; Schatschneider, Chris – Florida Center for Reading Research, 2015

The grades K-2 Florida Center for Reading Research (FCRR) Reading Assessment (FRA) consists of computer-adaptive alphabetic and oral language screening tasks that provide a Probability of Literacy Success (PLS) linked to grade-level performance (i.e., the 40th percentile) on the word reading (in kindergarten) or reading comprehension (in grades…

Descriptors: Reading Instruction, Reading Tests, Kindergarten, Grade 1

Incomplete Psychometric Equivalence of Scores Obtained on the Manual and the Computer Version of the Wisconsin Card Sorting Test?

Peer reviewed

Direct link

Steinmetz, Jean-Paul; Brunner, Martin; Loarer, Even; Houssemand, Claude – Psychological Assessment, 2010

The Wisconsin Card Sorting Test (WCST) assesses executive and frontal lobe function and can be administered manually or by computer. Despite the widespread application of the 2 versions, the psychometric equivalence of their scores has rarely been evaluated and only a limited set of criteria has been considered. The present experimental study (N =…

Descriptors: Computer Assisted Testing, Psychometrics, Test Theory, Scores

Subscores Based on Classical Test Theory: To Report or Not to Report

Peer reviewed

Direct link

Sinharay, Sandip; Haberman, Shelby; Puhan, Gautam – Educational Measurement: Issues and Practice, 2007

There is an increasing interest in reporting subscores, both at examinee level and at aggregate levels. However, it is important to ensure reasonable subscore performance in terms of high reliability and validity to minimize incorrect instructional and remediation decisions. This article employs a statistical measure based on classical test theory…

Descriptors: Test Reliability, Test Theory, Test Validity, Statistical Analysis

Instrument Development Procedures for Mathematics Measures. Technical Report Number 08-02

Download full text

Jung, Eunju; Liu, Kimy; Ketterlin-Geller, Leanne R.; Tindal, Gerald – Behavioral Research and Teaching, 2008

The purpose of this study was to develop general outcome measures (GOM) in mathematics so that teachers could focus their instruction on needed prerequisite skills. We describe in detail, the manner in which content-related evidence was established and then present a number of statistical analyses conducted to evaluate the technical adequacy of…

Descriptors: Item Analysis, Test Construction, Test Theory, Mathematics Tests

Basic Concepts in Classical Test Theory: Relating Variance Partitioning in Substantive Analyses to the Same Process in Measurement Analyses.

Download full text

Dawson, Thomas E. – 1997

The basic processes in univariate statistics involve partitioning the sum of squares into two components: explained and within. This paper explains that the same partitioning occurs in measurement analyses, i.e., splitting the sum of squares into reliable and unreliable components. In addition, it is shown how the three types of error inherent in…

Descriptors: Estimation (Mathematics), Measurement Techniques, Scores, Statistical Analysis

A Perspective on the History of Generalizability Theory.

Peer reviewed

Brennan, Robert L. – Educational Measurement: Issues and Practice, 1997

The history of generalizability theory (G theory) is told from the perspective of one researcher's experiences, describing psychometric and scientific perspectives that influenced the development of G theory and its adoption. Work that remains to be done in the field is outlined. (SLD)

Descriptors: Educational Testing, Generalizability Theory, Measurement, Psychometrics

On Interpreting Test Scores as Social Indicators: Statistical Considerations.

Peer reviewed

Spencer, Bruce D. – Journal of Educational Measurement, 1983

Because test scores are ordinal not cordinal attributes, the average test score often is a misleading way to summarize the scores of a group of individuals. Similarly, correlation coefficients may be misleading summary measures of association between test scores. Proper, readily interpretable, summary statistics are developed from a theory of…

Descriptors: Correlation, Measurement Techniques, Scores, Statistical Analysis

Ordinal Measurement.

Cliff, Norman – 1984

In almost all applications of measurement there is some sort of response by a human subject. Almost always, the response scale is ordinal, but almost always it is treated as if it were an interval measure. Methods for treating data ordinally are currently being developed in three areas: ordinal analysis for questionnaire responses, ordinal…

Descriptors: Multiple Regression Analysis, Questionnaires, Research Problems, Scores

Item Order Affects Performance on Multiple-Choice Exams.

Peer reviewed

Balch, William R. – Teaching of Psychology, 1989

Studies the effect of item order on test scores and completion time. Students scored slightly higher when test items were grouped sequentially (relating to text and lectures) than on tests when test items were grouped by text chapter but ordered randomly, or when test items were ordered randomly. Found no differences in completion time. (Author/LS)

Descriptors: Educational Research, Higher Education, Performance, Psychology

Previous Page | Next Page »

Pages: 1 | 2

Balch, William R.	1
Bormuth, John R.	1
Brennan, Robert L.	1
Brunner, Martin	1
Cahan, Sorel	1
Cantwell, Emily D.	1
Cliff, Norman	1
Cohen, Allan S., Comp.	1
Common, Eric Alan	1
Dawson, Thomas E.	1
Foorman, Barbara R.	1
Haberman, Shelby	1
Henderson, Rachel	1
Hill, Heather	1
Houssemand, Claude	1
Joshua B. Gilbert	1
Jung, Eunju	1
Kelcey, Ben	1
Ketterlin-Geller, Leanne R.	1
Kogar, Hakan	1
Lambert, Warren	1
Lane, Kathleen Lynne	1
Lindell, Rebecca	1
Liu, Kimy	1
Loarer, Even	1
More ▼