ERIC - Search Results

Publication Date

In 2025	3
Since 2024	12
Since 2021 (last 5 years)	41
Since 2016 (last 10 years)	126
Since 2006 (last 20 years)	395

Descriptor

Test Theory	1161
Test Items	261
Test Reliability	252
Test Construction	245
Test Validity	245
Psychometrics	181
Scores	176
Item Response Theory	165
Foreign Countries	159
Item Analysis	141
Statistical Analysis	134
Higher Education	132
Mathematical Models	132
Measurement Techniques	123
Comparative Analysis	121
Correlation	114
Error of Measurement	113
Latent Trait Theory	112
Test Interpretation	112
Testing	111
Evaluation Methods	106
Models	98
Testing Problems	93
Elementary Secondary Education	90
Multiple Choice Tests	85
More ▼

Education Level

Higher Education	95
Postsecondary Education	65
Secondary Education	48
Elementary Education	39
Elementary Secondary Education	29
Middle Schools	27
High Schools	24
Junior High Schools	22
Grade 8	18
Grade 7	14
Grade 4	13
Grade 6	11
Adult Education	10
Early Childhood Education	10
Grade 5	10
Intermediate Grades	10
Grade 3	9
Primary Education	6
Grade 2	4
Preschool Education	4
Grade 10	3
Grade 9	3
Kindergarten	3
Grade 1	2
Grade 12	2
More ▼

Audience

Researchers	81
Practitioners	42
Teachers	22
Students	6
Administrators	5
Policymakers	4
Counselors	2

Location

United States	17
United Kingdom (England)	15
Canada	14
Australia	13
Turkey	12
Sweden	8
United Kingdom	8
Netherlands	7
Texas	7
New York	6
Taiwan	6
United Kingdom (Great Britain)	6
Florida	5
Japan	5
Spain	5
Tennessee	5
United Kingdom (Wales)	5
California	4
Colorado	4
Israel	4
Chile	3
China	3
Germany	3
Illinois	3
Indonesia	3
More ▼

Laws, Policies, & Programs

No Child Left Behind Act 2001	4
Elementary and Secondary…	3
Individuals with Disabilities…	3

What Works Clearinghouse Rating

Showing 436 to 450 of 1,161 results Save | Export

A Comparison of the Standardization and IRT Methods of Adjusting Pretest Item Statistics Using Realistic Data.

Download full text

Chang, Shun-Wen; Hanson, Bradley A.; Harris, Deborah J. – 2001

The requirement of large sample sizes for calibrating items based on item response theory (IRT) models is not easily met in many practical pretesting situations. Although classical item statistics could be estimated with much smaller samples, the values may not be comparable across different groups of examinees. This study extended the authors'…

Descriptors: Item Response Theory, Pretests Posttests, Sample Size, Test Items

Classical Test Theory and Item Response Theory: Analytical and Empirical Comparisons.

Download full text

Hwang, Dae-Yeop – 2002

This study compared classical test theory (CTT) and item response theory (IRT). The behavior of the item and person statistics derived from these two measurement frameworks was examined analytically and empirically using a data set obtained from BILOG (R. Mislay and D. Block, 1997). The example was a 15-item test with a sample size of 600…

Descriptors: Comparative Analysis, Measurement Techniques, Scores, Statistical Distributions

Accuracy of Individual Scores Expressed in Percentile Ranks: Classical Test Theory Calculations. CSE Technical Report.

Download full text

Rogosa, David – 2000

In the reporting of individual student results from standardized tests in educational assessments, the percentile rank of the individual student is a major numerical indicator. This paper develops a formulation and presents calculations to examine the accuracy of the individual percentile rank score. Here, accuracy follows the common-sense…

Descriptors: Comparative Analysis, Elementary Secondary Education, Standardized Tests, Test Results

Establishing the Reliability of Student Proficiency Classifications: The Accuracy of Observed Classifications.

Download full text

Hoffman, R. Gene; Wise, Lauress L. – 2000

Classical test theory is based on the concept of a true score for each examinee, defined as the expected or average score across an infinite number of repeated parallel tests. In most cases, there is only a score from a single administration of the test in question. The difference between this single observed score and the underlying true score is…

Descriptors: Achievement, Classification, Observation, Probability

A General Framework for Using Latent Class Analysis to Test Hierarchical and Nonhierarchical Learning Models.

Peer reviewed

Rindskopf, David – Psychometrika, 1983

Various models have been proposed for analyzing dichotomous test or questionnaire items which were constructed to reflect an assumed underlying structure (e.g., hierarchical). This paper shows that many such models are special cases of latent class analysis and discusses a currently available computer program to analyze them. (Author/JKS)

Descriptors: Computer Programs, Item Analysis, Mathematical Models, Measurement Techniques

The Excitatory State in the Triangular Constant Method.

Peer reviewed

Frijters, J. E. R. – Psychometrika, 1981

The Triangular Constant Method was designed for the measurement of discriminability between sensory stimuli. Its original model assumes a steady excitatory detection state. The purpose of this paper is to elaborate on the consequences of assuming a variable exicitatory state and to formulate the concomitant model. (Author)

Descriptors: Data Analysis, Mathematical Models, Measurement Techniques, Perception

A Review of the Beta-Binomial Model and Its Extensions.

Peer reviewed

Wilcox, Rand R. – Journal of Educational Statistics, 1981

Both the binomial and beta-binomial models are applied to various problems occurring in mental test theory. The paper reviews and critiques these models. The emphasis is on the extensions of the models that have been proposed in recent years, and that might not be familiar to many educators. (Author)

Descriptors: Error of Measurement, Item Analysis, Mathematical Models, Test Reliability

Measuring Absolute Amounts of Reading Comprehension Using the Rauding Rescaling Procedure.

Peer reviewed

Carver, Ronald P. – Journal of Reading Behavior, 1985

Describes a procedure for rescaling objective measures of comprehension so that they reflect amounts of accuracy of comprehension on an absolute scale. Suggests there is adequate empirical evidence supporting the validity of the rauding rescaling procedure. (RS)

Descriptors: Elementary Secondary Education, Predictive Validity, Reading Comprehension, Test Theory

A Comparison between Intelligence and Neuropsychological Functioning.

Peer reviewed

D'Amato, Rik Carl; And Others – Journal of School Psychology, 1988

Investigated the overlap between the Wechsler Intelligence Scale for Children - Revised (WISC-R) and the Halstead-Reitan Neuropsychological Battery (HRNB) in light of their use in diagnosing children's learning problems using scores for children (N=1,181) on the WISC-R and the HRNB. Results showed primary overlap between measures was attributed to…

Descriptors: Adolescents, Children, Intelligence Tests, Test Items

Historical Views of Invariance: Evidence from the Measurement Theories of Thorndike, Thurstone, and Rasch.

Peer reviewed

Engelhard, George, Jr. – Educational and Psychological Measurement, 1992

A historical perspective is provided of the concept of invariance in measurement theory, describing sample-invariant item calibration and item-invariant measurement of individuals. Invariance as a key measurement concept is illustrated through the measurement theories of E. L. Thorndike, L. L. Thurstone, and G. Rasch. (SLD)

Descriptors: Behavioral Sciences, Educational History, Measurement Techniques, Psychometrics

Coefficient Alpha and Composite Reliability with Interrelated Nonhomogeneous Items.

Peer reviewed

Raykov, Tenko – Applied Psychological Measurement, 1998

Examines the relationship between Cronbach's coefficient alpha and the reliability of a composite of a prespecified set of interrelated nonhomogeneous components through simulation. Shows that alpha can over- or underestimate scale reliability at the population level. Illustrates the bias in terms of structural parameters. (SLD)

Descriptors: Reliability, Simulation, Statistical Bias, Structural Equation Models

Classical, Generalizability, and Multifaceted Rasch Detection of Interrater Variability in Large, Sparse Data Sets.

Peer reviewed

MacMillan, Peter D. – Journal of Experimental Education, 2000

Compared classical test theory (CTT), generalizability theory (GT), and multifaceted Rasch model (MFRM) approaches to detecting and correcting for rater variability using responses of 4,930 high school students graded by 3 raters on 9 scales. The MFRM approach identified far more raters as different than did the CTT analysis. GT and Rasch…

Descriptors: Generalizability Theory, High School Students, High Schools, Interrater Reliability

A Perspective on the History of Generalizability Theory.

Peer reviewed

Brennan, Robert L. – Educational Measurement: Issues and Practice, 1997

The history of generalizability theory (G theory) is told from the perspective of one researcher's experiences, describing psychometric and scientific perspectives that influenced the development of G theory and its adoption. Work that remains to be done in the field is outlined. (SLD)

Descriptors: Educational Testing, Generalizability Theory, Measurement, Psychometrics

A Nominal Response Model Approach for Detecting Answer Copying.

Peer reviewed

Wollack, James A. – Applied Psychological Measurement, 1997

Introduces a new Item Response Theory (IRT) based statistic for detecting answer copying. Compares this omega statistic with the best classical test theory-based statistic under various conditions, and finds omega superior based on Type I error rate and power. (SLD)

Descriptors: Cheating, Identification, Item Response Theory, Power (Statistics)

On the Reliability of Categorically Scored Examinations

Peer reviewed

Direct link

Kupermintz, Haggai – Journal of Educational Measurement, 2004

A decision-theoretic approach to the question of reliability in categorically scored examinations is explored. The concepts of true scores and errors are discussed as they deviate from conventional psychometric definitions and measurement error in categorical scores is cast in terms of misclassifications. A reliability measure based on…

Descriptors: Test Reliability, Error of Measurement, Psychometrics, Test Theory

« Previous Page | Next Page »

Pages: 1 | ... | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | ... | 78

Educational and Psychological…	63
Psychometrika	48
Journal of Educational…	35
Applied Psychological…	34
ProQuest LLC	26
Educational Measurement:…	23
Language Testing	15
Measurement:…	15
Journal of Educational…	13
Online Submission	13
Assessment in Education:…	12
International Journal of…	12
Applied Measurement in…	10
International Journal of…	10
Journal of Educational and…	10
Journal of Experimental…	8
Alberta Journal of…	7
ETS Research Report Series	7
Journal of School Psychology	7
Annual Review of Applied…	6
Educational Research and…	6
Intelligence	6
Practical Assessment,…	6
School Psychology Review	6
Astronomy Education Review	5
More ▼

Mislevy, Robert J.	20
Zimmerman, Donald W.	15
van der Linden, Wim J.	15
Sinharay, Sandip	9
Andrich, David	8
Haladyna, Tom	7
Wilcox, Rand R.	7
Williams, Richard H.	7
Yen, Wendy M.	7
Brennan, Robert L.	6
Dorans, Neil J.	6
Haberman, Shelby J.	6
Holland, Paul W.	6
Huynh, Huynh	6
Prather, Edward E.	6
Wainer, Howard	6
Baird, Jo-Anne	5
Cliff, Norman	5
Petscher, Yaacov	5
Roid, Gale	5
Thompson, Bruce	5
Tindal, Gerald	5
Zumbo, Bruno D.	5
Engelhard, George, Jr.	4
More ▼

Journal Articles	728
Reports - Research	615
Reports - Evaluative	214
Speeches/Meeting Papers	187
Reports - Descriptive	120
Opinion Papers	113
Information Analyses	67
Dissertations/Theses -…	26
Guides - Non-Classroom	26
Tests/Questionnaires	26
Numerical/Quantitative Data	22
Books	13
Book/Product Reviews	11
Reference Materials -…	8
Collected Works - General	7
Guides - Classroom - Teacher	7
Collected Works - Proceedings	6
ERIC Publications	6
Guides - Classroom - Learner	6
Reports - General	5
Collected Works - Serials	4
Historical Materials	4
Dissertations/Theses -…	2
ERIC Digests in Full Text	2
Guides - General	2
More ▼

SAT (College Admission Test)	23
National Assessment of…	11
Wechsler Intelligence Scale…	11
Armed Services Vocational…	10
ACT Assessment	9
Graduate Record Examinations	7
Comprehensive Tests of Basic…	6
Test of English as a Foreign…	6
Program for International…	5
Trends in International…	5
California Achievement Tests	4
Kaufman Assessment Battery…	4
Stanford Binet Intelligence…	4
Bayley Scales of Infant…	3
Law School Admission Test	3
Stanford Achievement Tests	3
Strengths and Difficulties…	3
ACTFL Oral Proficiency…	2
Advanced Placement…	2
Alabama High School…	2
Childrens Depression Inventory	2
Eysenck Personality Inventory	2
General Aptitude Test Battery	2
Graduate Management Admission…	2
Learning and Study Strategies…	2
More ▼