ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	1
Since 2006 (last 20 years)	4

Descriptor

Measurement Techniques	13
Simulation	13
Test Reliability	13
Item Analysis	4
Statistical Analysis	4
Test Items	4
Adaptive Testing	3
Comparative Analysis	3
Correlation	3
Data Analysis	3
Error of Measurement	3
Response Style (Tests)	3
Test Construction	3
Test Validity	3
Ability	2
Classification	2
Computation	2
Computer Assisted Testing	2
Difficulty Level	2
Elementary Secondary Education	2
Evaluation Methods	2
Individual Differences	2
Item Banks	2
Mathematical Models	2
Predictive Validity	2
More ▼

Source

Educational and Psychological…	2
Grantee Submission	1
Journal of Consulting and…	1
Journal of Educational…	1
Journal of Educational and…	1
Psychometrika	1

Publication Type

Reports - Research	11
Journal Articles	6
Reports - Descriptive	2

Education Level

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

Early Childhood Longitudinal…	1
Pennsylvania Educational…	1
Stanford Binet Intelligence…	1

What Works Clearinghouse Rating

Showing all 13 results Save | Export

Scale Reliability Evaluation with Heterogeneous Populations

Peer reviewed

Direct link

Raykov, Tenko; Marcoulides, George A. – Educational and Psychological Measurement, 2015

A latent variable modeling approach for scale reliability evaluation in heterogeneous populations is discussed. The method can be used for point and interval estimation of reliability of multicomponent measuring instruments in populations representing mixtures of an unknown number of latent classes or subpopulations. The procedure is helpful also…

Descriptors: Test Reliability, Evaluation Methods, Measurement Techniques, Computation

Screening Test Items for Differential Item Functioning

Peer reviewed

Direct link

Longford, Nicholas T. – Journal of Educational and Behavioral Statistics, 2014

A method for medical screening is adapted to differential item functioning (DIF). Its essential elements are explicit declarations of the level of DIF that is acceptable and of the loss function that quantifies the consequences of the two kinds of inappropriate classification of an item. Instead of a single level and a single function, sets of…

Descriptors: Test Items, Test Bias, Simulation, Hypothesis Testing

The Role of Multiple-Group Measurement Invariance in Family Psychology Research

Peer reviewed
PDF on ERIC

Download full text

Direct link

Kern, Justin L.; McBride, Brent A.; Laxman, Daniel J.; Dyer, W. Justin; Santos, Rosa M.; Jeans, Laurie M. – Grantee Submission, 2016

Measurement invariance (MI) is a property of measurement that is often implicitly assumed, but in many cases, not tested. When the assumption of MI is tested, it generally involves determining if the measurement holds longitudinally or cross-culturally. A growing literature shows that other groupings can, and should, be considered as well.…

Descriptors: Psychology, Measurement, Error of Measurement, Measurement Objectives

A Procedure for Dimensionality Analyses of Response Data from Various Test Designs

Peer reviewed

Direct link

Zhang, Jinming – Psychometrika, 2013

In some popular test designs (including computerized adaptive testing and multistage testing), many item pairs are not administered to any test takers, which may result in some complications during dimensionality analyses. In this paper, a modified DETECT index is proposed in order to perform dimensionality analyses for response data from such…

Descriptors: Adaptive Testing, Simulation, Computer Assisted Testing, Test Reliability

An Evaluation of a Multiple Matrix Sampling Procedure for a State Assessment Program.

Download full text

Kohr, Richard L. – 1976

Pennsylvania's Educational Quality Assessment Program provides each participating school with a building level report in which state percentiles are a prominent part. Multiple matrix sampling was being considered as a technique to reduce testing time. However, there was great concern that the error associated with estimating the school mean might…

Descriptors: Educational Assessment, Elementary Secondary Education, Item Sampling, Measurement Techniques

Individual Assessment Accuracy.

Peer reviewed

Rudner, Lawrence M. – Journal of Educational Measurement, 1983

Nine indices for assessing the accuracy of an individual's test score were evaluated using simulated item responses to a commercial and a classroom test. The indices appear capable of identifying relatively high proportions of examinees with spurious total scores. (Author/PN)

Descriptors: Correlation, Item Analysis, Latent Trait Theory, Measurement Techniques

The Impact of Missing Data on Sample Reliability Estimates: Implications for Reliability Reporting Practices

Peer reviewed

Direct link

Enders, Craig K. – Educational and Psychological Measurement, 2004

A method for incorporating maximum likelihood (ML) estimation into reliability analyses with item-level missing data is outlined. An ML estimate of the covariance matrix is first obtained using the expectation maximization (EM) algorithm, and coefficient alpha is subsequently computed using standard formulae. A simulation study demonstrated that…

Descriptors: Intervals, Simulation, Test Reliability, Computation

Assessing Clinical Significance: Does it Matter which Method we Use?

Peer reviewed

Direct link

Atkins, David C.; Bedics, Jamie D.; Mcglinchey, Joseph B.; Beauchaine, Theodore P. – Journal of Consulting and Clinical Psychology, 2005

Measures of clinical significance are frequently used to evaluate client change during therapy. Several alternatives to the original method devised by N. S. Jacobson, W. C. Follette, & D. Revenstorf (1984) have been proposed, each purporting to increase accuracy. However, researchers have had little systematic guidance in choosing among…

Descriptors: Psychotherapy, Statistical Significance, Outcomes of Treatment, Behavior Change

Empirical and Simulation Studies of Flexilevel Ability Testing. Research Report No. 75-3.

Download full text

Betz, Nancy E.; Weiss, David J. – 1975

A 40-item flexilevel test and a 40-item conventional test were compared using data obtained through (1) computer-administration of the two tests to three groups of college students, and (2) monte carlo simulation of test response patterns. Results indicated the flexilevel score distribution better reflected the underlying normal distribution of…

Descriptors: Ability, College Students, Comparative Analysis, Computer Oriented Programs

Simulation Studies of Two-Stage Ability Testing. Research Report 74-4.

Download full text

Betz, Nancy E.; Weiss, David J. – 1974

Monte Carlo simulation procedures were used to study the psychometric characteristics of two two-stage adaptive tests and a conventional "peaked" ability test. Results showed that scores yielded by both two-stage tests better reflected the normal distribution of underlying ability. Ability estimates yielded by one of the two stage tests…

Descriptors: Ability, Academic Ability, Adaptive Testing, Computers

A Nonparametric Procedure for Demonstrating a Non-Chance Fit Among Pairs of Multivariate Responses.

Download full text

Mandeville, Garrett K.; And Others – 1975

A strategy for comparing two sets of results (one based upon early childhood recollections (ECR) and another upon video taped (VT) group behavior) from the Perceptual Characteristics Rating Scale was developed. The null distribution of the mean deviation was estimated by randomly matching an ECR response vector with a VT response vector. To…

Descriptors: Comparative Analysis, Correlation, Data Analysis, Goodness of Fit

Analysis of Covariance: Is It the Appropriate Model to Study Change?

Download full text

Marston, Paul T., Borich, Gary D. – 1977

The four main approaches to measuring treatment effects in schools; raw gain, residual gain, covariance, and true scores; were compared. A simulation study showed true score analysis produced a large number of Type-I errors. When corrected for this error, this method showed the least power of the four. This outcome was clearly the result of the…

Descriptors: Achievement Gains, Analysis of Covariance, Comparative Analysis, Error of Measurement

Evaluations of Implied Orders as a Basis for Tailored Testing Using Simulations. Technical Report No. 4.

Cliff, Norman; And Others – 1977

TAILOR is a computer program that uses the implied orders concept as the basis for computerized adaptive testing. The basic characteristics of TAILOR, which does not involve pretesting, are reviewed here and two studies of it are reported. One is a Monte Carlo simulation based on the four-parameter Birnbaum model and the other uses a matrix of…

Descriptors: Adaptive Testing, Computer Assisted Testing, Computer Programs, Difficulty Level

Betz, Nancy E.	2
Weiss, David J.	2
Atkins, David C.	1
Beauchaine, Theodore P.	1
Bedics, Jamie D.	1
Cliff, Norman	1
Dyer, W. Justin	1
Enders, Craig K.	1
Jeans, Laurie M.	1
Kern, Justin L.	1
Kohr, Richard L.	1
Laxman, Daniel J.	1
Longford, Nicholas T.	1
Mandeville, Garrett K.	1
Marcoulides, George A.	1
Marston, Paul T., Borich,…	1
McBride, Brent A.	1
Mcglinchey, Joseph B.	1
Raykov, Tenko	1
Rudner, Lawrence M.	1
Santos, Rosa M.	1
Zhang, Jinming	1
More ▼