ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	4
Since 2006 (last 20 years)	8

Descriptor

Item Analysis	16
Simulation	16
Test Reliability	16
Test Items	10
Test Validity	9
Item Response Theory	5
Statistical Analysis	5
Adaptive Testing	4
Computer Assisted Testing	4
Error of Measurement	4
Measurement Techniques	4
Item Banks	3
Latent Trait Theory	3
Mathematical Models	3
Scores	3
Statistical Bias	3
Test Bias	3
Test Construction	3
Computer Programs	2
Difficulty Level	2
Elementary Secondary Education	2
Evaluation Methods	2
Evaluation Research	2
Factor Analysis	2
Least Squares Statistics	2
More ▼

Source

Educational and Psychological…	2
Journal of Educational…	2
Journal of Educational and…	2
Center for Education Data &…	1
ETS Research Report Series	1
Educational Sciences: Theory…	1
Eurasian Journal of…	1
Practical Assessment,…	1
Psychometrika	1

Publication Type

Reports - Research	12
Journal Articles	11
Reports - Descriptive	3
Reports - Evaluative	1
Speeches/Meeting Papers	1

Education Level

Elementary Secondary Education	2
Elementary Education	1
Grade 8	1
Higher Education	1
Junior High Schools	1
Middle Schools	1
Secondary Education	1

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

Stanford Binet Intelligence…	1
Trends in International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 16 results Save | Export

Examining of Internal Consistency Coefficients in Mixed-Format Tests in Different Simulation Conditions

Peer reviewed
PDF on ERIC

Download full text

Gurdil Ege, Hatice; Demir, Ergul – Eurasian Journal of Educational Research, 2020

Purpose: The present study aims to evaluate how the reliabilities computed using a, Stratified a, Angoff-Feldt, and Feldt-Raju estimators may differ when sample size (500, 1000, and 2000) and item type ratio of dichotomous to polytomous items (2:1; 1:1, 1:2) included in the scale are varied. Research Methods: In this study, Cronbach's a,…

Descriptors: Test Format, Simulation, Test Reliability, Sample Size

Using Existing Data to Inform Development of New Item Types. Research Report. ETS RR-20-01

Peer reviewed
PDF on ERIC

Download full text

Guo, Hongwen; Ling, Guangming; Frankel, Lois – ETS Research Report Series, 2020

With advances in technology, researchers and test developers are developing new item types to measure complex skills like problem solving and critical thinking. Analyzing such items is often challenging because of their complicated response patterns, and thus it is important to develop psychometric methods for practitioners and researchers to…

Descriptors: Test Construction, Test Items, Item Analysis, Psychometrics

Reliability and Model Fit

Peer reviewed

Direct link

Stanley, Leanne M.; Edwards, Michael C. – Educational and Psychological Measurement, 2016

The purpose of this article is to highlight the distinction between the reliability of test scores and the fit of psychometric measurement models, reminding readers why it is important to consider both when evaluating whether test scores are valid for a proposed interpretation and/or use. It is often the case that an investigator judges both the…

Descriptors: Test Reliability, Goodness of Fit, Scores, Patients

Screening Test Items for Differential Item Functioning

Peer reviewed

Direct link

Longford, Nicholas T. – Journal of Educational and Behavioral Statistics, 2014

A method for medical screening is adapted to differential item functioning (DIF). Its essential elements are explicit declarations of the level of DIF that is acceptable and of the loss function that quantifies the consequences of the two kinds of inappropriate classification of an item. Instead of a single level and a single function, sets of…

Descriptors: Test Items, Test Bias, Simulation, Hypothesis Testing

The Impact of Test Dimensionality, Common-Item Set Format, and Scale Linking Methods on Mixed-Format Test Equating

Peer reviewed
PDF on ERIC

Download full text

Öztürk-Gübes, Nese; Kelecioglu, Hülya – Educational Sciences: Theory and Practice, 2016

The purpose of this study was to examine the impact of dimensionality, common-item set format, and different scale linking methods on preserving equity property with mixed-format test equating. Item response theory (IRT) true-score equating (TSE) and IRT observed-score equating (OSE) methods were used under common-item nonequivalent groups design.…

Descriptors: Test Format, Item Response Theory, True Scores, Equated Scores

Assumptions of Multiple Regression: Correcting Two Misconceptions

Peer reviewed
PDF on ERIC

Download full text

Williams, Matt N.; Gomez Grajales, Carlos Alberto; Kurkiewicz, Dason – Practical Assessment, Research & Evaluation, 2013

In 2002, an article entitled "Four assumptions of multiple regression that researchers should always test" by Osborne and Waters was published in "PARE." This article has gone on to be viewed more than 275,000 times (as of August 2013), and it is one of the first results displayed in a Google search for "regression…

Descriptors: Multiple Regression Analysis, Misconceptions, Reader Response, Predictor Variables

Multidimensional CAT Item Selection Methods for Domain Scores and Composite Scores: Theory and Applications

Peer reviewed

Direct link

Yao, Lihua – Psychometrika, 2012

Multidimensional computer adaptive testing (MCAT) can provide higher precision and reliability or reduce test length when compared with unidimensional CAT or with the paper-and-pencil test. This study compared five item selection procedures in the MCAT framework for both domain scores and overall scores through simulation by varying the structure…

Descriptors: Item Banks, Test Length, Simulation, Adaptive Testing

Assessing the "Rothstein Falsification Test": Does It Really Show Teacher Value-Added Models Are Biased? CEDR Working Paper No. 2012 1.3

Direct link

Goldhaber, Dan; Chaplin, Duncan – Center for Education Data & Research, 2012

In a provocative and influential paper, Jesse Rothstein (2010) finds that standard value added models (VAMs) suggest implausible future teacher effects on past student achievement, a finding that obviously cannot be viewed as causal. This is the basis of a falsification test (the Rothstein falsification test) that appears to indicate bias in VAM…

Descriptors: School Effectiveness, Teacher Effectiveness, Achievement Gains, Statistical Bias

A Monte Carlo Comparison of Ten Item Discrimination Indices.

Peer reviewed

Beuchert, A. Kent; Mendoza, Jorge L. – Journal of Educational Measurement, 1979

Ten item discrimination indices, across a variety of item analysis situations, were compared, based on the validities of tests constructed by using each of the indices to select 40 items from a 100-item pool. Item score data were generated by a computer program and included a simulation of guessing. (Author/CTM)

Descriptors: Item Analysis, Simulation, Statistical Analysis, Test Construction

Individual Assessment Accuracy.

Peer reviewed

Rudner, Lawrence M. – Journal of Educational Measurement, 1983

Nine indices for assessing the accuracy of an individual's test score were evaluated using simulated item responses to a commercial and a classroom test. The indices appear capable of identifying relatively high proportions of examinees with spurious total scores. (Author/PN)

Descriptors: Correlation, Item Analysis, Latent Trait Theory, Measurement Techniques

A Sharing Item Response Theory Model for Computerized Adaptive Testing

Peer reviewed

Direct link

Segall, Daniel O. – Journal of Educational and Behavioral Statistics, 2004

A new sharing item response theory (SIRT) model is presented that explicitly models the effects of sharing item content between informants and test takers. This model is used to construct adaptive item selection and scoring rules that provide increased precision and reduced score gains in instances where sharing occurs. The adaptive item selection…

Descriptors: Scoring, Item Analysis, Item Response Theory, Adaptive Testing

The Impact of Missing Data on Sample Reliability Estimates: Implications for Reliability Reporting Practices

Peer reviewed

Direct link

Enders, Craig K. – Educational and Psychological Measurement, 2004

A method for incorporating maximum likelihood (ML) estimation into reliability analyses with item-level missing data is outlined. An ML estimate of the covariance matrix is first obtained using the expectation maximization (EM) algorithm, and coefficient alpha is subsequently computed using standard formulae. A simulation study demonstrated that…

Descriptors: Intervals, Simulation, Test Reliability, Computation

Criterion-Referenced Test Interpretations of "Classical" Measurement Theory.

Download full text

Epstein, Kenneth I.; Knerr, Claramae S. – 1976

The literature on criterion referenced testing is full of discussions concerning whether classical measurement techniques are appropriate, whether variance is necessary, whether new indices of reliability are needed, and the like. What appears to be lacking, however, is a clear and simple discussion of why the problems occur. This paper suggests…

Descriptors: Career Development, Criterion Referenced Tests, Item Analysis, Item Sampling

An Empirical Investigaiton of Six Methods for Examing Test Item Bias. Final Report.

Merz, William R.; Grossen, Neal E. – 1978

Six approaches to assessing test item bias were examined: transformed item difficulty, point biserial correlations, chi-square, factor analysis, one parameter item characteristic curve, and three parameter item characteristic curve. Data sets for analysis were generated by a Monte Carlo technique based on the three parameter model; thus, four…

Descriptors: Difficulty Level, Evaluation Methods, Factor Analysis, Item Analysis

Operational Characteristics of a Rasch Model Tailored Testing Procedure when Program Parameters and Item Pool Attributes are Varied.

Download full text

Patience, Wayne M.; Reckase, Mark D. – 1979

Simulated tailored tests were used to investigate the relationships between characteristics of the item pool and the computer program, and the reliability and bias of the resulting ability estimates. The computer program was varied to provide for various step sizes (differences in difficulty between successive steps) and different acceptance…

Descriptors: Adaptive Testing, Computer Assisted Testing, Computer Programs, Educational Testing

Previous Page | Next Page »

Pages: 1 | 2

Beuchert, A. Kent	1
Chaplin, Duncan	1
Cliff, Norman	1
Demir, Ergul	1
Edwards, Michael C.	1
Enders, Craig K.	1
Epstein, Kenneth I.	1
Frankel, Lois	1
Goldhaber, Dan	1
Gomez Grajales, Carlos Alberto	1
Grossen, Neal E.	1
Guo, Hongwen	1
Gurdil Ege, Hatice	1
Kelecioglu, Hülya	1
Knerr, Claramae S.	1
Kurkiewicz, Dason	1
Ling, Guangming	1
Longford, Nicholas T.	1
Mendoza, Jorge L.	1
Merz, William R.	1
Patience, Wayne M.	1
Reckase, Mark D.	1
Rudner, Lawrence M.	1
Segall, Daniel O.	1
More ▼