ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	5
Since 2006 (last 20 years)	13

Descriptor

Error of Measurement	20
Simulation	20
Test Reliability	20
Item Response Theory	8
Test Items	8
Computer Assisted Testing	6
Scores	6
Statistical Analysis	6
Adaptive Testing	5
Comparative Analysis	4
Item Analysis	4
Mathematical Models	4
Statistical Bias	4
Test Length	4
Measurement Techniques	3
Test Construction	3
Test Format	3
Test Theory	3
Computation	2
Computer Programs	2
Educational Testing	2
Equated Scores	2
Evaluation Problems	2
Foreign Countries	2
Item Banks	2
More ▼

Source

Applied Psychological…	2
ETS Research Report Series	2
Educational and Psychological…	2
Psychometrika	2
EURASIA Journal of…	1
Education and Information…	1
Educational Research and…	1
Grantee Submission	1
Journal of Chemical Education	1
Measurement:…	1
Practical Assessment,…	1
ProQuest LLC	1
More ▼

Publication Type

Journal Articles	13
Reports - Research	13
Reports - Evaluative	4
Dissertations/Theses -…	1
Reports - Descriptive	1
Speeches/Meeting Papers	1

Education Level

High Schools	2
Secondary Education	2
Grade 9	1
Higher Education	1
Junior High Schools	1
Middle Schools	1
Postsecondary Education	1

Audience

Location

Canada	1
Indonesia	1

Laws, Policies, & Programs

Assessments and Surveys

Armed Forces Qualification…	1
Early Childhood Longitudinal…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 20 results Save | Export

Practical Considerations in Choosing an Anchor Test Form for Equating under the Random Groups Design

Peer reviewed

Direct link

Cui, Zhongmin; He, Yong – Measurement: Interdisciplinary Research and Perspectives, 2023

Careful considerations are necessary when there is a need to choose an anchor test form from a list of old test forms for equating under the random groups design. The choice of the anchor form potentially affects the accuracy of equated scores on new test forms. Few guidelines, however, can be found in the literature on choosing the anchor form.…

Descriptors: Test Format, Equated Scores, Best Practices, Test Construction

Measuring Language Ability of Students with Compensatory Multidimensional CAT: A Post-Hoc Simulation Study

Peer reviewed

Direct link

Ozdemir, Burhanettin; Gelbal, Selahattin – Education and Information Technologies, 2022

The computerized adaptive tests (CAT) apply an adaptive process in which the items are tailored to individuals' ability scores. The multidimensional CAT (MCAT) designs differ in terms of different item selection, ability estimation, and termination methods being used. This study aims at investigating the performance of the MCAT designs used to…

Descriptors: Scores, Computer Assisted Testing, Test Items, Language Proficiency

Accuracy of a Classical Test Theory-Based Procedure for Estimating the Reliability of a Multistage Test. Research Report. ETS RR-17-02

Peer reviewed
PDF on ERIC

Download full text

Kim, Sooyeon; Livingston, Samuel A. – ETS Research Report Series, 2017

The purpose of this simulation study was to assess the accuracy of a classical test theory (CTT)-based procedure for estimating the alternate-forms reliability of scores on a multistage test (MST) having 3 stages. We generated item difficulty and discrimination parameters for 10 parallel, nonoverlapping forms of the complete 3-stage test and…

Descriptors: Accuracy, Test Theory, Test Reliability, Adaptive Testing

Test-Retest Reliability of the Adaptive Chemistry Assessment Survey for Teachers: Measurement Error and Alternatives to Correlation

Peer reviewed

Direct link

Harshman, Jordan; Yezierski, Ellen – Journal of Chemical Education, 2016

Determining the error of measurement is a necessity for researchers engaged in bench chemistry, chemistry education research (CER), and a multitude of other fields. Discussions regarding what constructs measurement error entails and how to best measure them have occurred, but the critiques about traditional measures have yielded few alternatives.…

Descriptors: Science Instruction, Chemistry, Error of Measurement, Psychometrics

The Role of Multiple-Group Measurement Invariance in Family Psychology Research

Peer reviewed
PDF on ERIC

Download full text

Direct link

Kern, Justin L.; McBride, Brent A.; Laxman, Daniel J.; Dyer, W. Justin; Santos, Rosa M.; Jeans, Laurie M. – Grantee Submission, 2016

Measurement invariance (MI) is a property of measurement that is often implicitly assumed, but in many cases, not tested. When the assumption of MI is tested, it generally involves determining if the measurement holds longitudinally or cross-culturally. A growing literature shows that other groupings can, and should, be considered as well.…

Descriptors: Psychology, Measurement, Error of Measurement, Measurement Objectives

Comparing the Performance of Five Multidimensional CAT Selection Procedures with Different Stopping Rules

Peer reviewed

Direct link

Yao, Lihua – Applied Psychological Measurement, 2013

Through simulated data, five multidimensional computerized adaptive testing (MCAT) selection procedures with varying test lengths are examined and compared using different stopping rules. Fixed item exposure rates are used for all the items, and the Priority Index (PI) method is used for the content constraints. Two stopping rules, standard error…

Descriptors: Computer Assisted Testing, Adaptive Testing, Test Items, Selection

Effect of Violating Unidimensional Item Response Theory Vertical Scaling Assumptions on Developmental Score Scales

Direct link

Topczewski, Anna Marie – ProQuest LLC, 2013

Developmental score scales represent the performance of students along a continuum, where as students learn more they move higher along that continuum. Unidimensional item response theory (UIRT) vertical scaling has become a commonly used method to create developmental score scales. Research has shown that UIRT vertical scaling methods can be…

Descriptors: Item Response Theory, Scaling, Scores, Student Development

Multidimensional Computerized Adaptive Testing for Indonesia Junior High School Biology

Peer reviewed

Direct link

Kuo, Bor-Chen; Daud, Muslem; Yang, Chih-Wei – EURASIA Journal of Mathematics, Science & Technology Education, 2015

This paper describes a curriculum-based multidimensional computerized adaptive test that was developed for Indonesia junior high school Biology. In adherence to the Indonesian curriculum of different Biology dimensions, 300 items was constructed, and then tested to 2238 students. A multidimensional random coefficients multinomial logit model was…

Descriptors: Secondary School Science, Science Education, Science Tests, Computer Assisted Testing

What Response Rates Are Needed to Make Reliable Inferences from Student Evaluations of Teaching?

Peer reviewed

Direct link

Zumrawi, Abdel Azim; Bates, Simon P.; Schroeder, Marianne – Educational Research and Evaluation, 2014

This paper addresses the determination of statistically desirable response rates in students' surveys, with emphasis on assessing the effect of underlying variability in the student evaluation of teaching (SET). We discuss factors affecting the determination of adequate response rates and highlight challenges caused by non-response and lack of…

Descriptors: Inferences, Test Reliability, Response Rates (Questionnaires), Student Evaluation of Teacher Performance

Assumptions of Multiple Regression: Correcting Two Misconceptions

Peer reviewed
PDF on ERIC

Download full text

Williams, Matt N.; Gomez Grajales, Carlos Alberto; Kurkiewicz, Dason – Practical Assessment, Research & Evaluation, 2013

In 2002, an article entitled "Four assumptions of multiple regression that researchers should always test" by Osborne and Waters was published in "PARE." This article has gone on to be viewed more than 275,000 times (as of August 2013), and it is one of the first results displayed in a Google search for "regression…

Descriptors: Multiple Regression Analysis, Misconceptions, Reader Response, Predictor Variables

Multidimensional CAT Item Selection Methods for Domain Scores and Composite Scores: Theory and Applications

Peer reviewed

Direct link

Yao, Lihua – Psychometrika, 2012

Multidimensional computer adaptive testing (MCAT) can provide higher precision and reliability or reduce test length when compared with unidimensional CAT or with the paper-and-pencil test. This study compared five item selection procedures in the MCAT framework for both domain scores and overall scores through simulation by varying the structure…

Descriptors: Item Banks, Test Length, Simulation, Adaptive Testing

Multinomial and Compound Multinomial Error Models for Tests with Complex Item Scoring

Peer reviewed

Direct link

Lee, Won-Chan – Applied Psychological Measurement, 2007

This article introduces a multinomial error model, which models an examinee's test scores obtained over repeated measurements of an assessment that consists of polytomously scored items. A compound multinomial error model is also introduced for situations in which items are stratified according to content categories and/or prespecified numbers of…

Descriptors: Simulation, Error of Measurement, Scoring, Test Items

Comparison of Multistage Tests with Computerized Adaptive and Paper-and-Pencil Tests. Research Report. ETS RR-07-04

Peer reviewed
PDF on ERIC

Download full text

Rotou, Ourania; Patsula, Liane; Steffen, Manfred; Rizavi, Saba – ETS Research Report Series, 2007

Traditionally, the fixed-length linear paper-and-pencil (P&P) mode of administration has been the standard method of test delivery. With the advancement of technology, however, the popularity of administering tests using adaptive methods like computerized adaptive testing (CAT) and multistage testing (MST) has grown in the field of measurement…

Descriptors: Comparative Analysis, Test Format, Computer Assisted Testing, Models

Standard Errors of Estimate in Item-Examinee Sampling as a Function of Test Reliability, Variation in Item Difficulty Indices and Degree of Skewness in the Normative Distribution

Peer reviewed

Shoemaker, David M. – Educational and Psychological Measurement, 1972

Descriptors: Difficulty Level, Error of Measurement, Item Sampling, Simulation

The Reliability of Linearly Equated Tests.

Peer reviewed

Segall, Daniel O. – Psychometrika, 1994

An asymptotic expression for the reliability of a linearly equated test is developed using normal theory. Reliability is expressed as the product of test reliability before equating and an adjustment term that is a function of the sample sizes used to estimate the linear equating transformation. The approach is illustrated. (SLD)

Descriptors: Equated Scores, Error of Measurement, Estimation (Mathematics), Sample Size

Previous Page | Next Page »

Pages: 1 | 2

Yao, Lihua	2
Bates, Simon P.	1
Cui, Zhongmin	1
Daud, Muslem	1
Dyer, W. Justin	1
Enders, Craig K.	1
Gelbal, Selahattin	1
Gomez Grajales, Carlos Alberto	1
Harshman, Jordan	1
He, Yong	1
Jeans, Laurie M.	1
Kern, Justin L.	1
Kim, Sooyeon	1
Kuo, Bor-Chen	1
Kurkiewicz, Dason	1
Laxman, Daniel J.	1
Lee, Won-Chan	1
Livingston, Samuel A.	1
Marshall, J. Laird	1
Marston, Paul T., Borich,…	1
McBride, Brent A.	1
Ozdemir, Burhanettin	1
Patience, Wayne M.	1
Patsula, Liane	1
Reckase, Mark D.	1
More ▼