ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	7
Since 2016 (last 10 years)	8
Since 2006 (last 20 years)	35

Descriptor

Mathematics Tests	52
Item Response Theory	18
Test Items	15
Foreign Countries	12
Achievement Tests	11
Scores	11
Elementary School Students	9
Models	9
Statistical Analysis	9
Test Bias	8
College Entrance Examinations	7
Comparative Analysis	7
Correlation	7
Psychometrics	7
Test Reliability	7
Computer Assisted Testing	6
Difficulty Level	6
Higher Education	6
Item Analysis	6
Mathematics Achievement	6
Scoring	6
Test Construction	6
Test Results	6
Test Validity	6
Accuracy	5
More ▼

Source

Educational and Psychological…

Publication Type

Journal Articles	52
Reports - Research	38
Reports - Evaluative	11
Reports - Descriptive	2
Information Analyses	1

Education Level

Elementary Education	12
Higher Education	10
Postsecondary Education	8
Intermediate Grades	7
Grade 4	6
Elementary Secondary Education	5
Middle Schools	5
Grade 3	4
Secondary Education	4
Grade 5	3
Grade 6	3
Grade 7	3
Grade 8	3
High Schools	3
Early Childhood Education	2
Grade 2	2
Junior High Schools	2
Primary Education	2
Grade 10	1
More ▼

Audience

Location

Georgia	2
Germany	2
Canada	1
Israel	1
Kansas	1
Michigan	1
Netherlands	1
Russia	1
Singapore	1
South Korea	1
Sweden	1
Taiwan	1
United States	1
More ▼

Laws, Policies, & Programs

No Child Left Behind Act 2001

Assessments and Surveys

SAT (College Admission Test)	5
Trends in International…	4
Graduate Record Examinations	3
ACT Assessment	2
Georgia Criterion Referenced…	2
General Educational…	1
Law School Admission Test	1
National Assessment of…	1
Peabody Individual…	1
Program for International…	1
Students Evaluation of…	1
Wechsler Intelligence Scale…	1
Wide Range Achievement Test	1
More ▼

What Works Clearinghouse Rating

Showing 1 to 15 of 52 results Save | Export

The Use of Theory of Linear Mixed-Effects Models to Detect Fraudulent Erasures at an Aggregate Level

Peer reviewed
PDF on ERIC

Download full text

Direct link

Peng, Luyao; Sinharay, Sandip – Educational and Psychological Measurement, 2022

Wollack et al. (2015) suggested the erasure detection index (EDI) for detecting fraudulent erasures for individual examinees. Wollack and Eckerly (2017) and Sinharay (2018) extended the index of Wollack et al. (2015) to suggest three EDIs for detecting fraudulent erasures at the aggregate or group level. This article follows up on the research of…

Descriptors: Cheating, Identification, Statistical Analysis, Testing

How Days between Tests Impacts Alternate Forms Reliability in Computerized Adaptive Tests

Peer reviewed

Direct link

Wyse, Adam E. – Educational and Psychological Measurement, 2021

An essential question when computing test--retest and alternate forms reliability coefficients is how many days there should be between tests. This article uses data from reading and math computerized adaptive tests to explore how the number of days between tests impacts alternate forms reliability coefficients. Results suggest that the highest…

Descriptors: Computer Assisted Testing, Adaptive Testing, Test Reliability, Reading Tests

Fused SDT/IRT Models for Mixed-Format Exams

Peer reviewed

Direct link

Lawrence T. DeCarlo – Educational and Psychological Measurement, 2024

A psychological framework for different types of items commonly used with mixed-format exams is proposed. A choice model based on signal detection theory (SDT) is used for multiple-choice (MC) items, whereas an item response theory (IRT) model is used for open-ended (OE) items. The SDT and IRT models are shown to share a common conceptualization…

Descriptors: Test Format, Multiple Choice Tests, Item Response Theory, Models

Are Speeded Tests Unfair? Modeling the Impact of Time Limits on the Gender Gap in Mathematics

Peer reviewed

Direct link

Stoevenbelt, Andrea H.; Wicherts, Jelte M.; Flore, Paulette C.; Phillips, Lorraine A. T.; Pietschnig, Jakob; Verschuere, Bruno; Voracek, Martin; Schwabe, Inga – Educational and Psychological Measurement, 2023

When cognitive and educational tests are administered under time limits, tests may become speeded and this may affect the reliability and validity of the resulting test scores. Prior research has shown that time limits may create or enlarge gender gaps in cognitive and academic testing. On average, women complete fewer items than men when a test…

Descriptors: Timed Tests, Gender Differences, Item Response Theory, Correlation

A Polytomous Scoring Approach to Handle Not-Reached Items in Low-Stakes Assessments

Peer reviewed

Direct link

Gorgun, Guher; Bulut, Okan – Educational and Psychological Measurement, 2021

In low-stakes assessments, some students may not reach the end of the test and leave some items unanswered due to various reasons (e.g., lack of test-taking motivation, poor time management, and test speededness). Not-reached items are often treated as incorrect or not-administered in the scoring process. However, when the proportion of…

Descriptors: Scoring, Test Items, Response Style (Tests), Mathematics Tests

Scoring Graphical Responses in TIMSS 2019 Using Artificial Neural Networks

Peer reviewed

Direct link

von Davier, Matthias; Tyack, Lillian; Khorramdel, Lale – Educational and Psychological Measurement, 2023

Automated scoring of free drawings or images as responses has yet to be used in large-scale assessments of student achievement. In this study, we propose artificial neural networks to classify these types of graphical responses from a TIMSS 2019 item. We are comparing classification accuracy of convolutional and feed-forward approaches. Our…

Descriptors: Scoring, Networks, Artificial Intelligence, Elementary Secondary Education

A Mixture IRTree Model for Extreme Response Style: Accounting for Response Process Uncertainty

Peer reviewed

Direct link

Kim, Nana; Bolt, Daniel M. – Educational and Psychological Measurement, 2021

This paper presents a mixture item response tree (IRTree) model for extreme response style. Unlike traditional applications of single IRTree models, a mixture approach provides a way of representing the mixture of respondents following different underlying response processes (between individuals), as well as the uncertainty present at the…

Descriptors: Item Response Theory, Response Style (Tests), Models, Test Items

What about the "Instruction" in Instructional Sensitivity? Raising a Validity Issue in Research on Instructional Sensitivity

Peer reviewed

Direct link

Ing, Marsha – Educational and Psychological Measurement, 2018

In instructional sensitivity research, it is important to evaluate the validity argument about the extent to which student performance on the assessment can be used to infer differences in instructional experiences. This study examines whether three different measures of mathematics instruction consistently identify mathematics assessments as…

Descriptors: Validity, Educational Research, Mathematics Instruction, Mathematics Tests

Differential Item Functioning Detection across Two Methods of Defining Group Comparisons: Pairwise and Composite Group Comparisons

Peer reviewed

Direct link

Sari, Halil Ibrahim; Huggins, Anne Corinne – Educational and Psychological Measurement, 2015

This study compares two methods of defining groups for the detection of differential item functioning (DIF): (a) pairwise comparisons and (b) composite group comparisons. We aim to emphasize and empirically support the notion that the choice of pairwise versus composite group definitions in DIF is a reflection of how one defines fairness in DIF…

Descriptors: Test Bias, Comparative Analysis, Statistical Analysis, College Entrance Examinations

Operationalizing Levels of Academic Mastery Based on Vygotsky's Theory: The Study of Mathematical Knowledge

Peer reviewed

Direct link

Nezhnov, Peter; Kardanova, Elena; Vasilyeva, Marina; Ludlow, Larry – Educational and Psychological Measurement, 2015

The present study tested the possibility of operationalizing levels of knowledge acquisition based on Vygotsky's theory of cognitive growth. An assessment tool (SAM-Math) was developed to capture a hypothesized hierarchical structure of mathematical knowledge consisting of procedural, conceptual, and functional levels. In Study 1, SAM-Math was…

Descriptors: Knowledge Level, Mathematics, Cognitive Development, Vertical Organization

Examining Student Factors in Sources of Setting Accommodation DIF

Peer reviewed

Direct link

Lin, Pei-Ying; Lin, Yu-Cheng – Educational and Psychological Measurement, 2014

This exploratory study investigated potential sources of setting accommodation resulting in differential item functioning (DIF) on math and reading assessments for examinees with varied learning characteristics. The examinees were those who participated in large-scale assessments and were tested in either standardized or accommodated testing…

Descriptors: Test Bias, Multivariate Analysis, Testing Accommodations, Mathematics Tests

On the Factorial Structure of the SAT and Implications for Next-Generation College Readiness Assessments

Peer reviewed

Direct link

Wiley, Edward W.; Shavelson, Richard J.; Kurpius, Amy A. – Educational and Psychological Measurement, 2014

The name "SAT" has become synonymous with college admissions testing; it has been dubbed "the gold standard." Numerous studies on its reliability and predictive validity show that the SAT predicts college performance beyond high school grade point average. Surprisingly, studies of the factorial structure of the current version…

Descriptors: College Readiness, College Admission, College Entrance Examinations, Factor Analysis

Numerical Differentiation Methods for Computing Error Covariance Matrices in Item Response Theory Modeling: An Evaluation and a New Proposal

Peer reviewed

Direct link

Tian, Wei; Cai, Li; Thissen, David; Xin, Tao – Educational and Psychological Measurement, 2013

In item response theory (IRT) modeling, the item parameter error covariance matrix plays a critical role in statistical inference procedures. When item parameters are estimated using the EM algorithm, the parameter error covariance matrix is not an automatic by-product of item calibration. Cai proposed the use of Supplemented EM algorithm for…

Descriptors: Item Response Theory, Computation, Matrices, Statistical Inference

Dealing with Omitted and Not-Reached Items in Competence Tests: Evaluating Approaches Accounting for Missing Responses in Item Response Theory Models

Peer reviewed

Direct link

Pohl, Steffi; Gräfe, Linda; Rose, Norman – Educational and Psychological Measurement, 2014

Data from competence tests usually show a number of missing responses on test items due to both omitted and not-reached items. Different approaches for dealing with missing responses exist, and there are no clear guidelines on which of those to use. While classical approaches rely on an ignorable missing data mechanism, the most recently developed…

Descriptors: Test Items, Achievement Tests, Item Response Theory, Models

A Comparison of Three IRT Approaches to Examinee Ability Change Modeling in a Single-Group Anchor Test Design

Peer reviewed

Direct link

Paek, Insu; Park, Hyun-Jeong; Cai, Li; Chi, Eunlim – Educational and Psychological Measurement, 2014

Typically a longitudinal growth modeling based on item response theory (IRT) requires repeated measures data from a single group with the same test design. If operational or item exposure problems are present, the same test may not be employed to collect data for longitudinal analyses and tests at multiple time points are constructed with unique…

Descriptors: Item Response Theory, Comparative Analysis, Test Items, Equated Scores

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4

Plake, Barbara S.	4
Engelhard, George, Jr.	3
Banerji, Madhabi	2
Cai, Li	2
Ferdous, Abdullah A.	2
Finch, W. Holmes	2
Pomplun, Mark	2
Albano, Anthony D.	1
Attali, Yigal	1
Baumgarten, Bruce S.	1
Beretvas, S. Natasha	1
Bessant, Kenneth C.	1
Bing, Mark N.	1
Birenbaum, Menucha	1
Blixt, Sonya L.	1
Bolt, Daniel M.	1
Brooks, Thomas	1
Bulut, Okan	1
Capps, Lee	1
Cawthon, Stephanie W.	1
Chang, Lei	1
Cheong, Yuk Fai	1
Chi, Eunlim	1
Close, Catherine N.	1
More ▼