Publication Date
In 2025 | 1 |
Since 2024 | 2 |
Since 2021 (last 5 years) | 12 |
Since 2016 (last 10 years) | 20 |
Since 2006 (last 20 years) | 27 |
Descriptor
Simulation | 27 |
Foreign Countries | 22 |
Achievement Tests | 21 |
International Assessment | 20 |
Secondary School Students | 20 |
Test Items | 12 |
Item Response Theory | 11 |
Evaluation Methods | 10 |
Models | 7 |
Accuracy | 6 |
Data Analysis | 6 |
More ▼ |
Source
Author
Rutkowski, David | 3 |
Rutkowski, Leslie | 3 |
David Kaplan | 2 |
Liaw, Yuan-Ling | 2 |
Lüdtke, Oliver | 2 |
Robitzsch, Alexander | 2 |
Abulela, Mohammed A. A. | 1 |
Adams, Raymond J. | 1 |
Andersen, Nico | 1 |
Berezner, Alla | 1 |
Bolsinova, Maria | 1 |
More ▼ |
Publication Type
Journal Articles | 22 |
Reports - Research | 19 |
Reports - Descriptive | 3 |
Reports - Evaluative | 3 |
Collected Works - Proceedings | 1 |
Collected Works - Serial | 1 |
Dissertations/Theses -… | 1 |
Education Level
Secondary Education | 21 |
Elementary Secondary Education | 2 |
Elementary Education | 1 |
Grade 6 | 1 |
Higher Education | 1 |
Intermediate Grades | 1 |
Junior High Schools | 1 |
Middle Schools | 1 |
Postsecondary Education | 1 |
Audience
Laws, Policies, & Programs
Assessments and Surveys
Program for International… | 27 |
Trends in International… | 3 |
Early Childhood Longitudinal… | 2 |
National Assessment of… | 2 |
Law School Admission Test | 1 |
What Works Clearinghouse Rating
Julia Mang; Helmut Küchenhoff; Sabine Meinck – Large-scale Assessments in Education, 2024
Stratification is an important design feature of many studies using complex sampling designs and it is often used in large-scale assessment (LSA) studies, such as the "Programme for International Student Assessment" (PISA), for two main reasons. First, stratification variables that achieve a high between and low within strata variance…
Descriptors: Foreign Countries, Achievement Tests, International Assessment, Secondary School Students
Andersen, Nico; Zehner, Fabian; Goldhammer, Frank – Journal of Computer Assisted Learning, 2023
Background: In the context of large-scale educational assessments, the effort required to code open-ended text responses is considerably more expensive and time-consuming than the evaluation of multiple-choice responses because it requires trained personnel and long manual coding sessions. Aim: Our semi-supervised coding method eco (exploring…
Descriptors: Foreign Countries, Achievement Tests, International Assessment, Secondary School Students
Mingya Huang; David Kaplan – Journal of Educational and Behavioral Statistics, 2025
The issue of model uncertainty has been gaining interest in education and the social sciences community over the years, and the dominant methods for handling model uncertainty are based on Bayesian inference, particularly, Bayesian model averaging. However, Bayesian model averaging assumes that the true data-generating model is within the…
Descriptors: Bayesian Statistics, Hierarchical Linear Modeling, Statistical Inference, Predictor Variables
Kaplan, David; Chen, Jianshen; Lyu, Weicong; Yavuz, Sinan – Large-scale Assessments in Education, 2023
The purpose of this paper is to extend and evaluate methods of "Bayesian historical borrowing" applied to longitudinal data with a focus on parameter recovery and predictive performance. Bayesian historical borrowing allows researchers to utilize information from previous data sources and to adjust the extent of borrowing based on the…
Descriptors: Bayesian Statistics, Longitudinal Studies, Children, Surveys
David Kaplan; Jianshen Chen; Weicong Lyu; Sinan Yavuz – Grantee Submission, 2023
The purpose of this paper is to extend and evaluate methods of "Bayesian historical borrowing" applied to longitudinal data with a focus on parameter recovery and predictive performance. Bayesian historical borrowing allows researchers to utilize information from previous data sources and to adjust the extent of borrowing based on the…
Descriptors: Bayesian Statistics, Longitudinal Studies, Children, Surveys
Zhou, Hao; Ma, Xin – Sociological Methods & Research, 2023
Hierarchical linear modeling (HLM) is often used to estimate the effects of socioeconomic status (SES) on academic achievement at different levels of an educational system. However, if a prior academic achievement measure is missing in a HLM model, biased estimates may occur on the effects of student SES and school SES. Phantom effects describe…
Descriptors: Simulation, Hierarchical Linear Modeling, Socioeconomic Status, Institutional Characteristics
Mang, Julia; Küchenhoff, Helmut; Meinck, Sabine; Prenzel, Manfred – Large-scale Assessments in Education, 2021
Background: Standard methods for analysing data from large-scale assessments (LSA) cannot merely be adopted if hierarchical (or multilevel) regression modelling should be applied. Currently various approaches exist; they all follow generally a design-based model of estimation using the pseudo maximum likelihood method and adjusted weights for the…
Descriptors: Sampling, Hierarchical Linear Modeling, Simulation, Scaling
Lundgren, Erik – Journal of Educational Data Mining, 2022
Response process data have the potential to provide a rich description of test-takers' thinking processes. However, retrieving insights from these data presents a challenge for educational assessments and educational data mining as they are complex and not well annotated. The present study addresses this challenge by developing a computational…
Descriptors: Problem Solving, Classification, Accuracy, Foreign Countries
Robitzsch, Alexander; Lüdtke, Oliver – Large-scale Assessments in Education, 2023
One major aim of international large-scale assessments (ILSA) like PISA is to monitor changes in student performance over time. To accomplish this task, a set of common items (i.e., link items) is repeatedly administered in each assessment. Linking methods based on item response theory (IRT) models are used to align the results from the different…
Descriptors: Educational Trends, Trend Analysis, International Assessment, Achievement Tests
Fujimoto, Ken A. – Journal of Educational Measurement, 2020
Multilevel bifactor item response theory (IRT) models are commonly used to account for features of the data that are related to the sampling and measurement processes used to gather those data. These models conventionally make assumptions about the portions of the data structure that represent these features. Unfortunately, when data violate these…
Descriptors: Bayesian Statistics, Item Response Theory, Achievement Tests, Secondary School Students
A Sequential Bayesian Changepoint Detection Procedure for Aberrant Behaviors in Computerized Testing
Jing Lu; Chun Wang; Jiwei Zhang; Xue Wang – Grantee Submission, 2023
Changepoints are abrupt variations in a sequence of data in statistical inference. In educational and psychological assessments, it is pivotal to properly differentiate examinees' aberrant behaviors from solution behavior to ensure test reliability and validity. In this paper, we propose a sequential Bayesian changepoint detection algorithm to…
Descriptors: Bayesian Statistics, Behavior Patterns, Computer Assisted Testing, Accuracy
Abulela, Mohammed A. A.; Rios, Joseph A. – Applied Measurement in Education, 2022
When there are no personal consequences associated with test performance for examinees, rapid guessing (RG) is a concern and can differ between subgroups. To date, the impact of differential RG on item-level measurement invariance has received minimal attention. To that end, a simulation study was conducted to examine the robustness of the…
Descriptors: Comparative Analysis, Robustness (Statistics), Nonparametric Statistics, Item Analysis
Tijmstra, Jesper; Bolsinova, Maria; Liaw, Yuan-Ling; Rutkowski, Leslie; Rutkowski, David – Journal of Educational Measurement, 2020
Although the root-mean squared deviation (RMSD) is a popular statistical measure for evaluating country-specific item-level misfit (i.e., differential item functioning [DIF]) in international large-scale assessment, this paper shows that its sensitivity to detect misfit may depend strongly on the proficiency distribution of the considered…
Descriptors: Test Items, Goodness of Fit, Probability, Accuracy
Grund, Simon; Lüdtke, Oliver; Robitzsch, Alexander – Journal of Educational and Behavioral Statistics, 2021
Large-scale assessments (LSAs) use Mislevy's "plausible value" (PV) approach to relate student proficiency to noncognitive variables administered in a background questionnaire. This method requires background variables to be completely observed, a requirement that is seldom fulfilled. In this article, we evaluate and compare the…
Descriptors: Data Analysis, Error of Measurement, Research Problems, Statistical Inference
Teig, Nani; Scherer, Ronny; Kjaernsli, Marit – Journal of Research in Science Teaching, 2020
Previous research has demonstrated the potential of examining log-file data from computer-based assessments to understand student interactions with complex inquiry tasks. Rather than solely providing information about what has been achieved or the accuracy of student responses ("product data"), students' log files offer additional…
Descriptors: Science Process Skills, Thinking Skills, Inquiry, Simulation
Previous Page | Next Page »
Pages: 1 | 2