Publication Date
In 2025 | 0 |
Since 2024 | 3 |
Since 2021 (last 5 years) | 10 |
Since 2016 (last 10 years) | 20 |
Since 2006 (last 20 years) | 39 |
Descriptor
Error of Measurement | 57 |
Scores | 57 |
Test Items | 57 |
Item Response Theory | 22 |
Simulation | 14 |
Test Reliability | 13 |
Foreign Countries | 11 |
Test Bias | 11 |
Comparative Analysis | 10 |
Item Analysis | 10 |
Computation | 9 |
More ▼ |
Source
Author
Custer, Michael | 2 |
Emons, Wilco H. M. | 2 |
Lord, Frederic M. | 2 |
Sijtsma, Klaas | 2 |
Smith, Richard M. | 2 |
Sykes, Robert C. | 2 |
Ahmet Yildirim | 1 |
AlGhamdi, Hannan M. | 1 |
Alci, Devrim | 1 |
Anwyll, Steve | 1 |
Benítez, Isabel | 1 |
More ▼ |
Publication Type
Journal Articles | 38 |
Reports - Research | 38 |
Reports - Evaluative | 12 |
Speeches/Meeting Papers | 7 |
Reports - Descriptive | 4 |
Dissertations/Theses -… | 3 |
Numerical/Quantitative Data | 1 |
Tests/Questionnaires | 1 |
Education Level
Elementary Education | 6 |
Higher Education | 5 |
Postsecondary Education | 5 |
Secondary Education | 4 |
Elementary Secondary Education | 2 |
Grade 5 | 2 |
Grade 8 | 2 |
High Schools | 2 |
Junior High Schools | 2 |
Middle Schools | 2 |
Grade 3 | 1 |
More ▼ |
Audience
Researchers | 2 |
Location
Indonesia | 2 |
Iran | 2 |
Canada | 1 |
Colorado (Boulder) | 1 |
France | 1 |
Georgia | 1 |
Maryland | 1 |
Netherlands | 1 |
Portugal | 1 |
Saudi Arabia | 1 |
South Africa | 1 |
More ▼ |
Laws, Policies, & Programs
No Child Left Behind Act 2001 | 1 |
Race to the Top | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Gorney, Kylie; Wollack, James A. – Journal of Educational Measurement, 2023
In order to detect a wide range of aberrant behaviors, it can be useful to incorporate information beyond the dichotomous item scores. In this paper, we extend the l[subscript z] and l*[subscript z] person-fit statistics so that unusual behavior in item scores and unusual behavior in item distractors can be used as indicators of aberrance. Through…
Descriptors: Test Items, Scores, Goodness of Fit, Statistics
Metsämuuronen, Jari – Practical Assessment, Research & Evaluation, 2022
The reliability of a test score is usually underestimated and the deflation may be profound, 0.40 - 0.60 units of reliability or 46 - 71%. Eight root sources of the deflation are discussed and quantified by a simulation with 1,440 real-world datasets: (1) errors in the measurement modelling, (2) inefficiency in the estimator of reliability within…
Descriptors: Test Reliability, Scores, Test Items, Correlation
Sample Size and Item Parameter Estimation Precision When Utilizing the Masters' Partial Credit Model
Custer, Michael; Kim, Jongpil – Online Submission, 2023
This study utilizes an analysis of diminishing returns to examine the relationship between sample size and item parameter estimation precision when utilizing the Masters' Partial Credit Model for polytomous items. Item data from the standardization of the Batelle Developmental Inventory, 3rd Edition were used. Each item was scored with a…
Descriptors: Sample Size, Item Response Theory, Test Items, Computation
Ahmet Yildirim; Nizamettin Koç – International Journal of Assessment Tools in Education, 2024
The present research aims to examine whether the questions in the Program for the International Student Assessment (PISA) 2009 reading literacy instrument display differential item functioning (DIF) among the Turkish, French, and American samples based on univariate and multivariate matching techniques before and after the total score, which is…
Descriptors: Test Items, Item Analysis, Correlation, Error of Measurement
Mumba, Brian; Alci, Devrim; Uzun, N. Bilge – Journal on Educational Psychology, 2022
Assessment of measurement invariance is an essential component of construct validity in psychological measurement. However, the procedure for assessing measurement invariance with dichotomous items partially differs from that of invariance testing with continuous items. However, many studies have focused on invariance testing with continuous items…
Descriptors: Mathematics Tests, Test Items, Foreign Countries, Error of Measurement
Matt I. Brown; Patrick R. Heck; Christopher F. Chabris – Journal of Autism and Developmental Disorders, 2024
The Social Shapes Test (SST) is a measure of social intelligence which does not use human faces or rely on extensive verbal ability. The SST has shown promising validity among adults without autism spectrum disorder (ASD), but it is uncertain whether it is suitable for adults with ASD. We find measurement invariance between adults with (n = 229)…
Descriptors: Interpersonal Competence, Autism Spectrum Disorders, Emotional Intelligence, Verbal Ability
Ozdemir, Burhanettin; Gelbal, Selahattin – Education and Information Technologies, 2022
The computerized adaptive tests (CAT) apply an adaptive process in which the items are tailored to individuals' ability scores. The multidimensional CAT (MCAT) designs differ in terms of different item selection, ability estimation, and termination methods being used. This study aims at investigating the performance of the MCAT designs used to…
Descriptors: Scores, Computer Assisted Testing, Test Items, Language Proficiency
Kopp, Jason P.; Jones, Andrew T. – Applied Measurement in Education, 2020
Traditional psychometric guidelines suggest that at least several hundred respondents are needed to obtain accurate parameter estimates under the Rasch model. However, recent research indicates that Rasch equating results in accurate parameter estimates with sample sizes as small as 25. Item parameter drift under the Rasch model has been…
Descriptors: Item Response Theory, Psychometrics, Sample Size, Sampling
Rujun Xu; James Soland – International Journal of Testing, 2024
International surveys are increasingly being used to understand nonacademic outcomes like math and science motivation, and to inform education policy changes within countries. Such instruments assume that the measure works consistently across countries, ethnicities, and languages--that is, they assume measurement invariance. While studies have…
Descriptors: Surveys, Statistical Bias, Achievement Tests, Foreign Countries
Yesiltas, Gonca; Paek, Insu – Educational and Psychological Measurement, 2020
A log-linear model (LLM) is a well-known statistical method to examine the relationship among categorical variables. This study investigated the performance of LLM in detecting differential item functioning (DIF) for polytomously scored items via simulations where various sample sizes, ability mean differences (impact), and DIF types were…
Descriptors: Simulation, Sample Size, Item Analysis, Scores
Pei-Hsuan Chiu – ProQuest LLC, 2018
Evidence of student growth is a primary outcome of interest for educational accountability systems. When three or more years of student test data are available, questions around how students grow and what their predicted growth is can be answered. Given that test scores contain measurement error, this error should be considered in growth and…
Descriptors: Bayesian Statistics, Scores, Error of Measurement, Growth Models
Tsaousis, Ioannis; Sideridis, Georgios D.; AlGhamdi, Hannan M. – Journal of Psychoeducational Assessment, 2021
This study evaluated the psychometric quality of a computerized adaptive testing (CAT) version of the general cognitive ability test (GCAT), using a simulation study protocol put forth by Han, K. T. (2018a). For the needs of the analysis, three different sets of items were generated, providing an item pool of 165 items. Before evaluating the…
Descriptors: Computer Assisted Testing, Adaptive Testing, Cognitive Tests, Cognitive Ability
Oranje, Andreas; Kolstad, Andrew – Journal of Educational and Behavioral Statistics, 2019
The design and psychometric methodology of the National Assessment of Educational Progress (NAEP) is constantly evolving to meet the changing interests and demands stemming from a rapidly shifting educational landscape. NAEP has been built on strong research foundations that include conducting extensive evaluations and comparisons before new…
Descriptors: National Competency Tests, Psychometrics, Statistical Analysis, Computation
van der Lans, Rikkert M.; Maulana, Ridwan; Helms-Lorenz, Michelle; Fernández-García, Carmen-María; Chun, Seyeoung; de Jager, Thelma; Irnidayanti, Yulia; Inda-Caro, Mercedes; Lee, Okhwa; Coetzee, Thys; Fadhilah, Nurul; Jeon, Meae; Moorer, Peter – SAGE Open, 2021
This study examines measurement invariance of student perceptions of teaching quality collected in five countries: Indonesia (n students = 6,331), the Netherlands (n students = 6,738), South Africa (n students = 3,422), South Korea (n students = 6,997) and Spain (n students = 4,676). The administered questionnaire was the My Teacher Questionnaire…
Descriptors: Foreign Countries, Student Attitudes, Student Evaluation of Teacher Performance, Teacher Effectiveness
Lee, HyeSun – Applied Measurement in Education, 2018
The current simulation study examined the effects of Item Parameter Drift (IPD) occurring in a short scale on parameter estimates in multilevel models where scores from a scale were employed as a time-varying predictor to account for outcome scores. Five factors, including three decisions about IPD, were considered for simulation conditions. It…
Descriptors: Test Items, Hierarchical Linear Modeling, Predictor Variables, Scores