Publication Date
In 2025 | 0 |
Since 2024 | 2 |
Since 2021 (last 5 years) | 5 |
Since 2016 (last 10 years) | 11 |
Since 2006 (last 20 years) | 19 |
Descriptor
Accuracy | 19 |
Comparative Analysis | 19 |
Test Format | 19 |
Item Response Theory | 9 |
Test Items | 8 |
Classification | 6 |
Foreign Countries | 6 |
Language Tests | 5 |
Second Language Learning | 5 |
Computer Assisted Testing | 4 |
English (Second Language) | 4 |
More ▼ |
Source
Author
Kalender, Ilker | 2 |
Ahmadi, Alireza | 1 |
Ayan, Cansu | 1 |
Choi, Jiwon | 1 |
Chung, Hyewon | 1 |
Cikrikci, Nukhet | 1 |
DeCarlo, Lawrence T. | 1 |
Dodd, Barbara G. | 1 |
Eberharter, Kathrin | 1 |
Ebner, Viktoria S. | 1 |
Granena, Gisela | 1 |
More ▼ |
Publication Type
Journal Articles | 15 |
Reports - Research | 13 |
Dissertations/Theses -… | 4 |
Reports - Evaluative | 2 |
Numerical/Quantitative Data | 1 |
Tests/Questionnaires | 1 |
Education Level
Higher Education | 3 |
Postsecondary Education | 3 |
Secondary Education | 3 |
Elementary Education | 2 |
Junior High Schools | 2 |
Middle Schools | 2 |
Grade 2 | 1 |
Grade 4 | 1 |
Grade 8 | 1 |
Audience
Laws, Policies, & Programs
Assessments and Surveys
Test of English as a Foreign… | 1 |
What Works Clearinghouse Rating
Uk Hyun Cho – ProQuest LLC, 2024
The present study investigates the influence of multidimensionality on linking and equating in a unidimensional IRT. Two hypothetical multidimensional scenarios are explored under a nonequivalent group common-item equating design. The first scenario examines test forms designed to measure multiple constructs, while the second scenario examines a…
Descriptors: Item Response Theory, Classification, Correlation, Test Format
Shaojie Wang; Won-Chan Lee; Minqiang Zhang; Lixin Yuan – Applied Measurement in Education, 2024
To reduce the impact of parameter estimation errors on IRT linking results, recent work introduced two information-weighted characteristic curve methods for dichotomous items. These two methods showed outstanding performance in both simulation and pseudo-form pseudo-group analysis. The current study expands upon the concept of information…
Descriptors: Item Response Theory, Test Format, Test Length, Error of Measurement
Huang, Hung-Yu – Educational and Psychological Measurement, 2023
The forced-choice (FC) item formats used for noncognitive tests typically develop a set of response options that measure different traits and instruct respondents to make judgments among these options in terms of their preference to control the response biases that are commonly observed in normative tests. Diagnostic classification models (DCMs)…
Descriptors: Test Items, Classification, Bayesian Statistics, Decision Making
Lee, Won-Chan; Kim, Stella Y.; Choi, Jiwon; Kang, Yujin – Journal of Educational Measurement, 2020
This article considers psychometric properties of composite raw scores and transformed scale scores on mixed-format tests that consist of a mixture of multiple-choice and free-response items. Test scores on several mixed-format tests are evaluated with respect to conditional and overall standard errors of measurement, score reliability, and…
Descriptors: Raw Scores, Item Response Theory, Test Format, Multiple Choice Tests
Cikrikci, Nukhet; Yalcin, Seher; Kalender, Ilker; Gul, Emrah; Ayan, Cansu; Uyumaz, Gizem; Sahin-Kursad, Merve; Kamis, Omer – International Journal of Assessment Tools in Education, 2020
This study tested the applicability of the theoretical Examination for Candidates of Driving License (ECODL) in Turkey as a computerized adaptive test (CAT). Firstly, various simulation conditions were tested for the live CAT through an item response theory-based calibrated item bank. The application of the simulated CAT was based on data from…
Descriptors: Motor Vehicles, Traffic Safety, Computer Assisted Testing, Item Response Theory
Eberharter, Kathrin; Kormos, Judit; Guggenbichler, Elisa; Ebner, Viktoria S.; Suzuki, Shungo; Moser-Frötscher, Doris; Konrad, Eva; Kremmel, Benjamin – Language Testing, 2023
In online environments, listening involves being able to pause or replay the recording as needed. Previous research indicates that control over the listening input could improve the measurement accuracy of listening assessment. Self-pacing also supports the second language (L2) comprehension processes of test-takers with specific learning…
Descriptors: Literacy, Native Language, Second Language Learning, Second Language Instruction
Liu, Chunyan; Kolen, Michael J. – Journal of Educational Measurement, 2018
Smoothing techniques are designed to improve the accuracy of equating functions. The main purpose of this study is to compare seven model selection strategies for choosing the smoothing parameter (C) for polynomial loglinear presmoothing and one procedure for model selection in cubic spline postsmoothing for mixed-format pseudo tests under the…
Descriptors: Comparative Analysis, Accuracy, Models, Sample Size
Kaya, Elif; O'Grady, Stefan; Kalender, Ilker – Language Testing, 2022
Language proficiency testing serves an important function of classifying examinees into different categories of ability. However, misclassification is to some extent inevitable and may have important consequences for stakeholders. Recent research suggests that classification efficacy may be enhanced substantially using computerized adaptive…
Descriptors: Item Response Theory, Test Items, Language Tests, Classification
Granena, Gisela – Studies in Second Language Acquisition, 2019
This study investigated the underlying structure of a set of eight cognitive tests from the two most recent language aptitude test batteries: the LLAMA (Meara, 2005) and the Hi-LAB (Linck et al., 2013) to see whether they had any underlying constructs in common. The study also examined whether any of the observed constructs could predict L2…
Descriptors: Second Language Learning, Intelligence Tests, Memory, Language Aptitude
Su, Yanfang; Willis, Gordon; Salomon, Joshua A. – Field Methods, 2017
Vignette design has been largely neglected in anchoring vignette studies. This study aimed to contribute to the science of vignette design by developing and evaluating vignettes for measuring vision in rural China. Cognitive interviews were conducted among 36 participants in a Chinese middle school. The respondents either directly evaluated vision…
Descriptors: Foreign Countries, Middle School Students, Vignettes, Questioning Techniques
Wu, Yi-Fang – ProQuest LLC, 2015
Item response theory (IRT) uses a family of statistical models for estimating stable characteristics of items and examinees and defining how these characteristics interact in describing item and test performance. With a focus on the three-parameter logistic IRT (Birnbaum, 1968; Lord, 1980) model, the current study examines the accuracy and…
Descriptors: Item Response Theory, Test Items, Accuracy, Computation
Ahmadi, Alireza; Sadeghi, Elham – Language Assessment Quarterly, 2016
In the present study we investigated the effect of test format on oral performance in terms of test scores and discourse features (accuracy, fluency, and complexity). Moreover, we explored how the scores obtained on different test formats relate to such features. To this end, 23 Iranian EFL learners participated in three test formats of monologue,…
Descriptors: Oral Language, Comparative Analysis, Language Fluency, Accuracy
Wang, Xinrui – ProQuest LLC, 2013
The computer-adaptive multistage testing (ca-MST) has been developed as an alternative to computerized adaptive testing (CAT), and been increasingly adopted in large-scale assessments. Current research and practice only focus on ca-MST panels for credentialing purposes. The ca-MST test mode, therefore, is designed to gauge a single scale. The…
Descriptors: Computer Assisted Testing, Adaptive Testing, Diagnostic Tests, Comparative Analysis
Kim, Jiseon; Chung, Hyewon; Dodd, Barbara G.; Park, Ryoungsun – Educational and Psychological Measurement, 2012
This study compared various panel designs of the multistage test (MST) using mixed-format tests in the context of classification testing. Simulations varied the design of the first-stage module. The first stage was constructed according to three levels of test information functions (TIFs) with three different TIF centers. Additional computerized…
Descriptors: Test Format, Comparative Analysis, Computer Assisted Testing, Adaptive Testing
Quaiser-Pohl, Claudia; Neuburger, Sarah; Heil, Martin; Jansen, Petra; Schmelter, Andrea – International Journal of Testing, 2014
This article presents a reanalysis of the data of 862 second and fourth graders collected in two previous studies, focusing on the influence of method (psychometric vs. chronometric) and stimulus type on the gender difference in mental-rotation accuracy. The children had to solve mental-rotation tasks with animal pictures, letters, or cube…
Descriptors: Foreign Countries, Gender Differences, Accuracy, Age Differences
Previous Page | Next Page »
Pages: 1 | 2