Publication Date
In 2025 | 0 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 2 |
Since 2016 (last 10 years) | 8 |
Since 2006 (last 20 years) | 11 |
Descriptor
Simulation | 21 |
Test Items | 21 |
Test Validity | 21 |
Item Analysis | 11 |
Item Response Theory | 8 |
Scores | 7 |
Computer Assisted Testing | 6 |
Test Bias | 6 |
Test Reliability | 6 |
Adaptive Testing | 5 |
Foreign Countries | 5 |
More ▼ |
Source
Author
Pine, Steven M. | 2 |
Weiss, David J. | 2 |
Berberoglu, Giray | 1 |
Beuchert, A. Kent | 1 |
Carroll, Ian A. | 1 |
Cliff, Norman | 1 |
Eckerly, Carol | 1 |
Edwards, Michael C. | 1 |
Frankel, Lois | 1 |
Gorney, Kylie | 1 |
Grossen, Neal E. | 1 |
More ▼ |
Publication Type
Reports - Research | 14 |
Journal Articles | 10 |
Dissertations/Theses -… | 3 |
Speeches/Meeting Papers | 3 |
Reports - Descriptive | 2 |
Reports - Evaluative | 2 |
Education Level
Elementary Secondary Education | 1 |
Grade 10 | 1 |
Grade 5 | 1 |
Grade 6 | 1 |
Grade 7 | 1 |
Grade 8 | 1 |
Grade 9 | 1 |
High Schools | 1 |
Higher Education | 1 |
Postsecondary Education | 1 |
Secondary Education | 1 |
More ▼ |
Audience
Location
Philippines | 1 |
South Africa | 1 |
Turkey | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Stanford Binet Intelligence… | 1 |
What Works Clearinghouse Rating
Gorney, Kylie; Wollack, James A.; Sinharay, Sandip; Eckerly, Carol – Journal of Educational and Behavioral Statistics, 2023
Any time examinees have had access to items and/or answers prior to taking a test, the fairness of the test and validity of test score interpretations are threatened. Therefore, there is a high demand for procedures to detect both compromised items (CI) and examinees with preknowledge (EWP). In this article, we develop a procedure that uses item…
Descriptors: Scores, Test Validity, Test Items, Prior Learning
Musa Adekunle Ayanwale; Mdutshekelwa Ndlovu – Journal of Pedagogical Research, 2024
The COVID-19 pandemic has had a significant impact on high-stakes testing, including the national benchmark tests in South Africa. Current linear testing formats have been criticized for their limitations, leading to a shift towards Computerized Adaptive Testing [CAT]. Assessments with CAT are more precise and take less time. Evaluation of CAT…
Descriptors: Adaptive Testing, Benchmarking, National Competency Tests, Computer Assisted Testing
Guo, Hongwen; Ling, Guangming; Frankel, Lois – ETS Research Report Series, 2020
With advances in technology, researchers and test developers are developing new item types to measure complex skills like problem solving and critical thinking. Analyzing such items is often challenging because of their complicated response patterns, and thus it is important to develop psychometric methods for practitioners and researchers to…
Descriptors: Test Construction, Test Items, Item Analysis, Psychometrics
Kopp, Jason P.; Jones, Andrew T. – Applied Measurement in Education, 2020
Traditional psychometric guidelines suggest that at least several hundred respondents are needed to obtain accurate parameter estimates under the Rasch model. However, recent research indicates that Rasch equating results in accurate parameter estimates with sample sizes as small as 25. Item parameter drift under the Rasch model has been…
Descriptors: Item Response Theory, Psychometrics, Sample Size, Sampling
Carroll, Ian A. – ProQuest LLC, 2017
Item exposure control is, relative to adaptive testing, a nascent concept that has emerged only in the last two to three decades on an academic basis as a practical issue in high-stakes computerized adaptive tests. This study aims to implement a new strategy in item exposure control by incorporating the standard error of the ability estimate into…
Descriptors: Test Items, Computer Assisted Testing, Selection, Adaptive Testing
Steinkamp, Susan Christa – ProQuest LLC, 2017
For test scores that rely on the accurate estimation of ability via an IRT model, their use and interpretation is dependent upon the assumption that the IRT model fits the data. Examinees who do not put forth full effort in answering test questions, have prior knowledge of test content, or do not approach a test with the intent of answering…
Descriptors: Test Items, Item Response Theory, Scores, Test Wiseness
Kalender, Ilker; Berberoglu, Giray – Educational Sciences: Theory and Practice, 2017
Admission into university in Turkey is very competitive and features a number of practical problems regarding not only the test administration process itself, but also concerning the psychometric properties of test scores. Computerized adaptive testing (CAT) is seen as a possible alternative approach to solve these problems. In the first phase of…
Descriptors: Foreign Countries, Computer Assisted Testing, College Admission, Simulation
Stanley, Leanne M.; Edwards, Michael C. – Educational and Psychological Measurement, 2016
The purpose of this article is to highlight the distinction between the reliability of test scores and the fit of psychometric measurement models, reminding readers why it is important to consider both when evaluating whether test scores are valid for a proposed interpretation and/or use. It is often the case that an investigator judges both the…
Descriptors: Test Reliability, Goodness of Fit, Scores, Patients
Longford, Nicholas T. – Journal of Educational and Behavioral Statistics, 2014
A method for medical screening is adapted to differential item functioning (DIF). Its essential elements are explicit declarations of the level of DIF that is acceptable and of the loss function that quantifies the consequences of the two kinds of inappropriate classification of an item. Instead of a single level and a single function, sets of…
Descriptors: Test Items, Test Bias, Simulation, Hypothesis Testing
Wang, Hsuan-Po; Kuo, Bor-Chen; Tsai, Ya-Hsun; Liao, Chen-Huei – Turkish Online Journal of Educational Technology - TOJET, 2012
In the era of globalization, the trend towards learning Chinese as a foreign language (CFL) has become increasingly popular worldwide. The increasing demand in learning CFL has raised the profile of the Chinese proficiency test (CPT). This study will analyze in depth the inadequacy of current CPT's utilizing the common European framework of…
Descriptors: Achievement Tests, Adaptive Testing, Computer Assisted Testing, Global Approach
Pommerich, Mary – Journal of Educational Measurement, 2006
Domain scores have been proposed as a user-friendly way of providing instructional feedback about examinees' skills. Domain performance typically cannot be measured directly; instead, scores must be estimated using available information. Simulation studies suggest that IRT-based methods yield accurate group domain score estimates. Because…
Descriptors: Test Validity, Scores, Simulation, Evaluation Methods

Beuchert, A. Kent; Mendoza, Jorge L. – Journal of Educational Measurement, 1979
Ten item discrimination indices, across a variety of item analysis situations, were compared, based on the validities of tests constructed by using each of the indices to select 40 items from a 100-item pool. Item score data were generated by a computer program and included a simulation of guessing. (Author/CTM)
Descriptors: Item Analysis, Simulation, Statistical Analysis, Test Construction
Reckase, Mark D. – 1978
Five comparisons were made relative to the quality of estimates of ability parameters and item calibrations obtained from the one-parameter and three-parameter logistic models. The results indicate: (1) The three-parameter model fit the test data better in all cases than did the one-parameter model. For simulation data sets, multi-factor data were…
Descriptors: Comparative Analysis, Goodness of Fit, Item Analysis, Mathematical Models
Swaak, Janine; And Others – 1997
A study was conducted to develop a test that is able to capture knowledge of an intuitive nature, such as that acquired through discovery learning. The proposed test format is called the "what-if test." Test items in this format consist of the presentation of a situation. A change in the situation is introduced, and learners have to…
Descriptors: College Students, Discovery Learning, Educational Assessment, Evaluation Methods
Nandakumar, Ratna; Yu, Feng – 1994
DIMTEST is a statistical test procedure for assessing essential unidimensionality of binary test item responses. The test statistic T used for testing the null hypothesis of essential unidimensionality is a nonparametric statistic. That is, there is no particular parametric distribution assumed for the underlying ability distribution or for the…
Descriptors: Ability, Content Validity, Correlation, Nonparametric Statistics
Previous Page | Next Page ยป
Pages: 1 | 2