Publication Date
In 2025 | 0 |
Since 2024 | 2 |
Since 2021 (last 5 years) | 4 |
Since 2016 (last 10 years) | 13 |
Since 2006 (last 20 years) | 27 |
Descriptor
Evaluation Methods | 34 |
Item Response Theory | 34 |
Test Reliability | 34 |
Test Validity | 20 |
Student Evaluation | 10 |
Psychometrics | 9 |
Foreign Countries | 8 |
Test Items | 8 |
Test Construction | 7 |
Item Analysis | 6 |
Models | 6 |
More ▼ |
Source
Author
Wang, Wen-Chung | 2 |
Ahmed, Wondimu | 1 |
Baghaei, Purya | 1 |
Bailes, Lauren P. | 1 |
Bao, Jacqueline Y. | 1 |
Bao, Lei | 1 |
Bejar, Isaac I. | 1 |
Browne, Jeremy | 1 |
Chen, Ching-I | 1 |
Chen, Hsueh-Chu | 1 |
Cheung, K. C. | 1 |
More ▼ |
Publication Type
Journal Articles | 24 |
Reports - Research | 21 |
Reports - Evaluative | 5 |
Reports - Descriptive | 4 |
Speeches/Meeting Papers | 4 |
Dissertations/Theses -… | 3 |
Books | 1 |
Collected Works - General | 1 |
Opinion Papers | 1 |
Education Level
Audience
Location
United States | 2 |
California | 1 |
Canada | 1 |
China | 1 |
Colorado | 1 |
Germany | 1 |
Greece | 1 |
Oregon | 1 |
Pennsylvania | 1 |
Taiwan | 1 |
United Kingdom | 1 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
Graduate Record Examinations | 1 |
Hidden Figures Test | 1 |
Program for International… | 1 |
Trends in International… | 1 |
What Works Clearinghouse Rating
Madeline A. Schellman; Matthew J. Madison – Grantee Submission, 2024
Diagnostic classification models (DCMs) have grown in popularity as stakeholders increasingly desire actionable information related to students' skill competencies. Longitudinal DCMs offer a psychometric framework for providing estimates of students' proficiency status transitions over time. For both cross-sectional and longitudinal DCMs, it is…
Descriptors: Diagnostic Tests, Classification, Models, Psychometrics
Eirini M. Mitropoulou; Leonidas A. Zampetakis; Ioannis Tsaousis – Evaluation Review, 2024
Unfolding item response theory (IRT) models are important alternatives to dominance IRT models in describing the response processes on self-report tests. Their usage is common in personality measures, since they indicate potential differentiations in test score interpretation. This paper aims to gain a better insight into the structure of trait…
Descriptors: Foreign Countries, Adults, Item Response Theory, Personality Traits
Geoffrey Converse – ProQuest LLC, 2021
In educational measurement, Item Response Theory (IRT) provides a means of quantifying student knowledge. Specifically, IRT models the probability of a student answering a particular item correctly as a function of the student's continuous-valued latent abilities [theta] (e.g. add, subtract, multiply, divide) and parameters associated with the…
Descriptors: Item Response Theory, Test Validity, Student Evaluation, Computer Assisted Testing
Fu, Jianbin; Qu, Yanxuan – ETS Research Report Series, 2018
Various subscore estimation methods that use auxiliary information to improve subscore accuracy and stability have been developed. This report provides a review of various subscore estimation methods described in the literature. The methodology of each method is described, then research studies on these subscore estimation methods are summarized.…
Descriptors: Scores, Evaluation Methods, Item Response Theory, Test Items
Gulsah Gurkan – ProQuest LLC, 2021
Secondary analyses of international large-scale assessments (ILSA) commonly characterize relationships between variables of interest using correlations. However, the accuracy of correlation estimates is impaired by artefacts such as measurement error and clustering. Despite advancements in methodology, conventional correlation estimates or…
Descriptors: Secondary School Students, Achievement Tests, International Assessment, Foreign Countries
Koskey, Kristin L. K.; Makki, Nidaa; Ahmed, Wondimu; Garafolo, Nicholas G.; Visco, Donald P., Jr. – School Science and Mathematics, 2020
Integrating engineering into the K-12 science curriculum continues to be a focus in national reform efforts in science education. Although there is an increasing interest in research in and practice of integrating engineering in K-12 science education, to date only a few studies have focused on the development of an assessment tool to measure…
Descriptors: Middle School Students, Engineering, Design, Science Education
Xiao, Yang; Fritchman, Joseph C.; Bao, Jacqueline Y.; Nie, Ying; Han, Jing; Xiong, Jianwen; Xiao, Hua; Bao, Lei – Physical Review Physics Education Research, 2019
In physics education research (PER), concept inventories (CIs) have become standard instruments for assessing students' learning throughout instruction. To promote widespread use of concept inventories, previous studies have developed an approach to split a full length CI into short versions of CIs. This research extends the existing method to…
Descriptors: Physics, Science Instruction, Energy, Magnets
Bailes, Lauren P.; Nandakumar, Ratna – International Journal of Education Policy and Leadership, 2020
High-quality measurement tools are critical to school improvement efforts. Education researchers frequently employ surveys in order to assess a host of variables associated with school improvement. This article asserts that Rasch modeling techniques enhance the quality of a measurement tool because they comprise elements of both qualitative and…
Descriptors: Surveys, Evaluation Methods, Item Response Theory, Administrator Role
Jorgensen, Maribeth F.; Schweinle, William E. – Professional Counselor, 2018
The 68-item Research Identity Scale (RIS) was informed through qualitative exploration of research identity development in master's-level counseling students and practitioners. Classical psychometric analyses revealed the items had strong validity and reliability and a single factor. A one-parameter Rasch analysis and item review was used to…
Descriptors: Qualitative Research, Counseling Services, Counselor Training, Psychometrics
Lee, Minji K.; Sweeney, Kevin; Melican, Gerald J. – Educational Assessment, 2017
This study investigates the relationships among factor correlations, inter-item correlations, and the reliability estimates of subscores, providing a guideline with respect to psychometric properties of useful subscores. In addition, it compares subscore estimation methods with respect to reliability and distinctness. The subscore estimation…
Descriptors: Scores, Test Construction, Test Reliability, Test Validity
Goldstein, Harvey – British Educational Research Journal, 2015
A response is made to a paper that urges the use of the Rasch model for educational assessment. This paper argues that the model is inadequate and that claims for its efficacy are exaggerated and technically weak.
Descriptors: Reader Response, Item Response Theory, Educational Assessment, Evaluation Methods
Wedman, Jonathan; Lyrén, Per-Erik – Practical Assessment, Research & Evaluation, 2015
When subscores on a test are reported to the test taker, the appropriateness of reporting them depends on whether they provide useful information above what is provided by the total score. Subscores that fail to do so lack adequate psychometric quality and should not be reported. There are several methods for examining the quality of subscores,…
Descriptors: Evaluation Methods, Psychometrics, Scores, Tests
Eckes, Thomas; Baghaei, Purya – Applied Measurement in Education, 2015
C-tests are gap-filling tests widely used to assess general language proficiency for purposes of placement, screening, or provision of feedback to language learners. C-tests consist of several short texts in which parts of words are missing. We addressed the issue of local dependence in C-tests using an explicit modeling approach based on testlet…
Descriptors: Language Proficiency, Language Tests, Item Response Theory, Test Reliability
Mahmud, Jumailiyah; Sutikno, Muzayanah; Naga, Dali S. – Educational Research and Reviews, 2016
The aim of this study is to determine variance difference between maximum likelihood and expected A posteriori estimation methods viewed from number of test items of aptitude test. The variance presents an accuracy generated by both maximum likelihood and Bayes estimation methods. The test consists of three subtests, each with 40 multiple-choice…
Descriptors: Maximum Likelihood Statistics, Computation, Item Response Theory, Test Items
Kaspar, Roman; Döring, Ottmar; Wittmann, Eveline; Hartig, Johannes; Weyland, Ulrike; Nauerth, Annette; Möllers, Michaela; Rechenbach, Simone; Simon, Julia; Worofka, Iberé – Vocations and Learning, 2016
Valid and reliable standardized assessment of nursing competencies is needed to monitor the quality of vocational education and training (VET) in nursing and evaluate learning outcomes for care work trainees with increasingly heterogeneous learning backgrounds. To date, however, the modeling of professional competencies has not yet evolved into…
Descriptors: Nursing Education, Geriatrics, Video Technology, Computer Assisted Testing