Publication Date
In 2025 | 0 |
Since 2024 | 2 |
Since 2021 (last 5 years) | 2 |
Since 2016 (last 10 years) | 4 |
Since 2006 (last 20 years) | 136 |
Descriptor
Source
Author
Garb, Howard N. | 2 |
Scriven, Michael | 2 |
Whittaker, Tiffany A. | 2 |
Adams, Katharine | 1 |
Agarwala, Rina | 1 |
Al-Hamdan, Jasem M. | 1 |
Al-Yacoub, Ali M. | 1 |
Alderman, Lyn | 1 |
Ananiadou, Katerina | 1 |
Anderson, Robin D. | 1 |
Anil, Duygu | 1 |
More ▼ |
Publication Type
Education Level
Location
United Kingdom | 6 |
Australia | 5 |
United States | 5 |
Finland | 3 |
Turkey | 3 |
California | 2 |
Canada | 2 |
Germany (Berlin) | 2 |
Hong Kong | 2 |
Belgium | 1 |
China | 1 |
More ▼ |
Laws, Policies, & Programs
Individuals with Disabilities… | 1 |
No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Ute Knoch; Jason Fan – Language Testing, 2024
While several test concordance tables have been published, the research underpinning such tables has rarely been examined in detail. This study aimed to survey the publically available studies or documentation underpinning the test concordance tables of the providers of four major international language tests, all accepted by the Australian…
Descriptors: Language Tests, English, Test Validity, Item Analysis
Ayfer Sayin; Mark Gierl – Educational Measurement: Issues and Practice, 2024
The purpose of this study is to introduce and evaluate a method for generating reading comprehension items using template-based automatic item generation. To begin, we describe a new model for generating reading comprehension items called the text analysis cognitive model assessing inferential skills across different reading passages. Next, the…
Descriptors: Algorithms, Reading Comprehension, Item Analysis, Man Machine Systems
Spurgeon, Shawn L. – Measurement and Evaluation in Counseling and Development, 2017
Construct irrelevance (CI) and construct underrepresentation (CU) are 2 major threats to validity, yet they are rarely discussed within the counseling literature. This article provides information about the relevance of these threats to internal validity. An illustrative case example will be provided to assist counselors in understanding these…
Descriptors: Construct Validity, Evaluation Criteria, Evaluation Methods, Evaluation Problems
Öztürk-Gübes, Nese; Kelecioglu, Hülya – Educational Sciences: Theory and Practice, 2016
The purpose of this study was to examine the impact of dimensionality, common-item set format, and different scale linking methods on preserving equity property with mixed-format test equating. Item response theory (IRT) true-score equating (TSE) and IRT observed-score equating (OSE) methods were used under common-item nonequivalent groups design.…
Descriptors: Test Format, Item Response Theory, True Scores, Equated Scores
Maydeu-Olivares, Alberto – Measurement: Interdisciplinary Research and Perspectives, 2013
In this rejoinder, Maydeu-Olivares states that, in item response theory (IRT) measurement applications, the application of goodness-of-fit (GOF) methods informs researchers of the discrepancy between the model and the data being fitted (the room for improvement). By routinely reporting the GOF of IRT models, together with the substantive results…
Descriptors: Goodness of Fit, Models, Evaluation Methods, Item Response Theory
Williams, Matt N.; Gomez Grajales, Carlos Alberto; Kurkiewicz, Dason – Practical Assessment, Research & Evaluation, 2013
In 2002, an article entitled "Four assumptions of multiple regression that researchers should always test" by Osborne and Waters was published in "PARE." This article has gone on to be viewed more than 275,000 times (as of August 2013), and it is one of the first results displayed in a Google search for "regression…
Descriptors: Multiple Regression Analysis, Misconceptions, Reader Response, Predictor Variables
Croasmun, James T.; Ostrom, Lee – Journal of Adult Education, 2011
Likert scales are useful in social science and attitude research projects. The General Self-Efficacy Exam is a test used to determine whether factors in educational settings affect participant's learning self-efficacy. The original instrument had 10 efficacy items and used a 4-point Likert scale. The Cronbach's alphas for the original test ranged…
Descriptors: Self Efficacy, Social Sciences, Likert Scales, Measures (Individuals)
Stark, Stephen; Chernyshenko, Oleksandr S. – International Journal of Testing, 2011
This article delves into a relatively unexplored area of measurement by focusing on adaptive testing with unidimensional pairwise preference items. The use of such tests is becoming more common in applied non-cognitive assessment because research suggests that this format may help to reduce certain types of rater error and response sets commonly…
Descriptors: Test Length, Simulation, Adaptive Testing, Item Analysis
May, Tom – Journal of Vocational Education and Training, 2013
This article investigates whether the characteristics of feedback identified by the literature can be found within written assessment records for work and competence-based qualifications. It uses a brief literature review and analysis of the formative feedback within 257 learner portfolios. The characteristics identified by the literature are…
Descriptors: Formative Evaluation, Feedback (Response), Qualifications, Competency Based Education
Little, Mary E. – Educational Forum, 2012
The purpose of this article is to define and clarify the process of instructional problem-solving using assessment data within action research (AR) and Response to Intervention (RtI). Similarities between AR and RtI are defined and compared. Lastly, specific resources and examples of the instructional problem-solving process of AR within…
Descriptors: Intervention, Action Research, Problem Solving, Data Analysis
Mo, Lun; Yang, Fang; Hu, Xiangen – Educational Research and Evaluation, 2011
School climate surveys are widely applied in school districts across the nation to collect information about teacher efficacy, principal leadership, school safety, students' activities, and so forth. They enable school administrators to understand and address many issues on campus when used in conjunction with other student and staff data.…
Descriptors: Evidence, Academic Achievement, Questionnaires, Item Response Theory
Pounder, Diana – Journal of Research on Leadership Education, 2012
This article addresses the leadership preparation line of inquiry developed in the past decade by the University Council for Educational Administration/Learning and Teaching in Educational Leadership Special Interest Group Taskforce on Evaluating Leadership Preparation Programs, and it particularly addresses the series of survey instruments…
Descriptors: Administrator Education, Educational Administration, Instructional Leadership, Program Evaluation
Somerset, Anthony – Compare: A Journal of Comparative and International Education, 2011
Educational practitioners rely predominantly on measures of outcome, rather than of inputs or process, in making judgements as to quality. Outcome measures are available from two main sources: (1) the relatively new international assessment systems; and (2) the traditional national examinations systems. The two types of system differ in their…
Descriptors: Testing Programs, Educational Quality, National Competency Tests, Educational Improvement
Gonyea, Robert M.; Miller, Angie – New Directions for Institutional Research, 2011
Correlations between self-reported learning gains and direct, longitudinal measures that ostensibly correspond in content area are generally inadequate. This chapter clarifies that self-reported measures of learning are more properly used and interpreted as evidence of students' perceived learning and affective outcomes. In this context, the…
Descriptors: Evidence, College Students, Institutional Research, Social Desirability
Camilli, Gregory – Educational Research and Evaluation, 2013
In the attempt to identify or prevent unfair tests, both quantitative analyses and logical evaluation are often used. For the most part, fairness evaluation is a pragmatic attempt at determining whether procedural or substantive due process has been accorded to either a group of test takers or an individual. In both the individual and comparative…
Descriptors: Alternative Assessment, Test Bias, Test Content, Test Format