Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 3 |
Since 2016 (last 10 years) | 3 |
Since 2006 (last 20 years) | 24 |
Descriptor
Evaluation Methods | 42 |
Evaluation Research | 42 |
Testing | 42 |
Student Evaluation | 13 |
Evaluation Problems | 11 |
Test Interpretation | 11 |
Measurement Techniques | 10 |
Psychometrics | 10 |
Measurement | 8 |
Comparative Analysis | 7 |
Test Construction | 7 |
More ▼ |
Source
Author
Kim, Eun Sook | 2 |
Yoon, Myeongsun | 2 |
Bank, Jurgen | 1 |
Barry, Robin A. | 1 |
Bartram, Dave | 1 |
Brauer, J. | 1 |
Brock, Rebecca L. | 1 |
Brown, Mark | 1 |
Cheng, Britte H. | 1 |
Clarke, David | 1 |
Colker, Alexis M. | 1 |
More ▼ |
Publication Type
Education Level
Elementary Secondary Education | 4 |
Higher Education | 3 |
Postsecondary Education | 3 |
Adult Education | 2 |
Middle Schools | 2 |
Early Childhood Education | 1 |
High Schools | 1 |
Junior High Schools | 1 |
Secondary Education | 1 |
Audience
Policymakers | 1 |
Practitioners | 1 |
Researchers | 1 |
Teachers | 1 |
Location
South Africa | 2 |
United States | 2 |
Australia | 1 |
California | 1 |
Canada | 1 |
Europe | 1 |
Germany | 1 |
Hong Kong | 1 |
Philippines | 1 |
United Kingdom | 1 |
Laws, Policies, & Programs
No Child Left Behind Act 2001 | 3 |
Assessments and Surveys
Beck Anxiety Inventory | 1 |
Center for Epidemiologic… | 1 |
National Assessment of… | 1 |
Rosenberg Self Esteem Scale | 1 |
SAT (College Admission Test) | 1 |
What Works Clearinghouse Rating
Lotfi Simon Kerzabi – ProQuest LLC, 2021
Monte Carlo methods are an accepted methodology in regards to generation critical values for a Maximum test. The same methods are also applicable to the evaluation of the robustness of the new created test. A table of critical values was created, and the robustness of the new maximum test was evaluated for five different distributions. Robustness…
Descriptors: Data, Monte Carlo Methods, Testing, Evaluation Research
Developing a High Performance Digital Education Ecosystem: Institutional Self-Assessment Instruments
Volungeviciene, Airina; Brown, Mark; Greenspon, Rasa; Gaebel, Michael; Morrisroe, Alison – European University Association, 2021
Digitally enhanced learning and teaching is widely used across the European Higher Education Area, with general acceptance growing over the years and institutions widely acknowledging the benefits it brings to the student experience. The strategic focus being placed on digitally enhanced learning and teaching has increased, undoubtedly accelerated…
Descriptors: Educational Technology, Technology Uses in Education, Program Evaluation, Self Evaluation (Groups)
Samosa, Resty C. – Online Submission, 2022
Due to the unprecedented COVID-19 incident, basic education institutions have faced different challenges in their teaching-learning activities. Particularly conducting assessments remotely during COVID-19 has posed extraordinary challenges for basic education institutions owing to lack of preparation superimposed with the inherent problems of…
Descriptors: Educational Change, COVID-19, Pandemics, Teaching Methods
Kim, Eun Sook; Kwok, Oi-man; Yoon, Myeongsun – Structural Equation Modeling: A Multidisciplinary Journal, 2012
Testing factorial invariance has recently gained more attention in different social science disciplines. Nevertheless, when examining factorial invariance, it is generally assumed that the observations are independent of each other, which might not be always true. In this study, we examined the impact of testing factorial invariance in multilevel…
Descriptors: Monte Carlo Methods, Testing, Social Science Research, Factor Structure
Kim, Eun Sook; Yoon, Myeongsun; Lee, Taehun – Educational and Psychological Measurement, 2012
Multiple-indicators multiple-causes (MIMIC) modeling is often used to test a latent group mean difference while assuming the equivalence of factor loadings and intercepts over groups. However, this study demonstrated that MIMIC was insensitive to the presence of factor loading noninvariance, which implies that factor loading invariance should be…
Descriptors: Test Items, Simulation, Testing, Statistical Analysis
Royal, Kenneth D.; Gilliland, Kurt O.; Kernick, Edward T. – Anatomical Sciences Education, 2014
Any examination that involves moderate to high stakes implications for examinees should be psychometrically sound and legally defensible. Currently, there are two broad and competing families of test theories that are used to score examination data. The majority of instructors outside the high-stakes testing arena rely on classical test theory…
Descriptors: Item Response Theory, Scoring, Evaluation Methods, Anatomy
Engelhard, George, Jr.; Perkins, Aminah F. – Measurement: Interdisciplinary Research and Perspectives, 2011
Humphry (this issue) has written a thought-provoking piece on the interpretation of item discrimination parameters as scale units in item response theory. One of the key features of his work is the description of an item response theory (IRT) model that he calls the logistic measurement function that combines aspects of two traditions in IRT that…
Descriptors: Foreign Countries, Social Sciences, Item Response Theory, Testing
Mislevy, Robert J.; Haertel, Geneva; Cheng, Britte H.; Ructtinger, Liliana; DeBarger, Angela; Murray, Elizabeth; Rose, David; Gravel, Jenna; Colker, Alexis M.; Rutstein, Daisy; Vendlinski, Terry – Educational Research and Evaluation, 2013
Standardizing aspects of assessments has long been recognized as a tactic to help make evaluations of examinees fair. It reduces variation in irrelevant aspects of testing procedures that could advantage some examinees and disadvantage others. However, recent attention to making assessment accessible to a more diverse population of students…
Descriptors: Testing Accommodations, Access to Education, Testing, Psychometrics
Rodgers, Joseph Lee; Rodgers, Jacci L. – Journal of Continuing Higher Education, 2011
We propose, develop, and evaluate the black ink-red ink (BIRI) method of testing. This approach uses two different methods within the same test administration setting, one that matches recognition learning and the other that matches recall learning. Students purposively define their own tradeoff between the two approaches. Evaluation of the method…
Descriptors: Testing, Test Anxiety, Recall (Psychology), Recognition (Psychology)
Brock, Rebecca L.; Barry, Robin A.; Lawrence, Erika; Dey, Jodi; Rolffs, Jaci – Assessment, 2012
This study examined the psychometric equivalence of paper-and-pencil and Internet formats of key questionnaires used in couple research. Self-report questionnaires assessing interpersonal constructs (relationship satisfaction, communication/conflict management, partner support, emotional intimacy) and intrapersonal constructs (individual traits,…
Descriptors: Satisfaction, Conflict, Intimacy, Questionnaires
Ogunnaike-Lafe, Yomi; Krohn, Joan – Exchange: The Early Childhood Leaders' Magazine Since 1978, 2010
Assessment is a hotly contested issue in education today. The education policy No Child Left Behind (NCLB) emphasizes standardized testing throughout a child's schooling as a major means of assessment. Even at Head Start an attempt was made at standardized testing using the National Reporting System (NRS). Although research indicates that these…
Descriptors: Early Childhood Education, Testing, Standardized Tests, Learning Processes
Howie, Sarah – Assessment in Education: Principles, Policy & Practice, 2012
The Jomtien conference in 1990 on Education for All is seen by many as a turning point for the introduction of increased monitoring and evaluation of the quality of education systems around the world. Internationally, debates have arisen about the nature and frequency of assessment and its impact on education systems with its intended and…
Descriptors: Test Use, Testing, High Stakes Tests, Measures (Individuals)
Haja, Shajahan; Clarke, David – Mathematics Education Research Journal, 2011
The structure of two-tier testing is such that the first tier consists of a multiple-choice question and the second tier requires justifications for choices of answers made in the first tier. This study aims to evaluate two-tier tasks in "proportion" in terms of students' capacity to write and select justifications and to examine the effect of…
Descriptors: Student Attitudes, Alternative Assessment, Testing, Misconceptions
Lim, David – Journal of Vocational Education and Training, 2009
Operating a quality assurance system in tertiary education is the rule rather than the exception, because of the belief that it will improve quality. However, proving this is not easy. This study examines three ways of providing the evidence: the a "priori" method, the stepwise backtracking method, and the external evaluation method. The…
Descriptors: Testing, Quality Control, Program Effectiveness, Foreign Countries
Looney, Janet W. – OECD Publishing (NJ1), 2011
A long-held ambition for many educators and assessment experts has been to integrate summative and formative assessments so that data from external assessments used for system monitoring may also be used to shape teaching and learning in classrooms. In turn, classroom-based assessments may provide valuable data for decision makers at school and…
Descriptors: Research and Development, Formative Evaluation, Testing, Summative Evaluation