Publication Date
In 2025 | 3 |
Since 2024 | 4 |
Since 2021 (last 5 years) | 5 |
Since 2016 (last 10 years) | 6 |
Since 2006 (last 20 years) | 15 |
Descriptor
Source
Author
Alcock, Lara | 1 |
Alwis, W. A. M. | 1 |
Bhola, Dennison S. | 1 |
Brice, Julie | 1 |
Catherine Mata | 1 |
Coombes, Lee | 1 |
DeMars, Christine E. | 1 |
Elizabeth B. Vaughan | 1 |
Elosua, Paula | 1 |
Hamid Mohammadi | 1 |
Homer, Matthew S. | 1 |
More ▼ |
Publication Type
Journal Articles | 13 |
Reports - Research | 10 |
Reports - Evaluative | 5 |
Dissertations/Theses -… | 1 |
Speeches/Meeting Papers | 1 |
Tests/Questionnaires | 1 |
Education Level
Higher Education | 16 |
Postsecondary Education | 10 |
Elementary Secondary Education | 1 |
Audience
Location
Canada | 1 |
China | 1 |
Georgia (Atlanta) | 1 |
North Carolina | 1 |
Oregon (Portland) | 1 |
Singapore | 1 |
United Kingdom (Leeds) | 1 |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Tahereh Firoozi; Hamid Mohammadi; Mark J. Gierl – Journal of Educational Measurement, 2025
The purpose of this study is to describe and evaluate a multilingual automated essay scoring (AES) system for grading essays in three languages. Two different sentence embedding models were evaluated within the AES system, multilingual BERT (mBERT) and language-agnostic BERT sentence embedding (LaBSE). German, Italian, and Czech essays were…
Descriptors: College Students, Slavic Languages, German, Italian
Ole J. Kemi – Advances in Physiology Education, 2025
Students are assessed by coursework and/or exams, all of which are marked by assessors (markers). Student and marker performances are then subject to end-of-session board of examiner handling and analysis. This occurs annually and is the basis for evaluating students but also the wider learning and teaching efficiency of an academic institution.…
Descriptors: Undergraduate Students, Evaluation Methods, Evaluation Criteria, Academic Standards
Catherine Mata; Katharine Meyer; Lindsay Page – Annenberg Institute for School Reform at Brown University, 2024
This article examines the risk of crossover contamination in individual-level randomization, a common concern in experimental research, in the context of a large-enrollment college course. While individual-level randomization is more efficient for assessing program effectiveness, it also increases the potential for control group students to cross…
Descriptors: Chemistry, Science Instruction, Undergraduate Students, Large Group Instruction
Elizabeth B. Vaughan; Saraswathi Tummuru; Jack Barbera – Chemistry Education Research and Practice, 2025
Students' expectations for their laboratory coursework are theorized to have an impact on their learning experiences and behaviors, such as engagement. Before students' expectations and engagement can be explored in different types of undergraduate chemistry laboratory courses, appropriate measures of these constructs must be identified, and…
Descriptors: Undergraduate Students, Organic Chemistry, Chemistry, Science Instruction
Razieh Fathi – ProQuest LLC, 2021
This dissertation describes an experiment to investigate how learners with different levels of background in computer science learn core concepts of computer science, in particular, algorithms. We designed a study to focus on cognitive task analysis for eliciting the empirical mental elements of learning two graph algorithms. Cognitive workload…
Descriptors: Undergraduate Students, Computer Science Education, Algorithms, Cognitive Development
Murray, Keith B.; Zdravkovic, Srdan – Journal of Education for Business, 2016
Considerable debate continues regarding the efficacy of the website RateMyProfessors.com (RMP). To date, however, virtually no direct, experimental research has been reported which directly bears on questions relating to sampling adequacy or item adequacy in producing what favorable correlations have been reported. The authors compare the data…
Descriptors: Computer Assisted Testing, Computer Software Evaluation, Student Evaluation of Teacher Performance, Item Analysis
Slepkov, Aaron D.; Shiell, Ralph C. – Physical Review Special Topics - Physics Education Research, 2014
Constructed-response (CR) questions are a mainstay of introductory physics textbooks and exams. However, because of the time, cost, and scoring reliability constraints associated with this format, CR questions are being increasingly replaced by multiple-choice (MC) questions in formal exams. The integrated testlet (IT) is a recently developed…
Descriptors: Science Tests, Physics, Responses, Multiple Choice Tests
Totten, Jeff W. – Journal of Learning in Higher Education, 2014
The original SOCO Scale was reduced to 10 items by Thomas, Soutar, and Ryan (2001). The author conducted a pretest and a posttest in his Personal Selling class during the Fall 2009 semester. Significant differences by gender, student sales experience and family member in the sales field were identified. The author once again pretested the…
Descriptors: Test Construction, Program Validation, Pretests Posttests, Questionnaires
Jones, Ian; Alcock, Lara – Studies in Higher Education, 2014
Peer assessment typically requires students to judge peers' work against assessment criteria. We tested an alternative approach in which students judged pairs of scripts against one another in the absence of assessment criteria. First year mathematics undergraduates (N?=?194) sat a written test on conceptual understanding of multivariable…
Descriptors: Peer Evaluation, Evaluation Criteria, Alternative Assessment, Undergraduate Students
Morrison, Keith – Educational Research and Evaluation, 2013
This paper reviews the literature on comparing online and paper course evaluations in higher education and provides a case study of a very large randomised trial on the topic. It presents a mixed but generally optimistic picture of online course evaluations with respect to response rates, what they indicate, and how to increase them. The paper…
Descriptors: Literature Reviews, Course Evaluation, Case Studies, Higher Education
Lew, Magdeleine D. N.; Alwis, W. A. M.; Schmidt, Henk G. – Assessment & Evaluation in Higher Education, 2010
The purpose of the two studies presented here was to evaluate the accuracy of students' self-assessment ability, to examine whether this ability improves over time and to investigate whether self-assessment is more accurate if students believe that it contributes to improving learning. To that end, the accuracy of the self-assessments of 3588…
Descriptors: Self Evaluation (Individuals), Beliefs, Learning Processes, Correlation
Ricketts, Chris; Brice, Julie; Coombes, Lee – Advances in Health Sciences Education, 2010
The purpose of multiple choice tests of medical knowledge is to estimate as accurately as possible a candidate's level of knowledge. However, concern is sometimes expressed that multiple choice tests may also discriminate in undesirable and irrelevant ways, such as between minority ethnic groups or by sex of candidates. There is little literature…
Descriptors: Medical Students, Testing Accommodations, Ethnic Groups, Learning Disabilities
Elosua, Paula; Iliescu, Dragos – International Journal of Testing, 2012
Psychometric practice does not always converge with the advances of psychometric theory. In order to investigate this gap, the authors focus on the 10 most used psychological tests in Europe, as identified by recent surveys. The article analyzes test manuals published in 6 different European countries for these 10 most used tests. A total of 32…
Descriptors: Psychological Testing, Personality Measures, Error of Measurement, Foreign Countries
Pell, Godfrey; Homer, Matthew S.; Roberts, Trudie E. – International Journal of Research & Method in Education, 2008
Increasingly, academic institutions are being required to improve the validity of the assessment process; unfortunately, often this is at the expense of reliability. In medical schools (such as Leeds), standardized tests of clinical skills, such as "Objective Structured Clinical Examinations" (OSCEs) are widely used to assess clinical…
Descriptors: Medical Education, Standardized Tests, Clinical Experience, Criterion Referenced Tests
Kong, Xiaojing J.; Wise, Steven L.; Bhola, Dennison S. – Educational and Psychological Measurement, 2007
This study compared four methods for setting item response time thresholds to differentiate rapid-guessing behavior from solution behavior. Thresholds were either (a) common for all test items, (b) based on item surface features such as the amount of reading required, (c) based on visually inspecting response time frequency distributions, or (d)…
Descriptors: Test Items, Reaction Time, Timed Tests, Item Response Theory
Previous Page | Next Page ยป
Pages: 1 | 2