NotesFAQContact Us
Collection
Advanced
Search Tips
Audience
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 1 to 15 of 30 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Yang Du; Susu Zhang – Journal of Educational and Behavioral Statistics, 2025
Item compromise has long posed challenges in educational measurement, jeopardizing both test validity and test security of continuous tests. Detecting compromised items is therefore crucial to address this concern. The present literature on compromised item detection reveals two notable gaps: First, the majority of existing methods are based upon…
Descriptors: Item Response Theory, Item Analysis, Bayesian Statistics, Educational Assessment
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Ezi Apino; Edi Istiyono; Heri Retnawati; Widihastuti Widihastuti; Kana Hidayati – Journal of Pedagogical Research, 2024
Assessment of attitudes towards statistics [ATS] is needed to support the success of statistics education in tertiary institutions, so measuring instruments with high accuracy is required. However, existing instruments to measure ATS have not considered the use of technology as an essential variable affecting success in statistics education. The…
Descriptors: Foreign Countries, College Students, College Faculty, Statistics Education
Peer reviewed Peer reviewed
Direct linkDirect link
Celeste Combrinck; Nelé Loubser – Discover Education, 2025
Written assignments for large classes pose a far more significant challenge in the age of the GenAI revolution. Suggestions such as oral exams and formative assessments are not always feasible with many students in a class. Therefore, we conducted a study in South Africa and involved 280 Honors students to explore the usefulness of Turnitin's AI…
Descriptors: Foreign Countries, Artificial Intelligence, Large Group Instruction, Alternative Assessment
Peer reviewed Peer reviewed
Direct linkDirect link
Shinta Estri Wahyuningrum; Gilles van Luijtelaar; Augustina Sulastri; Marc P. H. Hendriks; Ridwan Sanjaya; Tom Heskes – SAGE Open, 2024
Visual Reproduction is a condition to measure Visual Spatial Memory as one of the cognitive domains commonly used to measure visuo-spatial memory. Geometric figures serve as stimulus material, and probands have to reproduce the figures from memory through a hand drawing. The scoring of the drawing has subjective elements. This study aims to…
Descriptors: Automation, Scores, Geometry, Visual Aids
Peer reviewed Peer reviewed
Direct linkDirect link
Penny Smith; Tracey Carlyon – Assessment Matters, 2023
Learning and assessment that drives learner success should be a key tenet of all initial teacher education programmes. Initial teacher education providers in Aotearoa New Zealand must use an assessment framework to ensure that graduating teachers meet the Teaching Council standards. As a part of a review of their assessment practices, academic…
Descriptors: Foreign Countries, Beginning Teachers, Beginning Teacher Induction, Teacher Education
Patrick C. Kyllonen; Amit Sevak; Teresa Ober; Ikkyu Choi; Jesse Sparks; Daniel Fishtein – ETS Research Institute, 2024
Assessment refers to a broad array of approaches for measuring or evaluating a person's (or group of persons') skills, behaviors, dispositions, or other attributes. Assessments range from standardized tests used in admissions, employee selection, licensure examinations, and domestic and international largescale assessments of cognitive and…
Descriptors: Performance Based Assessment, Evaluation Criteria, Evaluation Methods, Test Bias
Jing Lu; Chun Wang; Jiwei Zhang; Xue Wang – Grantee Submission, 2023
Changepoints are abrupt variations in a sequence of data in statistical inference. In educational and psychological assessments, it is pivotal to properly differentiate examinees' aberrant behaviors from solution behavior to ensure test reliability and validity. In this paper, we propose a sequential Bayesian changepoint detection algorithm to…
Descriptors: Bayesian Statistics, Behavior Patterns, Computer Assisted Testing, Accuracy
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Gu, Lixiong; Ling, Guangming; Qu, Yanxuan – ETS Research Report Series, 2019
Research has found that the "a"-stratified item selection strategy (STR) for computerized adaptive tests (CATs) may lead to insufficient use of high a items at later stages of the tests and thus to reduced measurement precision. A refined approach, unequal item selection across strata (USTR), effectively improves test precision over the…
Descriptors: Computer Assisted Testing, Adaptive Testing, Test Use, Test Items
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Singh, Upasana Gitanjali; de Villiers, Mary Ruth – International Review of Research in Open and Distributed Learning, 2017
e-Assessment, in the form of tools and systems that deliver and administer multiple choice questions (MCQs), is used increasingly, raising the need for evaluation and validation of such systems. This research uses literature and a series of six empirical action research studies to develop an evaluation framework of categories and criteria called…
Descriptors: Computer Assisted Testing, Multiple Choice Tests, Test Selection, Action Research
Halloran, Jo-Ann – ProQuest LLC, 2013
Government entities set criteria for institutions that have teacher educator programs to use online assessment tools to show continuous ongoing evaluation, and use data from the tools to guide the improvement of courses. The purpose of this qualitative, multi-case study was to discover how Instructional Designers-by-Assignment (IDBA) are using…
Descriptors: Instructional Design, Student Evaluation, Computer Assisted Testing, Evaluation Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Ramineni, Chaitanya; Williamson, David M. – Assessing Writing, 2013
In this paper, we provide an overview of psychometric procedures and guidelines Educational Testing Service (ETS) uses to evaluate automated essay scoring for operational use. We briefly describe the e-rater system, the procedures and criteria used to evaluate e-rater, implications for a range of potential uses of e-rater, and directions for…
Descriptors: Educational Testing, Guidelines, Scoring, Psychometrics
Peer reviewed Peer reviewed
Direct linkDirect link
Makransky, Guido; Glas, Cees A. W. – International Journal of Testing, 2013
Cognitive ability tests are widely used in organizations around the world because they have high predictive validity in selection contexts. Although these tests typically measure several subdomains, testing is usually carried out for a single subdomain at a time. This can be ineffective when the subdomains assessed are highly correlated. This…
Descriptors: Foreign Countries, Cognitive Ability, Adaptive Testing, Feedback (Response)
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Ramineni, Chaitanya; Trapani, Catherine S.; Williamson, David M.; Davey, Tim; Bridgeman, Brent – ETS Research Report Series, 2012
Scoring models for the "e-rater"® system were built and evaluated for the "TOEFL"® exam's independent and integrated writing prompts. Prompt-specific and generic scoring models were built, and evaluation statistics, such as weighted kappas, Pearson correlations, standardized differences in mean scores, and correlations with…
Descriptors: Scoring, Prompting, Evaluators, Computer Software
Peer reviewed Peer reviewed
Direct linkDirect link
Passos, Valeria Lima; Berger, Martijn P. F.; Tan, Frans E. S. – Journal of Educational and Behavioral Statistics, 2008
During the early stage of computerized adaptive testing (CAT), item selection criteria based on Fisher"s information often produce less stable latent trait estimates than the Kullback-Leibler global information criterion. Robustness against early stage instability has been reported for the D-optimality criterion in a polytomous CAT with the…
Descriptors: Computer Assisted Testing, Adaptive Testing, Evaluation Criteria, Item Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Yin, Alexander C.; Volkwein, J. Fredericks – New Directions for Institutional Research, 2010
After surveying 1,827 students in their final year at eighty randomly selected two-year and four-year public and private institutions, American Institutes for Research (2006) reported that approximately 30 percent of students in two-year institutions and nearly 20 percent of students in four-year institutions have only basic quantitative…
Descriptors: Standardized Tests, Basic Skills, College Admission, Educational Testing
Previous Page | Next Page »
Pages: 1  |  2