NotesFAQContact Us
Collection
Advanced
Search Tips
What Works Clearinghouse Rating
Showing 1 to 15 of 403 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Tahereh Firoozi; Hamid Mohammadi; Mark J. Gierl – Journal of Educational Measurement, 2025
The purpose of this study is to describe and evaluate a multilingual automated essay scoring (AES) system for grading essays in three languages. Two different sentence embedding models were evaluated within the AES system, multilingual BERT (mBERT) and language-agnostic BERT sentence embedding (LaBSE). German, Italian, and Czech essays were…
Descriptors: College Students, Slavic Languages, German, Italian
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Meagan Karvonen; Russell Swinburne Romine; Amy K. Clark – Practical Assessment, Research & Evaluation, 2024
This paper describes methods and findings from student cognitive labs, teacher cognitive labs, and test administration observations as evidence evaluated in a validity argument for a computer-based alternate assessment for students with significant cognitive disabilities. Validity of score interpretations and uses for alternate assessments based…
Descriptors: Students with Disabilities, Intellectual Disability, Severe Disabilities, Student Evaluation
Peer reviewed Peer reviewed
Direct linkDirect link
Ke-Hai Yuan; Zhiyong Zhang; Lijuan Wang – Grantee Submission, 2024
Mediation analysis plays an important role in understanding causal processes in social and behavioral sciences. While path analysis with composite scores was criticized to yield biased parameter estimates when variables contain measurement errors, recent literature has pointed out that the population values of parameters of latent-variable models…
Descriptors: Structural Equation Models, Path Analysis, Weighted Scores, Comparative Testing
Peer reviewed Peer reviewed
Direct linkDirect link
Peabody, Michael R.; Muckle, Timothy J.; Meng, Yu – Educational Measurement: Issues and Practice, 2023
The subjective aspect of standard-setting is often criticized, yet data-driven standard-setting methods are rarely applied. Therefore, we applied a mixture Rasch model approach to setting performance standards across several testing programs of various sizes and compared the results to existing passing standards derived from traditional…
Descriptors: Item Response Theory, Standard Setting, Testing, Sampling
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Lynch, Sarah – Practical Assessment, Research & Evaluation, 2022
In today's digital age, tests are increasingly being delivered on computers. Many of these computer-based tests (CBTs) have been adapted from paper-based tests (PBTs). However, this change in mode of test administration has the potential to introduce construct-irrelevant variance, affecting the validity of score interpretations. Because of this,…
Descriptors: Computer Assisted Testing, Tests, Scores, Scoring
Peer reviewed Peer reviewed
Direct linkDirect link
Mostafa M. Samy; Mohamed A. Metwally; Mahmoud Ashry; Wael M. Elmayyah – Measurement: Interdisciplinary Research and Perspectives, 2025
Gas Turbine Engines (GTE) have the highest power-to-weight ratio among Internal Combustion Engines (ICE). Its modularity and ability to utilize various types of fuel make it highly recommended in power plants, naval transportation, and, of course, the most equipped in aviation. The lack of GTEs' real data is increasing a recognized need for…
Descriptors: Engines, Power Technology, Data Collection, Data Interpretation
Peer reviewed Peer reviewed
Direct linkDirect link
Nicole Marx; Wolfgang Mann – Journal of Multilingual and Multicultural Development, 2025
Language assessment is a central aspect not only of language education in the general population, but also amongst heterogeneous, low-incidence populations. One such population are immigrant deaf and hard-of-hearing learners (IDML) who are bimodal-multilingual and whose languages development often includes the spoken, written, and/or signed…
Descriptors: Foreign Countries, German, Sign Language, Immigrants
Peer reviewed Peer reviewed
Direct linkDirect link
Han, Chao – Language Testing, 2022
Over the past decade, testing and assessing spoken-language interpreting has garnered an increasing amount of attention from stakeholders in interpreter education, professional certification, and interpreting research. This is because in these fields assessment results provide a critical evidential basis for high-stakes decisions, such as the…
Descriptors: Translation, Language Tests, Testing, Evaluation Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Guher Gorgun; Okan Bulut – Educational Measurement: Issues and Practice, 2025
Automatic item generation may supply many items instantly and efficiently to assessment and learning environments. Yet, the evaluation of item quality persists to be a bottleneck for deploying generated items in learning and assessment settings. In this study, we investigated the utility of using large-language models, specifically Llama 3-8B, for…
Descriptors: Artificial Intelligence, Quality Control, Technology Uses in Education, Automation
Peer reviewed Peer reviewed
Direct linkDirect link
Edward G. J. Stevenson; Jil Molenaar; David-Paul Pertaub; Dessalegn Tekle – Field Methods, 2025
Is it possible to measure wealth and poverty across settings while being faithful to local understandings? The stages of progress method (SoP) attempts to do this by building ladders of wealth in locally relevant terms and using these in comparisons across groups. This approach is potentially useful among pastoralist populations where monetary…
Descriptors: Foreign Countries, Poverty, Social Mobility, Evaluation Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Rafner, Janet; Biskjaer, Michael Mose; Zana, Blanka; Langsford, Steven; Bergenholtz, Carsten; Rahimi, Seyedahmad; Carugati, Andrea; Noy, Lior; Sherson, Jacob – Creativity Research Journal, 2022
Creativity assessments should be valid, reliable, and scalable to support various stakeholders (e.g., policy-makers, educators, corporations, and the general public) in their decision-making processes. Established initiatives toward scalable creativity assessments have relied on well-studied standardized tests. Although robust in many ways, most…
Descriptors: Creativity, Evaluation Methods, Video Games, Computer Assisted Testing
Peer reviewed Peer reviewed
Direct linkDirect link
Ole J. Kemi – Advances in Physiology Education, 2025
Students are assessed by coursework and/or exams, all of which are marked by assessors (markers). Student and marker performances are then subject to end-of-session board of examiner handling and analysis. This occurs annually and is the basis for evaluating students but also the wider learning and teaching efficiency of an academic institution.…
Descriptors: Undergraduate Students, Evaluation Methods, Evaluation Criteria, Academic Standards
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Eko Suhartoyo; Rida Afrilyasanti; Nur Mukminatien – Turkish Online Journal of Distance Education, 2025
In this paper, we investigated the impact of an online classroom-based reading assessment on implementing practices in reading instruction among 30 EFL learners in an intermediate reading course at a public university in East Java, Indonesia. Our study aimed to develop an online classroom-based reading assessment and evaluate its efficacy in…
Descriptors: Student Evaluation, Computer Assisted Testing, Reading Tests, Reading Instruction
Peer reviewed Peer reviewed
Direct linkDirect link
Shun-Fu Hu; Amery D. Wu; Jake Stone – Journal of Educational Measurement, 2025
Scoring high-dimensional assessments (e.g., > 15 traits) can be a challenging task. This paper introduces the multilabel neural network (MNN) as a scoring method for high-dimensional assessments. Additionally, it demonstrates how MNN can score the same test responses to maximize different performance metrics, such as accuracy, recall, or…
Descriptors: Tests, Testing, Scores, Test Construction
Peer reviewed Peer reviewed
Direct linkDirect link
Yue Huang; Joshua Wilson – Journal of Computer Assisted Learning, 2025
Background: Automated writing evaluation (AWE) systems, used as formative assessment tools in writing classrooms, are promising for enhancing instruction and improving student performance. Although meta-analytic evidence supports AWE's effectiveness in various contexts, research on its effectiveness in the U.S. K-12 setting has lagged behind its…
Descriptors: Writing Evaluation, Writing Skills, Writing Tests, Writing Instruction
Previous Page | Next Page ยป
Pages: 1  |  2  |  3  |  4  |  5  |  6  |  7  |  8  |  9  |  10  |  11  |  ...  |  27