Publication Date
In 2025 | 0 |
Since 2024 | 2 |
Since 2021 (last 5 years) | 8 |
Since 2016 (last 10 years) | 12 |
Since 2006 (last 20 years) | 18 |
Descriptor
Computer Software | 23 |
Scoring | 23 |
Test Items | 23 |
Computer Assisted Testing | 14 |
Item Analysis | 11 |
Accuracy | 9 |
Item Response Theory | 8 |
Models | 8 |
Foreign Countries | 7 |
Test Construction | 6 |
Artificial Intelligence | 5 |
More ▼ |
Source
Author
Ali, Usama S. | 1 |
Ashwell, Tim | 1 |
Aybek, Eren Can | 1 |
Bakla, Arif | 1 |
Bennett, Randy Elliot | 1 |
Breyer, F. Jay | 1 |
Deborah L. Myers | 1 |
Demirtasli, R. Nukhet | 1 |
Deng, Nina | 1 |
Denis Dumas | 1 |
Elam, Jesse R. | 1 |
More ▼ |
Publication Type
Education Level
Elementary Secondary Education | 4 |
Higher Education | 4 |
Postsecondary Education | 4 |
Elementary Education | 2 |
Secondary Education | 2 |
Audience
Administrators | 1 |
Practitioners | 1 |
Researchers | 1 |
Teachers | 1 |
Location
Japan | 2 |
United Kingdom | 2 |
Australia | 1 |
Austria | 1 |
Belgium | 1 |
Canada | 1 |
Chile | 1 |
Cyprus | 1 |
Czech Republic | 1 |
Denmark | 1 |
Estonia | 1 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
Trends in International… | 2 |
Advanced Placement… | 1 |
Program for International… | 1 |
Test of English as a Foreign… | 1 |
Torrance Tests of Creative… | 1 |
What Works Clearinghouse Rating
Kunal Sareen – Innovations in Education and Teaching International, 2024
This study examines the proficiency of Chat GPT, an AI language model, in answering questions on the Situational Judgement Test (SJT), a widely used assessment tool for evaluating the fundamental competencies of medical graduates in the UK. A total of 252 SJT questions from the "Oxford Assess and Progress: Situational Judgement" Test…
Descriptors: Ethics, Decision Making, Artificial Intelligence, Computer Software
Zhang, Mengxue; Heffernan, Neil; Lan, Andrew – International Educational Data Mining Society, 2023
Automated scoring of student responses to open-ended questions, including short-answer questions, has great potential to scale to a large number of responses. Recent approaches for automated scoring rely on supervised learning, i.e., training classifiers or fine-tuning language models on a small number of responses with human-provided score…
Descriptors: Scoring, Computer Assisted Testing, Mathematics Instruction, Mathematics Tests
Gregory J. Crowther; Usha Sankar; Leena S. Knight; Deborah L. Myers; Kevin T. Patton; Lekelia D. Jenkins; Thomas A. Knight – Journal of Microbiology & Biology Education, 2023
The biology education literature includes compelling assertions that unfamiliar problems are especially useful for revealing students' true understanding of biology. However, there is only limited evidence that such novel problems have different cognitive requirements than more familiar problems. Here, we sought additional evidence by using…
Descriptors: Science Instruction, Artificial Intelligence, Scoring, Molecular Structure
Selcuk Acar; Denis Dumas; Peter Organisciak; Kelly Berthiaume – Grantee Submission, 2024
Creativity is highly valued in both education and the workforce, but assessing and developing creativity can be difficult without psychometrically robust and affordable tools. The open-ended nature of creativity assessments has made them difficult to score, expensive, often imprecise, and therefore impractical for school- or district-wide use. To…
Descriptors: Thinking Skills, Elementary School Students, Artificial Intelligence, Measurement Techniques
von Davier, Matthias; Tyack, Lillian; Khorramdel, Lale – Educational and Psychological Measurement, 2023
Automated scoring of free drawings or images as responses has yet to be used in large-scale assessments of student achievement. In this study, we propose artificial neural networks to classify these types of graphical responses from a TIMSS 2019 item. We are comparing classification accuracy of convolutional and feed-forward approaches. Our…
Descriptors: Scoring, Networks, Artificial Intelligence, Elementary Secondary Education
van Rijn, Peter W.; Ali, Usama S. – ETS Research Report Series, 2018
A computer program was developed to estimate speed-accuracy response models for dichotomous items. This report describes how the models are estimated and how to specify data and input files. An example using data from a listening section of an international language test is described to illustrate the modeling approach and features of the computer…
Descriptors: Computer Software, Computation, Reaction Time, Timed Tests
Çekiç, Ahmet; Bakla, Arif – International Online Journal of Education and Teaching, 2021
The Internet and the software stores for mobile devices come with a huge number of digital tools for any task, and those intended for digital formative assessment (DFA) have burgeoned exponentially in the last decade. These tools vary in terms of their functionality, pedagogical quality, cost, operating systems and so forth. Teachers and learners…
Descriptors: Formative Evaluation, Futures (of Society), Computer Assisted Testing, Guidance
Aybek, Eren Can; Demirtasli, R. Nukhet – International Journal of Research in Education and Science, 2017
This article aims to provide a theoretical framework for computerized adaptive tests (CAT) and item response theory models for polytomous items. Besides that, it aims to introduce the simulation and live CAT software to the related researchers. Computerized adaptive test algorithm, assumptions of item response theory models, nominal response…
Descriptors: Computer Assisted Testing, Adaptive Testing, Item Response Theory, Test Items
Mullis, Ina V. S., Ed.; Martin, Michael O., Ed.; von Davier, Matthias, Ed. – International Association for the Evaluation of Educational Achievement, 2021
TIMSS (Trends in International Mathematics and Science Study) is a long-standing international assessment of mathematics and science at the fourth and eighth grades that has been collecting trend data every four years since 1995. About 70 countries use TIMSS trend data for monitoring the effectiveness of their education systems in a global…
Descriptors: Achievement Tests, International Assessment, Science Achievement, Mathematics Achievement
Yang, Ji Seung; Zheng, Xiaying – Journal of Educational and Behavioral Statistics, 2018
The purpose of this article is to introduce and review the capability and performance of the Stata item response theory (IRT) package that is available from Stata v.14, 2015. Using a simulated data set and a publicly available item response data set extracted from Programme of International Student Assessment, we review the IRT package from…
Descriptors: Item Response Theory, Item Analysis, Computer Software, Statistical Analysis
Ashwell, Tim; Elam, Jesse R. – JALT CALL Journal, 2017
The ultimate aim of our research project was to use the Google Web Speech API to automate scoring of elicited imitation (EI) tests. However, in order to achieve this goal, we had to take a number of preparatory steps. We needed to assess how accurate this speech recognition tool is in recognizing native speakers' production of the test items; we…
Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Language Tests
Feng, Mingyu, Ed.; Käser, Tanja, Ed.; Talukdar, Partha, Ed. – International Educational Data Mining Society, 2023
The Indian Institute of Science is proud to host the fully in-person sixteenth iteration of the International Conference on Educational Data Mining (EDM) during July 11-14, 2023. EDM is the annual flagship conference of the International Educational Data Mining Society. The theme of this year's conference is "Educational data mining for…
Descriptors: Information Retrieval, Data Analysis, Computer Assisted Testing, Cheating
OECD Publishing, 2013
The Programme for the International Assessment of Adult Competencies (PIAAC) has been planned as an ongoing program of assessment. The first cycle of the assessment has involved two "rounds." The first round, which is covered by this report, took place over the period of January 2008-October 2013. The main features of the first cycle of…
Descriptors: International Assessment, Adults, Skills, Test Construction
Deng, Nina – ProQuest LLC, 2011
Three decision consistency and accuracy (DC/DA) methods, the Livingston and Lewis (LL) method, LEE method, and the Hambleton and Han (HH) method, were evaluated. The purposes of the study were: (1) to evaluate the accuracy and robustness of these methods, especially when their assumptions were not well satisfied, (2) to investigate the "true"…
Descriptors: Item Response Theory, Test Theory, Computation, Classification
Zhang, Mo; Breyer, F. Jay; Lorenz, Florian – ETS Research Report Series, 2013
In this research, we investigated the suitability of implementing "e-rater"® automated essay scoring in a high-stakes large-scale English language testing program. We examined the effectiveness of generic scoring and 2 variants of prompt-based scoring approaches. Effectiveness was evaluated on a number of dimensions, including agreement…
Descriptors: Computer Assisted Testing, Computer Software, Scoring, Language Tests
Previous Page | Next Page »
Pages: 1 | 2