Publication Date
In 2025 | 1 |
Since 2024 | 3 |
Since 2021 (last 5 years) | 10 |
Since 2016 (last 10 years) | 18 |
Since 2006 (last 20 years) | 26 |
Descriptor
Source
Author
Donovan, Jenny | 2 |
Hutton, Penny | 2 |
Lennon, Melissa | 2 |
Martin, Michael O., Ed. | 2 |
Mullis, Ina V. S., Ed. | 2 |
Ahmed, S. | 1 |
Balouch, Martin | 1 |
Bastari | 1 |
Baxter, G. P. | 1 |
Brese, Falk, Ed. | 1 |
Breyer, F. Jay | 1 |
More ▼ |
Publication Type
Education Level
Secondary Education | 9 |
Elementary Education | 8 |
Elementary Secondary Education | 6 |
Grade 6 | 4 |
Higher Education | 4 |
Postsecondary Education | 4 |
Grade 10 | 2 |
Grade 8 | 2 |
Early Childhood Education | 1 |
Grade 11 | 1 |
Grade 4 | 1 |
More ▼ |
Audience
Teachers | 2 |
Location
Australia | 5 |
China | 3 |
Europe | 2 |
Indonesia | 2 |
Iran | 2 |
Netherlands | 2 |
United Kingdom | 2 |
Asia | 1 |
Czech Republic | 1 |
Hong Kong | 1 |
Japan | 1 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
Program for International… | 4 |
Trends in International… | 2 |
Peabody Picture Vocabulary… | 1 |
What Works Clearinghouse Rating
Harrison, Scott; Kroehne, Ulf; Goldhammer, Frank; Lüdtke, Oliver; Robitzsch, Alexander – Large-scale Assessments in Education, 2023
Background: Mode effects, the variations in item and scale properties attributed to the mode of test administration (paper vs. computer), have stimulated research around test equivalence and trend estimation in PISA. The PISA assessment framework provides the backbone to the interpretation of the results of the PISA test scores. However, an…
Descriptors: Scoring, Test Items, Difficulty Level, Foreign Countries
Emma Walland – Research Matters, 2024
GCSE examinations (taken by students aged 16 years in England) are not intended to be speeded (i.e. to be partly a test of how quickly students can answer questions). However, there has been little research exploring this. The aim of this research was to explore the speededness of past GCSE written examinations, using only the data from scored…
Descriptors: Educational Change, Test Items, Item Analysis, Scoring
Kunal Sareen – Innovations in Education and Teaching International, 2024
This study examines the proficiency of Chat GPT, an AI language model, in answering questions on the Situational Judgement Test (SJT), a widely used assessment tool for evaluating the fundamental competencies of medical graduates in the UK. A total of 252 SJT questions from the "Oxford Assess and Progress: Situational Judgement" Test…
Descriptors: Ethics, Decision Making, Artificial Intelligence, Computer Software
Farshad Effatpanah; Purya Baghaei; Mona Tabatabaee-Yazdi; Esmat Babaii – Language Testing, 2025
This study aimed to propose a new method for scoring C-Tests as measures of general language proficiency. In this approach, the unit of analysis is sentences rather than gaps or passages. That is, the gaps correctly reformulated in each sentence were aggregated as sentence score, and then each sentence was entered into the analysis as a polytomous…
Descriptors: Item Response Theory, Language Tests, Test Items, Test Construction
Hrubes?, Jan; Tywoniak, Adam; Balouch, Martin; Chvi´la, Stanislav; Hrabovsky´, Jan – Journal of Chemical Education, 2021
Identification and further motivation of gifted students are widely discussed among the science education community. In the context of the educational system of the Czech Republic, competitions serve as one of the main ways to identify those gifted in precollege education. In this article, we report the foundation of a team-based open-book…
Descriptors: Gifted Education, Science Instruction, Teaching Methods, Teamwork
Bronkhorst, Hugo; Roorda, Gerrit; Suhre, Cor; Goedhart, Martin – Research in Mathematics Education, 2022
Logical reasoning as part of critical thinking is becoming more and more important to prepare students for their future life in society, work, and study. This article presents the results of a quasi-experimental study with a pre-test-post-test control group design focusing on the effective use of formalisations to support logical reasoning. The…
Descriptors: Mathematics Instruction, Teaching Methods, Logical Thinking, Critical Thinking
von Davier, Matthias; Tyack, Lillian; Khorramdel, Lale – Educational and Psychological Measurement, 2023
Automated scoring of free drawings or images as responses has yet to be used in large-scale assessments of student achievement. In this study, we propose artificial neural networks to classify these types of graphical responses from a TIMSS 2019 item. We are comparing classification accuracy of convolutional and feed-forward approaches. Our…
Descriptors: Scoring, Networks, Artificial Intelligence, Elementary Secondary Education
Li, Shuai; Wen, Ting; Li, Xian; Feng, Yali; Lin, Chuan – Language Testing, 2023
This study compared holistic and analytic marking methods for their effects on parameter estimation (of examinees, raters, and items) and rater cognition in assessing speech act production in L2 Chinese. Seventy American learners of Chinese completed an oral Discourse Completion Test assessing requests and refusals. Four first-language (L1)…
Descriptors: Speech Acts, Second Language Learning, Second Language Instruction, Chinese
Coniam, David; Lee, Tony; Milanovic, Michael; Pike, Nigel; Zhao, Wen – Language Education & Assessment, 2022
The calibration of test materials generally involves the interaction between empirical analysis and expert judgement. This paper explores the extent to which scale familiarity might affect expert judgement as a component of test validation in the calibration process. It forms part of a larger study that investigates the alignment of the…
Descriptors: Specialists, Language Tests, Test Validity, College Faculty
Wasis; Kumaidi; Bastari; Mundilarto; Wintarti, Atik – Eurasian Journal of Educational Research, 2018
Purpose: This is a developmental research study that aims to develop a model of polytomous scoring based-on weighting for multiple correct items in the subject of physics. Weighting was analytically applied based on question complexity and imposed penalties on wrong answers. Research Methods: Within the development model, Fenrich's development…
Descriptors: Physics, Science Education, Scoring, Secondary School Students
Jølle, Lennart; Skar, Gustaf B. – Scandinavian Journal of Educational Research, 2020
This paper reports findings from a project called "The National Panel of Raters" (NPR) that took place within a writing test programme in Norway (2010-2016). A recent research project found individual differences between the raters in the NPR. This paper reports results from an explorative follow up-study where 63 NPR members were…
Descriptors: Foreign Countries, Validity, Scoring, Program Descriptions
Katemba, Caroline V.; Ning, Wei – Online Submission, 2018
This study aims to find out the student responses in enhancing new vocabulary through subtitled English Movies. And the research question is what are students' responses in enhancing new vocabulary through subtitled English movies? To achieve this objective, the study employed a quantitative method. The data were obtained from the questionnaire.…
Descriptors: Films, Vocabulary Development, Visual Aids, Student Attitudes
Chan, Stephanie W. Y.; Cheung, Wai Ming; Huang, Yanli; Lam, Wai-Ip; Lin, Chin-Hsi – Language Testing, 2020
Demand for second-language (L2) Chinese education for kindergarteners has grown rapidly, but little is known about these kindergarteners' L2 skills, with existing studies focusing on school-age populations and alphabetic languages. Accordingly, we developed a six-subtest Chinese character acquisition assessment to measure L2 kindergarteners'…
Descriptors: Chinese, Second Language Learning, Second Language Instruction, Written Language
Mullis, Ina V. S., Ed.; Martin, Michael O., Ed.; von Davier, Matthias, Ed. – International Association for the Evaluation of Educational Achievement, 2021
TIMSS (Trends in International Mathematics and Science Study) is a long-standing international assessment of mathematics and science at the fourth and eighth grades that has been collecting trend data every four years since 1995. About 70 countries use TIMSS trend data for monitoring the effectiveness of their education systems in a global…
Descriptors: Achievement Tests, International Assessment, Science Achievement, Mathematics Achievement
Yang, Ji Seung; Zheng, Xiaying – Journal of Educational and Behavioral Statistics, 2018
The purpose of this article is to introduce and review the capability and performance of the Stata item response theory (IRT) package that is available from Stata v.14, 2015. Using a simulated data set and a publicly available item response data set extracted from Programme of International Student Assessment, we review the IRT package from…
Descriptors: Item Response Theory, Item Analysis, Computer Software, Statistical Analysis