ERIC - Search Results

Publication Date

In 2025	1
Since 2024	3
Since 2021 (last 5 years)	19
Since 2016 (last 10 years)	40
Since 2006 (last 20 years)	69

Descriptor

Foreign Countries	103
Scoring	103
Test Items	103
Test Construction	27
Comparative Analysis	22
Item Analysis	21
Item Response Theory	20
Achievement Tests	19
Language Tests	19
Scores	17
Computer Assisted Testing	16
Student Evaluation	16
Science Tests	15
Second Language Learning	15
Multiple Choice Tests	14
Test Reliability	14
Test Validity	14
Difficulty Level	13
Mathematics Tests	13
Test Format	13
Educational Assessment	12
Accuracy	11
Computation	11
English (Second Language)	11
International Assessment	11
More ▼

Publication Type

Journal Articles	64
Reports - Research	55
Reports - Evaluative	21
Reports - Descriptive	13
Guides - Non-Classroom	9
Tests/Questionnaires	9
Speeches/Meeting Papers	6
Guides - Classroom - Teacher	5
Information Analyses	5
Numerical/Quantitative Data	5
Collected Works - General	3
Multilingual/Bilingual…	2
Opinion Papers	2
Reports - General	2
Books	1
Guides - General	1
More ▼

Education Level

Secondary Education	18
Higher Education	15
Elementary Education	12
Elementary Secondary Education	10
Postsecondary Education	10
Grade 6	5
Grade 8	4
Junior High Schools	4
Middle Schools	4
Grade 4	3
High Schools	3
Intermediate Grades	2
Early Childhood Education	1
Grade 10	1
Grade 5	1
Grade 7	1
More ▼

Audience

Teachers	11
Practitioners	10
Administrators	4
Students	2

Location

Canada	14
China	12
Australia	10
United Kingdom	9
Japan	6
United States	6
Turkey	5
Netherlands	4
Denmark	3
France	3
Germany	3
Hong Kong	3
Israel	3
South Korea	3
Taiwan	3
United Kingdom (England)	3
Asia	2
Czech Republic	2
Estonia	2
Europe	2
Iran	2
Norway	2
Poland	2
Slovakia	2
Spain	2
More ▼

Laws, Policies, & Programs

Assessments and Surveys

Program for International…	11
Trends in International…	5
National Assessment of…	2
Autism Diagnostic Observation…	1
Clinical Evaluation of…	1
General Aptitude Test Battery	1
International Association for…	1
Progress in International…	1
Raven Progressive Matrices	1
Strengths and Difficulties…	1
Teaching and Learning…	1
Test of English as a Foreign…	1
More ▼

What Works Clearinghouse Rating

Showing 1 to 15 of 103 results Save | Export

Examination of the Aggregate Scoring Method in a Judgment Concordance Test

Peer reviewed
PDF on ERIC

Download full text

Deschênes, Marie-France; Dionne, Éric; Dorion, Michelle; Grondin, Julie – Practical Assessment, Research & Evaluation, 2023

The use of the aggregate scoring method for scoring concordance tests requires the weighting of test items to be derived from the performance of a group of experts who take the test under the same conditions as the examinees. However, the average score of experts constituting the reference panel remains a critical issue in the use of these tests.…

Descriptors: Scoring, Tests, Evaluation Methods, Test Items

Automated Marking of Longer Computational Questions in Engineering Subjects

Peer reviewed

Direct link

Pearson, Christopher; Penna, Nigel – Assessment & Evaluation in Higher Education, 2023

E-assessments are becoming increasingly common and progressively more complex. Consequently, how these longer, more complex questions are designed and marked is imperative. This article uses the NUMBAS e-assessment tool to investigate the best practice for creating longer questions and their mark schemes on surveying modules taken by engineering…

Descriptors: Automation, Scoring, Engineering Education, Foreign Countries

Comparing the Score Interpretation across Modes in PISA: An Investigation of How Item Facets Affect Difficulty

Peer reviewed

Direct link

Harrison, Scott; Kroehne, Ulf; Goldhammer, Frank; Lüdtke, Oliver; Robitzsch, Alexander – Large-scale Assessments in Education, 2023

Background: Mode effects, the variations in item and scale properties attributed to the mode of test administration (paper vs. computer), have stimulated research around test equivalence and trend estimation in PISA. The PISA assessment framework provides the backbone to the interpretation of the results of the PISA test scores. However, an…

Descriptors: Scoring, Test Items, Difficulty Level, Foreign Countries

Marginalized Learners in International and Regional Test Data: The Extent of Floor Effects

Peer reviewed

Direct link

Gustafsson, Martin; Barakat, Bilal Fouad – Comparative Education Review, 2023

International assessments inform education policy debates, yet little is known about their floor effects: To what extent do they fail to differentiate between the lowest performers, and what are the implications of this? TIMSS, SACMEQ, and LLECE data are analyzed to answer this question. In TIMSS, floor effects have been reduced through the…

Descriptors: Achievement Tests, Elementary Secondary Education, International Assessment, Foreign Countries

Collaborative Problem-Solving Design in Large-Scale Assessments: Shedding Lights in Sequential Conversation-Based Measurement

Peer reviewed
PDF on ERIC

Download full text

Qiwei He – International Journal of Assessment Tools in Education, 2023

Collaborative problem solving (CPS) is inherently an interactive, conjoint, dual-strand process that considers how a student reasons about a problem as well as how s/he interacts with others to regulate social processes and exchange information (OECD, 2013). Measuring CPS skills presents a challenge for obtaining consistent, accurate, and reliable…

Descriptors: Cooperative Learning, Problem Solving, Test Items, International Assessment

Exploring Speededness in Pre-Reform GCSEs (2009 to 2016)

Download full text

Direct link

Emma Walland – Research Matters, 2024

GCSE examinations (taken by students aged 16 years in England) are not intended to be speeded (i.e. to be partly a test of how quickly students can answer questions). However, there has been little research exploring this. The aim of this research was to explore the speededness of past GCSE written examinations, using only the data from scored…

Descriptors: Educational Change, Test Items, Item Analysis, Scoring

Assessing the Ethical Capabilities of Chat GPT in Healthcare: A Study on Its Proficiency in Situational Judgement Test

Peer reviewed

Direct link

Kunal Sareen – Innovations in Education and Teaching International, 2024

This study examines the proficiency of Chat GPT, an AI language model, in answering questions on the Situational Judgement Test (SJT), a widely used assessment tool for evaluating the fundamental competencies of medical graduates in the UK. A total of 252 SJT questions from the "Oxford Assess and Progress: Situational Judgement" Test…

Descriptors: Ethics, Decision Making, Artificial Intelligence, Computer Software

A Class of Cognitive Diagnosis Models for Polytomous Data

Peer reviewed

Direct link

Gao, Xuliang; Ma, Wenchao; Wang, Daxun; Cai, Yan; Tu, Dongbo – Journal of Educational and Behavioral Statistics, 2021

This article proposes a class of cognitive diagnosis models (CDMs) for polytomously scored items with different link functions. Many existing polytomous CDMs can be considered as special cases of the proposed class of polytomous CDMs. Simulation studies were carried out to investigate the feasibility of the proposed CDMs and the performance of…

Descriptors: Cognitive Measurement, Models, Test Items, Scoring

Evaluating Human Scoring Using Generalizability Theory

Peer reviewed

Direct link

Bimpeh, Yaw; Pointer, William; Smith, Ben Alexander; Harrison, Liz – Applied Measurement in Education, 2020

Many high-stakes examinations in the United Kingdom (UK) use both constructed-response items and selected-response items. We need to evaluate the inter-rater reliability for constructed-response items that are scored by humans. While there are a variety of methods for evaluating rater consistency across ratings in the psychometric literature, we…

Descriptors: Scoring, Generalizability Theory, Interrater Reliability, Foreign Countries

Coefficient [beta] as Extension of KR-21 Reliability for Summed and Scaled Scores for Polytomously-Scored Tests

Peer reviewed

Direct link

Almehrizi, Rashid S. – Applied Measurement in Education, 2021

KR-21 reliability and its extension (coefficient [alpha]) gives the reliability estimate of test scores under the assumption of tau-equivalent forms. KR-21 reliability gives the reliability estimate for summed scores for dichotomous items when items are randomly sampled from an infinite pool of similar items (randomly parallel forms). The article…

Descriptors: Test Reliability, Scores, Scoring, Computation

A New Scoring Method for Item Response Theory Analysis of C-Tests

Peer reviewed

Direct link

Farshad Effatpanah; Purya Baghaei; Mona Tabatabaee-Yazdi; Esmat Babaii – Language Testing, 2025

This study aimed to propose a new method for scoring C-Tests as measures of general language proficiency. In this approach, the unit of analysis is sentences rather than gaps or passages. That is, the gaps correctly reformulated in each sentence were aggregated as sentence score, and then each sentence was entered into the analysis as a polytomous…

Descriptors: Item Response Theory, Language Tests, Test Items, Test Construction

Autism Diagnostic Interview-Revised within DSM-5 Framework: Test of Reliability and Validity in Chinese Children

Peer reviewed

Direct link

Lai, Kelly Y. C.; Yuen, Emily C. W.; Hung, Se Fong; Leung, Patrick W. L. – Journal of Autism and Developmental Disorders, 2022

This study examines the psychometric properties of the Autism Diagnostic Interview-Revised (ADI-R) in the context of DSM-5 in a sample of Chinese children. Using re-mapped ADI-R items and algorithms matched to DSM-5 criteria, and administering to children with autism spectrum disorder (ASD) with and without intellectual disability,…

Descriptors: Autism, Pervasive Developmental Disorders, Diagnostic Tests, Observation

Investigating Invariant Item Ordering Using Mokken Scale Analysis for Dichotomously Scored Items

Peer reviewed
PDF on ERIC

Download full text

Dirlik, Ezgi Mor – International Journal of Progressive Education, 2020

Mokken models have recently started to become the preferred method of researchers from different fields in studies of nonparametric item response theory (NIRT). Despite increasing application of these models, some features of this type of modelling need further study and explanation. Invariant item ordering (IIO) is one of these areas, which the…

Descriptors: Item Response Theory, Test Items, Nonparametric Statistics, Scoring

Assessment of Transversal Competencies: Current Tools in the Asian Region

Direct link

Care, Esther; Vista, Alvin; Kim, Helyn – UNESCO Bangkok, 2019

UNESCO's Asia-Pacific Regional Bureau for Education has been working on education quality under the name of 'transversal competencies' (TVC) since 2013. Many of these competencies have been included in national education policy and curricula of countries in the region, but now the importance accorded them is increasingly gaining attention. As…

Descriptors: Foreign Countries, Educational Quality, 21st Century Skills, Competence

Analysis of Two-Tier Question Scoring Methods: A Case Study on the Lawson's Classroom Test of Scientific Reasoning

Peer reviewed
PDF on ERIC

Download full text

Zhou, Shao-Na; Liu, Qiao-Yi; Koenig, Kathleen; Xiao, Qiu-ye Li-Yang; Bao, Lei – Journal of Baltic Science Education, 2021

The Lawson's Classroom Test of Scientific Reasoning (LCTSR) is a popular instrument that measures the development of students' scientific reasoning skills. The instrument has a two-tier question design, which has led to multiple ways of scoring and interpretation. In this research, a method of pattern analysis was proposed and applied to analyze…

Descriptors: Science Tests, Science Process Skills, Logical Thinking, Multiple Choice Tests

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7

Ministerial Council on…	5
Educational and Psychological…	4
Journal of Educational and…	4
Language Testing	4
Applied Measurement in…	3
Applied Psychological…	3
ETS Research Report Series	3
International Association for…	3
OECD Publishing	3
English Language Teaching	2
Eurasian Journal of…	2
International Journal of…	2
Advanced Education	1
Advances in Health Sciences…	1
Asia-Pacific Education…	1
Assessment & Evaluation in…	1
Assessment in Education:…	1
Comparative Education Review	1
Educational Research and…	1
Electronic Journal of…	1
Innovations in Education and…	1
Interactive Learning…	1
International Education…	1
International Journal of…	1
International Journal of…	1
More ▼

Donovan, Jenny	3
Ellington, Henry	3
Lennon, Melissa	3
Hutton, Penny	2
Morrissey, Noni	2
Nadas, Rita	2
O'Connor, Gayl	2
Suto, Irenka	2
Xin, Tao	2
Yamamoto, Kentaro	2
von Davier, Matthias	2
Ahmed, S.	1
Akour, Mutasem	1
Akyildiz, Murat	1
Almehrizi, Rashid S.	1
Ashwell, Tim	1
Aviad-Levitzky, Tami	1
Bao, Lei	1
Barakat, Bilal Fouad	1
Bauer, Daniel	1
Baxter, G. P.	1
Bell, Richard C.	1
Ben-Shakhar, Gershon	1
Bilan Liang	1
More ▼