ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	4
Since 2006 (last 20 years)	8

Descriptor

Evaluation Methods	9
Student Evaluation	9
Foreign Countries	5
Psychometrics	3
Test Bias	3
Test Construction	3
Scoring	2
Simulation	2
Student Diversity	2
Test Items	2
Test Use	2
Testing Accommodations	2
21st Century Skills	1
Academic Achievement	1
Achievement Tests	1
Adolescents	1
Artificial Intelligence	1
Case Studies	1
Classification	1
Clinical Diagnosis	1
Computation	1
Computer Assisted Testing	1
Cooperation	1
Cross Cultural Studies	1
Cultural Differences	1
More ▼

Source

International Journal of…

Publication Type

Journal Articles	9
Reports - Descriptive	4
Reports - Research	4
Guides - General	1

Education Level

Elementary Education	1
Elementary Secondary Education	1
High Schools	1
Higher Education	1

Audience

Location

Australia	1
Belgium	1
Iowa	1
Spain	1
United Kingdom (England)	1
United States	1
Zimbabwe	1

Laws, Policies, & Programs

Assessments and Surveys

Program for International…

What Works Clearinghouse Rating

Showing all 9 results Save | Export

Using Evidence-Centered Design to Support the Development of Culturally and Linguistically Sensitive Collaborative Problem-Solving Assessments

Peer reviewed

Direct link

Oliveri, María Elena; Lawless, René; Mislevy, Robert J. – International Journal of Testing, 2019

Collaborative problem solving (CPS) ranks among the top five most critical skills necessary for college graduates to meet workforce demands (Hart Research Associates, 2015). It is also deemed a critical skill for educational success (Beaver, 2013). It thus deserves more prominence in the suite of courses and subjects assessed in K-16. Such…

Descriptors: Cooperation, Problem Solving, Evidence Based Practice, 21st Century Skills

Challenges to the Use of Artificial Neural Networks for Diagnostic Classifications with Student Test Data

Peer reviewed

Direct link

Briggs, Derek C.; Circi, Ruhan – International Journal of Testing, 2017

Artificial Neural Networks (ANNs) have been proposed as a promising approach for the classification of students into different levels of a psychological attribute hierarchy. Unfortunately, because such classifications typically rely upon internally produced item response patterns that have not been externally validated, the instability of ANN…

Descriptors: Artificial Intelligence, Classification, Student Evaluation, Tests

Item Calibration Samples and the Stability of Achievement Estimates and System Rankings: Another Look at the PISA Model

Peer reviewed

Direct link

Rutkowski, Leslie; Rutkowski, David; Zhou, Yan – International Journal of Testing, 2016

Using an empirically-based simulation study, we show that typically used methods of choosing an item calibration sample have significant impacts on achievement bias and system rankings. We examine whether recent PISA accommodations, especially for lower performing participants, can mitigate some of this bias. Our findings indicate that standard…

Descriptors: Simulation, International Programs, Adolescents, Student Evaluation

ITC Guidelines for the Large-Scale Assessment of Linguistically and Culturally Diverse Populations

Peer reviewed

Direct link

International Journal of Testing, 2019

These guidelines describe considerations relevant to the assessment of test takers in or across countries or regions that are linguistically or culturally diverse. The guidelines were developed by a committee of experts to help inform test developers, psychometricians, test users, and test administrators about fairness issues in support of the…

Descriptors: Test Bias, Student Diversity, Cultural Differences, Language Usage

Review of Sample Size for Structural Equation Models in Second Language Testing and Learning Research: A Monte Carlo Approach

Peer reviewed

Direct link

In'nami, Yo; Koizumi, Rie – International Journal of Testing, 2013

The importance of sample size, although widely discussed in the literature on structural equation modeling (SEM), has not been widely recognized among applied SEM researchers. To narrow this gap, we focus on second language testing and learning studies and examine the following: (a) Is the sample size sufficient in terms of precision and power of…

Descriptors: Structural Equation Models, Sample Size, Second Language Instruction, Monte Carlo Methods

Test Reviewing in Spain

Peer reviewed

Direct link

Muniz, Jose; Fernandez-Hermida, Jose R.; Fonseca-Pedrero, Eduardo; Campillo-Alvarez, Angela; Pena-Suarez, Elsa – International Journal of Testing, 2012

The proper use of psychological tests requires that the measurement instruments have adequate psychometric properties, such as reliability and validity, and that the professionals who use the instruments have the necessary expertise. In this article, we present the first review of tests published in Spain, carried out with an assessment model…

Descriptors: Student Evaluation, Measurement, Foreign Countries, Psychometrics

Exploration of the Validity of Gender Differences in Mathematics Assessment Using Differential Bundle Functioning

Peer reviewed

Direct link

Ong, Yoke Mooi; Williams, Julian Scott; Lamprianou, Iasonas – International Journal of Testing, 2011

The aims of this study are (a) to examine the sources of differential functioning by gender via differential bundle functioning (DBF) in mathematics assessment and (b) to use DBF to explore whether the differential functioning displayed is construct-relevant or construct-irrelevant. Three qualitatively different areas, namely curriculum domains,…

Descriptors: Test Bias, Gender Differences, Gender Bias, Mathematics Tests

An Exploration of Learning Disabilities in Four Countries: Implications for Test Development and Use in Developing Countries

Peer reviewed

Direct link

Oakland, Thomas; Mpofu, Elias; Gregoire, Jacques; Faulkner, Michael – International Journal of Testing, 2007

Tests often are used to assist in assessing common childhood disabilities and disorders (e.g., mental retardation). Learning disabilities and difficulties (LD) constitute the plurality, even the majority, of school-related disorders in many countries. However, tests and other assessment methods to assess LD are not available universally and, among…

Descriptors: Foreign Countries, Learning Disabilities, Mental Retardation, Cross Cultural Studies

Design Rationale for a Complex Performance Assessment

Peer reviewed

Direct link

Williamson, David M.; Bauer, Malcolm; Steinberg, Linda S.; Mislevy, Robert J.; Behrens, John T.; DeMark, Sarah F. – International Journal of Testing, 2004

In computer-based interactive environments meant to support learning, students must bring a wide range of relevant knowledge, skills, and abilities to bear jointly as they solve meaningful problems in a learning domain. To function effectively as an assessment, a computer system must additionally be able to evoke and interpret observable evidence…

Descriptors: Computer Assisted Testing, Psychometrics, Task Analysis, Performance Based Assessment

Mislevy, Robert J.	2
Bauer, Malcolm	1
Behrens, John T.	1
Briggs, Derek C.	1
Campillo-Alvarez, Angela	1
Circi, Ruhan	1
DeMark, Sarah F.	1
Faulkner, Michael	1
Fernandez-Hermida, Jose R.	1
Fonseca-Pedrero, Eduardo	1
Gregoire, Jacques	1
In'nami, Yo	1
Koizumi, Rie	1
Lamprianou, Iasonas	1
Lawless, René	1
Mpofu, Elias	1
Muniz, Jose	1
Oakland, Thomas	1
Oliveri, María Elena	1
Ong, Yoke Mooi	1
Pena-Suarez, Elsa	1
Rutkowski, David	1
Rutkowski, Leslie	1
Steinberg, Linda S.	1
Williams, Julian Scott	1
More ▼