ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	4
Since 2006 (last 20 years)	8

Descriptor

Comparative Analysis	14
Test Format	14
Test Results	14
Computer Assisted Testing	7
Scores	5
Test Items	5
Foreign Countries	4
Test Construction	4
Adaptive Testing	3
Multiple Choice Tests	3
Science Tests	3
Statistical Analysis	3
Academic Achievement	2
Difficulty Level	2
Elementary Secondary Education	2
English (Second Language)	2
Equated Scores	2
Item Analysis	2
Item Response Theory	2
Language Tests	2
National Competency Tests	2
Performance Factors	2
Pretests Posttests	2
Program Effectiveness	2
Psychometrics	2
More ▼

Source

Educational Evaluation and…	1
Educational Research	1
Educational Research and…	1
European Journal of…	1
Journal of Applied Testing…	1
Journal of Educational…	1
Oxford Review of Education	1
Physical Review Special…	1
ProQuest LLC	1
TEFLIN Journal: A publication…	1

Publication Type

Reports - Research	11
Journal Articles	9
Reports - Evaluative	2
Dissertations/Theses -…	1
Speeches/Meeting Papers	1

Education Level

Elementary Secondary Education	2
Higher Education	2
Postsecondary Education	2
Secondary Education	2
Elementary Education	1
High Schools	1
Intermediate Grades	1

Audience

Location

Germany	1
Indonesia	1
Ireland	1
Maryland	1
Sweden	1
United Kingdom (England)	1

Laws, Policies, & Programs

No Child Left Behind Act 2001

Assessments and Surveys

Graduate Record Examinations	1
Program for International…	1

What Works Clearinghouse Rating

Showing all 14 results Save | Export

Score Comparability between Online Proctored and In-Person Credentialing Exams

Peer reviewed

Direct link

Jones, Paul; Tong, Ye; Liu, Jinghua; Borglum, Joshua; Primoli, Vince – Journal of Educational Measurement, 2022

This article studied two methods to detect mode effects in two credentialing exams. In Study 1, we used a "modal scale comparison approach," where the same pool of items was calibrated separately, without transformation, within two TC cohorts (TC1 and TC2) and one OP cohort (OP1) matched on their pool-based scale score distributions. The…

Descriptors: Scores, Credentials, Licensing Examinations (Professions), Computer Assisted Testing

The Equivalence of TOEP Forms

Peer reviewed

Direct link

Madya, Suwarsih; Retnawati, Heri; Purnawan, Ari; Putro, Nur Hidayanto Pancoro Setyo; Apino, Ezi – TEFLIN Journal: A publication on the teaching and learning of English, 2019

This explorative-descriptive study set out to examine the equivalence among Test of English Proficiency (TOEP) forms, developed by the Indonesian Testing Service Centre (ITSC) and co-founded by The Association for The Teaching of English as a Foreign Language in Indonesia (TEFLIN) and The Association of Psychology in Indonesia. Using a…

Descriptors: Language Tests, Language Proficiency, English (Second Language), Second Language Learning

PISA 2015: How Big Is the 'Mode Effect' and What Has Been Done about It?

Peer reviewed

Direct link

Jerrim, John; Micklewright, John; Heine, Jorg-Henrik; Salzer, Christine; McKeown, Caroline – Oxford Review of Education, 2018

The Programme for International Student Assessment (PISA) is an important cross-national study of 15-year-olds' academic knowledge and skills. Educationalists and public policymakers eagerly await the tri-annual results, with particular interest in whether their country has moved up or slid down the international rankings, as compared to earlier…

Descriptors: Foreign Countries, Achievement Tests, International Assessment, Secondary School Students

Time Pressure in Scenario-Based Online Construction Safety Quizzes and Its Effect on Students' Performance

Peer reviewed

Direct link

Jaeger, Martin; Adair, Desmond – European Journal of Engineering Education, 2017

Online quizzes have been shown to be effective learning and assessment approaches. However, if scenario-based online construction safety quizzes do not include time pressure similar to real-world situations, they reflect situations too ideally. The purpose of this paper is to compare engineering students' performance when carrying out an online…

Descriptors: Engineering Education, Quasiexperimental Design, Tests, Academic Achievement

Dividing the Force Concept Inventory into Two Equivalent Half-Length Tests

Peer reviewed

Direct link

Han, Jing; Bao, Lei; Chen, Li; Cai, Tianfang; Pi, Yuan; Zhou, Shaona; Tu, Yan; Koenig, Kathleen – Physical Review Special Topics - Physics Education Research, 2015

The Force Concept Inventory (FCI) is a 30-question multiple-choice assessment that has been a building block for much of the physics education research done today. In practice, there are often concerns regarding the length of the test and possible test-retest effects. Since many studies in the literature use the mean score of the FCI as the…

Descriptors: Physics, Multiple Choice Tests, Science Instruction, Scores

The Contribution of Constructed Response Items to Large Scale Assessment: Measuring and Understanding Their Impact

Peer reviewed

Direct link

Lissitz, Robert W.; Hou, Xiaodong; Slater, Sharon Cadman – Journal of Applied Testing Technology, 2012

This article investigates several questions regarding the impact of different item formats on measurement characteristics. Constructed response (CR) items and multiple choice (MC) items obviously differ in their formats and in the resources needed to score them. As such, they have been the subject of considerable discussion regarding the impact of…

Descriptors: Computer Assisted Testing, Scoring, Evaluation Problems, Psychometrics

Investigation of the Factors Affecting the Pre-Test Effect in National Curriculum Science Assessment Development in England

Peer reviewed

Direct link

Pyle, Katie; Jones, Emily; Williams, Chris; Morrison, Jo – Educational Research, 2009

Background: All national curriculum tests in England are pre-tested as part of the development process. Differences in pupil performance between pre-test and live test are consistently found. This difference has been termed the pre-test effect. Understanding the pre-test effect is essential in the test development and selection processes and in…

Descriptors: Foreign Countries, Pretesting, Context Effect, National Curriculum

Addressing the Inclusion of English Language Learners in the Educational Accountability System: Lessons Learned from Peer Review

Direct link

Christensen, Laurene L. – ProQuest LLC, 2010

This study investigated the inclusion of English language learners (ELLs) in state standards and assessments, as measured by comments made by peer reviewers in the federal evaluation of states' standards and assessments. As required by the Elementary and Secondary Education Act (ESEA), reauthorized in 2004 as No Child Left Behind (NCLB), states…

Descriptors: Elementary Secondary Education, Federal Legislation, Research Methodology, State Standards

Some Considerations in Maintaining Adaptive Test Item Pools.

Download full text

Stocking, Martha L. – 1988

The construction of parallel editions of conventional tests for purposes of test security while maintaining score comparability has always been a recognized and difficult problem in psychometrics and test construction. The introduction of new modes of test construction, e.g., adaptive testing, changes the nature of the problem, but does not make…

Descriptors: Adaptive Testing, Comparative Analysis, Computer Assisted Testing, Identification

Comparison of Test Administration Procedures for Placement Decisions in a Mathematics Course.

Peer reviewed

Straetmans, Gerard J. J. M.; Eggen, Theo J. H. M. – Educational Research and Evaluation (An International Journal on Theory and Practice), 1998

Three test administration procedures for making placement decisions in adult education were compared (paper-based, computer-based, and computerized-adaptive tests) with 90 adult-education students. Test performance was not differentially affected by the mode of administration, but the computerized adaptive test always yielded more precise ability…

Descriptors: Ability, Adaptive Testing, Adult Education, Adult Students

Gender and Racial/Ethnic Differences on Performance Assessments in Science.

Peer reviewed

Klein, Stephen P.; And Others – Educational Evaluation and Policy Analysis, 1997

Whether differences in mean scores among gender and racial/ethnic groups on science performance assessments are comparable to those for traditional tests was studied with 2,000 students in grades five, six, and nine. Overall, results suggest that the type of test has little effect on these differences in scores. (SLD)

Descriptors: Comparative Analysis, Cultural Differences, Ethnic Groups, Performance Based Assessment

The Introduction and Comparability of the Computer Adaptive GRE General Test. GRE Board Professional Report No. 88-08aP.

Download full text

Schaeffer, Gary A.; And Others – 1995

This report summarizes the results from two studies. The first assessed the comparability of scores derived from linear computer-based (CBT) and computer adaptive (CAT) versions of the three Graduate Record Examinations (GRE) General Test measures. A verbal CAT was taken by 1,507, a quantitative CAT by 1,354, and an analytical CAT by 995…

Descriptors: Adaptive Testing, Comparative Analysis, Computer Assisted Testing, Equated Scores

Comparing Student Performance on Different Item Formats Relative to Achievement Levels Cutpoints.

Download full text

Bay, Luz – 1998

A study was conducted to investigate the difference in student performance on multiple choice (MC) and constructed response (CR) items relative to the achievement levels of the National Assessment of Educational Progress (NAEP). The study included an investigation of how estimates of student performance were affected by item response theory (IRT)…

Descriptors: Academic Achievement, Comparative Analysis, Constructed Response, Cutting Scores

French Immersion Studies, Year 3 (1985-86). Tests of (English) Reading Skills.

Download full text

York Region Board of Education, Aurora (Ontario). – 1986

To determine whether students enrolled in one Ontario region's early French immersion (FI) programs developed English reading skills comparable to their non-FI peers, a monitoring process was begun in the first FI program year (grade 3) in which formal English instruction is given. The FI cohort and a control group matched for mental abilities and…

Descriptors: Comparative Analysis, Elementary Education, English, Foreign Countries

Adair, Desmond	1
Apino, Ezi	1
Bao, Lei	1
Bay, Luz	1
Borglum, Joshua	1
Cai, Tianfang	1
Chen, Li	1
Christensen, Laurene L.	1
Eggen, Theo J. H. M.	1
Han, Jing	1
Heine, Jorg-Henrik	1
Hou, Xiaodong	1
Jaeger, Martin	1
Jerrim, John	1
Jones, Emily	1
Jones, Paul	1
Klein, Stephen P.	1
Koenig, Kathleen	1
Lissitz, Robert W.	1
Liu, Jinghua	1
Madya, Suwarsih	1
McKeown, Caroline	1
Micklewright, John	1
Morrison, Jo	1
More ▼