ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	5
Since 2006 (last 20 years)	10

Descriptor

Foreign Countries	13
Test Validity	7
Test Items	6
Validity	5
High School Students	4
Mathematics Tests	4
Achievement Tests	3
Correlation	3
Elementary School Students	3
Item Analysis	3
National Competency Tests	3
Grade 12	2
Grade 5	2
Grade 7	2
High Schools	2
International Assessment	2
Interrater Reliability	2
Mathematics Achievement	2
Reading Tests	2
Secondary School Students	2
Self Concept Measures	2
Standard Setting	2
Statistical Analysis	2
Student Motivation	2
Test Construction	2
More ▼

Source

Applied Measurement in…

Publication Type

Journal Articles	13
Reports - Research	13

Education Level

Elementary Secondary Education	3
Elementary Education	2
High Schools	2
Secondary Education	2
Grade 11	1
Grade 12	1
Grade 3	1
Grade 4	1
Grade 6	1
Grade 8	1
Grade 9	1
Intermediate Grades	1
Junior High Schools	1
Middle Schools	1
More ▼

Audience

Location

Canada	4
Germany	2
Australia	1
Finland	1
France	1
Israel	1
Italy	1
Jordan	1
Netherlands	1
Norway	1
Romania	1
Russia	1
Slovenia	1
South Korea	1
Sweden	1
Trinidad and Tobago	1
United Kingdom	1
United Kingdom (Northern…	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

Program for International…	2
Trends in International…	2
National Assessment of…	1
Perceived Competence Scale…	1
Progress in International…	1

What Works Clearinghouse Rating

Showing all 13 results Save | Export

The Trade-Off between Model Fit, Invariance, and Validity: The Case of PISA Science Assessments

Peer reviewed

Direct link

El Masri, Yasmine H.; Andrich, David – Applied Measurement in Education, 2020

In large-scale educational assessments, it is generally required that tests are composed of items that function invariantly across the groups to be compared. Despite efforts to ensure invariance in the item construction phase, for a range of reasons (including the security of items) it is often necessary to account for differential item…

Descriptors: Models, Goodness of Fit, Test Validity, Achievement Tests

Of Small Beauties and Large Beasts: The Quality of Distractors on Multiple-Choice Tests Is More Important than Their Quantity

Peer reviewed

Direct link

Papenberg, Martin; Musch, Jochen – Applied Measurement in Education, 2017

In multiple-choice tests, the quality of distractors may be more important than their number. We therefore examined the joint influence of distractor quality and quantity on test functioning by providing a sample of 5,793 participants with five parallel test sets consisting of items that differed in the number and quality of distractors.…

Descriptors: Multiple Choice Tests, Test Items, Test Validity, Test Reliability

Validating Human and Automated Scoring of Essays against "True" Scores

Peer reviewed

Direct link

Cohen, Yoav; Levi, Effi; Ben-Simon, Anat – Applied Measurement in Education, 2018

In the current study, two pools of 250 essays, all written as a response to the same prompt, were rated by two groups of raters (14 or 15 raters per group), thereby providing an approximation to the essay's true score. An automated essay scoring (AES) system was trained on the datasets and then scored the essays using a cross-validation scheme. By…

Descriptors: Test Validity, Automation, Scoring, Computer Assisted Testing

Installing a System of Performance Standards for National Assessments in the Republic of Trinidad and Tobago: Issues and Challenges

Peer reviewed

Direct link

De Lisle, Jerome – Applied Measurement in Education, 2015

This article explores the challenge of setting performance standards in a non-Western context. The study is centered on standard-setting practice in the national learning assessments of Trinidad and Tobago. Quantitative and qualitative data from annual evaluations between 2005 and 2009 were compiled, analyzed, and deconstructed. In the mixed…

Descriptors: Foreign Countries, National Standards, Educational Assessment, Standard Setting

Negative Keying Effects in the Factor Structure of TIMSS 2011 Motivation Scales and Associations with Reading Achievement

Peer reviewed

Direct link

Michaelides, Michalis P. – Applied Measurement in Education, 2019

The Student Background survey administered along with achievement tests in studies of the International Association for the Evaluation of Educational Achievement includes scales of student motivation, competence, and attitudes toward mathematics and science. The scales consist of positively- and negatively keyed items. The current research…

Descriptors: International Assessment, Achievement Tests, Mathematics Achievement, Mathematics Tests

Applying a Thurstonian, Two-Stage Method in the Standardized Assessment of Writing

Peer reviewed

Direct link

McGrane, Joshua Aaron; Humphry, Stephen Mark; Heldsinger, Sandra – Applied Measurement in Education, 2018

National standardized assessment programs have increasingly included extended written performances, amplifying the need for reliable, valid, and efficient methods of assessment. This article examines a two-stage method using comparative judgments and calibrated exemplars as a complement and alternative to existing methods of assessing writing.…

Descriptors: Standardized Tests, Foreign Countries, Writing Tests, Writing Evaluation

The Effect of Small Group Discussion on Cutoff Scores during Standard Setting

Peer reviewed

Direct link

Deunk, Marjolein I.; van Kuijk, Mechteld F.; Bosker, Roel J. – Applied Measurement in Education, 2014

Standard setting methods, like the Bookmark procedure, are used to assist education experts in formulating performance standards. Small group discussion is meant to help these experts in setting more reliable and valid cutoff scores. This study is an analysis of 15 small group discussions during two standards setting trajectories and their effect…

Descriptors: Cutting Scores, Standard Setting, Group Discussion, Reading Tests

A Cross-National Comparison of Reported Effort and Mathematics Performance in TIMSS Advanced

Peer reviewed

Direct link

Eklöf, Hanna; Pavešic, Barbara Japelj; Grønmo, Liv Sissel – Applied Measurement in Education, 2014

The purpose of the study was to measure students' reported test-taking effort and the relationship between reported effort and performance on the Trends in International Mathematics and Science Study (TIMSS) Advanced mathematics test. This was done in three countries participating in TIMSS Advanced 2008 (Sweden, Norway, and Slovenia), and the…

Descriptors: Mathematics Tests, Cross Cultural Studies, Foreign Countries, Correlation

An Experimental Test of Student Verbal Reports and Teacher Evaluations as a Source of Validity Evidence for Test Development

Peer reviewed

Direct link

Leighton, Jacqueline P.; Heffernan, Colleen; Cor, M. Kenneth; Gokiert, Rebecca J.; Cui, Ying – Applied Measurement in Education, 2011

The "Standards for Educational and Psychological Testing" indicate that test instructions, and by extension item objectives, presented to examinees should be sufficiently clear and detailed to help ensure that they respond as developers intend them to respond (Standard 3.20; AERA, APA, & NCME, 1999). The present study investigates…

Descriptors: Test Construction, Validity, Evidence, Science Tests

Validity of the Simultaneous Approach to the Development of Equivalent Achievement Tests in English and French

Peer reviewed

Direct link

Rogers, W. Todd; Lin, Jie; Rinaldi, Christia M. – Applied Measurement in Education, 2011

The evidence gathered in the present study supports the use of the simultaneous development of test items for different languages. The simultaneous approach used in the present study involved writing an item in one language (e.g., French) and, before moving to the development of a second item, translating the item into the second language (e.g.,…

Descriptors: Test Items, Item Analysis, Achievement Tests, French

Measuring Self-Efficacy: Multitrait-Multimethod Comparison of Scaling Procedures.

Peer reviewed

Bong, Mimi; Hocevar, Dennis – Applied Measurement in Education, 2002

Examined convergent and discriminant validity of various self-efficacy measures across two studies, one involving 358 U.S. high school students and another involving 235 Korean female high school students. Across the studies the first-order confirmatory factor analyses provide support for both convergent validity of different self-efficacy…

Descriptors: Comparative Analysis, Foreign Countries, High School Students, High Schools

Investigating Gender Differences in Adolescent Self-Concept: A Look beneath the Surface.

Peer reviewed

Byrne, Barbara M. – Applied Measurement in Education, 1990

The extent to which convergent and discriminant validity of 4 self-concept traits and the method effects associated with 3 measurement scales that were equivalent across gender was studied. Data from 832 students (412 boys and 420 girls) in grades 11 and 12 in suburban Canada were analyzed. (SLD)

Descriptors: Adolescents, Foreign Countries, Grade 11, Grade 12

Perceived Competence Scale for Children: Testing for Factorial Validity and Invariance across Age and Ability.

Peer reviewed

Byrne, Barbara M.; Schneider, Barry H. – Applied Measurement in Education, 1988

For 241 normal and 132 gifted fifth graders and 113 normal and 117 gifted seventh graders in Ottawa (Ontario), exploratory and confirmatory factor analyses investigated the factorial validity of the Perceived Competence Scale for Children (PCSC). Overall, the PCSC demonstrated sound psychometric properties. (SLD)

Descriptors: Ability, Academically Gifted, Age Differences, Competence

Byrne, Barbara M.	2
Andrich, David	1
Ben-Simon, Anat	1
Bong, Mimi	1
Bosker, Roel J.	1
Cohen, Yoav	1
Cor, M. Kenneth	1
Cui, Ying	1
De Lisle, Jerome	1
Deunk, Marjolein I.	1
Eklöf, Hanna	1
El Masri, Yasmine H.	1
Gokiert, Rebecca J.	1
Grønmo, Liv Sissel	1
Heffernan, Colleen	1
Heldsinger, Sandra	1
Hocevar, Dennis	1
Humphry, Stephen Mark	1
Leighton, Jacqueline P.	1
Levi, Effi	1
Lin, Jie	1
McGrane, Joshua Aaron	1
Michaelides, Michalis P.	1
Musch, Jochen	1
Papenberg, Martin	1
More ▼