ERIC - Search Results

Publication Date

In 2025	7
Since 2024	15
Since 2021 (last 5 years)	53
Since 2016 (last 10 years)	107
Since 2006 (last 20 years)	156

Descriptor

Difficulty Level	239
Test Items	239
Test Validity	239
Test Reliability	130
Test Construction	109
Foreign Countries	87
Item Analysis	74
Multiple Choice Tests	55
Item Response Theory	54
Psychometrics	38
Scores	31
Elementary School Students	27
Statistical Analysis	27
Achievement Tests	26
Science Tests	24
Higher Education	23
Mathematics Tests	22
Comparative Analysis	21
High School Students	21
Language Tests	21
Reading Tests	21
Test Format	21
Computer Assisted Testing	20
Correlation	18
English (Second Language)	18
More ▼

Publication Type

Reports - Research	190
Journal Articles	157
Speeches/Meeting Papers	35
Reports - Evaluative	23
Tests/Questionnaires	18
Reports - Descriptive	12
Dissertations/Theses -…	7
Numerical/Quantitative Data	4
Information Analyses	3
Guides - Non-Classroom	2
Opinion Papers	2
Collected Works - Serials	1
Guides - General	1
Non-Print Media	1
Reference Materials - General	1
More ▼

Education Level

Higher Education	46
Secondary Education	45
Postsecondary Education	41
Elementary Education	39
Middle Schools	22
High Schools	21
Grade 8	13
Junior High Schools	13
Intermediate Grades	9
Grade 4	8
Early Childhood Education	7
Elementary Secondary Education	7
Primary Education	7
Grade 1	6
Grade 3	6
Grade 5	6
Grade 6	6
Grade 7	6
Kindergarten	5
Grade 12	3
Grade 2	3
Grade 9	3
Grade 10	2
Grade 11	1
Preschool Education	1
More ▼

Audience

Researchers	8
Teachers	2
Practitioners	1

Location

Turkey	12
Indonesia	11
Germany	5
Nigeria	5
Japan	4
Australia	3
California	3
Jordan	3
United Kingdom	3
Alabama	2
Canada	2
China	2
Colorado	2
Florida	2
Georgia	2
Idaho	2
Illinois	2
Iran	2
Nevada	2
New York	2
South Africa	2
South Korea	2
Tennessee	2
Thailand	2
Turkey (Istanbul)	2
More ▼

Laws, Policies, & Programs

What Works Clearinghouse Rating

Showing 1 to 15 of 239 results Save | Export

A Chi-Square Statistic for Testing the Equality of Distracters' Plausibility in Multiple-Choice Test Items

Download full text

Sherwin E. Balbuena – Online Submission, 2024

This study introduces a new chi-square test statistic for testing the equality of response frequencies among distracters in multiple-choice tests. The formula uses the information from the number of correct answers and wrong answers, which becomes the basis of calculating the expected values of response frequencies per distracter. The method was…

Descriptors: Multiple Choice Tests, Statistics, Test Validity, Testing

Evaluating Methodological Enhancements to the Yes/No Angoff Standard-Setting Method in Language Proficiency Assessment

Peer reviewed

Direct link

Tia M. Fechter; Heeyeon Yoon – Language Testing, 2024

This study evaluated the efficacy of two proposed methods in an operational standard-setting study conducted for a high-stakes language proficiency test of the U.S. government. The goal was to seek low-cost modifications to the existing Yes/No Angoff method to increase the validity and reliability of the recommended cut scores using a convergent…

Descriptors: Standard Setting, Language Proficiency, Language Tests, Evaluation Methods

Validation of an Elicited Imitation Test as a Measure of Korean Language Proficiency

Peer reviewed

Direct link

Hojung Kim; Changkyung Song; Jiyoung Kim; Hyeyun Jeong; Jisoo Park – Language Testing in Asia, 2024

This study presents a modified version of the Korean Elicited Imitation (EI) test, designed to resemble natural spoken language, and validates its reliability as a measure of proficiency. The study assesses the correlation between average test scores and Test of Proficiency in Korean (TOPIK) levels, examining score distributions among beginner,…

Descriptors: Korean, Test Validity, Test Reliability, Imitation

Empirically Deriving Cut Scores in the Positive Behavioral Interventions and Supports (PBIS) Tiered Fidelity Inventory (TFI) through a Bookmarking Process

Peer reviewed

Direct link

Jerin Kim; Kent McIntosh – Journal of Positive Behavior Interventions, 2025

We aimed to identify empirically valid cut scores on the positive behavioral interventions and supports (PBIS) Tiered Fidelity Inventory (TFI) through an expert panel process known as bookmarking. The TFI is a measurement tool to evaluate the fidelity of implementation of PBIS. In the bookmark method, experts reviewed all TFI items and item scores…

Descriptors: Positive Behavior Supports, Cutting Scores, Fidelity, Program Evaluation

Improvised Progressive Model Based on Automatic Calibration of Difficulty Level: A Practical Solution of Competitive-Based Examination

Peer reviewed

Direct link

Aditya Shah; Ajay Devmane; Mehul Ranka; Prathamesh Churi – Education and Information Technologies, 2024

Online learning has grown due to the advancement of technology and flexibility. Online examinations measure students' knowledge and skills. Traditional question papers include inconsistent difficulty levels, arbitrary question allocations, and poor grading. The suggested model calibrates question paper difficulty based on student performance to…

Descriptors: Computer Assisted Testing, Difficulty Level, Grading, Test Construction

On the Relationship between Item Stem Formulation and Criterion Validity of Multiple-Component Measuring Instruments

Peer reviewed

Direct link

Menold, Natalja; Raykov, Tenko – Educational and Psychological Measurement, 2022

The possible dependency of criterion validity on item formulation in a multicomponent measuring instrument is examined. The discussion is concerned with evaluation of the differences in criterion validity between two or more groups (populations/subpopulations) that have been administered instruments with items having differently formulated item…

Descriptors: Test Items, Measures (Individuals), Test Validity, Difficulty Level

Argument-Based Validation of Chulalongkorn University Language Institute (CULI) Test: A Rasch-Based Evidence Investigation

Peer reviewed

Direct link

Apichat Khamboonruang – Language Testing in Asia, 2025

Chulalongkorn University Language Institute (CULI) test was developed as a local standardised test of English for professional and international communication. To ensure that the CULI test fulfils its intended purposes, this study employed Kane's argument-based validation and Rasch measurement approaches to construct the validity argument for the…

Descriptors: Universities, Second Language Learning, Second Language Instruction, Language Tests

Assessing Lower-Secondary School Students' Critical Thinking Skills in Photosynthesis: A Rasch Model Approach

Peer reviewed
PDF on ERIC

Download full text

Suwita Suwita; Sulistyo Saputro; Sajidan Sajidan; Sutarno Sutarno – Journal of Baltic Science Education, 2024

The current study uses the Rasch Model to measure lower-secondary school students' critical thinking skills on photosynthesis topics. Critical thinking skills are considered essential in science education, but few valid and practical measurement instruments remain. The current study fills the gap by adapting the instrument from the Watson-Glaser…

Descriptors: Secondary School Students, Critical Thinking, Thinking Skills, Botany

Developing and Validating a Biological System Thinking Test for Middle School Students

Peer reviewed

Direct link

Ruying Li; Gaofeng Li – International Journal of Science and Mathematics Education, 2025

Systems thinking (ST) is an essential competence for future life and biology learning. Appropriate assessment is critical for collecting sufficient information to develop ST in biology education. This research offers an ST framework based on a comprehensive understanding of biological systems, encompassing four skills across three complexity…

Descriptors: Test Construction, Test Validity, Science Tests, Cognitive Tests

Validity and Reliability Analysis of a Socioscientific Issues-Based Critical Thinking Self-Assessment Instrument Using the Rasch Model

Peer reviewed
PDF on ERIC

Download full text

Y. Yokhebed; Rexy Maulana Dwi Karmadi; Luvia Ranggi Nastiti – Journal of Biological Education Indonesia (Jurnal Pendidikan Biologi Indonesia), 2025

Although self-assessment in critical thinking is thought to help students recognise their strengths and weaknesses, the reliability and validity of the assessment tool is still questionable, so a more objective evaluation is needed. Objective of this investigation is to assess the self-assessment tools in evaluating students' critical thinking…

Descriptors: Self Evaluation (Individuals), Critical Thinking, Science and Society, Test Validity

Classroom Assessment That Tailor Instruction and Direct Learning: A Validation Study

Peer reviewed
PDF on ERIC

Download full text

Wai Kei Chan; Li Zhang; Emily Pey-Tee Oon – International Journal of Assessment Tools in Education, 2023

We report the validity of a test instrument that assesses the arithmetic ability of primary students by (a) describing the theoretical model of arithmetic ability assessment using Wilson's (2004) four building blocks of constructing measures and (b) providing empirical evidence for the validation study. The instrument consists of 21…

Descriptors: Foreign Countries, Elementary School Students, Arithmetic, Grade 3

Design, Development, and Evaluation of the Organic Chemistry Representational Competence Assessment (ORCA)

Peer reviewed

Direct link

Lyniesha Ward; Fridah Rotich; Jeffrey R. Raker; Regis Komperda; Sachin Nedungadi; Maia Popova – Chemistry Education Research and Practice, 2025

This paper describes the design and evaluation of the Organic chemistry Representational Competence Assessment (ORCA). Grounded in Kozma and Russell's representational competence framework, the ORCA measures the learner's ability to "interpret," "translate," and "use" six commonly used representations of molecular…

Descriptors: Organic Chemistry, Science Tests, Test Construction, Student Evaluation

The Knowledge of Autism Questionnaire-UK: Development and Initial Psychometric Evaluation

Peer reviewed

Direct link

Sophie Langhorne; Nora Uglik-Marucha; Charlotte Broadhurst; Elena Lieven; Amelia Pearson; Silia Vitoratou; Kathy Leadbitter – Journal of Autism and Developmental Disorders, 2025

Tools to measure autism knowledge are needed to assess levels of understanding within particular groups of people and to evaluate whether awareness-raising campaigns or interventions lead to improvements in understanding. Several such measures are in circulation, but, to our knowledge, there are no psychometrically-validated questionnaires that…

Descriptors: Foreign Countries, Autism Spectrum Disorders, Questionnaires, Psychometrics

Reliability and Validity Evidence of Diagnostic Methods: Comparison of Diagnostic Classification Models and Item Response Theory-Based Methods

Direct link

Yoo Jeong Jang – ProQuest LLC, 2022

Despite the increasing demand for diagnostic information, observed subscores have been often reported to lack adequate psychometric qualities such as reliability, distinctiveness, and validity. Therefore, several statistical techniques based on CTT and IRT frameworks have been proposed to improve the quality of subscores. More recently, DCM has…

Descriptors: Classification, Accuracy, Item Response Theory, Correlation

Preliminary Findings to Support the Internal Consistency and Factor Structure of the Ferrari-Lynch-Vogel Listening Test (FLVLT)

Peer reviewed

Direct link

Ferrari-Bridgers, Franca – International Journal of Listening, 2023

While many tools exist to assess student content knowledge, there are few that assess whether students display the critical listening skills necessary to interpret the quality of a speaker's message at the college level. The following research provides preliminary evidence for the internal consistency and factor structure of a tool, the…

Descriptors: Factor Structure, Test Validity, Community College Students, Test Reliability

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | ... | 16

Online Submission	10
Educational and Psychological…	6
ProQuest LLC	6
CBE - Life Sciences Education	4
ETS Research Report Series	4
Grantee Submission	4
International Journal of…	4
Journal of Experimental…	4
Language Assessment Quarterly	4
Language Testing	4
Behavioral Research and…	3
Educational Assessment	3
International Journal of…	3
Journal of Chemical Education	3
Journal of Education and…	3
Journal of Educational…	3
Language Testing in Asia	3
Practical Assessment,…	3
Applied Measurement in…	2
Chemistry Education Research…	2
Educational Measurement:…	2
International Journal of…	2
International Journal of…	2
Journal of Applied Testing…	2
Journal of Research in…	2
More ▼

Roid, Gale	4
Bejar, Isaac I.	3
Liu, Kimy	3
Paek, Insu	3
Schoen, Robert C.	3
Tindal, Gerald	3
Weiten, Wayne	3
Yang, Xiaotong	3
Alexander, Patricia A.	2
Baghaei, Purya	2
Crisp, Victoria	2
Deane, Paul	2
Haladyna, Tom	2
Ketterlin-Geller, Leanne R.	2
Kolstad, Rosemarie K.	2
Liu, Sicong	2
Mike Stieff	2
Mitchell, Virginia P.	2
Pollock, Steven J.	2
Retnawati, Heri	2
Robertson, David W.	2
Smith, Richard M.	2
Stephanie M. Werner	2
Ward, Phillip	2
More ▼

Stanford Achievement Tests	4
National Assessment of…	3
SAT (College Admission Test)	3
Test of English as a Foreign…	3
Trends in International…	3
Comprehensive Tests of Basic…	2
Flesch Kincaid Grade Level…	2
Graduate Record Examinations	2
Hidden Figures Test	2
Raven Progressive Matrices	2
Test of English for…	2
Adult Attachment Interview	1
Advanced Placement…	1
Alabama High School…	1
Armed Services Vocational…	1
Bayley Scales of Infant…	1
California Achievement Tests	1
Child Behavior Checklist	1
Childrens Manifest Anxiety…	1
Defining Issues Test	1
Dynamic Indicators of Basic…	1
Flesch Reading Ease Formula	1
Iowa Tests of Basic Skills	1
Matching Familiar Figures Test	1
Metropolitan Achievement Tests	1
More ▼