ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	10
Since 2016 (last 10 years)	27
Since 2006 (last 20 years)	37

Descriptor

Difficulty Level	53
Scores	53
Test Reliability	53
Test Items	42
Test Validity	20
Foreign Countries	18
Test Construction	16
Multiple Choice Tests	14
Correlation	12
Item Analysis	11
Achievement Tests	10
Item Response Theory	9
Psychometrics	9
Language Tests	8
Statistical Analysis	8
Student Evaluation	7
College Students	6
Comparative Analysis	6
Evaluation Methods	6
Second Language Learning	6
Test Bias	6
Test Interpretation	6
English (Second Language)	5
Physics	5
Pretests Posttests	5
More ▼

Publication Type

Reports - Research	39
Journal Articles	34
Dissertations/Theses -…	5
Reports - Evaluative	5
Speeches/Meeting Papers	5
Tests/Questionnaires	4
Guides - Non-Classroom	2
Collected Works - Proceedings	1
Guides - General	1
Numerical/Quantitative Data	1

Education Level

Higher Education	16
Postsecondary Education	13
Elementary Education	3
Secondary Education	3
Early Childhood Education	2
Elementary Secondary Education	2
Middle Schools	2
Primary Education	2
Grade 1	1
Grade 2	1
Grade 5	1
Grade 9	1
High Schools	1
Intermediate Grades	1
Kindergarten	1
More ▼

Audience

Researchers

Location

Turkey	5
Australia	2
Canada	2
Colorado	2
Israel	2
South Korea	2
United States	2
Asia	1
Brazil	1
California	1
China	1
Connecticut	1
Denmark	1
District of Columbia	1
Egypt	1
Estonia	1
Finland	1
Florida	1
Germany	1
Greece	1
Hawaii	1
India	1
Indonesia	1
Ireland	1
Italy	1
More ▼

Laws, Policies, & Programs

Elementary and Secondary…

Assessments and Surveys

Flesch Kincaid Grade Level…	2
ACTFL Oral Proficiency…	1
Comprehensive Tests of Basic…	1
Flesch Reading Ease Formula	1
Metropolitan Achievement Tests	1
SAT (College Admission Test)	1
Stanford Achievement Tests	1
Test of English as a Foreign…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 53 results Save | Export

Seeking the Real Reliability: Why the Traditional Estimators of Reliability Usually Fail in Achievement Testing and Why the Deflation-Corrected Coefficients Could Be Better Options

Peer reviewed
PDF on ERIC

Download full text

Metsämuuronen, Jari – Practical Assessment, Research & Evaluation, 2023

Traditional estimators of reliability such as coefficients alpha, theta, omega, and rho (maximal reliability) are prone to give radical underestimates of reliability for the tests common when testing educational achievement. These tests are often structured by widely deviating item difficulties. This is a typical pattern where the traditional…

Descriptors: Test Reliability, Achievement Tests, Computation, Test Items

Validation of an Elicited Imitation Test as a Measure of Korean Language Proficiency

Peer reviewed

Direct link

Hojung Kim; Changkyung Song; Jiyoung Kim; Hyeyun Jeong; Jisoo Park – Language Testing in Asia, 2024

This study presents a modified version of the Korean Elicited Imitation (EI) test, designed to resemble natural spoken language, and validates its reliability as a measure of proficiency. The study assesses the correlation between average test scores and Test of Proficiency in Korean (TOPIK) levels, examining score distributions among beginner,…

Descriptors: Korean, Test Validity, Test Reliability, Imitation

Assessment of Item and Test Parameters: Cosine Similarity Approach

Peer reviewed
PDF on ERIC

Download full text

Chakrabartty, Satyendra Nath – International Journal of Psychology and Educational Studies, 2021

The paper proposes new measures of difficulty and discriminating values of binary items and test consisting of such items and find their relationships including estimation of test error variance and thereby the test reliability, as per definition using cosine similarities. The measures use entire data. Difficulty value of test and item is defined…

Descriptors: Test Items, Difficulty Level, Scores, Test Reliability

A Novel Examination of None-of-the-Above as It Influences Examinee Item Responses

Direct link

Thompson, Kathryn N. – ProQuest LLC, 2023

It is imperative to collect validity evidence prior to interpreting and using test scores. During the process of collecting validity evidence, test developers should consider whether test scores are contaminated by sources of extraneous information. This is referred to as construct irrelevant variance, or the "degree to which test scores are…

Descriptors: Test Wiseness, Test Items, Item Response Theory, Scores

Somers' D as an Alternative for the Item-Test and Item-Rest Correlation Coefficients in the Educational Measurement Settings

Peer reviewed
PDF on ERIC

Download full text

Metsämuuronen, Jari – International Journal of Educational Methodology, 2020

Pearson product-moment correlation coefficient between item g and test score X, known as item-test or item-total correlation ("Rit"), and item-rest correlation ("Rir") are two of the most used classical estimators for item discrimination power (IDP). Both "Rit" and "Rir" underestimate IDP caused by the…

Descriptors: Correlation, Test Items, Scores, Difficulty Level

Taking Inventory of the Creative Behavior Inventory: An Item Response Theory Analysis of the CBI

Peer reviewed

Direct link

Rodriguez, Rebekah M.; Silvia, Paul J.; Kaufman, James C.; Reiter-Palmon, Roni; Puryear, Jeb S. – Creativity Research Journal, 2023

The original 90-item Creative Behavior Inventory (CBI) was a landmark self-report scale in creativity research, and the 28-item brief form developed nearly 20 years ago continues to be a popular measure of everyday creativity. Relatively little is known, however, about the psychometric properties of this widely used scale. In the current research,…

Descriptors: Creativity Tests, Creativity, Creative Thinking, Psychometrics

Establishing the Validity and Reliability of the LOCUS Assessments

Peer reviewed
PDF on ERIC

Download full text

Tim Jacobbe; Bob delMas; Brad Hartlaub; Jeff Haberstroh; Catherine Case; Steven Foti; Douglas Whitaker – Numeracy, 2023

The development of assessments as part of the funded LOCUS project is described. The assessments measure students' conceptual understanding of statistics as outlined in the GAISE PreK-12 Framework. Results are reported from a large-scale administration to 3,430 students in grades 6 through 12 in the United States. Items were designed to assess…

Descriptors: Statistics Education, Common Core State Standards, Student Evaluation, Elementary School Students

A Baseline for Multiple-Choice Testing in the University Classroom

Peer reviewed

Direct link

Slepkov, A. D.; Van Bussel, M. L.; Fitze, K. M.; Burr, W. S. – SAGE Open, 2021

There is a broad literature in multiple-choice test development, both in terms of item-writing guidelines, and psychometric functionality as a measurement tool. However, most of the published literature concerns multiple-choice testing in the context of expert-designed high-stakes standardized assessments, with little attention being paid to the…

Descriptors: Foreign Countries, Undergraduate Students, Student Evaluation, Multiple Choice Tests

Characteristics of Fundamental Physics Higher-Order Thinking Skills Test Using Item Response Theory Analysis

Peer reviewed
PDF on ERIC

Download full text

Saepuzaman, Duden; Istiyono, Edi; Haryanto – Pegem Journal of Education and Instruction, 2022

HOTS is one part of the skills that need to be developed in the 21st Century . This study aims to determine the characteristics of the Fundamental Physics Higher-order Thinking Skill (FundPhysHOTS) test for prospective physics teachers using Item Response Theory (IRT) analysis. This study uses a quantitative approach. 254 prospective physics…

Descriptors: Thinking Skills, Physics, Science Process Skills, Cognitive Tests

Test Item Writing Competence among Oman College of Health Sciences Nurse Faculty

Direct link

Mohammed Ambusaidi – ProQuest LLC, 2022

There is an increased demand on nursing faculty to provide quality teaching and assessment. Nursing faculty are required to ensure accurate assessment of learning through testing and outcome measurement that are critical elements of the evaluation process. Likewise, nursing faculty should implement a logical evaluation system. However, the…

Descriptors: Nursing Education, College Faculty, Test Construction, Test Validity

Investigating Animation-Based Achievement Tests According to Various Variables

Peer reviewed
PDF on ERIC

Download full text

Guven Demir, Elif; Öksuz, Yücel – Participatory Educational Research, 2022

This research aimed to investigate animation-based achievement tests according to the item format, psychometric features, students' performance, and gender. The study sample consisted of 52 fifth-grade students in Samsun/Turkey in 2017-2018. Measures of the research were open-ended (OE), animation-based open-ended (AOE), multiple-choice (MC), and…

Descriptors: Animation, Achievement Tests, Test Items, Psychometrics

A Simulation-Based Method for Finding the Optimal Number of Options for Multiple-Choice Items on a Test. Research Report. ETS RR-18-22

Peer reviewed
PDF on ERIC

Download full text

Guo, Hongwen; Zu, Jiyun; Kyllonen, Patrick – ETS Research Report Series, 2018

For a multiple-choice test under development or redesign, it is important to choose the optimal number of options per item so that the test possesses the desired psychometric properties. On the basis of available data for a multiple-choice assessment with 8 options, we evaluated the effects of changing the number of options on test properties…

Descriptors: Multiple Choice Tests, Test Items, Simulation, Test Construction

Exploration of Factors Affecting the Added Value of Test Subscores

Peer reviewed

Direct link

Wang, Xiaolin; Svetina, Dubravka; Dai, Shenghai – Journal of Experimental Education, 2019

Recently, interest in test subscore reporting for diagnosis purposes has been growing rapidly. The two simulation studies here examined factors (sample size, number of subscales, correlation between subscales, and three factors affecting subscore reliability: number of items per subscale, item parameter distribution, and data generating model)…

Descriptors: Value Added Models, Scores, Sample Size, Correlation

Developing a Test to Measure Drawing a Shape-Schema and Making a Table Skills of Prospective Teachers

Peer reviewed
PDF on ERIC

Download full text

Yuksel, Ibrahim; Savas, Muhammed Ali – Asian Journal of Education and Training, 2019

In this research, it is aimed to develop a valid and reliable test to determine the drawing a shape-schema and making a table levels of prospective teachers at Mathematics and Science Education, Turkish and Social Sciences Education and Basic Education Departments. In this process, a comprehensive item pool has been prepared with the table of…

Descriptors: Preservice Teachers, Item Banks, Test Validity, Foreign Countries

Examining the Effects of Item Difficulty and Rating Method on Rating Reliability and Construct Validity of Constructed-Response and Essay Items on English Examinations

Direct link

Yao, Yuan – ProQuest LLC, 2019

Under the framework of item response theory (IRT) and generalizability (G-) theory, this study examined the effects of item difficulty on rating reliability and construct validity on both the constructed-response (CR) items and essay items on English examinations. The data collected for this study were students' scores and responses on the two…

Descriptors: Foreign Countries, College Students, Second Language Learning, English (Second Language)

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4

ProQuest LLC	5
ETS Research Report Series	3
Physical Review Special…	2
Advances in Health Sciences…	1
Applied Language Learning	1
Applied Psychological…	1
Asian Journal of Education…	1
Chemistry Education Research…	1
Creativity Research Journal	1
EURASIA Journal of…	1
EURASIA Journal of…	1
Educational Research and…	1
Educational and Psychological…	1
IEEE Transactions on Education	1
International Association for…	1
International Journal of…	1
International Journal of…	1
International Journal of…	1
Journal of Educational…	1
Journal of Experimental…	1
Journal of Science Education…	1
Journal of the Scholarship of…	1
Language Assessment Quarterly	1
Language Testing	1
Language Testing in Asia	1
More ▼

Gallas, Edwin J.	2
Metsämuuronen, Jari	2
Pollock, Steven J.	2
Ali, Syed Haris	1
Alpayar, Cagla	1
Asikainen, Mervi A.	1
Barak, Moshe	1
Bers, Marina Umaschi	1
Bob delMas	1
Brad Hartlaub	1
Bristow, M.	1
Broder, Darren L.	1
Burr, W. S.	1
Bynum, Bethany H.	1
Carr, Patrick A.	1
Catherine Case	1
Chakrabartty, Satyendra Nath	1
Changkyung Song	1
Chissom, Brad	1
Chukabarah, Prince C. O.	1
Cizek, Gregory J.	1
Clark, Teresa P.	1
Craig, Robert	1
Crowder, Christopher R.	1
More ▼