ERIC - Search Results

Publication Date

In 2025	1
Since 2024	4
Since 2021 (last 5 years)	16
Since 2016 (last 10 years)	39
Since 2006 (last 20 years)	94

Descriptor

Test Reliability	252
Test Theory	252
Test Validity	109
Test Construction	64
Test Items	51
Item Response Theory	44
Scores	44
Error of Measurement	43
Psychometrics	41
Item Analysis	38
Statistical Analysis	38
Foreign Countries	32
Test Interpretation	32
Correlation	29
Mathematical Models	29
Evaluation Methods	27
Criterion Referenced Tests	26
Career Development	25
Measurement Techniques	25
Testing Problems	25
Comparative Analysis	23
Difficulty Level	23
Higher Education	23
Testing	22
Factor Analysis	21
More ▼

Education Level

Higher Education	27
Postsecondary Education	21
Elementary Education	8
Secondary Education	8
Early Childhood Education	6
Middle Schools	6
Elementary Secondary Education	5
High Schools	5
Adult Education	4
Grade 8	4
Junior High Schools	4
Primary Education	3
Grade 1	2
Grade 2	2
Grade 4	2
Grade 5	2
Grade 6	2
Grade 7	2
Intermediate Grades	2
Preschool Education	2
Grade 12	1
Grade 3	1
Grade 9	1
Kindergarten	1
More ▼

Audience

Practitioners	10
Researchers	7
Teachers	7
Administrators	2
Students	2

Location

United Kingdom (England)	5
Canada	4
Spain	3
Australia	2
Singapore	2
Texas	2
Turkey	2
Turkey (Ankara)	2
United Kingdom (Great Britain)	2
United States	2
Chile	1
Colorado	1
Egypt	1
Finland (Helsinki)	1
France	1
Germany	1
Indiana	1
Indonesia	1
Italy	1
Japan	1
Jordan	1
Malaysia	1
New York	1
New York (New York)	1
Norway	1
More ▼

Laws, Policies, & Programs

Elementary and Secondary…

What Works Clearinghouse Rating

Showing 1 to 15 of 252 results Save | Export

Electronic Assessment Anxiety Scale: Development, Validity and Reliability

Peer reviewed
PDF on ERIC

Download full text

Osman Tat; Abdullah Faruk Kilic – Turkish Online Journal of Distance Education, 2024

The widespread availability of internet access in daily life has resulted in a greater acceptance of online assessment methods. E-assessment platforms offer various features such as randomizing questions and answers, utilizing extensive question banks, setting time limits, and managing access during online exams. Electronic assessment enables…

Descriptors: Test Construction, Test Validity, Test Reliability, Anxiety

Comparison of the Results of the Generalizability Theory with the Inter-Rater Agreement Coefficients

Peer reviewed
PDF on ERIC

Download full text

Eser, Mehmet Taha; Aksu, Gökhan – International Journal of Curriculum and Instruction, 2022

The agreement between raters is examined within the scope of the concept of "inter-rater reliability". Although there are clear definitions of the concepts of agreement between raters and reliability between raters, there is no clear information about the conditions under which agreement and reliability level methods are appropriate to…

Descriptors: Generalizability Theory, Interrater Reliability, Evaluation Methods, Test Theory

Assessment of Item and Test Parameters: Cosine Similarity Approach

Peer reviewed
PDF on ERIC

Download full text

Chakrabartty, Satyendra Nath – International Journal of Psychology and Educational Studies, 2021

The paper proposes new measures of difficulty and discriminating values of binary items and test consisting of such items and find their relationships including estimation of test error variance and thereby the test reliability, as per definition using cosine similarities. The measures use entire data. Difficulty value of test and item is defined…

Descriptors: Test Items, Difficulty Level, Scores, Test Reliability

Accuracy and Sensitivity of Coefficient Alpha and Its Alternatives with Unidimensional and Contaminated Scales

Peer reviewed

Direct link

Xiao, Leifeng; Hau, Kit-Tai – Applied Measurement in Education, 2023

We compared coefficient alpha with five alternatives (omega total, omega RT, omega h, GLB, and coefficient H) in two simulation studies. Results showed for unidimensional scales, (a) all indices except omega h performed similarly well for most conditions; (b) alpha is still good; (c) GLB and coefficient H overestimated reliability with small…

Descriptors: Test Theory, Test Reliability, Factor Analysis, Test Length

Evidence for Validity and Reliability of a Research-Based Assessment Instrument on Measurement Uncertainty

Peer reviewed

Direct link

Gayle Geschwind; Michael Vignal; Marcos D. Caballero; H.? J. Lewandowski – Physical Review Physics Education Research, 2024

The Survey of Physics Reasoning on Uncertainty Concepts in Experiments (SPRUCE) was designed to measure students' proficiency with measurement uncertainty concepts and practices across ten different assessment objectives to help facilitate the improvement of laboratory instruction focused on this important topic. To ensure the reliability and…

Descriptors: Measurement, Ambiguity (Context), Scientific Concepts, Physics

Using Differential Item Functioning to Test for Interrater Reliability in Constructed Response Items

Peer reviewed

Direct link

Walker, Cindy M.; Göçer Sahin, Sakine – Educational and Psychological Measurement, 2020

The purpose of this study was to investigate a new way of evaluating interrater reliability that can allow one to determine if two raters differ with respect to their rating on a polytomous rating scale or constructed response item. Specifically, differential item functioning (DIF) analyses were used to assess interrater reliability and compared…

Descriptors: Test Bias, Interrater Reliability, Responses, Correlation

The Riddle Knowledge Inference Test (R-Kit)

Peer reviewed

Direct link

Nicolas Rochat; Laurent Lima; Pascal Bressoux – Journal of Psychoeducational Assessment, 2025

Inference is considered an important factor in comprehension models and has been described as a causal factor in predicting comprehension. To date, specific tests for inference are rare and often rely on specific thematic texts. This reliance on thematic inference may raise some concerns as inference is related to prior text-specific knowledge.…

Descriptors: Inferences, Reading Comprehension, Reading Tests, Test Reliability

Programme Evaluation in Action: Theory to Practice from an Asian Educational Context

Peer reviewed

Direct link

Ser Ming Mark Lee; Wei Cheng Liu – Asia Pacific Journal of Education, 2024

Programme evaluation has developed tremendously over the past 50 years, with a proliferation of evaluation research, an increase in the institutionalization of evaluation, and growth in the professionalization of evaluation. However, existing research and developments are still largely in North America, Europe, Australia, and New Zealand, with…

Descriptors: Foreign Countries, Evaluation Research, Evaluation Methods, Evaluation Criteria

On Misconceptions and the Limited Usefulness of Ordinal Alpha

Peer reviewed

Direct link

Chalmers, R. Philip – Educational and Psychological Measurement, 2018

This article discusses the theoretical and practical contributions of Zumbo, Gadermann, and Zeisser's family of ordinal reliability statistics. Implications, interpretation, recommendations, and practical applications regarding their ordinal measures, particularly ordinal alpha, are discussed. General misconceptions relating to this family of…

Descriptors: Misconceptions, Test Theory, Test Reliability, Statistics

A Simple Model to Determine the Efficient Duration of Exams

Peer reviewed

Direct link

Ellis, Jules L. – Educational and Psychological Measurement, 2021

This study develops a theoretical model for the costs of an exam as a function of its duration. Two kind of costs are distinguished: (1) the costs of measurement errors and (2) the costs of the measurement. Both costs are expressed in time of the student. Based on a classical test theory model, enriched with assumptions on the context, the costs…

Descriptors: Test Length, Models, Error of Measurement, Measurement

The PSI-20: Development of a Viable Short Form Alternative of the Problem Solving Inventory Using Item Response Theory

Peer reviewed

Direct link

Tyrone B. Pretorius; P. Paul Heppner; Anita Padmanabhanunni; Serena Ann Isaacs – SAGE Open, 2023

In previous studies, problem solving appraisal has been identified as playing a key role in promoting positive psychological well-being. The Problem Solving Inventory is the most widely used measure of problem solving appraisal and consists of 32 items. The length of the instrument, however, may limit its applicability to large-scale surveys…

Descriptors: Problem Solving, Measures (Individuals), Test Construction, Item Response Theory

The Effect of Chance Success on Equalization Error in Test Equation Based on Classical Test Theory

Peer reviewed
PDF on ERIC

Download full text

Koçak, Duygu – International Journal of Progressive Education, 2020

The aim of this study was to determine the effect of chance success on test equalization. For this purpose, artificially generated 500 and 1000 sample size data sets were synchronized using linear equalization and equal percentage equalization methods. In the data which were produced as a simulative, a total of four cases were created with no…

Descriptors: Test Theory, Equated Scores, Error of Measurement, Sample Size

A Closed-Form Alternative for Estimating [omega] Reliability under Unidimensionality

Peer reviewed

Direct link

Hancock, Gregory R.; An, Ji – Measurement: Interdisciplinary Research and Perspectives, 2020

As an alternative to Cronbach's [alpha] for estimating scale reliability, McDonald's [omega] has attracted increased attention within the methodological community for its less stringent measurement assumptions. Notwithstanding, [omega] is still seldom used by practitioners, likely due to its unavailability in popular software packages (e.g., SPSS)…

Descriptors: Evaluation, Alternative Assessment, Reliability, Test Reliability

Conditional Standard Error of Measurement: Classical Test Theory, Generalizability Theory and Many-Facet Rasch Measurement with Applications to Writing Assessment

Peer reviewed
PDF on ERIC

Download full text

Huebner, Alan; Skar, Gustaf B. – Practical Assessment, Research & Evaluation, 2021

Writing assessments often consist of students responding to multiple prompts, which are judged by more than one rater. To establish the reliability of these assessments, there exist different methods to disentangle variation due to prompts and raters, including classical test theory, Many Facet Rasch Measurement (MFRM), and Generalizability Theory…

Descriptors: Error of Measurement, Test Theory, Generalizability Theory, Item Response Theory

Establishing a Physics Concept Inventory Using Computer Marked Free-Response Questions

Peer reviewed
PDF on ERIC

Download full text

Parker, Mark A. J.; Hedgeland, Holly; Jordan, Sally E.; Braithwaite, Nicholas St. J. – European Journal of Science and Mathematics Education, 2023

The study covers the development and testing of the alternative mechanics survey (AMS), a modified force concept inventory (FCI), which used automatically marked free-response questions. Data were collected over a period of three academic years from 611 participants who were taking physics classes at high school and university level. A total of…

Descriptors: Test Construction, Scientific Concepts, Physics, Test Reliability

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | ... | 17

Educational and Psychological…	20
Psychometrika	10
Applied Psychological…	6
Educational Measurement:…	6
Journal of Educational…	6
ProQuest LLC	6
Journal of Experimental…	5
Journal of Educational…	4
Alberta Journal of…	3
Applied Measurement in…	3
Journal of School Psychology	3
Research Papers in Education	3
Assessment & Evaluation in…	2
Assessment for Effective…	2
Educational Research and…	2
Grantee Submission	2
International Journal of…	2
Journal of Educational and…	2
Journal on Educational…	2
Language Testing	2
Online Submission	2
Physical Review Physics…	2
Reading Research and…	2
SAGE Open	2
Advances in Health Sciences…	1
More ▼

Zimmerman, Donald W.	9
Haladyna, Tom	4
Huynh, Huynh	4
Williams, Richard H.	4
Mislevy, Robert J.	3
Wilcox, Rand R.	3
Bormuth, John R.	2
Brennan, Robert L.	2
Cliff, Norman	2
Crocker, Linda	2
Crowley, Susan L.	2
Feldt, Leonard S.	2
Gonzalez-Tamayo, Eulogio	2
He, Qingping	2
Lane, Kathleen Lynne	2
Mehrens, William A.	2
Oakes, Wendy Peia	2
Petscher, Yaacov	2
Prather, Edward E.	2
Roid, Gale	2
Salmani-Nodoushan, Mohammad…	2
Schulman, Robert S.	2
Truckenmiller, Adrea	2
Wainer, Howard	2
More ▼

Journal Articles	153
Reports - Research	148
Speeches/Meeting Papers	36
Reports - Evaluative	34
Reports - Descriptive	26
Opinion Papers	15
Information Analyses	13
Dissertations/Theses -…	6
Books	4
Numerical/Quantitative Data	4
Tests/Questionnaires	4
Collected Works - Serials	3
Guides - Non-Classroom	3
Collected Works - General	2
Guides - Classroom - Learner	2
Reference Materials -…	2
Book/Product Reviews	1
Collected Works - Proceedings	1
Dissertations/Theses -…	1
Guides - Classroom - Teacher	1
Reference Materials - General	1
More ▼

Childrens Depression Inventory	2
SAT (College Admission Test)	2
Strengths and Difficulties…	2
Test of English as a Foreign…	2
ACT Assessment	1
Armed Services Vocational…	1
California Achievement Tests	1
California Critical Thinking…	1
Center for Epidemiologic…	1
Defining Issues Test	1
Dyadic Adjustment Scale	1
Expressive One Word Picture…	1
Graduate Record Examinations	1
Kaufman Assessment Battery…	1
Learning and Study Strategies…	1
My Class Inventory	1
National Assessment of…	1
Nelson Denny Reading Tests	1
New Jersey College Basic…	1
Preliminary Scholastic…	1
Rosenberg Self Esteem Scale	1
Thematic Apperception Test	1
Watson Glaser Critical…	1
Woodcock Johnson Tests of…	1
More ▼