ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	1
Since 2017 (last 10 years)	5
Since 2007 (last 20 years)	10

Descriptor

Reliability	45
Testing Problems	45
Evaluation Methods	12
Validity	12
Testing	7
Comparative Analysis	6
Higher Education	6
Research Methodology	6
Scores	6
Educational Testing	5
Error of Measurement	5
Foreign Countries	5
Measurement Techniques	5
Research Problems	5
Second Language Learning	5
Statistical Analysis	5
Student Evaluation	5
Test Construction	5
Test Interpretation	5
College Students	4
Data Analysis	4
Elementary Secondary Education	4
Test Bias	4
Test Reliability	4
Test Validity	4
More ▼

Publication Type

Journal Articles	23
Reports - Research	19
Speeches/Meeting Papers	7
Reports - Evaluative	5
Reports - Descriptive	4
Opinion Papers	2
Books	1
Dissertations/Theses -…	1
ERIC Digests in Full Text	1
ERIC Publications	1
Information Analyses	1
Legal/Legislative/Regulatory…	1
Reference Materials -…	1
Reports - General	1
Tests/Questionnaires	1
More ▼

Education Level

Higher Education	4
Postsecondary Education	4
Elementary Secondary Education	2
Elementary Education	1

Audience

Researchers

Location

Canada	1
Indiana	1
Kentucky	1
Maine	1
Minnesota	1
Oklahoma	1
Thailand	1
Thailand (Bangkok)	1
Turkey	1
United Kingdom (England)	1
United Kingdom (Scotland)	1
More ▼

Laws, Policies, & Programs

No Child Left Behind Act 2001

Assessments and Surveys

Brazelton Neonatal Assessment…	2
Flanders System of…	1
Peabody Picture Vocabulary…	1
Program for International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 45 results Save | Export

The Analysis of Marking Reliability through the Approach of Gauge Repeatability and Reproducibility (GR&R) Study: A Case of English-Speaking Test

Peer reviewed

Direct link

Pornphan Sureeyatanapas; Panitas Sureeyatanapas; Uthumporn Panitanarak; Jittima Kraisriwattana; Patchanan Sarootyanapat; Daniel O'Connell – Language Testing in Asia, 2024

Ensuring consistent and reliable scoring is paramount in education, especially in performance-based assessments. This study delves into the critical issue of marking consistency, focusing on speaking proficiency tests in English language learning, which often face greater reliability challenges. While existing literature has explored various…

Descriptors: Foreign Countries, Students, English Language Learners, Speech

The Assessment Has Become the Curriculum: Teachers' Views on the Phonics Screening Check in England

Peer reviewed

Direct link

Carter, Jane – British Educational Research Journal, 2020

The Phonics Screening Check (PSC) was introduced in England in 2012 for Year 1 children (aged 5 and 6). There have been criticisms of the check in relation to its reliability and appropriateness as an assessment for early reading, although advocates of the check see it as a valuable tool in securing progress in early reading. This mixed methods…

Descriptors: Phonics, Teacher Attitudes, Socioeconomic Status, Testing Problems

Applying Assessment Principles during Emergency Remote Teaching: Challenges and Considerations

Peer reviewed
PDF on ERIC

Download full text

Allehaiby, Wid Hasen; Al-Bahlani, Sara – Arab World English Journal, 2021

One of the main challenges higher educational institutions encounter amid the recent COVID-19 crisis is transferring assessment approaches from the traditional face-to-face form to the online Emergency Remote Teaching approach. A set of language assessment principles, practicality, reliability, validity, authenticity, and washback, which can be…

Descriptors: Barriers, Distance Education, Evaluation Methods, Teaching Methods

Traditional and Alternative Assessments in ELT: Students' and Teachers' Perceptions

Peer reviewed
PDF on ERIC

Download full text

Phongsirikul, Marissa – rEFLections, 2018

The study aimed to investigate teachers' and students' perceptions towards traditional and alternative types of assessment within a classroom context of an English course provided for English-majoring students at tertiary level. A combination of traditional and alternative assessment tools was implemented in the study. The researcher developed…

Descriptors: Teacher Attitudes, Student Attitudes, Alternative Assessment, Second Language Learning

Investigating the Backwash Effect of Higher Education Exam (YGS) on University Students' Attitudes

Peer reviewed
PDF on ERIC

Download full text

Polat, Murat – International Journal of Psychology and Educational Studies, 2020

The application of high-stakes tests to choose students for higher education in Turkey has been considered as a reliable and effective way of assessment for so long. However, the application of a multiple-choice test in testing various skills could bring a number of side-effects with itself. This study aimed to investigate the backwash effect of…

Descriptors: Testing Problems, College Students, Student Attitudes, College Entrance Examinations

Online Testing Suffers Setbacks in Multiple States

Direct link

Davis, Michelle R. – Education Week, 2013

Widespread technical failures and interruptions of recent online testing in a number of states have shaken the confidence of educators and policymakers in high-tech assessment methods and raised serious concerns about schools' technological readiness for the coming common-core online tests. The glitches arose as many districts in the 46 states…

Descriptors: Computer Assisted Testing, Testing Problems, Reliability, Public Schools

Is It Really Possible to Test All Educationally Significant Achievements with High Levels of Reliability?

Peer reviewed

Direct link

Davis, Andrew – Ethics and Education, 2015

PISA claims that it can extend its reach from its current core subjects of Reading, Science, Maths and problem-solving. Yet given the requirement for high levels of reliability for PISA, especially in the light of its current high stakes character, proposed widening of its subject coverage cannot embrace some important aspects of the social and…

Descriptors: International Assessment, High Stakes Tests, Reliability, Academic Achievement

A Simulation Study of the Situations in Which Reporting Subscores Can Add Value to Licensure Examinations

Direct link

Feinberg, Richard A. – ProQuest LLC, 2012

Subscores, also known as domain scores, diagnostic scores, or trait scores, can help determine test-takers' relative strengths and weaknesses and appropriately focus remediation. However, subscores often have poor psychometric properties, particularly reliability and distinctiveness (Folske, Gessaroli, & Swanson, 1999; Monaghan, 2006;…

Descriptors: Simulation, Tests, Testing, Scores

Student Evaluation in Higher Education: A Comparison between Computer Assisted Assessment and Traditional Evaluation

Peer reviewed
PDF on ERIC

Download full text

Ghilay, Yaron; Ghilay, Ruth – Journal of Educational Technology, 2012

The study examined advantages and disadvantages of computerised assessment compared to traditional evaluation. It was based on two samples of college students (n=54) being examined in computerised tests instead of paper-based exams. Students were asked to answer a questionnaire focused on test effectiveness, experience, flexibility and integrity.…

Descriptors: Student Evaluation, Higher Education, Comparative Analysis, Computer Assisted Testing

Individualized Assessment of Differential Abilities.

Download full text

Weiss, David J. – 1969

Today's psychological measurement depends almost exclusively on the "standardized test." A certain amount of non-standardization, however, exists in the administration of any standardized test, with the amount unknown for any given test score. Time limits on tests pose a bigger problem since another variable is introduced, pressure. Test taking…

Descriptors: Computer Oriented Programs, Individual Testing, Measurement Instruments, Motivation

Reliability of a Group Form of the Peabody Picture Vocabulary Test.

Peer reviewed
PDF on ERIC

Download full text

Tillinghast, B. S., Jr.; Renzulli, Joseph S. – Journal of Educational Research, 1968

The purpose of this study was to further examine the reliability of the Peabody Picture Vocabulary Test (PPVT), a new instrument to measure hearing vocabulary so that a student's verbal intelligence may be inferred. A group testing procedure was utilized by reproducing the PPVT plates on 35 millimeter transparent slides and projecting them onto a…

Descriptors: Aptitude Tests, Elementary School Students, Evaluation, Group Testing

Test Use and Test Reliability in a Curriculum for Educable Mentally Retarded Children. Working Paper Number 1.

Download full text

Smith, Leon I.; Greenberg, Sandra – 1973

A discussion of selected applications of new tests developed within the context of a large-scale curriculum for educable mentally retarded (EMR) children, the Social Learning Curriculum (SLC), is presented in this paper which investigates three types of reliability that need to be demonstrated in order to provide a basis of these applications. The…

Descriptors: Curriculum Evaluation, Educational Research, Evaluation Methods, Measurement Techniques

The Use of Bayes' Estimates in the Law of Comparative Judgment.

Peer reviewed

Kaiser, Henry F. – Educational and Psychological Measurement, 1980

The use of Bayes' estimates for proportions in the Law of Comparative Judgment is suggested to avoid sample proportions of zero and one. (Author)

Descriptors: Bayesian Statistics, Comparative Analysis, Reliability, Statistical Analysis

Basic Concepts in Classical Test Theory: Tests Aren't Reliable, the Nature of Alpha, and Reliability Generalization as a Meta-analytic Method.

Download full text

Helms, LuAnn Sherbeck – 1999

This paper discusses the fact that reliability is about scores and not tests and how reliability limits effect sizes. The paper also explores the classical reliability coefficients of stability, equivalence, and internal consistency. Stability is concerned with how stable test scores will be over time, while equivalence addresses the relationship…

Descriptors: Effect Size, Meta Analysis, Reliability, Scores

Subject Profiles Viewed in the Light of Reliability

McVey, P. J. – Assessment in Higher Education, 1976

The results of 16 pairs of "equivalent papers" were used to estimate the reliability of the papers and the extent to which each paper correlated with the year's average test grade. Estimates were also made of the work of the grade for each paper as a predictor of true subject grades. It is shown that a "profile" of grades would mislead.…

Descriptors: Grades (Scholastic), Higher Education, Profiles, Reliability

Previous Page | Next Page »

Pages: 1 | 2 | 3

Journal of Educational…	3
Assessing Writing	2
Educational and Psychological…	2
Monographs of the Society for…	2
Applied Measurement in…	1
Arab World English Journal	1
Assessment in Higher Education	1
British Educational Research…	1
British Journal of…	1
Education Week	1
Engl Quart	1
Ethics and Education	1
Evaluation and Program…	1
Generations	1
Intelligence	1
International Journal of…	1
Journal of Educational…	1
Journal of Educational…	1
Journal of Marriage and the…	1
Language Testing in Asia	1
ProQuest LLC	1
Review of Educational Research	1
Scottish Educational Review	1
rEFLections	1
More ▼

Al-Bahlani, Sara	1
Allehaiby, Wid Hasen	1
Campbell, Mary	1
Carter, Jane	1
Cohen, Patricia	1
Dahl, Theodore	1
Daniel O'Connell	1
Davis, Andrew	1
Davis, Michelle R.	1
Earles, James A.	1
Ebel, Robert L.	1
Erlich, Oded	1
Feinberg, Richard A.	1
Feldt, Leonard S.	1
Follman, John	1
Frary, Robert B.	1
Geron, Scott Miyake	1
Ghilay, Ruth	1
Ghilay, Yaron	1
Greenberg, Sandra	1
Harmon, Michelle G.	1
Helms, LuAnn Sherbeck	1
Hoge, Robert D.	1
Horowitz, Frances Degen	1
More ▼