ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	1
Since 2017 (last 10 years)	2
Since 2007 (last 20 years)	12

Publication Type

Reports - Research	29
Journal Articles	23
Speeches/Meeting Papers	2
Numerical/Quantitative Data	1
Opinion Papers	1

Education Level

Elementary Secondary Education	4
Higher Education	4
Postsecondary Education	3
Adult Education	1
Junior High Schools	1
Secondary Education	1

Audience

Researchers

Location

Canada	2
Idaho	2
Washington	2
Alaska	1
Arizona	1
California	1
Colorado	1
Delaware	1
Denmark	1
Germany	1
Ghana	1
Hawaii	1
Illinois	1
Indiana	1
Japan	1
Kansas	1
Maine	1
Maryland	1
Massachusetts	1
Michigan	1
Minnesota	1
Montana	1
Nevada	1
New Hampshire	1
New Jersey	1
More ▼

Laws, Policies, & Programs

Education Consolidation…	1
No Child Left Behind Act 2001	1

Assessments and Surveys

Wechsler Intelligence Scale…	3
Advanced Placement…	1
Child Behavior Checklist	1
Kaufman Adolescent and Adult…	1
Personality Inventory for…	1
Texas Assessment of Academic…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 29 results Save | Export

Rethinking Online Assessment Quality from Pre-Service Teachers Perspectives

Peer reviewed
PDF on ERIC

Download full text

Mücahit Öztürk – Open Praxis, 2024

This study examined the problems that pre-service teachers face in the online assessment process and their suggestions for solutions to these problems. The participants were 136 pre-service teachers who have been experiencing online assessment for a long time and who took the Foundations of Open and Distance Learning course. This research is a…

Descriptors: Foreign Countries, Preservice Teacher Education, Preservice Teachers, Distance Education

Are the Nonparametric Person-Fit Statistics More Powerful than Their Parametric Counterparts? Revisiting the Simulations in Karabatsos (2003)

Peer reviewed

Direct link

Sinharay, Sandip – Applied Measurement in Education, 2017

Karabatsos compared the power of 36 person-fit statistics using receiver operating characteristics curves and found the "H[superscript T]" statistic to be the most powerful in identifying aberrant examinees. He found three statistics, "C", "MCI", and "U3", to be the next most powerful. These four statistics,…

Descriptors: Nonparametric Statistics, Goodness of Fit, Simulation, Comparative Analysis

Impact of Design Effects in Large-Scale District and State Assessments

Peer reviewed

Direct link

Phillips, Gary W. – Applied Measurement in Education, 2015

This article proposes that sampling design effects have potentially huge unrecognized impacts on the results reported by large-scale district and state assessments in the United States. When design effects are unrecognized and unaccounted for they lead to underestimating the sampling error in item and test statistics. Underestimating the sampling…

Descriptors: State Programs, Sampling, Research Design, Error of Measurement

The Leading Group Effect: Illusionary Declines in Scholastic Standard Scores of Mid-Range Japanese Junior High School Pupils

Peer reviewed

Direct link

Mori, Kazuo; Uchida, Akitoshi – Research in Education, 2012

Longitudinal change in the average Z scores for four groups of pupils sorted by quartiles was examined for its stability over three years. The data, collected from 1998 to 2009, was obtained from nine cohorts of Japanese junior high school pupils totaling 1,962 subjects. It showed illusionary declines among the mid-range pupils but improvements…

Descriptors: Foreign Countries, Junior High School Students, Cohort Analysis, Evaluation Problems

The Applicability of Multidimensional Computerized Adaptive Testing for Cognitive Ability Measurement in Organizational Assessment

Peer reviewed

Direct link

Makransky, Guido; Glas, Cees A. W. – International Journal of Testing, 2013

Cognitive ability tests are widely used in organizations around the world because they have high predictive validity in selection contexts. Although these tests typically measure several subdomains, testing is usually carried out for a single subdomain at a time. This can be ineffective when the subdomains assessed are highly correlated. This…

Descriptors: Foreign Countries, Cognitive Ability, Adaptive Testing, Feedback (Response)

An NCME Instructional Module on Using Differential Step Functioning to Refine the Analysis of DIF in Polytomous Items

Peer reviewed

Direct link

Penfield, Randall D.; Gattamorta, Karina; Childs, Ruth A. – Educational Measurement: Issues and Practice, 2009

Traditional methods for examining differential item functioning (DIF) in polytomously scored test items yield a single item-level index of DIF and thus provide no information concerning which score levels are implicated in the DIF effect. To address this limitation of DIF methodology, the framework of differential step functioning (DSF) has…

Descriptors: Test Bias, Test Items, Evaluation Methods, Scores

Assessment Design and Cheating Risk in Online Instruction

Peer reviewed

Direct link

Harmon, Oskar R.; Lambrinos, James; Buffolino, Judy – Online Journal of Distance Learning Administration, 2010

Many consider online courses to be an inferior alternative to traditional face-to-face (f2f) courses because exam cheating is thought to occur more often in online courses. This study examines how the assessment design in online courses contributes to this perception. Following a literature review, the assessment design in a sample of online…

Descriptors: Electronic Learning, Student Attitudes, Cheating, Online Courses

Playing with the Stakes: A Consideration of an Aspect of the Social Context of a Gatekeeping Writing Assessment

Peer reviewed

Direct link

Baker, Beverly A. – Assessing Writing, 2010

In high-stakes writing assessments, rater training in the use of a rating scale does not eliminate variability in grade attribution. This realisation has been accompanied by research that explores possible sources of rater variability, such as rater background or rating scale type. However, there has been little consideration thus far of…

Descriptors: Foreign Countries, Writing Evaluation, Writing Tests, Testing

Monitoring Rater Performance over Time: A Framework for Detecting Differential Accuracy and Differential Scale Category Use

Peer reviewed

Direct link

Myford, Carol M.; Wolfe, Edward W. – Journal of Educational Measurement, 2009

In this study, we describe a framework for monitoring rater performance over time. We present several statistical indices to identify raters whose standards drift and explain how to use those indices operationally. To illustrate the use of the framework, we analyzed rating data from the 2002 Advanced Placement English Literature and Composition…

Descriptors: English Literature, Advanced Placement, Measures (Individuals), Writing (Composition)

Judges' Use of Examinee Performance Data in an Angoff Standard-Setting Exercise for a Medical Licensing Examination: An Experimental Study

Peer reviewed

Direct link

Clauser, Brian E.; Mee, Janet; Baldwin, Su G.; Margolis, Melissa J.; Dillon, Gerard F. – Journal of Educational Measurement, 2009

Although the Angoff procedure is among the most widely used standard setting procedures for tests comprising multiple-choice items, research has shown that subject matter experts have considerable difficulty accurately making the required judgments in the absence of examinee performance data. Some authors have viewed the need to provide…

Descriptors: Standard Setting (Scoring), Program Effectiveness, Expertise, Health Personnel

Bias in Psychological Assessment: Heterosexism.

Peer reviewed

Chernin, Jeffrey; Holden, Janice Miner; Chandler, Cynthia – Measurement and Evaluation in Counseling and Development, 1997

Explores heterosexist bias in seven widely used assessment instruments. Focuses on bias that is observable in the instruments themselves and in the ancillary materials. Describes three types of bias, how these biases manifest in various instruments, and makes recommendations for mental health practitioners and for professionals who develop…

Descriptors: Evaluation Problems, Homophobia, Homosexuality, Lesbianism

Structural Equation Modeling Methods of Hypothesis Testing of Latent Variable Means.

Peer reviewed

Hancock, Gregory R. – Measurement and Evaluation in Counseling and Development, 1997

Analyzes two methods of testing group differences of a latent variable: group code analysis and structured means analysis. Describes these methods in terms of conceptual representation, unique underlying assumptions, and relative merits and limitations; points toward methodological extensions beyond this two-sample case. (RJM)

Descriptors: Evaluation Problems, Group Dynamics, Group Unity, Models

Door of Hope or Despair: Students' Perception of Distance Education at University of Ghana

Peer reviewed
PDF on ERIC

Download full text

Oteng-Ababio, M. – Turkish Online Journal of Distance Education, 2011

Distance Education has globally become one of the important solutions for increasing admission into the universities, decongesting campuses and efficient utilization of time and space. To ensure the sustainability of the programmes' noble objectives calls for periodic re-evaluation of its modus operandi including the assessment of the perception…

Descriptors: Foreign Countries, Student Attitudes, Distance Education, Negative Attitudes

Myth of the Master Detective: Reliability of Interpretations for Kaufman's "Intelligent Testing" Approach to the WISC-III.

Peer reviewed

Macmann, Gregg M.; Barnett, David W. – School Psychology Quarterly, 1997

Used computer simulation to examine the reliability of interpretations for Kaufman's "intelligent testing" approach to the Wechsler Intelligence Scale for Children (3rd ed.) (WISC-III). Findings indicate that factor index-score differences and other measures could not be interpreted with confidence. Argues that limitations of IQ testing…

Descriptors: Elementary Secondary Education, Evaluation Problems, Intelligence, Intelligence Quotient

Deafness and WISC-III Item Difficulty: Invariance and Fit.

Peer reviewed

Maller, Susan J. – Journal of School Psychology, 1997

Translated five subtests of the Wechsler Intelligence Scale for Children (3rd ed.)--Picture Completion, Information, Similarities, Vocabulary, and Comprehension--into sign language and administered tests to 110 severely and profoundly deaf children. Results indicate that many of the subtest items differentially measured intelligence for deaf…

Descriptors: Deafness, Elementary Secondary Education, Evaluation Problems, Goodness of Fit

Previous Page | Next Page »

Pages: 1 | 2

School Psychology Quarterly	3
Applied Measurement in…	2
Journal of Educational…	2
Measurement and Evaluation in…	2
Adolescence	1
Assessing Writing	1
CALICO Journal	1
Canadian Journal of Education	1
Clearing House	1
Educational Measurement:…	1
Educational Research for…	1
International Journal of…	1
Journal of School Psychology	1
Online Journal of Distance…	1
Open Praxis	1
Research in Education	1
Thomas B. Fordham Institute	1
Turkish Online Journal of…	1
Urban Review	1
More ▼

Adkins, Deborah	1
Anderson, Patricia S.	1
Baker, Beverly A.	1
Baldwin, Su G.	1
Barner, Robert R.	1
Barnett, David W.	1
Bielinski, John	1
Bradley, J.	1
Bruno, James E.	1
Buffolino, Judy	1
Chandler, Cynthia	1
Chernin, Jeffrey	1
Childs, Ruth A.	1
Clauser, Brian E.	1
Cronin, John	1
Dahlin, Michael	1
Devaraju-Backhaus, S.	1
Dillon, Gerard F.	1
Feng, Jianhua	1
Field, James C.	1
Gattamorta, Karina	1
Glas, Cees A. W.	1
Glutting, Joseph J.	1
Golden, C. J.	1
More ▼

Evaluation Problems	29
Testing Problems	29
Elementary Secondary Education	9
Evaluation Methods	9
Foreign Countries	7
Educational Assessment	6
Educational Testing	6
Test Validity	6
Student Evaluation	5
Evaluation Research	4
Comparative Analysis	3
Computer Assisted Testing	3
Diagnostic Tests	3
Distance Education	3
Intelligence Tests	3
Item Response Theory	3
Norm Referenced Tests	3
Program Effectiveness	3
Psychometrics	3
Scores	3
Scoring	3
State Programs	3
Student Attitudes	3
Test Construction	3
Testing	3
More ▼