ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	4
Since 2016 (last 10 years)	10
Since 2006 (last 20 years)	14

Descriptor

Alternative Assessment	16
Error of Measurement	16
Evaluation Methods	8
Scores	7
Test Reliability	5
Educational Assessment	3
Evaluation Criteria	3
Interrater Reliability	3
Measurement Techniques	3
Student Evaluation	3
Test Construction	3
Test Validity	3
Bayesian Statistics	2
College Students	2
Correlation	2
Educational Policy	2
Elementary Secondary Education	2
Hierarchical Linear Modeling	2
Intervention	2
Mathematics Tests	2
Rating Scales	2
Robustness (Statistics)	2
Sample Size	2
Scaling	2
Scoring	2
More ▼

Source

Educational Measurement:…	2
Applied Measurement in…	1
Assessment & Evaluation in…	1
Communique	1
Educational Evaluation and…	1
Educational and Psychological…	1
International Journal of…	1
Journal of Curriculum and…	1
Journal of Educational and…	1
Journal of Human Resources	1
Measurement and Evaluation in…	1
Modern Language Journal	1
Review of Educational Research	1
Society for Research on…	1
More ▼

Publication Type

Journal Articles	14
Reports - Research	9
Reports - Descriptive	3
Reports - Evaluative	3
ERIC Digests in Full Text	1
ERIC Publications	1
Tests/Questionnaires	1

Education Level

Elementary Secondary Education	3
Elementary Education	2
Higher Education	2
Junior High Schools	2
Middle Schools	2
Postsecondary Education	2
Secondary Education	2
Grade 10	1
Grade 5	1
Grade 8	1
High Schools	1
Intermediate Grades	1
More ▼

Audience

Location

Portugal

Laws, Policies, & Programs

Assessments and Surveys

SAT (College Admission Test)

What Works Clearinghouse Rating

Showing 1 to 15 of 16 results Save | Export

Detecting Careless Responding in Multidimensional Forced-Choice Questionnaires

Peer reviewed

Direct link

Rebekka Kupffer; Susanne Frick; Eunike Wetzel – Educational and Psychological Measurement, 2024

The multidimensional forced-choice (MFC) format is an alternative to rating scales in which participants rank items according to how well the items describe them. Currently, little is known about how to detect careless responding in MFC data. The aim of this study was to adapt a number of indices used for rating scales to the MFC format and…

Descriptors: Measurement Techniques, Alternative Assessment, Rating Scales, Questionnaires

On the Merits of Longitudinal Multiple Group Modelling: An Alternative to Multilevel Modelling for Intervention Evaluations

Peer reviewed

Direct link

Little, Todd D.; Bontempo, Daniel; Rioux, Charlie; Tracy, Allison – International Journal of Research & Method in Education, 2022

Multilevel modelling (MLM) is the most frequently used approach for evaluating interventions with clustered data. MLM, however, has some limitations that are associated with numerous obstacles to model estimation and valid inferences. Longitudinal multiple-group (LMG) modelling is a longstanding approach for testing intervention effects using…

Descriptors: Longitudinal Studies, Hierarchical Linear Modeling, Alternative Assessment, Intervention

Assessing Inter-Rater Reliability with Heterogeneous Variance Components Models: Flexible Approach Accounting for Contextual Variables

Peer reviewed

Direct link

Martinková, Patrícia; Bartoš, František; Brabec, Marek – Journal of Educational and Behavioral Statistics, 2023

Inter-rater reliability (IRR), which is a prerequisite of high-quality ratings and assessments, may be affected by contextual variables, such as the rater's or ratee's gender, major, or experience. Identification of such heterogeneity sources in IRR is important for the implementation of policies with the potential to decrease measurement error…

Descriptors: Interrater Reliability, Bayesian Statistics, Statistical Inference, Hierarchical Linear Modeling

Controlling for Measurement Error in Evaluations When Treatment Group Assignment Is Based on Noisy Measures

Peer reviewed

Direct link

Robert Meyer; Sara Hu; Michael Christian – Society for Research on Educational Effectiveness, 2023

Background: This paper develops a new method to estimate quasi-experimental evaluation models when it is necessary to control for measurement error in predictors and individual assignment to the treatment group is based on these same fallible variables. A major methodological finding of the study is that standard methods of estimating models that…

Descriptors: Error of Measurement, Measurement Techniques, Elementary Secondary Education, Report Cards

Processes and Procedures for Estimating Score Reliability and Precision

Peer reviewed

Direct link

Bardhoshi, Gerta; Erford, Bradley T. – Measurement and Evaluation in Counseling and Development, 2017

Precision is a key facet of test development, with score reliability determined primarily according to the types of error one wants to approximate and demonstrate. This article identifies and discusses several primary forms of reliability estimation: internal consistency (i.e., split-half, KR-20, a), test-retest, alternate forms, interscorer, and…

Descriptors: Scores, Test Reliability, Accuracy, Pretests Posttests

The Perceptive Imperative: Connoisseurship and the Temptation of Rubrics

Peer reviewed

Direct link

Gottlieb, Derek; Moroye, Christy M. – Journal of Curriculum and Pedagogy, 2016

We examine the reliance on rubrics for educational evaluation and explore whether such tools fulfill their promise. Following Wittgensteinian critical strategies, we explore what "the application of the [rubric] picture looks like" and then evaluate (a) whether those benefits are attributable to rubric use at all, and (b) whether any of…

Descriptors: Scoring Rubrics, Educational Assessment, Student Evaluation, Educational Benefits

The Accuracy of Aggregate Student Growth Percentiles as Indicators of Educator Performance

Peer reviewed

Direct link

Castellano, Katherine E.; McCaffrey, Daniel F. – Educational Measurement: Issues and Practice, 2017

Mean or median student growth percentiles (MGPs) are a popular measure of educator performance, but they lack rigorous evaluation. This study investigates the error in MGP due to test score measurement error (ME). Using analytic derivations, we find that errors in the commonly used MGP are correlated with average prior latent achievement: Teachers…

Descriptors: Teacher Evaluation, Teacher Effectiveness, Value Added Models, Achievement Gains

Challenging Conventional Wisdom for Multivariate Statistical Models with Small Samples

Peer reviewed

Direct link

McNeish, Daniel – Review of Educational Research, 2017

In education research, small samples are common because of financial limitations, logistical challenges, or exploratory studies. With small samples, statistical principles on which researchers rely do not hold, leading to trust issues with model estimates and possible replication issues when scaling up. Researchers are generally aware of such…

Descriptors: Models, Statistical Analysis, Sampling, Sample Size

Exploring the Utility of Sequential Analysis in Studying Informal Formative Assessment Practices

Peer reviewed

Direct link

Furtak, Erin Marie; Ruiz-Primo, Maria Araceli; Bakeman, Roger – Educational Measurement: Issues and Practice, 2017

Formative assessment is a classroom practice that has received much attention in recent years for its established potential at increasing student learning. A frequent analytic approach for determining the quality of formative assessment practices is to develop a coding scheme and determine frequencies with which the codes are observed; however,…

Descriptors: Sequential Approach, Formative Evaluation, Alternative Assessment, Incidence

Classroom Dynamic Assessment: A Critical Examination of Constructs and Practices

Peer reviewed

Direct link

Davin, Kristin J. – Modern Language Journal, 2016

This article explores the implementation of dynamic assessment (DA) in an elementary school foreign language classroom by considering its theoretical basis and its applicability to second language (L2) teaching, learning, and development. In existing applications of L2 classroom DA, errors serve as a window into learners' instructional needs and…

Descriptors: Alternative Assessment, Elementary School Students, Second Language Learning, Second Language Instruction

Improving the Targeting of Treatment: Evidence from College Remediation

Peer reviewed

Direct link

Scott-Clayton, Judith; Crosta, Peter M.; Belfield, Clive R. – Educational Evaluation and Policy Analysis, 2014

Remediation is one of the largest single interventions intended to improve outcomes for underprepared college students, yet little is known about the remedial screening process. Using administrative data and a rich predictive model, we find that severe mis-assignments are common using current test-score-cutoff-based policies, with…

Descriptors: Remedial Instruction, Remedial Programs, College Students, Screening Tests

An Application of Generalizability Theory to Evaluate the Technical Quality of an Alternate Assessment

Peer reviewed

Direct link

Taylor, Melinda Ann; Pastor, Dena A. – Applied Measurement in Education, 2013

Although federal regulations require testing students with severe cognitive disabilities, there is little guidance regarding how technical quality should be established. It is known that challenges exist with documentation of the reliability of scores for alternate assessments. Typical measures of reliability do little in modeling multiple sources…

Descriptors: Generalizability Theory, Alternative Assessment, Test Reliability, Scores

Using the Response to Intervention Framework with English Language Learners

Direct link

Elizalde-Utnick, Graciela – Communique, 2008

There is great controversy in the field of learning disabilities (LD) regarding the establishment of criteria for LD identification. The traditional approach to LD identification is to use the IQ-discrepancy. Lyon and colleagues (2001) point out the numerous problems with such an approach, including faulty assumptions about the adequacy of an IQ…

Descriptors: Intervention, Learning Disabilities, Second Language Learning, Intelligence Quotient

E-Assessment within the Bologna Paradigm: Evidence from Portugal

Peer reviewed

Direct link

Ferrao, Maria – Assessment & Evaluation in Higher Education, 2010

The Bologna Declaration brought reforms into higher education that imply changes in teaching methods, didactic materials and textbooks, infrastructures and laboratories, etc. Statistics and mathematics are disciplines that traditionally have the worst success rates, particularly in non-mathematics core curricula courses. This research project,…

Descriptors: Foreign Countries, Computer Assisted Testing, Educational Technology, Educational Assessment

Alternative Assessments of the Performance of Schools: Measurement of State Variations in Achievement.

Peer reviewed

Hanushek, Eric A.; Taylor, Lori L. – Journal of Human Resources, 1990

Commonly employed measures of school quality can lead to very misleading results. Especially at the state level, nonrepresentative data such as aggregate Scholastic Aptitude Test scores provide very biased measures of school performance. Far superior are direct estimates of achievement growth. (SK)

Descriptors: Academic Achievement, Alternative Assessment, Educational Assessment, Educational Quality

Previous Page | Next Page »

Pages: 1 | 2

Bakeman, Roger	1
Bardhoshi, Gerta	1
Bartoš, František	1
Belfield, Clive R.	1
Bontempo, Daniel	1
Brabec, Marek	1
Castellano, Katherine E.	1
Crosta, Peter M.	1
Davin, Kristin J.	1
Elizalde-Utnick, Graciela	1
Erford, Bradley T.	1
Eunike Wetzel	1
Ferrao, Maria	1
Furtak, Erin Marie	1
Gottlieb, Derek	1
Hanushek, Eric A.	1
Little, Todd D.	1
Martinková, Patrícia	1
McCaffrey, Daniel F.	1
McNeish, Daniel	1
Michael Christian	1
Moroye, Christy M.	1
Pastor, Dena A.	1
Rebekka Kupffer	1
Rioux, Charlie	1
More ▼