ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	2
Since 2006 (last 20 years)	9

Descriptor

Evaluation Methods	22
Test Reliability	22
Sampling	17
Test Validity	12
Measurement Techniques	7
Data Collection	6
Item Sampling	6
Statistical Analysis	6
Program Evaluation	5
Research Methodology	5
Test Construction	5
Testing Problems	5
Models	4
Academic Achievement	3
Elementary Secondary Education	3
Research Design	3
Testing	3
Achievement Gains	2
Alignment (Education)	2
Comparative Analysis	2
Computation	2
Correlation	2
Data Analysis	2
Demonstration Programs	2
Educational Research	2
More ▼

Source

Educational and Psychological…	2
Advances in Physiology…	1
Annual Review of Applied…	1
Applied Measurement in…	1
Bill & Melinda Gates…	1
Child Abuse & Neglect: The…	1
Crime & Delinquency	1
Evaluation and Research in…	1
Health Education (Washington…	1
Higher Education Pedagogies	1
International Journal of…	1
Journal of Educational…	1
Journal of Speech and Hearing…	1
ProQuest LLC	1
More ▼

Publication Type

Journal Articles	11
Reports - Research	7
Reports - Descriptive	5
Guides - Non-Classroom	2
Dissertations/Theses -…	1
Information Analyses	1
Reference Materials -…	1
Reports - Evaluative	1
Speeches/Meeting Papers	1
Tests/Questionnaires	1

Education Level

Higher Education	2
Elementary Secondary Education	1
Postsecondary Education	1
Secondary Education	1

Audience

Policymakers	1
Practitioners	1
Researchers	1

Location

Australia	1
Colorado (Denver)	1
Europe	1
North Carolina (Charlotte)	1
Pennsylvania (Pittsburgh)	1
Tennessee (Memphis)	1
United Kingdom	1
United States	1

Laws, Policies, & Programs

Elementary and Secondary…

Assessments and Surveys

National Assessment of…

What Works Clearinghouse Rating

Showing 1 to 15 of 22 results Save | Export

A Unified Approach to Estimating the Intraclass Correlation Coefficient and Its Bias: An Exploratory Study

Direct link

Kelvin Terrell Pompey – ProQuest LLC, 2021

Many methods are used to measure interrater reliability for studies where each target receives ratings by a different set of judges. The purpose of this study is to explore the use of hierarchical modeling for estimating interrater reliability using the intraclass correlation coefficient. This study provides a description of how the ICC can be…

Descriptors: Interrater Reliability, Evaluation Methods, Test Reliability, Correlation

Scale Reliability Evaluation with Heterogeneous Populations

Peer reviewed

Direct link

Raykov, Tenko; Marcoulides, George A. – Educational and Psychological Measurement, 2015

A latent variable modeling approach for scale reliability evaluation in heterogeneous populations is discussed. The method can be used for point and interval estimation of reliability of multicomponent measuring instruments in populations representing mixtures of an unknown number of latent classes or subpopulations. The procedure is helpful also…

Descriptors: Test Reliability, Evaluation Methods, Measurement Techniques, Computation

Making Sense of Learning Gain in Higher Education

Peer reviewed

Direct link

Evans, C.; Kandiko Howson, C.; Forsythe, A. – Higher Education Pedagogies, 2018

Internationally, the political appetite for educational measurement capable of capturing a metric of value for money and effectiveness has momentum. While most would agree with the need to assess costs relevant to quality to help support better governmental policy decisions about public spending, poorly understood measurement comes with unintended…

Descriptors: Higher Education, Achievement Gains, Political Issues, Quality Assurance

Impact of Design Effects in Large-Scale District and State Assessments

Peer reviewed

Direct link

Phillips, Gary W. – Applied Measurement in Education, 2015

This article proposes that sampling design effects have potentially huge unrecognized impacts on the results reported by large-scale district and state assessments in the United States. When design effects are unrecognized and unaccounted for they lead to underestimating the sampling error in item and test statistics. Underestimating the sampling…

Descriptors: State Programs, Sampling, Research Design, Error of Measurement

Making Do with What We Have: Use Your Bootstraps

Peer reviewed

Direct link

Calmettes, Guillaume; Drummond, Gordon B.; Vowler, Sarah L. – Advances in Physiology Education, 2012

A jack knife is a pocket knife that is put to many tasks, because it's ready to hand. Often there could be a better tool for the job, such as a screwdriver, a scraper, or a can-opener, but these are not usually pocket items. In statistical terms, the expression implies making do with what's available. Another simile, of an extreme situation, is…

Descriptors: Statistical Analysis, Computation, Population Distribution, Evaluation Methods

Asking Students about Teaching: Student Perception Surveys and Their Implementation. MET Project Policy and Practice Brief

Download full text

Bill & Melinda Gates Foundation, 2012

No one has a bigger stake in teaching effectiveness than students. Nor are there any better experts on how teaching is experienced by its intended beneficiaries. Only recently have many policymakers and practitioners come to recognize that--when asked the right questions, in the right ways--students can be an important source of information on the…

Descriptors: Student Surveys, Student Attitudes, Feedback (Response), Test Validity

Introduction to the Development of the ISPCAN Child Abuse Screening Tools

Peer reviewed

Direct link

Runyan, Desmond K.; Dunne, Michael P.; Zolotor, Adam J. – Child Abuse & Neglect: The International Journal, 2009

The "World Report on Children and Violence", (Pinheiro, 2006) was produced at the request of the UN Secretary General and the UN General Assembly. This report recommended improvement in research on child abuse. ISPCAN representatives took this charge and developed 3 new instruments. We describe this background and introduce three new measures…

Descriptors: Child Abuse, Screening Tests, Child Welfare, Test Construction

Estimating a Correlation Coefficient Using a Multiple Matrix Sampling Disign.

PDF pending restoration

Estes, Carole; Estes, Gary D. – 1980

Multiple matrix sampling is a sampling design in which both test items and examinees are randomly sampled from their respective populations. This study was designed to develop and assess a method for computing an estimate of a correlation coefficient when a multiple matrix sampling design is used. The examinee populations included 212 third-grade…

Descriptors: Correlation, Elementary Secondary Education, Evaluation Methods, Grade 3

Using Multiple Matrix Sampling to Assess Health Education Knowledge

Austin, Dean A.; Novak, Carl D. – Health Education (Washington D.C.), 1976

This study demonstrates that multiple matrix sampling procedures can be used to collect assessment data efficiently, unabstrusively, and reliably. (MB)

Descriptors: Data Collection, Educational Testing, Evaluation Methods, Item Sampling

Content-Sampling as an Evaluation and Research Technique

Peer reviewed

Poggio, John P.; Glasnapp, Douglas R. – Educational and Psychological Measurement, 1973

Descriptors: Academic Achievement, Evaluation Methods, Formative Evaluation, Item Sampling

An Empirical Investigation of the Applicability of Multiple Matrix Sampling to the Method of Rank Order.

Peer reviewed

Askegaard, Lewis D.; Umila, Benwardo V. – Journal of Educational Measurement, 1982

Multiple matrix sampling of items and examinees was applied to an 18-item rank order instrument administered to a randomly assigned group and compared to the ordering and ranking of all items by control subjects. High correlations between ranks suggest the methodology may viably reduce respondent effort on long rank ordering tasks. (Author/CM)

Descriptors: Evaluation Methods, Item Sampling, Junior High Schools, Student Reaction

An Empirical Examination of a Modified Matrix Sampling Procedure as an Evaluation Tool for Grades 7 to 12 in a Midwestern School District

Peer reviewed

Direct link

Liang, Xin – Evaluation and Research in Education, 2003

Multiple matrix sampling is a data collection technique that ensures accuracy and efficiency in group performance. It has been widely used in large-scale curriculum evaluation since the 1980s. However, the design does not always fully embrace the dynamics of local evaluation demands. The purpose of this study is to introduce a modified matrix…

Descriptors: Curriculum Evaluation, Item Sampling, Matrices, Statistical Studies

Statistical Risk Assessment: Old Problems and New Applications

Peer reviewed

Direct link

Gottfredson, Stephen D.; Moriarty, Laura J. – Crime & Delinquency, 2006

Statistically based risk assessment devices are widely used in criminal justice settings. Their promise remains largely unfulfilled, however, because assumptions and premises requisite to their development and application are routinely ignored and/or violated. This article provides a brief review of the most salient of these assumptions and…

Descriptors: Risk, Justice, Criminals, Crime

Language Sample Collection and Analysis: Interview Compared to Freeplay Assessment Contexts.

Peer reviewed

Evans, Julia L.; Craig, Holly K. – Journal of Speech and Hearing Research, 1992

Analysis of spontaneous language samples of 10 children (ages 8-9) with specific language impairments found that interviews were a reliable, valid, and efficient assessment context, eliciting the same profile of behaviors as a freeplay context without altering diagnostic classifications. (Author/JDD)

Descriptors: Data Collection, Discourse Analysis, Educational Diagnosis, Efficiency

Considerations for Creating Multi-Language Personality Norms: A Three-Component Model of Error

Peer reviewed

Direct link

Meyer, Kevin D.; Foster, Jeff L. – International Journal of Testing, 2008

With the increasing globalization of human resources practices, a commensurate increase in demand has occurred for multi-language ("global") personality norms for use in selection and development efforts. The combination of data from multiple translations of a personality assessment into a single norm engenders error from multiple sources. This…

Descriptors: Global Approach, Cultural Differences, Norms, Human Resources

Previous Page | Next Page »

Pages: 1 | 2

Askegaard, Lewis D.	1
Austin, Dean A.	1
Brown, James Dean	1
Bruininks, Robert H.	1
Calmettes, Guillaume	1
Cohen, Allan S., Comp.	1
Craig, Holly K.	1
Drummond, Gordon B.	1
Dunne, Michael P.	1
Estes, Carole	1
Estes, Gary D.	1
Evans, C.	1
Evans, Julia L.	1
Fitz-Gibbon, Carol Taylor	1
Forsythe, A.	1
Foster, Jeff L.	1
Glasnapp, Douglas R.	1
Gottfredson, Stephen D.	1
Granville, Arthur C.	1
Kandiko Howson, C.	1
Kelvin Terrell Pompey	1
Kriewall, Thomas E.	1
Liang, Xin	1
Marcoulides, George A.	1
More ▼