NotesFAQContact Us
Collection
Advanced
Search Tips
Laws, Policies, & Programs
Elementary and Secondary…1
Assessments and Surveys
National Assessment of…1
What Works Clearinghouse Rating
Showing 1 to 15 of 22 results Save | Export
Kelvin Terrell Pompey – ProQuest LLC, 2021
Many methods are used to measure interrater reliability for studies where each target receives ratings by a different set of judges. The purpose of this study is to explore the use of hierarchical modeling for estimating interrater reliability using the intraclass correlation coefficient. This study provides a description of how the ICC can be…
Descriptors: Interrater Reliability, Evaluation Methods, Test Reliability, Correlation
Peer reviewed Peer reviewed
Direct linkDirect link
Raykov, Tenko; Marcoulides, George A. – Educational and Psychological Measurement, 2015
A latent variable modeling approach for scale reliability evaluation in heterogeneous populations is discussed. The method can be used for point and interval estimation of reliability of multicomponent measuring instruments in populations representing mixtures of an unknown number of latent classes or subpopulations. The procedure is helpful also…
Descriptors: Test Reliability, Evaluation Methods, Measurement Techniques, Computation
Peer reviewed Peer reviewed
Direct linkDirect link
Evans, C.; Kandiko Howson, C.; Forsythe, A. – Higher Education Pedagogies, 2018
Internationally, the political appetite for educational measurement capable of capturing a metric of value for money and effectiveness has momentum. While most would agree with the need to assess costs relevant to quality to help support better governmental policy decisions about public spending, poorly understood measurement comes with unintended…
Descriptors: Higher Education, Achievement Gains, Political Issues, Quality Assurance
Peer reviewed Peer reviewed
Direct linkDirect link
Phillips, Gary W. – Applied Measurement in Education, 2015
This article proposes that sampling design effects have potentially huge unrecognized impacts on the results reported by large-scale district and state assessments in the United States. When design effects are unrecognized and unaccounted for they lead to underestimating the sampling error in item and test statistics. Underestimating the sampling…
Descriptors: State Programs, Sampling, Research Design, Error of Measurement
Peer reviewed Peer reviewed
Direct linkDirect link
Calmettes, Guillaume; Drummond, Gordon B.; Vowler, Sarah L. – Advances in Physiology Education, 2012
A jack knife is a pocket knife that is put to many tasks, because it's ready to hand. Often there could be a better tool for the job, such as a screwdriver, a scraper, or a can-opener, but these are not usually pocket items. In statistical terms, the expression implies making do with what's available. Another simile, of an extreme situation, is…
Descriptors: Statistical Analysis, Computation, Population Distribution, Evaluation Methods
Bill & Melinda Gates Foundation, 2012
No one has a bigger stake in teaching effectiveness than students. Nor are there any better experts on how teaching is experienced by its intended beneficiaries. Only recently have many policymakers and practitioners come to recognize that--when asked the right questions, in the right ways--students can be an important source of information on the…
Descriptors: Student Surveys, Student Attitudes, Feedback (Response), Test Validity
Peer reviewed Peer reviewed
Direct linkDirect link
Runyan, Desmond K.; Dunne, Michael P.; Zolotor, Adam J. – Child Abuse & Neglect: The International Journal, 2009
The "World Report on Children and Violence", (Pinheiro, 2006) was produced at the request of the UN Secretary General and the UN General Assembly. This report recommended improvement in research on child abuse. ISPCAN representatives took this charge and developed 3 new instruments. We describe this background and introduce three new measures…
Descriptors: Child Abuse, Screening Tests, Child Welfare, Test Construction
PDF pending restoration PDF pending restoration
Estes, Carole; Estes, Gary D. – 1980
Multiple matrix sampling is a sampling design in which both test items and examinees are randomly sampled from their respective populations. This study was designed to develop and assess a method for computing an estimate of a correlation coefficient when a multiple matrix sampling design is used. The examinee populations included 212 third-grade…
Descriptors: Correlation, Elementary Secondary Education, Evaluation Methods, Grade 3
Austin, Dean A.; Novak, Carl D. – Health Education (Washington D.C.), 1976
This study demonstrates that multiple matrix sampling procedures can be used to collect assessment data efficiently, unabstrusively, and reliably. (MB)
Descriptors: Data Collection, Educational Testing, Evaluation Methods, Item Sampling
Peer reviewed Peer reviewed
Poggio, John P.; Glasnapp, Douglas R. – Educational and Psychological Measurement, 1973
Descriptors: Academic Achievement, Evaluation Methods, Formative Evaluation, Item Sampling
Peer reviewed Peer reviewed
Askegaard, Lewis D.; Umila, Benwardo V. – Journal of Educational Measurement, 1982
Multiple matrix sampling of items and examinees was applied to an 18-item rank order instrument administered to a randomly assigned group and compared to the ordering and ranking of all items by control subjects. High correlations between ranks suggest the methodology may viably reduce respondent effort on long rank ordering tasks. (Author/CM)
Descriptors: Evaluation Methods, Item Sampling, Junior High Schools, Student Reaction
Peer reviewed Peer reviewed
Direct linkDirect link
Liang, Xin – Evaluation and Research in Education, 2003
Multiple matrix sampling is a data collection technique that ensures accuracy and efficiency in group performance. It has been widely used in large-scale curriculum evaluation since the 1980s. However, the design does not always fully embrace the dynamics of local evaluation demands. The purpose of this study is to introduce a modified matrix…
Descriptors: Curriculum Evaluation, Item Sampling, Matrices, Statistical Studies
Peer reviewed Peer reviewed
Direct linkDirect link
Gottfredson, Stephen D.; Moriarty, Laura J. – Crime & Delinquency, 2006
Statistically based risk assessment devices are widely used in criminal justice settings. Their promise remains largely unfulfilled, however, because assumptions and premises requisite to their development and application are routinely ignored and/or violated. This article provides a brief review of the most salient of these assumptions and…
Descriptors: Risk, Justice, Criminals, Crime
Peer reviewed Peer reviewed
Evans, Julia L.; Craig, Holly K. – Journal of Speech and Hearing Research, 1992
Analysis of spontaneous language samples of 10 children (ages 8-9) with specific language impairments found that interviews were a reliable, valid, and efficient assessment context, eliciting the same profile of behaviors as a freeplay context without altering diagnostic classifications. (Author/JDD)
Descriptors: Data Collection, Discourse Analysis, Educational Diagnosis, Efficiency
Peer reviewed Peer reviewed
Direct linkDirect link
Meyer, Kevin D.; Foster, Jeff L. – International Journal of Testing, 2008
With the increasing globalization of human resources practices, a commensurate increase in demand has occurred for multi-language ("global") personality norms for use in selection and development efforts. The combination of data from multiple translations of a personality assessment into a single norm engenders error from multiple sources. This…
Descriptors: Global Approach, Cultural Differences, Norms, Human Resources
Previous Page | Next Page ยป
Pages: 1  |  2