Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 2 |
Since 2006 (last 20 years) | 7 |
Descriptor
Error of Measurement | 8 |
Evaluation Research | 8 |
Test Reliability | 8 |
Evaluation Methods | 5 |
Error Patterns | 3 |
Scores | 3 |
Test Validity | 3 |
Foreign Countries | 2 |
Interrater Reliability | 2 |
Measurement Techniques | 2 |
Misconceptions | 2 |
More ▼ |
Source
Assessment | 1 |
Canadian Journal of School… | 1 |
International Journal of… | 1 |
Journal of Educational… | 1 |
Measurement and Evaluation in… | 1 |
Oxford Review of Education | 1 |
Practical Assessment,… | 1 |
Research & Practice in… | 1 |
Author
Publication Type
Journal Articles | 8 |
Reports - Research | 4 |
Reports - Descriptive | 3 |
Reports - Evaluative | 1 |
Education Level
Elementary Secondary Education | 3 |
Higher Education | 3 |
Postsecondary Education | 1 |
Audience
Location
Oklahoma | 1 |
United Kingdom | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Wechsler Intelligence Scale… | 1 |
What Works Clearinghouse Rating
Bardhoshi, Gerta; Erford, Bradley T. – Measurement and Evaluation in Counseling and Development, 2017
Precision is a key facet of test development, with score reliability determined primarily according to the types of error one wants to approximate and demonstrate. This article identifies and discusses several primary forms of reliability estimation: internal consistency (i.e., split-half, KR-20, a), test-retest, alternate forms, interscorer, and…
Descriptors: Scores, Test Reliability, Accuracy, Pretests Posttests
Dwyer, Andrew C. – Journal of Educational Measurement, 2016
This study examines the effectiveness of three approaches for maintaining equivalent performance standards across test forms with small samples: (1) common-item equating, (2) resetting the standard, and (3) rescaling the standard. Rescaling the standard (i.e., applying common-item equating methodology to standard setting ratings to account for…
Descriptors: Cutting Scores, Equivalency Tests, Test Format, Academic Standards
Williams, Matt N.; Gomez Grajales, Carlos Alberto; Kurkiewicz, Dason – Practical Assessment, Research & Evaluation, 2013
In 2002, an article entitled "Four assumptions of multiple regression that researchers should always test" by Osborne and Waters was published in "PARE." This article has gone on to be viewed more than 275,000 times (as of August 2013), and it is one of the first results displayed in a Google search for "regression…
Descriptors: Multiple Regression Analysis, Misconceptions, Reader Response, Predictor Variables
Gardner, John – Oxford Review of Education, 2013
Evidence from recent research suggests that in the UK the public perception of errors in national examinations is that they are simply mistakes; events that are preventable. This perception predominates over the more sophisticated technical view that errors arise from many sources and create an inevitable variability in assessment outcomes. The…
Descriptors: Educational Assessment, Public Opinion, Error of Measurement, Foreign Countries
Mrazik, Martin; Janzen, Troy M.; Dombrowski, Stefan C.; Barford, Sean W.; Krawchuk, Lindsey L. – Canadian Journal of School Psychology, 2012
A total of 19 graduate students enrolled in a graduate course conducted 6 consecutive administrations of the Wechsler Intelligence Scale for Children, 4th edition (WISC-IV, Canadian version). Test protocols were examined to obtain data describing the frequency of examiner errors, including administration and scoring errors. Results identified 511…
Descriptors: Intelligence Tests, Intelligence, Statistical Analysis, Scoring
Hathcoat, John D.; Penn, Jeremy D. – Research & Practice in Assessment, 2012
Critics of standardized testing have recommended replacing standardized tests with more authentic assessment measures, such as classroom assignments, projects, or portfolios rated by a panel of raters using common rubrics. Little research has examined the consistency of scores across multiple authentic assignments or the implications of this…
Descriptors: Generalizability Theory, Performance Based Assessment, Writing Across the Curriculum, Standardized Tests
Leark, Robert A.; Wallace, Denise R.; Fitzgerald, Robert – Assessment, 2004
Test-retest reliability of the Test of Variables of Attention (T.O.V.A.) was investigated in two studies using two different time intervals: 90 min and 1 week (plus or minus 2 days). To investigate the 90-min reliability, 31 school-age children (M = 10 years, SD = 2.66) were administered the T.O.V.A. then read ministered the test 90 min afterward.…
Descriptors: Intervals, Reaction Time, Error of Measurement, Test Reliability
Meyer, Kevin D.; Foster, Jeff L. – International Journal of Testing, 2008
With the increasing globalization of human resources practices, a commensurate increase in demand has occurred for multi-language ("global") personality norms for use in selection and development efforts. The combination of data from multiple translations of a personality assessment into a single norm engenders error from multiple sources. This…
Descriptors: Global Approach, Cultural Differences, Norms, Human Resources