ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	2
Since 2006 (last 20 years)	7

Descriptor

Error of Measurement	8
Evaluation Research	8
Test Reliability	8
Evaluation Methods	5
Error Patterns	3
Scores	3
Test Validity	3
Foreign Countries	2
Interrater Reliability	2
Measurement Techniques	2
Misconceptions	2
Research Methodology	2
Standardized Tests	2
Student Evaluation	2
Testing Problems	2
Academic Standards	1
Accuracy	1
Achievement Rating	1
Alternative Assessment	1
Attention Span	1
Business Administration	1
Change Strategies	1
Comparative Analysis	1
Computation	1
Cross Cultural Studies	1
More ▼

Source

Assessment	1
Canadian Journal of School…	1
International Journal of…	1
Journal of Educational…	1
Measurement and Evaluation in…	1
Oxford Review of Education	1
Practical Assessment,…	1
Research & Practice in…	1

Publication Type

Journal Articles	8
Reports - Research	4
Reports - Descriptive	3
Reports - Evaluative	1

Education Level

Elementary Secondary Education	3
Higher Education	3
Postsecondary Education	1

Audience

Location

Oklahoma	1
United Kingdom	1

Laws, Policies, & Programs

Assessments and Surveys

Wechsler Intelligence Scale…

What Works Clearinghouse Rating

Showing all 8 results Save | Export

Processes and Procedures for Estimating Score Reliability and Precision

Peer reviewed

Direct link

Bardhoshi, Gerta; Erford, Bradley T. – Measurement and Evaluation in Counseling and Development, 2017

Precision is a key facet of test development, with score reliability determined primarily according to the types of error one wants to approximate and demonstrate. This article identifies and discusses several primary forms of reliability estimation: internal consistency (i.e., split-half, KR-20, a), test-retest, alternate forms, interscorer, and…

Descriptors: Scores, Test Reliability, Accuracy, Pretests Posttests

Maintaining Equivalent Cut Scores for Small Sample Test Forms

Peer reviewed

Direct link

Dwyer, Andrew C. – Journal of Educational Measurement, 2016

This study examines the effectiveness of three approaches for maintaining equivalent performance standards across test forms with small samples: (1) common-item equating, (2) resetting the standard, and (3) rescaling the standard. Rescaling the standard (i.e., applying common-item equating methodology to standard setting ratings to account for…

Descriptors: Cutting Scores, Equivalency Tests, Test Format, Academic Standards

Assumptions of Multiple Regression: Correcting Two Misconceptions

Peer reviewed
PDF on ERIC

Download full text

Williams, Matt N.; Gomez Grajales, Carlos Alberto; Kurkiewicz, Dason – Practical Assessment, Research & Evaluation, 2013

In 2002, an article entitled "Four assumptions of multiple regression that researchers should always test" by Osborne and Waters was published in "PARE." This article has gone on to be viewed more than 275,000 times (as of August 2013), and it is one of the first results displayed in a Google search for "regression…

Descriptors: Multiple Regression Analysis, Misconceptions, Reader Response, Predictor Variables

The Public Understanding of Error in Educational Assessment

Peer reviewed

Direct link

Gardner, John – Oxford Review of Education, 2013

Evidence from recent research suggests that in the UK the public perception of errors in national examinations is that they are simply mistakes; events that are preventable. This perception predominates over the more sophisticated technical view that errors arise from many sources and create an inevitable variability in assessment outcomes. The…

Descriptors: Educational Assessment, Public Opinion, Error of Measurement, Foreign Countries

Administration and Scoring Errors of Graduate Students Learning the WISC-IV: Issues and Controversies

Peer reviewed

Direct link

Mrazik, Martin; Janzen, Troy M.; Dombrowski, Stefan C.; Barford, Sean W.; Krawchuk, Lindsey L. – Canadian Journal of School Psychology, 2012

A total of 19 graduate students enrolled in a graduate course conducted 6 consecutive administrations of the Wechsler Intelligence Scale for Children, 4th edition (WISC-IV, Canadian version). Test protocols were examined to obtain data describing the frequency of examiner errors, including administration and scoring errors. Results identified 511…

Descriptors: Intelligence Tests, Intelligence, Statistical Analysis, Scoring

Generalizability of Student Writing across Multiple Tasks: A Challenge for Authentic Assessment

Peer reviewed
PDF on ERIC

Download full text

Hathcoat, John D.; Penn, Jeremy D. – Research & Practice in Assessment, 2012

Critics of standardized testing have recommended replacing standardized tests with more authentic assessment measures, such as classroom assignments, projects, or portfolios rated by a panel of raters using common rubrics. Little research has examined the consistency of scores across multiple authentic assignments or the implications of this…

Descriptors: Generalizability Theory, Performance Based Assessment, Writing Across the Curriculum, Standardized Tests

Test-Retest Reliability and Standard Error of Measurement for the Test of Variables of Attention (T.O.V.A.) With Healthy School-Age Children

Peer reviewed

Direct link

Leark, Robert A.; Wallace, Denise R.; Fitzgerald, Robert – Assessment, 2004

Test-retest reliability of the Test of Variables of Attention (T.O.V.A.) was investigated in two studies using two different time intervals: 90 min and 1 week (plus or minus 2 days). To investigate the 90-min reliability, 31 school-age children (M = 10 years, SD = 2.66) were administered the T.O.V.A. then read ministered the test 90 min afterward.…

Descriptors: Intervals, Reaction Time, Error of Measurement, Test Reliability

Considerations for Creating Multi-Language Personality Norms: A Three-Component Model of Error

Peer reviewed

Direct link

Meyer, Kevin D.; Foster, Jeff L. – International Journal of Testing, 2008

With the increasing globalization of human resources practices, a commensurate increase in demand has occurred for multi-language ("global") personality norms for use in selection and development efforts. The combination of data from multiple translations of a personality assessment into a single norm engenders error from multiple sources. This…

Descriptors: Global Approach, Cultural Differences, Norms, Human Resources

Bardhoshi, Gerta	1
Barford, Sean W.	1
Dombrowski, Stefan C.	1
Dwyer, Andrew C.	1
Erford, Bradley T.	1
Fitzgerald, Robert	1
Foster, Jeff L.	1
Gardner, John	1
Gomez Grajales, Carlos Alberto	1
Hathcoat, John D.	1
Janzen, Troy M.	1
Krawchuk, Lindsey L.	1
Kurkiewicz, Dason	1
Leark, Robert A.	1
Meyer, Kevin D.	1
Mrazik, Martin	1
Penn, Jeremy D.	1
Wallace, Denise R.	1
Williams, Matt N.	1
More ▼