Publication Date
In 2025 | 1 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 4 |
Since 2006 (last 20 years) | 7 |
Descriptor
Test Reliability | 13 |
Tests | 13 |
Test Validity | 8 |
Student Evaluation | 5 |
Foreign Countries | 4 |
Testing | 4 |
College Students | 3 |
Evaluation Methods | 3 |
Test Construction | 3 |
Educational Research | 2 |
Elementary Secondary Education | 2 |
More ▼ |
Source
Author
Anne Wicks | 1 |
Caldwell, Michael S. | 1 |
Callahan, Carolyn M. | 1 |
Funke, Joachim | 1 |
Greiff, Samuel | 1 |
Hepburn, Mary A. | 1 |
Jones, Philip | 1 |
Mahar, Matthew T. | 1 |
Mildon, Sally | 1 |
Moore, Keri | 1 |
Naumann, Fiona | 1 |
More ▼ |
Publication Type
Reports - Descriptive | 13 |
Journal Articles | 11 |
Laws, Policies, & Programs
Assessments and Surveys
Kaufman Assessment Battery… | 1 |
What Works Clearinghouse Rating
Anne Wicks; Robin Berkley – George W. Bush Institute, 2025
Assessments are one of the most important--and often misunderstood--elements of education. In most cases, tests are administered by the state as well as by districts and schools. Assessments at each of these levels have distinct purposes, yield different information, and are part of a powerful, coordinated approach to improving student outcomes.…
Descriptors: Student Evaluation, Testing, Tests, Standardized Tests
Wesolowski, Brian C. – Music Educators Journal, 2020
Validity, reliability, and fairness are three prominent indicators for evaluating the quality of assessment processes. Each of the indicators is most often written about and applied in the context of large-scale assessment. As a result, the technical properties of these indicators make them limited in both their practicality and relevance for…
Descriptors: Music Education, Test Validity, Test Reliability, Student Evaluation
Nicewander, W. Alan – Educational and Psychological Measurement, 2019
This inquiry is focused on three indicators of the precision of measurement--conditional on fixed values of ?, the latent variable of item response theory (IRT). The indicators that are compared are (1) The traditional, conditional standard errors, s(eX|?) = CSEM; (2) the IRT-based conditional standard errors, s[subscript irt](eX|?)=C[subscript…
Descriptors: Measurement, Accuracy, Scores, Error of Measurement
Center on Standards and Assessments Implementation, 2018
Reliability is a measure of consistency. It is the degree to which student results are the same when they take the same test on different occasions, when different scorers score the same item or task, and when different but equivalent tests are taken at the same time or at different times. Reliability is about making sure that different test forms…
Descriptors: Test Reliability, Test Validity, Student Evaluation, Test Bias
Naumann, Fiona; Moore, Keri; Mildon, Sally; Jones, Philip – Asia-Pacific Journal of Cooperative Education, 2014
This paper aims to develop a valid method to assess the key competencies of the exercise physiology profession acquired through work-integrated learning (WIL). In order to develop a competency-based assessment, the key professional tasks needed to be identified and the test designed so students' competency in different tasks and settings could be…
Descriptors: Exercise Physiology, Competence, Test Construction, Work Experience
Greiff, Samuel; Wustenberg, Sascha; Funke, Joachim – Applied Psychological Measurement, 2012
This article addresses two unsolved measurement issues in dynamic problem solving (DPS) research: (a) unsystematic construction of DPS tests making a comparison of results obtained in different studies difficult and (b) use of time-intensive single tasks leading to severe reliability problems. To solve these issues, the MicroDYN approach is…
Descriptors: Problem Solving, Tests, Measurement, Structural Equation Models
Mahar, Matthew T.; Rowe, David A. – Measurement in Physical Education and Exercise Science, 2008
Accurate measures of youth fitness are needed by researchers and practitioners. Evidence of validity and reliability are essential before results of youth fitness tests can be used to make sound decisions. This article describes a three-stage paradigm for validation research and provides guidance for conducting and understanding norm-referenced…
Descriptors: Test Reliability, Test Validity, Guidelines, Physical Education Teachers

Hepburn, Mary A.; Strickland, Joseph B. – Journal of Social Studies Research, 1979
Describes the development and assessment of evaluation instruments designed to test student political-citizenship knowledge, skills, and attitudes. The tests are a part of the Improving Citizenship Education Project in Fulton County, Georgia. (Author/CK)
Descriptors: Citizenship, Educational Assessment, Elementary Secondary Education, Evaluation Methods

Stalder, Daniel R. – Teaching of Psychology, 2001
Evaluates the use of discrimination indexes (or item-total correlation) for examining the reliability of examinations. States this technique has drawbacks and may cause examination validity to be lower. Discusses the idea of discrimination power and why poor students may answer an item correctly. (CMK)
Descriptors: Academic Failure, Educational Research, Higher Education, Psychology

Callahan, Carolyn M.; Caldwell, Michael S. – Journal for the Education of the Gifted, 1993
This article describes the database of the National Repository for Instruments and Strategies Used in the Identification and Evaluation of Gifted Programs (University of Virginia). The Scale for the Evaluation of Gifted Identification Instruments is applied to the Kaufman Assessment Battery for Children. A sample bibliographic reference from the…
Descriptors: Ability Identification, Bibliographic Databases, Databases, Elementary Secondary Education
Thomsett, Michael C. – Graduating Engineer, 1988
Reviews a number of tests that persons applying for engineering jobs may encounter including drugs, honesty, polygraph, personality and job skills tests. Discusses some problems with tests. Defines discrimination in hiring. States that job applicants usually will not take a test for Acquired Immune Deficiency Syndrome (AIDS). (CW)
Descriptors: College Science, College Students, Employment Practices, Employment Qualifications

Uduehi, Joseph – Visual Arts Research, 1995
Reiterates the criticism that the Maitland Graves Design Judgment Test is inadequate in measuring aesthetic judgment as defined by Graves, especially in a cross-cultural setting. United States students consistently scored highest. Nonetheless, all students responded favorably to three factors: symmetry, three-dimensionality, and complex design.…
Descriptors: Aesthetic Values, Art Appreciation, Art Education, Art Materials

Sharp, Stephen – Educational Studies, 1996
Summarizes an experiment that expanded the types of variables and approaches used in statistical analyses of subject attainment in secondary education. The experiment compared students' grades in a given subject with their expected grades based on performance in other subjects. Considers applications and limitations to this approach. (MJP)
Descriptors: Academic Achievement, Courses, Educational Administration, Educational Research