Publication Date
| In 2026 | 3 |
| Since 2025 | 675 |
| Since 2022 (last 5 years) | 3176 |
| Since 2017 (last 10 years) | 7417 |
| Since 2007 (last 20 years) | 15055 |
Descriptor
| Test Reliability | 15043 |
| Test Validity | 10279 |
| Reliability | 9761 |
| Foreign Countries | 7144 |
| Test Construction | 4825 |
| Validity | 4191 |
| Measures (Individuals) | 3877 |
| Factor Analysis | 3825 |
| Psychometrics | 3526 |
| Interrater Reliability | 3124 |
| Correlation | 3040 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 709 |
| Practitioners | 451 |
| Teachers | 208 |
| Administrators | 122 |
| Policymakers | 66 |
| Counselors | 42 |
| Students | 38 |
| Parents | 11 |
| Community | 7 |
| Support Staff | 6 |
| Media Staff | 5 |
| More ▼ | |
Location
| Turkey | 1328 |
| Australia | 436 |
| Canada | 379 |
| China | 368 |
| United States | 271 |
| United Kingdom | 256 |
| Indonesia | 253 |
| Taiwan | 234 |
| Netherlands | 223 |
| Spain | 217 |
| California | 215 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 8 |
| Meets WWC Standards with or without Reservations | 9 |
| Does not meet standards | 6 |
Hertz, Norman R.; Chinn, Roberta N. – 2002
Nearly all of the research on standard setting focuses on different standard setting methods rather than the interaction of group members and the instructions given to group members. This study explored the effect of deliberation style and the requirement to reach consensus on the passing score, on rater satisfaction, and on postdecision…
Descriptors: Decision Making, Evaluation Methods, Evaluators, Interaction
Sherman, Lawrence W.; Taylor, Amy N. – 2001
Experimenter bias effects were experimentally manipulated in a sample of 97 school psychologists scoring 3 subscales (Similarity, Vocabulary, and Comprehension) of the Wechsler Intelligence Scale for Children-Third Edition (WISC-III). First year (n=29), interns (n=42), and experienced (n=26) psychologists were randomly assigned to either a bias or…
Descriptors: Bias, Children, Down Syndrome, Elementary Secondary Education
Bielinski, John; Minnema, Jane; Thurlow, Martha – 2002
A Web-based survey of 25 experts in testing theory and large-scale assessment examined the utility of out-of-level testing for making decisions about students and schools. Survey respondents were given a series of scenarios and asked to judge the degree to which out-of-level testing would affect the reliability and validity of test scores within…
Descriptors: Disabilities, Educational Assessment, Elementary Secondary Education, Gifted
Paradowski, Michal B. – Online Submission, 2002
The paper discusses the key criteria of good language tests: practicality, validity, and reliability.
Descriptors: Language Tests, Criterion Referenced Tests, Test Reliability, Test Validity
Gomberg, Anna; Orlova, Darya; Matthews, Amanda; Narvaez, Darcia – Online Submission, 2004
The "Rating Ethical Content Scale" ("RECS") judges the content of stories for positive content, based on the Four Process model of ethical behavior: ethical sensitivity, ethical judgment, ethical focus and ethical action (Rest, 1983; Narvaez, & Rest, 1995). For example, a story with Ethical Sensitivity has evidence of…
Descriptors: Media Literacy, Ethics, Rating Scales, Test Reliability
Wilkerson, Judy R.; Lang, William Steve – Online Submission, 2005
NCATE (2002) requires the measurement of knowledge, skills, and dispositions as part of its accreditation requirements for teacher education programs (Standard 1) and the use of unit assessment systems to aggregate and analyse data with a view toward program improvement (Standard 2). Data must indicate that candidates meet professional, state, and…
Descriptors: Teacher Education Programs, Program Improvement, Psychological Testing, Program Effectiveness
Cho, Insik J.; Ellinger, Andrea D.; Hezlett, Sarah A. – Online Submission, 2005
The concept of self-directed learning has become increasingly important in educational and work organizations as a result of trends that require learners to become more responsible for their own learning to remain highly skilled and knowledgeable in a competitive marketplace. To assess self-directedness in the Korean context, a relatively new…
Descriptors: Foreign Countries, Test Reliability, Test Validity, Independent Study
O'Neill, Thomas R.; Lunz, Mary E. – 1997
This paper illustrates a method to study rater severity across exam administrations. A multi-facet Rasch model defined the ratings as being dominated by four facets: examinee ability, rater severity, project difficulty, and task difficulty. Ten years of data from administrations of a histotechnology performance assessment were pooled and analyzed…
Descriptors: Ability, Comparative Analysis, Equated Scores, Interrater Reliability
Scheuren, Fritz; Li, Bonnie – 1995
This report provides empirical results of attempts to achieve consistency of estimates between two National Center for Education Statistics (NCES) surveys. These surveys are the 1991- 92 Private School Survey (PSS) and the Private School Component of the 1990-91 Schools and Staffing Survey (SASS). Consistency was sought in the numbers of schools,…
Descriptors: Classification, Elementary Secondary Education, Estimation (Mathematics), Least Squares Statistics
Canivez, Gary L. – 1999
The short-term (45-day) stability of the Adjustment Scales for Children and Adolescents (P. McDermott, N. Marston, and D. Stott, 1993) was studied with 51 first and fifth graders, seven of whom were classified as "exceptional/disabled." Significant test-retest reliability coefficients were obtained, and mean differences from test to…
Descriptors: Behavior Patterns, Classification, Disabilities, Elementary Education
Fan, Xitao – 2001
Bootstrap analysis, both for nonparametric statistical inference and for describing sample results stability and replicability, has been gaining prominence among quantitative researchers in educational and psychological research. Procedurally, however, it is often quite a challenge for quantitative researchers to implement bootstrap analysis in…
Descriptors: Computer Software, Educational Research, Heuristics, Nonparametric Statistics
Crislip, Marian A.; Chin-Chance, Selvin – 2001
This paper discusses the use of two theories of item analysis and test construction, their strengths and weaknesses, and applications to the design of the Hawaii State Test of Essential Competencies (HSTEC). Traditional analyses of the data collected from the HSTEC field test were viewed from the perspectives of item difficulty levels and item…
Descriptors: Difficulty Level, Item Response Theory, Psychometrics, Reliability
Taherbhai, Husein; Young, Michael James – 2000
This empirical study used data from the Reading: Basic Understanding section of the New Standards English Language Arts Examination. Data were collected for 3,200 high school students randomly selected from those who took the examination. The resulting sample had 16 raters who scored 200 students each, with each student rated by only 1 rater. The…
Descriptors: Evaluators, High School Students, High Schools, Interrater Reliability
Stapleton, Laura M.; Edmonds, Meaghan – 2003
An exploratory reliability and validity study was conducted of a relatively new response scale developed in the marketing field. Unlike many Likert-type scales, the "unbounded write-in" scale is claimed to produce distributions that more closely approximate normal distributions. This type of scale has been used in large-scale marketing studies.…
Descriptors: Attitude Measures, Focus Groups, Higher Education, Rating Scales
Melnick, Steven A.; Coyle, H. Elizabeth – 2000
This paper reports on the development of an empirically valid and reliable assessment instrument that identifies areas of need in violence prevention skills within the student population. The completed instrument should allow a school district to choose a curriculum that aligns with their identified need at each developmentally-appropriate level…
Descriptors: Elementary Secondary Education, Evaluation Methods, Prevention, Reliability


