Publication Date
| In 2026 | 0 |
| Since 2025 | 621 |
| Since 2022 (last 5 years) | 3121 |
| Since 2017 (last 10 years) | 7362 |
| Since 2007 (last 20 years) | 15000 |
Descriptor
| Test Reliability | 15006 |
| Test Validity | 10245 |
| Reliability | 9748 |
| Foreign Countries | 7119 |
| Test Construction | 4807 |
| Validity | 4189 |
| Measures (Individuals) | 3872 |
| Factor Analysis | 3820 |
| Psychometrics | 3513 |
| Interrater Reliability | 3117 |
| Correlation | 3037 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 709 |
| Practitioners | 451 |
| Teachers | 208 |
| Administrators | 122 |
| Policymakers | 66 |
| Counselors | 42 |
| Students | 38 |
| Parents | 11 |
| Community | 7 |
| Support Staff | 6 |
| Media Staff | 5 |
| More ▼ | |
Location
| Turkey | 1319 |
| Australia | 436 |
| Canada | 379 |
| China | 368 |
| United States | 271 |
| United Kingdom | 256 |
| Indonesia | 249 |
| Taiwan | 234 |
| Netherlands | 223 |
| Spain | 216 |
| California | 214 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 8 |
| Meets WWC Standards with or without Reservations | 9 |
| Does not meet standards | 6 |
Wangerin, Paul T. – 1994
This paper addresses problems confronting law school teachers in grading law school exams and assigning letter grades. Using prototypical dialogue and scenarios, the paper examines mathematical and statistical issues that contribute to grading errors. Discussed in relation to real world data and the bar exam are: differential weighting, combining…
Descriptors: Civil Rights, Court Litigation, Educational Malpractice, Error of Measurement
Martin, Michael O., Ed.; Kelly, Dana L., Ed. – 1996
The Third International Mathematics and Science Study (TIMSS) developed and administered tests and questionnaires in three student populations to document the quality of mathematics and science education in 45 participating countries. Study design, instrument development, and research procedures were achieved through a complex collaborative…
Descriptors: Academic Achievement, Comparative Analysis, Data Collection, Elementary Secondary Education
Vansickle, Timothy R. – 1992
The scaling of a new assessment is a significant undertaking. The scaling of a new assessment designed as a multiple-level, criterion-referenced assessment is even more so. A Guttman approach to scaling was used with the Work Keys selected-response assessments, Reading for Information and Applied Mathematics. Assessments in development in the Work…
Descriptors: Criterion Referenced Tests, Employment Qualifications, High School Students, High Schools
Office of Inspector General (ED), San Francisco, CA. Region IX. – 1994
This report of the San Francisco (California) Regional Office of the Inspector General concludes that the U.S. Department of Education should improve its present method of allocating special education funds among the states. It finds that the basis of these allocations, reported numbers of students receiving special education in each state, is…
Descriptors: Accountability, Categorical Aid, Data Collection, Demography
Baxter, Gail P.; And Others – 1994
The degree to which performance assessments meet their dual mandate to evaluate student learning and inform instructional practice is not adequately addressed through traditional concerns for reliability and validity. A possible approach is suggested for examining the cognitive activity students engage in during a performance assessment. The…
Descriptors: Cognitive Processes, Cognitive Psychology, Educational Assessment, Educational Practices
Koretz, Daniel M.; And Others – 1991
Detailed evidence is presented about the extent of generalization from high-stakes tests to other tests and about the instructional effects of high-stakes testing. Data are from grade 3 of a large, high-poverty urban district with large numbers of Black and Hispanic American students. The district's results in 1990 for two tests, designated Test B…
Descriptors: Academic Achievement, Accountability, Achievement Tests, Black Students
Cheung, K. C. – 1993
In the past decade, there have been ample interests in the assessment of cognitive and affective processes and products for the purposes of meaningful learning. Meaningful measurement (MM) has been proposed which is in accordance with a humanistic constructivist information-processing perspective. Students' responses to the assessment tasks are…
Descriptors: Affective Behavior, Cognitive Processes, Constructivism (Learning), Educational Assessment
Ross, John A.; Cousins, J. Bradley – 1993
Researchers have used children's self-reports to investigate the conditions under which children seek and give help. Little attention has been given to examining the predictive value of such measures, even though investigators in other domains have found discrepancies between self-reports and observed behavior. Two studies were conducted in which…
Descriptors: Children, Classroom Observation Techniques, Cooperative Learning, Correlation
Hall, William; Saunders, John – 1993
This booklet has been written to help persons interested in assessment of education and training programs in general and competency-based vocational education programs in particular. The following topics are covered in the individual sections: the meaning of the term "assessment"; the importance of assessment; curriculum models; percentages and…
Descriptors: Annotated Bibliographies, Competency Based Education, Criterion Referenced Tests, Evaluation Methods
Pino, Barbara Gonzalez – Texas Papers in Foreign Language Education, 1998
Previous literature on classroom testing of second language speech skills provides several models of both task types and rubrics for rating, and suggestions regarding procedures for testing speaking with large numbers of learners. However, there is no clear, widely disseminated consensus in the profession on the appropriate paradigm to guide the…
Descriptors: College Instruction, Evaluation Criteria, Higher Education, Interrater Reliability
Mohadjer, Leyla; West, Jerry – 1992
The National Household Education Survey (NHES) was conducted for the first time in 1991 as a way to collect data on the early childhood education experiences of young children and participation in adult education. Because the NHES methodology is relatively new, field tests were necessary. A large field test of approximately 15,000 households was…
Descriptors: Adult Education, Black Students, Data Collection, Demography
Myford, Carol M. – 1991
The aesthetic judgments of experts (casting directors and high school drama teachers), theater buffs, and novices were compared as they rated high school students' videotaped performances of Shakespearean monologues. It was hypothesized that theater buffs would represent an intermediate stage on the path to developing expertise in judging acting…
Descriptors: Ability, Acting, Aesthetic Values, Art Criticism
Wolfe, Edward W.; Manalo, Jonathan R. – ETS Research Report Series, 2005
This study examined scores from 133,906 operationally scored Test of English as a Foreign Language™ (TOEFL®) essays to determine whether the choice of composition medium has any impact on score quality for subgroups of test-takers. Results of analyses demonstrate that (a) scores assigned to word-processed essays are slightly more reliable than…
Descriptors: English (Second Language), Language Tests, Second Language Learning, Scores
Sefer, Jasmina – 1987
The validity and reliability of the Yugoslavian (Beograd) version of the Hungarian adaptation of the Torrance Divergent Capacities Test (HAT-DAT) were tested, with a view toward improving the methodology of scoring the creative abilities test and determining standards for Yugoslavia. The test, based on the work of J. P. Guilford (1977), examines…
Descriptors: Art Products, Childrens Art, Cognitive Ability, Creative Thinking
Veccia, Ellen M.; Schroeder, David H. – 1990
As a measure of musical aptitude, a new 90-item Pitch Discrimination Test was developed, and its internal structure was examined. Each of the three sections of the test measures an individual's aptitude for pitch discrimination in a different frequency range using square-wave tones generated by a personal computer. A total of 1,303 examinees,…
Descriptors: Ability Identification, Adults, Aptitude Tests, Auditory Discrimination

Peer reviewed
