Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 4 |
| Since 2017 (last 10 years) | 6 |
| Since 2007 (last 20 years) | 13 |
Descriptor
| Test Format | 71 |
| Test Reliability | 61 |
| Test Validity | 49 |
| Test Interpretation | 33 |
| Standardized Tests | 32 |
| Testing | 32 |
| Test Content | 29 |
| Test Reviews | 25 |
| Test Construction | 24 |
| Disability Identification | 23 |
| Screening Tests | 21 |
| More ▼ | |
Source
Author
Publication Type
Education Level
| Higher Education | 5 |
| Postsecondary Education | 3 |
| Adult Education | 2 |
| Secondary Education | 2 |
| Elementary Education | 1 |
| Elementary Secondary Education | 1 |
| High Schools | 1 |
| Junior High Schools | 1 |
| Middle Schools | 1 |
Audience
| Practitioners | 7 |
| Teachers | 7 |
| Administrators | 4 |
| Researchers | 2 |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
McCaffrey, Daniel F.; Casabianca, Jodi M.; Ricker-Pedley, Kathryn L.; Lawless, René R.; Wendler, Cathy – ETS Research Report Series, 2022
This document describes a set of best practices for developing, implementing, and maintaining the critical process of scoring constructed-response tasks. These practices address both the use of human raters and automated scoring systems as part of the scoring process and cover the scoring of written, spoken, performance, or multimodal responses.…
Descriptors: Best Practices, Scoring, Test Format, Computer Assisted Testing
Cobern, William W.; Adams, Betty A. J. – International Journal of Assessment Tools in Education, 2020
What follows is a practical guide for establishing the validity of a survey for research purposes. The motivation for providing this guide is our observation that researchers, not necessarily being survey researchers per se, but wanting to use a survey method, lack a concise resource on validity. There is far more to know about surveys and survey…
Descriptors: Surveys, Test Validity, Test Construction, Test Items
NWEA, 2022
This technical report documents the processes and procedures employed by NWEA® to build and support the English MAP® Reading Fluency™ assessments administered during the 2020-2021 school year. It is written for measurement professionals and administrators to help evaluate the quality of MAP Reading Fluency. The seven sections of this report: (1)…
Descriptors: Achievement Tests, Reading Tests, Reading Achievement, Reading Fluency
Koizumi, Rie – Language Assessment Quarterly, 2022
In Japanese secondary schools, speaking assessment in English classrooms is designed, conducted, and scored by teachers. Although the assessment is intended to be used for summative and formative purposes, it is not regularly or adequately practiced. This paper reports the problems (i.e., lack of continuous speaking assessment, limited speaking…
Descriptors: Secondary Schools, English (Second Language), Second Language Instruction, Language Tests
Nebraska Department of Education, 2024
The Nebraska Student-Centered Assessment System (NSCAS) is a statewide assessment system that embodies Nebraska's holistic view of students and helps them prepare for success in postsecondary education, career, and civic life. It uses multiple measures throughout the year to provide educators and decision-makers at all levels with the insights…
Descriptors: Student Evaluation, Evaluation Methods, Elementary School Students, Middle School Students
Magno, Carlo – UNESCO Bangkok, 2020
The COVID-19 pandemic has disrupted education across the globe leading countries to adapt how they administer and manage high-stakes examinations and large-scale learning assessments. This thematic review describes the measures that countries have taken, in terms of policies and practices, when learning assessments are disrupted by emergencies and…
Descriptors: High Stakes Tests, COVID-19, Pandemics, Cross Cultural Studies
Sriram, Rishi – NASPA - Student Affairs Administrators in Higher Education, 2014
When student affairs professionals assess their work, they often employ some type of survey. The use of surveys stems from a desire to objectively measure outcomes, a demand from someone else (e.g., supervisor, accreditation committee) for data, or the feeling that numbers can provide an aura of competence. Although surveys are effective tools for…
Descriptors: Surveys, Test Construction, Student Personnel Services, Test Use
Thomas, Jason E.; Hornsey, Philip E. – Journal of Instructional Research, 2014
Formative Classroom Assessment Techniques (CAT) have been well-established instructional tools in higher education since their exposition in the late 1980s (Angelo & Cross, 1993). A large body of literature exists surrounding the strengths and weaknesses of formative CATs. Simpson-Beck (2011) suggested insufficient quantitative evidence exists…
Descriptors: Classroom Techniques, Nontraditional Education, Adult Education, Formative Evaluation
Kolen, Michael J.; Lee, Won-Chan – Educational Measurement: Issues and Practice, 2011
This paper illustrates that the psychometric properties of scores and scales that are used with mixed-format educational tests can impact the use and interpretation of the scores that are reported to examinees. Psychometric properties that include reliability and conditional standard errors of measurement are considered in this paper. The focus is…
Descriptors: Test Use, Test Format, Error of Measurement, Raw Scores
Ediger, Marlow – Education, 2010
Data driven decision making emphasizes the importance of the teacher using objective sources of information in developing the social studies curriculum. Too frequently, decisions of teachers have been made based on routine and outdated methods of teaching. Valid and reliable tests used to secure results from pupil learning make for better…
Descriptors: Data, Decision Making, Social Studies, Standardized Tests
Schumacker, Randall E.; Smith, Everett V., Jr. – Educational and Psychological Measurement, 2007
Measurement error is a common theme in classical measurement models used in testing and assessment. In classical measurement models, the definition of measurement error and the subsequent reliability coefficients differ on the basis of the test administration design. Internal consistency reliability specifies error due primarily to poor item…
Descriptors: Measurement Techniques, Error of Measurement, Item Sampling, Item Response Theory
Yi, Hyun Sook; Kim, Seonghoon; Brennan, Robert L. – Applied Psychological Measurement, 2007
Large-scale testing programs involving classification decisions typically have multiple forms available and conduct equating to ensure cut-score comparability across forms. A test developer might be interested in the extent to which an examinee who happens to take a particular form would have a consistent classification decision if he or she had…
Descriptors: Classification, Reliability, Indexes, Computation
van der Linden, Wim J.; Vos, Hans J.; Chang, Lei – 2000
In judgmental standard setting experiments, it may be difficult to specify subjective probabilities that adequately take the properties of the items into account. As a result, these probabilities are not consistent with each other in the sense that they do not refer to the same borderline level of performance. Methods to check standard setting…
Descriptors: Interrater Reliability, Judges, Probability, Standard Setting
Henson, Robin K. – 2000
The purpose of this paper is to highlight some psychometric cautions that should be observed when seeking to develop short form versions of tests. Several points are made: (1) score reliability is impacted directly by the characteristics of the sample and testing conditions; (2) sampling error has a direct influence on reliability and factor…
Descriptors: Factor Structure, Psychometrics, Reliability, Sampling
Peer reviewedSimon, Alan J.; Joiner, Lee M. – Journal of Educational Measurement, 1976
The purpose of this study was to determine whether a Mexican version of the Peabody Picture Vocabulary Test could be improved by directly translating both forms of the American test, then using decision procedures to select the better item of each pair. The reliability of the simple translations suffered. (Author/BW)
Descriptors: Early Childhood Education, Spanish, Test Construction, Test Format

Direct link
