Publication Date
| In 2026 | 0 |
| Since 2025 | 5 |
| Since 2022 (last 5 years) | 45 |
| Since 2017 (last 10 years) | 91 |
| Since 2007 (last 20 years) | 144 |
Descriptor
| Test Format | 418 |
| Test Reliability | 418 |
| Test Validity | 243 |
| Test Construction | 135 |
| Test Items | 119 |
| Higher Education | 88 |
| Multiple Choice Tests | 68 |
| Foreign Countries | 67 |
| Testing | 65 |
| Test Interpretation | 61 |
| Comparative Analysis | 57 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 33 |
| Teachers | 23 |
| Administrators | 18 |
| Researchers | 12 |
| Community | 1 |
| Counselors | 1 |
| Policymakers | 1 |
| Students | 1 |
| Support Staff | 1 |
Location
| New York | 9 |
| Turkey | 8 |
| California | 7 |
| Canada | 6 |
| Japan | 6 |
| Germany | 4 |
| United Kingdom | 4 |
| Georgia | 3 |
| Israel | 3 |
| France | 2 |
| Indonesia | 2 |
| More ▼ | |
Laws, Policies, & Programs
| Individuals with Disabilities… | 1 |
| Job Training Partnership Act… | 1 |
| No Child Left Behind Act 2001 | 1 |
| Pell Grant Program | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Lissitz, Robert W.; Hou, Xiaodong; Slater, Sharon Cadman – Journal of Applied Testing Technology, 2012
This article investigates several questions regarding the impact of different item formats on measurement characteristics. Constructed response (CR) items and multiple choice (MC) items obviously differ in their formats and in the resources needed to score them. As such, they have been the subject of considerable discussion regarding the impact of…
Descriptors: Computer Assisted Testing, Scoring, Evaluation Problems, Psychometrics
Macqueen, Susy; Harding, Luke – Language Testing, 2009
In 2002 the University of Cambridge Local Examinations Syndicate (UCLES) implemented a revised version of the Certificate of Proficiency in English (CPE). CPE, which is the highest level of the Main Suite of Cambridge ESOL exams, comprises five modules, "Reading," "Writing," "Use of English," "Listening" and "Speaking," the latter of which is the…
Descriptors: Speech Communication, Test Reviews, Examiners, English (Second Language)
Alderson, J. Charles – Language Testing, 2009
In this article, the author reviews the TOEFL iBT which is the latest version of the TOEFL, whose history stretches back to 1961. The TOEFL iBT was introduced in the USA, Canada, France, Germany and Italy in late 2005. Currently the TOEFL test is offered in two testing formats: (1) Internet-based testing (iBT); and (2) paper-based testing (PBT).…
Descriptors: Oral Language, Writing Tests, Listening Comprehension Tests, Test Reviews
New York State Education Department, 2014
This technical report provides an overview of the New York State Alternate Assessment (NYSAA), including a description of the purpose of the NYSAA, the processes utilized to develop and implement the NYSAA program, and Stakeholder involvement in those processes. The purpose of this report is to document the technical aspects of the 2013-14 NYSAA.…
Descriptors: Alternative Assessment, Educational Assessment, State Departments of Education, Student Evaluation
Marshall, Robert C.; Wright, Heather Harris – American Journal of Speech-Language Pathology, 2007
Purpose: The Kentucky Aphasia Test (KAT) is an objective measure of language functioning for persons with aphasia. This article describes materials, administration, and scoring of the KAT; presents the rationale for development of test items; reports information from a pilot study; and discusses the role of the KAT in aphasia assessment. Method:…
Descriptors: Aphasia, Test Format, Language Tests, Expressive Language
Hoachlander, E. Gareth – Techniques: Making Education and Career Connections, 1998
Discusses state testing, various types of tests, and whether the increased attention to assessment is contributing to improved student learning. Describes uses of standardized multiple-choice, open-ended constructed response, essay, performance event, and portfolio methods. (JOW)
Descriptors: Academic Achievement, Student Evaluation, Test Format, Test Reliability
Peer reviewedStreiner, David L.; Miller, Harold R. – Journal of Clinical Psychology, 1986
Numerous short forms of the Minnesota Multiphasic Personality Inventory have been proposed in the last 15 years. In each case, the initial enthusiasm has been replaced by the questions about the clinical utility of the abbreviated version. Argues that the statistical properties of the test and reduced reliability due to shortening the scales…
Descriptors: Test Construction, Test Format, Test Length, Test Reliability
Peer reviewedBenson, Philip G.; Dickinson, Terry L. – Educational and Psychological Measurement, 1983
The mixed standard scale is a rating format that allows researchers to count internally inconsistent response patterns. This study investigated the meaning of these counts, using 943 accountants as raters. The counts of internally inconsistent response patterns were not related to reliability as measured by Cronbach's alpha. (Author/BW)
Descriptors: Accountants, Adults, Error Patterns, Rating Scales
Peer reviewedTorabi-Parizi, Rosa; Campbell, Noma Jo – Elementary School Journal, 1982
Investigates the effects of varying the placement of blanks and the number of options available in multiple-choice items on the reliability of fifth-grade students' scores. Results indicate that scores on three-choice item tests were not less reliable than scores on four-choice item tests. A similar finding was found regarding the placement of…
Descriptors: Elementary Education, Elementary School Students, Scores, Test Format
Peer reviewedChambers, William V. – Social Behavior and Personality, 1985
Personal construct psychologists have suggested various psychological functions explain differences in the stability of constructs. Among these functions are constellatory and loose construction. This paper argues that measurement error is a more parsimonious explanation of the differences in construct stability reported in these studies. (Author)
Descriptors: Error of Measurement, Test Construction, Test Format, Test Reliability
Peer reviewedGrosse, Martin E.; Wright, Benjamin D. – Educational and Psychological Measurement, 1985
A model of examinee behavior was used to generate hypotheses about the operation of true-false scores. Confirmation of hypotheses supported the contention that true-false scores contain an error component that makes these tests less reliable than multiple-choice tests. Examinee response style may invalidate a total true-false score. (Author/DWH)
Descriptors: Objective Tests, Response Style (Tests), Test Format, Test Reliability
Peer reviewedWilcox, Rand R. – Educational and Psychological Measurement, 1982
Results in the engineering literature on "k out of n system reliability" can be used to characterize tests based on estimates of the probability of correctly determining whether the examinee knows the correct response. In particular, the minimum number of distractors required for multiple-choice tests can be empirically determined.…
Descriptors: Achievement Tests, Mathematical Models, Multiple Choice Tests, Test Format
Peer reviewedThompson, Martie; Kaslow, Nadine J.; Weiss, Bahr; Nolen-Hoeksema, Susan – Psychological Assessment, 1998
The psychometric properties of the Children's Attributional Style Questionnaire-Revised (CASQ) (N. Kaslow and S. Nolen-Hoeksema, 1991) were studied with 1086 children, 9 to 12 years old. Results indicate the revised version to be somewhat less reliable than the original, but with equivalent criterion-related validity for self-reported depression.…
Descriptors: Attribution Theory, Concurrent Validity, Psychometrics, Racial Differences
Peer reviewedHelfeldt, John P.; And Others – Journal of Educational Research, 1986
Performances of 64 sixth-grade readers on a traditional and three alternative types of cloze tests were compared. Results confirm and extend the findings of earlier studies investigating cloze alternatives. Advantages of the alternate forms are discussed. (Author/MT)
Descriptors: Cloze Procedure, Grade 6, Intermediate Grades, Reading Comprehension
Peer reviewedCampbell, Todd; And Others – Educational and Psychological Measurement, 1997
The construct validity of scores from the Bem Sex-Role Inventory was studied using confirmatory factor analysis methods on data from 791 subjects. Measurement characteristics of the long and short forms were studied, with the short form yielding more reliable scores, as has previously been indicated. (Author/SLD)
Descriptors: Adults, Construct Validity, Factor Structure, Scores

Direct link
