Publication Date
| In 2026 | 0 |
| Since 2025 | 8 |
| Since 2022 (last 5 years) | 36 |
| Since 2017 (last 10 years) | 115 |
| Since 2007 (last 20 years) | 378 |
Descriptor
| Test Theory | 1166 |
| Test Items | 262 |
| Test Reliability | 252 |
| Test Construction | 246 |
| Test Validity | 245 |
| Psychometrics | 183 |
| Scores | 176 |
| Item Response Theory | 168 |
| Foreign Countries | 160 |
| Item Analysis | 141 |
| Statistical Analysis | 134 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Location
| United States | 17 |
| United Kingdom (England) | 15 |
| Canada | 14 |
| Australia | 13 |
| Turkey | 12 |
| Sweden | 8 |
| United Kingdom | 8 |
| Netherlands | 7 |
| Texas | 7 |
| New York | 6 |
| Taiwan | 6 |
| More ▼ | |
Laws, Policies, & Programs
| No Child Left Behind Act 2001 | 4 |
| Elementary and Secondary… | 3 |
| Individuals with Disabilities… | 3 |
Assessments and Surveys
What Works Clearinghouse Rating
Peer reviewedRamsay, James O. – Psychometrika, 1989
An alternative to the Rasch model is introduced. It characterizes strength of response according to the ratio of ability and difficulty parameters rather than their difference. Joint estimation and marginal estimation models are applied to two test data sets. (SLD)
Descriptors: Ability, Bayesian Statistics, College Entrance Examinations, Comparative Analysis
Peer reviewedCahan, Sorel – Educational and Psychological Measurement, 1989
Statistical significance and "abnormality" have been used as criteria for the evaluation of intra-individual subtest score differences. Shortcomings of these criteria are identified, and improved estimates of the true score differences are suggested. The applicability of the abnormality criterion to these improved estimates is reviewed.…
Descriptors: Estimation (Mathematics), Evaluation Methods, Individual Differences, Mathematical Models
Peer reviewedSuzuki, Shinobu; Rancer, Andrew S. – Communication Monographs, 1994
Finds that the two-factor solution of the Argumentativeness Scale and the Verbal Aggressiveness Scale was a reasonable overall fit to samples of both U.S. and Japanese college students; orthogonality of the two constructs (argumentativeness and verbal aggressiveness) held for both samples; and the two scales had satisfactory construct validity for…
Descriptors: Communication Research, Construct Validity, Cross Cultural Studies, Evaluation Methods
Peer reviewedO'Grady, Kevin E.; Medoff, Deborah R. – Multivariate Behavioral Research, 1991
A procedure for evaluating a variety of rater reliability models is presented. A multivariate linear model is used to describe and assess a set of ratings. Parameters are represented in terms of a factor analytic model, and maximum likelihood methods test the model parameters. Illustrative examples are presented. (SLD)
Descriptors: Comparative Analysis, Correlation, Equations (Mathematics), Estimation (Mathematics)
Peer reviewedCarroll, John B. – Intelligence, 1995
It is argued that the statements and accusations made by Stephen Jay Gould about the use of factor analysis are incorrect and unjustified and that tests properly designed for the purpose can adequately measure a "general" or "g" factor of intelligence, particularly in view of the developments in testing since "The…
Descriptors: Factor Analysis, Intelligence Tests, Measurement Techniques, Nature Nurture Controversy
Peer reviewedBachman, Lyle F. – Language Testing, 2000
Reviews developments in language testing research and practice over the last 20 years, and suggests future directions in the areas of professionalizing the field and validation research. Argues that concerns for ethical conduct must be grounded in valid test use, so that professionalization and validation research are inseparable. (Author/VWL)
Descriptors: Ethics, Language Research, Language Tests, Second Language Instruction
van der Linden, Wim J. – Applied Psychological Measurement, 2006
Traditionally, error in equating observed scores on two versions of a test is defined as the difference between the transformations that equate the quantiles of their distributions in the sample and population of test takers. But it is argued that if the goal of equating is to adjust the scores of test takers on one version of the test to make…
Descriptors: Equated Scores, Evaluation Criteria, Models, Error of Measurement
Hayward, Pamela A. – 1995
This review critiques the use of Lev Vygotsky's concept of the zone of proximal development (ZPD) in quantitative research that focuses on the role communication plays in learning. A study that makes claims in terms of the ZPD should include a pretest, a problem-solving activity, and a posttest. Without these minimal elements, researchers are not…
Descriptors: Communication Research, Communication (Thought Transfer), Learning Processes, Pretests Posttests
Lai, Morris K.; Saka, Thomas – 1993
Two studies investigated factors affecting the scores of Hawaii students taking the verbal subtest of the Scholastic Aptitude Test (SAT). For the past several years, the mean verbal scores of Hawaii students have consistently been among the lowest 10% of all states. The first study addressed the identification of items and types of items that have…
Descriptors: Comparative Analysis, High School Seniors, High Schools, Instructional Effectiveness
Ackerman, Terry A. – 1987
One of the important underlying assumptions of all item response theory (IRT) models is that of local independence. This assumption requires that the response to an item on a test not be influenced by the response to any other items. This assumption is often taken for granted, with little or no scrutiny of the response process required to answer…
Descriptors: Computer Software, Correlation, Estimation (Mathematics), Latent Trait Theory
Goulden, Nancy Rost – 1989
Since speech communication evaluators are beginning to adapt the analytic and holistic instruments and methods used for rating written products to oral products and performance, this research review investigated: (1) what the labels "analytic" and "holistic" mean; (2) the theoretical bases of the two scoring approaches; and (3)…
Descriptors: Comparative Analysis, Higher Education, Holistic Evaluation, Rating Scales
Randhawa, Bikkar S. – 1987
This paper discusses cognitive and noncognitive variables, and their relationship with each other, in learning and in educational evaluation. The nature of noncognitive learning environment variables is examined. The following instruments that have been widely used in assessing classroom environment are described: (1) the Learning Environment…
Descriptors: Attitude Measures, Classroom Environment, Cognitive Measurement, Construct Validity
Mitchell, Karen J. – 1984
The purpose of this resarch was to develop a model of verbal information processing for use in subsequent analyses of the construct and predictive validity of the current Department of Defense military selection and classification battery, the Armed Services Vocational Aptitude Battery (ASVAB) 8/9/10. The theory and research methods of selected…
Descriptors: Adults, Armed Forces, Cognitive Processes, Models
Lewis, Charles – 1982
The nonparametric approach to test theory discussed here has its roots in the early work of Guttman, Lazarsfeld, and Meredith; and more recently in the work of Cliff and in Tatsuoka and Tatsuoka. Mokken's extensive treatment of this subject concentrated on defining, constructing, and testing unidimensional scales, based on responses to dichotomous…
Descriptors: Computer Oriented Programs, Estimation (Mathematics), Item Analysis, Latent Trait Theory
Norris, Stephen P. – 1988
The problems of validity and fairness involved in multiple-choice critical thinking tests can be lessened by using verbal reports of examinees' thinking during the process of developing such tests in order to retain only those items which rely on critical thinking skills to obtain the correct answer. Multiple-choice testing can lead to unfair…
Descriptors: Critical Thinking, High School Students, High Schools, Multiple Choice Tests

Direct link
