Publication Date
In 2025 | 3 |
Since 2024 | 12 |
Since 2021 (last 5 years) | 41 |
Since 2016 (last 10 years) | 126 |
Since 2006 (last 20 years) | 395 |
Descriptor
Test Theory | 1161 |
Test Items | 261 |
Test Reliability | 252 |
Test Construction | 245 |
Test Validity | 245 |
Psychometrics | 181 |
Scores | 176 |
Item Response Theory | 165 |
Foreign Countries | 159 |
Item Analysis | 141 |
Statistical Analysis | 134 |
More ▼ |
Source
Author
Publication Type
Education Level
Location
United States | 17 |
United Kingdom (England) | 15 |
Canada | 14 |
Australia | 13 |
Turkey | 12 |
Sweden | 8 |
United Kingdom | 8 |
Netherlands | 7 |
Texas | 7 |
New York | 6 |
Taiwan | 6 |
More ▼ |
Laws, Policies, & Programs
No Child Left Behind Act 2001 | 4 |
Elementary and Secondary… | 3 |
Individuals with Disabilities… | 3 |
Assessments and Surveys
What Works Clearinghouse Rating
Wainer, Howard; And Others – 1992
Four researchers at the Educational Testing Service describe what they consider some of the most vexing research problems they face. While these problems are not completely statistical, they all have major statistical components. Following the introduction (section 1), in section 2, "Problems with the Simultaneous Estimation of Many True…
Descriptors: Adaptive Testing, Bayesian Statistics, Educational Research, Estimation (Mathematics)
Hamers, J. H. M., Ed.; Sijtsma, K., Ed.; Ruijssenaars, A. J. J. M., Ed. – 1993
The first part of this volume is concerned with theoretical and conceptual issues concerning learning potential assessment. The second part deals with methodological and measurement issues in learning potential assessment, and the third part is devoted to research projects and practical applications of learning potential tests. The following…
Descriptors: Cross Cultural Studies, Educational Assessment, Elementary Secondary Education, Intelligence Tests
Crowley, Susan L.; And Others – 1993
Issues surrounding accurate assessment of depression in children have received much attention. However, the stability of scores from depression measures has generally been estimated using only classical test score theory, rather than the more powerful generalizability theory. The dependability of scores from the Children's Depression Inventory…
Descriptors: Children, Clinical Diagnosis, Depression (Psychology), Diagnostic Tests
Gonzalez-Tamayo, Eulogio – 1987
The agreement between the Educational Testing Service (ETS) and the Golden Rule Insurance Company of Illinois is interpreted as setting the general principles on which items must be selected to be included in a licensure test. These principles put a limit to the difficulty level of any item, and they also limit the size of the difference in…
Descriptors: Analysis of Variance, Content Validity, Difficulty Level, Item Analysis
North Carolina State Dept. of Public Instruction, Raleigh. Div. of Accountability Services/Research. – 1990
To facilitate the proper technical use of the test scores obtained from the administration of the tests, the curricular and psychometric characteristics of the tests are described in a series of technical manuals. This manual, the eighth in the series, contains a description of the characteristics of the North Carolina Test of Geometry. This paper…
Descriptors: Curriculum Evaluation, Geometry, Mathematics Education, Mathematics Skills
Theunissen, Phiel J. J. M. – 1983
Any systematic approach to the assessment of students' ability implies the use of a model. The more explicit the model is, the more its users know about what they are doing and what the consequences are. The Rasch model is a strong model where measurement is a bonus of the model itself. It is based on four ideas: (1) separation of observable…
Descriptors: Ability Grouping, Difficulty Level, Evaluation Criteria, Item Sampling
Bejar, Isaac I. – 1986
This report summarizes the results of research designed to study the psychometric and technological feasibility of adaptive testing to assess spatial ability. Data was collected from high school students on two types of spatial items: three-dimensional cubes and hidden figure items. The analysis of the three-dimensional cubes focused on the fit of…
Descriptors: Adaptive Testing, Algorithms, Cognitive Measurement, Computer Assisted Testing
Gialluca, Kathleen A.; And Others – 1984
In this study, simulated and actual Air Force test data were used to compare the different procedures for equating mental tests: conventional (equipercentile and linear), Item Response Theory (IRT), and strong true-score theory (STST); data collection designs used were single-group, equivalent-groups, and anchor test. Equating transformations were…
Descriptors: Adults, Cognitive Ability, Cognitive Tests, Comparative Analysis
Forbes, Dean W. – 1986
For many years personalization of achievement testing has been impossible in all but the simplest forms. Recently, item response theory (IRT), or latent trait theory, has emerged as a valuable tool which brings far greater flexibility to the process than had previously been possible. The single parameter Rasch Model, a mathematical model developed…
Descriptors: Achievement Tests, Adaptive Testing, Computer Assisted Testing, Elementary Secondary Education
Torres, Rosalie T.; Harnisch, Delwyn L. – 1983
A review of functional literacy testing in the United States from 1955-82 is provided by summarizing the results of literacy assessment studies and synthesizing the major issues which they have engendered. These reviews are grouped with respect to content, criterion-related, and construct validity. The paper concludes by (1) summarizing some of…
Descriptors: Adults, Criterion Referenced Tests, Educational Diagnosis, Elementary Secondary Education
van der Linden, Wim J. – 1987
The use of Bayesian decision theory to solve problems in test-based decision making is discussed. Four basic decision problems are distinguished: (1) selection; (2) mastery; (3) placement; and (4) classification, the situation where each treatment has its own criterion. Each type of decision can be identified as a specific configuration of one or…
Descriptors: Bayesian Statistics, Classification, Decision Making, Foreign Countries
Schmidt, Hans-Jurgen – 1988
This study assumes that multiple choice test items generally provide the testee with several solutions, one of which is correct and the others of which are wrong. If pupils are unable to answer a question, one would expect that the wrong choices have equal chances of being selected. In many multiple choice items on stoichiometric calculation which…
Descriptors: Behavior Patterns, Chemistry, Computation, Performance
Dickinson, Terry L. – 1985
The general linear model was described, and the influence that measurement errors have on model parameters was discussed. In particular, the assumptions of classical true-score theory were used to develop algebraic relationships between the squared multiple correlations coefficient and the regression coefficients in the infallible and fallible…
Descriptors: Analysis of Covariance, Analysis of Variance, Correlation, Error of Measurement
Meyer, Linda A. – 1985
Following a review of empirical research on teacher feedback to students' wrong responses, this paper describes a paradigm for use with direct instruction materials, applying the feedback model to comprehension tasks from traditional reading and science textbooks. The next section details a classification system for wrong responses grouped into…
Descriptors: Elementary Education, Feedback, Miscue Analysis, Reading Comprehension
Quellmalz, Edys S.; Shaha, Steven – 1982
The potential of a cognitive model task analysis scheme (CMS) that specifies features of test problems shown by research to affect performance is explored. CMS describes the general skill area and the generic task or problem type. It elaborates features of the problem situation and required responses found by research to influence performance.…
Descriptors: Academic Achievement, Cognitive Measurement, Criterion Referenced Tests, Elementary Secondary Education