Publication Date
| In 2025 | 0 |
| Since 2024 | 0 |
| Since 2021 (last 5 years) | 0 |
| Since 2016 (last 10 years) | 1 |
| Since 2006 (last 20 years) | 5 |
Descriptor
| Test Theory | 112 |
| Test Items | 37 |
| Higher Education | 30 |
| Latent Trait Theory | 29 |
| Test Construction | 28 |
| Test Validity | 28 |
| Item Analysis | 26 |
| Mathematical Models | 25 |
| Test Reliability | 20 |
| Statistical Analysis | 18 |
| Scores | 17 |
| More ▼ | |
Author
| Ackerman, Terry A. | 3 |
| Cope, Ronald T. | 3 |
| Engelhard, George, Jr. | 3 |
| Haladyna, Tom | 3 |
| Hutchinson, T. P. | 2 |
| Lockwood, Robert E. | 2 |
| Powell, J. C. | 2 |
| Roid, Gale | 2 |
| Thompson, Bruce | 2 |
| Yen, Wendy M. | 2 |
| Andrich, David | 1 |
| More ▼ | |
Publication Type
| Reports - Research | 112 |
| Speeches/Meeting Papers | 112 |
| Tests/Questionnaires | 3 |
| Information Analyses | 2 |
| Numerical/Quantitative Data | 2 |
| Journal Articles | 1 |
| Reports - Evaluative | 1 |
Education Level
| High Schools | 3 |
| Higher Education | 3 |
| Secondary Education | 2 |
| Adult Education | 1 |
| Postsecondary Education | 1 |
Audience
| Researchers | 44 |
| Practitioners | 2 |
| Administrators | 1 |
Location
| Netherlands | 2 |
| Alabama | 1 |
| Canada | 1 |
| Hawaii | 1 |
| Lithuania | 1 |
| Mexico | 1 |
| New York | 1 |
| Texas | 1 |
| Turkey | 1 |
| United Kingdom (England) | 1 |
| Virginia | 1 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Selvi, Hüseyin; Özdemir Alici, Devrim – International Journal of Assessment Tools in Education, 2018
In this study, it is aimed to investigate the impact of different missing data handling methods on the detection of Differential Item Functioning methods (Mantel Haenszel and Standardization methods based on Classical Test Theory and Likelihood Ratio Test method based on Item Response Theory). In this regard, on the data acquired from 1046…
Descriptors: Test Bias, Test Theory, Item Response Theory, Multiple Choice Tests
Engelhard, George, Jr.; Wind, Stefanie A. – College Board, 2013
The major purpose of this study is to examine the quality of ratings assigned to CR (constructed-response) questions in large-scale assessments from the perspective of Rasch Measurement Theory. Rasch Measurement Theory provides a framework for the examination of rating scale category structure that can yield useful information for interpreting the…
Descriptors: Measurement Techniques, Rating Scales, Test Theory, Scores
Herbst, Patricio; Dimmel, Justin; Erickson, Ander; Ko, Inah; Kosko, Karl W. – North American Chapter of the International Group for the Psychology of Mathematics Education, 2014
We describe the conceptualization, development, and piloting of two instruments--a survey and a scenario-based assessment--designed to assess, teachers' recognition of an obligation to the discipline of mathematics and the extent to which teachers justify actions that deviate from what is normative on account of this obligation. We show how we…
Descriptors: Mathematics Teachers, Test Construction, Test Theory, Test Items
Moffett, David W.; Zhou, Yunfang – Online Submission, 2009
The Investigators hypothesized cooperating teachers' evaluations of candidates in clinical practice and field experiences would possess higher scores than those provided by clinical and education division faculty. However, the reasons for the higher scores proved to be much more complex than originally thought. While it was assumed that teachers…
Descriptors: Field Experience Programs, Cooperating Teachers, Student Teacher Supervisors, Clinical Supervision (of Teachers)
Engelhard, George, Jr. – 1988
The purpose of this essay is to describe the principles of educational measurement proposed by B. Wood during the 1920s in his dissertation, written under the direction of E. L. Thorndike, and later published as "Measurement in Higher Education" (1923). These principles were selected because they illustrate one of the earliest and most complete…
Descriptors: Educational History, Educational Testing, Test Theory, Testing Problems
Hwang, Dae-Yeop – 2002
This study compared classical test theory (CTT) and item response theory (IRT). The behavior of the item and person statistics derived from these two measurement frameworks was examined analytically and empirically using a data set obtained from BILOG (R. Mislay and D. Block, 1997). The example was a 15-item test with a sample size of 600…
Descriptors: Comparative Analysis, Measurement Techniques, Scores, Statistical Distributions
Hoffman, R. Gene; Wise, Lauress L. – 2000
Classical test theory is based on the concept of a true score for each examinee, defined as the expected or average score across an infinite number of repeated parallel tests. In most cases, there is only a score from a single administration of the test in question. The difference between this single observed score and the underlying true score is…
Descriptors: Achievement, Classification, Observation, Probability
Brown, James Dean; Ross, Jacqueline A. – 1993
This study investigates the Test of English as a Foreign Language (TOEFL), in particular the relative contributions to score dependability (analogous to classical theory reliability) of various numbers of items and subtests as well as the decision dependability at different cut points. Research questions that apply to the overall TOEFL battery and…
Descriptors: English (Second Language), Language Tests, Statistical Analysis, Test Reliability
Stone, Kathy Kees; And Others – 1983
Looking beyond the overall effectiveness of sensory stimulation, this study aimed to identify specific aspects of infant behavior most responsive to early stimulation. Subjects were 65 premature infants with a birth weight of less than 5 pounds, 8 ounces and a gestational age under 37 weeks. Experimental group members had completed a multimodal…
Descriptors: Comparative Analysis, Discriminant Analysis, Infant Behavior, Premature Infants
Andrich, David – 1984
Both the attenuation paradox of traditional test theory and the assumption of local independence in person-item response theory have caused problems in interpretation. This paper demonstrates that the two are related concepts, and, through this demonstration, both are clarified. It is demonstrated that the breakdown of local independence leads to…
Descriptors: Latent Trait Theory, Test Interpretation, Test Items, Test Reliability
Mumford, Michael D.; Mendoza, Jorge L. – 1983
The present paper reviews the techniques commonly used to correct an observed correlation coefficient for the simultaneous influence of attenuation and range restriction effects. It is noted that the procedure which is currently in use may be somewhat biased because it treats range restriction and attenuation as independent restrictive influences.…
Descriptors: Correlation, Measurement Techniques, Psychometrics, Research Problems
DeVito, Anthony J.; And Others – 1983
To assist the clinician or researcher in scale selection, four symposium papers discussed instruments available to measure test anxiety (TA), with special attention given to the newly-developed Test Anxiety Inventory (TAI). Following an integrative summary delivered by the chairperson (DeVito), the first paper (Conetta and Tryon) reviewed the two…
Descriptors: Affective Measures, Higher Education, Psychological Testing, Test Anxiety
PDF pending restorationHunyh, Hunyh; Saunders, Joseph C. – 1979
Comparisons were made among various methods of estimating the reliability of pass-fail decisions based on mastery tests. The reliability indices that are considered are p, the proportion of agreements between two estimates, and kappa, the proportion of agreements corrected for chance. Estimates of these two indices were made on the basis of…
Descriptors: Cutting Scores, Error of Measurement, Mastery Tests, Reliability
Reckase, Mark D.; McKinley, Robert L. – 1984
The purpose of this paper is to present a generalization of the concept of item difficulty to test items that measure more than one dimension. Three common definitions of item difficulty were considered: the proportion of correct responses for a group of individuals; the probability of a correct response to an item for a specific person; and the…
Descriptors: Difficulty Level, Item Analysis, Latent Trait Theory, Mathematical Models
Cliff, Norman – 1984
In almost all applications of measurement there is some sort of response by a human subject. Almost always, the response scale is ordinal, but almost always it is treated as if it were an interval measure. Methods for treating data ordinally are currently being developed in three areas: ordinal analysis for questionnaire responses, ordinal…
Descriptors: Multiple Regression Analysis, Questionnaires, Research Problems, Scores

Peer reviewed
