Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 2 |
Since 2006 (last 20 years) | 3 |
Descriptor
True Scores | 49 |
Item Response Theory | 14 |
Equated Scores | 13 |
Test Items | 13 |
Error of Measurement | 11 |
Reliability | 8 |
Test Reliability | 7 |
Latent Trait Theory | 6 |
Mathematical Models | 6 |
Academic Achievement | 5 |
Achievement Tests | 5 |
More ▼ |
Author
Publication Type
Speeches/Meeting Papers | 49 |
Reports - Research | 26 |
Reports - Evaluative | 16 |
Reports - Descriptive | 5 |
Journal Articles | 2 |
Numerical/Quantitative Data | 2 |
Information Analyses | 1 |
Opinion Papers | 1 |
Education Level
Higher Education | 1 |
Audience
Researchers | 6 |
Administrators | 1 |
Practitioners | 1 |
Location
China | 1 |
Illinois | 1 |
New York | 1 |
Virgin Islands | 1 |
Laws, Policies, & Programs
Elementary and Secondary… | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Wang, Tianqi; Jing, Xia; Li, Qi; Gao, Jing; Tang, Jie – International Educational Data Mining Society, 2019
Massive Open Online Courses (MOOCs) have become more and more popular recently. These courses have attracted a large number of students world-wide. In a popular course, there may be thousands of students. Such a large number of students in one course makes it infeasible for the instructors to grade all the submissions. Peer assessment is thus an…
Descriptors: Peer Evaluation, Accuracy, Grades (Scholastic), Grading
Han, Yong; Wu, Wenjun; Ji, Suozhao; Zhang, Lijun; Zhang, Hui – International Educational Data Mining Society, 2019
Peer-grading is commonly adopted by instructors as an effective assessment method for MOOCs (Massive Open Online Courses) and SPOCs (Small Private online course). For solving the problems brought by varied skill levels and attitudes of online students, statistical models have been proposed to improve the fairness and accuracy of peer-grading.…
Descriptors: Peer Evaluation, Grading, Online Courses, Computer Assisted Testing
Herman, William E.; Nelson, Gena C. – Online Submission, 2009
This study compared college student reported grade point averages (GPA) with actual GPA as recorded at the Registrar's Office to determine the accuracy of student reported GPA. Results indicated that, on average, students reported slightly higher GPA than their actual GPA. Additionally, females were virtually as accurate as males and students with…
Descriptors: Grade Point Average, Research Problems, Statistical Bias, True Scores

Dimitrov, Dimiter M. – 2003
This paper provides formulas for expected true-score measures and reliability of binary items as a function of their Rasch difficulty parameters when the trait distribution is normal or logistic. With the proposed formula, one can evaluate the theoretical values of classical reliability indexes for norm-referenced and criterion-referenced…
Descriptors: Cutting Scores, Item Response Theory, Reliability, True Scores
Stocking, Martha L.; And Others – 1988
A sequence of simulations was carried out to aid in the diagnosis and interpretation of equating differences found between random and matched (nonrandom) samples for four commonly used equating procedures: (1) Tucker linear observed-score equating; (2) Levine equally reliable linear observed-score equating; (3) equipercentile curvilinear…
Descriptors: Equated Scores, Item Response Theory, Sample Size, Simulation
Rotou, Ourania; Elmore, Patricia B.; Headrick, Todd C. – 2001
This study investigated the number-correct scoring method based on different theories (classical true-score theory and multidimensional item response theory) when a standardized test requires more than one ability for an examinee to get a correct response. The number-correct scoring procedure that is widely used is the one that is defined in…
Descriptors: Item Response Theory, Scoring, Standardized Tests, Test Items
Dimitrov, Dimiter M. – 2003
This paper provides analytic evaluations of expected (marginal) true-score measures for binary items given their item response theory (IRT) calibration. Under the assumption of normal trait distributions, marginalized true scores, error variance, true score variance, and reliability for norm-referenced and criterion-references interpretations are…
Descriptors: Item Response Theory, Reliability, Test Construction, Test Items

Dimitrov, Dimiter M. – 2002
Exact formulas for classical error variance are provided for Rasch measurement with logistic distributions. An approximation formula with the normal ability distribution is also provided. With the proposed formulas, the additive contribution of individual items to the population error variance can be determined without knowledge of the other test…
Descriptors: Ability, Error of Measurement, Item Response Theory, Test Items
Hoffman, R. Gene; Wise, Lauress L. – 2000
Classical test theory is based on the concept of a true score for each examinee, defined as the expected or average score across an infinite number of repeated parallel tests. In most cases, there is only a score from a single administration of the test in question. The difference between this single observed score and the underlying true score is…
Descriptors: Achievement, Classification, Observation, Probability
Lang, William Steve – Online Submission, 2005
This paper reports the analysis of the results from a pilot effort to create and use a battery of instruments based on INTASC principles indicators of teacher dispositions. The original conception of the battery was designed on the taxonomy of increasing levels of inference. This means that the intent to measure included multiple instruments in…
Descriptors: Cognitive Tests, True Scores, Teacher Certification, Pilot Projects
Klaas, Alan C. – 1975
Current usage and theory of standard error of measurement calls for one standard error of measurement figure to be used across all levels of scoring. The study revealed that scoring variance across scoring levels is not constant. As scoring ability increases scoring variance decreases. The assertion that low and high scoring subjects will…
Descriptors: Error of Measurement, Guessing (Tests), Scoring, Statistical Analysis
Brennan, Robert L. – 1990
In 1955, R. Levine introduced two linear equating procedures for the common-item non-equivalent populations design. His procedures make the same assumptions about true scores; they differ in terms of the nature of the equating function used. In this paper, two parameterizations of a classical congeneric model are introduced to model the variables…
Descriptors: Equated Scores, Equations (Mathematics), Mathematical Models, Research Design
Cook, Linda L.; And Others – 1983
The purpose of this study was to empirically examine the relationship between violations of the assumption of unidimensionality, as assessed by the factor analysis of item parcel data, and the quality of item response theory (IRT) true-score equating, as measured by score scale stability. The verbal section of the Scholastic Aptitude Test (SAT)…
Descriptors: College Entrance Examinations, Equated Scores, Factor Analysis, Latent Trait Theory
Lyu, C. Felicia; And Others – 1995
A smoothed version of standardization, which merges kernel smoothing with the traditional standardization differential item functioning (DIF) approach, was used to examine DIF for student-produced response (SPR) items on the Scholastic Assessment Test (SAT) I mathematics test at both the item and testlet levels. This nonparametric technique avoids…
Descriptors: Aptitude Tests, Item Bias, Mathematics Tests, Multiple Choice Tests
Cliff, Norman – 1984
In almost all applications of measurement there is some sort of response by a human subject. Almost always, the response scale is ordinal, but almost always it is treated as if it were an interval measure. Methods for treating data ordinally are currently being developed in three areas: ordinal analysis for questionnaire responses, ordinal…
Descriptors: Multiple Regression Analysis, Questionnaires, Research Problems, Scores