Publication Date
In 2025 | 1 |
Since 2024 | 4 |
Since 2021 (last 5 years) | 11 |
Since 2016 (last 10 years) | 30 |
Since 2006 (last 20 years) | 95 |
Descriptor
True Scores | 415 |
Error of Measurement | 121 |
Test Reliability | 110 |
Statistical Analysis | 107 |
Mathematical Models | 97 |
Item Response Theory | 87 |
Correlation | 76 |
Equated Scores | 76 |
Reliability | 64 |
Test Theory | 52 |
Test Items | 50 |
More ▼ |
Source
Author
Publication Type
Education Level
Audience
Researchers | 12 |
Practitioners | 2 |
Administrators | 1 |
Teachers | 1 |
Location
Australia | 1 |
Canada | 1 |
China | 1 |
Colorado | 1 |
Illinois | 1 |
Israel | 1 |
New York | 1 |
Oregon | 1 |
Taiwan | 1 |
Texas | 1 |
United Kingdom (England) | 1 |
More ▼ |
Laws, Policies, & Programs
Elementary and Secondary… | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Drewes, Donald W. – Psychological Methods, 2009
A unifying theory of subject-centered scalability is offered that is grounded in structural true score modeling, is conceptually distinct from internal consistency and homogeneity as determined by item correlations, and is empirically confirmable. Scalability holds when item true scores are perfectly correlated but differ in their individual scale…
Descriptors: Rating Scales, Factor Analysis, True Scores, Mathematical Models
Laenen, Annouschka; Alonso, Ariel; Molenberghs, Geert; Vangeneugden, Tony; Mallinckrodt, Craig H. – Applied Psychological Measurement, 2010
Longitudinal studies are permeating clinical trials in psychiatry. Therefore, it is of utmost importance to study the psychometric properties of rating scales, frequently used in these trials, within a longitudinal framework. However, intrasubject serial correlation and memory effects are problematic issues often encountered in longitudinal data.…
Descriptors: Psychiatry, Rating Scales, Memory, Psychometrics
DeMars, Christine E. – Journal of Educational and Behavioral Statistics, 2009
The Mantel-Haenszel (MH) and logistic regression (LR) differential item functioning (DIF) procedures have inflated Type I error rates when there are large mean group differences, short tests, and large sample sizes.When there are large group differences in mean score, groups matched on the observed number-correct score differ on true score,…
Descriptors: Regression (Statistics), Test Bias, Error of Measurement, True Scores
Bramley, Tom – Educational Research, 2010
Background: A recent article published in "Educational Research" on the reliability of results in National Curriculum testing in England (Newton, "The reliability of results from national curriculum testing in England," "Educational Research" 51, no. 2: 181-212, 2009) suggested that: (1) classification accuracy can be…
Descriptors: National Curriculum, Educational Research, Testing, Measurement
Rowan, Anna Habash; Hall, Daria; Haycock, Kati – Education Trust, 2010
Leaders in schools, districts, and states, along with policymakers in Washington, D.C., are focusing new energy on closing long-standing gaps in performance that separate low-income students and students of color from others. It's critically important that their efforts succeed--for students, their families, their communities, and for their…
Descriptors: Elementary Secondary Education, Academic Achievement, National Competency Tests, True Scores
Gierl, Mark J.; Cui, Ying; Zhou, Jiawen – Journal of Educational Measurement, 2009
The attribute hierarchy method (AHM) is a psychometric procedure for classifying examinees' test item responses into a set of structured attribute patterns associated with different components from a cognitive model of task performance. Results from an AHM analysis yield information on examinees' cognitive strengths and weaknesses. Hence, the AHM…
Descriptors: Test Items, True Scores, Psychometrics, Algebra
Kang, Taehoon; Chen, Troy T. – National Center for Research on Evaluation, Standards, and Student Testing (CRESST), 2009
In this report, an alternative item response theory (IRT) observed score equating method was newly developed. The proposed equating method was illustrated with two real data sets and the equating results were compared to those of traditional IRT true score and IRT observed score equating methods. Using three loss indices, the new method appeared…
Descriptors: Equated Scores, Item Response Theory, True Scores, Methods
Taft, Casey T.; Watkins, Laura E.; Stafford, Jane; Street, Amy E.; Monson, Candice M. – Journal of Consulting and Clinical Psychology, 2011
Objective: The authors conducted a meta-analysis of empirical studies investigating associations between indices of posttraumatic stress disorder (PTSD) and intimate relationship problems to empirically synthesize this literature. Method: A literature search using PsycINFO, Medline, Published International Literature on Traumatic Stress (PILOTS),…
Descriptors: Aggression, Posttraumatic Stress Disorder, Doctoral Dissertations, Error of Measurement
Haberman, Shelby J.; Qian, Jiahe – Journal of Educational and Behavioral Statistics, 2007
Statistical prediction problems often involve both a direct estimate of a true score and covariates of this true score. Given the criterion of mean squared error, this study determines the best linear predictor of the true score given the direct estimate and the covariates. Results yield an extension of Kelley's formula for estimation of the true…
Descriptors: Prediction, Regression (Statistics), True Scores, Correlation
Hagge, Sarah Lynn – ProQuest LLC, 2010
Mixed-format tests containing both multiple-choice and constructed-response items are widely used on educational tests. Such tests combine the broad content coverage and efficient scoring of multiple-choice items with the assessment of higher-order thinking skills thought to be provided by constructed-response items. However, the combination of…
Descriptors: Test Format, True Scores, Equated Scores, Psychometrics
Katz, Stanley N. – Chronicle of Higher Education, 2008
How should one think about assessment in general education--or what is sometimes called liberal education--in the pluralistic environment of American higher education? Generalizations about longitudinal collegiate assessment are difficult, not least because of the remarkable range of four-year institutions and the students who attend them.…
Descriptors: Undergraduate Study, General Education, Outcomes of Education, True Scores
Herman, William E.; Nelson, Gena C. – Online Submission, 2009
This study compared college student reported grade point averages (GPA) with actual GPA as recorded at the Registrar's Office to determine the accuracy of student reported GPA. Results indicated that, on average, students reported slightly higher GPA than their actual GPA. Additionally, females were virtually as accurate as males and students with…
Descriptors: Grade Point Average, Research Problems, Statistical Bias, True Scores
MacCann, Robert G. – Educational and Psychological Measurement, 2008
It is shown that the Angoff and bookmarking cut scores are examples of true score equating that in the real world must be applied to observed scores. In the context of defining minimal competency, the percentage "failed" by such methods is a function of the length of the measuring instrument. It is argued that this length is largely…
Descriptors: True Scores, Cutting Scores, Minimum Competencies, Scores
von Davier, Alina A.; Fournier-Zajac, Stephanie; Holland, Paul W. – ETS Research Report Series, 2007
In the nonequivalent groups with anchor test (NEAT) design, there are several ways to use the information provided by the anchor in the equating process. One of the NEAT-design equating methods is the linear observed-score Levine method (Kolen & Brennan, 2004). It is based on a classical test theory model of the true scores on the test forms…
Descriptors: Equated Scores, Statistical Analysis, Test Items, Test Theory
Miller, Angela D.; Murdock, Tamera B. – Contemporary Educational Psychology, 2007
Measures of classroom climate such as classroom goal structures are often assessed through students' perceptions; the aggregated means within classrooms are then sometimes labeled as "classroom characteristics." The validity of these constructs is limited by the reliability of the measure at both the student and classroom level; yet, few studies…
Descriptors: True Scores, Teacher Characteristics, Classroom Environment, Student Attitudes