ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	4
Since 2006 (last 20 years)	12

Descriptor

Models	12
True Scores	12
Correlation	7
Error of Measurement	5
Comparative Analysis	4
Item Response Theory	4
Computation	3
Evaluation Methods	3
Factor Analysis	3
Prediction	3
Scores	3
Simulation	3
Bias	2
Equated Scores	2
Hierarchical Linear Modeling	2
Mathematical Models	2
Psychometrics	2
Rating Scales	2
Reliability	2
Test Format	2
Test Theory	2
Academic Standards	1
Accuracy	1
Achievement Tests	1
Advanced Placement	1
More ▼

Source

ProQuest LLC	2
Applied Measurement in…	1
Applied Psychological…	1
Assessment	1
Contemporary Educational…	1
ETS Research Report Series	1
Educational Assessment	1
Educational and Psychological…	1
Journal of Educational…	1
Psychological Methods	1
Sociology of Education	1
More ▼

Publication Type

Journal Articles	10
Reports - Research	8
Dissertations/Theses -…	2
Reports - Evaluative	2

Education Level

Higher Education	2
Postsecondary Education	1

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…

What Works Clearinghouse Rating

Showing all 12 results Save | Export

Factor Scores in Clustered Data: An Evaluation of Methods to Obtain Level-1 and Level-2 Scores

Direct link

Strauss, Christian L. L. – ProQuest LLC, 2022

In many psychological and educational applications, it is imperative to obtain valid and reliable score estimates of multilevel processes. For example, in order to assess the quality and characteristics of high impact learning processes, one must compute accurate scores representative of student- and classroom-level constructs. Currently, there…

Descriptors: Scores, Factor Analysis, Models, True Scores

Comparing the Efficacy of Fixed-Effects and MAIHDA Models in Predicting Outcomes for Intersectional Social Strata

Peer reviewed

Direct link

Ben Van Dusen; Heidi Cian; Jayson Nissen; Lucy Arellano; Adrienne D. Woods – Sociology of Education, 2024

This investigation examines the efficacy of multilevel analysis of individual heterogeneity and discriminatory accuracy (MAIHDA) over fixed-effects models when performing intersectional studies. The research questions are as follows: (1) What are typical strata representation rates and outcomes on physics research-based assessments? (2) To what…

Descriptors: Educational Research, Intersectionality, Critical Race Theory, STEM Education

An Extension of IRT-Based Equating to the Dichotomous Testlet Response Theory Model

Peer reviewed

Direct link

Tao, Wei; Cao, Yi – Applied Measurement in Education, 2016

Current procedures for equating number-correct scores using traditional item response theory (IRT) methods assume local independence. However, when tests are constructed using testlets, one concern is the violation of the local item independence assumption. The testlet response theory (TRT) model is one way to accommodate local item dependence.…

Descriptors: Item Response Theory, Equated Scores, Test Format, Models

On the Relationship between Classical Test Theory and Item Response Theory: From One to the Other and Back

Peer reviewed

Direct link

Raykov, Tenko; Marcoulides, George A. – Educational and Psychological Measurement, 2016

The frequently neglected and often misunderstood relationship between classical test theory and item response theory is discussed for the unidimensional case with binary measures and no guessing. It is pointed out that popular item response models can be directly obtained from classical test theory-based models by accounting for the discrete…

Descriptors: Test Theory, Item Response Theory, Models, Correlation

Standards-Based Grading: History Adjusted True Score

Peer reviewed

Direct link

Hooper, Jay; Cowell, Ryan – Educational Assessment, 2014

There has been much research and discussion on the principles of standards-based grading, and there is a growing consensus of best practice. Even so, the actual process of implementing standards-based grading at a school or district level can be a significant challenge. There are very practical questions that remain unclear, such as how the grades…

Descriptors: True Scores, Grading, Academic Standards, Computation

Assessing First- and Second-Order Equity for the Common-Item Nonequivalent Groups Design Using Multidimensional IRT

Direct link

Andrews, Benjamin James – ProQuest LLC, 2011

The equity properties can be used to assess the quality of an equating. The degree to which expected scores conditional on ability are similar between test forms is referred to as first-order equity. Second-order equity is the degree to which conditional standard errors of measurement are similar between test forms after equating. The purpose of…

Descriptors: Test Format, Advanced Placement, Simulation, True Scores

Reliability Generalization: An Examination of the Positive Affect and Negative Affect Schedule

Peer reviewed

Direct link

Leue, Anja; Lange, Sebastian – Assessment, 2011

The assessment of positive affect (PA) and negative affect (NA) by means of the Positive Affect and Negative Affect Schedule has received a remarkable popularity in the social sciences. Using a meta-analytic tool--namely, reliability generalization (RG)--population reliability scores of both scales have been investigated on the basis of a random…

Descriptors: Social Sciences, True Scores, Generalization, Affective Behavior

Subject-Centered Scalability: The Sine Qua Non of Summated Ratings

Peer reviewed

Direct link

Drewes, Donald W. – Psychological Methods, 2009

A unifying theory of subject-centered scalability is offered that is grounded in structural true score modeling, is conceptually distinct from internal consistency and homogeneity as determined by item correlations, and is empirically confirmable. Scalability holds when item true scores are perfectly correlated but differ in their individual scale…

Descriptors: Rating Scales, Factor Analysis, True Scores, Mathematical Models

Coping with Memory Effect and Serial Correlation when Estimating Reliability in a Longitudinal Framework

Peer reviewed

Direct link

Laenen, Annouschka; Alonso, Ariel; Molenberghs, Geert; Vangeneugden, Tony; Mallinckrodt, Craig H. – Applied Psychological Measurement, 2010

Longitudinal studies are permeating clinical trials in psychiatry. Therefore, it is of utmost importance to study the psychometric properties of rating scales, frequently used in these trials, within a longitudinal framework. However, intrasubject serial correlation and memory effects are problematic issues often encountered in longitudinal data.…

Descriptors: Psychiatry, Rating Scales, Memory, Psychometrics

Modeling Latent True Scores to Determine the Utility of Aggregate Student Perceptions as Classroom Indicators in HLM: The Case of Classroom Goal Structures

Peer reviewed

Direct link

Miller, Angela D.; Murdock, Tamera B. – Contemporary Educational Psychology, 2007

Measures of classroom climate such as classroom goal structures are often assessed through students' perceptions; the aggregated means within classrooms are then sometimes labeled as "classroom characteristics." The validity of these constructs is limited by the reliability of the measure at both the student and classroom level; yet, few studies…

Descriptors: True Scores, Teacher Characteristics, Classroom Environment, Student Attitudes

Generating Dichotomous Item Scores with the Four-Parameter Beta Compound Binomial Model

Peer reviewed

Direct link

Monahan, Patrick O.; Lee, Won-Chan; Ankenmann, Robert D. – Journal of Educational Measurement, 2007

A Monte Carlo simulation technique for generating dichotomous item scores is presented that implements (a) a psychometric model with different explicit assumptions than traditional parametric item response theory (IRT) models, and (b) item characteristic curves without restrictive assumptions concerning mathematical form. The four-parameter beta…

Descriptors: True Scores, Psychometrics, Monte Carlo Methods, Correlation

Construct Validity of "e-rater"® in Scoring TOEFL® Essays. Research Report. ETS RR-07-21

Peer reviewed
PDF on ERIC

Download full text

Attali, Yigal – ETS Research Report Series, 2007

This study examined the construct validity of the "e-rater"® automated essay scoring engine as an alternative to human scoring in the context of TOEFL® essay writing. Analyses were based on a sample of students who repeated the TOEFL within a short time period. Two "e-rater" scores were investigated in this study, the first…

Descriptors: Construct Validity, Computer Assisted Testing, Scoring, English (Second Language)

Adrienne D. Woods	1
Alonso, Ariel	1
Andrews, Benjamin James	1
Ankenmann, Robert D.	1
Attali, Yigal	1
Ben Van Dusen	1
Cao, Yi	1
Cowell, Ryan	1
Drewes, Donald W.	1
Heidi Cian	1
Hooper, Jay	1
Jayson Nissen	1
Laenen, Annouschka	1
Lange, Sebastian	1
Lee, Won-Chan	1
Leue, Anja	1
Lucy Arellano	1
Mallinckrodt, Craig H.	1
Marcoulides, George A.	1
Miller, Angela D.	1
Molenberghs, Geert	1
Monahan, Patrick O.	1
Murdock, Tamera B.	1
Raykov, Tenko	1
Strauss, Christian L. L.	1
More ▼