ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	0
Since 2006 (last 20 years)	1

Source

American Psychologist	1
Educational Measurement:…	1
Foreign Language Annals	1
Language and Education	1
Measurement:…	1

Author

Haberman, Shelby J.	1
Haertel, Edward H.	1
Haugh, Brian	1
Jaeger, Richard M.	1
Luecht, Richard M.	1
Messick, Samuel	1
Rock, Donald A.	1
Sinharay, Sandip	1
Wilson, John	1
Yen, Wendy M.	1

Publication Type

Opinion Papers	8
Journal Articles	5
Speeches/Meeting Papers	4
Reports - Evaluative	2

Education Level

Audience

Researchers

Location

United Kingdom (Great Britain)

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 8 results Save | Export

How Much Can We Reliably Know about What Examinees Know?

Peer reviewed

Direct link

Sinharay, Sandip; Haberman, Shelby J. – Measurement: Interdisciplinary Research and Perspectives, 2009

In this commentary, the authors discuss some of the issues regarding the use of diagnostic classification models that practitioners should keep in mind. In the authors experience, these issues are not as well known as they should be. The authors then provide recommendations on diagnostic scoring.

Descriptors: Scoring, Reliability, Validity, Classification

The Choice of Scale for Educational Measurement: An IRT Perspective.

Yen, Wendy M. – 1984

Two of the most popular methods for obtaining equal-interval scales for educational measurement are discussed: Thurstone's method and Item Response Theory (IRT). Between-grade growth on these scales is compared; while unstandardized differences show different trends for the two scales, standardized differences that take standard deviations into…

Descriptors: Academic Achievement, Achievement Tests, Educational Research, Latent Trait Theory

Multistage Complexity in Language Proficiency Assessment: A Framework for Aligning Theoretical Perspectives, Test Development, and Psychometrics

Peer reviewed

Direct link

Luecht, Richard M. – Foreign Language Annals, 2003

This article contends that the necessary links between constructs and test scores/decisions in language assessment must be established through principled design procedures that align three models: (1) a theoretical construct model; (2) a test development model; and (3) a psychometric scoring model. The theoretical construct model articulates the…

Descriptors: Scoring, Psychometrics, Language Proficiency, Language Tests

Selection of Judges for Standard-Setting.

Peer reviewed

Jaeger, Richard M. – Educational Measurement: Issues and Practice, 1991

Issues concerning the selection of judges for standard setting are discussed. Determining the consistency of judges' recommendations, or their congruity with other expert recommendations, would help in selection. Enough judges must be chosen to allow estimation of recommendations by an entire population of judges. (SLD)

Descriptors: Cutting Scores, Evaluation Methods, Evaluators, Examiners

Validity of Psychological Assessment: Validation of Inferences from Persons' Responses and Performances as Scientific Inquiry into Score Meaning.

Peer reviewed

Messick, Samuel – American Psychologist, 1995

Presents a comprehensive review of validity that includes an empirical evaluation of the actual and potential consequences of score interpretation and use, how those consequences come about, and what determines them. Six distinguishable aspects of construct validity are highlighted as a means of addressing central issues implicit in the notion of…

Descriptors: Concurrent Validity, Construct Validity, Content Validity, Models

Latent Traits or Latent States? The Role of Discrete Models for Ability and Performance.

Download full text

Haertel, Edward H. – 1992

Classical test theory, item response theory, and generalizability theory all treat the abilities to be measured as continuous variables, and the items of a test as independent probes of underlying continua. These models are well-suited to measuring the broad, diffuse traits of traditional differential psychology, but not for measuring the outcomes…

Descriptors: Ability, Data Analysis, Error of Measurement, Generalizability Theory

Collaborative Modelling and Talk in the Classroom.

Peer reviewed

Wilson, John; Haugh, Brian – Language and Education, 1995

Argues that the method of "collaborative modelling" developed to teach reading skills may be utilized in generating and assessing pupil talk within the classroom. Pupil pairs were given different texts from science, English, and geography and asked to re-present them in another form. Results indicate the value of the talk emerging from…

Descriptors: Case Studies, Class Activities, Classroom Communication, Cooperation

Development of a Process To Assess Higher Order Thinking Skills for College Graduates.

Download full text

Rock, Donald A. – 1991

Issues in the development of assessments of higher order thinking skills for college graduates are discussed in the order in which they were presented when this series of papers was commissioned. With regard to Issue 1, it is generally agreed that the development of these skills is a desirable goal, but there is little consensus on how they should…

Descriptors: Adult Literacy, Cognitive Measurement, College Graduates, Communication Skills

Scoring	8
Models	5
Mathematical Models	3
Psychometrics	3
Test Interpretation	3
Educational Assessment	2
Evaluation Methods	2
Item Response Theory	2
Scaling	2
Test Construction	2
Ability	1
Academic Achievement	1
Achievement Tests	1
Adult Literacy	1
Case Studies	1
Class Activities	1
Classification	1
Classroom Communication	1
Cognitive Measurement	1
College Graduates	1
Communication Skills	1
Concurrent Validity	1
Construct Validity	1
Content Validity	1
Cooperation	1
More ▼