ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	3
Since 2006 (last 20 years)	8

Descriptor

Models	13
Psychometrics	13
Test Format	13
Test Items	8
Item Response Theory	7
Test Construction	6
Difficulty Level	3
Item Analysis	3
Test Reliability	3
Validity	3
Cognitive Ability	2
Computer Assisted Testing	2
Foreign Countries	2
Goodness of Fit	2
Graduate Students	2
High School Students	2
Likert Scales	2
Measurement Techniques	2
Rating Scales	2
Reliability	2
Scores	2
Scoring	2
Testing	2
Ability	1
Achievement Tests	1
More ▼

Source

Applied Psychological…	2
College Board	2
Applied Measurement in…	1
Educational and Psychological…	1
Intelligence	1
Journal of Intelligence	1
Journal of Psychoeducational…	1

Publication Type

Reports - Research	10
Journal Articles	7
Speeches/Meeting Papers	3
Reports - Descriptive	2
Information Analyses	1
Numerical/Quantitative Data	1
Reports - Evaluative	1

Education Level

Elementary Secondary Education	1
High Schools	1
Higher Education	1
Postsecondary Education	1
Secondary Education	1

Audience

Location

Canada	1
France	1
Texas	1

Laws, Policies, & Programs

Assessments and Surveys

Texas Assessment of Academic…

What Works Clearinghouse Rating

Showing all 13 results Save | Export

Same Test, Better Scores: Boosting the Reliability of Short Online Intelligence Recruitment Tests with Nested Logit Item Response Theory Models

Peer reviewed
PDF on ERIC

Download full text

Storme, Martin; Myszkowski, Nils; Baron, Simon; Bernard, David – Journal of Intelligence, 2019

Assessing job applicants' general mental ability online poses psychometric challenges due to the necessity of having brief but accurate tests. Recent research (Myszkowski & Storme, 2018) suggests that recovering distractor information through Nested Logit Models (NLM; Suh & Bolt, 2010) increases the reliability of ability estimates in…

Descriptors: Intelligence Tests, Item Response Theory, Comparative Analysis, Test Reliability

Evaluating the Psychometric Characteristics of Generated Multiple-Choice Test Items

Peer reviewed

Direct link

Gierl, Mark J.; Lai, Hollis; Pugh, Debra; Touchie, Claire; Boulais, André-Philippe; De Champlain, André – Applied Measurement in Education, 2016

Item development is a time- and resource-intensive process. Automatic item generation integrates cognitive modeling with computer technology to systematically generate test items. To date, however, items generated using cognitive modeling procedures have received limited use in operational testing situations. As a result, the psychometric…

Descriptors: Psychometrics, Multiple Choice Tests, Test Items, Item Analysis

Examining the Teachers' Sense of Efficacy Scale at the Item Level with Rasch Measurement Model

Peer reviewed

Direct link

Chang, Mei-Lin; Engelhard, George, Jr. – Journal of Psychoeducational Assessment, 2016

The purpose of this study is to examine the psychometric quality of the Teachers' Sense of Efficacy Scale (TSES) with data collected from 554 teachers in a U.S. Midwestern state. The many-facet Rasch model was used to examine several potential contextual influences (years of teaching experience, school context, and levels of emotional exhaustion)…

Descriptors: Models, Teacher Attitudes, Self Efficacy, Item Response Theory

Rating Quality Studies Using Rasch Measurement Theory. Research Report 2013-3

Download full text

Engelhard, George, Jr.; Wind, Stefanie A. – College Board, 2013

The major purpose of this study is to examine the quality of ratings assigned to CR (constructed-response) questions in large-scale assessments from the perspective of Rasch Measurement Theory. Rasch Measurement Theory provides a framework for the examination of rating scale category structure that can yield useful information for interpreting the…

Descriptors: Measurement Techniques, Rating Scales, Test Theory, Scores

Exploring Equity Properties in Equating Using AP® Examinations. Research Report No. 2012-4

Download full text

Lee, Eunjung; Lee, Won-Chan; Brennan, Robert L. – College Board, 2012

In almost all high-stakes testing programs, test equating is necessary to ensure that test scores across multiple test administrations are equivalent and can be used interchangeably. Test equating becomes even more challenging in mixed-format tests, such as Advanced Placement Program® (AP®) Exams, that contain both multiple-choice and constructed…

Descriptors: Test Construction, Test Interpretation, Test Norms, Test Reliability

Quantitative Differences in Retest Effects across Different Methods Used to Construct Alternate Test Forms

Peer reviewed

Direct link

Arendasy, Martin E.; Sommer, Markus – Intelligence, 2013

Allowing respondents to retake a cognitive ability test has shown to increase their test scores. Several theoretical models have been proposed to explain this effect, which make distinct assumptions regarding the measurement invariance of psychometric tests across test administration sessions with regard to narrower cognitive abilities and general…

Descriptors: Cognitive Tests, Testing, Repetition, Scores

Applications of the Linear Logistic Test Model in Psychometric Research

Peer reviewed

Direct link

Kubinger, Klaus D. – Educational and Psychological Measurement, 2009

The linear logistic test model (LLTM) breaks down the item parameter of the Rasch model as a linear combination of some hypothesized elementary parameters. Although the original purpose of applying the LLTM was primarily to generate test items with specified item difficulty, there are still many other potential applications, which may be of use…

Descriptors: Models, Test Items, Psychometrics, Item Response Theory

A Method for Estimating Classification Consistency Indices for Two Equated Forms

Peer reviewed

Direct link

Yi, Hyun Sook; Kim, Seonghoon; Brennan, Robert L. – Applied Psychological Measurement, 2007

Large-scale testing programs involving classification decisions typically have multiple forms available and conduct equating to ensure cut-score comparability across forms. A test developer might be interested in the extent to which an examinee who happens to take a particular form would have a consistent classification decision if he or she had…

Descriptors: Classification, Reliability, Indexes, Computation

A Psychometric Evaluation of 4-Point and 6-Point Likert-Type Scales in Relation to Reliability and Validity.

Peer reviewed

Chang, Lei – Applied Psychological Measurement, 1994

Reliability and validity of 4-point and 6-point scales were assessed using a new model-based approach to fit empirical data from 165 graduate students completing an attitude measure. Results suggest that the issue of four- versus six-point scales may depend on the empirical setting. (SLD)

Descriptors: Attitude Measures, Goodness of Fit, Graduate Students, Graduate Study

Item Construction and Psychometric Models Appropriate for Constructed Responses.

Download full text

Tatsuoka, Kikumi K. – 1991

Constructed-response formats are desired for measuring complex and dynamic response processes that require the examinee to understand the structures of problems and micro-level cognitive tasks. These micro-level tasks and their organized structures are usually unobservable. This study shows that elementary graph theory is useful for organizing…

Descriptors: Adult Literacy, Cognitive Measurement, Cognitive Processes, Constructed Response

Ethnic Group's Representation in Test Construction Sample and Test Bias.

Download full text

Fan, Xitao; And Others – 1994

The hypothesis that faulty classical psychometric and sampling procedures in test construction could generate systematic bias against ethnic groups with smaller representation in the test construction sample was studied empirically. Two test construction models were developed: one with differential representation of ethnic groups (White, African…

Descriptors: Ethnic Groups, Genetics, High School Students, High Schools

Assessing Instructional Outcomes.

Download full text

Baker, Eva L.; O'Neil, Harold F. – 1985

This paper presents a discussion of outcome assessment that puts into context how measurement has evolved to its present state. Several types of testing and assessment options are considered against a background of validity. Criterion-referenced measurement is discussed extensively in terms of history, field study, identity problems, intellectual…

Descriptors: Criterion Referenced Tests, Educational Assessment, Educational Technology, Elementary Secondary Education

Using Confirmatory Factor Analysis of Multitrait-Multimethod Data To Assess the Psychometrical Equivalence of 4-Point and 6-Point Likert-Type Scales.

Download full text

Chang, Lei – 1993

Equivalence in reliability and validity across 4-point and 6-point scales was assessed by fitting different measurement models through confirmatory factor analysis of a multitrait-multimethod covariance matrix. Responses to nine Likert-type items designed to measure perceived quantitative ability, self-perceived usefulness of quantitative…

Descriptors: Ability, Comparative Testing, Education Majors, Graduate Students

Brennan, Robert L.	2
Chang, Lei	2
Engelhard, George, Jr.	2
Arendasy, Martin E.	1
Baker, Eva L.	1
Baron, Simon	1
Bernard, David	1
Boulais, André-Philippe	1
Chang, Mei-Lin	1
De Champlain, André	1
Fan, Xitao	1
Gierl, Mark J.	1
Kim, Seonghoon	1
Kubinger, Klaus D.	1
Lai, Hollis	1
Lee, Eunjung	1
Lee, Won-Chan	1
Myszkowski, Nils	1
O'Neil, Harold F.	1
Pugh, Debra	1
Sommer, Markus	1
Storme, Martin	1
Tatsuoka, Kikumi K.	1
Touchie, Claire	1
More ▼