ERIC - Search Results

Publication Date

In 2025	4
Since 2024	8
Since 2021 (last 5 years)	19
Since 2016 (last 10 years)	35
Since 2006 (last 20 years)	57

Descriptor

Test Validity	165
Test Reliability	68
Test Construction	52
Validity	52
Higher Education	36
Test Items	35
Predictive Validity	33
Scores	33
Item Analysis	31
Test Interpretation	30
Test Bias	29
Achievement Tests	28
Multiple Choice Tests	28
Evaluation Methods	26
Comparative Analysis	24
Scoring	23
Item Response Theory	21
Testing Problems	21
Models	20
Test Use	20
College Entrance Examinations	18
Measurement Techniques	18
Correlation	16
Academic Achievement	15
Criterion Referenced Tests	15
More ▼

Source

Journal of Educational…

252

Publication Type

Journal Articles	173
Reports - Research	118
Reports - Evaluative	30
Opinion Papers	14
Reports - Descriptive	10
Information Analyses	7
Speeches/Meeting Papers	7
Book/Product Reviews	1
Reports - General	1
Tests/Questionnaires	1

Education Level

Higher Education	6
Postsecondary Education	6
Secondary Education	4
Middle Schools	3
Elementary Education	2
Elementary Secondary Education	2
Junior High Schools	2
Grade 7	1
Grade 8	1
High Schools	1

Audience

Researchers	7
Practitioners	2

Location

Canada	2
Australia	1
Ireland	1
Israel	1
Jordan	1
United Kingdom	1

Laws, Policies, & Programs

What Works Clearinghouse Rating

Journal of Educational Measurement X

Showing 106 to 120 of 252 results Save | Export

The Use of Experimental Design in Educational Evaluation

Peer reviewed

Stufflebeam, Daniel L. – Journal of Educational Measurement, 1971

Descriptors: Data Analysis, Educational Experiments, Evaluation Methods, Individual Differences

A Note on the Comparability of Alternative Scoring Methods for the Institutional Functioning Inventory

Peer reviewed

Hartnett, Rodney T. – Journal of Educational Measurement, 1971

Alternative scoring methods yield essentially the same information, including scale intercorrelations and validity. Reasons for preferring the traditional psychometric scoring technique are offered. (Author/AG)

Descriptors: College Environment, Comparative Analysis, Correlation, Item Analysis

Pearson Selection Formulas: Implications for Studies of Predictive Bias and Estimates of Educational Effects in Selected Samples.

Peer reviewed

Linn, Robert L. – Journal of Educational Measurement, 1983

When the precise basis of selection effect on correlation and regression equations is unknown but can be modeled by selection on a variable that is highly but not perfectly related to observed scores, the selection effects can lead to the commonly observed "overprediction" results in studies of predictive bias. (Author/PN)

Descriptors: Bias, Correlation, Higher Education, Prediction

Choice among Essay Topics: Impact on Performance and Validity.

Peer reviewed

Bridgeman, Brent; Morgan, Rick; Wang, Ming-mei – Journal of Educational Measurement, 1997

Test results of 915 high school students taking a history examination with a choice of topics show that students were generally able to pick the topic on which they could get the highest score. Implications for fair scoring when topic choice is allowed are discussed. (SLD)

Descriptors: Essay Tests, High School Students, History, Performance Factors

"Mental Model" Comparison of Automated and Human Scoring.

Peer reviewed

Williamson, David M.; Bejar, Isaac I.; Hone, Anne S. – Journal of Educational Measurement, 1999

Contrasts "mental models" used by automated scoring for the simulation division of the computerized Architect Registration Examination with those used by experienced human graders for 3,613 candidate solutions. Discusses differences in the models used and the potential of automated scoring to enhance the validity evidence of scores. (SLD)

Descriptors: Architects, Comparative Analysis, Computer Assisted Testing, Judges

An Application of Item Response Time: The Effort-Moderated IRT Model

Peer reviewed

Direct link

Wise, Steven L.; DeMars, Christine E. – Journal of Educational Measurement, 2006

The validity of inferences based on achievement test scores is dependent on the amount of effort that examinees put forth while taking the test. With low-stakes tests, for which this problem is particularly prevalent, there is a consequent need for psychometric models that can take into account differing levels of examinee effort. This article…

Descriptors: Guessing (Tests), Psychometrics, Inferences, Reaction Time

Differential Aptitude Tests

Peer reviewed

Hanna, Gerald S. – Journal of Educational Measurement, 1974

Descriptors: Aptitude Tests, Secondary School Students, Test Interpretation, Test Reliability

A Study of Reliability and Validity Effects of Total and Partial Immediate Feedback in Multiple-Choice Testing

Peer reviewed

Hanna, Gerald S. – Journal of Educational Measurement, 1977

The effects of providing total and partial immediate feedback to pupils in multiple choice testing was investigated with fifth and sixth grade pupils. The split-half reliability was higher with total feedback than with no feedback. Concurrent validity with a completion test showed all three settings to be nearly identical. (Author/JKS)

Descriptors: Elementary Education, Elementary School Students, Feedback, Forced Choice Technique

Selection Bias: Multiple Meanings.

Peer reviewed

Linn, Robert L. – Journal of Educational Measurement, 1984

The common approach to studies of predictive bias is analyzed within the context of a conceptual model in which predictors and criterion measures are viewed as fallible indicators of idealized qualifications. (Author/PN)

Descriptors: Certification, Models, Predictive Measurement, Predictive Validity

Matched Pair True-False Scoring: Effect on Reliability and Validity

Peer reviewed

Brandenburg, Dale C.; Whitney, Douglas R. – Journal of Educational Measurement, 1972

Primary purpose of this study was to investigate the effect of various scoring methods on the reliability and validity of the Primary Test of Economic Understanding (PTEU). the PTEU was designed to be scored using the matched pair procedure. (Authors)

Descriptors: Grade 3, Objective Tests, Response Style (Tests), Scoring Formulas

The Effects of Selected Poor Item-Writing Practices on Test Difficulty, Reliability and Validity

Peer reviewed

Board, Cynthia; Whitney, Douglas R. – Journal of Educational Measurement, 1972

For the principles studied here, poor item-writing practices serve to obscure (or attentuate) differences between good and poor students. (Authors)

Descriptors: College Students, Item Analysis, Multiple Choice Tests, Test Construction

Development and Evaluation of a Test of Information Storage During Reading

Peer reviewed

Carver, Ronald P.; Darby, Charles A., Jr. – Journal of Educational Measurement, 1971

Discusses a reading test using chunked" items -- groups of meaningfully related words in which certain groups are changed in meaning from the original passage. (Author)

Descriptors: Information Storage, Multiple Choice Tests, Reading Comprehension, Reading Tests

A Monte Carlo Comparison of Ten Item Discrimination Indices.

Peer reviewed

Beuchert, A. Kent; Mendoza, Jorge L. – Journal of Educational Measurement, 1979

Ten item discrimination indices, across a variety of item analysis situations, were compared, based on the validities of tests constructed by using each of the indices to select 40 items from a 100-item pool. Item score data were generated by a computer program and included a simulation of guessing. (Author/CTM)

Descriptors: Item Analysis, Simulation, Statistical Analysis, Test Construction

Internal Invalidity in Studies Employing Self-Report Instruments: A Suggested Remedy.

Peer reviewed

Howard, George S.; And Others – Journal of Educational Measurement, 1979

Evaluations of experimental interventions which employ self-report measures are subject to contamination known as response-shift bias. Response-shift effects may be attenuated by substituting retrospective pretest ratings for the traditional self-report pretest ratings. This study indicated that the retrospective rating more accurately reflected…

Descriptors: Higher Education, Rating Scales, Response Style (Tests), Self Evaluation

Comparison of Two Procedures for Analyzing Multitrait Multimethod Matrices.

Peer reviewed

Lomax, Richard G.; Algina, James – Journal of Educational Measurement, 1979

Results of using multimethod factor analysis and exploratory factor analysis for the analysis of three multitrait-multimethod matrices are compared. Results suggest that the two methods can give quite different impressions of discriminant validity. In the examples considered, the former procedure tends to support discrimination while the latter…

Descriptors: Comparative Analysis, Factor Analysis, Goodness of Fit, Matrices

« Previous Page | Next Page »

Pages: 1 | ... | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | ... | 17

Bennett, Randy Elliot	4
Wainer, Howard	4
Whitney, Douglas R.	4
Clauser, Brian E.	3
Goldman, Roy D.	3
Hanna, Gerald S.	3
Kane, Michael T.	3
Linn, Robert L.	3
Novick, Melvin R.	3
Ackerman, Terry A.	2
Airasian, Peter W.	2
Algina, James	2
Baldwin, Peter	2
Bejar, Isaac I.	2
Brandenburg, Dale C.	2
Chang, Hua-Hua	2
Ebel, Robert L.	2
Embretson, Susan	2
Farr, Roger	2
Fitzpatrick, Anne R.	2
Frisbie, David A.	2
Haertel, Edward	2
Hakstian, A. Ralph	2
Hambleton, Ronald K.	2
More ▼

SAT (College Admission Test)	11
Comprehensive Tests of Basic…	3
Graduate Record Examinations	3
Stanford Achievement Tests	3
Differential Aptitude Test	2
Iowa Tests of Basic Skills	2
National Assessment of…	2
Peabody Picture Vocabulary…	2
ACT Interest Inventory	1
Advanced Placement…	1
Alabama High School…	1
Classroom Environment Scale	1
College and University…	1
General Aptitude Test Battery	1
Kaufman Assessment Battery…	1
Law School Admission Test	1
Lexile Scale of Reading	1
McCarthy Scales of Childrens…	1
Metropolitan Achievement Tests	1
Metropolitan Readiness Tests	1
My Class Inventory	1
National Teacher Examinations	1
Preschool Inventory	1
Program for International…	1
Remote Associates Test	1
More ▼