ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	2
Since 2006 (last 20 years)	12

Descriptor

Error of Measurement	14
Psychometrics	14
Testing	14
Models	4
Probability	4
Foreign Countries	3
Generalizability Theory	3
Item Analysis	3
Measurement	3
Measurement Techniques	3
Reliability	3
Scores	3
Scoring	3
Statistical Analysis	3
Test Reliability	3
Validity	3
Certification	2
Classification	2
English (Second Language)	2
Evaluation	2
Evaluation Methods	2
Item Response Theory	2
Language Tests	2
Rating Scales	2
Second Language Learning	2
More ▼

Source

International Journal of…	2
Psychometrika	2
Asia Pacific Journal of…	1
Educational Measurement:…	1
Educational Research	1
Educational Researcher	1
Journal of Educational and…	1
Journal of Psychoeducational…	1
Language Assessment Quarterly	1
Psicologica: International…	1
Teachers College Record	1
More ▼

Publication Type

Journal Articles	13
Reports - Research	6
Reports - Descriptive	3
Reports - Evaluative	3
Opinion Papers	1
Tests/Questionnaires	1

Education Level

High Schools	1
Higher Education	1
Middle Schools	1
Postsecondary Education	1

Audience

Location

China (Beijing)	1
Japan	1
United Kingdom (England)	1

Laws, Policies, & Programs

Assessments and Surveys

National Longitudinal Survey…	1
Wechsler Adult Intelligence…	1
Wechsler Intelligence Scale…	1

What Works Clearinghouse Rating

Showing all 14 results Save | Export

A Guide for Setting the Cut-Scores to Minimize Weighted Classification Errors in Test Batteries

Peer reviewed

Direct link

Grabovsky, Irina; Wainer, Howard – Journal of Educational and Behavioral Statistics, 2017

In this article, we extend the methodology of the Cut-Score Operating Function that we introduced previously and apply it to a testing scenario with multiple independent components and different testing policies. We derive analytically the overall classification error rate for a test battery under the policy when several retakes are allowed for…

Descriptors: Cutting Scores, Weighted Scores, Classification, Testing

Investigating Score Dependability in English/Chinese Interpreter Certification Performance Testing: A Generalizability Theory Approach

Peer reviewed

Direct link

Han, Chao – Language Assessment Quarterly, 2016

As a property of test scores, reliability/dependability constitutes an important psychometric consideration, and it underpins the validity of measurement results. A review of interpreter certification performance tests (ICPTs) reveals that (a) although reliability/dependability checking has been recognized as an important concern, its theoretical…

Descriptors: Foreign Countries, Scores, English, Chinese

Robust Structural Equation Modeling with Missing Data and Auxiliary Variables

Peer reviewed

Direct link

Yuan, Ke-Hai; Zhang, Zhiyong – Psychometrika, 2012

The paper develops a two-stage robust procedure for structural equation modeling (SEM) and an R package "rsem" to facilitate the use of the procedure by applied researchers. In the first stage, M-estimates of the saturated mean vector and covariance matrix of all variables are obtained. Those corresponding to the substantive variables…

Descriptors: Structural Equation Models, Tests, Federal Aid, Psychometrics

Accumulative Equating Error after a Chain of Linear Equatings

Peer reviewed

Direct link

Guo, Hongwen – Psychometrika, 2010

After many equatings have been conducted in a testing program, equating errors can accumulate to a degree that is not negligible compared to the standard error of measurement. In this paper, the author investigates the asymptotic accumulative standard error of equating (ASEE) for linear equating methods, including chained linear, Tucker, and…

Descriptors: Testing Programs, Testing, Error of Measurement, Equated Scores

Assessing Short-Term Individual Consistency Using IRT-Based Statistics

Peer reviewed
PDF on ERIC

Download full text

Ferrando, Pere J. – Psicologica: International Journal of Methodology and Experimental Psychology, 2010

This article proposes a procedure, based on a global statistic, for assessing intra-individual consistency in a test-retest design with a short-term retest interval. The procedure is developed within the framework of parametric item response theory, and the statistic is a likelihood-based measure that can be considered as an extension of the…

Descriptors: Item Response Theory, Intervals, Psychometrics, Testing

Testing for Factorial Invariance of the Modified Leadership Scale for Sports: Using a Japanese Version

Peer reviewed

Direct link

Kwon, Hyungil Harry; Pyun, Do Young; Han, Siwan; Ogasawara, Etsuko – Asia Pacific Journal of Education, 2011

The objective of this study was to provide empirical evidence to support psychometric properties of a modified four-dimensional model of the Leadership Scale for Sports (LSS). The study tested invariance of all parameters (i.e., factor loadings, error variances, and factor variances-covariances) in the four-dimensional measurement model between…

Descriptors: Feedback (Response), Testing, Athletes, Factor Structure

IQ Scores Should Be Corrected for the Flynn Effect in High-Stakes Decisions

Peer reviewed

Direct link

Fletcher, Jack M.; Stuebing, Karla K.; Hughes, Lisa C. – Journal of Psychoeducational Assessment, 2010

IQ test scores should be corrected for high stakes decisions that employ these assessments, including capital offense cases. If scores are not corrected, then diagnostic standards must change with each generation. Arguments against corrections, based on standards of practice, information present and absent in test manuals, and related issues,…

Descriptors: Testing, Mental Retardation, Validity, Intelligence Quotient

A Response to an Article Published in "Educational Research"'s Special Issue on Assessment (June 2009). What Can Be Inferred about Classification Accuracy from Classification Consistency?

Peer reviewed

Direct link

Bramley, Tom – Educational Research, 2010

Background: A recent article published in "Educational Research" on the reliability of results in National Curriculum testing in England (Newton, "The reliability of results from national curriculum testing in England," "Educational Research" 51, no. 2: 181-212, 2009) suggested that: (1) classification accuracy can be…

Descriptors: National Curriculum, Educational Research, Testing, Measurement

Same-Form Retest Effects on Credentialing Examinations

Peer reviewed

Direct link

Raymond, Mark R.; Neustel, Sandra; Anderson, Dan – Educational Measurement: Issues and Practice, 2009

Examinees who take high-stakes assessments are usually given an opportunity to repeat the test if they are unsuccessful on their initial attempt. To prevent examinees from obtaining unfair score increases by memorizing the content of specific test items, testing agencies usually assign a different test form to repeat examinees. The use of multiple…

Descriptors: Test Results, Test Items, Testing, Aptitude Tests

Who Is Given Tests in What Language by Whom, When, and Where? The Need for Probabilistic Views of Language in the Testing of English Language Learners

Peer reviewed

Direct link

Solano-Flores, Guillermo – Educational Researcher, 2008

The testing of English language learners (ELLs) is, to a large extent, a random process because of poor implementation and factors that are uncertain or beyond control. Yet current testing practices and policies appear to be based on deterministic views of language and linguistic groups and erroneous assumptions about the capacity of assessment…

Descriptors: Generalizability Theory, Testing, Second Language Learning, Error of Measurement

Language, Dialect, and Register: Sociolinguistics and the Estimation of Measurement Error in the Testing of English Language Learners

Peer reviewed

Direct link

Solano-Flores, Guillermo – Teachers College Record, 2006

This article examines the intersection of psychometrics and sociolinguists in the testing of English language learners (ELLs); it discusses language, dialect, and register as sources of measurement error. Research findings show that the dialect of the language in which students are tested (e.g., local or standard English) is as important as…

Descriptors: Second Language Learning, Test Construction, Sociolinguistics, Psychometrics

Latent Trait Estimation: Theory vs. Practice.

Download full text

Kolakowski, Donald – 1972

Empirical results are presented as regards the implementation of a latent-trait psychometric model by means of conditional maximum likelihood estimation. Items are scored polychotomously into varying numbers of nominal categories and the test and item characteristic curves and information functions are examined. It is concluded that scoring items…

Descriptors: Error of Measurement, Item Analysis, Item Sampling, Measurement Techniques

Multiple Evaluation: A New Testing Paradigm that Exorcizes Guessing

Peer reviewed

Direct link

Dirkzwager, Arie – International Journal of Testing, 2003

The crux in psychometrics is how to estimate the probability that a respondent answers an item correctly on one occasion out of many. Under the current testing paradigm this probability is estimated using all kinds of statistical techniques and mathematical modeling. Multiple evaluation is a new testing paradigm using the person's own personal…

Descriptors: Psychometrics, Probability, Models, Measurement

Considerations for Creating Multi-Language Personality Norms: A Three-Component Model of Error

Peer reviewed

Direct link

Meyer, Kevin D.; Foster, Jeff L. – International Journal of Testing, 2008

With the increasing globalization of human resources practices, a commensurate increase in demand has occurred for multi-language ("global") personality norms for use in selection and development efforts. The combination of data from multiple translations of a personality assessment into a single norm engenders error from multiple sources. This…

Descriptors: Global Approach, Cultural Differences, Norms, Human Resources

Solano-Flores, Guillermo	2
Anderson, Dan	1
Bramley, Tom	1
Dirkzwager, Arie	1
Ferrando, Pere J.	1
Fletcher, Jack M.	1
Foster, Jeff L.	1
Grabovsky, Irina	1
Guo, Hongwen	1
Han, Chao	1
Han, Siwan	1
Hughes, Lisa C.	1
Kolakowski, Donald	1
Kwon, Hyungil Harry	1
Meyer, Kevin D.	1
Neustel, Sandra	1
Ogasawara, Etsuko	1
Pyun, Do Young	1
Raymond, Mark R.	1
Stuebing, Karla K.	1
Wainer, Howard	1
Yuan, Ke-Hai	1
Zhang, Zhiyong	1
More ▼