ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	3
Since 2006 (last 20 years)	6

Descriptor

Error of Measurement	13
Evaluation Methods	13
Testing	13
Measurement Techniques	4
Simulation	4
Test Reliability	4
Item Response Theory	3
Measurement	3
Sampling	3
Statistical Analysis	3
Comparative Analysis	2
Computer Software	2
Correlation	2
Data Collection	2
Evaluation Criteria	2
Evaluation Research	2
Maximum Likelihood Statistics	2
Models	2
Multiple Regression Analysis	2
Nonparametric Statistics	2
Norm Referenced Tests	2
Psychometrics	2
Questionnaires	2
Research Design	2
Sample Size	2
More ▼

Source

Applied Psychological…	2
International Journal of…	2
Audio-Visual Language Journal	1
International Journal of…	1
Journal of Agronomic…	1
Online Submission	1
ProQuest LLC	1
Research Synthesis Methods	1

Author

Woods, Carol M.	2
Algina, James	1
Cooper, Terence H.	1
Dirkzwager, Arie	1
Dusseldorp, Elise	1
Foster, Jeff L.	1
Karkee, Thakur B.	1
Kilmen, Sevilay	1
Kirsch, Irwin S.	1
Li, Xinru	1
Lotfi Simon Kerzabi	1
Meulman, Jacqueline J.	1
Meyer, Kevin D.	1
Olejnik, Stephen F.	1
Ozsoy, Seyma Nur	1
Simpson, J. D.	1
Stewart, E. Elizabeth	1
Wright, Karen R.	1
More ▼

Publication Type

Journal Articles	7
Reports - Research	5
Reports - Evaluative	4
Speeches/Meeting Papers	3
Reports - Descriptive	2
Dissertations/Theses -…	1
Numerical/Quantitative Data	1
Opinion Papers	1
Tests/Questionnaires	1

Education Level

Junior High Schools	1
Middle Schools	1
Secondary Education	1

Audience

Practitioners	1
Researchers	1

Location

Turkey

Laws, Policies, & Programs

Job Training Partnership Act…

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 13 results Save | Export

A Maximum Test of Three Non-Parametric Two-Sample Procedures for Ordinal Data

Direct link

Lotfi Simon Kerzabi – ProQuest LLC, 2021

Monte Carlo methods are an accepted methodology in regards to generation critical values for a Maximum test. The same methods are also applicable to the evaluation of the robustness of the new created test. A table of critical values was created, and the robustness of the new maximum test was evaluated for five different distributions. Robustness…

Descriptors: Data, Monte Carlo Methods, Testing, Evaluation Research

Comparison of Kernel Equating Methods under NEAT and NEC Designs

Peer reviewed
PDF on ERIC

Download full text

Ozsoy, Seyma Nur; Kilmen, Sevilay – International Journal of Assessment Tools in Education, 2023

In this study, Kernel test equating methods were compared under NEAT and NEC designs. In NEAT design, Kernel post-stratification and chain equating methods taking into account optimal and large bandwidths were compared. In the NEC design, gender and/or computer/tablet use was considered as a covariate, and Kernel test equating methods were…

Descriptors: Equated Scores, Testing, Test Items, Statistical Analysis

A Flexible Approach to Identify Interaction Effects between Moderators in Meta-Analysis

Peer reviewed

Direct link

Li, Xinru; Dusseldorp, Elise; Meulman, Jacqueline J. – Research Synthesis Methods, 2019

In meta-analytic studies, there are often multiple moderators available (eg, study characteristics). In such cases, traditional meta-analysis methods often lack sufficient power to investigate interaction effects between moderators, especially high-order interactions. To overcome this problem, meta-CART was proposed: an approach that applies…

Descriptors: Correlation, Meta Analysis, Identification, Testing

Ramsay-Curve Differential Item Functioning

Peer reviewed

Direct link

Woods, Carol M. – Applied Psychological Measurement, 2011

Differential item functioning (DIF) occurs when an item on a test, questionnaire, or interview has different measurement properties for one group of people versus another, irrespective of true group-mean differences on the constructs being measured. This article is focused on item response theory based likelihood ratio testing for DIF (IRT-LR or…

Descriptors: Simulation, Item Response Theory, Testing, Questionnaires

Empirical Selection of Anchors for Tests of Differential Item Functioning

Peer reviewed

Direct link

Woods, Carol M. – Applied Psychological Measurement, 2009

Differential item functioning (DIF) occurs when items on a test or questionnaire have different measurement properties for one group of people versus another, irrespective of group-mean differences on the construct. Methods for testing DIF require matching members of different groups on an estimate of the construct. Preferably, the estimate is…

Descriptors: Test Results, Testing, Item Response Theory, Test Bias

Statistics for the Non-statistical

Simpson, J. D. – Audio-Visual Language Journal, 1974

Some basic statistical concepts relevant to the teacher--mean scores, standard deviation, normal and skewed distributions, z scores, item analysis, standard error of measurement, reliability--and their use by the teacher are explained. (RM)

Descriptors: Error of Measurement, Evaluation Methods, Norm Referenced Tests, Scoring

A Study of Three-option and Four-option Multiple Choice Exams.

Cooper, Terence H. – Journal of Agronomic Education (JAE), 1988

Describes a study used to determine differences in exam reliability, difficulty, and student evaluations. Indicates that when a fourth option was added to the three-option items, the exams became more difficult. Includes methods, results discussion, and tables on student characteristics, whole test analyses, and selected items. (RT)

Descriptors: Agronomy, College Science, Error of Measurement, Evaluation Methods

Multiple Evaluation: A New Testing Paradigm that Exorcizes Guessing

Peer reviewed

Direct link

Dirkzwager, Arie – International Journal of Testing, 2003

The crux in psychometrics is how to estimate the probability that a respondent answers an item correctly on one occasion out of many. Under the current testing paradigm this probability is estimated using all kinds of statistical techniques and mathematical modeling. Multiple evaluation is a new testing paradigm using the person's own personal…

Descriptors: Psychometrics, Probability, Models, Measurement

Evaluation of Linking Methods for Placing Three-Parameter Logistic Item Parameter Estimates onto a One-Parameter Scale

Download full text

Karkee, Thakur B.; Wright, Karen R. – Online Submission, 2004

Different item response theory (IRT) models may be employed for item calibration. Change of testing vendors, for example, may result in the adoption of a different model than that previously used with a testing program. To provide scale continuity and preserve cut score integrity, item parameter estimates from the new model must be linked to the…

Descriptors: Measures (Individuals), Evaluation Criteria, Testing, Integrity

Tests of Variance Equality When Distributions Differ in Form, Scale and Location.

Download full text

Olejnik, Stephen F.; Algina, James – 1986

Sampling distributions for ten tests for comparing population variances in a two group design were generated for several combinations of equal and unequal sample sizes, population means, and group variances when distributional forms differed. The ten procedures included: (1) O'Brien's (OB); (2) O'Brien's with adjusted degrees of freedom; (3)…

Descriptors: Error of Measurement, Evaluation Methods, Measurement Techniques, Nonparametric Statistics

Methodological Issues Related to the Study of Context Effects in Multisection Tests.

Stewart, E. Elizabeth – 1981

Context effects are defined as being influences on test performance associated with the content of successively presented test items or sections. Four types of context effects are identified: (1) direct context effects (practice effects) which occur when performance on items is affected by the examinee having been exposed to similar types of…

Descriptors: Context Effect, Data Collection, Error of Measurement, Evaluation Methods

Considerations for Creating Multi-Language Personality Norms: A Three-Component Model of Error

Peer reviewed

Direct link

Meyer, Kevin D.; Foster, Jeff L. – International Journal of Testing, 2008

With the increasing globalization of human resources practices, a commensurate increase in demand has occurred for multi-language ("global") personality norms for use in selection and development efforts. The combination of data from multiple translations of a personality assessment into a single norm engenders error from multiple sources. This…

Descriptors: Global Approach, Cultural Differences, Norms, Human Resources

Profiling the Literacy Proficiencies of JTPA and ES/UI Populations. Final Report to the Department of Labor.

PDF pending restoration

Kirsch, Irwin S.; And Others – 1992

A comprehensive assessment of the literacy proficiencies of Job Training Partnership Act (JTPA) and Employment Service/Unemployment Insurance (ES/UI) participants was conducted by the Department of Labor. The survey responses of a sample of 2,501 JTPA applicants and 3,277 ES/UI participants were scored, weighted, analyzed, and used to develop a…

Descriptors: Adult Literacy, Comparative Analysis, Correlation, Data Collection