ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	0
Since 2006 (last 20 years)	4

Descriptor

Statistical Analysis	12
Test Format	12
Equated Scores	6
Test Items	5
Foreign Countries	3
Scores	3
Test Theory	3
Testing Programs	3
Comparative Analysis	2
Difficulty Level	2
High School Students	2
Higher Education	2
Language Tests	2
Latent Trait Theory	2
Mathematics Tests	2
Multiple Choice Tests	2
Psychometrics	2
Reading Comprehension	2
Reading Tests	2
Research Methodology	2
Sampling	2
Statistical Bias	2
Test Bias	2
Test Construction	2
Testing Problems	2
More ▼

Source

Applied Measurement in…	1
Assessment	1
Educational and Psychological…	1
International Journal of…	1
Journal of Educational and…	1
Language Testing	1
National Center for Analysis…	1
Psychological Assessment	1

Publication Type

Reports - Evaluative	12
Journal Articles	7
Speeches/Meeting Papers	3
Numerical/Quantitative Data	1

Education Level

Secondary Education	2
Elementary Education	1
Elementary Secondary Education	1
Preschool Education	1

Audience

Researchers

Location

Australia	1
Florida	1
Luxembourg	1
Netherlands	1
North Carolina	1

Laws, Policies, & Programs

Assessments and Surveys

SAT (College Admission Test)	3
Armed Services Vocational…	1
Florida Comprehensive…	1
Minnesota Multiphasic…	1
North Carolina End of Course…	1
Wisconsin Card Sorting Test	1

What Works Clearinghouse Rating

Showing all 12 results Save | Export

Incomplete Psychometric Equivalence of Scores Obtained on the Manual and the Computer Version of the Wisconsin Card Sorting Test?

Peer reviewed

Direct link

Steinmetz, Jean-Paul; Brunner, Martin; Loarer, Even; Houssemand, Claude – Psychological Assessment, 2010

The Wisconsin Card Sorting Test (WCST) assesses executive and frontal lobe function and can be administered manually or by computer. Despite the widespread application of the 2 versions, the psychometric equivalence of their scores has rarely been evaluated and only a limited set of criteria has been considered. The present experimental study (N =…

Descriptors: Computer Assisted Testing, Psychometrics, Test Theory, Scores

Item Equivalence in English and Chinese Translation of a Cognitive Development Test for Preschoolers

Peer reviewed

Direct link

He, Wei; Wolfe, Edward W. – International Journal of Testing, 2010

This article reports the results of a study of potential sources of item nonequivalence between English and Chinese language versions of a cognitive development test for preschool-aged children. Items were flagged for potential nonequivalence through statistical and judgment-based procedures, and the relationship between flag status and item…

Descriptors: Preschool Children, Mandarin Chinese, Cognitive Development, Item Analysis

Do Questions Written in the Target Language Make Foreign Language Listening Comprehension Tests More Difficult?

Peer reviewed

Direct link

Filipi, Anna – Language Testing, 2012

The Assessment of Language Competence (ALC) certificates is an annual, international testing program developed by the Australian Council for Educational Research to test the listening and reading comprehension skills of lower to middle year levels of secondary school. The tests are developed for three levels in French, German, Italian and…

Descriptors: Listening Comprehension Tests, Item Response Theory, Statistical Analysis, Foreign Countries

New Estimates of Design Parameters for Clustered Randomization Studies: Findings from North Carolina and Florida. Working Paper 43

Download full text

Xu, Zeyu; Nichols, Austin – National Center for Analysis of Longitudinal Data in Education Research, 2010

The gold standard in making causal inference on program effects is a randomized trial. Most randomization designs in education randomize classrooms or schools rather than individual students. Such "clustered randomization" designs have one principal drawback: They tend to have limited statistical power or precision. This study aims to…

Descriptors: Test Format, Reading Tests, Norm Referenced Tests, Research Design

Checking the Statistical Equivalence of Nearly Identical Test Editions.

Peer reviewed

Dorans, Neil J.; Lawrence, Ida M. – Applied Measurement in Education, 1990

A procedure for checking the score equivalence of nearly identical editions of a test is described and illustrated with Scholastic Aptitude Test data. The procedure uses the standard error of equating and uses graphical representation of score conversion deviations from the identity function in standard error units. (SLD)

Descriptors: Equated Scores, Grade Equivalent Scores, Scores, Statistical Analysis

Test Equating from Biased Samples, with Application to the Armed Services Vocational Aptitude Battery.

Peer reviewed

Little, Roderick J. A.; Rubin, Donald B. – Journal of Educational and Behavioral Statistics, 1994

Equating a new standard test to an old reference test is considered when samples for equating are not randomly selected from the target population of test takers, identifying two problems from equating from biased samples. An empirical example with data from the Armed Services Vocational Aptitude Battery illustrates the approach. (SLD)

Descriptors: Equated Scores, Military Personnel, Sampling, Statistical Analysis

Determining the Optimal Number of Alternatives to a Multiple-Choice Test Item: An Information Theoretic Perspective.

Peer reviewed

Bruno, James E.; Dirkzwager, A. – Educational and Psychological Measurement, 1995

Determining the optimal number of choices on a multiple-choice test is explored analytically from an information theory perspective. The analysis revealed that, in general, three choices seem optimal. This finding is in agreement with previous statistical and psychometric research. (SLD)

Descriptors: Distractors (Tests), Information Theory, Multiple Choice Tests, Psychometrics

The Equivalence of the MMPI and MMPI-2.

Peer reviewed

Gaston, Michele F.; And Others – Assessment, 1994

Comparability of the Minnesota Multiphasic Personality Inventory (MMPI) and the MMPI-2 was explored by examining T-score means, profile configurations, score distribution, and rank-order correlations on validity scales for 84 undergraduates. Equivalency of the two forms was generally supported. (SLD)

Descriptors: Comparative Analysis, Correlation, Higher Education, Personality Assessment

What Combination of Sampling and Equating Methods Works Best? Revised.

Download full text

Livingston, Samuel A.; And Others – 1989

Combinations of five methods of equating test forms and two methods of selecting samples of students for equating were compared for accuracy. The two sampling methods were representative sampling from the population and matching samples on the anchor test score. The equating methods were: (1) the Tucker method; (2) the Levine method; (3) the…

Descriptors: Comparative Analysis, Data Collection, Equated Scores, High School Students

The Determination of Empirical Standard Errors of Equating the Scores on SAT-Verbal and SAT-Mathematical.

Download full text

Angoff, William H. – 1991

An attempt was made to evaluate the standard error of equating (at the mean of the scores) in an ongoing testing program. The interest in estimating the empirical standard error of equating is occasioned by some discomfort with the error normally reported for test scores. Data used for this evaluation came from the Admissions Testing Program of…

Descriptors: College Entrance Examinations, Equated Scores, Error of Measurement, High School Students

Rasch and Reading

Samson, Digna M. M. – 1983

The traditional multiple-choice reading comprehension test of English as a second language, used in the Dutch school-leaving examinations, has been criticized for its apparent lack of construct validity. The Dutch National Institute for Educational Measurement has conducted a number of studies to determine whether there is a different skill…

Descriptors: English (Second Language), Foreign Countries, Language Tests, Multiple Choice Tests

Practical Questions about Item Response Models in Large-Scale Assessment Programs.

Download full text

Legg, Sue M.; Algina, James – 1986

This paper focuses on the questions which arise as test practitioners monitor score scales derived from latent trait theory. Large scale assessment programs are dynamic and constantly challenge the assumptions and limits of latent trait models. Even though testing programs evolve, test scores must remain reliable indicators of progress.…

Descriptors: Difficulty Level, Educational Assessment, Elementary Secondary Education, Equated Scores

Algina, James	1
Angoff, William H.	1
Brunner, Martin	1
Bruno, James E.	1
Dirkzwager, A.	1
Dorans, Neil J.	1
Filipi, Anna	1
Gaston, Michele F.	1
He, Wei	1
Houssemand, Claude	1
Lawrence, Ida M.	1
Legg, Sue M.	1
Little, Roderick J. A.	1
Livingston, Samuel A.	1
Loarer, Even	1
Nichols, Austin	1
Rubin, Donald B.	1
Samson, Digna M. M.	1
Steinmetz, Jean-Paul	1
Wolfe, Edward W.	1
Xu, Zeyu	1
More ▼