NotesFAQContact Us
Collection
Advanced
Search Tips
Publication Date
In 20250
Since 20240
Since 2021 (last 5 years)0
Since 2016 (last 10 years)0
Since 2006 (last 20 years)4
Audience
Researchers1
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing all 12 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Steinmetz, Jean-Paul; Brunner, Martin; Loarer, Even; Houssemand, Claude – Psychological Assessment, 2010
The Wisconsin Card Sorting Test (WCST) assesses executive and frontal lobe function and can be administered manually or by computer. Despite the widespread application of the 2 versions, the psychometric equivalence of their scores has rarely been evaluated and only a limited set of criteria has been considered. The present experimental study (N =…
Descriptors: Computer Assisted Testing, Psychometrics, Test Theory, Scores
Peer reviewed Peer reviewed
Direct linkDirect link
He, Wei; Wolfe, Edward W. – International Journal of Testing, 2010
This article reports the results of a study of potential sources of item nonequivalence between English and Chinese language versions of a cognitive development test for preschool-aged children. Items were flagged for potential nonequivalence through statistical and judgment-based procedures, and the relationship between flag status and item…
Descriptors: Preschool Children, Mandarin Chinese, Cognitive Development, Item Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Filipi, Anna – Language Testing, 2012
The Assessment of Language Competence (ALC) certificates is an annual, international testing program developed by the Australian Council for Educational Research to test the listening and reading comprehension skills of lower to middle year levels of secondary school. The tests are developed for three levels in French, German, Italian and…
Descriptors: Listening Comprehension Tests, Item Response Theory, Statistical Analysis, Foreign Countries
Xu, Zeyu; Nichols, Austin – National Center for Analysis of Longitudinal Data in Education Research, 2010
The gold standard in making causal inference on program effects is a randomized trial. Most randomization designs in education randomize classrooms or schools rather than individual students. Such "clustered randomization" designs have one principal drawback: They tend to have limited statistical power or precision. This study aims to…
Descriptors: Test Format, Reading Tests, Norm Referenced Tests, Research Design
Peer reviewed Peer reviewed
Dorans, Neil J.; Lawrence, Ida M. – Applied Measurement in Education, 1990
A procedure for checking the score equivalence of nearly identical editions of a test is described and illustrated with Scholastic Aptitude Test data. The procedure uses the standard error of equating and uses graphical representation of score conversion deviations from the identity function in standard error units. (SLD)
Descriptors: Equated Scores, Grade Equivalent Scores, Scores, Statistical Analysis
Peer reviewed Peer reviewed
Little, Roderick J. A.; Rubin, Donald B. – Journal of Educational and Behavioral Statistics, 1994
Equating a new standard test to an old reference test is considered when samples for equating are not randomly selected from the target population of test takers, identifying two problems from equating from biased samples. An empirical example with data from the Armed Services Vocational Aptitude Battery illustrates the approach. (SLD)
Descriptors: Equated Scores, Military Personnel, Sampling, Statistical Analysis
Peer reviewed Peer reviewed
Bruno, James E.; Dirkzwager, A. – Educational and Psychological Measurement, 1995
Determining the optimal number of choices on a multiple-choice test is explored analytically from an information theory perspective. The analysis revealed that, in general, three choices seem optimal. This finding is in agreement with previous statistical and psychometric research. (SLD)
Descriptors: Distractors (Tests), Information Theory, Multiple Choice Tests, Psychometrics
Peer reviewed Peer reviewed
Gaston, Michele F.; And Others – Assessment, 1994
Comparability of the Minnesota Multiphasic Personality Inventory (MMPI) and the MMPI-2 was explored by examining T-score means, profile configurations, score distribution, and rank-order correlations on validity scales for 84 undergraduates. Equivalency of the two forms was generally supported. (SLD)
Descriptors: Comparative Analysis, Correlation, Higher Education, Personality Assessment
Livingston, Samuel A.; And Others – 1989
Combinations of five methods of equating test forms and two methods of selecting samples of students for equating were compared for accuracy. The two sampling methods were representative sampling from the population and matching samples on the anchor test score. The equating methods were: (1) the Tucker method; (2) the Levine method; (3) the…
Descriptors: Comparative Analysis, Data Collection, Equated Scores, High School Students
Angoff, William H. – 1991
An attempt was made to evaluate the standard error of equating (at the mean of the scores) in an ongoing testing program. The interest in estimating the empirical standard error of equating is occasioned by some discomfort with the error normally reported for test scores. Data used for this evaluation came from the Admissions Testing Program of…
Descriptors: College Entrance Examinations, Equated Scores, Error of Measurement, High School Students
Samson, Digna M. M. – 1983
The traditional multiple-choice reading comprehension test of English as a second language, used in the Dutch school-leaving examinations, has been criticized for its apparent lack of construct validity. The Dutch National Institute for Educational Measurement has conducted a number of studies to determine whether there is a different skill…
Descriptors: English (Second Language), Foreign Countries, Language Tests, Multiple Choice Tests
Legg, Sue M.; Algina, James – 1986
This paper focuses on the questions which arise as test practitioners monitor score scales derived from latent trait theory. Large scale assessment programs are dynamic and constantly challenge the assumptions and limits of latent trait models. Even though testing programs evolve, test scores must remain reliable indicators of progress.…
Descriptors: Difficulty Level, Educational Assessment, Elementary Secondary Education, Equated Scores