NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 10 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Wiliam, Dylan – Assessment in Education: Principles, Policy & Practice, 2008
While international comparisons such as those provided by PISA may be meaningful in terms of overall judgements about the performance of educational systems, caution is needed in terms of more fine-grained judgements. In particular it is argued that the results of PISA to draw conclusions about the quality of instruction in different systems is…
Descriptors: Test Bias, Test Construction, Comparative Testing, Evaluation
Peer reviewed Peer reviewed
Direct linkDirect link
Kim, Sooyeon; Walker, Michael E.; McHale, Frederick – Journal of Educational Measurement, 2010
In this study we examined variations of the nonequivalent groups equating design for tests containing both multiple-choice (MC) and constructed-response (CR) items to determine which design was most effective in producing equivalent scores across the two tests to be equated. Using data from a large-scale exam, this study investigated the use of…
Descriptors: Measures (Individuals), Scoring, Equated Scores, Test Bias
Peer reviewed Peer reviewed
Direct linkDirect link
Kato, Kentaro; Moen, Ross E.; Thurlow, Martha L. – Educational Measurement: Issues and Practice, 2009
Large data sets from a state reading assessment for third and fifth graders were analyzed to examine differential item functioning (DIF), differential distractor functioning (DDF), and differential omission frequency (DOF) between students with particular categories of disabilities (speech/language impairments, learning disabilities, and emotional…
Descriptors: Learning Disabilities, Language Impairments, Behavior Disorders, Affective Behavior
Peer reviewed Peer reviewed
Direct linkDirect link
Coe, Robert – Oxford Review of Education, 2008
The comparability of examinations in different subjects has been a controversial topic for many years and a number of criticisms have been made of statistical approaches to estimating the "difficulties" of achieving particular grades in different subjects. This paper argues that if comparability is understood in terms of a linking…
Descriptors: Test Items, Grades (Scholastic), Foreign Countries, Test Bias
Peer reviewed Peer reviewed
Rafferty, Eileen A.; Treff, August V. – ERS Spectrum, 1994
Addresses issues faced by institutions attempting to design school profiles to meet accountability standards. Reports of high-stakes test results can be skewed by choice of statistic type (percent of students passing versus mean scores), sample bias, geographical transients, and omission errors. Administrators must look beyond "common…
Descriptors: Accountability, Achievement Tests, Comparative Testing, Elementary Secondary Education
Peer reviewed Peer reviewed
Direct linkDirect link
Bolt, Sara E.; Ysseldyke, James E. – Applied Measurement in Education, 2006
Although testing accommodations are commonly provided to students with disabilities within large-scale testing programs, research findings on how well accommodations allow for comparable measurement of student knowledge and skill remain inconclusive. The purpose of this study was to examine the extent to which 1 commonly held belief about testing…
Descriptors: Oral Reading, Testing Accommodations, Disabilities, Special Needs Students
Peer reviewed Peer reviewed
Direct linkDirect link
Meyer, J. Patrick; Huynh, Huynh; Seaman, Michael A. – Journal of Educational Measurement, 2004
Exact nonparametric procedures have been used to identify the level of differential item functioning (DIF) in binary items. This study explored the use of exact DIF procedures with items scored on a Likert scale. The results from an attitude survey suggest that the large-sample Cochran-Mantel-Haenszel (CMH) procedure identifies more items as…
Descriptors: Test Bias, Attitude Measures, Surveys, Predictive Validity
Chang, Yu-Wen; Davison, Mark L. – 1992
Standard errors and bias of unidimensional and multidimensional ability estimates were compared in a factorial, simulation design with two item response theory (IRT) approaches, two levels of test correlation (0.42 and 0.63), two sample sizes (500 and 1,000), and a hierarchical test content structure. Bias and standard errors of subtest scores…
Descriptors: Comparative Testing, Computer Simulation, Correlation, Error of Measurement
Katz, Elinor – 1993
A critical analysis is presented of the literature as it relates to survey research, including personal interviews, telephone interviews, and mail questionnaires. Additional research concerns are explored, and a code of ethics for survey researchers is presented. Focus groups, interviews, long interviews, telephone interviews, and mail surveys are…
Descriptors: Codes of Ethics, Comparative Testing, Confidentiality, Interviews
McManus, Barbara Luger – 1992
This paper discusses whether or not revisions of the Scholastic Aptitude Test (SAT) and the American College Test (ACT) have created such significant differences between the two tests that a student could conceivably score significantly higher on one than the other. The SAT has been revised to meet the needs of an increasingly diverse student…
Descriptors: Ability, Achievement Tests, Aptitude Tests, College Entrance Examinations