NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 14 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
von Davier, Matthias; Bezirhan, Ummugul – Educational and Psychological Measurement, 2023
Viable methods for the identification of item misfit or Differential Item Functioning (DIF) are central to scale construction and sound measurement. Many approaches rely on the derivation of a limiting distribution under the assumption that a certain model fits the data perfectly. Typical DIF assumptions such as the monotonicity and population…
Descriptors: Robustness (Statistics), Test Items, Item Analysis, Goodness of Fit
Peer reviewed Peer reviewed
Direct linkDirect link
van der Linden, Wim J. – Journal of Educational and Behavioral Statistics, 2019
Lord's (1980) equity theorem claims observed-score equating to be possible only when two test forms are perfectly reliable or strictly parallel. An analysis of its proof reveals use of an incorrect statistical assumption. The assumption does not invalidate the theorem itself though, which can be shown to follow directly from the discrete nature of…
Descriptors: Equated Scores, Testing Problems, Item Response Theory, Evaluation Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Sinharay, Sandip; Duong, Minh Q.; Wood, Scott W. – Journal of Educational Measurement, 2017
As noted by Fremer and Olson, analysis of answer changes is often used to investigate testing irregularities because the analysis is readily performed and has proven its value in practice. Researchers such as Belov, Sinharay and Johnson, van der Linden and Jeon, van der Linden and Lewis, and Wollack, Cohen, and Eckerly have suggested several…
Descriptors: Identification, Statistics, Change, Tests
Sinharay, Sandip – Journal of Educational and Behavioral Statistics, 2018
Wollack, Cohen, and Eckerly suggested the "erasure detection index" (EDI) to detect fraudulent erasures for individual examinees. Wollack and Eckerly extended the EDI to detect fraudulent erasures at the group level. The EDI at the group level was found to be slightly conservative. This article suggests two modifications of the EDI for…
Descriptors: Deception, Identification, Testing Problems, Cheating
Peer reviewed Peer reviewed
Direct linkDirect link
Sinharay, Sandip – Journal of Educational and Behavioral Statistics, 2017
An increasing concern of producers of educational assessments is fraudulent behavior during the assessment (van der Linden, 2009). Benefiting from item preknowledge (e.g., Eckerly, 2017; McLeod, Lewis, & Thissen, 2003) is one type of fraudulent behavior. This article suggests two new test statistics for detecting individuals who may have…
Descriptors: Test Items, Cheating, Testing Problems, Identification
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Klufa, Jindrich – Journal on Efficiency and Responsibility in Education and Science, 2016
The paper contains an analysis of the differences of number of points in the test in mathematics between test variants, which were used in the entrance examinations at the Faculty of Business Administration at University of Economics in Prague in 2015. The differences may arise due to the varying difficulty of variants for students, but also…
Descriptors: Foreign Countries, College Students, Business Administration Education, College Entrance Examinations
Peer reviewed Peer reviewed
Hanson, Bradley A. – Applied Measurement in Education, 1996
Determining whether score distributions differ on two or more test forms administered to samples of examinees from a single population is explored using three statistical tests using loglinear models. Examples are presented of applying tests of distribution differences to decide if equating is needed for alternative forms of a test. (SLD)
Descriptors: Equated Scores, Scoring, Statistical Distributions, Test Format
Peer reviewed Peer reviewed
Burket, George R. – Journal of Educational Measurement, 1987
This response to the Baglin paper (1986) points out the fallacy in inferring that inappropriate scaling procedures cause apparent discrepancies between medians and means and between means calculated using different units. (LMO)
Descriptors: Norm Referenced Tests, Scaling, Scoring, Statistical Distributions
Peer reviewed Peer reviewed
Walberg, Herbert J.; And Others – Review of Educational Research, 1984
This paper demonstrates the variety of positive-skew phenomena and discusses their theoretical, research, and practical implications in education. (PN)
Descriptors: Academic Achievement, Data Analysis, Research Problems, Scores
Peer reviewed Peer reviewed
Roberts, Dennis M. – Journal of Educational Measurement, 1987
This study examines a score-difference model for the detection of cheating based on the difference between two scores for an examinee: one based on the appropriate scoring key and another based on an alternative, inappropriate key. It argues that the score-difference method could falsely accuse students as cheaters. (Author/JAZ)
Descriptors: Answer Keys, Cheating, Mathematical Models, Multiple Choice Tests
Peer reviewed Peer reviewed
Huberty, Carl J. – Educational Researcher, 1987
Two approaches of statistical testing are critically reviewed. A new approach, which is a hybrid of the two, is proposed. The new approach requires the researcher to think about the two types of potential inferential errors and an explicit alternative hypothesis of interest. (VM)
Descriptors: Educational Assessment, Instruction, Multivariate Analysis, Researchers
Peer reviewed Peer reviewed
Charters, W. W., Jr.; Pitner, Nancy J. – Educational and Psychological Measurement, 1986
This paper reports on the application of Yukl's Management Behavior Survey in 47 elementary schools. Three problems with the instrument are discussed: (1) lack of response; (2) interrater disagreement; and (3) ceiling effects. The dimensionality of the measure is evaluated through factor analysis. (Author/LMO)
Descriptors: Administrators, Behavior Rating Scales, Elementary Education, Factor Analysis
Peer reviewed Peer reviewed
van der Ven, A. H. G. S.; And Others – Applied Psychological Measurement, 1989
A new model is presented that explains reaction time fluctuations in prolonged work tasks. The model extends the so-called Poisson-Erlang model and accounts for long-term trend effects in the reaction time curve. The model is consistent with Spearman's hypothesis that inhibition increases during work and decreases during rest. (TJH)
Descriptors: Elementary Secondary Education, Equations (Mathematics), Foreign Countries, Goodness of Fit
Peer reviewed Peer reviewed
Brown, Dianne C. – American Psychologist, 1994
Introduces controversial issue of subgroup norming, in which normative reference data are based on subgroups of population rather than on total group, in employment testing and briefly highlights two articles that address this issue. Controversy over subgroup norming has increased with passage of Civil Rights Act of 1991, which bans any form of…
Descriptors: Employment Practices, Employment Qualifications, Equal Opportunities (Jobs), Minority Groups