ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	6
Since 2006 (last 20 years)	6

Descriptor

Statistical Distributions	14
Testing Problems	14
Scoring	4
Cheating	3
Identification	3
Item Response Theory	3
Statistical Analysis	3
Educational Assessment	2
Equated Scores	2
Equations (Mathematics)	2
Evaluation Methods	2
Foreign Countries	2
Goodness of Fit	2
Mathematical Models	2
Test Items	2
Academic Achievement	1
Administrators	1
Answer Keys	1
Behavior Rating Scales	1
Business Administration…	1
Change	1
College Entrance Examinations	1
College Students	1
Computation	1
Data Analysis	1
More ▼

Source

Journal of Educational…	3
Journal of Educational and…	3
Educational and Psychological…	2
American Psychologist	1
Applied Measurement in…	1
Applied Psychological…	1
Educational Researcher	1
Journal on Efficiency and…	1
Review of Educational Research	1

Author

Sinharay, Sandip	3
Bezirhan, Ummugul	1
Brown, Dianne C.	1
Burket, George R.	1
Charters, W. W., Jr.	1
Duong, Minh Q.	1
Hanson, Bradley A.	1
Huberty, Carl J.	1
Klufa, Jindrich	1
Pitner, Nancy J.	1
Roberts, Dennis M.	1
Walberg, Herbert J.	1
Wood, Scott W.	1
van der Linden, Wim J.	1
van der Ven, A. H. G. S.	1
von Davier, Matthias	1
More ▼

Publication Type

Journal Articles	14
Reports - Research	8
Reports - Evaluative	3
Information Analyses	2
Opinion Papers	2
Reports - Descriptive	1
Speeches/Meeting Papers	1

Education Level

Higher Education	1
Postsecondary Education	1

Audience

Location

Czech Republic	1
Netherlands	1

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 14 results Save | Export

A Robust Method for Detecting Item Misfit in Large-Scale Assessments

Peer reviewed

Direct link

von Davier, Matthias; Bezirhan, Ummugul – Educational and Psychological Measurement, 2023

Viable methods for the identification of item misfit or Differential Item Functioning (DIF) are central to scale construction and sound measurement. Many approaches rely on the derivation of a limiting distribution under the assumption that a certain model fits the data perfectly. Typical DIF assumptions such as the monotonicity and population…

Descriptors: Robustness (Statistics), Test Items, Item Analysis, Goodness of Fit

Lord's Equity Theorem Revisited

Peer reviewed

Direct link

van der Linden, Wim J. – Journal of Educational and Behavioral Statistics, 2019

Lord's (1980) equity theorem claims observed-score equating to be possible only when two test forms are perfectly reliable or strictly parallel. An analysis of its proof reveals use of an incorrect statistical assumption. The assumption does not invalidate the theorem itself though, which can be shown to follow directly from the discrete nature of…

Descriptors: Equated Scores, Testing Problems, Item Response Theory, Evaluation Methods

A New Statistic for Detection of Aberrant Answer Changes

Peer reviewed

Direct link

Sinharay, Sandip; Duong, Minh Q.; Wood, Scott W. – Journal of Educational Measurement, 2017

As noted by Fremer and Olson, analysis of answer changes is often used to investigate testing irregularities because the analysis is readily performed and has proven its value in practice. Researchers such as Belov, Sinharay and Johnson, van der Linden and Jeon, van der Linden and Lewis, and Wollack, Cohen, and Eckerly have suggested several…

Descriptors: Identification, Statistics, Change, Tests

Detecting Fraudulent Erasures at an Aggregate Level

Peer reviewed
PDF on ERIC

Download full text

Direct link

Sinharay, Sandip – Journal of Educational and Behavioral Statistics, 2018

Wollack, Cohen, and Eckerly suggested the "erasure detection index" (EDI) to detect fraudulent erasures for individual examinees. Wollack and Eckerly extended the EDI to detect fraudulent erasures at the group level. The EDI at the group level was found to be slightly conservative. This article suggests two modifications of the EDI for…

Descriptors: Deception, Identification, Testing Problems, Cheating

Detection of Item Preknowledge Using Likelihood Ratio Test and Score Test

Peer reviewed

Direct link

Sinharay, Sandip – Journal of Educational and Behavioral Statistics, 2017

An increasing concern of producers of educational assessments is fraudulent behavior during the assessment (van der Linden, 2009). Benefiting from item preknowledge (e.g., Eckerly, 2017; McLeod, Lewis, & Thissen, 2003) is one type of fraudulent behavior. This article suggests two new test statistics for detecting individuals who may have…

Descriptors: Test Items, Cheating, Testing Problems, Identification

Comparison of the Test Variants in Entrance Examinations

Peer reviewed
PDF on ERIC

Download full text

Klufa, Jindrich – Journal on Efficiency and Responsibility in Education and Science, 2016

The paper contains an analysis of the differences of number of points in the test in mathematics between test variants, which were used in the entrance examinations at the Faculty of Business Administration at University of Economics in Prague in 2015. The differences may arise due to the varying difficulty of variants for students, but also…

Descriptors: Foreign Countries, College Students, Business Administration Education, College Entrance Examinations

Testing for Differences in Test Score Distributions Using Loglinear Models.

Peer reviewed

Hanson, Bradley A. – Applied Measurement in Education, 1996

Determining whether score distributions differ on two or more test forms administered to samples of examinees from a single population is explored using three statistical tests using loglinear models. Examples are presented of applying tests of distribution differences to decide if equating is needed for alternative forms of a test. (SLD)

Descriptors: Equated Scores, Scoring, Statistical Distributions, Test Format

Group Scores: A Response to Baglin.

Peer reviewed

Burket, George R. – Journal of Educational Measurement, 1987

This response to the Baglin paper (1986) points out the fallacy in inferring that inappropriate scaling procedures cause apparent discrepancies between medians and means and between means calculated using different units. (LMO)

Descriptors: Norm Referenced Tests, Scaling, Scoring, Statistical Distributions

Exceptional Performance.

Peer reviewed

Walberg, Herbert J.; And Others – Review of Educational Research, 1984

This paper demonstrates the variety of positive-skew phenomena and discusses their theoretical, research, and practical implications in education. (PN)

Descriptors: Academic Achievement, Data Analysis, Research Problems, Scores

Limitations of the Score-Difference Method in Detecting Cheating in Recognition Test Situations.

Peer reviewed

Roberts, Dennis M. – Journal of Educational Measurement, 1987

This study examines a score-difference model for the detection of cheating based on the difference between two scores for an examinee: one based on the appropriate scoring key and another based on an alternative, inappropriate key. It argues that the score-difference method could falsely accuse students as cheaters. (Author/JAZ)

Descriptors: Answer Keys, Cheating, Mathematical Models, Multiple Choice Tests

On Statistical Testing.

Peer reviewed

Huberty, Carl J. – Educational Researcher, 1987

Two approaches of statistical testing are critically reviewed. A new approach, which is a hybrid of the two, is proposed. The new approach requires the researcher to think about the two types of potential inferential errors and an explicit alternative hypothesis of interest. (VM)

Descriptors: Educational Assessment, Instruction, Multivariate Analysis, Researchers

The Application of the Management Behavior Survey to the Measurement of Principal Leadership Behaviors.

Peer reviewed

Charters, W. W., Jr.; Pitner, Nancy J. – Educational and Psychological Measurement, 1986

This paper reports on the application of Yukl's Management Behavior Survey in 47 elementary schools. Three problems with the instrument are discussed: (1) lack of response; (2) interrater disagreement; and (3) ceiling effects. The dimensionality of the measure is evaluated through factor analysis. (Author/LMO)

Descriptors: Administrators, Behavior Rating Scales, Elementary Education, Factor Analysis

Inhibition in Prolonged Work Tasks.

Peer reviewed

van der Ven, A. H. G. S.; And Others – Applied Psychological Measurement, 1989

A new model is presented that explains reaction time fluctuations in prolonged work tasks. The model extends the so-called Poisson-Erlang model and accounts for long-term trend effects in the reaction time curve. The model is consistent with Spearman's hypothesis that inhibition increases during work and decreases during rest. (TJH)

Descriptors: Elementary Secondary Education, Equations (Mathematics), Foreign Countries, Goodness of Fit

Subgroup Norming: Legitimate Testing Practice or Reverse Discrimination?

Peer reviewed

Brown, Dianne C. – American Psychologist, 1994

Introduces controversial issue of subgroup norming, in which normative reference data are based on subgroups of population rather than on total group, in employment testing and briefly highlights two articles that address this issue. Controversy over subgroup norming has increased with passage of Civil Rights Act of 1991, which bans any form of…

Descriptors: Employment Practices, Employment Qualifications, Equal Opportunities (Jobs), Minority Groups