ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	4
Since 2016 (last 10 years)	15
Since 2006 (last 20 years)	32

Descriptor

Test Items	37
Statistical Analysis	14
Item Response Theory	12
Goodness of Fit	10
Difficulty Level	9
Equated Scores	9
Scores	9
Licensing Examinations…	7
Models	7
College Entrance Examinations	6
Comparative Analysis	6
Correlation	6
Bayesian Statistics	5
Cheating	5
Educational Testing	5
Multiple Choice Tests	5
Psychometrics	5
Simulation	5
Markov Processes	4
Measurement	4
Reaction Time	4
Test Bias	4
Accuracy	3
Computation	3
Educational Assessment	3
More ▼

Source

Journal of Educational and…	7
ETS Research Report Series	6
Grantee Submission	6
Educational Measurement:…	4
Journal of Educational…	4
Educational Testing Service	3
International Journal of…	2
Educational and Psychological…	1
Measurement:…	1
Psychometrika	1

Author

Sinharay, Sandip	37
Johnson, Matthew S.	6
Holland, Paul W.	5
Haberman, Shelby J.	4
Curley, Edward	3
Feigenbaum, Miriam	3
Holland, Paul	3
Liu, Jinghua	3
Lee, Yi-Hsuan	2
Williamson, David M.	2
Wollack, James A.	2
Bejar, Isaac I.	1
Dorans, Neil J.	1
Eckerly, Carol	1
Gorney, Kylie	1
Han, Ning	1
Jensen, Jens Ledet	1
Johnson, Matthew	1
Katz, Irvin R.	1
Keehner, Madeleine	1
Liang, Longjuan	1
Livne, Oren	1
Lu, Ying	1
Moon, Jung Aa	1
Pan, Yiqin	1
More ▼

Publication Type

Reports - Research	27
Journal Articles	26
Reports - Evaluative	5
Reports - Descriptive	4
Numerical/Quantitative Data	1
Opinion Papers	1
Speeches/Meeting Papers	1

Education Level

Higher Education	2
Elementary Education	1
Middle Schools	1
Postsecondary Education	1

Audience

Location

United States

Laws, Policies, & Programs

Assessments and Surveys

SAT (College Admission Test)	3
Graduate Record Examinations	2

What Works Clearinghouse Rating

Showing 1 to 15 of 37 results Save | Export

Item Selection Algorithm Based on Collaborative Filtering for Item Exposure Control

Peer reviewed

Direct link

Pan, Yiqin; Livne, Oren; Wollack, James A.; Sinharay, Sandip – Educational Measurement: Issues and Practice, 2023

In computerized adaptive testing, overexposure of items in the bank is a serious problem and might result in item compromise. We develop an item selection algorithm that utilizes the entire bank well and reduces the overexposure of items. The algorithm is based on collaborative filtering and selects an item in two stages. In the first stage, a set…

Descriptors: Computer Assisted Testing, Adaptive Testing, Test Items, Algorithms

Reporting Pass-Fail Decisions to Examinees with Incomplete Data: A Commentary on Feinberg (2021)

Peer reviewed

Direct link

Sinharay, Sandip – Educational Measurement: Issues and Practice, 2022

Administrative problems such as computer malfunction and power outage occasionally lead to missing item scores, and hence to incomplete data, on credentialing tests such as the United States Medical Licensing examination. Feinberg compared four approaches for reporting pass-fail decisions to the examinees with incomplete data on credentialing…

Descriptors: Testing Problems, High Stakes Tests, Credentials, Test Items

Using Item Scores and Distractors to Detect Item Compromise and Preknowledge

Peer reviewed

Direct link

Gorney, Kylie; Wollack, James A.; Sinharay, Sandip; Eckerly, Carol – Journal of Educational and Behavioral Statistics, 2023

Any time examinees have had access to items and/or answers prior to taking a test, the fairness of the test and validity of test score interpretations are threatened. Therefore, there is a high demand for procedures to detect both compromised items (CI) and examinees with preknowledge (EWP). In this article, we develop a procedure that uses item…

Descriptors: Scores, Test Validity, Test Items, Prior Learning

The Lack of Robustness of a Statistic Based on the Neyman-Pearson Lemma to Violations of Its Underlying Assumptions

Peer reviewed
PDF on ERIC

Download full text

Direct link

Sinharay, Sandip – Grantee Submission, 2021

Drasgow, Levine, and Zickar (1996) suggested a statistic based on the Neyman-Pearson lemma (e.g., Lehmann & Romano, 2005, p. 60) for detecting preknowledge on a known set of items. The statistic is a special case of the optimal appropriateness indices of Levine and Drasgow (1988) and is the most powerful statistic for detecting item…

Descriptors: Robustness (Statistics), Hypothesis Testing, Statistics, Test Items

The Reliability of the Posterior Probability of Skill Attainment in Diagnostic Classification Models

Peer reviewed

Direct link

Johnson, Matthew S.; Sinharay, Sandip – Journal of Educational and Behavioral Statistics, 2020

One common score reported from diagnostic classification assessments is the vector of posterior means of the skill mastery indicators. As with any assessment, it is important to derive and report estimates of the reliability of the reported scores. After reviewing a reliability measure suggested by Templin and Bradshaw, this article suggests three…

Descriptors: Reliability, Probability, Skill Development, Classification

Detection of Item Preknowledge Using Response Times

Peer reviewed
PDF on ERIC

Download full text

Direct link

Sinharay, Sandip – Grantee Submission, 2019

Benefiting from item preknowledge (e.g., McLeod, Lewis, & Thissen, 2003) is a major type of fraudulent behavior during educational assessments. This paper suggests a new statistic that can be used for detecting the examinees who may have benefitted from item preknowledge using their response times. The statistic quantifies the difference in…

Descriptors: Test Items, Cheating, Reaction Time, Identification

Assessing Fit of the Lognormal Model for Response Times

Peer reviewed
PDF on ERIC

Download full text

Direct link

Sinharay, Sandip; van Rijn, Peter W. – Journal of Educational and Behavioral Statistics, 2020

Response time models (RTMs) are of increasing interest in educational and psychological testing. This article focuses on the lognormal model for response times, which is one of the most popular RTMs. Several existing statistics for testing normality and the fit of factor analysis models are repurposed for testing the fit of the lognormal model. A…

Descriptors: Educational Testing, Psychological Testing, Goodness of Fit, Factor Analysis

Investigating Technology-Enhanced Item Formats Using Cognitive and Item Response Theory Approaches

Peer reviewed

Direct link

Moon, Jung Aa; Sinharay, Sandip; Keehner, Madeleine; Katz, Irvin R. – International Journal of Testing, 2020

The current study examined the relationship between test-taker cognition and psychometric item properties in multiple-selection multiple-choice and grid items. In a study with content-equivalent mathematics items in alternative item formats, adult participants' tendency to respond to an item was affected by the presence of a grid and variations of…

Descriptors: Computer Assisted Testing, Multiple Choice Tests, Test Wiseness, Psychometrics

Assessing Fit of the Lognormal Model for Response Times

Peer reviewed
PDF on ERIC

Download full text

Direct link

Sinharay, Sandip; van Rijn, Peter – Grantee Submission, 2020

Response-time models are of increasing interest in educational and psychological testing. This paper focuses on the lognormal model for response times (van der Linden, 2006), which is one of the most popular response-time models. Several existing statistics for testing normality and the fit of factor-analysis models are repurposed for testing the…

Descriptors: Educational Testing, Psychological Testing, Goodness of Fit, Factor Analysis

Higher-Order Asymptotics and Its Use to Test the Equality of the Examinee Ability over Two Sets of Items

Peer reviewed
PDF on ERIC

Download full text

Direct link

Sinharay, Sandip; Jensen, Jens Ledet – Grantee Submission, 2018

In educational and psychological measurement, researchers and/or practitioners are often interested in examining whether the ability of an examinee is the same over two sets of items. Such problems can arise in measurement of change, detection of cheating on unproctored tests, erasure analysis, detection of item preknowledge etc. Traditional…

Descriptors: Test Items, Ability, Mathematics, Item Response Theory

The Use of Item Scores and Response Times to Detect Examinees Who May Have Benefited from Item Preknowledge

Peer reviewed
PDF on ERIC

Download full text

Direct link

Sinharay, Sandip; Johnson, Matthew S. – Grantee Submission, 2019

According to Wollack and Schoenig (2018), benefitting from item preknowledge is one of the three broad types of test fraud that occur in educational assessments. We use tools from constrained statistical inference to suggest a new statistic that is based on item scores and response times and can be used to detect the examinees who may have…

Descriptors: Scores, Test Items, Reaction Time, Cheating

On the Choice of Anchor Tests in Equating

Peer reviewed

Direct link

Sinharay, Sandip – Educational Measurement: Issues and Practice, 2018

The choice of anchor tests is crucial in applications of the nonequivalent groups with anchor test design of equating. Sinharay and Holland (2006, 2007) suggested "miditests," which are anchor tests that are content-representative and have the same mean item difficulty as the total test but have a smaller spread of item difficulties.…

Descriptors: Test Content, Difficulty Level, Test Items, Test Construction

Detection of Item Preknowledge Using Likelihood Ratio Test and Score Test

Peer reviewed

Direct link

Sinharay, Sandip – Journal of Educational and Behavioral Statistics, 2017

An increasing concern of producers of educational assessments is fraudulent behavior during the assessment (van der Linden, 2009). Benefiting from item preknowledge (e.g., Eckerly, 2017; McLeod, Lewis, & Thissen, 2003) is one type of fraudulent behavior. This article suggests two new test statistics for detecting individuals who may have…

Descriptors: Test Items, Cheating, Testing Problems, Identification

How to Compare Parametric and Nonparametric Person-Fit Statistics Using Real Data

Peer reviewed

Direct link

Sinharay, Sandip – Journal of Educational Measurement, 2017

Person-fit assessment (PFA) is concerned with uncovering atypical test performance as reflected in the pattern of scores on individual items on a test. Existing person-fit statistics (PFSs) include both parametric and nonparametric statistics. Comparison of PFSs has been a popular research topic in PFA, but almost all comparisons have employed…

Descriptors: Goodness of Fit, Testing, Test Items, Scores

Extension of Caution Indices to Mixed-Format Tests

Peer reviewed
PDF on ERIC

Download full text

Direct link

Sinharay, Sandip – Grantee Submission, 2018

Tatsuoka (1984) suggested several extended caution indices and their standardized versions that have been used as person-fit statistics by researchers such as Drasgow, Levine, and McLaughlin (1987), Glas and Meijer (2003), and Molenaar and Hoijtink (1990). However, these indices are only defined for tests with dichotomous items. This paper extends…

Descriptors: Test Format, Goodness of Fit, Item Response Theory, Error Patterns

Previous Page | Next Page »

Pages: 1 | 2 | 3