ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	3
Since 2006 (last 20 years)	5

Source

Educational and Psychological…	2
Applied Measurement in…	1
ETS Research Report Series	1
Educational Measurement:…	1
International Journal of…	1
Journal of Educational…	1
Journal of Research in Reading	1
Psychometrika	1

Publication Type

Reports - Research	16
Journal Articles	9
Collected Works - General	1
Reports - Evaluative	1
Speeches/Meeting Papers	1

Education Level

Middle Schools

Audience

Location

Germany

Laws, Policies, & Programs

Assessments and Surveys

Indiana Statewide Testing for…

What Works Clearinghouse Rating

Showing 1 to 15 of 16 results Save | Export

Explaining Performance Decline over the Course of Taking Comprehensive Proficiency Tests: The Roles of Effort and Omission Propensity

Peer reviewed

Direct link

Karoline A. Sachse; Sebastian Weirich; Nicole Mahler; Camilla Rjosk – International Journal of Testing, 2024

In order to ensure content validity by covering a broad range of content domains, the testing times of some educational large-scale assessments last up to a total of two hours or more. Performance decline over the course of taking the test has been extensively documented in the literature. It can occur due to increases in the numbers of: (a)…

Descriptors: Test Wiseness, Test Score Decline, Testing Problems, Foreign Countries

A Statistical Procedure for Testing Unusually Frequent Exactly Matching Responses and Nearly Matching Responses. Research Report. ETS RR-17-23

Peer reviewed
PDF on ERIC

Download full text

Haberman, Shelby J.; Lee, Yi-Hsuan – ETS Research Report Series, 2017

In investigations of unusual testing behavior, a common question is whether a specific pattern of responses occurs unusually often within a group of examinees. In many current tests, modern communication techniques can permit quite large numbers of examinees to share keys, or common response patterns, to the entire test. To address this issue,…

Descriptors: Student Evaluation, Testing, Item Response Theory, Maximum Likelihood Statistics

Examining Estimates of Intervention Effectiveness Using Sensitivity Analysis

Peer reviewed

Direct link

An, Chen; Braun, Henry; Walsh, Mary E. – Educational Measurement: Issues and Practice, 2018

Making causal inferences from a quasi-experiment is difficult. Sensitivity analysis approaches to address hidden selection bias thus have gained popularity. This study serves as an introduction to a simple but practical form of sensitivity analysis using Monte Carlo simulation procedures. We examine estimated treatment effects for a school-based…

Descriptors: Statistical Inference, Intervention, Program Effectiveness, Quasiexperimental Design

Determining the Overall Impact of Interruptions during Online Testing

Peer reviewed

Direct link

Sinharay, Sandip; Wan, Ping; Whitaker, Mike; Kim, Dong-In; Zhang, Litong; Choi, Seung W. – Journal of Educational Measurement, 2014

With an increase in the number of online tests, interruptions during testing due to unexpected technical issues seem unavoidable. For example, interruptions occurred during several recent state tests. When interruptions occur, it is important to determine the extent of their impact on the examinees' scores. There is a lack of research on this…

Descriptors: Computer Assisted Testing, Testing Problems, Scores, Regression (Statistics)

Impact of Design Effects in Large-Scale District and State Assessments

Peer reviewed

Direct link

Phillips, Gary W. – Applied Measurement in Education, 2015

This article proposes that sampling design effects have potentially huge unrecognized impacts on the results reported by large-scale district and state assessments in the United States. When design effects are unrecognized and unaccounted for they lead to underestimating the sampling error in item and test statistics. Underestimating the sampling…

Descriptors: State Programs, Sampling, Research Design, Error of Measurement

Exact Tests for the Rasch Model via Sequential Importance Sampling

Peer reviewed

Direct link

Chen, Yuguo; Small, Dylan – Psychometrika, 2005

Rasch proposed an exact conditional inference approach to testing his model but never implemented it because it involves the calculation of a complicated probability. This paper furthers Rasch's approach by (1) providing an efficient Monte Carlo methodology for accurately approximating the required probability and (2) illustrating the usefulness…

Descriptors: Testing Problems, Probability, Methods, Testing

A Two-Stage Procedure for Selecting the Best of Several Binomial Populations; [and] Some Exact Sample Sizes for Comparing the Squared Multiple Correlation Coefficient to a Standard; [and] An Improved Decision-Theoretic Coefficient for Tests. Studies in Measurement and Methodology, Work Unit 3: Technical Adequacy of Tests.

Wilcox, Rand R. – 1979

Three separate papers are included in this report. The first describes a two-stage procedure for choosing from among several instructional programs the one which maximizes the probability of passing the test. The second gives the exact sample sizes required to determine whether a squared multiple correlation coefficient is above or below a known…

Descriptors: Bayesian Statistics, Correlation, Hypothesis Testing, Mathematical Models

Applying Ranking and Selection Techniques to Determine the Length of a Mastery Test.

Peer reviewed

Wilcox, Rand R. – Educational and Psychological Measurement, 1979

A problem of considerable importance in certain educational settings is determining how many items to include on a mastery test. Applying ranking and selection procedures, a solution is given which includes as a special case all existing single-stage, non-Bayesian solutions based on a strong true-score model. (Author/JKS)

Descriptors: Criterion Referenced Tests, Mastery Tests, Nonparametric Statistics, Probability

An Evaluation of Some Models for Culture-Fair Selection.

Download full text

Petersen, Nancy S.; Novick, Melvin R. – 1975

Models proposed by Cleary, Thorndike, Cole, Linn, Einhorn and Bass, Darlington, and Gross and Su for analyzing bias in the use of tests in a selection strategy are surveyed. Several additional models are also introduced. The purpose is to describe, compare, contrast, and evaluate these models while extracting such useful ideas as may be found in…

Descriptors: Comparative Analysis, Culture Fair Tests, Models, Personnel Selection

The Cloze Procedure Applied to a Probability Concepts Test.

Peer reviewed

Green, D. R.; Tomlinson, M. – Journal of Research in Reading, 1983

Confirms that in cloze testing, it is unnecessary to use standard size spaces and reveals a high correlation between synonymic scoring and verbatim scoring. Indicates also that a specific probability concepts test is comprehensible and readable by the great majority of students for whom it was devised. (FL)

Descriptors: Cloze Procedure, Elementary Secondary Education, Listening Skills, Probability

Effect of Examinee Certainty on Probabilistic Test Scores and a Comparison of Scoring Methods for Probabilistic Responses.

Download full text

Suhadolnik, Debra; Weiss, David J. – 1983

The present study was an attempt to alleviate some of the difficulties inherent in multiple-choice items by having examinees respond to multiple-choice items in a probabilistic manner. Using this format, examinees are able to respond to each alternative and to provide indications of any partial knowledge they may possess concerning the item. The…

Descriptors: Confidence Testing, Multiple Choice Tests, Probability, Response Style (Tests)

An Alternative Interpretation of Three Stability Models. Measurement and Methodology, Work Unit 2: Technical Adequacy of Tests.

Wilcox, Rand R. – 1978

Two fundamental problems in mental test theory are to estimate true score and to estimate the amount of error when testing an examinee. In this report, three probability models which characterize a single test item in terms of a population of examinees are described. How these models may be modified to characterize a single examinee in terms of an…

Descriptors: Achievement Tests, Comparative Analysis, Error of Measurement, Mathematical Models

Testing with Personal Probabilities: 11-Year-Olds Can Correctly Estimate Their Personal Probabilities.

Peer reviewed

Dirkzwager, A. – Educational and Psychological Measurement, 1996

Testing with personal probabilities eliminates guessing whether the subjects are well calibrated. A probability testing study with 47 Dutch elementary school children who used an interactive computer program shows that even 11-year-olds can estimate their personal probabilities correctly. (SLD)

Descriptors: Computer Assisted Testing, Elementary Education, Elementary School Students, Estimation (Mathematics)

A Two-Parameter Latent Trait Model. Methodology Project.

Download full text

Choppin, Bruce – 1982

On well-constructed multiple-choice tests, the most serious threat to measurement is not variation in item discrimination, but the guessing behavior that may be adopted by some students. Ways of ameliorating the effects of guessing are discussed, especially for problems in latent trait models. A new item response model, including an item parameter…

Descriptors: Ability, Algorithms, Guessing (Tests), Item Analysis

Utilizing Rasch Analysis to Detect Cheating on Language Examinations.

Madsen, Harold S. – 1987

A study investigated the effectiveness of the Rasch procedure in measuring response appropriateness, especially for the detection of cheating on multiple-choice language tests. The report gives background information on appropriateness measurement and its potential uses, reviews recent research on cheating and its detection, and describes three…

Descriptors: Cheating, English (Second Language), Evaluation Methods, Language Tests

Previous Page | Next Page »

Pages: 1 | 2

Probability	16
Testing Problems	16
Statistical Analysis	5
Mathematical Models	4
Evaluation Methods	3
Item Response Theory	3
Multiple Choice Tests	3
Scores	3
Test Construction	3
Test Items	3
Test Validity	3
Achievement Tests	2
Bayesian Statistics	2
Cheating	2
Comparative Analysis	2
Computer Assisted Testing	2
Criterion Referenced Tests	2
Error of Measurement	2
Foreign Countries	2
Guessing (Tests)	2
Hypothesis Testing	2
Item Analysis	2
Latent Trait Theory	2
Mathematics Tests	2
Maximum Likelihood Statistics	2
More ▼

Wilcox, Rand R.	3
An, Chen	1
Braun, Henry	1
Camilla Rjosk	1
Chen, Yuguo	1
Choi, Seung W.	1
Choppin, Bruce	1
Dirkzwager, A.	1
Green, D. R.	1
Haberman, Shelby J.	1
Hambleton, Ronald K.	1
Karoline A. Sachse	1
Kim, Dong-In	1
Lee, Yi-Hsuan	1
Madsen, Harold S.	1
Nicole Mahler	1
Novick, Melvin R.	1
Petersen, Nancy S.	1
Phillips, Gary W.	1
Sebastian Weirich	1
Sinharay, Sandip	1
Small, Dylan	1
Suhadolnik, Debra	1
Tomlinson, M.	1
More ▼