NotesFAQContact Us
Collection
Advanced
Search Tips
Education Level
Middle Schools1
Audience
Location
Germany1
Laws, Policies, & Programs
Assessments and Surveys
Indiana Statewide Testing for…1
What Works Clearinghouse Rating
Showing 1 to 15 of 21 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Karoline A. Sachse; Sebastian Weirich; Nicole Mahler; Camilla Rjosk – International Journal of Testing, 2024
In order to ensure content validity by covering a broad range of content domains, the testing times of some educational large-scale assessments last up to a total of two hours or more. Performance decline over the course of taking the test has been extensively documented in the literature. It can occur due to increases in the numbers of: (a)…
Descriptors: Test Wiseness, Test Score Decline, Testing Problems, Foreign Countries
Peer reviewed Peer reviewed
Direct linkDirect link
Sinharay, Sandip – Educational and Psychological Measurement, 2022
Administrative problems such as computer malfunction and power outage occasionally lead to missing item scores and hence to incomplete data on mastery tests such as the AP and U.S. Medical Licensing examinations. Investigators are often interested in estimating the probabilities of passing of the examinees with incomplete data on mastery tests.…
Descriptors: Mastery Tests, Computer Assisted Testing, Probability, Test Wiseness
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Haberman, Shelby J.; Lee, Yi-Hsuan – ETS Research Report Series, 2017
In investigations of unusual testing behavior, a common question is whether a specific pattern of responses occurs unusually often within a group of examinees. In many current tests, modern communication techniques can permit quite large numbers of examinees to share keys, or common response patterns, to the entire test. To address this issue,…
Descriptors: Student Evaluation, Testing, Item Response Theory, Maximum Likelihood Statistics
Peer reviewed Peer reviewed
Direct linkDirect link
An, Chen; Braun, Henry; Walsh, Mary E. – Educational Measurement: Issues and Practice, 2018
Making causal inferences from a quasi-experiment is difficult. Sensitivity analysis approaches to address hidden selection bias thus have gained popularity. This study serves as an introduction to a simple but practical form of sensitivity analysis using Monte Carlo simulation procedures. We examine estimated treatment effects for a school-based…
Descriptors: Statistical Inference, Intervention, Program Effectiveness, Quasiexperimental Design
Peer reviewed Peer reviewed
Direct linkDirect link
Sinharay, Sandip; Wan, Ping; Whitaker, Mike; Kim, Dong-In; Zhang, Litong; Choi, Seung W. – Journal of Educational Measurement, 2014
With an increase in the number of online tests, interruptions during testing due to unexpected technical issues seem unavoidable. For example, interruptions occurred during several recent state tests. When interruptions occur, it is important to determine the extent of their impact on the examinees' scores. There is a lack of research on this…
Descriptors: Computer Assisted Testing, Testing Problems, Scores, Regression (Statistics)
Peer reviewed Peer reviewed
Direct linkDirect link
Phillips, Gary W. – Applied Measurement in Education, 2015
This article proposes that sampling design effects have potentially huge unrecognized impacts on the results reported by large-scale district and state assessments in the United States. When design effects are unrecognized and unaccounted for they lead to underestimating the sampling error in item and test statistics. Underestimating the sampling…
Descriptors: State Programs, Sampling, Research Design, Error of Measurement
Peer reviewed Peer reviewed
Direct linkDirect link
Chen, Yuguo; Small, Dylan – Psychometrika, 2005
Rasch proposed an exact conditional inference approach to testing his model but never implemented it because it involves the calculation of a complicated probability. This paper furthers Rasch's approach by (1) providing an efficient Monte Carlo methodology for accurately approximating the required probability and (2) illustrating the usefulness…
Descriptors: Testing Problems, Probability, Methods, Testing
Koplyay, Janos B.; And Others – 1972
The relationship between true ability (operationally defined as the number of items for which the examinee actually knew the correct answer) and the effects of guessing upon observed test variance was investigated. Three basic hypotheses were treated mathematically: there is no functional relationship between true ability and guessing success;…
Descriptors: Guessing (Tests), Predictor Variables, Probability, Scoring
Peer reviewed Peer reviewed
Echternacht, Gary – Educational and Psychological Measurement, 1974
Descriptors: Evaluation Criteria, Probability, Statistical Analysis, Test Bias
Slakter, Malcolm J. – Educ Psychol Meas, 1969
Research supported by the U.S. Office of Education, Contract OEC-6-10-239.
Descriptors: Decision Making, Measurement, Objective Tests, Performance
Wilcox, Rand R. – 1979
Three separate papers are included in this report. The first describes a two-stage procedure for choosing from among several instructional programs the one which maximizes the probability of passing the test. The second gives the exact sample sizes required to determine whether a squared multiple correlation coefficient is above or below a known…
Descriptors: Bayesian Statistics, Correlation, Hypothesis Testing, Mathematical Models
Peer reviewed Peer reviewed
Wilcox, Rand R. – Educational and Psychological Measurement, 1979
A problem of considerable importance in certain educational settings is determining how many items to include on a mastery test. Applying ranking and selection procedures, a solution is given which includes as a special case all existing single-stage, non-Bayesian solutions based on a strong true-score model. (Author/JKS)
Descriptors: Criterion Referenced Tests, Mastery Tests, Nonparametric Statistics, Probability
Petersen, Nancy S.; Novick, Melvin R. – 1975
Models proposed by Cleary, Thorndike, Cole, Linn, Einhorn and Bass, Darlington, and Gross and Su for analyzing bias in the use of tests in a selection strategy are surveyed. Several additional models are also introduced. The purpose is to describe, compare, contrast, and evaluate these models while extracting such useful ideas as may be found in…
Descriptors: Comparative Analysis, Culture Fair Tests, Models, Personnel Selection
Peer reviewed Peer reviewed
Green, D. R.; Tomlinson, M. – Journal of Research in Reading, 1983
Confirms that in cloze testing, it is unnecessary to use standard size spaces and reveals a high correlation between synonymic scoring and verbatim scoring. Indicates also that a specific probability concepts test is comprehensible and readable by the great majority of students for whom it was devised. (FL)
Descriptors: Cloze Procedure, Elementary Secondary Education, Listening Skills, Probability
Suhadolnik, Debra; Weiss, David J. – 1983
The present study was an attempt to alleviate some of the difficulties inherent in multiple-choice items by having examinees respond to multiple-choice items in a probabilistic manner. Using this format, examinees are able to respond to each alternative and to provide indications of any partial knowledge they may possess concerning the item. The…
Descriptors: Confidence Testing, Multiple Choice Tests, Probability, Response Style (Tests)
Previous Page | Next Page ยป
Pages: 1  |  2