NotesFAQContact Us
Collection
Advanced
Search Tips
Audience
Location
Japan1
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 1 to 15 of 21 results Save | Export
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Uysal, Ibrahim; Sahin-Kürsad, Merve; Kiliç, Abdullah Faruk – Participatory Educational Research, 2022
The aim of the study was to examine the common items in the mixed format (e.g., multiple-choices and essay items) contain parameter drifts in the test equating processes performed with the common item nonequivalent groups design. In this study, which was carried out using Monte Carlo simulation with a fully crossed design, the factors of test…
Descriptors: Test Items, Test Format, Item Response Theory, Equated Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Sinharay, Sandip – Applied Measurement in Education, 2017
Karabatsos compared the power of 36 person-fit statistics using receiver operating characteristics curves and found the "H[superscript T]" statistic to be the most powerful in identifying aberrant examinees. He found three statistics, "C", "MCI", and "U3", to be the next most powerful. These four statistics,…
Descriptors: Nonparametric Statistics, Goodness of Fit, Simulation, Comparative Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Isbell, Dan; Winke, Paula – Language Testing, 2019
The American Council on the Teaching of Foreign Languages (ACTFL) oral proficiency interview -- computer (OPIc) testing system represents an ambitious effort in language assessment: Assessing oral proficiency in over a dozen languages, on the same scale, from virtually anywhere at any time. Especially for users in contexts where multiple foreign…
Descriptors: Oral Language, Language Tests, Language Proficiency, Second Language Learning
Peer reviewed Peer reviewed
Direct linkDirect link
Watanabe, Yoshinori – Language Testing, 2013
This article describes the National Center Test for University Admissions, a unified national test in Japan, which is taken by 500,000 students every year. It states that implementation of the Center Test began in 1990, with the English component consisting only of the written section until 2005, when the listening section was first implemented…
Descriptors: College Admission, Foreign Countries, College Entrance Examinations, English (Second Language)
Peer reviewed Peer reviewed
Direct linkDirect link
Camilli, Gregory – Educational Research and Evaluation, 2013
In the attempt to identify or prevent unfair tests, both quantitative analyses and logical evaluation are often used. For the most part, fairness evaluation is a pragmatic attempt at determining whether procedural or substantive due process has been accorded to either a group of test takers or an individual. In both the individual and comparative…
Descriptors: Alternative Assessment, Test Bias, Test Content, Test Format
Peer reviewed Peer reviewed
Direct linkDirect link
Cui, Ying; Leighton, Jacqueline P. – Journal of Educational Measurement, 2009
In this article, we introduce a person-fit statistic called the hierarchy consistency index (HCI) to help detect misfitting item response vectors for tests developed and analyzed based on a cognitive model. The HCI ranges from -1.0 to 1.0, with values close to -1.0 indicating that students respond unexpectedly or differently from the responses…
Descriptors: Test Length, Simulation, Correlation, Research Methodology
Peer reviewed Peer reviewed
Direct linkDirect link
Wollack, James A. – Applied Measurement in Education, 2006
Many of the currently available statistical indexes to detect answer copying lack sufficient power at small [alpha] levels or when the amount of copying is relatively small. Furthermore, there is no one index that is uniformly best. Depending on the type or amount of copying, certain indexes are better than others. The purpose of this article was…
Descriptors: Statistical Analysis, Item Analysis, Test Length, Sample Size
Peer reviewed Peer reviewed
Conger, Anthony J. – Educational and Psychological Measurement, 1983
A paradoxical phenomenon of decreases in reliability as the number of elements averaged over increases is shown to be possible in multifacet reliability procedures (intraclass correlations or generalizability coefficients). Conditions governing this phenomenon are presented along with implications and cautions. (Author)
Descriptors: Generalizability Theory, Test Construction, Test Items, Test Length
Peer reviewed Peer reviewed
Gallucci, Nicholas T. – Educational and Psychological Measurement, 1986
This study evaluated the degree to which 102 undergraduate participants objected to questions on the Minnesota Multiphasic Personality Inventory (MMPI) which referred to sex, religion, bladder and bowel functions, family relationships, and unusual thinking in comparision to degree of objection to length of the MMPI and repetition of questions.…
Descriptors: College Students, Higher Education, Personality Measures, Psychological Evaluation
Peer reviewed Peer reviewed
Sindhu, R. S.; Sharma, Reeta – Science Education International, 1999
Finds that the time required to attempt all the test items of each question paper in a four-paper sample was inversely proportional to the percentage of students who attempted all the test items of that paper. Extrapolates results to give guidelines for determining the feasibility of newly-developed exam papers. (WRM)
Descriptors: Science Tests, Secondary Education, Test Construction, Test Length
Peer reviewed Peer reviewed
Woodard, John L.; Axelrod, Bradley N. – Psychological Assessment, 1995
Using 308 patients referred for neuropsychological evaluation, 2 regression equations were developed to predict weighted raw score sums for General Memory and Delayed Recall using the Wechsler Memory Scale-Revised (WMS-R) analogs of 5 subtests from the original WMS. The equations may help reduce WMS-R administration time. (SLD)
Descriptors: Equations (Mathematics), Memory, Neuropsychology, Patients
Peer reviewed Peer reviewed
Budescu, David – Journal of Educational Measurement, 1985
An important determinant of equating process efficiency is the correlation between the anchor test and components of each form. Use of some monotonic function of this correlation as a measure of equating efficiency is suggested. A model relating anchor test length and test reliability to this measure of efficiency is presented. (Author/DWH)
Descriptors: Correlation, Equated Scores, Mathematical Models, Standardized Tests
Peer reviewed Peer reviewed
Wainer, Howard; Kiely, Gerard L. – Journal of Educational Measurement, 1987
The testlet, a bundle of test items, alleviates some problems associated with computerized adaptive testing: context effects, lack of robustness, and item difficulty ordering. While testlets may be linear or hierarchical, the most useful ones are four-level hierarchical units, containing 15 items and partitioning examinees into 16 classes. (GDC)
Descriptors: Adaptive Testing, Computer Assisted Testing, Context Effect, Item Banks
Peer reviewed Peer reviewed
Cudeck, Robert; And Others – Applied Psychological Measurement, 1979
TAILOR, a computer program which implements an approach to tailored testing, was examined by Monte Carlo methods. The evaluation showed the procedure to be highly reliable and capable of reducing the required number of tests items by about one half. (Author/JKS)
Descriptors: Adaptive Testing, Computer Programs, Feasibility Studies, Item Analysis
Peer reviewed Peer reviewed
Wilcox, Rand R. – Educational and Psychological Measurement, 1979
A problem of considerable importance in certain educational settings is determining how many items to include on a mastery test. Applying ranking and selection procedures, a solution is given which includes as a special case all existing single-stage, non-Bayesian solutions based on a strong true-score model. (Author/JKS)
Descriptors: Criterion Referenced Tests, Mastery Tests, Nonparametric Statistics, Probability
Previous Page | Next Page »
Pages: 1  |  2