Publication Date
In 2025 | 1 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 2 |
Since 2016 (last 10 years) | 2 |
Since 2006 (last 20 years) | 7 |
Descriptor
Test Length | 21 |
Test Construction | 9 |
Test Items | 7 |
Test Reliability | 6 |
Evaluation Methods | 5 |
Item Response Theory | 5 |
Test Validity | 5 |
Reliability | 4 |
Testing | 4 |
Achievement Tests | 3 |
Foreign Countries | 3 |
More ▼ |
Source
Author
Publication Type
Reports - Descriptive | 21 |
Journal Articles | 15 |
Opinion Papers | 2 |
Guides - Non-Classroom | 1 |
Historical Materials | 1 |
Reports - Research | 1 |
Speeches/Meeting Papers | 1 |
Education Level
Elementary Secondary Education | 2 |
Middle Schools | 2 |
Secondary Education | 2 |
Elementary Education | 1 |
High Schools | 1 |
Higher Education | 1 |
Junior High Schools | 1 |
Audience
Administrators | 1 |
Practitioners | 1 |
Researchers | 1 |
Laws, Policies, & Programs
Americans with Disabilities… | 1 |
Equal Access | 1 |
Race to the Top | 1 |
Rehabilitation Act 1973… | 1 |
Assessments and Surveys
Florida Comprehensive… | 1 |
National Assessment of… | 1 |
What Works Clearinghouse Rating
Ying Xu; Xiaodong Li; Jin Chen – Language Testing, 2025
This article provides a detailed review of the Computer-based English Listening Speaking Test (CELST) used in Guangdong, China, as part of the National Matriculation English Test (NMET) to assess students' English proficiency. The CELST measures listening and speaking skills as outlined in the "English Curriculum for Senior Middle…
Descriptors: Computer Assisted Testing, English (Second Language), Language Tests, Listening Comprehension Tests
Braun, Virginia; Clarke, Victoria; Boulton, Elicia; Davey, Louise; McEvoy, Charlotte – International Journal of Social Research Methodology, 2021
Fully "qualitative" surveys, which prioritise qualitative research values, and harness the rich potential of qualitative data, have much to offer qualitative researchers, especially given online delivery options. Yet the method remains underutilised, and there is little in the way of methodological discussion of qualitative surveys.…
Descriptors: Online Surveys, Qualitative Research, Social Science Research, Disclosure
Meriac, John P.; Woehr, David J.; Gorman, C. Allen; Thomas, Amanda L. E. – Journal of Vocational Behavior, 2013
The multidimensional work ethic profile (MWEP) has become one of the most widely-used inventories for measuring the work ethic construct. However, its length has been a potential barrier to even more widespread use. We developed a short form of the MWEP, the MWEP-SF. A subset of items from the original measure was identified, using item response…
Descriptors: Work Ethic, Profiles, Measures (Individuals), Test Construction
Gewertz, Catherine – Education Week, 2012
A group that is developing tests for half the states in the nation has dramatically reduced the length of its assessment in a bid to balance the desire for a more meaningful and useful exam with concerns about the amount of time spent on testing. The decision by the Smarter Balanced Assessment Consortium reflects months of conversation among its…
Descriptors: State Standards, Test Length, Questioning Techniques, Test Construction
Tay, Louis; Drasgow, Fritz – Educational and Psychological Measurement, 2012
Two Monte Carlo simulation studies investigated the effectiveness of the mean adjusted X[superscript 2]/df statistic proposed by Drasgow and colleagues and, because of problems with the method, a new approach for assessing the goodness of fit of an item response theory model was developed. It has been previously recommended that mean adjusted…
Descriptors: Test Length, Monte Carlo Methods, Goodness of Fit, Item Response Theory
Culpepper, Steven Andrew – Applied Psychological Measurement, 2012
Measurement error significantly biases interaction effects and distorts researchers' inferences regarding interactive hypotheses. This article focuses on the single-indicator case and shows how to accurately estimate group slope differences by disattenuating interaction effects with errors-in-variables (EIV) regression. New analytic findings were…
Descriptors: Evidence, Test Length, Interaction, Regression (Statistics)
Pennsylvania Department of Education, 2010
This handbook describes the responsibilities of district and school assessment coordinators in the administration of the Pennsylvania System of School Assessment (PSSA). This updated guidebook contains the following sections: (1) General Assessment Guidelines for All Assessments; (2) Writing Specific Guidelines; (3) Reading and Mathematics…
Descriptors: Guidelines, Guides, Educational Assessment, Writing Tests
Chang, Yuan-chin Ivan – Psychometrika, 2005
In this paper, we apply sequential one-sided confidence interval estimation procedures with beta-protection to adaptive mastery testing. The procedures of fixed-width and fixed proportional accuracy confidence interval estimation can be viewed as extensions of one-sided confidence interval procedures. It can be shown that the adaptive mastery…
Descriptors: Mastery Tests, Probability, Intervals, Testing

Sanders, Piet F.; Verschoor, Alfred J. – Applied Psychological Measurement, 1998
Presents minimization and maximization models for parallel test construction under constraints. The minimization model constructs weakly and strongly parallel tests of minimum length, while the maximization model constructs weakly and strongly parallel tests with maximum test reliability. (Author/SLD)
Descriptors: Algorithms, Models, Reliability, Test Construction
Henson, Robin K. – 2000
The purpose of this paper is to highlight some psychometric cautions that should be observed when seeking to develop short form versions of tests. Several points are made: (1) score reliability is impacted directly by the characteristics of the sample and testing conditions; (2) sampling error has a direct influence on reliability and factor…
Descriptors: Factor Structure, Psychometrics, Reliability, Sampling

Wainer, Howard; Kiely, Gerard L. – Journal of Educational Measurement, 1987
The testlet, a bundle of test items, alleviates some problems associated with computerized adaptive testing: context effects, lack of robustness, and item difficulty ordering. While testlets may be linear or hierarchical, the most useful ones are four-level hierarchical units, containing 15 items and partitioning examinees into 16 classes. (GDC)
Descriptors: Adaptive Testing, Computer Assisted Testing, Context Effect, Item Banks
Burton, Richard F. – Assessment and Evaluation in Higher Education, 2005
Examiners seeking guidance on multiple-choice and true/false tests are likely to encounter various faulty or questionable ideas. Twelve of these are discussed in detail, having to do mainly with the effects on test reliability of test length, guessing and scoring method (i.e. number-right scoring or negative marking). Some misunderstandings could…
Descriptors: Guessing (Tests), Multiple Choice Tests, Objective Tests, Test Reliability
Wang, Wen-Chung; Chen, Hsueh-Chu – Educational and Psychological Measurement, 2004
As item response theory (IRT) becomes popular in educational and psychological testing, there is a need of reporting IRT-based effect size measures. In this study, we show how the standardized mean difference can be generalized into such a measure. A disattenuation procedure based on the IRT test reliability is proposed to correct the attenuation…
Descriptors: Test Reliability, Rating Scales, Sample Size, Error of Measurement
Cohen, Allan S.; Gregg, Noel; Deng, Meng – Learning Disabilities Research & Practice, 2005
The premise of a great deal of current research guiding policy development has been that accommodations are the catalyst for student performance differences. Rather than accepting this premise, two studies were conducted to investigate the influence of extended time and content knowledge on the performance of ninth-grade students who took a…
Descriptors: Program Effectiveness, Mathematics Tests, Learning Disabilities, Testing Accommodations
Lee, J. Murray; Segel, David – Office of Education, United States Department of the Interior, 1936
In order to make an intelligent advance in any school practice a knowledge of what schools are doing in that practice is almost indispensable, since a transition in procedures must be a growth from the one to the other. This bulletin gives this background of facts concerning the use of tests and examinations by the different subject departments in…
Descriptors: Testing, Teachers, Standardized Tests, Principals
Previous Page | Next Page ยป
Pages: 1 | 2