Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 2 |
Since 2006 (last 20 years) | 6 |
Descriptor
Comparative Testing | 12 |
Item Analysis | 12 |
Test Reliability | 12 |
Test Construction | 5 |
Test Validity | 5 |
Difficulty Level | 4 |
Test Interpretation | 4 |
Test Items | 4 |
Aptitude Tests | 3 |
Multiple Choice Tests | 3 |
Test Format | 3 |
More ▼ |
Source
Advances in Health Sciences… | 2 |
Applied Measurement in… | 1 |
Applied Psychological… | 1 |
GED Testing Service | 1 |
International Journal of… | 1 |
Journal of Education for… | 1 |
Author
Bauer, Daniel | 1 |
Bethscheider, Janine K. | 1 |
Brice, Julie | 1 |
Chissom, Brad | 1 |
Chukabarah, Prince C. O. | 1 |
Coombes, Lee | 1 |
Downey, Ronald G. | 1 |
Elosua, Paula | 1 |
Fischer, Martin R. | 1 |
Guttormsen, Sissel | 1 |
He, Yi | 1 |
More ▼ |
Publication Type
Reports - Research | 9 |
Journal Articles | 6 |
Reports - Evaluative | 2 |
Speeches/Meeting Papers | 2 |
Tests/Questionnaires | 1 |
Education Level
Higher Education | 3 |
Elementary Secondary Education | 2 |
Grade 10 | 1 |
Grade 4 | 1 |
Grade 7 | 1 |
Postsecondary Education | 1 |
Audience
Researchers | 2 |
Location
Finland | 1 |
Laws, Policies, & Programs
Assessments and Surveys
General Educational… | 1 |
What Works Clearinghouse Rating
Lahner, Felicitas-Maria; Lörwald, Andrea Carolin; Bauer, Daniel; Nouns, Zineb Miriam; Krebs, René; Guttormsen, Sissel; Fischer, Martin R.; Huwendiek, Sören – Advances in Health Sciences Education, 2018
Multiple true-false (MTF) items are a widely used supplement to the commonly used single-best answer (Type A) multiple choice format. However, an optimal scoring algorithm for MTF items has not yet been established, as existing studies yielded conflicting results. Therefore, this study analyzes two questions: What is the optimal scoring algorithm…
Descriptors: Scoring Formulas, Scoring Rubrics, Objective Tests, Multiple Choice Tests
Murray, Keith B.; Zdravkovic, Srdan – Journal of Education for Business, 2016
Considerable debate continues regarding the efficacy of the website RateMyProfessors.com (RMP). To date, however, virtually no direct, experimental research has been reported which directly bears on questions relating to sampling adequacy or item adequacy in producing what favorable correlations have been reported. The authors compare the data…
Descriptors: Computer Assisted Testing, Computer Software Evaluation, Student Evaluation of Teacher Performance, Item Analysis
Ricketts, Chris; Brice, Julie; Coombes, Lee – Advances in Health Sciences Education, 2010
The purpose of multiple choice tests of medical knowledge is to estimate as accurately as possible a candidate's level of knowledge. However, concern is sometimes expressed that multiple choice tests may also discriminate in undesirable and irrelevant ways, such as between minority ethnic groups or by sex of candidates. There is little literature…
Descriptors: Medical Students, Testing Accommodations, Ethnic Groups, Learning Disabilities
Taylor, Catherine S.; Lee, Yoonsun – Applied Measurement in Education, 2010
Item response theory (IRT) methods are generally used to create score scales for large-scale tests. Research has shown that IRT scales are stable across groups and over time. Most studies have focused on items that are dichotomously scored. Now Rasch and other IRT models are used to create scales for tests that include polytomously scored items.…
Descriptors: Measures (Individuals), Item Response Theory, Robustness (Statistics), Item Analysis
Elosua, Paula; Iliescu, Dragos – International Journal of Testing, 2012
Psychometric practice does not always converge with the advances of psychometric theory. In order to investigate this gap, the authors focus on the 10 most used psychological tests in Europe, as identified by recent surveys. The article analyzes test manuals published in 6 different European countries for these 10 most used tests. A total of 32…
Descriptors: Psychological Testing, Personality Measures, Error of Measurement, Foreign Countries
Setzer, J. Carl; He, Yi – GED Testing Service, 2009
Reliability Analysis for the Internationally Administered 2002 Series GED (General Educational Development) Tests Reliability refers to the consistency, or stability, of test scores when the authors administer the measurement procedure repeatedly to groups of examinees (American Educational Research Association [AERA], American Psychological…
Descriptors: Educational Research, Error of Measurement, Scores, Test Reliability
Bethscheider, Janine K. – 1992
Standard and experimental forms of the Johnson O'Connor Research Foundations Analytical Reasoning test were administered to 1,496 clients of the Foundation (persons seeking information about aptitude for educational and career decisions). The objectives were to develop a new form of the test and to better understand what makes some items more…
Descriptors: Adults, Aptitude Tests, Career Choice, Comparative Testing
Chissom, Brad; Chukabarah, Prince C. O. – 1985
The comparative effects of various sequences of test items were examined for over 900 graduate students enrolled in an educational research course at The University of Alabama, Tuscaloosa. experiment, which was conducted a total of four times using four separate tests, presented three different arrangements of 50 multiple-choice items: (1)…
Descriptors: Analysis of Variance, Comparative Testing, Difficulty Level, Graduate Students

Downey, Ronald G. – Applied Psychological Measurement, 1979
This research attempted to interrelate several methods of producing option weights (i.e., Guttman internal and external weights and judges' weights) and examined their effects on reliability and on concurrent, predictive, and face validity. It was concluded that option weighting offered limited, if any, improvement over unit weighting. (Author/CTM)
Descriptors: Achievement Tests, Answer Keys, Comparative Testing, High Schools
Silva, Sharron J. – 1985
Test item selection techniques based on traditional item analysis methods were compared to techniques based on item response theory. The consistency of mastery classifications in criterion referenced reading tests was examined. Pretest and posttest data were available for 945 first and second grade students and for 1796 fourth to sixth grade…
Descriptors: Analysis of Variance, Comparative Testing, Criterion Referenced Tests, Elementary Education
Karma, Kai – 1978
Four new versions of an acoustic structuring test were developed, administered, and analyzed in order to produce better tests and to contribute to better understanding of the abilities measured by these tests. The tests consist of tape recordings of patterns of musical notes played on an electric organ or an acoustic guitar. Item analyses and…
Descriptors: Adults, Aptitude, Aptitude Tests, Cognitive Ability
Lado, Robert – 1961
Intended as a comprehensive introduction to the construction and use of foreign language tests, this book utilizes modern linguistic knowledge as a base for scientific language testing. Major attention in testing is focused on such integrated language skills as auditory and reading comprehension, speaking, writing, translation, and over-all…
Descriptors: Achievement Tests, Aptitude Tests, Comparative Testing, Cultural Education