Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 0 |
Since 2006 (last 20 years) | 7 |
Descriptor
Difficulty Level | 13 |
Testing | 13 |
Test Items | 8 |
Item Response Theory | 5 |
Comparative Analysis | 4 |
Test Construction | 4 |
Multiple Choice Tests | 3 |
Test Format | 3 |
Computer Assisted Testing | 2 |
Foreign Countries | 2 |
Goodness of Fit | 2 |
More ▼ |
Source
Author
Publication Type
Reports - Evaluative | 13 |
Journal Articles | 8 |
Speeches/Meeting Papers | 4 |
Information Analyses | 1 |
Numerical/Quantitative Data | 1 |
Education Level
Higher Education | 2 |
Postsecondary Education | 2 |
Secondary Education | 2 |
Elementary Secondary Education | 1 |
Grade 7 | 1 |
Junior High Schools | 1 |
Middle Schools | 1 |
Audience
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Karpicke, Jeffrey D.; Aue, William R. – Educational Psychology Review, 2015
Van Gog and Sweller (2015) claim that there is no testing effect--no benefit of practicing retrieval--for complex materials. We show that this claim is incorrect on several grounds. First, Van Gog and Sweller's idea of "element interactivity" is not defined in a quantitative, measurable way. As a consequence, the idea is applied…
Descriptors: Testing, Learning, Instructional Materials, Difficulty Level
Plassmann, Sibylle; Zeidler, Beate – Language Learning in Higher Education, 2014
Language testing means taking decisions: about the test taker's results, but also about the test construct and the measures taken in order to ensure quality. This article takes the German test "telc Deutsch C1 Hochschule" as an example to illustrate this decision-making process in an academic context. The test is used for university…
Descriptors: Language Tests, Test Wiseness, Test Construction, Decision Making
Hamzah, Mohd Sahandri Gani; Abdullah, Saifuddin Kumar – Online Submission, 2011
The evaluation of learning is a systematic process involving testing, measuring and evaluation. In the testing step, a teacher needs to choose the best instrument that can test the minds of students. Testing will produce scores or marks with many variations either in homogeneous or heterogeneous forms that will be used to categorize the scores…
Descriptors: Test Items, Item Analysis, Difficulty Level, Testing
Park, Bitnara Jasmine; Alonzo, Julie; Tindal, Gerald – Behavioral Research and Teaching, 2011
This technical report describes the process of development and piloting of reading comprehension measures that are appropriate for seventh-grade students as part of an online progress screening and monitoring assessment system, http://easycbm.com. Each measure consists of an original fictional story of approximately 1,600 to 1,900 words with 20…
Descriptors: Reading Comprehension, Reading Tests, Grade 7, Test Construction
Meyers, Jason L.; Miller, G. Edward; Way, Walter D. – Applied Measurement in Education, 2009
In operational testing programs using item response theory (IRT), item parameter invariance is threatened when an item appears in a different location on the live test than it did when it was field tested. This study utilizes data from a large state's assessments to model change in Rasch item difficulty (RID) as a function of item position change,…
Descriptors: Test Items, Test Content, Testing Programs, Simulation

Albanese, Mark A. – Journal of Educational Measurement, 1988
Estimates of the effects of use of formula scoring on the individual examinee's score are presented. Results for easy, moderate, and hard tests are examined. Using test characteristics from several studies shows that some examinees would increase scores substantially if they were to answer items omitted under formula directions. (SLD)
Descriptors: Difficulty Level, Guessing (Tests), Scores, Scoring Formulas

Firmin, Michael; Hwang, Chi-En; Copella, Margaret; Clark, Sarah – Education, 2004
This study examined learned helplessness and its effect on test taking. Students were given one of two tests; the first began with extremely difficult questions and the other started with easy questions. The researchers hypothesized that those who took the test beginning with difficult questions would become easily frustrated and possibly doubt…
Descriptors: Helplessness, Difficulty Level, Comparative Analysis, Academic Failure
Pettijohn, Terry F., II; Sacco, Matthew F. – Journal of Instructional Psychology, 2007
We conducted 2 studies to investigate undergraduate performance, perceptions, and time required in completing sequentially ordered, randomly ordered, or reverse ordered exams in introductory psychology classes. Study 1 compared the outcomes and perceptions of students (N = 66) on 3 non-comprehensive multiple-choice exams which were sequentially,…
Descriptors: Student Attitudes, Program Effectiveness, Psychology, Test Anxiety
Lecointe, Darius A. – 1995
The purpose of this Item Response Theory study was to investigate how the expected reduction in item information, due to the collapsing of response categories in performance assessment data, was affected by varying testing conditions: item difficulty, item discrimination, inter-rater reliability, and direction of collapsing. The investigation used…
Descriptors: Classification, Computer Simulation, Difficulty Level, Interrater Reliability
Lazarte, Alejandro A. – 1999
Two experiments reproduced in a simulated computerized test-taking situation the effect of two of the main determinants in answering an item in a test: the difficulty of the item and the time available to answer it. A model is proposed for the time to respond or abandon an item and for the probability of abandoning it or answering it correctly. In…
Descriptors: Computer Assisted Testing, Difficulty Level, Higher Education, Probability
Yao, Lihua; Schwarz, Richard D. – Applied Psychological Measurement, 2006
Multidimensional item response theory (IRT) models have been proposed for better understanding the dimensional structure of data or to define diagnostic profiles of student learning. A compensatory multidimensional two-parameter partial credit model (M-2PPC) for constructed-response items is presented that is a generalization of those proposed to…
Descriptors: Models, Item Response Theory, Markov Processes, Monte Carlo Methods
Stone, Gregory Ethan; Lunz, Mary E. – 1994
This paper explores the comparability of item calibrations for three types of items: (1) text only; (2) text with photographs; and (3) text plus graphics when items are presented on written tests and computerized adaptive tests. Data are from five different medical technology certification examinations administered nationwide in 1993. The Rasch…
Descriptors: Adaptive Testing, Comparative Analysis, Computer Assisted Testing, Diagrams
MacWhinney, Brian – 1994
Drawing on recent psychological and neurological research on how individual differences might interact with learning a particular language, the study examines how psycholinguistic research and theory can help in assigning military personnel to language training and to a given language. Using the Defense Language Institute's Defense Language…
Descriptors: Comparative Analysis, Contrastive Linguistics, Difficulty Level, English