Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 4 |
Since 2006 (last 20 years) | 7 |
Descriptor
Difficulty Level | 23 |
Item Analysis | 23 |
Test Theory | 23 |
Test Items | 21 |
Latent Trait Theory | 9 |
Test Construction | 8 |
Mathematical Models | 5 |
Test Reliability | 5 |
Test Validity | 5 |
Career Development | 4 |
Achievement Tests | 3 |
More ▼ |
Source
Author
Publication Type
Reports - Research | 20 |
Journal Articles | 10 |
Speeches/Meeting Papers | 5 |
Guides - Non-Classroom | 1 |
Information Analyses | 1 |
Numerical/Quantitative Data | 1 |
Opinion Papers | 1 |
Reports - Descriptive | 1 |
Reports - Evaluative | 1 |
Education Level
Elementary Education | 3 |
Grade 6 | 2 |
Middle Schools | 2 |
Early Childhood Education | 1 |
Elementary Secondary Education | 1 |
Grade 1 | 1 |
Grade 3 | 1 |
Grade 4 | 1 |
Grade 5 | 1 |
Grade 7 | 1 |
Grade 8 | 1 |
More ▼ |
Audience
Researchers | 2 |
Laws, Policies, & Programs
No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
SAT (College Admission Test) | 2 |
Test of English as a Foreign… | 1 |
What Works Clearinghouse Rating
DeCarlo, Lawrence T. – Journal of Educational Measurement, 2023
A conceptualization of multiple-choice exams in terms of signal detection theory (SDT) leads to simple measures of item difficulty and item discrimination that are closely related to, but also distinct from, those used in classical item analysis (CIA). The theory defines a "true split," depending on whether or not examinees know an item,…
Descriptors: Multiple Choice Tests, Test Items, Item Analysis, Test Wiseness
Eaton, Philip; Johnson, Keith; Barrett, Frank; Willoughby, Shannon – Physical Review Physics Education Research, 2019
For proper assessment selection understanding the statistical similarities amongst assessments that measure the same, or very similar, topics is imperative. This study seeks to extend the comparative analysis between the brief electricity and magnetism assessment (BEMA) and the conceptual survey of electricity and magnetism (CSEM) presented by…
Descriptors: Test Theory, Item Response Theory, Comparative Analysis, Energy
Shanmugam, S. Kanageswari Suppiah; Wong, Vincent; Rajoo, Murugan – Malaysian Journal of Learning and Instruction, 2020
Purpose: This study examined the quality of English test items using psychometric and linguistic characteristics among Grade Six pupils. Method: Contrary to the conventional approach of relying only on statistics when investigating item quality, this study adopted a mixed-method approach by employing psychometric analysis and cognitive interviews.…
Descriptors: English (Second Language), Second Language Instruction, Language Tests, Psychometrics
Powers, Donald; Schedl, Mary; Papageorgiou, Spiros – Language Testing, 2017
The aim of this study was to develop, for the benefit of both test takers and test score users, enhanced "TOEFL ITP"® test score reports that go beyond the simple numerical scores that are currently reported. To do so, we applied traditional scale anchoring (proficiency scaling) to item difficulty data in order to develop performance…
Descriptors: English (Second Language), Second Language Learning, Language Proficiency, Scores
Kettler, Ryan J.; Dickenson, Tammiee S.; Bennett, Heather L.; Morgan, Grant B.; Gilmore, Joanna A.; Beddow, Peter A.; Swaffield, Suzanne; Turner, Linda; Herrera, Bill; Turner, Charlene; Palmer, Porter W. – Exceptional Children, 2012
This study was inspired by the final regulations for the No Child Left Behind Act (NCLB) indicating that each state has the option to develop a new assessment for students whose disabilities have kept them from obtaining proficiency. Sets of high school science achievement items were enhanced for the new test. A 3-by-2, within subjects,…
Descriptors: Accessibility (for Disabled), Achievement Tests, Science Achievement, Testing Accommodations
Lee, Young-Sun; Lembke, Erica; Moore, Douglas; Ginsburg, Herbert P.; Pappas, Sandra – Assessment for Effective Intervention, 2012
The present study examined the technical adequacy of curriculum-based measures (CBMs) of early numeracy. Six 1-min early mathematics tasks were administered to 137 kindergarten and first-grade students, along with an omnibus test of early mathematics. The CBM measures included Count Out Loud, Quantity Discrimination, Number Identification, Missing…
Descriptors: Numeracy, Curriculum Based Assessment, Mathematics Tests, Kindergarten
Jung, Eunju; Liu, Kimy; Ketterlin-Geller, Leanne R.; Tindal, Gerald – Behavioral Research and Teaching, 2008
The purpose of this study was to develop general outcome measures (GOM) in mathematics so that teachers could focus their instruction on needed prerequisite skills. We describe in detail, the manner in which content-related evidence was established and then present a number of statistical analyses conducted to evaluate the technical adequacy of…
Descriptors: Item Analysis, Test Construction, Test Theory, Mathematics Tests
Reckase, Mark D.; McKinley, Robert L. – 1984
The purpose of this paper is to present a generalization of the concept of item difficulty to test items that measure more than one dimension. Three common definitions of item difficulty were considered: the proportion of correct responses for a group of individuals; the probability of a correct response to an item for a specific person; and the…
Descriptors: Difficulty Level, Item Analysis, Latent Trait Theory, Mathematical Models

Reynolds, Thomas J. – Educational and Psychological Measurement, 1981
Cliff's Index "c" derived from an item dominance matrix is utilized in a clustering approach, termed extracting Reliable Guttman Orders (ERGO), to isolate Guttman-type item hierarchies. A comparison of factor analysis to the ERGO is made on social distance data involving multiple ethnic groups. (Author/BW)
Descriptors: Cluster Analysis, Difficulty Level, Factor Analysis, Item Analysis

Henning, Grant – Language Testing, 1988
Violations of item unidimensionality on language tests produced distorted estimates of person ability, and violations of person unidimensionality produced distorted estimates of item difficulty. The Bejar Method was sensitive to such distortions. (Author)
Descriptors: Construct Validity, Content Validity, Difficulty Level, Item Analysis

O'Brien, Michael L. – Studies in Educational Evaluation, 1986
A test score can be used for individual instructional diagnosis after determining whether: (1) difficulty of the test items was consistent with the complexity of the content measured; (2) items measuring the same underlying process were about equally difficult; and (3) partial credit scoring would increase the reliability of the diagnosis. (LMO)
Descriptors: Behavioral Objectives, Difficulty Level, Educational Diagnosis, Error Patterns
Hambleton, Ronald K.; Cook, Linda L. – 1978
The purpose of the present research was to study, systematically, the "goodness-of-fit" of the one-, two-, and three-parameter logistic models. We studied, using computer-simulated test data, the effects of four variables: variation in item discrimination parameters, the average value of the pseudo-chance level parameters, test length,…
Descriptors: Career Development, Difficulty Level, Goodness of Fit, Item Analysis

Lord, Frederic M. – Applied Psychological Measurement, 1977
Under given conditions, conventional testing and computer-generated repeatable testing (CGRT) are equally effective for estimating examinee ability; CGRT is more effective for estimating the mean ability level of a group and less effective for estimating ability differences among individuals. These conclusion are drawn from domain-referenced test…
Descriptors: Career Development, Computer Assisted Testing, Difficulty Level, Group Norms

Linn, Robert L.; Drasgow, Fritz – Educational Measurement: Issues and Practice, 1987
This article discusses the application of the Golden Rule procedure to items of the Scholastic Aptitude Test. Using item response theory, the analyses indicate that the Golden Rule procedures are ineffective in detecting biased items and may undermine the reliability and validity of tests. (Author/JAZ)
Descriptors: College Entrance Examinations, Difficulty Level, Item Analysis, Latent Trait Theory
Seong, Tae-Je; Subkoviak, Michael J. – 1987
The purpose of this research was to reinvestigate the accuracy of three item bias detection procedures: (1) Linn and Harnisch's pseudo-IRT(Z) method; (2) Camilli's chi-square technique; and (3) Angoff's revised transformed item difficulty method. These methods are applied when the minority group sample size is too small to obtain stable estimates…
Descriptors: Blacks, Difficulty Level, Higher Education, Item Analysis
Previous Page | Next Page »
Pages: 1 | 2