Publication Date
| In 2026 | 0 |
| Since 2025 | 220 |
| Since 2022 (last 5 years) | 1089 |
| Since 2017 (last 10 years) | 2599 |
| Since 2007 (last 20 years) | 4960 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 653 |
| Teachers | 563 |
| Researchers | 250 |
| Students | 201 |
| Administrators | 81 |
| Policymakers | 22 |
| Parents | 17 |
| Counselors | 8 |
| Community | 7 |
| Support Staff | 3 |
| Media Staff | 1 |
| More ▼ | |
Location
| Turkey | 226 |
| Canada | 223 |
| Australia | 155 |
| Germany | 116 |
| United States | 99 |
| China | 90 |
| Florida | 86 |
| Indonesia | 82 |
| Taiwan | 78 |
| United Kingdom | 73 |
| California | 66 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 4 |
| Meets WWC Standards with or without Reservations | 4 |
| Does not meet standards | 1 |
Peer reviewedBielinski, John; Davison, Mark L. – Journal of Educational Measurement, 2001
Used mathematics achievement data from the 1992 National Assessment of Educational Progress, the Third International Mathematics and Science Study, and the National Education Longitudinal Study of 1988 to examine the sex difference by item difficulty interaction. The predicted negative correlation was found for all eight populations and was…
Descriptors: Correlation, Difficulty Level, Interaction, Mathematics Tests
Peer reviewedDavidson, Fred – System, 2000
Statistical analysis tools in language testing are described, chiefly classical test theory and item response theory. Computer software for statistical analysis is briefly reviewed and divided into three tiers: commonly available; statistical packages; and specialty software. (Author/VWL)
Descriptors: Computer Software, Language Tests, Second Language Learning, Statistical Analysis
Peer reviewedEllis, Barbara B.; Mead, Alan D. – Educational and Psychological Measurement, 2000
Used the differential functioning of items and tests (DFIT) framework to examine the measurement equivalence of a Spanish translation of the Sixteen Personality Factor (16PF) Questionnaire using samples of 309 Anglo American college students and other adults, 280 English-speaking Hispanics, and 244 Spanish-speaking college students. Results show…
Descriptors: Adults, College Students, Higher Education, Hispanic American Students
Peer reviewedWang, Wen-Chung – Journal of Applied Measurement, 2000
Proposes a factorial procedure for investigating differential distractor functioning in multiple choice items that models each distractor with a distinct distractibility parameter. Results of a simulation study show that the parameters of the proposed modeling were recovered very well. Analysis of 10 4-choice items from a college entrance…
Descriptors: College Entrance Examinations, Distractors (Tests), Factor Structure, Foreign Countries
Peer reviewedReise, Steven P. – Applied Psychological Measurement, 2001
This book contains a series of research articles about computerized adaptive testing (CAT) written for advanced psychometricians. The book is divided into sections on: (1) item selection and examinee scoring in CAT; (2) examples of CAT applications; (3) item banks; (4) determining model fit; and (5) using testlets in CAT. (SLD)
Descriptors: Adaptive Testing, Computer Assisted Testing, Goodness of Fit, Item Banks
Wise, Steven L.; Kong, Xiaojing – Applied Measurement in Education, 2005
When low-stakes assessments are administered, the degree to which examinees give their best effort is often unclear, complicating the validity and interpretation of the resulting test scores. This study introduces a new method, based on item response time, for measuring examinee test-taking effort on computer-based test items. This measure, termed…
Descriptors: Psychometrics, Validity, Reaction Time, Test Items
Stalder, Daniel R. – Teaching of Psychology, 2005
Study 1 assessed students' use and perceptions of acronyms at 3 different exam times in 2 sections of Introduction to Psychology. Acronym use consistently predicted higher performance on acronym related exam items, and I partially discounted 2 possible confounds. Students rated acronyms as helpful in multiple ways, including increasing motivation…
Descriptors: Memory, Psychology, Teaching Methods, Student Motivation
LaDuca, Tony – Educational Measurement: Issues and Practice, 2006
In the Spring 2005 issue, Wang, Schnipke, and Witt provided an informative description of the task inventory approach that centered on four functions of job analysis. The discussion included persuasive arguments for making systematic connections between tasks and KSAs. But several other facets of the discussion were much less persuasive. This…
Descriptors: Criticism, Task Analysis, Job Analysis, Persuasive Discourse
Keenan, Janice M.; Betjemann, Rebecca S. – Scientific Studies of Reading, 2006
We examined the validity of the comprehension component of the Gray Oral Reading Test (GORT; Wiederholt & Bryant, 1992, 2001) by assessing whether reading really is required to answer its questions. The extent to which GORT questions are passage independent was assessed by having participants answer them without reading the passages. Most…
Descriptors: Reading Comprehension, Oral Reading, Reading Tests, Test Items
Feldt, Leonard S. – Educational and Psychological Measurement, 2005
To meet the requirements of the No Child Left Behind Act, school districts and states must compile summary reports of the levels of student achievement in reading and mathematics. The levels are to be described in broad categories: "basic and below," "proficient," or "advanced." Educational units are given considerable latitude in defining the…
Descriptors: Federal Legislation, Academic Achievement, Test Items, Test Validity
Emons, Wilco H. M.; Sijtsma, Klaas; Meijer, Rob R. – Multivariate Behavioral Research, 2004
The person-response function (PRF) relates the probability of an individual's correct answer to the difficulty of items measuring the same latent trait. Local deviations of the observed PRF from the expected PRF indicate person misfit. We discuss two new approaches to investigate person fit. The first approach uses kernel smoothing to estimate…
Descriptors: Probability, Simulation, Item Response Theory, Test Items
Johnson, John A. – Multivariate Behavioral Research, 2004
This study describes the relation between personality items' validities, defined as the items' correlations with acquaintance ratings on the Big 5 personality factors, and other itemmetric properties including ambiguity, syntactic complexity, social desirability, content, and trait indicativity. Five external validity coefficients for each item on…
Descriptors: Personality Measures, Personality Assessment, Social Desirability, Personality Traits
Ferdous, Abdullah A.; Plake, Barbara S. – Applied Measurement in Education, 2005
This study addressed what standard-setting panelists think about when they make item performance estimates for a barely proficient student. This study extended previous studies by considering the factors that influenced panelists' decisions in an Angoff (1971)-based standard-setting study as a function of their item performance estimates.…
Descriptors: Test Items, Standard Setting (Scoring), Decision Making, Student Evaluation
Deak, Gedeon O.; Ray, Shanna D.; Pick, Anne D. – Cognitive Development, 2004
To test preschoolers' ability to flexibly switch between abstract rules differing in difficulty, ninety-three 3-, 4-, and 5-year-olds were instructed to switch from an (easier) shape-sorting to a (harder) function-sorting rule, or vice versa. Children learned one rule, sorted four test sets, then learned the other rule, and sorted four more sets.…
Descriptors: Difficulty Level, Preschool Children, Cognitive Tests, Adaptive Testing
Scheck, Petra; Meeter, Martijn; Nelson, Thomas O. – Journal of Memory and Language, 2004
This research explored the absolute accuracy of judgments of learning (JOLs), wherein absolute accuracy pertains to how well the magnitude of the participant's predictions of recall correspond to his or her subsequent recall. The Anchoring Hypothesis proposes that the magnitude of JOLs does not change systematically with item difficulty; analogous…
Descriptors: Recall (Psychology), Difficulty Level, Test Items, Predictive Validity

Direct link
