Publication Date
In 2025 | 1 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 4 |
Since 2006 (last 20 years) | 12 |
Descriptor
Test Interpretation | 50 |
Test Items | 50 |
Test Construction | 15 |
Item Response Theory | 12 |
Scoring | 11 |
Scores | 8 |
Test Validity | 8 |
Correlation | 7 |
Criterion Referenced Tests | 7 |
Foreign Countries | 7 |
Item Analysis | 7 |
More ▼ |
Source
Author
Hambleton, Ronald K. | 3 |
Samejima, Fumiko | 3 |
Ackerman, Terry A. | 1 |
Allen, Nancy L. | 1 |
Alvarez, Karina | 1 |
Angoff, William H. | 1 |
B. Barbot | 1 |
B. Goecke | 1 |
Beaton, Albert E. | 1 |
Beller, Michael | 1 |
Besag, Frank | 1 |
More ▼ |
Publication Type
Reports - Evaluative | 50 |
Journal Articles | 28 |
Speeches/Meeting Papers | 12 |
Reports - Research | 4 |
Guides - Non-Classroom | 2 |
Opinion Papers | 2 |
Information Analyses | 1 |
Numerical/Quantitative Data | 1 |
Tests/Questionnaires | 1 |
Education Level
Elementary Secondary Education | 5 |
Elementary Education | 2 |
Grade 4 | 2 |
Secondary Education | 2 |
Adult Education | 1 |
Grade 10 | 1 |
Grade 6 | 1 |
Grade 8 | 1 |
Postsecondary Education | 1 |
Audience
Researchers | 1 |
Laws, Policies, & Programs
Individuals with Disabilities… | 1 |
No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
B. Goecke; S. Weiss; B. Barbot – Journal of Creative Behavior, 2025
The present paper questions the content validity of the eight creativity-related self-report scales available in PISA 2022's context questionnaire and provides a set of considerations for researchers interested in using these indexes. Specifically, we point out some threats to the content validity of these scales (e.g., "creative thinking…
Descriptors: Creativity, Creativity Tests, Questionnaires, Content Validity
Jacobson, Erik; Svetina, Dubravka – Applied Measurement in Education, 2019
Contingent argument-based approaches to validity require a unique argument for each use, in contrast to more prescriptive approaches that identify the common kinds of validity evidence researchers should consider for every use. In this article, we evaluate our use of an approach that is both prescriptive "and" argument-based to develop a…
Descriptors: Test Validity, Test Items, Test Construction, Test Interpretation
Vaheoja, Monika; Verhelst, N. D.; Eggen, T.J.H.M. – European Journal of Science and Mathematics Education, 2019
In this article, the authors applied profile analysis to Maths exam data to demonstrate how different exam forms, differing in difficulty and length, can be reported and easily interpreted. The results were presented for different groups of participants and for different institutions in different Maths domains by evaluating the balance. Some…
Descriptors: Feedback (Response), Foreign Countries, Statistical Analysis, Scores
Wise, Steven L. – Educational Measurement: Issues and Practice, 2017
The rise of computer-based testing has brought with it the capability to measure more aspects of a test event than simply the answers selected or constructed by the test taker. One behavior that has drawn much research interest is the time test takers spend responding to individual multiple-choice items. In particular, very short response…
Descriptors: Guessing (Tests), Multiple Choice Tests, Test Items, Reaction Time
Reynolds, Matthew R.; Niileksela, Christopher R. – Journal of Psychoeducational Assessment, 2015
"The Woodcock-Johnson IV Tests of Cognitive Abilities" (WJ IV COG) is an individually administered measure of psychometric intellectual abilities designed for ages 2 to 90+. The measure was published by Houghton Mifflin Harcourt-Riverside in 2014. Frederick Shrank, Kevin McGrew, and Nancy Mather are the authors. Richard Woodcock, the…
Descriptors: Cognitive Tests, Testing, Scoring, Test Interpretation
Dorans, Neil J. – Educational Measurement: Issues and Practice, 2012
Views on testing--its purpose and uses and how its data are analyzed--are related to one's perspective on test takers. Test takers can be viewed as learners, examinees, or contestants. I briefly discuss the perspective of test takers as learners. I maintain that much of psychometrics views test takers as examinees. I discuss test takers as a…
Descriptors: Testing, Test Theory, Item Response Theory, Test Reliability
Penfield, Randall D.; Alvarez, Karina; Lee, Okhee – Applied Measurement in Education, 2009
The assessment of differential item functioning (DIF) in polytomous items addresses between-group differences in measurement properties at the item level, but typically does not inform which score levels may be involved in the DIF effect. The framework of differential step functioning (DSF) addresses this issue by examining between-group…
Descriptors: Test Bias, Classification, Test Items, Criteria
Advantages of the Rasch Measurement Model in Analysing Educational Tests: An Applicator's Reflection
Tormakangas, Kari – Educational Research and Evaluation, 2011
Educational achievement is a very important issue for parents, teachers, and the government. An accurate measurement plays a very important role in evaluating achievement fairly, and, therefore, analysis methods have been developed considerably in recent years. Education based on long-time learning processes forms a fruitful base for item tests,…
Descriptors: Test Items, Item Analysis, Learning Processes, Item Response Theory
Montgomery, Janine Marie; Newton, Brendan; Smith, Christiane – Journal of Psychoeducational Assessment, 2008
The Gilliam Autism Rating Scale-Second Edition (GARS-2) is a screening tool for autism spectrum disorders for individuals between the ages of 3 and 22. It was designed to help differentiate those with autism from those with severe behavioral disorders as well as from those who are typically developing. It is a norm-referenced instrument that…
Descriptors: Autism, Rating Scales, Test Reviews, Norm Referenced Tests

French, Ann W.; Miller, Timothy R. – Journal of Educational Measurement, 1996
A computer simulation study was conducted to determine the feasibility of using logistic regression procedures to detect differential item functioning (DIF) in polytomous items. Results indicate that logistic regression is powerful in detecting most forms of DIF, although it requires large amounts of data manipulation and careful interpretation.…
Descriptors: Computer Simulation, Identification, Item Bias, Test Interpretation
National Center for Education Statistics, 2007
The purpose of this document is to provide background information that will be useful in interpreting the 2007 results from the Trends in International Mathematics and Science Study (TIMSS) by comparing its design, features, framework, and items with those of the U.S. National Assessment of Educational Progress and another international assessment…
Descriptors: National Competency Tests, Comparative Analysis, Achievement Tests, Test Items
Samejima, Fumiko – 1996
Traditionally, the test score represented by the number of items answered correctly was taken as an indicator of the examinee's ability level. Researchers still tend to think that the number-correct score is a way of ordering individuals with respect to the latent trait. The objective of this study is to depict the benefits of using ability…
Descriptors: Ability, Attitude Measures, Estimation (Mathematics), Models
Samejima, Fumiko – 1997
Latent trait models introduced the concept of the latent trait, or ability, as distinct from the test score. There is a recent tendency to treat the test score as through it were a substitute for ability, largely because the test score is a convenient way to place individuals in order. F. Samejima (1969) has shown that, in general, the amount of…
Descriptors: Ability, Estimation (Mathematics), Item Response Theory, Philosophy

Zumbo, Bruno D.; Pope, Gregory A.; Watson, Jackie E.; Hubley, Anita M. – Educational and Psychological Measurement, 1997
E. Roskam's (1985) conjecture that steeper item characteristic curve (ICC) "a" parameters (slopes) (and higher item total correlations in classical test theory) would be found with more concretely worded test items was tested with results from 925 young adults on the Eysenck Personality Questionnaire (H. Eysenck and S. Eysenck, 1975).…
Descriptors: Correlation, Personality Assessment, Personality Measures, Test Interpretation

Wilson, Mark – Applied Psychological Measurement, 1988
A method for detecting and interpreting disturbances of the local-independence assumption among items that share common stimulus material or other features is presented. Dichotomous and polytomous Rasch models are used to analyze structure of the learning outcome superitems. (SLD)
Descriptors: Item Analysis, Latent Trait Theory, Mathematical Models, Test Interpretation