Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 2 |
Since 2016 (last 10 years) | 3 |
Since 2006 (last 20 years) | 13 |
Descriptor
Item Analysis | 38 |
Test Reliability | 38 |
Test Theory | 38 |
Test Validity | 18 |
Test Construction | 16 |
Test Items | 13 |
Career Development | 10 |
Latent Trait Theory | 9 |
Mathematical Models | 8 |
Test Interpretation | 8 |
Error of Measurement | 7 |
More ▼ |
Source
Author
Haladyna, Tom | 2 |
Salmani-Nodoushan, Mohammad… | 2 |
Algina, James | 1 |
Altepeter, Tom | 1 |
Anita Padmanabhanunni | 1 |
Arias, Benito | 1 |
Bashaw, W. L. | 1 |
Beddow, Peter A. | 1 |
Bentler, P. M. | 1 |
Bernknopf, Stanley | 1 |
Bichi, Ado Abdu | 1 |
More ▼ |
Publication Type
Education Level
Higher Education | 5 |
Adult Education | 2 |
Elementary Secondary Education | 2 |
Postsecondary Education | 2 |
Early Childhood Education | 1 |
Elementary Education | 1 |
Grade 1 | 1 |
Kindergarten | 1 |
Preschool Education | 1 |
Primary Education | 1 |
Audience
Researchers | 2 |
Practitioners | 1 |
Students | 1 |
Teachers | 1 |
Location
Finland (Helsinki) | 1 |
Singapore | 1 |
South Africa | 1 |
Spain | 1 |
Texas | 1 |
Laws, Policies, & Programs
Elementary and Secondary… | 1 |
Assessments and Surveys
Armed Services Vocational… | 1 |
Dyadic Adjustment Scale | 1 |
Expressive One Word Picture… | 1 |
What Works Clearinghouse Rating
Tyrone B. Pretorius; P. Paul Heppner; Anita Padmanabhanunni; Serena Ann Isaacs – SAGE Open, 2023
In previous studies, problem solving appraisal has been identified as playing a key role in promoting positive psychological well-being. The Problem Solving Inventory is the most widely used measure of problem solving appraisal and consists of 32 items. The length of the instrument, however, may limit its applicability to large-scale surveys…
Descriptors: Problem Solving, Measures (Individuals), Test Construction, Item Response Theory
Kim, Peter – Language Teaching Research Quarterly, 2021
Foreign language aptitude is defined as one's potential to learn a second language. A language learner with higher aptitude is predicted to learn more, faster, and reach a higher level of proficiency. If this is the case, one way to validate the construct of aptitude and its measure is to conduct a validation study in which measures of aptitude is…
Descriptors: Morphology (Languages), Syntax, Second Language Learning, Second Language Instruction
Bichi, Ado Abdu; Talib, Rohaya – International Journal of Evaluation and Research in Education, 2018
Testing in educational system perform a number of functions, the results from a test can be used to make a number of decisions in education. It is therefore well accepted in the education literature that, testing is an important element of education. To effectively utilize the tests in educational policies and quality assurance its validity and…
Descriptors: Item Response Theory, Test Items, Test Construction, Decision Making
Longford, Nicholas T. – Journal of Educational and Behavioral Statistics, 2014
A method for medical screening is adapted to differential item functioning (DIF). Its essential elements are explicit declarations of the level of DIF that is acceptable and of the loss function that quantifies the consequences of the two kinds of inappropriate classification of an item. Instead of a single level and a single function, sets of…
Descriptors: Test Items, Test Bias, Simulation, Hypothesis Testing
Rantanen, Pekka – Assessment & Evaluation in Higher Education, 2013
A multilevel analysis approach was used to analyse students' evaluation of teaching (SET). The low value of inter-rater reliability stresses that any solid conclusions on teaching cannot be made on the basis of single feedbacks. To assess a teacher's general teaching effectiveness, one needs to evaluate four randomly chosen course implementations.…
Descriptors: Test Reliability, Feedback (Response), Generalizability Theory, Student Evaluation of Teacher Performance
Gomez, Laura E.; Arias, Benito; Verdugo, Miguel Angel; Navas, Patricia – Journal of Intellectual & Developmental Disability, 2012
Background: Most instruments that assess quality of life have been validated by means of the classical test theory (CTT). However, CTT limitations have resulted in the development of alternative models, such as the Rasch rating scale model (RSM). The main goal of this paper is testing and improving the psychometric properties of the INTEGRAL…
Descriptors: Evidence, Models, Mental Retardation, Quality of Life
Beddow, Peter A. – International Journal of Disability, Development and Education, 2012
In the arena of educational testing, accessibility refers to the degree to which students are given the opportunity to participate in and engage a test. Accessibility theory is a model for examining the interactions between the test-taker and the test itself and defining how they may decrease some students' access to the test event, ultimately…
Descriptors: Test Results, Test Items, Educational Testing, Scores
Lee, Young-Sun; Lembke, Erica; Moore, Douglas; Ginsburg, Herbert P.; Pappas, Sandra – Assessment for Effective Intervention, 2012
The present study examined the technical adequacy of curriculum-based measures (CBMs) of early numeracy. Six 1-min early mathematics tasks were administered to 137 kindergarten and first-grade students, along with an omnibus test of early mathematics. The CBM measures included Count Out Loud, Quantity Discrimination, Number Identification, Missing…
Descriptors: Numeracy, Curriculum Based Assessment, Mathematics Tests, Kindergarten
Salmani-Nodoushan, Mohammad Ali – Journal on Educational Psychology, 2009
A good test is one that has at least three qualities: reliability, or the precision with which a test measures what it is supposed to measure; validity, i.e., if the test really measures what it is supposed to measure, and practicality, or if the test, no matter how sound theoretically, is practicable in reality. These are the sine qua non for any…
Descriptors: Generalizability Theory, Testing, Language Tests, Item Response Theory
Salmani-Nodoushan, Mohammad Ali – Online Submission, 2009
A good test is one that has at least three qualities: reliability, or the precision with which a test measures what it is supposed to measure; validity, i.e., if the test really measures what it is supposed to measure; and practicality, or if the test, no matter how sound theoretically, is practicable in reality. These are the sine qua non for…
Descriptors: Generalizability Theory, Testing, Language Tests, Item Response Theory
Ricketts, Chris; Brice, Julie; Coombes, Lee – Advances in Health Sciences Education, 2010
The purpose of multiple choice tests of medical knowledge is to estimate as accurately as possible a candidate's level of knowledge. However, concern is sometimes expressed that multiple choice tests may also discriminate in undesirable and irrelevant ways, such as between minority ethnic groups or by sex of candidates. There is little literature…
Descriptors: Medical Students, Testing Accommodations, Ethnic Groups, Learning Disabilities
Cox, Judith Ellen – ProQuest LLC, 2010
Emotional intelligence in relationships can be developed and enhanced through the use of an assessment instrument within a mentoring or counseling relationship. The Relationship Skills Map (RSM) has been created for this purpose. This study concerns the validation of the Relationship Skills Map. Participants in this study included members of a…
Descriptors: Stress Management, Graduate Students, Emotional Intelligence, Time Management
Elosua, Paula; Iliescu, Dragos – International Journal of Testing, 2012
Psychometric practice does not always converge with the advances of psychometric theory. In order to investigate this gap, the authors focus on the 10 most used psychological tests in Europe, as identified by recent surveys. The article analyzes test manuals published in 6 different European countries for these 10 most used tests. A total of 32…
Descriptors: Psychological Testing, Personality Measures, Error of Measurement, Foreign Countries

Cudeck, Robert – Journal of Educational Measurement, 1980
Methods for evaluating the consistency of responses to test items were compared. When a researcher is unwilling to make the assumptions of classical test theory, has only a small number of items, or is in a tailored testing context, Cliff's dominance indices may be useful. (Author/CTM)
Descriptors: Error Patterns, Item Analysis, Test Items, Test Reliability

Wilcox, Rand R. – Journal of Educational Statistics, 1981
Both the binomial and beta-binomial models are applied to various problems occurring in mental test theory. The paper reviews and critiques these models. The emphasis is on the extensions of the models that have been proposed in recent years, and that might not be familiar to many educators. (Author)
Descriptors: Error of Measurement, Item Analysis, Mathematical Models, Test Reliability