Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 2 |
Since 2016 (last 10 years) | 2 |
Since 2006 (last 20 years) | 8 |
Descriptor
Comparative Analysis | 20 |
Test Length | 20 |
Test Validity | 12 |
Computer Assisted Testing | 6 |
Mastery Tests | 6 |
Test Format | 6 |
Test Items | 6 |
Adaptive Testing | 5 |
Higher Education | 5 |
Test Reliability | 5 |
Criterion Referenced Tests | 4 |
More ▼ |
Source
Author
Publication Type
Reports - Research | 17 |
Journal Articles | 11 |
Speeches/Meeting Papers | 5 |
Dissertations/Theses -… | 1 |
Information Analyses | 1 |
Numerical/Quantitative Data | 1 |
Reports - Evaluative | 1 |
Tests/Questionnaires | 1 |
Education Level
Higher Education | 3 |
Postsecondary Education | 2 |
Elementary Education | 1 |
Elementary Secondary Education | 1 |
Secondary Education | 1 |
Audience
Researchers | 2 |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Koziol, Natalie A.; Goodrich, J. Marc; Yoon, HyeonJin – Educational and Psychological Measurement, 2022
Differential item functioning (DIF) is often used to examine validity evidence of alternate form test accommodations. Unfortunately, traditional approaches for evaluating DIF are prone to selection bias. This article proposes a novel DIF framework that capitalizes on regression discontinuity design analysis to control for selection bias. A…
Descriptors: Regression (Statistics), Item Analysis, Validity, Testing Accommodations
Kilic, Abdullah Faruk; Dogan, Nuri – International Journal of Assessment Tools in Education, 2021
Weighted least squares (WLS), weighted least squares mean-and-variance-adjusted (WLSMV), unweighted least squares mean-and-variance-adjusted (ULSMV), maximum likelihood (ML), robust maximum likelihood (MLR) and Bayesian estimation methods were compared in mixed item response type data via Monte Carlo simulation. The percentage of polytomous items,…
Descriptors: Factor Analysis, Computation, Least Squares Statistics, Maximum Likelihood Statistics
Lee, Jihyun; Paek, Insu – Journal of Psychoeducational Assessment, 2014
Likert-type rating scales are still the most widely used method when measuring psychoeducational constructs. The present study investigates a long-standing issue of identifying the optimal number of response categories. A special emphasis is given to categorical data, which were generated by the Item Response Theory (IRT) Graded-Response Modeling…
Descriptors: Likert Scales, Responses, Item Response Theory, Classification
Kastner, Rebecca M.; Sellbom, Martin; Lilienfeld, Scott O. – Psychological Assessment, 2012
The Psychopathic Personality Inventory (PPI) has shown promising construct validity as a measure of psychopathy. Because of its relative efficiency, a short-form version of the PPI (PPI-SF) was developed and has proven useful in many psychopathy studies. The validity of the PPI-SF, however, has not been thoroughly examined, and no studies have…
Descriptors: Personality Measures, Psychopathology, Psychometrics, Comparative Analysis
Thalmayer, Amber Gayle; Saucier, Gerard; Eigenhuis, Annemarie – Psychological Assessment, 2011
A general consensus on the Big Five model of personality attributes has been highly generative for the field of personality psychology. Many important psychological and life outcome correlates with Big Five trait dimensions have been established. But researchers must choose between multiple Big Five inventories when conducting a study and are…
Descriptors: Test Validity, Personality Measures, Test Length, Undergraduate Students
Dikici, Ayhan; Soh, Kaycheng – Online Submission, 2015
Many measurement tools on creativity are available in the literature. One of these scales is Creativity Fostering Teacher Behaviour Index (CFTIndex) developed for Singaporean teacher originally. It was then translated into Turkish and trialled on teachers in Nigde province with acceptable reliability and factorial validity. The main purpose of…
Descriptors: Creativity, Teacher Behavior, Comparative Analysis, Turkish
Yang, Sophie Xin; Jowett, Sophia – Measurement in Physical Education and Exercise Science, 2013
The Coach-Athlete Relationship Questionnaire was developed to effectively measure affective, cognitive, and behavioral aspects, represented by the interpersonal constructs of closeness, commitment, and complementarity, of the quality of the relationship within the context of sport coaching. The current study sought to determine the internal…
Descriptors: Foreign Countries, Athletes, Athletic Coaches, Interpersonal Relationship
Evans, Josiah Jeremiah – ProQuest LLC, 2010
In measurement research, data simulations are a commonly used analytical technique. While simulation designs have many benefits, it is unclear if these artificially generated datasets are able to accurately capture real examinee item response behaviors. This potential lack of comparability may have important implications for administration of…
Descriptors: Computer Assisted Testing, Adaptive Testing, Educational Testing, Admission (School)

Silverstein, A. B. – Perceptual and Motor Skills, 1983
Formulas for estimating the validity of random short forms were applied to the standardization data for the Wechsler Adult Intelligence Scale-Revised, the Minnesota Multiphasic Personality Inventory, and the Marlowe-Crowne Social Desirability Scale. These formulas demonstrated how much "better than random" the best short forms of these…
Descriptors: Comparative Analysis, Intelligence Tests, Measures (Individuals), Test Format

Christiansen, Neil D.; And Others – Educational and Psychological Measurement, 1996
The usefulness of examining the structural validity of scores on multidimensional measures using nested hierarchical model comparisons was evaluated in 2 studies using the Social Problem Solving Inventory (SPSI) with samples of 464 and 216 undergraduates. Results support the conceptual model of the SPSI. (SLD)
Descriptors: Comparative Analysis, Construct Validity, Higher Education, Interpersonal Relationship

Prewett, Peter N. – Psychological Assessment, 1995
The concurrent validity of 2 brief intelligence tests, the Matrix Analogies Test-Short Form (MAT) and the Kaufman Brief Intelligence Test (K-BIT) with the Wechsler Intelligence Scale for Children-Third Edition (WISC-III) using a sample of 50 urban students. The MAT and K-BIT appeared equally useful as screening tests. (SLD)
Descriptors: Children, Comparative Analysis, Concurrent Validity, Correlation
Frick, Theodore W. – 1991
Expert systems can be used to aid decisionmaking. A computerized adaptive test is one kind of expert system, although not commonly recognized as such. A new approach, termed EXSPRT, was devised that combines expert systems reasoning and sequential probability ratio test stopping rules. Two versions of EXSPRT were developed, one with random…
Descriptors: Adaptive Testing, Comparative Analysis, Computer Assisted Testing, Expert Systems
Oosterhof, Albert C.; Coats, Pamela K. – 1981
Instructors who develop classroom examinations that require students to provide a numerical response to a mathematical problem are often very concerned about the appropriateness of the multiple-choice format. The present study augments previous research relevant to this concern by comparing the difficulty and reliability of multiple-choice and…
Descriptors: Comparative Analysis, Difficulty Level, Grading, Higher Education
Eignor, Daniel R.; Hambleton, Ronald K. – 1979
The purpose of the investigation was to obtain some relationships among (1) test lengths, (2) shape of domain-score distributions, (3) advancement scores, and (4) several criterion-referenced test score reliability and validity indices. The study was conducted using computer simulation methods. The values of variables under study were set to be…
Descriptors: Comparative Analysis, Computer Assisted Testing, Criterion Referenced Tests, Cutting Scores
Wainer, Howard; And Others – 1990
The initial development of a testlet-based algebra test was previously reported (Wainer and Lewis, 1990). This account provides the details of this excursion into the use of hierarchical testlets and validity-based scoring. A pretest of two 15-item hierarchical testlets was carried out in which examinees' performance on a 4-item subset of each…
Descriptors: Adaptive Testing, Algebra, Comparative Analysis, Computer Assisted Testing
Previous Page | Next Page ยป
Pages: 1 | 2