Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 5 |
| Since 2017 (last 10 years) | 21 |
| Since 2007 (last 20 years) | 54 |
Descriptor
| Models | 139 |
| Test Validity | 139 |
| Test Reliability | 48 |
| Test Construction | 47 |
| Testing | 37 |
| Computer Assisted Testing | 29 |
| Evaluation Methods | 25 |
| Testing Problems | 22 |
| Test Items | 21 |
| Student Evaluation | 19 |
| Factor Analysis | 18 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Researchers | 4 |
| Practitioners | 3 |
Location
| Malaysia | 3 |
| Japan | 2 |
| United Kingdom (England) | 2 |
| Arkansas | 1 |
| Asia | 1 |
| Australia | 1 |
| California | 1 |
| Canada | 1 |
| China | 1 |
| China (Beijing) | 1 |
| Georgia | 1 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Yan Jin; Jason Fan – Language Assessment Quarterly, 2023
In language assessment, AI technology has been incorporated in task design, assessment delivery, automated scoring of performance-based tasks, score reporting, and provision of feedback. AI technology is also used for collecting and analyzing performance data in language assessment validation. Research has been conducted to investigate the…
Descriptors: Language Tests, Artificial Intelligence, Computer Assisted Testing, Test Format
James Soland – Journal of Research on Educational Effectiveness, 2024
When randomized control trials are not possible, quasi-experimental methods often represent the gold standard. One quasi-experimental method is difference-in-difference (DiD), which compares changes in outcomes before and after treatment across groups to estimate a causal effect. DiD researchers often use fairly exhaustive robustness checks to…
Descriptors: Item Response Theory, Testing, Test Validity, Intervention
Aditya Shah; Ajay Devmane; Mehul Ranka; Prathamesh Churi – Education and Information Technologies, 2024
Online learning has grown due to the advancement of technology and flexibility. Online examinations measure students' knowledge and skills. Traditional question papers include inconsistent difficulty levels, arbitrary question allocations, and poor grading. The suggested model calibrates question paper difficulty based on student performance to…
Descriptors: Computer Assisted Testing, Difficulty Level, Grading, Test Construction
Pere J. Ferrando; Fabia Morales-Vives; Ana Hernández-Dorado – Educational and Psychological Measurement, 2024
In recent years, some models for binary and graded format responses have been proposed to assess unipolar variables or "quasi-traits." These studies have mainly focused on clinical variables that have traditionally been treated as bipolar traits. In the present study, we have made a proposal for unipolar traits measured with continuous…
Descriptors: Item Analysis, Goodness of Fit, Accuracy, Test Validity
Ackerman, Debra J. – ETS Research Report Series, 2020
Over the past 8 years, U.S. kindergarten classrooms have been impacted by policies mandating or recommending the administration of a specific kindergarten entry assessment (KEA) in the initial months of school as well as the increasing reliance on digital technology in the form of mobile apps, touchscreen devices, and online data platforms. Using…
Descriptors: Kindergarten, School Readiness, Computer Assisted Testing, Preschool Teachers
Ketabi, Somaye; Alavi, Seyyed Mohammed; Ravand, Hamdollah – International Journal of Language Testing, 2021
Although Diagnostic Classification Models (DCMs) were introduced to education system decades ago, it seems that these models were not employed for the original aims upon which they had been designed. Using DCMs has been mostly common in analyzing large-scale non-diagnostic tests and these models have been rarely used in developing Cognitive…
Descriptors: Diagnostic Tests, Test Construction, Goodness of Fit, Classification
Andres De Los Reyes; Mo Wang; Matthew D. Lerner; Bridget A. Makol; Olivia M. Fitzpatrick; John R. Weisz – Grantee Submission, 2022
Researchers strategically assess youth mental health by soliciting reports from multiple informants. Typically, these informants (e.g., parents, teachers, youth themselves) vary in the social contexts where they observe youth. Decades of research reveal that the most common data conditions produced with this approach consist of discrepancies…
Descriptors: Mental Health, Measurement Techniques, Evaluation Methods, Research
von Davier, Matthias; Khorramdel, Lale; He, Qiwei; Shin, Hyo Jeong; Chen, Haiwen – Journal of Educational and Behavioral Statistics, 2019
International large-scale assessments (ILSAs) transitioned from paper-based assessments to computer-based assessments (CBAs) facilitating the use of new item types and more effective data collection tools. This allows implementation of more complex test designs and to collect process and response time (RT) data. These new data types can be used to…
Descriptors: International Assessment, Computer Assisted Testing, Psychometrics, Item Response Theory
Carlson, Tiffany; Crepeau-Hobson, Franci – Communique, 2021
When the coronavirus pandemic was declared a public health crisis in March 2020, school psychologists were forced into situations where face-to-face interaction with their students was discouraged and in some cases, prohibited. Consequently, the traditional practice of school psychology abruptly ended. Individualized Education Plans (IEP) and…
Descriptors: Cognitive Tests, Ethics, Decision Making, Models
Kaufman, Alan S. – Journal of Intelligence, 2021
U.S. Supreme Court justices and other federal judges are, effectively, appointed for life, with no built-in check on their cognitive functioning as they approach old age. There is about a century of research on aging and intelligence that shows the vulnerability of processing speed, fluid reasoning, visual-spatial processing, and working memory to…
Descriptors: Judges, Federal Government, Aging (Individuals), Decision Making
Ligon, Kateryna V.; Rowell, R. Kevin; Stoltz, Kevin B. – Journal of Leadership Education, 2019
The basis of this study is Kelley's (1992) two-dimensional model, which measures five follower types. Previous investigations did not support the validity of Kelley's model. Although the model is utilized in research, the validity and reliability of the Kelley Followership Questionnaire (KFQ) is still in question. In this study, the KFQ validity…
Descriptors: Questionnaires, Critical Thinking, Correlation, Test Validity
Esfandiari, Mohammad Reza; Riasati, Mohammad Javad; Vaezian, Helia; Rahimi, Forough – Language Testing in Asia, 2018
Background: Validity is a notable concept in language testing which has concerned many researchers and scholars in the field of language testing due to its importance in decision making process. Tests' results always introduce consequences to test takers' lives which emphasizes the need to ensure their validity. Detecting and delineating the…
Descriptors: Computer Assisted Testing, Test Validity, Language Tests, English (Second Language)
Becker, Kirk A.; Bergstrom, Betty A. – Practical Assessment, Research & Evaluation, 2013
The need for increased exam security, improved test formats, more flexible scheduling, better measurement, and more efficient administrative processes has caused testing agencies to consider converting the administration of their exams from paper-and-pencil to computer-based testing (CBT). Many decisions must be made in order to provide an optimal…
Descriptors: Testing, Models, Testing Programs, Program Administration
Examining the Structure of Vocational Interests in Turkey in the Context of the Personal Globe Model
Vardarli, Bade; Özyürek, Ragip; Wilkins-Yel, Kerrie G.; Tracey, Terence J. G. – International Journal for Educational and Vocational Guidance, 2017
The structural validity of the Personal Globe Inventory-Short (PGI-S: Tracey in J Vocat Behavi 76:1-15, 2010) was examined in a Turkish sample of high school and university students. The PGI-S measures eight basic interest scales, Holland's ("Making vocational choice," Prentice-Hall, Englewood Cliffs, 1997) six types, Prediger's ("J…
Descriptors: Foreign Countries, Vocational Interests, Interest Inventories, Models
Lewis, Todd F. – Measurement and Evaluation in Counseling and Development, 2017
American Educational Research Association (AERA) standards stipulate that researchers show evidence of the internal structure of instruments. Confirmatory factor analysis (CFA) is one structural equation modeling procedure designed to assess construct validity of assessments that has broad applicability for counselors interested in instrument…
Descriptors: Educational Research, Factor Analysis, Structural Equation Models, Construct Validity

Peer reviewed
Direct link
