Publication Date
In 2025 | 1 |
Since 2024 | 3 |
Since 2021 (last 5 years) | 3 |
Since 2016 (last 10 years) | 4 |
Since 2006 (last 20 years) | 10 |
Descriptor
Evaluation Methods | 67 |
Test Construction | 67 |
Test Selection | 39 |
Test Validity | 20 |
Student Evaluation | 14 |
Test Reliability | 14 |
Educational Assessment | 13 |
Elementary Secondary Education | 13 |
Program Evaluation | 13 |
Selection | 13 |
Test Interpretation | 13 |
More ▼ |
Source
Author
Publication Type
Education Level
Adult Education | 4 |
Elementary Secondary Education | 4 |
Higher Education | 3 |
Postsecondary Education | 2 |
Adult Basic Education | 1 |
Elementary Education | 1 |
High Schools | 1 |
Secondary Education | 1 |
Laws, Policies, & Programs
Elementary and Secondary… | 3 |
No Child Left Behind Act 2001 | 2 |
Americans with Disabilities… | 1 |
Every Student Succeeds Act… | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Shun-Fu Hu; Amery D. Wu; Jake Stone – Journal of Educational Measurement, 2025
Scoring high-dimensional assessments (e.g., > 15 traits) can be a challenging task. This paper introduces the multilabel neural network (MNN) as a scoring method for high-dimensional assessments. Additionally, it demonstrates how MNN can score the same test responses to maximize different performance metrics, such as accuracy, recall, or…
Descriptors: Tests, Testing, Scores, Test Construction
Stephen G. Sireci; Javier Suárez-Álvarez; April L. Zenisky; Maria Elena Oliveri – Grantee Submission, 2024
The goal in personalized assessment is to best fit the needs of each individual test taker, given the assessment purposes. Design-In-Real-Time (DIRTy) assessment reflects the progressive evolution in testing from a single test, to an adaptive test, to an adaptive assessment "system." In this paper, we lay the foundation for DIRTy…
Descriptors: Educational Assessment, Student Needs, Test Format, Test Construction
Stephen G. Sireci; Javier Suárez-Álvarez; April L. Zenisky; Maria Elena Oliveri – Educational Measurement: Issues and Practice, 2024
The goal in personalized assessment is to best fit the needs of each individual test taker, given the assessment purposes. Design-in-Real-Time (DIRTy) assessment reflects the progressive evolution in testing from a single test, to an adaptive test, to an adaptive assessment "system." In this article, we lay the foundation for DIRTy…
Descriptors: Educational Assessment, Student Needs, Test Format, Test Construction
Demir, Yusuf; Ertas, Abdullah – Reading Matrix: An International Online Journal, 2014
Coursebook evaluation helps practitioners decide on the most appropriate coursebook to be exploited. Moreover, evaluation process enables to predict the potential strengths and weaknesses of a given coursebook. Checklist method is probably the most widely adopted way of judging coursebooks and there are plenty of ELT coursebook evaluation…
Descriptors: Check Lists, Course Evaluation, Instructional Material Evaluation, Media Selection
Sanders, Sara – National Technical Assistance Center for the Education of Neglected or Delinquent Children and Youth (NDTAC), 2019
This guide is designed to assist States, agencies, and/or facilities who work with youth who are neglected, delinquent, or at-risk (N or D). The information in the guide will benefit those who are (a) interested in implementing pre-posttests, (b) in the process of identifying an appropriate pre-posttest, or (c) ready to evaluate current testing…
Descriptors: At Risk Students, Delinquency, Pretests Posttests, Testing
Eliasson, Ann-Christin – Physical & Occupational Therapy in Pediatrics, 2012
Assessments used for both clinical practice and research should show evidence of validity and reliability for the target group of people. It is easy to agree with this statement, but it is not always easy to choose the right assessment for the right purpose. Recently there have been increasing numbers of studies which investigate further the…
Descriptors: Psychometrics, Test Construction, Test Reliability, Test Validity
Sijtsma, Klaas – Psychometrika, 2012
I address two issues that were inspired by my work on the Dutch Committee on Tests and Testing (COTAN). The first issue is the understanding of problems test constructors and researchers using tests have of psychometric knowledge. I argue that this understanding is important for a field, like psychometrics, for which the dissemination of…
Descriptors: Foreign Countries, Psychometrics, Knowledge Level, Test Construction
Sanders, Karen – ProQuest LLC, 2010
The infusion of online education has brought about significant changes within the traditional, brick-and-mortar schoolhouse. With an increasing number of students enrolling in online courses, administrators must identify more quality online educators to meet the needs of the students in their care. Hence, the focus of this study is to answer the…
Descriptors: Electronic Learning, Research Needs, Teacher Effectiveness, Teacher Selection
Herman, Joan L.; Osmundson, Ellen; Dietel, Ronald – Assessment and Accountability Comprehensive Center, 2010
This report describes the purposes of benchmark assessments and provides recommendations for selecting and using benchmark assessments--addressing validity, alignment, reliability, fairness and bias and accessibility, instructional sensitivity, utility, and reporting issues. We also present recommendations on building capacity to support schools'…
Descriptors: Multiple Choice Tests, Test Items, Benchmarking, Educational Assessment
Hamilton, Jack A.; Mitchell, Anita M. – Career Education Quarterly, 1979
Describes a process for evaluating career education activities. Discusses the selection of the evaluation sample, selection and development of instruments, data reduction and analysis, comparison standards, and preparation of an evaluation handbook. (JOW)
Descriptors: Career Education, Data Analysis, Data Collection, Evaluation Methods
Fremer, John – 1974
This paper attempts to provide practical guidance to those individuals responsible for selecting or developing instruments for assessment programs. The question of what to measure in an assessment program is addressed at a global and a specific level. Once a developer has identified the areas to be assessed, it is necessary to consider the…
Descriptors: Educational Assessment, Evaluation Methods, Test Construction, Test Selection
Almond, Russell; Steinberg, Linda; Mislevy, Robert – 2001
This paper describes a four-process model for the operation of a generic assessment: Activity Selection, Presentation, Response Processing (Evidence Identification), and Summary Scoring (Evidence Accumulation). It discusses the relationships between the functions and responsibilities of these processes and the objects in the Instructional…
Descriptors: Chinese, Evaluation Methods, Language Proficiency, Models
Shermis, Mark D.; DiVesta, Francis J. – Rowman & Littlefield Publishers, Inc., 2011
"Classroom Assessment in Action" clarifies the multi-faceted roles of measurement and assessment and their applications in a classroom setting. Comprehensive in scope, Shermis and Di Vesta explain basic measurement concepts and show students how to interpret the results of standardized tests. From these basic concepts, the authors then…
Descriptors: Student Evaluation, Standardized Tests, Scores, Measurement
Osburn, H. G.; Shoemaker, David M. – 1968
A computer program generating question series for achievement examinations was presented and the relative reliability of computer-generated and instructor-selected items was investigated. To provide validity for examinations generated by an original computer program, representative processes of construction and sampling were operationally defined.…
Descriptors: Achievement Tests, Evaluation Methods, Measurement Techniques, Test Construction

Guion, Robert M. – Personnel Psychology, 1978
"Content validity" has been widely but unwisely hailed as a solution to many problems in employee selection. The author argues that sampling from content domains cannot logically be substituted for criterion related validity. He suggests that evaluations of scores be based on the principle of construct validation. (Editor/RK)
Descriptors: Criterion Referenced Tests, Evaluation Methods, Personnel Evaluation, Personnel Selection