Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 2 |
Since 2016 (last 10 years) | 4 |
Since 2006 (last 20 years) | 28 |
Descriptor
Evaluation Methods | 111 |
Measurement Techniques | 111 |
Testing | 111 |
Student Evaluation | 29 |
Measurement | 21 |
Evaluation Criteria | 20 |
Models | 20 |
Test Construction | 18 |
Program Evaluation | 17 |
Psychometrics | 17 |
Statistical Analysis | 17 |
More ▼ |
Source
Author
Siegel, Arthur I. | 3 |
Horst, Donald P. | 2 |
Potter, Norman R. | 2 |
Algina, James | 1 |
Allen, Paul M. | 1 |
Ames, Russell | 1 |
Anderson, Scarvia B. | 1 |
Andres De Los Reyes | 1 |
Angoff, William H. | 1 |
Asparouhov, Tihomir | 1 |
Ayala, Armando | 1 |
More ▼ |
Publication Type
Education Level
Elementary Secondary Education | 3 |
Higher Education | 3 |
Early Childhood Education | 2 |
Postsecondary Education | 2 |
Adult Education | 1 |
Elementary Education | 1 |
Grade 3 | 1 |
Kindergarten | 1 |
Audience
Practitioners | 7 |
Researchers | 5 |
Teachers | 4 |
Policymakers | 1 |
Students | 1 |
Location
United Kingdom | 3 |
United States | 2 |
Australia | 1 |
California | 1 |
Canada | 1 |
Hawaii | 1 |
Hong Kong | 1 |
Maryland (Baltimore) | 1 |
Netherlands | 1 |
Pennsylvania (Philadelphia) | 1 |
South Africa | 1 |
More ▼ |
Laws, Policies, & Programs
No Child Left Behind Act 2001 | 2 |
Bilingual Education Act 1968 | 1 |
Elementary and Secondary… | 1 |
Assessments and Surveys
SAT (College Admission Test) | 2 |
Early Childhood Longitudinal… | 1 |
Marlowe Crowne Social… | 1 |
Medical College Admission Test | 1 |
Minnesota Multiphasic… | 1 |
National Assessment of… | 1 |
Sixteen Personality Factor… | 1 |
What Works Clearinghouse Rating
Lu, Jie; Schmidt, Matthew; Lee, Minyoung; Huang, Rui – Educational Technology Research and Development, 2022
This paper presents a systematic literature review characterizing the methodological properties of usability studies conducted on educational and learning technologies in the past 20 years. PRISMA guidelines were followed to identify, select, and review relevant research and report results. Our rigorous review focused on (1) categories of…
Descriptors: Usability, Research Methodology, Educational Technology, Evaluation Methods
Castellano, Katherine E.; McCaffrey, Daniel F. – Journal of Educational Measurement, 2020
Testing programs are often interested in using a student growth measure. This article presents analytic derivations of the accuracy of common student growth measures on both the raw scale of the test and the percentile rank scale in terms of the proportional reduction in mean squared error and the squared correlation between the estimator and…
Descriptors: Student Evaluation, Accuracy, Testing, Student Development
Andres De Los Reyes; Mo Wang; Matthew D. Lerner; Bridget A. Makol; Olivia M. Fitzpatrick; John R. Weisz – Grantee Submission, 2022
Researchers strategically assess youth mental health by soliciting reports from multiple informants. Typically, these informants (e.g., parents, teachers, youth themselves) vary in the social contexts where they observe youth. Decades of research reveal that the most common data conditions produced with this approach consist of discrepancies…
Descriptors: Mental Health, Measurement Techniques, Evaluation Methods, Research
Rambo-Hernandez, Karen E.; Warne, Russell T. – TEACHING Exceptional Children, 2015
Out-of-level testing is an underused strategy for addressing the needs of students who score in the extremes, and when used wisely, it could provide educators with a much more accurate picture of what students know. Out-of-level testing has been shown to be an effective assessment strategy with high-achieving students; however, out-of-level…
Descriptors: Testing, Student Evaluation, High Achievement, Evaluation Methods
Dumas, Denis G.; McNeish, Daniel M. – Educational Researcher, 2017
Single-timepoint educational measurement practices are capable of assessing student ability at the time of testing but are not designed to be informative of student capacity for developing in any particular academic domain, despite commonly being used in such a manner. For this reason, such measurement practice systematically underestimates the…
Descriptors: Measurement Techniques, Student Evaluation, Evaluation Methods, Testing
Kim, Eun Sook; Yoon, Myeongsun; Lee, Taehun – Educational and Psychological Measurement, 2012
Multiple-indicators multiple-causes (MIMIC) modeling is often used to test a latent group mean difference while assuming the equivalence of factor loadings and intercepts over groups. However, this study demonstrated that MIMIC was insensitive to the presence of factor loading noninvariance, which implies that factor loading invariance should be…
Descriptors: Test Items, Simulation, Testing, Statistical Analysis
Lovett, Steve; Johnson, Jennie – Journal of the Scholarship of Teaching and Learning, 2012
The measurement of student learning is becoming increasingly important in U.S. higher education. One way to measure learning is through longitudinal testing, but this becomes especially difficult when applied to cumulative learning within programs in situations of low persistence. In particular, many Hispanic Serving Institutions (HSIs) find…
Descriptors: Testing, Student Evaluation, Evaluation Methods, Measurement Techniques
Mislevy, Robert J.; Haertel, Geneva; Cheng, Britte H.; Ructtinger, Liliana; DeBarger, Angela; Murray, Elizabeth; Rose, David; Gravel, Jenna; Colker, Alexis M.; Rutstein, Daisy; Vendlinski, Terry – Educational Research and Evaluation, 2013
Standardizing aspects of assessments has long been recognized as a tactic to help make evaluations of examinees fair. It reduces variation in irrelevant aspects of testing procedures that could advantage some examinees and disadvantage others. However, recent attention to making assessment accessible to a more diverse population of students…
Descriptors: Testing Accommodations, Access to Education, Testing, Psychometrics
Harris, Phillip; Smith, Bruce M.; Harris, Joan – Rowman & Littlefield Publishers, Inc., 2011
Pundits, politicians, and business leaders continually make claims for what standardized tests can do, and those claims go largely unchallenged because they are in line with popular assumptions about what these tests can do, what the scores mean, and the psychology of human motivation. But what most of what these opinion leaders say--and the…
Descriptors: Accountability, Academic Achievement, Evaluation Methods, Autobiographies
Ogunnaike-Lafe, Yomi; Krohn, Joan – Exchange: The Early Childhood Leaders' Magazine Since 1978, 2010
Assessment is a hotly contested issue in education today. The education policy No Child Left Behind (NCLB) emphasizes standardized testing throughout a child's schooling as a major means of assessment. Even at Head Start an attempt was made at standardized testing using the National Reporting System (NRS). Although research indicates that these…
Descriptors: Early Childhood Education, Testing, Standardized Tests, Learning Processes
von Davier, Matthias – Measurement: Interdisciplinary Research and Perspectives, 2009
In this commentary, the author points out few issues, one being that there are models mislabeled as diagnostic, which deal with linear decompositions of item difficulties rather than estimating multidimensional skill variables. The author discusses the issue that there are many new names for essentially well-known models for multiple simultaneous…
Descriptors: Test Items, Probability, Models, Diagnostic Tests
Hancock, Gregory R. – Measurement: Interdisciplinary Research and Perspectives, 2009
As Rupp and Templin (2008) stated directly, diagnostic classification methods "are confirmatory in nature." Methods, though, are neither inherently confirmatory nor exploratory. Diagnostic classification modeling, with its analytical and computational obstacles eventually yielding as a comprehensive and potent discipline emerges, will…
Descriptors: Structural Equation Models, Test Items, Models, Diagnostic Tests
Lim, David – Journal of Vocational Education and Training, 2009
Operating a quality assurance system in tertiary education is the rule rather than the exception, because of the belief that it will improve quality. However, proving this is not easy. This study examines three ways of providing the evidence: the a "priori" method, the stepwise backtracking method, and the external evaluation method. The…
Descriptors: Testing, Quality Control, Program Effectiveness, Foreign Countries
Puhan, Gautam – Applied Measurement in Education, 2009
The purpose of this study is to determine the extent of scale drift on a test that employs cut scores. It was essential to examine scale drift for this testing program because new forms in this testing program are often put on scale through a series of intermediate equatings (known as equating chains). This process may cause equating error to…
Descriptors: Testing Programs, Testing, Measurement Techniques, Item Response Theory
Have Cognitive Diagnostic Models Delivered Their Goods? Some Substantial and Methodological Concerns
Wilhelm, Oliver; Robitzsch, Alexander – Measurement: Interdisciplinary Research and Perspectives, 2009
The paper by Rupp and Templin (2008) is an excellent work on the characteristics and features of cognitive diagnostic models (CDM). In this article, the authors comment on some substantial and methodological aspects of this focus paper. They organize their comments by going through issues associated with the terms "cognitive,"…
Descriptors: Research Methodology, Test Items, Models, Diagnostic Tests