Publication Date
In 2025 | 0 |
Since 2024 | 2 |
Since 2021 (last 5 years) | 4 |
Since 2016 (last 10 years) | 6 |
Since 2006 (last 20 years) | 28 |
Descriptor
Evaluation Methods | 116 |
Testing Problems | 116 |
Test Validity | 98 |
Student Evaluation | 36 |
Test Reliability | 35 |
Test Construction | 31 |
Educational Assessment | 24 |
Measurement Techniques | 21 |
Elementary Secondary Education | 19 |
Test Use | 19 |
Educational Testing | 18 |
More ▼ |
Source
Author
Bielinski, John | 2 |
Minnema, Jane | 2 |
Norris, Stephen P. | 2 |
Phillips, Gary W. | 2 |
Soar, Robert S. | 2 |
Thurlow, Martha | 2 |
Afflerbach, Peter P. | 1 |
Aiken, Lewis R. | 1 |
Al-Bahlani, Sara | 1 |
Ali Panahi | 1 |
Allehaiby, Wid Hasen | 1 |
More ▼ |
Publication Type
Education Level
Elementary Secondary Education | 15 |
Higher Education | 5 |
Postsecondary Education | 3 |
Elementary Education | 2 |
Audience
Researchers | 8 |
Practitioners | 5 |
Location
United Kingdom (England) | 4 |
Sweden | 2 |
United Kingdom | 2 |
United Kingdom (Wales) | 2 |
United States | 2 |
California | 1 |
China | 1 |
Denmark | 1 |
Germany | 1 |
Netherlands | 1 |
Poland | 1 |
More ▼ |
Laws, Policies, & Programs
Elementary and Secondary… | 2 |
Education Consolidation… | 1 |
No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Angela Johnson; Elizabeth Barker; Marcos Viveros Cespedes – Educational Measurement: Issues and Practice, 2024
Educators and researchers strive to build policies and practices on data and evidence, especially on academic achievement scores. When assessment scores are inaccurate for specific student populations or when scores are inappropriately used, even data-driven decisions will be misinformed. To maximize the impact of the research-practice-policy…
Descriptors: Equal Education, Inclusion, Evaluation Methods, Error of Measurement
Ke-Hai Yuan; Zhiyong Zhang; Lijuan Wang – Grantee Submission, 2024
Mediation analysis plays an important role in understanding causal processes in social and behavioral sciences. While path analysis with composite scores was criticized to yield biased parameter estimates when variables contain measurement errors, recent literature has pointed out that the population values of parameters of latent-variable models…
Descriptors: Structural Equation Models, Path Analysis, Weighted Scores, Comparative Testing
James Dean Brown; Ali Panahi; Hassan Mohebbi – Language Teaching Research Quarterly, 2023
Panahi and Mohebbi review James Dean Brown's 50-years of research in language testing, curriculum development and research statistics with reference to an impressionistic framework for analysis containing two components with their subcomponents: Annotations (i.e., briefing and implications) and main concepts and themes (i.e., testing and teaching…
Descriptors: Second Language Learning, Second Language Instruction, Language Tests, Curriculum Development
Allehaiby, Wid Hasen; Al-Bahlani, Sara – Arab World English Journal, 2021
One of the main challenges higher educational institutions encounter amid the recent COVID-19 crisis is transferring assessment approaches from the traditional face-to-face form to the online Emergency Remote Teaching approach. A set of language assessment principles, practicality, reliability, validity, authenticity, and washback, which can be…
Descriptors: Barriers, Distance Education, Evaluation Methods, Teaching Methods
Zumbo, Bruno D.; Hubley, Anita M. – Assessment in Education: Principles, Policy & Practice, 2016
Ultimately, measures in research, testing, assessment and evaluation are used, or have implications, for ranking, intervention, feedback, decision-making or policy purposes. Explicit recognition of this fact brings the often-ignored and sometimes maligned concept of consequences to the fore. Given that measures have personal and social…
Descriptors: Testing Programs, Testing Problems, Measurement Techniques, Student Evaluation
Li, Xiangdong – Interpreter and Translator Trainer, 2018
T&I programmes aim to produce self-reliant graduates capable of judging their own work against agreed criteria and increasing their levels of competence over the course of their careers. An essential element to achieve this goal is student-centred assessment, i.e. 'assessment as learning' (AaL) (also known as 'sustainable assessment'), which…
Descriptors: Translation, Teaching Methods, Testing Problems, Validity
Brown, Nathaniel J.; Afflerbach, Peter P.; Croninger, Robert G. – Educational Psychology Review, 2014
National policy and standards documents, including the National Assessment of Educational Progress frameworks, the "Common Core State Standards" and the "Next Generation Science Standards," assert the need to assess critical-analytic thinking (CAT) across subject areas. However, assessment of CAT poses several challenges for…
Descriptors: Critical Thinking, Thinking Skills, National Standards, National Competency Tests
Popham, W. James – Phi Delta Kappan, 2014
The tests we use to evaluate student achievement may well be sound measures of what students know, but they are faulty indicators at best of how well they have been taught. A remedy to this this situation of judging teachers by the performance of their students on high-stakes tests may be in hand already. We should look to the methods successfully…
Descriptors: High Stakes Tests, Academic Achievement, Teacher Evaluation, Evaluation Methods
Ghilay, Yaron; Ghilay, Ruth – Journal of Educational Technology, 2012
The study examined advantages and disadvantages of computerised assessment compared to traditional evaluation. It was based on two samples of college students (n=54) being examined in computerised tests instead of paper-based exams. Students were asked to answer a questionnaire focused on test effectiveness, experience, flexibility and integrity.…
Descriptors: Student Evaluation, Higher Education, Comparative Analysis, Computer Assisted Testing
Phillips, Gary W. – Applied Measurement in Education, 2015
This article proposes that sampling design effects have potentially huge unrecognized impacts on the results reported by large-scale district and state assessments in the United States. When design effects are unrecognized and unaccounted for they lead to underestimating the sampling error in item and test statistics. Underestimating the sampling…
Descriptors: State Programs, Sampling, Research Design, Error of Measurement
Guarino, Cassandra M.; Reckase, Mark D.; Stacy, Brian W.; Wooldridge, Jeffrey M. – Education Policy Center at Michigan State University, 2014
We study the properties of two specification tests that have been applied to a variety of estimators in the context of value-added measures (VAMs) of teacher and school quality: the Hausman test for choosing between random and fixed effects and a test for feedback (sometimes called a "falsification test"). We discuss theoretical…
Descriptors: Achievement Gains, Evaluation Methods, Teacher Effectiveness, Educational Quality
Henning, Grant – English Teaching Forum, 2012
To some extent, good testing procedure, like good language use, can be achieved through avoidance of errors. Almost any language-instruction program requires the preparation and administration of tests, and it is only to the extent that certain common testing mistakes have been avoided that such tests can be said to be worthwhile selection,…
Descriptors: Testing, English (Second Language), Testing Problems, Student Evaluation
Geisinger, Kurt F. – International Journal of Testing, 2012
This article sets the stage for the description of a variety of approaches to test reviewing worldwide. It describes the importance of test reviewing as a protection of the public and of society and also the benefits of this activity for test users, who must choose measures to use in particular situations with particular clients at a particular…
Descriptors: Test Reviews, Evaluation Methods, Evaluation Criteria, Global Approach
Baird, Jo-Anne – Measurement: Interdisciplinary Research and Perspectives, 2010
Newton's article (2010) makes three main contributions to the literature. First, it is transatlantic, bringing together literatures that have been dealing with similar problems, using sometimes different methods and certainly with distinctive educational, cultural perspectives. He points out that neither of these literatures has all of the…
Descriptors: Foreign Countries, Predictive Validity, Standards, Ethics
Makransky, Guido; Glas, Cees A. W. – International Journal of Testing, 2013
Cognitive ability tests are widely used in organizations around the world because they have high predictive validity in selection contexts. Although these tests typically measure several subdomains, testing is usually carried out for a single subdomain at a time. This can be ineffective when the subdomains assessed are highly correlated. This…
Descriptors: Foreign Countries, Cognitive Ability, Adaptive Testing, Feedback (Response)