NotesFAQContact Us
Collection
Advanced
Search Tips
Publication Date
In 20260
Since 20250
Since 2022 (last 5 years)0
Since 2017 (last 10 years)5
Since 2007 (last 20 years)28
What Works Clearinghouse Rating
Showing 1 to 15 of 116 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Alonzo, Alicia C.; Ke, Li – Measurement: Interdisciplinary Research and Perspectives, 2016
A new vision of science learning described in the "Next Generation Science Standards"--particularly the science and engineering practices and their integration with content--pose significant challenges for large-scale assessment. This article explores what might be learned from advances in large-scale science assessment and…
Descriptors: Science Achievement, Science Tests, Group Testing, Accountability
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Simui, Francis; Chibale, Henry; Namangala, Boniface – Open Praxis, 2017
This paper focuses on the management of distance education examination in a lowly resourced North-Eastern region of Zambia. The study applies Hermeneutic Phenomenology approach to generate and make sense of the data. It is the lived experiences of 2 invigilators and 66 students purposively selected that the study draws its insights from. Meaning…
Descriptors: Distance Education, Phenomenology, Testing Programs, Testing
Osness, Bonnie J. – ProQuest LLC, 2018
This qualitative case study examined the process Northcentral Technical College (NTC), a two-year institution, navigated to design and implement a standardized prior learning assessment (PLA) program. With the projected labor market shortage, PLA is one strategy colleges have been recommended to use to provide the opportunity for students to apply…
Descriptors: Prior Learning, Standardized Tests, Test Construction, Technical Institutes
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Becker, Kirk A.; Bergstrom, Betty A. – Practical Assessment, Research & Evaluation, 2013
The need for increased exam security, improved test formats, more flexible scheduling, better measurement, and more efficient administrative processes has caused testing agencies to consider converting the administration of their exams from paper-and-pencil to computer-based testing (CBT). Many decisions must be made in order to provide an optimal…
Descriptors: Testing, Models, Testing Programs, Program Administration
Peer reviewed Peer reviewed
Direct linkDirect link
Griffith, Stafford Alexander – Quality Assurance in Education: An International Perspective, 2017
Purpose: The purpose of this paper is to show how higher education institutions in the Caribbean may benefit from the quality assurance measures implemented by the Caribbean Examinations Council (CXC). Design/methodology/approach: The paper uses an outcomes model of quality assurance to analyse the measures implemented by the CXC to assure quality…
Descriptors: Higher Education, Quality Assurance, Testing Programs, Educational Quality
Educational Testing Service, 2011
Choosing whether to test via computer is the most difficult and consequential decision the designers of a testing program can make. The decision is difficult because of the wide range of choices available. Designers can choose where and how often the test is made available, how the test items look and function, how those items are combined into…
Descriptors: Test Items, Testing Programs, Testing, Computer Assisted Testing
Peer reviewed Peer reviewed
Direct linkDirect link
Assouline, Susan G.; Lupkowski-Shoplik, Ann – Journal of Psychoeducational Assessment, 2012
The Talent Search model, founded at Johns Hopkins University by Dr. Julian C. Stanley, is fundamentally an above-level testing program. This simplistic description belies the enduring impact that the Talent Search model has had on the lives of hundreds of thousands of gifted students as well as their parents and teachers. In this article, we…
Descriptors: Testing Programs, Academically Gifted, Elementary Secondary Education, Talent
Peer reviewed Peer reviewed
Direct linkDirect link
Tindal, Gerald; Nese, Joseph F. T.; Stevens, Joseph J. – Educational Assessment, 2017
For the past decade, the accountability model associated with No Child Left Behind (NCLB) emphasized proficiency on end of year tests; with Every Student Succeeds Act (ESSA) the emphasis on proficiency within statewide testing programs, though now integrated with other measures of student learning, nevertheless remains a primary metric for…
Descriptors: Testing Programs, Middle School Students, Models, State Standards
Hansen, Mark; Cai, Li; Monroe, Scott; Li, Zhen – National Center for Research on Evaluation, Standards, and Student Testing (CRESST), 2014
It is a well-known problem in testing the fit of models to multinomial data that the full underlying contingency table will inevitably be sparse for tests of reasonable length and for realistic sample sizes. Under such conditions, full-information test statistics such as Pearson's X[superscript 2] and the likelihood ratio statistic G[superscript…
Descriptors: Goodness of Fit, Item Response Theory, Classification, Maximum Likelihood Statistics
Peer reviewed Peer reviewed
Direct linkDirect link
Chen, Qian – International Journal of Science and Mathematics Education, 2014
In this study, the Trends in International Mathematics and Science Study 2007 data were used to build mathematics achievement models of fourth graders in two East Asian school systems: Hong Kong and Singapore. In each school system, eight variables at student level and nine variables at school/class level were incorporated to build an achievement…
Descriptors: Foreign Countries, Mathematics Achievement, Grade 4, Mathematics Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Li, Ying; Jiao, Hong; Lissitz, Robert W. – Journal of Applied Testing Technology, 2012
This study investigated the application of multidimensional item response theory (IRT) models to validate test structure and dimensionality. Multiple content areas or domains within a single subject often exist in large-scale achievement tests. Such areas or domains may cause multidimensionality or local item dependence, which both violate the…
Descriptors: Achievement Tests, Science Tests, Item Response Theory, Measures (Individuals)
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Hansen, Mark; Cai, Li; Monroe, Scott; Li, Zhen – Grantee Submission, 2016
Despite the growing popularity of diagnostic classification models (e.g., Rupp, Templin, & Henson, 2010) in educational and psychological measurement, methods for testing their absolute goodness-of-fit to real data remain relatively underdeveloped. For tests of reasonable length and for realistic sample size, full-information test statistics…
Descriptors: Goodness of Fit, Item Response Theory, Classification, Maximum Likelihood Statistics
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Sabatini, John; O'Reilly, Tenaha; Deane, Paul – ETS Research Report Series, 2013
This report describes the foundation and rationale for a framework designed to measure reading literacy. The aim of the effort is to build an assessment system that reflects current theoretical conceptions of reading and is developmentally sensitive across a prekindergarten to 12th grade student range. The assessment framework is intended to…
Descriptors: Reading Tests, Literacy, Models, Testing Programs
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Rock, Donald A. – ETS Research Report Series, 2012
This paper provides a history of ETS's role in developing assessment instruments and psychometric procedures for measuring change in large-scale national assessments funded by the Longitudinal Studies branch of the National Center for Education Statistics. It documents the innovations developed during more than 30 years of working with…
Descriptors: Models, Educational Change, Longitudinal Studies, Educational Development
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Zhang, Mo; Breyer, F. Jay; Lorenz, Florian – ETS Research Report Series, 2013
In this research, we investigated the suitability of implementing "e-rater"® automated essay scoring in a high-stakes large-scale English language testing program. We examined the effectiveness of generic scoring and 2 variants of prompt-based scoring approaches. Effectiveness was evaluated on a number of dimensions, including agreement…
Descriptors: Computer Assisted Testing, Computer Software, Scoring, Language Tests
Previous Page | Next Page »
Pages: 1  |  2  |  3  |  4  |  5  |  6  |  7  |  8