NotesFAQContact Us
Collection
Advanced
Search Tips
Publication Date
In 20250
Since 20240
Since 2021 (last 5 years)0
Since 2016 (last 10 years)2
Since 2006 (last 20 years)14
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 1 to 15 of 38 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Akour, Mutasem; Sabah, Saed; Hammouri, Hind – Journal of Psychoeducational Assessment, 2015
The purpose of this study was to apply two types of Differential Item Functioning (DIF), net and global DIF, as well as the framework of Differential Step Functioning (DSF) to real testing data to investigate measurement invariance related to test language. Data from the Program for International Student Assessment (PISA)-2006 polytomously scored…
Descriptors: Test Bias, Science Tests, Test Items, Scoring
Peer reviewed Peer reviewed
Direct linkDirect link
Topçu, Mustafa Sami; Arikan, Serkan; Erbilgin, Evrim – Australian Educational Researcher, 2015
The OECD's Programme for International Student Assessment (PISA) enables participating countries to monitor 15-year old students' progress in reading, mathematics, and science literacy. The present study investigates persistent factors that contribute to science performance of Turkish students in PISA 2006 and PISA 2009. Additionally, the study…
Descriptors: Foreign Countries, Science Achievement, Science Tests, Testing Programs
Peer reviewed Peer reviewed
Direct linkDirect link
Alonzo, Alicia C.; Ke, Li – Measurement: Interdisciplinary Research and Perspectives, 2016
A new vision of science learning described in the "Next Generation Science Standards"--particularly the science and engineering practices and their integration with content--pose significant challenges for large-scale assessment. This article explores what might be learned from advances in large-scale science assessment and…
Descriptors: Science Achievement, Science Tests, Group Testing, Accountability
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Hansen, Mark; Cai, Li; Monroe, Scott; Li, Zhen – Grantee Submission, 2016
Despite the growing popularity of diagnostic classification models (e.g., Rupp, Templin, & Henson, 2010) in educational and psychological measurement, methods for testing their absolute goodness-of-fit to real data remain relatively underdeveloped. For tests of reasonable length and for realistic sample size, full-information test statistics…
Descriptors: Goodness of Fit, Item Response Theory, Classification, Maximum Likelihood Statistics
Mullis, Ina V. S., Ed.; Martin, Michael O., Ed. – International Association for the Evaluation of Educational Achievement, 2014
It is critical for countries to ensure that capable secondary school students receive further preparation in advanced mathematics and science, so that they are ready to enter challenging university-level studies that prepare them for careers in science, technology, engineering, and mathematics (STEM) fields. This group of students will become the…
Descriptors: Mathematics Tests, Science Tests, Educational Assessment, Secondary School Students
Peer reviewed Peer reviewed
Direct linkDirect link
Chen, Xinnian; Graesser, Donnasue; Sah, Megha – Advances in Physiology Education, 2015
Laboratory courses serve as important gateways to science, technology, engineering, and mathematics education. One of the challenges in assessing laboratory learning is to conduct meaningful and standardized practical exams, especially for large multisection laboratory courses. Laboratory practical exams in life sciences courses are frequently…
Descriptors: Laboratory Experiments, Standardized Tests, Testing Programs, Testing Problems
Cresswell, John; Schwantner, Ursula; Waters, Charlotte – OECD Publishing, 2015
This report reviews the major international and regional large-scale educational assessments, including international surveys, school-based surveys and household-based surveys. The report compares and contrasts the cognitive and contextual data collection instruments and implementation methods used by the different assessments in order to identify…
Descriptors: International Assessment, Educational Assessment, Data Collection, Comparative Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Li, Ying; Jiao, Hong; Lissitz, Robert W. – Journal of Applied Testing Technology, 2012
This study investigated the application of multidimensional item response theory (IRT) models to validate test structure and dimensionality. Multiple content areas or domains within a single subject often exist in large-scale achievement tests. Such areas or domains may cause multidimensionality or local item dependence, which both violate the…
Descriptors: Achievement Tests, Science Tests, Item Response Theory, Measures (Individuals)
Peer reviewed Peer reviewed
Direct linkDirect link
Quellmalz, Edys S.; Timms, Michael J.; Silberglitt, Matt D.; Buckley, Barbara C. – Journal of Research in Science Teaching, 2012
This article reports on the collaboration of six states to study how simulation-based science assessments can become transformative components of multi-level, balanced state science assessment systems. The project studied the psychometric quality, feasibility, and utility of simulation-based science assessments designed to serve formative purposes…
Descriptors: State Programs, Educational Assessment, Simulated Environment, Grade 6
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Pellegrino, James W.; Quellmalz, Edys S. – Journal of Research on Technology in Education, 2011
This paper considers uses of technology in educational assessment from the perspective of innovation and support for teaching and learning. It examines assessment cases drawn from contexts that include large-scale testing programs as well as classroom-based programs, and attempts that have been made to harness the power of technology to provide…
Descriptors: Testing Programs, Student Evaluation, Educational Assessment, Testing
Peer reviewed Peer reviewed
Direct linkDirect link
Wyse, Adam E. – Educational and Psychological Measurement, 2011
Standard setting is a method used to set cut scores on large-scale assessments. One of the most popular standard setting methods is the Bookmark method. In the Bookmark method, panelists are asked to envision a response probability (RP) criterion and move through a booklet of ordered items based on a RP criterion. This study investigates whether…
Descriptors: Testing Programs, Standard Setting (Scoring), Cutting Scores, Probability
Peer reviewed Peer reviewed
PDF on ERIC Download full text
National Center for Education Statistics, 2013
The 2011 NAEP-TIMSS linking study conducted by the National Center for Education Statistics (NCES) was designed to predict Trends in International Mathematics and Science Study (TIMSS) scores for the U.S. states that participated in 2011 National Assessment of Educational Progress (NAEP) mathematics and science assessment of eighth-grade students.…
Descriptors: Grade 8, Research Methodology, Research Design, Trend Analysis
Meyers, Jason L.; Murphy, Stephen; Goodman, Joshua; Turhan, Ahmet – Pearson, 2012
Operational testing programs employing item response theory (IRT) applications benefit from of the property of item parameter invariance whereby item parameter estimates obtained from one sample can be applied to other samples (when the underlying assumptions are satisfied). In theory, this feature allows for applications such as computer-adaptive…
Descriptors: Equated Scores, Test Items, Test Format, Item Response Theory
Di Giacomo, F. Tony; Fishbein, Bethany G.; Buckley, Vanessa W. – College Board, 2013
Many articles and reports have reviewed, researched, and commented on international assessments from the perspective of exploring what is relevant for the United States' education systems. Researchers make claims about whether the top-performing systems have transferable practices or policies that could be applied to the United States. However,…
Descriptors: Comparative Testing, International Assessment, Relevance (Education), Testing Programs
Peer reviewed Peer reviewed
Smith, P. Sean; And Others – Science Education, 1992
Describes the impact of North Carolina's testing program in chemistry on curriculum and instruction from teachers' perspectives. A random sample of 100 teachers received a questionnaire yielding a usable sample of 48, of which 8 were interviewed. Results suggest testing is making the chemistry curriculum more uniform across the state. It is not…
Descriptors: Chemistry, Educational Research, Science Curriculum, Science Education
Previous Page | Next Page »
Pages: 1  |  2  |  3