Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 3 |
Since 2016 (last 10 years) | 8 |
Since 2006 (last 20 years) | 21 |
Descriptor
Item Analysis | 45 |
Weighted Scores | 45 |
Test Reliability | 17 |
Test Validity | 13 |
Scoring | 12 |
Test Construction | 12 |
Test Items | 12 |
Scoring Formulas | 11 |
Correlation | 9 |
Evaluation Methods | 9 |
Foreign Countries | 9 |
More ▼ |
Source
Author
Echternacht, Gary | 2 |
Kane, Michael | 2 |
Keller, Lisa A. | 2 |
Al-Qadi, Abdulfatah S. | 1 |
Al-Qudah, Hussein F. | 1 |
Al-Rimawi, Ahmad Sh. | 1 |
Angoff, William H. | 1 |
Aricak, O. Tolga | 1 |
Arikan, Serkan | 1 |
Attali, Yigal | 1 |
Bastari | 1 |
More ▼ |
Publication Type
Education Level
Audience
Researchers | 1 |
Teachers | 1 |
Location
Asia | 1 |
Australia | 1 |
California | 1 |
Colombia | 1 |
Europe | 1 |
Indonesia | 1 |
Japan | 1 |
Jordan | 1 |
Latin America | 1 |
Michigan | 1 |
Netherlands | 1 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
Graduate Record Examinations | 2 |
Program for International… | 1 |
Test of English as a Foreign… | 1 |
Trends in International… | 1 |
What Works Clearinghouse Rating
Wan, Siyu; Keller, Lisa A. – Practical Assessment, Research & Evaluation, 2023
Statistical process control (SPC) charts have been widely used in the field of educational measurement. The cumulative sum (CUSUM) is an established SPC method to detect aberrant responses for educational assessments. There are many studies that investigated the performance of CUSUM in different test settings. This paper describes the CUSUM…
Descriptors: Visual Aids, Educational Assessment, Evaluation Methods, Item Response Theory
Wu, Tong – ProQuest LLC, 2023
This three-article dissertation aims to address three methodological challenges to ensure comparability in educational research, including scale linking, test equating, and propensity score (PS) weighting. The first study intends to improve test scale comparability by evaluating the effect of six missing data handling approaches, including…
Descriptors: Educational Research, Comparative Analysis, Equated Scores, Weighted Scores
Lu, Ru; Guo, Hongwen; Dorans, Neil J. – ETS Research Report Series, 2021
Two families of analysis methods can be used for differential item functioning (DIF) analysis. One family is DIF analysis based on observed scores, such as the Mantel-Haenszel (MH) and the standardized proportion-correct metric for DIF procedures; the other is analysis based on latent ability, in which the statistic is a measure of departure from…
Descriptors: Robustness (Statistics), Weighted Scores, Test Items, Item Analysis
Wasis; Kumaidi; Bastari; Mundilarto; Wintarti, Atik – Eurasian Journal of Educational Research, 2018
Purpose: This is a developmental research study that aims to develop a model of polytomous scoring based-on weighting for multiple correct items in the subject of physics. Weighting was analytically applied based on question complexity and imposed penalties on wrong answers. Research Methods: Within the development model, Fenrich's development…
Descriptors: Physics, Science Education, Scoring, Secondary School Students
Christie, Fiona – Journal of Education and Work, 2017
Which are the best and worst universities in the UK for getting a job when you graduate? This question attracts readers of the employability rankings in national league tables. This study critically reviews the employability measure used in the rankings and its subsequent reporting in public news and commentary sources, such as national and local…
Descriptors: Employment Potential, Classification, Research Reports, Universities
Köhler, Hannah, Ed.; Weber, Sabine, Ed.; Brese, Falk, Ed.; Schulz, Wolfram, Ed.; Carstens, Ralph, Ed. – International Association for the Evaluation of Educational Achievement, 2018
The IEA's International Civic and Citizenship Education Study (ICCS) investigates the ways in which young people are prepared to undertake their roles as citizens in a range of countries in the second decade of the 21st century. ICCS 2016 is the second cycle of a study initiated in 2009. The ICCS 2016 user guide describes the content and format of…
Descriptors: Guides, Citizenship Education, Citizen Participation, Citizenship Responsibility
Palacio, Marcela; Gaviria, Sandra; Brown, James Dean – PROFILE: Issues in Teachers' Professional Development, 2016
Frustrations with traditional testing led a group of teachers at the English for adults program at Universidad EAFIT (Colombia) to design tests aligned with the institutional teaching philosophy and classroom practices. This article reports on a study of an item-by-item evaluation of a series of English exams for validity and reliability in an…
Descriptors: Foreign Countries, English (Second Language), Second Language Learning, Second Language Instruction
Keller, Lisa A.; Keller, Robert R. – Applied Measurement in Education, 2015
Equating test forms is an essential activity in standardized testing, with increased importance with the accountability systems in existence through the mandate of Adequate Yearly Progress. It is through equating that scores from different test forms become comparable, which allows for the tracking of changes in the performance of students from…
Descriptors: Item Response Theory, Rating Scales, Standardized Tests, Scoring Rubrics
Phillips, Gary W. – Applied Measurement in Education, 2015
This article proposes that sampling design effects have potentially huge unrecognized impacts on the results reported by large-scale district and state assessments in the United States. When design effects are unrecognized and unaccounted for they lead to underestimating the sampling error in item and test statistics. Underestimating the sampling…
Descriptors: State Programs, Sampling, Research Design, Error of Measurement
Williams, Matt N.; Gomez Grajales, Carlos Alberto; Kurkiewicz, Dason – Practical Assessment, Research & Evaluation, 2013
In 2002, an article entitled "Four assumptions of multiple regression that researchers should always test" by Osborne and Waters was published in "PARE." This article has gone on to be viewed more than 275,000 times (as of August 2013), and it is one of the first results displayed in a Google search for "regression…
Descriptors: Multiple Regression Analysis, Misconceptions, Reader Response, Predictor Variables
Arikan, Serkan; van de Vijver, Fons J. R.; Yagmur, Kutlay – EURASIA Journal of Mathematics, Science & Technology Education, 2016
Large-scale studies, such as the Trends in International Mathematics and Science Study (TIMSS), provide data to understand cross-national differences and similarities. In this study, we aimed to identify factors predicting mathematics achievement of Turkish students by comparing to Australian students. First, construct equivalence and item bias…
Descriptors: Foreign Countries, Comparative Education, Mathematics Achievement, Performance Factors
van Rijn, P. W.; Beguin, A. A.; Verstralen, H. H. F. M. – Assessment in Education: Principles, Policy & Practice, 2012
While measurement precision is relatively easy to establish for single tests and assessments, it is much more difficult to determine for decision making with multiple tests on different subjects. This latter is the situation in the system of final examinations for secondary education in the Netherlands and is used as an example in this paper. This…
Descriptors: Secondary Education, Tests, Foreign Countries, Decision Making
Attali, Yigal; Bridgeman, Brent; Trapani, Catherine – Journal of Technology, Learning, and Assessment, 2010
A generic approach in automated essay scoring produces scores that have the same meaning across all prompts, existing or new, of a writing assessment. This is accomplished by using a single set of linguistic indicators (or features), a consistent way of combining and weighting these features into essay scores, and a focus on features that are not…
Descriptors: Writing Evaluation, Writing Tests, Scoring, Test Scoring Machines
Martinez, Rebecca S.; Missall, Kristen N.; Graney, Suzanne Bamonto; Aricak, O. Tolga; Clarke, Ben – Assessment for Effective Intervention, 2009
The current study examines the technical adequacy of four Early Numeracy Curriculum-Based Measurement (EN-CBM) screening tasks: "Oral Counting" (OC), "Number Identification" (NI), "Quantity Discrimination" (QD), and "Missing Number" (MN). Results from 59 kindergarten students assessed in the fall and spring reveal moderate to high test-retest and…
Descriptors: Curriculum Based Assessment, Numeracy, Predictive Validity, Kindergarten
OECD Publishing (NJ1), 2012
The "PISA 2009 Technical Report" describes the methodology underlying the PISA 2009 survey. It examines additional features related to the implementation of the project at a level of detail that allows researchers to understand and replicate its analyses. The reader will find a wealth of information on the test and sample design,…
Descriptors: Quality Control, Research Reports, Research Methodology, Evaluation Criteria