Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 2 |
Since 2016 (last 10 years) | 5 |
Since 2006 (last 20 years) | 11 |
Descriptor
Source
Author
Xu, Xueli | 3 |
von Davier, Matthias | 3 |
Allen, Nancy L. | 2 |
Mislevy, Robert J. | 2 |
Romberg, Thomas A. | 2 |
Yamamoto, Kentaro | 2 |
Beaton, Albert E. | 1 |
Carlson, James E. | 1 |
Chang, Wanchen | 1 |
Chun Wang | 1 |
Dodd, Barbara G. | 1 |
More ▼ |
Publication Type
Journal Articles | 13 |
Reports - Evaluative | 13 |
Reports - Research | 11 |
Reports - Descriptive | 2 |
Speeches/Meeting Papers | 2 |
Tests/Questionnaires | 2 |
Numerical/Quantitative Data | 1 |
Education Level
Grade 4 | 5 |
Grade 12 | 3 |
Grade 8 | 3 |
Elementary Education | 2 |
Elementary Secondary Education | 2 |
High Schools | 2 |
Intermediate Grades | 2 |
Secondary Education | 2 |
Higher Education | 1 |
Junior High Schools | 1 |
Middle Schools | 1 |
More ▼ |
Audience
Policymakers | 1 |
Teachers | 1 |
Location
Sweden | 1 |
United States | 1 |
Laws, Policies, & Programs
Assessments and Surveys
National Assessment of… | 26 |
SAT (College Admission Test) | 1 |
What Works Clearinghouse Rating
Patel, Nirmal; Sharma, Aditya; Shah, Tirth; Lomas, Derek – Journal of Educational Data Mining, 2021
Process Analysis is an emerging approach to discover meaningful knowledge from temporal educational data. The study presented in this paper shows how we used Process Analysis methods on the National Assessment of Educational Progress (NAEP) test data for modeling and predicting student test-taking behavior. Our process-oriented data exploration…
Descriptors: Learning Analytics, National Competency Tests, Evaluation Methods, Prediction
Jing Lu; Chun Wang; Ningzhong Shi – Grantee Submission, 2023
In high-stakes, large-scale, standardized tests with certain time limits, examinees are likely to engage in either one of the three types of behavior (e.g., van der Linden & Guo, 2008; Wang & Xu, 2015): solution behavior, rapid guessing behavior, and cheating behavior. Oftentimes examinees do not always solve all items due to various…
Descriptors: High Stakes Tests, Standardized Tests, Guessing (Tests), Cheating
Wang, Ting; Li, Min; Thummaphan, Phonraphee; Ruiz-Primo, Maria Araceli – International Journal of Testing, 2017
Contextualized items have been widely used in science testing. Despite common use of item contexts, how the influence of a chosen context on the reliability and validity of the score inferences remains unclear. We focused on sequential cues of contextual information, referring to the order of events or descriptions presented in item contexts. We…
Descriptors: Science Tests, Cues, Difficulty Level, Test Items
Ramsay, James O.; Wiberg, Marie – Journal of Educational and Behavioral Statistics, 2017
This article promotes the use of modern test theory in testing situations where sum scores for binary responses are now used. It directly compares the efficiencies and biases of classical and modern test analyses and finds an improvement in the root mean squared error of ability estimates of about 5% for two designed multiple-choice tests and…
Descriptors: Scoring, Test Theory, Computation, Maximum Likelihood Statistics
Rakes, Christopher R.; Ronau, Robert N. – International Journal of Research in Education and Science, 2019
The present study examined the ability of content domain (algebra, geometry, rational number, probability) to classify mathematics misconceptions. The study was conducted with 1,133 students in 53 algebra and geometry classes taught by 17 teachers from three high schools and one middle school across three school districts in a Midwestern state.…
Descriptors: Mathematics Instruction, Secondary School Teachers, Middle School Teachers, Misconceptions
Whittaker, Tiffany A.; Chang, Wanchen; Dodd, Barbara G. – Applied Psychological Measurement, 2012
When tests consist of multiple-choice and constructed-response items, researchers are confronted with the question of which item response theory (IRT) model combination will appropriately represent the data collected from these mixed-format tests. This simulation study examined the performance of six model selection criteria, including the…
Descriptors: Item Response Theory, Models, Selection, Criteria
Ip, Edward H. – Applied Psychological Measurement, 2010
The testlet response model is designed for handling items that are clustered, such as those embedded within the same reading passage. Although the testlet is a powerful tool for handling item clusters in educational and psychological testing, the interpretations of its item parameters, the conditional correlation between item pairs, and the…
Descriptors: Item Response Theory, Models, Test Items, Correlation
National Assessment Governing Board, 2012
Since 1973, the National Assessment of Educational Progress (NAEP) has gathered information about student achievement in mathematics. Results of these periodic assessments, produced in print and web-based formats, provide valuable information to a wide variety of audiences. They inform citizens about the nature of students' comprehension of the…
Descriptors: Academic Achievement, Mathematics Achievement, National Competency Tests, Grade 4
Xu, Xueli; von Davier, Matthias – ETS Research Report Series, 2008
Xu and von Davier (2006) demonstrated the feasibility of using the general diagnostic model (GDM) to analyze National Assessment of Educational Progress (NAEP) proficiency data. Their work showed that the GDM analysis not only led to conclusions for gender and race groups similar to those published in the NAEP Report Card, but also allowed…
Descriptors: National Competency Tests, Models, Data Analysis, Reading Tests
Xu, Xueli; von Davier, Matthias – ETS Research Report Series, 2008
Three strategies for linking two consecutive assessments are investigated and compared by analyzing reading data for the National Assessment of Educational Progress (NAEP) using the general diagnostic model. These strategies are compared in terms of marginal and joint expectations of skills, joint probabilities of skill patterns, and item…
Descriptors: National Competency Tests, Probability, Reading Achievement, Test Items
Longford, Nicholas T. – 1994
This study is a critical evaluation of the roles for coding and scoring of missing responses to multiple-choice items in educational tests. The focus is on tests in which the test-takers have little or no motivation; in such tests omitting and not reaching (as classified by the currently adopted operational rules) is quite frequent. Data from the…
Descriptors: Algorithms, Classification, Coding, Models
Yamamoto, Kentaro; Kulick, Edward – 1992
Test items are designed to be representative of the subject areas that they measure and to reflect the importance of specific domains or item types within those subject areas. Content validity is achieved by content specification and number of items in each content domain included in the design of the test. However, largely due to the normal…
Descriptors: Content Validity, Elementary Secondary Education, Field Tests, Mathematical Models
Rudner, Lawrence M.; And Others – 1995
Fit statistics provide a direct measure of assessment accuracy by analyzing the fit of measurement models to an individual's (or group's) response pattern. Students that lose interest during the assessment, for example, will miss exercises that are within their abilities. Such students will respond correctly to some more difficult items and…
Descriptors: Difficulty Level, Educational Assessment, Goodness of Fit, Measurement Techniques
Romberg, Thomas A.; And Others – 1982
The purpose of this report is to describe the development of a pool of mathematical problem-solving situations and a set of items for each situation which provides information about students' qualitatively different levels of reasoning ability as applied to that situation. For each problem-solving situation, a set of "structured…
Descriptors: Elementary Secondary Education, Evaluation Methods, Mathematics Achievement, Models

Beaton, Albert E.; Allen, Nancy L. – Journal of Educational Statistics, 1992
The National Assessment of Educational Progress (NAEP) makes possible comparison of groups of students and provides information about what these groups know and can do. The scale anchoring techniques described in this chapter address the latter purpose. The direct method and the smoothing method of scale anchoring are discussed. (SLD)
Descriptors: Comparative Testing, Educational Assessment, Elementary Secondary Education, Knowledge Level
Previous Page | Next Page ยป
Pages: 1 | 2