NotesFAQContact Us
Collection
Advanced
Search Tips
Showing 3,871 to 3,885 of 9,533 results Save | Export
Adams, Ray; Berezner, Alla; Jakubowski, Maciej – OECD Publishing (NJ1), 2010
This paper uses an approximate average percent-correct methodology to compare the ranks that would be obtained for PISA 2006 countries if the rankings had been derived from items judged by each country to be of highest priority for inclusion. The results reported show a remarkable consistency in the country rank orderings across different sets of…
Descriptors: Science Tests, Preferences, Test Items, Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Buckendahl, Chad W.; Ferdous, Abdullah A.; Gerrow, Jack – Practical Assessment, Research & Evaluation, 2010
Many testing programs face the practical challenge of having limited resources to conduct comprehensive standard setting studies. Some researchers have suggested that replicating a group's recommended cut score on a full-length test may be possible by using a subset of the items. However, these studies were based on simulated data. This study…
Descriptors: Cutting Scores, Test Items, Standard Setting (Scoring), Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Sinharay, Sandip; Haberman, Shelby J.; Zwick, Rebecca – Measurement: Interdisciplinary Research and Perspectives, 2010
Several researchers (e.g., Klein, Hamilton, McCaffrey, & Stecher, 2000; Koretz & Barron, 1998; Linn, 2000) have asserted that test-based accountability, a crucial component of U.S. education policy, has resulted in score inflation. This inference has relied on comparisons with performance on other tests such as the National Assessment of…
Descriptors: Audits (Verification), Test Items, Scores, Measurement
Peer reviewed Peer reviewed
Direct linkDirect link
Briggs, Derek C. – Measurement: Interdisciplinary Research and Perspectives, 2010
The use of large-scale assessments for making high stakes inferences about students and the schools in which they are situated is premised on the assumption that tests are sensitive to good instruction. An increase in the quality of classroom instruction should cause, on the average, an increase in test scores. In work with a number of colleagues…
Descriptors: Measurement, High Stakes Tests, Inferences, Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Raker, Jeffrey R.; Towns, Marcy H. – Chemistry Education Research and Practice, 2010
Investigations of the problem types used in college-level general chemistry examinations have been reported in this Journal and were first reported in the "Journal of Chemical Education" in 1924. This study extends the findings from general chemistry to the problems of four college-level organic chemistry courses. Three problem…
Descriptors: Benchmarking, Organic Chemistry, Science Instruction, College Science
Peer reviewed Peer reviewed
Direct linkDirect link
Ip, Edward H. – Applied Psychological Measurement, 2010
The testlet response model is designed for handling items that are clustered, such as those embedded within the same reading passage. Although the testlet is a powerful tool for handling item clusters in educational and psychological testing, the interpretations of its item parameters, the conditional correlation between item pairs, and the…
Descriptors: Item Response Theory, Models, Test Items, Correlation
Peer reviewed Peer reviewed
Direct linkDirect link
Kim, Sooyeon; Livingston, Samuel A. – Journal of Educational Measurement, 2010
Score equating based on small samples of examinees is often inaccurate for the examinee populations. We conducted a series of resampling studies to investigate the accuracy of five methods of equating in a common-item design. The methods were chained equipercentile equating of smoothed distributions, chained linear equating, chained mean equating,…
Descriptors: Equated Scores, Test Items, Item Sampling, Item Response Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Revuelta, Javier – Psychometrika, 2010
A comprehensive analysis of difficulty for multiple-choice items requires information at different levels: the test, the items, and the alternatives. This paper introduces a new parameterization of the nominal categories model (NCM) for analyzing difficulty at these three levels. The new parameterization is referred to as the NE-NCM and is…
Descriptors: Classification, Short Term Memory, Multiple Choice Tests, Test Items
Peer reviewed Peer reviewed
Direct linkDirect link
Haberman, Shelby J.; Sinharay, Sandip – Psychometrika, 2010
Recently, there has been increasing interest in reporting subscores. This paper examines reporting of subscores using multidimensional item response theory (MIRT) models (e.g., Reckase in "Appl. Psychol. Meas." 21:25-36, 1997; C.R. Rao and S. Sinharay (Eds), "Handbook of Statistics, vol. 26," pp. 607-642, North-Holland, Amsterdam, 2007; Beguin &…
Descriptors: Item Response Theory, Psychometrics, Statistical Analysis, Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Hooker, Giles; Finkelman, Matthew – Psychometrika, 2010
Hooker, Finkelman, and Schwartzman ("Psychometrika," 2009, in press) defined a paradoxical result as the attainment of a higher test score by changing answers from correct to incorrect and demonstrated that such results are unavoidable for maximum likelihood estimates in multidimensional item response theory. The potential for these results to…
Descriptors: Models, Scores, Item Response Theory, Psychometrics
Altun, Halis; Korkmaz, Özgen – Online Submission, 2012
The aim of this study is to adapt the Cooperative Learning Attitude Scale into Turkish and determine engineering students' attitudes towards the cooperative learning. The study is based on the descriptive scanning model. The study group consists of 466 engineering students. The validity of the scale is confirmed through exploration factor analysis…
Descriptors: Foreign Countries, Cooperative Learning, Attitude Measures, Engineering Education
Ho, Siew Yin; Lowrie, Tom – Mathematics Education Research Group of Australasia, 2012
This study describes Singapore students' (N = 607) performance on a recently developed Mathematics Processing Instrument (MPI). The MPI comprised tasks sourced from Australia's NAPLAN and Singapore's PSLE. In addition, the MPI had a corresponding question which encouraged students to describe how they solved the respective tasks. In particular,…
Descriptors: Foreign Countries, Academic Achievement, National Competency Tests, Mathematics Tests
Louisiana Department of Education, 2012
"Louisiana Believes” embraces the principle that all children can achieve at high levels, as evidenced in Louisiana's recent adoption of the Common Core State Standards (CCSS). "Louisiana Believes" also promotes the idea that Louisiana's educators should be empowered to make decisions to support the success of their students. In…
Descriptors: Student Evaluation, Achievement Tests, Grade 8, English
Peer reviewed Peer reviewed
Direct linkDirect link
Taylor, Catherine S.; Lee, Yoonsun – Applied Measurement in Education, 2012
This was a study of differential item functioning (DIF) for grades 4, 7, and 10 reading and mathematics items from state criterion-referenced tests. The tests were composed of multiple-choice and constructed-response items. Gender DIF was investigated using POLYSIBTEST and a Rasch procedure. The Rasch procedure flagged more items for DIF than did…
Descriptors: Test Bias, Gender Differences, Reading Tests, Mathematics Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Pae, Hye K. – Educational Research and Evaluation, 2012
The aim of this study was to apply Rasch modeling to an examination of the psychometric properties of the "Pearson Test of English Academic" (PTE Academic). Analyzed were 140 test-takers' scores derived from the PTE Academic database. The mean age of the participants was 26.45 (SD = 5.82), ranging from 17 to 46. Conformity of the participants'…
Descriptors: Reliability, Second Language Learning, Field Tests, Psychometrics
Pages: 1  |  ...  |  255  |  256  |  257  |  258  |  259  |  260  |  261  |  262  |  263  |  ...  |  636