NotesFAQContact Us
Collection
Advanced
Search Tips
Showing 3,031 to 3,045 of 9,552 results Save | Export
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Liu, Yan; Zumbo, Bruno D.; Gustafson, Paul; Huang, Yi; Kroc, Edward; Wu, Amery D. – Practical Assessment, Research & Evaluation, 2016
A variety of differential item functioning (DIF) methods have been proposed and used for ensuring that a test is fair to all test takers in a target population in the situations of, for example, a test being translated to other languages. However, once a method flags an item as DIF, it is difficult to conclude that the grouping variable (e.g.,…
Descriptors: Test Items, Test Bias, Probability, Scores
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Demir, Mevhibe Kobak; Gür, Hülya – Educational Research and Reviews, 2016
This study was aimed to develop a valid and reliable perception scale in order to determine the perceptions of pre-service teachers towards the use of WebQuest in mathematics teaching. The study was conducted with 115 junior and senior pre-service teachers at Balikesir University's Faculty of Education, Computer Education and Instructional…
Descriptors: Foreign Countries, Attitude Measures, Likert Scales, Test Construction
Peer reviewed Peer reviewed
Direct linkDirect link
Ciftci, S. Koza; Karadag, Engin – Africa Education Review, 2016
The aim of this study was to evaluate students' perceptions of the quality of mathematics education and to develop a reliable and valid measurement tool. The research was conducted with 638 (first study) and 407 (second study) secondary school students in Eskisehir, Turkey. Item discrimination, structural validity (exploratory factor analysis and…
Descriptors: Educational Quality, Mathematics Education, Test Construction, Foreign Countries
Peer reviewed Peer reviewed
Direct linkDirect link
Gómez-Benito, Juana; Hidalgo, Maria Dolores; Zumbo, Bruno D. – Educational and Psychological Measurement, 2013
The objective of this article was to find an optimal decision rule for identifying polytomous items with large or moderate amounts of differential functioning. The effectiveness of combining statistical tests with effect size measures was assessed using logistic discriminant function analysis and two effect size measures: R[superscript 2] and…
Descriptors: Item Analysis, Test Items, Effect Size, Statistical Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
de la Torre, Jimmy; Lee, Young-Sun – Journal of Educational Measurement, 2013
This article used the Wald test to evaluate the item-level fit of a saturated cognitive diagnosis model (CDM) relative to the fits of the reduced models it subsumes. A simulation study was carried out to examine the Type I error and power of the Wald test in the context of the G-DINA model. Results show that when the sample size is small and a…
Descriptors: Statistical Analysis, Test Items, Goodness of Fit, Error of Measurement
Peer reviewed Peer reviewed
Direct linkDirect link
Grunert, Megan L.; Raker, Jeffrey R.; Murphy, Kristen L.; Holme, Thomas A. – Journal of Chemical Education, 2013
The concept of assigning partial credit on multiple-choice test items is considered for items from ACS Exams. Because the items on these exams, particularly the quantitative items, use common student errors to define incorrect answers, it is possible to assign partial credits to some of these incorrect responses. To do so, however, it becomes…
Descriptors: Multiple Choice Tests, Scoring, Scoring Rubrics, Science Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Dodonova, Yulia A.; Dodonov, Yury S. – Intelligence, 2013
Using more complex items than those commonly employed within the information-processing approach, but still easier than those used in intelligence tests, this study analyzed how the association between processing speed and accuracy level changes as the difficulty of the items increases. The study involved measuring cognitive ability using Raven's…
Descriptors: Difficulty Level, Intelligence Tests, Cognitive Ability, Accuracy
Peer reviewed Peer reviewed
Direct linkDirect link
Hua, Jing; Gu, Guixiong; Meng, Wei; Wu, Zhuochun – Research in Developmental Disabilities: A Multidisciplinary Journal, 2013
The aim of this paper was to examine the validity and reliability of age band 1 of the Movement Assessment Battery for Children-Second Edition (MABC-2) in preparation for its standardization in mainland China. Interrater and test-retest reliability of the MABC-2 was estimated using Intraclass Correlation Coefficient (ICC). Cronbach's alpha for…
Descriptors: Factor Analysis, Test Items, Foreign Countries, Psychometrics
Peer reviewed Peer reviewed
Direct linkDirect link
Hong, Eunsook; Peng, Yun; O'Neil, Harold F., Jr.; Wu, Junbin – Journal of Creative Behavior, 2013
The study examined the effects of gender and item content of domain-general and domain-specific creative-thinking tests on four subscale scores of creative-thinking (fluency, flexibility, originality, and elaboration). Chinese tenth-grade students (234 males and 244 females) participated in the study. Domain-general creative thinking was measured…
Descriptors: Creative Thinking, Creativity Tests, Gender Differences, Test Items
Peer reviewed Peer reviewed
Direct linkDirect link
Luecht, Richard M. – Journal of Applied Testing Technology, 2013
Assessment engineering is a new way to design and implement scalable, sustainable and ideally lower-cost solutions to the complexities of designing and developing tests. It represents a merger of sorts between cognitive task modeling and engineering design principles--a merger that requires some new thinking about the nature of score scales, item…
Descriptors: Engineering, Test Construction, Test Items, Models
Peer reviewed Peer reviewed
Direct linkDirect link
Kim, Jihye; Oshima, T. C. – Educational and Psychological Measurement, 2013
In a typical differential item functioning (DIF) analysis, a significance test is conducted for each item. As a test consists of multiple items, such multiple testing may increase the possibility of making a Type I error at least once. The goal of this study was to investigate how to control a Type I error rate and power using adjustment…
Descriptors: Test Bias, Test Items, Statistical Analysis, Error of Measurement
Peer reviewed Peer reviewed
Direct linkDirect link
Han, Kyung T. – Applied Psychological Measurement, 2013
Most computerized adaptive testing (CAT) programs do not allow test takers to review and change their responses because it could seriously deteriorate the efficiency of measurement and make tests vulnerable to manipulative test-taking strategies. Several modified testing methods have been developed that provide restricted review options while…
Descriptors: Computer Assisted Testing, Adaptive Testing, Test Items, Testing
Store, Davie – ProQuest LLC, 2013
The impact of particular types of context effects on actual scores is less understood although there has been some research carried out regarding certain types of context effects under the nonequivalent anchor test (NEAT) design. In addition, the issue of the impact of item context effects on scores has not been investigated extensively when item…
Descriptors: Test Items, Equated Scores, Accuracy, Item Response Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Maydeu-Olivares, Alberto; Montano, Rosa – Psychometrika, 2013
We investigate the performance of three statistics, R [subscript 1], R [subscript 2] (Glas in "Psychometrika" 53:525-546, 1988), and M [subscript 2] (Maydeu-Olivares & Joe in "J. Am. Stat. Assoc." 100:1009-1020, 2005, "Psychometrika" 71:713-732, 2006) to assess the overall fit of a one-parameter logistic model…
Descriptors: Foreign Countries, Item Response Theory, Statistics, Data Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Tijmstra, Jesper; Hessen, David J.; van der Heijden, Peter G. M.; Sijtsma, Klaas – Psychometrika, 2013
Most dichotomous item response models share the assumption of latent monotonicity, which states that the probability of a positive response to an item is a nondecreasing function of a latent variable intended to be measured. Latent monotonicity cannot be evaluated directly, but it implies manifest monotonicity across a variety of observed scores,…
Descriptors: Item Response Theory, Statistical Inference, Probability, Psychometrics
Pages: 1  |  ...  |  199  |  200  |  201  |  202  |  203  |  204  |  205  |  206  |  207  |  ...  |  637