NotesFAQContact Us
Collection
Advanced
Search Tips
Showing 4,891 to 4,905 of 9,547 results Save | Export
Peer reviewed Peer reviewed
Zenisky, April L.; Hambleton, Ronald K.; Robin, Frederic – Educational and Psychological Measurement, 2003
Studied a two-stage methodology for evaluating differential item functioning (DIF) in large-scale assessment data using a sample of 60,000 students taking a large-scale assessment. Findings illustrate the merit of iterative approached for DIF detection, since items identified at one stage were not necessarily the same as those identified at the…
Descriptors: Item Bias, Large Scale Assessment, Research Methodology, Test Items
Peer reviewed Peer reviewed
Gelin, Michaela N.; Zumbo, Bruno D. – Educational and Psychological Measurement, 2003
Investigated potentially biased scale items on the Center for Epidemiological Studies Depression scale (CES-D; Radloff, 1977) in a sample of 600 adults. Overall, results indicate that the scoring method has an effect on differential item functioning (DIF), and that DIF is a property of the item, scoring method, and purpose of the assessment. (SLD)
Descriptors: Depression (Psychology), Item Bias, Scoring, Test Items
Peer reviewed Peer reviewed
Gierl, Mark J.; Bolt, Daniel M. – International Journal of Testing, 2001
Presents an overview of nonparametric regression as it allies to differential item functioning analysis and then provides three examples to illustrate how nonparametric regression can be applied to multilingual, multicultural data to study group differences. (SLD)
Descriptors: Groups, Item Bias, Nonparametric Statistics, Regression (Statistics)
Peer reviewed Peer reviewed
Harmon, Lenore W.; Borgen, Fred H. – Journal of Career Assessment, 1995
Data from over 50,000 people in 50 occupational groups were used to revise the Strong Interest Inventory. New General Reference Samples containing over 18,000 people were used to construct scales, and nearly every scale was revised. (SK)
Descriptors: Evaluation Criteria, Interest Inventories, Measures (Individuals), Occupations
Peer reviewed Peer reviewed
Engelhard, George, Jr.; Davis, Melodee; Hansche, Linda – Applied Measurement in Education, 1999
Examined whether reviewers on item-review committees can identify accurately test items that exhibit a variety of flaws. Results with 39 reviewers of a 75-item test show that reviewers exhibit fairly high accuracy rates overall, with statistically significant differences in judgmental accuracy among reviewers. (SLD)
Descriptors: Decision Making, Judges, Review (Reexamination), Test Construction
Peer reviewed Peer reviewed
Lee, Guemin; Frisbie, David A. – Applied Measurement in Education, 1999
Studied the appropriateness and implications of using a generalizability theory approach to estimating the reliability of scores from tests composed of testlets. Analyses of data from two national standardization samples suggest that manipulating the number of passages is a more productive way to obtain efficient measurement than manipulating the…
Descriptors: Generalizability Theory, Models, National Surveys, Reliability
Peer reviewed Peer reviewed
Nering, Michael L.; Meijer, Rob R. – Applied Psychological Measurement, 1998
Compared the person-response function (PRF) method for identifying examinees who respond to test items in a manner divergent from the underlying test model to the "l(z)" index of Drasgow and others (1985). Although performance of the "l(z)" index was superior in most cases, the PRF was useful in some conditions. (SLD)
Descriptors: Comparative Analysis, Item Response Theory, Models, Responses
Peer reviewed Peer reviewed
van der Linden, Wim J.; Adema, Jos J. – Journal of Educational Measurement, 1998
Proposes an algorithm for the assembly of multiple test forms in which the multiple-form problem is reduced to a series of computationally less intensive two-form problems. Illustrates how the method can be implemented using 0-1 linear programming and gives two examples. (SLD)
Descriptors: Algorithms, Linear Programming, Test Construction, Test Format
Peer reviewed Peer reviewed
Camilli, Gregory; Congdon, Peter – Journal of Educational and Behavioral Statistics, 1999
Demonstrates a method for studying differential item functioning (DIF) that can be used with dichotomous or polytomous items and that is valid for data that follow a partial credit Item Response Theory model. A simulation study shows that positively biased Type I error rates are in accord with results from previous studies. (SLD)
Descriptors: Estimation (Mathematics), Item Bias, Item Response Theory, Test Items
Peer reviewed Peer reviewed
Carlstedt, Berit; Gustafsson, Jan-Eric; Ullstadius, Eva – Intelligence, 2000
Studied whether a change of test item sequencing, intended to increase test complexity, would cause increased involvement of general intelligence using a sample of Swedish military recruits who received heterogeneous (n=1,778) or homogeneous (n=363) tests. Items presented homogeneously showed higher general intelligence ("G") loadings.…
Descriptors: Foreign Countries, Intelligence, Military Personnel, Test Construction
Peer reviewed Peer reviewed
Armstrong, Ronald D.; Jones, Douglas H.; Wang, Zhaobo – Journal of Educational and Behavioral Statistics, 1998
Generating a test from an item bank using a criterion based on classical test theory parameters poses considerable problems. A mathematical model is formulated that maximizes the reliability coefficient alpha, subject to logical constraints on the choice of items. Theorems ensuring appropriate application of the Lagragian relation techniques are…
Descriptors: Item Banks, Mathematical Models, Reliability, Test Construction
Peer reviewed Peer reviewed
Linacre, John Michael – Journal of Outcome Measurement, 1998
Simulation studies indicate that, for responses to complete tests, construction of Rasch measures from observational data, followed by principal components factor analysis of Rasch residuals, provides an effective means of identifying multidimensionality. The most diagnostically useful residual form was found to be the standardized residual. (SLD)
Descriptors: Factor Analysis, Identification, Item Response Theory, Simulation
Peer reviewed Peer reviewed
Douglas, Jeff; Kim, Hae Rim; Habing, Brian; Gao, Furong – Journal of Educational and Behavioral Statistics, 1998
The local dependence of item pairs is investigated through a conditional covariance function estimation procedure. The conditioning variable used is obtained by a monotonic transformation of total score on the remaining items. Conditional covariance functions are estimated by using kernel smoothing. Several models of local dependence are…
Descriptors: Analysis of Covariance, Estimation (Mathematics), Models, Scores
Peer reviewed Peer reviewed
Smith, Richard M.; Schumacker, Randall E.; Bush, M. Joan – Journal of Outcome Measurement, 1998
Using item mean squares to evaluate fit to the Rasch model was studied, also considering the transformed version of the item fit statistics. Simulations demonstrate that the critical value for the mean square used to detect misfit is affected by the type of mean square and the number of persons in the calibration. (SLD)
Descriptors: Goodness of Fit, Item Response Theory, Simulation, Test Items
Peer reviewed Peer reviewed
Kamata, Akihito – Journal of Educational Measurement, 2001
Presents the hierarchical generalized linear model (HGLM) as an explicit two-level formulation of a multilevel item response model. Shows that the HGLM is equivalent to the Rasch model, and that a characteristic of the HGLM is that person ability can be expressed as a latent regression model with person-characteristic variables. Shows that the…
Descriptors: Item Analysis, Item Response Theory, Regression (Statistics), Test Items
Pages: 1  |  ...  |  323  |  324  |  325  |  326  |  327  |  328  |  329  |  330  |  331  |  ...  |  637