Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 0 |
Since 2006 (last 20 years) | 4 |
Descriptor
Item Response Theory | 18 |
State Programs | 18 |
Testing Programs | 18 |
Test Construction | 5 |
Achievement Tests | 4 |
Cutting Scores | 4 |
Error of Measurement | 4 |
High Schools | 4 |
Test Items | 4 |
Comparative Analysis | 3 |
Elementary Secondary Education | 3 |
More ▼ |
Source
Educational and Psychological… | 3 |
Applied Measurement in… | 1 |
Journal of Educational… | 1 |
Journal of Research in… | 1 |
Author
Baghi, Heibatollah | 2 |
Fan, Xitao | 2 |
Lee, Guemin | 2 |
Lewis, Daniel M. | 2 |
Ackerman, Terry | 1 |
Bolt, Daniel | 1 |
Buckley, Barbara C. | 1 |
Carvajal, Jorge | 1 |
Childs, Ruth A. | 1 |
Chin-Chance, Selvin | 1 |
Crislip, Marian A. | 1 |
More ▼ |
Publication Type
Reports - Research | 11 |
Speeches/Meeting Papers | 9 |
Journal Articles | 6 |
Reports - Evaluative | 4 |
Reports - Descriptive | 2 |
Information Analyses | 1 |
Numerical/Quantitative Data | 1 |
Audience
Laws, Policies, & Programs
Assessments and Surveys
North Carolina End of Course… | 1 |
What Works Clearinghouse Rating
Phillips, Gary W. – Applied Measurement in Education, 2015
This article proposes that sampling design effects have potentially huge unrecognized impacts on the results reported by large-scale district and state assessments in the United States. When design effects are unrecognized and unaccounted for they lead to underestimating the sampling error in item and test statistics. Underestimating the sampling…
Descriptors: State Programs, Sampling, Research Design, Error of Measurement
Quellmalz, Edys S.; Timms, Michael J.; Silberglitt, Matt D.; Buckley, Barbara C. – Journal of Research in Science Teaching, 2012
This article reports on the collaboration of six states to study how simulation-based science assessments can become transformative components of multi-level, balanced state science assessment systems. The project studied the psychometric quality, feasibility, and utility of simulation-based science assessments designed to serve formative purposes…
Descriptors: State Programs, Educational Assessment, Simulated Environment, Grade 6
Skorupski, William P.; Carvajal, Jorge – Educational and Psychological Measurement, 2010
This study is an evaluation of the psychometric issues associated with estimating objective level scores, often referred to as "subscores." The article begins by introducing the concepts of reliability and validity for subscores from statewide achievement tests. These issues are discussed with reference to popular scaling techniques, classical…
Descriptors: Testing Programs, Test Validity, Achievement Tests, Scores
Lee, Guemin; Lewis, Daniel M. – Educational and Psychological Measurement, 2008
The bookmark standard-setting procedure is an item response theory-based method that is widely implemented in state testing programs. This study estimates standard errors for cut scores resulting from bookmark standard settings under a generalizability theory model and investigates the effects of different universes of generalization and error…
Descriptors: Generalizability Theory, Testing Programs, Error of Measurement, Cutting Scores

Fan, Xitao – Educational and Psychological Measurement, 1998
This study empirically examined the behaviors of item and person statistics derived from item response theory and classical test theory, focusing on item and person statistics and using a large-scale statewide assessment. Findings show that the person and item statistics from the two measurement frameworks are quite comparable. (SLD)
Descriptors: Item Response Theory, State Programs, Statistical Analysis, Test Items
Holweger, Nancy; Weston, Timothy – 1998
This study compares logistic discriminant function analysis for differential item functioning (DIF) with a technique for the detection of DIF that is based on item response theory rather than the Mantel-Haenszel procedure. In this study, the areas between the two item characteristic curves, also called the item characteristic curve method is…
Descriptors: Item Bias, Item Response Theory, Performance Based Assessment, State Programs
Crislip, Marian A.; Chin-Chance, Selvin – 2001
This paper discusses the use of two theories of item analysis and test construction, their strengths and weaknesses, and applications to the design of the Hawaii State Test of Essential Competencies (HSTEC). Traditional analyses of the data collected from the HSTEC field test were viewed from the perspectives of item difficulty levels and item…
Descriptors: Difficulty Level, Item Response Theory, Psychometrics, Reliability
Fan, Xitao; Ping, Yin – 1999
This study empirically investigated the potential negative effect of item response theory (IRT) model-data misfit on the degree of invariance of: (1) IRT item parameter estimates (item difficulty and discrimination); and (2) IRT person ability parameter estimates. A large-scale statewide assessment program test database was used, for which the…
Descriptors: Estimation (Mathematics), Goodness of Fit, High Schools, Item Response Theory
Gyagenda, Ismail S.; Engelhard, George, Jr. – 1998
The purpose of this study was to describe the Rasch model for measurement and apply the model to examine the relationship between raters, domains of written compositions, and student writing ability. Twenty raters were randomly selected from a group of 87 operational raters contracted to rate essays as part of the 1993 field test of the Georgia…
Descriptors: Difficulty Level, Essay Tests, Evaluators, High School Students

Williams, Valerie S. L.; Pommerich, Mary; Thissen, David – Journal of Educational Measurement, 1998
Created a developmental scale for the North Carolina End-of-Grade Mathematics Tests using a subset of identical test forms administered to adjacent grade levels with Thurstone scaling and Item Response Theory methods. Discusses differences in patterns produced. (Author/SLD)
Descriptors: Achievement Tests, Child Development, Comparative Analysis, Elementary Secondary Education
Baghi, Heibatollah – 1990
The Maryland Functional Testing Program (MFTP) uses the Rasch model as the statistical framework for the analysis of test items and scores. This paper is designed to assist the reader in developing an understanding of the fit statistics in the Rasch model. Background materials on application of the Rasch model in statistical analysis of the MFTP…
Descriptors: Computer Assisted Testing, Computer Software, Equated Scores, Error of Measurement
Emenogu, Barnabas; Childs, Ruth A. – 2003
This study investigated the possible impacts of language and curriculum differences on the performance of test items by subpopulations of students. Focusing on Measurement and Geometry items completed by students in French- and English-language schools in Ontario made it possible to explore the differences and to compare the item response theory…
Descriptors: Curriculum, English, Foreign Countries, French
Lee, Guemin; Lewis, Daniel M. – 2001
The Bookmark Standard Setting Procedure (Lewis, Mitzel, and Green, 1996) is an item-response-theory-based standard setting method that has been widely implemented by state testing programs. The primary purposes of this study were to: (1) estimate standard errors for cutscores that result from Bookmark standard settings under a generalizability…
Descriptors: Cutting Scores, Elementary School Students, Elementary Secondary Education, Error of Measurement
Sinclair, Norma; Pecheone, Raymond L. – 1991
The impact of test item multidimensionality was examined as it affected fitting items on a teacher licensure test to an item response theory (IRT) model. Test item data from the 1990 study of G. W. Guiton and G. Delandshere are used. The Connecticut Elementary Certification Test (CONNECT) is a licensure examination designed to assess the subject…
Descriptors: Beginning Teachers, Elementary Education, Elementary School Teachers, Higher Education

Baghi, Heibatollah; Ferrara, Steven F. – 1989
Use of item response theory (IRT), the delta plot method, and Mantel-Haenszel techniques to assess differential item functioning (DIF) across racial and gender groups associated with the Maryland Test of Citizenship Skills (MTCS) is described. The objective of this research was to determine the: effect of sample size on results from these three…
Descriptors: Black Students, Citizenship Education, Comparative Analysis, Comparative Testing
Previous Page | Next Page ยป
Pages: 1 | 2