NotesFAQContact Us
Collection
Advanced
Search Tips
Publication Date
In 20250
Since 20240
Since 2021 (last 5 years)0
Since 2016 (last 10 years)0
Since 2006 (last 20 years)7
Audience
Location
Netherlands1
Laws, Policies, & Programs
No Child Left Behind Act 20011
What Works Clearinghouse Rating
Showing 1 to 15 of 21 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Zu, Jiyun; Puhan, Gautam – Journal of Educational Measurement, 2014
Preequating is in demand because it reduces score reporting time. In this article, we evaluated an observed-score preequating method: the empirical item characteristic curve (EICC) method, which makes preequating without item response theory (IRT) possible. EICC preequating results were compared with a criterion equating and with IRT true-score…
Descriptors: Item Response Theory, Equated Scores, Item Analysis, Item Sampling
Peer reviewed Peer reviewed
Direct linkDirect link
Fitzpatrick, Anne R. – Educational Measurement: Issues and Practice, 2008
Examined in this study were the effects of reducing anchor test length on student proficiency rates for 12 multiple-choice tests administered in an annual, large-scale, high-stakes assessment. The anchor tests contained 15 items, 10 items, or five items. Five content representative samples of items were drawn at each anchor test length from a…
Descriptors: Test Length, Multiple Choice Tests, Item Sampling, Student Evaluation
Peer reviewed Peer reviewed
Direct linkDirect link
Waller, Niels G. – Applied Psychological Measurement, 2008
Reliability is a property of test scores from individuals who have been sampled from a well-defined population. Reliability indices, such as coefficient and related formulas for internal consistency reliability (KR-20, Hoyt's reliability), yield lower bound reliability estimates when (a) subjects have been sampled from a single population and when…
Descriptors: Test Items, Reliability, Scores, Psychometrics
OECD Publishing (NJ1), 2012
The "PISA 2009 Technical Report" describes the methodology underlying the PISA 2009 survey. It examines additional features related to the implementation of the project at a level of detail that allows researchers to understand and replicate its analyses. The reader will find a wealth of information on the test and sample design,…
Descriptors: Quality Control, Research Reports, Research Methodology, Evaluation Criteria
Peer reviewed Peer reviewed
Direct linkDirect link
Rudd, Andy; Johnson, R. Burke – Studies in Educational Evaluation, 2008
As a result of the federal No Child Left Behind Act (NCLB) of 2002, the field of education has seen a heavy emphasis on the use of "scientifically based research" for designing and testing the effectiveness of new and existing educational programs. According to NCLB, when addressing basic cause and effect questions scientifically based…
Descriptors: Quasiexperimental Design, Scientific Research, Educational Research, Federal Legislation
Peer reviewed Peer reviewed
Direct linkDirect link
Webster, Jeffrey Dean – International Journal of Aging and Human Development, 2007
This study examined the psychosocial correlates and psychometric properties of the Self-Assessed Wisdom Scale (SAWS) (Webster, 2003a). Seventy-three men and 98 women ranging in age from 17-92 years (Mean age = 42.77) completed an expanded, 40-item version of the SAWS, the Loyola Generativity Scale, and the Experiences in Close Relationships Scale.…
Descriptors: Measures (Individuals), Psychometrics, Construct Validity, Correlation
Peer reviewed Peer reviewed
Direct linkDirect link
Huitzing, Hiddo A.; Veldkamp, Bernard P.; Verschoor, Angela J. – Journal of Educational Measurement, 2005
Several techniques exist to automatically put together a test meeting a number of specifications. In an item bank, the items are stored with their characteristics. A test is constructed by selecting a set of items that fulfills the specifications set by the test assembler. Test assembly problems are often formulated in terms of a model consisting…
Descriptors: Testing Programs, Programming, Mathematics, Item Sampling
Berger, Martijn P. F. – 1989
The problem of obtaining designs that result in the most precise parameter estimates is encountered in at least two situations where item response theory (IRT) models are used. In so-called two-stage testing procedures, certain designs that match difficulty levels of the test items with the ability of the examinees may be located. Such designs…
Descriptors: Difficulty Level, Efficiency, Equations (Mathematics), Heuristics
Peer reviewed Peer reviewed
Taylor, Annette Kujawski – College Student Journal, 2005
This research examined 2 elements of multiple-choice test construction, balancing the key and optimal number of options. In Experiment 1 the 3 conditions included a balanced key, overrepresentation of a and b responses, and overrepresentation of c and d responses. The results showed that error-patterns were independent of the key, reflecting…
Descriptors: Comparative Analysis, Test Items, Multiple Choice Tests, Test Construction
Peer reviewed Peer reviewed
Direct linkDirect link
Handel, Richard W.; Arnau, Randolph C.; Archer, Robert P.; Dandy, Kristina L. – Assessment, 2006
The Minnesota Multiphasic Personality Inventory--Adolescent (MMPI-A) and Minnesota Multiphasic Personality Inventory--2 (MMPI-2) True Response Inconsistency (TRIN) scales are measures of acquiescence and nonacquiescence included among the standard validity scales on these instruments. The goals of this study were to evaluate the effectiveness of…
Descriptors: Adolescents, Protocol Analysis, Effect Size, Personality Measures
van den Brink, Wulfert – Evaluation in Education: International Progress, 1982
Binomial models for domain-referenced testing are compared, emphasizing the assumptions underlying the beta-binomial model. Advantages and disadvantages are discussed. A proposed item sampling model is presented which takes the effect of guessing into account. (Author/CM)
Descriptors: Comparative Analysis, Criterion Referenced Tests, Item Sampling, Measurement Techniques
Linn, Robert – 1978
A series of studies on conceptual and design problems in competency-based measurements are explained. The concept of validity within the context of criterion-referenced measurement is reviewed. The authors believe validation should be viewed as a process rather than an end product. It is the process of marshalling evidence to support…
Descriptors: Criterion Referenced Tests, Item Analysis, Item Sampling, Test Bias
Upp, Caroline M.; Barcikowski, Robert S. – 1981
Demands for more complete information on educational programs have emanated from national, state and local sources. Their focus is on the processes that are occurring in individual classrooms. The information that is collected to provide insight into educational programs is customarily summative in nature, answering, for example, questions…
Descriptors: Academic Achievement, Attitude Measures, Cognitive Measurement, Evaluation Methods
Peer reviewed Peer reviewed
Cliff, Norman; Donoghue, John R. – Psychometrika, 1992
A test theory using only ordinal assumptions is presented, based on the idea that the test items are a sample from a universe of items. The sum across items of the ordinal relations for a pair of persons on the universe items is analogous to a true score. (SLD)
Descriptors: Equations (Mathematics), Estimation (Mathematics), Item Response Theory, Item Sampling
Stake, Bernadine Evans; And Others – 1983
During the last 2 years (1980-82), selected schools in the Broward County School District in Florida participated in the National Sex Equity Demonstration Project (NSEDP) to create a model for demonstration of curricular materials, educational practices, and program arrangements that feature gender-fair instruction and associated educational…
Descriptors: Administrator Attitudes, Demonstration Programs, Elementary Secondary Education, Evaluation Methods
Previous Page | Next Page ยป
Pages: 1  |  2