ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	2
Since 2006 (last 20 years)	9

Descriptor

Models	13
Test Bias	13
Test Reliability	13
Test Validity	7
Test Items	6
Item Response Theory	4
Scores	4
Comparative Analysis	3
Item Analysis	3
Monte Carlo Methods	3
Simulation	3
Student Evaluation	3
Computation	2
Criterion Referenced Tests	2
Cutting Scores	2
Elementary School Students	2
English (Second Language)	2
Evaluation Methods	2
Foreign Countries	2
Generalizability Theory	2
Markov Processes	2
Minimum Competency Testing	2
Predictive Validity	2
Secondary School Students	2
Teacher Effectiveness	2
More ▼

Source

Advances in Health Sciences…	1
Applied Psychological…	1
Center for Education Data &…	1
ETS Research Report Series	1
Educational Assessment	1
Educational and Psychological…	1
Journal of Cross-Cultural…	1
Journal of Educational…	1
Numeracy	1
ProQuest LLC	1
Routledge, Taylor & Francis…	1
More ▼

Publication Type

Journal Articles	7
Reports - Research	4
Reports - Descriptive	3
Books	1
Collected Works - General	1
Dissertations/Theses -…	1
Guides - Classroom - Teacher	1
Information Analyses	1
Reports - Evaluative	1

Education Level

Higher Education	3
Secondary Education	3
Elementary Education	2
Elementary Secondary Education	2
Junior High Schools	1
Middle Schools	1
Postsecondary Education	1
Two Year Colleges	1

Audience

Location

California	1
Taiwan	1

Laws, Policies, & Programs

No Child Left Behind Act 2001

Assessments and Surveys

Eysenck Personality Inventory	1
Graduate Record Examinations	1
Wechsler Adult Intelligence…	1

What Works Clearinghouse Rating

Showing all 13 results Save | Export

Establishing the Validity and Reliability of the LOCUS Assessments

Peer reviewed
PDF on ERIC

Download full text

Tim Jacobbe; Bob delMas; Brad Hartlaub; Jeff Haberstroh; Catherine Case; Steven Foti; Douglas Whitaker – Numeracy, 2023

The development of assessments as part of the funded LOCUS project is described. The assessments measure students' conceptual understanding of statistics as outlined in the GAISE PreK-12 Framework. Results are reported from a large-scale administration to 3,430 students in grades 6 through 12 in the United States. Items were designed to assess…

Descriptors: Statistics Education, Common Core State Standards, Student Evaluation, Elementary School Students

Hidden Item Variance in Multiple Mini-Interview Scores

Peer reviewed

Direct link

Zaidi, Nikki L.; Swoboda, Christopher M.; Kelcey, Benjamin M.; Manuel, R. Stephen – Advances in Health Sciences Education, 2017

The extant literature has largely ignored a potentially significant source of variance in multiple mini-interview (MMI) scores by "hiding" the variance attributable to the sample of attributes used on an evaluation form. This potential source of hidden variance can be defined as rating items, which typically comprise an MMI evaluation…

Descriptors: Interviews, Scores, Generalizability Theory, Monte Carlo Methods

The Effects of Rater Severity and Rater Distribution on Examinees' Ability Estimation for Constructed-Response Items. Research Report. ETS RR-13-23

Peer reviewed
PDF on ERIC

Download full text

Wang, Zhen; Yao, Lihua – ETS Research Report Series, 2013

The current study used simulated data to investigate the properties of a newly proposed method (Yao's rater model) for modeling rater severity and its distribution under different conditions. Our study examined the effects of rater severity, distributions of rater severity, the difference between item response theory (IRT) models with rater effect…

Descriptors: Test Format, Test Items, Responses, Computation

Higher Order Testlet Response Models for Hierarchical Latent Traits and Testlet-Based Items

Peer reviewed

Direct link

Huang, Hung-Yu; Wang, Wen-Chung – Educational and Psychological Measurement, 2013

Both testlet design and hierarchical latent traits are fairly common in educational and psychological measurements. This study aimed to develop a new class of higher order testlet response models that consider both local item dependence within testlets and a hierarchy of latent traits. Due to high dimensionality, the authors adopted the Bayesian…

Descriptors: Item Response Theory, Models, Bayesian Statistics, Computation

Assessing the "Rothstein Falsification Test": Does It Really Show Teacher Value-Added Models Are Biased? CEDR Working Paper No. 2012 1.3

Direct link

Goldhaber, Dan; Chaplin, Duncan – Center for Education Data & Research, 2012

In a provocative and influential paper, Jesse Rothstein (2010) finds that standard value added models (VAMs) suggest implausible future teacher effects on past student achievement, a finding that obviously cannot be viewed as causal. This is the basis of a falsification test (the Rothstein falsification test) that appears to indicate bias in VAM…

Descriptors: School Effectiveness, Teacher Effectiveness, Achievement Gains, Statistical Bias

The Nature of Science Instrument-Elementary (NOSI-E): Using Rasch Principles to Develop a Theoretically Grounded Scale to Measure Elementary Student Understanding of the Nature of Science

Direct link

Peoples, Shelagh – ProQuest LLC, 2012

The purpose of this study was to determine which of three competing models will provide, reliable, interpretable, and responsive measures of elementary students' understanding of the nature of science (NOS). The Nature of Science Instrument-Elementary (NOSI-E), a 28-item Rasch-based instrument, was used to assess students' NOS…

Descriptors: Scientific Principles, Science Tests, Elementary School Students, Item Response Theory

Multinomial and Compound Multinomial Error Models for Tests with Complex Item Scoring

Peer reviewed

Direct link

Lee, Won-Chan – Applied Psychological Measurement, 2007

This article introduces a multinomial error model, which models an examinee's test scores obtained over repeated measurements of an assessment that consists of polytomously scored items. A compound multinomial error model is also introduced for situations in which items are stratified according to content categories and/or prespecified numbers of…

Descriptors: Simulation, Error of Measurement, Scoring, Test Items

A Framework for Test Validity Research on Content Assessments Taken by English Language Learners

Peer reviewed

Direct link

Young, John W. – Educational Assessment, 2009

In this article, I specify a conceptual framework for test validity research on content assessments taken by English language learners (ELLs) in U.S. schools in grades K-12. This framework is modeled after one previously delineated by Willingham et al. (1988), which was developed to guide research on students with disabilities. In this framework…

Descriptors: Test Validity, Evaluation Research, Achievement Tests, Elementary Secondary Education

Practical Applications of Item Characteristic Curve Theory

Peer reviewed

Lord, Frederic M. – Journal of Educational Measurement, 1977

A variety of practical applications of item characteristic curve test theory are discussed. Among these applications are tailored testing, two stage testing, determining whether two tests measure the same latent trait, and measuring item bias towards minority or other groups. (Author/JKS)

Descriptors: Computer Programs, Latent Trait Theory, Mastery Tests, Measurement

The Questionable Value of Cross-Cultural Comparisons with the Eysenck Personality Questionnaire.

Peer reviewed

Bijnen, Emanuel J.; Poortinga, Ype H. – Journal of Cross-Cultural Psychology, 1988

The impressively high factor congruence coefficients observed in cross-cultural studies with the Eysenck Personality Questionnaire (EPQ) cannot be taken as sufficient evidence for the "similarity" or "essential identity" of these factors in the cultures concerned. Cross-cultural comparisons of factor scores on the EPQ are…

Descriptors: Congruence (Psychology), Cross Cultural Studies, Cultural Context, Cultural Differences

Handbook on Measurement, Assessment, and Evaluation in Higher Education

Direct link

Secolsky, Charles, Ed.; Denison, D. Brian, Ed. – Routledge, Taylor & Francis Group, 2011

Increased demands for colleges and universities to engage in outcomes assessment for accountability purposes have accelerated the need to bridge the gap between higher education practice and the fields of measurement, assessment, and evaluation. The "Handbook on Measurement, Assessment, and Evaluation in Higher Education" provides higher…

Descriptors: Generalizability Theory, Higher Education, Institutional Advancement, Teacher Effectiveness

Criteria for Reviewing District Competency Tests.

Download full text

Herman, Joan L. – 1982

A formative evaluation minimum competency test model is examined. The model systematically uses assessment information to support and facilitate program improvement. In terms of the model, four inter-related qualities are essential for a sound testing program. The content validity perspective looks at how well the district has defined competency…

Descriptors: Criterion Referenced Tests, Cutting Scores, Evaluation Criteria, Formative Evaluation

Getting to Grips with Assessment.

Hall, William; Saunders, John – 1993

This booklet has been written to help persons interested in assessment of education and training programs in general and competency-based vocational education programs in particular. The following topics are covered in the individual sections: the meaning of the term "assessment"; the importance of assessment; curriculum models; percentages and…

Descriptors: Annotated Bibliographies, Competency Based Education, Criterion Referenced Tests, Evaluation Methods

Bijnen, Emanuel J.	1
Bob delMas	1
Brad Hartlaub	1
Catherine Case	1
Chaplin, Duncan	1
Denison, D. Brian, Ed.	1
Douglas Whitaker	1
Goldhaber, Dan	1
Hall, William	1
Herman, Joan L.	1
Huang, Hung-Yu	1
Jeff Haberstroh	1
Kelcey, Benjamin M.	1
Lee, Won-Chan	1
Lord, Frederic M.	1
Manuel, R. Stephen	1
Peoples, Shelagh	1
Poortinga, Ype H.	1
Saunders, John	1
Secolsky, Charles, Ed.	1
Steven Foti	1
Swoboda, Christopher M.	1
Tim Jacobbe	1
Wang, Wen-Chung	1
Wang, Zhen	1
More ▼