ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	2
Since 2006 (last 20 years)	12

Descriptor

Classification	25
Testing Programs	25
Academic Achievement	5
Achievement Tests	5
Mathematics Tests	5
Measurement Techniques	5
Scores	5
State Programs	5
Testing Problems	5
Cutting Scores	4
Item Response Theory	4
Psychometrics	4
School Districts	4
State Standards	4
Test Construction	4
Test Validity	4
Accountability	3
At Risk Students	3
Elementary Secondary Education	3
Foreign Countries	3
Mathematics Achievement	3
Models	3
Reading Tests	3
Standardized Tests	3
Test Reliability	3
More ▼

Source

Behavioral Research and…	2
Educational Measurement:…	2
Applied Measurement in…	1
Applied Psychological…	1
Educational Researcher	1
Educational and Psychological…	1
English Teaching: Practice…	1
Grantee Submission	1
International Journal of…	1
Journal of Applied Testing…	1
National Center for Research…	1
Research Quarterly for…	1
Research Services, Miami-Dade…	1
More ▼

Publication Type

Reports - Evaluative	12
Journal Articles	10
Reports - Research	7
Numerical/Quantitative Data	2
Reports - Descriptive	2
Speeches/Meeting Papers	2

Education Level

Elementary Secondary Education	3
Grade 3	3
Grade 5	3
Elementary Education	2
Grade 4	2
Grade 6	2
Grade 7	2
Grade 8	2
Grade 2	1

Audience

Location

Australia	2
Illinois (Chicago)	2
Nebraska	2
Connecticut	1
New Jersey	1
Oregon	1
United States	1
Washington	1

Laws, Policies, & Programs

Assessments and Surveys

National Assessment of…	2
Trends in International…	2
Armed Forces Qualification…	1
California Achievement Tests	1

What Works Clearinghouse Rating

Showing 1 to 15 of 25 results Save | Export

Limited-Information Goodness-of-Fit Testing of Diagnostic Classification Item Response Theory Models. CRESST Report 840

Download full text

Hansen, Mark; Cai, Li; Monroe, Scott; Li, Zhen – National Center for Research on Evaluation, Standards, and Student Testing (CRESST), 2014

It is a well-known problem in testing the fit of models to multinomial data that the full underlying contingency table will inevitably be sparse for tests of reasonable length and for realistic sample sizes. Under such conditions, full-information test statistics such as Pearson's X[superscript 2] and the likelihood ratio statistic G[superscript…

Descriptors: Goodness of Fit, Item Response Theory, Classification, Maximum Likelihood Statistics

Limited-Information Goodness-of-Fit Testing of Diagnostic Classification Item Response Models

Peer reviewed
PDF on ERIC

Download full text

Hansen, Mark; Cai, Li; Monroe, Scott; Li, Zhen – Grantee Submission, 2016

Despite the growing popularity of diagnostic classification models (e.g., Rupp, Templin, & Henson, 2010) in educational and psychological measurement, methods for testing their absolute goodness-of-fit to real data remain relatively underdeveloped. For tests of reasonable length and for realistic sample size, full-information test statistics…

Descriptors: Goodness of Fit, Item Response Theory, Classification, Maximum Likelihood Statistics

National Standardised Testing and the Diluting of English as a Second Language (ESL) in Australia

Peer reviewed
PDF on ERIC

Download full text

Creagh, Sue – English Teaching: Practice and Critique, 2014

The Australian field of English as a Second Language (ESL) teaching is globally respected for its research and practice achievements over a period of some 30 years. However, this essential field of pedagogy is being diluted in the current Australian reform agenda which is firmly founded on a traditional vision of English as first language, and…

Descriptors: Foreign Countries, Standardized Tests, English (Second Language), Second Language Learning

Ethnolinguistic Diversity within Australian Schools: Call for a Participant Perspective in Teacher Learning

Peer reviewed

Direct link

Liyanage, Indika; Singh, Parlo; Walker, Tony – International Journal of Pedagogies and Learning, 2016

Enactment of policy on diversity and learning in Australian schools is evident in "diversity talk" in daily discourses of school teachers. From policy documents to daily staffroom conversations, there is extensive use in contemporary Western educational discourse of ethnolinguistic categories. The categorization of students to groups on…

Descriptors: Linguistics, Ethnic Groups, Multilingualism, Foreign Countries

The Long-Term Sustainability of Different Item Response Theory Scaling Methods

Peer reviewed

Direct link

Keller, Lisa A.; Keller, Robert R. – Educational and Psychological Measurement, 2011

This article investigates the accuracy of examinee classification into performance categories and the estimation of the theta parameter for several item response theory (IRT) scaling techniques when applied to six administrations of a test. Previous research has investigated only two administrations; however, many testing programs equate tests…

Descriptors: Item Response Theory, Scaling, Sustainability, Classification

The Potential Impact of Not Being Able to Create Parallel Tests on Expected Classification Accuracy

Peer reviewed

Direct link

Wyse, Adam E. – Applied Psychological Measurement, 2011

In many practical testing situations, alternate test forms from the same testing program are not strictly parallel to each other and instead the test forms exhibit small psychometric differences. This article investigates the potential practical impact that these small psychometric differences can have on expected classification accuracy. Ten…

Descriptors: Test Format, Test Construction, Testing Programs, Psychometrics

A Cross-Validation of easyCBM Mathematics Cut Scores in Washington State: 2009-2010 Test. Technical Report #1105

Download full text

Anderson, Daniel; Alonzo, Julie; Tindal, Gerald – Behavioral Research and Teaching, 2011

In this technical report, we document the results of a cross-validation study designed to identify optimal cut-scores for the use of the easyCBM[R] mathematics test in the state of Washington. A large sample, randomly split into two groups of roughly equal size, was used for this study. Students' performance classification on the Washington state…

Descriptors: Testing Programs, Mathematics Tests, Prediction, Measurement Techniques

Motivational Responses to Fitness Testing by Award Status and Gender

Peer reviewed

Direct link

Domangue, Elizabeth; Solmon, Melinda – Research Quarterly for Exercise and Sport, 2010

Fitness testing is a prominent element in many physical education programs, but there has been limited investigation concerning motivation constructs associated with the testing. This study investigated the relationships among physical education students' award status and gender to achievement goals, intrinsic motivation, and intentions. After…

Descriptors: Physical Education, Testing Programs, Recognition (Achievement), Testing

Cross-Validation of easyCBM Reading Cut Scores in Oregon: 2009-2010. Technical Report #1108

Download full text

Park, Bitnara Jasmine; Irvin, P. Shawn; Anderson, Daniel; Alonzo, Julie; Tindal, Gerald – Behavioral Research and Teaching, 2011

This technical report presents results from a cross-validation study designed to identify optimal cut scores when using easyCBM[R] reading tests in Oregon. The cross-validation study analyzes data from the 2009-2010 academic year for easyCBM[R] reading measures. A sample of approximately 2,000 students per grade, randomly split into two groups of…

Descriptors: Testing Programs, Reading Tests, Prediction, Measurement Techniques

A Proposed Framework of Test Administration Methods

Peer reviewed

Direct link

Thompson, Nathan A. – Journal of Applied Testing Technology, 2008

The widespread application of personal computers to educational and psychological testing has substantially increased the number of test administration methodologies available to testing programs. Many of these mediums are referred to by their acronyms, such as CAT, CBT, CCT, and LOFT. The similarities between the acronyms and the methods…

Descriptors: Testing Programs, Psychological Testing, Classification, Educational Testing

Unintended Consequences of High-Stakes Testing. Information Capsule. Volume 1008

Download full text

Blazer, Christie – Research Services, Miami-Dade County Public Schools, 2011

High-stakes testing is one of the most controversial issues in American education. Advocates contend that these tests encourage students to work harder, provide teachers with a stronger understanding of students' strengths and weaknesses, and allow educators to target failing schools for extra help. Critics claim that they narrow and distort the…

Descriptors: High Stakes Tests, Program Effectiveness, Dropout Rate, Testing Programs

Determining Sufficient Measurement Opportunities when Using Multiple Cut Scores

Peer reviewed

Direct link

Norman, Rebecca L.; Buckendahl, Chad W. – Educational Measurement: Issues and Practice, 2008

Many educational testing programs report examinee performance at more than two levels of proficiency. Whether these assessments have the capacity to support these multiple inferences, though, is a topic that has not been widely discussed. This study proposes a method for evaluating the minimum number of measurement opportunities for reporting…

Descriptors: Testing Programs, Student Evaluation, Educational Testing, Mathematics Achievement

Principles of Work Sample Testing. Volume I: A Non-Empirical Taxonomy of Test Uses; Volume II: Evaluation of Personnel Testing Programs; Volume III: Construction and Evaluation of Work Sample Tests; Volume IV: Generalizability.

Guion, Robert M.; Ironson, Gail H. – 1979

Challenges to classical psychometric theory are examined in the context of a broader range of fundamental, derived, and intuitive measurements in psychology; the challenges include content-referenced testing, latent trait theory, and generalizability theory. A taxonomy of psychological measurement is developed, based on: (1) purposes of…

Descriptors: Classification, Latent Trait Theory, Measurement Objectives, Program Evaluation

Word Processing for Item Banking and Test Production. Final Report.

Boyd, Joseph L. – 1982

This report describes the sequence of activities that took place as the Examination Division of the New Jersey Department of Civil Service introduced a word processing system for a test item bank and for production of camera-ready test copy. The equipment selection, installation and orientation procedures are discussed. Keyboard and CRT terminals,…

Descriptors: Classification, Computer Assisted Testing, Item Banks, Occupational Tests

Vertically Articulated Performance Standards: Logic, Procedures, and Likely Classification Accuracy

Peer reviewed

Direct link

Ferrara, Steve; Johnson, Eugene; Chen, Wen-Hung – Applied Measurement in Education, 2005

Psychometricians continue to develop and evaluate methods for linking test scores, both horizontally and vertically. This article describes a social moderation process for articulating (i.e., linking) performance standards across grade levels for an operational state assessment program. The researchers used generated data to evaluate the likely…

Descriptors: Grade 2, Grade 3, Scores, Error of Measurement

Previous Page | Next Page »

Pages: 1 | 2

Alonzo, Julie	2
Anderson, Daniel	2
Buckendahl, Chad W.	2
Cai, Li	2
HESS, ROBERT D.	2
Hansen, Mark	2
Li, Zhen	2
Monroe, Scott	2
Tindal, Gerald	2
Allington, Richard L.	1
Blazer, Christie	1
Bourque, Mary Lyn	1
Boyd, Joseph L.	1
Chen, Wen-Hung	1
Creagh, Sue	1
Domangue, Elizabeth	1
Ferrara, Steve	1
Follettie, Joseph F.	1
Fuchs, Edmund F.	1
Guion, Robert M.	1
Hambleton, Ronald K.	1
Impara, James C.	1
Ironson, Gail H.	1
Irvin, P. Shawn	1
More ▼