NotesFAQContact Us
Collection
Advanced
Search Tips
Publication Date
In 20250
Since 20240
Since 2021 (last 5 years)0
Since 2016 (last 10 years)2
Since 2006 (last 20 years)12
Audience
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 1 to 15 of 25 results Save | Export
Hansen, Mark; Cai, Li; Monroe, Scott; Li, Zhen – National Center for Research on Evaluation, Standards, and Student Testing (CRESST), 2014
It is a well-known problem in testing the fit of models to multinomial data that the full underlying contingency table will inevitably be sparse for tests of reasonable length and for realistic sample sizes. Under such conditions, full-information test statistics such as Pearson's X[superscript 2] and the likelihood ratio statistic G[superscript…
Descriptors: Goodness of Fit, Item Response Theory, Classification, Maximum Likelihood Statistics
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Hansen, Mark; Cai, Li; Monroe, Scott; Li, Zhen – Grantee Submission, 2016
Despite the growing popularity of diagnostic classification models (e.g., Rupp, Templin, & Henson, 2010) in educational and psychological measurement, methods for testing their absolute goodness-of-fit to real data remain relatively underdeveloped. For tests of reasonable length and for realistic sample size, full-information test statistics…
Descriptors: Goodness of Fit, Item Response Theory, Classification, Maximum Likelihood Statistics
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Creagh, Sue – English Teaching: Practice and Critique, 2014
The Australian field of English as a Second Language (ESL) teaching is globally respected for its research and practice achievements over a period of some 30 years. However, this essential field of pedagogy is being diluted in the current Australian reform agenda which is firmly founded on a traditional vision of English as first language, and…
Descriptors: Foreign Countries, Standardized Tests, English (Second Language), Second Language Learning
Peer reviewed Peer reviewed
Direct linkDirect link
Liyanage, Indika; Singh, Parlo; Walker, Tony – International Journal of Pedagogies and Learning, 2016
Enactment of policy on diversity and learning in Australian schools is evident in "diversity talk" in daily discourses of school teachers. From policy documents to daily staffroom conversations, there is extensive use in contemporary Western educational discourse of ethnolinguistic categories. The categorization of students to groups on…
Descriptors: Linguistics, Ethnic Groups, Multilingualism, Foreign Countries
Peer reviewed Peer reviewed
Direct linkDirect link
Keller, Lisa A.; Keller, Robert R. – Educational and Psychological Measurement, 2011
This article investigates the accuracy of examinee classification into performance categories and the estimation of the theta parameter for several item response theory (IRT) scaling techniques when applied to six administrations of a test. Previous research has investigated only two administrations; however, many testing programs equate tests…
Descriptors: Item Response Theory, Scaling, Sustainability, Classification
Peer reviewed Peer reviewed
Direct linkDirect link
Wyse, Adam E. – Applied Psychological Measurement, 2011
In many practical testing situations, alternate test forms from the same testing program are not strictly parallel to each other and instead the test forms exhibit small psychometric differences. This article investigates the potential practical impact that these small psychometric differences can have on expected classification accuracy. Ten…
Descriptors: Test Format, Test Construction, Testing Programs, Psychometrics
Anderson, Daniel; Alonzo, Julie; Tindal, Gerald – Behavioral Research and Teaching, 2011
In this technical report, we document the results of a cross-validation study designed to identify optimal cut-scores for the use of the easyCBM[R] mathematics test in the state of Washington. A large sample, randomly split into two groups of roughly equal size, was used for this study. Students' performance classification on the Washington state…
Descriptors: Testing Programs, Mathematics Tests, Prediction, Measurement Techniques
Peer reviewed Peer reviewed
Direct linkDirect link
Domangue, Elizabeth; Solmon, Melinda – Research Quarterly for Exercise and Sport, 2010
Fitness testing is a prominent element in many physical education programs, but there has been limited investigation concerning motivation constructs associated with the testing. This study investigated the relationships among physical education students' award status and gender to achievement goals, intrinsic motivation, and intentions. After…
Descriptors: Physical Education, Testing Programs, Recognition (Achievement), Testing
Park, Bitnara Jasmine; Irvin, P. Shawn; Anderson, Daniel; Alonzo, Julie; Tindal, Gerald – Behavioral Research and Teaching, 2011
This technical report presents results from a cross-validation study designed to identify optimal cut scores when using easyCBM[R] reading tests in Oregon. The cross-validation study analyzes data from the 2009-2010 academic year for easyCBM[R] reading measures. A sample of approximately 2,000 students per grade, randomly split into two groups of…
Descriptors: Testing Programs, Reading Tests, Prediction, Measurement Techniques
Peer reviewed Peer reviewed
Direct linkDirect link
Thompson, Nathan A. – Journal of Applied Testing Technology, 2008
The widespread application of personal computers to educational and psychological testing has substantially increased the number of test administration methodologies available to testing programs. Many of these mediums are referred to by their acronyms, such as CAT, CBT, CCT, and LOFT. The similarities between the acronyms and the methods…
Descriptors: Testing Programs, Psychological Testing, Classification, Educational Testing
Blazer, Christie – Research Services, Miami-Dade County Public Schools, 2011
High-stakes testing is one of the most controversial issues in American education. Advocates contend that these tests encourage students to work harder, provide teachers with a stronger understanding of students' strengths and weaknesses, and allow educators to target failing schools for extra help. Critics claim that they narrow and distort the…
Descriptors: High Stakes Tests, Program Effectiveness, Dropout Rate, Testing Programs
Peer reviewed Peer reviewed
Direct linkDirect link
Norman, Rebecca L.; Buckendahl, Chad W. – Educational Measurement: Issues and Practice, 2008
Many educational testing programs report examinee performance at more than two levels of proficiency. Whether these assessments have the capacity to support these multiple inferences, though, is a topic that has not been widely discussed. This study proposes a method for evaluating the minimum number of measurement opportunities for reporting…
Descriptors: Testing Programs, Student Evaluation, Educational Testing, Mathematics Achievement
Guion, Robert M.; Ironson, Gail H. – 1979
Challenges to classical psychometric theory are examined in the context of a broader range of fundamental, derived, and intuitive measurements in psychology; the challenges include content-referenced testing, latent trait theory, and generalizability theory. A taxonomy of psychological measurement is developed, based on: (1) purposes of…
Descriptors: Classification, Latent Trait Theory, Measurement Objectives, Program Evaluation
Boyd, Joseph L. – 1982
This report describes the sequence of activities that took place as the Examination Division of the New Jersey Department of Civil Service introduced a word processing system for a test item bank and for production of camera-ready test copy. The equipment selection, installation and orientation procedures are discussed. Keyboard and CRT terminals,…
Descriptors: Classification, Computer Assisted Testing, Item Banks, Occupational Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Ferrara, Steve; Johnson, Eugene; Chen, Wen-Hung – Applied Measurement in Education, 2005
Psychometricians continue to develop and evaluate methods for linking test scores, both horizontally and vertically. This article describes a social moderation process for articulating (i.e., linking) performance standards across grade levels for an operational state assessment program. The researchers used generated data to evaluate the likely…
Descriptors: Grade 2, Grade 3, Scores, Error of Measurement
Previous Page | Next Page ยป
Pages: 1  |  2