Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 2 |
Since 2016 (last 10 years) | 3 |
Since 2006 (last 20 years) | 18 |
Descriptor
Comparative Analysis | 30 |
Scores | 30 |
Test Theory | 30 |
Item Response Theory | 8 |
Correlation | 7 |
Test Items | 7 |
Foreign Countries | 6 |
Test Reliability | 6 |
Testing | 6 |
Language Tests | 5 |
Computation | 4 |
More ▼ |
Source
Author
Publication Type
Reports - Research | 23 |
Journal Articles | 20 |
Speeches/Meeting Papers | 6 |
Reports - Evaluative | 4 |
Reports - Descriptive | 2 |
Dissertations/Theses -… | 1 |
Dissertations/Theses -… | 1 |
Opinion Papers | 1 |
Tests/Questionnaires | 1 |
Education Level
Higher Education | 6 |
Postsecondary Education | 4 |
Elementary Education | 1 |
Grade 4 | 1 |
Grade 5 | 1 |
Grade 6 | 1 |
Audience
Researchers | 2 |
Policymakers | 1 |
Practitioners | 1 |
Laws, Policies, & Programs
Individuals with Disabilities… | 1 |
No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
ACT Assessment | 1 |
Childrens Depression Inventory | 1 |
SAT (College Admission Test) | 1 |
What Works Clearinghouse Rating
Polat, Murat; Turhan, Nihan S.; Toraman, Cetin – Pegem Journal of Education and Instruction, 2022
Testing English writing skills could be multi-dimensional; thus, the study aimed to compare students' writing scores calculated according to Classical Test Theory (CTT) and Multi-Facet Rasch Model (MFRM). The research was carried out in 2019 with 100 university students studying at a foreign language preparatory class and four experienced…
Descriptors: Comparative Analysis, Test Theory, Item Response Theory, Student Evaluation
Polat, Murat – International Online Journal of Education and Teaching, 2022
Foreign language testing is a multi-dimensional phenomenon and obtaining objective and error-free scores on learners' language skills is often problematic. While assessing foreign language performance on high-stakes tests, using different testing approaches including Classical Test Theory (CTT), Generalizability Theory (GT) and/or Item Response…
Descriptors: Second Language Learning, Second Language Instruction, Item Response Theory, Language Tests
Alqarni, Abdulelah Mohammed – Journal on Educational Psychology, 2019
This study compares the psychometric properties of reliability in Classical Test Theory (CTT), item information in Item Response Theory (IRT), and validation from the perspective of modern validity theory for the purpose of bringing attention to potential issues that might exist when testing organizations use both test theories in the same testing…
Descriptors: Test Theory, Item Response Theory, Test Construction, Scoring
Retnawati, Heri – Turkish Online Journal of Educational Technology - TOJET, 2015
This study aimed to compare the accuracy of the test scores as results of Test of English Proficiency (TOEP) based on paper and pencil test (PPT) versus computer-based test (CBT). Using the participants' responses to the PPT documented from 2008-2010 and data of CBT TOEP documented in 2013-2014 on the sets of 1A, 2A, and 3A for the Listening and…
Descriptors: Scores, Accuracy, Computer Assisted Testing, English (Second Language)
Culpepper, Steven Andrew – Applied Psychological Measurement, 2013
A classic topic in the fields of psychometrics and measurement has been the impact of the number of scale categories on test score reliability. This study builds on previous research by further articulating the relationship between item response theory (IRT) and classical test theory (CTT). Equations are presented for comparing the reliability and…
Descriptors: Item Response Theory, Reliability, Scores, Error of Measurement
Sinharay, Sandip; Haberman, Shelby J. – International Journal of Testing, 2014
Recently there has been an increasing level of interest in subtest scores, or subscores, for their potential diagnostic value. Haberman (2008) suggested a method to determine if a subscore has added value over the total score. Researchers have often been interested in the performance of subgroups--for example, those based on gender or…
Descriptors: Scores, Achievement Tests, Language Tests, English (Second Language)
Woodruff, David; Traynor, Anne; Cui, Zhongmin; Fang, Yu – ACT, Inc., 2013
Professional standards for educational testing recommend that both the overall standard error of measurement and the conditional standard error of measurement (CSEM) be computed on the score scale used to report scores to examinees. Several methods have been developed to compute scale score CSEMs. This paper compares three methods, based on…
Descriptors: Comparative Analysis, Error of Measurement, Scores, Scaling
Bailey, Janelle M.; Johnson, Bruce; Prather, Edward E.; Slater, Timothy F. – International Journal of Science Education, 2012
Concept inventories (CIs)--typically multiple-choice instruments that focus on a single or small subset of closely related topics--have been used in science education for more than a decade. This paper describes the development and validation of a new CI for astronomy, the "Star Properties Concept Inventory" (SPCI). Questions cover the areas of…
Descriptors: Educational Strategies, Validity, Testing, Astronomy
Sussman, Joshua; Beaujean, A. Alexander; Worrell, Frank C.; Watson, Stevie – Measurement and Evaluation in Counseling and Development, 2013
Item response models (IRMs) were used to analyze Cross Racial Identity Scale (CRIS) scores. Rasch analysis scores were compared with classical test theory (CTT) scores. The partial credit model demonstrated a high goodness of fit and correlations between Rasch and CTT scores ranged from 0.91 to 0.99. CRIS scores are supported by both methods.…
Descriptors: Item Response Theory, Test Theory, Measures (Individuals), Racial Identification
Xu, Ting; Stone, Clement A. – Educational and Psychological Measurement, 2012
It has been argued that item response theory trait estimates should be used in analyses rather than number right (NR) or summated scale (SS) scores. Thissen and Orlando postulated that IRT scaling tends to produce trait estimates that are linearly related to the underlying trait being measured. Therefore, IRT trait estimates can be more useful…
Descriptors: Educational Research, Monte Carlo Methods, Measures (Individuals), Item Response Theory
Bretz, Stacey Lowery; Linenberger, Kimberly J. – Biochemistry and Molecular Biology Education, 2012
Enzyme function is central to student understanding of multiple topics within the biochemistry curriculum. In particular, students must understand how enzymes and substrates interact with one another. This manuscript describes the development of a 15-item Enzyme-Substrate Interactions Concept Inventory (ESICI) that measures student understanding…
Descriptors: Biochemistry, Science Education, Science Instruction, Scientific Concepts
Puhan, Gautam; Sinharay, Sandip; Haberman, Shelby; Larkin, Kevin – Applied Measurement in Education, 2010
Will subscores provide additional information than what is provided by the total score? Is there a method that can estimate more trustworthy subscores than observed subscores? To answer the first question, this study evaluated whether the true subscore was more accurately predicted by the observed subscore or total score. To answer the second…
Descriptors: Licensing Examinations (Professions), Scores, Computation, Methods
Green-Gibson, Andrea – ProQuest LLC, 2011
This mixed, causal-comparative study was an investigation of culture infusion methods and AYP of two different public schools in Chicago, a school that infuses African culture and a school that does not. The purpose of the study was to identify if there was a significant causative relationship between culture infusion methods and Adequate Yearly…
Descriptors: Urban Schools, Public Schools, Correlation, Academic Achievement
Darrah, Marjorie; Fuller, Edgar; Miller, David – Journal of Computers in Mathematics and Science Teaching, 2010
This paper discusses a possible solution to a problem frequently encountered by educators seeking to use computer-based or multiple choice-based exams for mathematics. These assessment methodologies force a discrete grading system on students and do not allow for the possibility of partial credit. The research presented in this paper investigates…
Descriptors: College Students, College Mathematics, Calculus, Computer Assisted Testing
Wiberg, Marie; Sundstrom, Anna – Practical Assessment, Research & Evaluation, 2009
A common problem in predictive validity studies in the educational and psychological fields, e.g. in educational and employment selection, is restriction in range of the predictor variables. There are several methods for correcting correlations for restriction of range. The aim of this paper was to examine the usefulness of two approaches to…
Descriptors: Predictive Validity, Predictor Variables, Correlation, Mathematics
Previous Page | Next Page ยป
Pages: 1 | 2