Publication Date
In 2025 | 1 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 3 |
Since 2016 (last 10 years) | 199 |
Since 2006 (last 20 years) | 459 |
Descriptor
Scores | 547 |
Statistical Analysis | 547 |
Hypothesis Testing | 277 |
Foreign Countries | 197 |
Comparative Analysis | 169 |
Correlation | 137 |
Computer Assisted Testing | 124 |
Pretests Posttests | 107 |
Academic Achievement | 84 |
Teaching Methods | 83 |
Questionnaires | 80 |
More ▼ |
Source
Author
Kim, Sooyeon | 3 |
Singaravelu, G. | 3 |
Sinharay, Sandip | 3 |
AbdulRaheem, Yusuf | 2 |
Barker, Pierce | 2 |
Barut, Yasar | 2 |
Booker, Kevin | 2 |
Bruch, Julie | 2 |
Burdick, Hal | 2 |
Choi, Seung W. | 2 |
Creagh, Sue | 2 |
More ▼ |
Publication Type
Education Level
Higher Education | 181 |
Postsecondary Education | 139 |
Secondary Education | 111 |
Elementary Education | 80 |
Middle Schools | 51 |
High Schools | 44 |
Junior High Schools | 34 |
Grade 7 | 28 |
Grade 4 | 27 |
Grade 5 | 27 |
Grade 8 | 27 |
More ▼ |
Location
Nigeria | 24 |
Iran | 15 |
India | 12 |
Texas | 11 |
Turkey | 11 |
Australia | 9 |
China | 9 |
Georgia | 7 |
Germany | 7 |
Thailand | 7 |
California | 6 |
More ▼ |
Laws, Policies, & Programs
No Child Left Behind Act 2001 | 7 |
Elementary and Secondary… | 1 |
Elementary and Secondary… | 1 |
Individuals with Disabilities… | 1 |
Individuals with Disabilities… | 1 |
Race to the Top | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Meets WWC Standards without Reservations | 2 |
Meets WWC Standards with or without Reservations | 2 |
A. R. Georgeson – Structural Equation Modeling: A Multidisciplinary Journal, 2025
There is increasing interest in using factor scores in structural equation models and there have been numerous methodological papers on the topic. Nevertheless, sum scores, which are computed from adding up item responses, continue to be ubiquitous in practice. It is therefore important to compare simulation results involving factor scores to…
Descriptors: Structural Equation Models, Scores, Factor Analysis, Statistical Bias
Ranger, Jochen; Brauer, Kay – Journal of Educational and Behavioral Statistics, 2022
The generalized S-X[superscript 2]-test is a test of item fit for items with polytomous responses format. The test is based on a comparison of the observed and expected number of responses in strata defined by the test score. In this article, we make four contributions. We demonstrate that the performance of the generalized S-X[superscript 2]-test…
Descriptors: Goodness of Fit, Test Items, Statistical Analysis, Item Response Theory
Puhan, Gautam; Kim, Sooyeon – Journal of Educational Measurement, 2022
As a result of the COVID-19 pandemic, at-home testing has become a popular delivery mode in many testing programs. When programs offer at-home testing to expand their service, the score comparability between test takers testing remotely and those testing in a test center is critical. This article summarizes statistical procedures that could be…
Descriptors: Scores, Scoring, Comparative Analysis, Testing
Metsämuuronen, Jari – International Journal of Educational Methodology, 2020
Pearson product-moment correlation coefficient between item g and test score X, known as item-test or item-total correlation ("Rit"), and item-rest correlation ("Rir") are two of the most used classical estimators for item discrimination power (IDP). Both "Rit" and "Rir" underestimate IDP caused by the…
Descriptors: Correlation, Test Items, Scores, Difficulty Level
Sinharay, Sandip; Johnson, Matthew S. – Grantee Submission, 2019
According to Wollack and Schoenig (2018), score differencing is one of six types of statistical methods used to detect test fraud. In this paper, we suggested the use of Bayes factors (e.g., Kass & Raftery, 1995) for score differencing. A simulation study shows that the suggested approach performs slightly better than an existing frequentist…
Descriptors: Cheating, Deception, Statistical Analysis, Bayesian Statistics
MacGregor, Philip C.; O'Reilly, Frances L.; Matt, John – Journal of Education and Training Studies, 2017
This study examined the following question: What is the relationship, if any, between COMPASS placement scores and the student success in the first online course during the students first semester? Discriminant function analysis was used to examine the relationship. This study used existing data from new students, who took the COMPASS placement…
Descriptors: Scores, Student Placement, Testing, Two Year Colleges
Sinharay, Sandip; Wan, Ping; Choi, Seung W.; Kim, Dong-In – Journal of Educational Measurement, 2015
With an increase in the number of online tests, the number of interruptions during testing due to unexpected technical issues seems to be on the rise. For example, interruptions occurred during several recent state tests. When interruptions occur, it is important to determine the extent of their impact on the examinees' scores. Researchers such as…
Descriptors: Computer Assisted Testing, Testing Problems, Scores, Statistical Analysis
Al-Hilawani, Yasser A. – Educational Studies, 2018
The purpose of this study was to examine the relationship between metacognition as measured in real-life situations and IQ scores as reflected by performance on the Raven Standard Progressive Matrices Scale. It is also intended in this study to report on whether or not there were significant differences in performance on the metacognitive…
Descriptors: Intelligence Quotient, Metacognition, Correlation, Tests
Chang, Todd P.; Schrager, Sheree M.; Rake, Alyssa J.; Chan, Michael W.; Pham, Phung K.; Christman, Grant – Advances in Health Sciences Education, 2017
Multimedia in assessing clinical decision-making skills (CDMS) has been poorly studied, particularly in comparison to traditional text-based assessments. The literature suggests multimedia is more difficult for trainees. We hypothesize that pediatric residents score lower in diagnostic skill when clinical vignettes use multimedia rather than text…
Descriptors: Medical Students, Pediatrics, Multimedia Materials, Clinical Diagnosis
Davis, Laurie Laughlin; Kong, Xiaojing; McBride, Yuanyuan; Morrison, Kristin M. – Applied Measurement in Education, 2017
The definition of what it means to take a test online continues to evolve with the inclusion of a broader range of item types and a wide array of devices used by students to access test content. To assure the validity and reliability of test scores for all students, device comparability research should be conducted to evaluate the impact of…
Descriptors: Educational Technology, Technology Uses in Education, High School Students, Tests
Ayodele, Alicia Nicole – ProQuest LLC, 2017
Within polytomous items, differential item functioning (DIF) can take on various forms due to the number of response categories. The lack of invariance at this level is referred to as differential step functioning (DSF). The most common DSF methods in the literature are the adjacent category log odds ratio (AC-LOR) estimator and cumulative…
Descriptors: Statistical Analysis, Test Bias, Test Items, Scores
Öz, Hüseyin; Özturan, Tuba – Journal of Language and Linguistic Studies, 2018
This article reports the findings of a study that sought to investigate whether computer-based vs. paper-based test-delivery mode has an impact on the reliability and validity of an achievement test for a pedagogical content knowledge course in an English teacher education program. A total of 97 university students enrolled in the English as a…
Descriptors: Computer Assisted Testing, Testing, Test Format, Teaching Methods
Miciak, Jeremy; Taylor, W. Pat; Stuebing, Karla K.; Fletcher, Jack M. – Journal of Psychoeducational Assessment, 2018
We investigated the classification accuracy of learning disability (LD) identification methods premised on the identification of an intraindividual pattern of processing strengths and weaknesses (PSW) method using multiple indicators for all latent constructs. Known LD status was derived from latent scores; values at the observed level identified…
Descriptors: Accuracy, Learning Disabilities, Classification, Identification
Hsiao, Yu-Yu; Kwok, Oi-Man; Lai, Mark H. C. – Educational and Psychological Measurement, 2018
Path models with observed composites based on multiple items (e.g., mean or sum score of the items) are commonly used to test interaction effects. Under this practice, researchers generally assume that the observed composites are measured without errors. In this study, we reviewed and evaluated two alternative methods within the structural…
Descriptors: Error of Measurement, Testing, Scores, Models
Luce, Christine; Kirnan, Jean P. – Journal of the Scholarship of Teaching and Learning, 2016
Contradictory results have been reported regarding the accuracy of various methods used to assess student learning in higher education. The current study examined student learning outcomes across a multi-section and mult-iinstructor psychology research course with both indirect and direct assessments in a sample of 67 undergraduate students. The…
Descriptors: Undergraduate Students, Psychology, Methods Courses, Student Evaluation