Publication Date
In 2025 | 7 |
Since 2024 | 44 |
Since 2021 (last 5 years) | 209 |
Since 2016 (last 10 years) | 857 |
Since 2006 (last 20 years) | 2335 |
Descriptor
Comparative Analysis | 2335 |
Hypothesis Testing | 952 |
Foreign Countries | 815 |
Computer Assisted Testing | 630 |
Statistical Analysis | 574 |
Scores | 498 |
Correlation | 336 |
Teaching Methods | 330 |
Testing | 318 |
Academic Achievement | 289 |
Questionnaires | 269 |
More ▼ |
Source
Author
Dodd, Barbara G. | 9 |
Paas, Fred | 8 |
Sinharay, Sandip | 8 |
Kim, Sooyeon | 7 |
Attali, Yigal | 6 |
Chang, Hua-Hua | 5 |
Cirino, Paul T. | 4 |
Coniam, David | 4 |
DeBoer, George E. | 4 |
Peterson, Paul E. | 4 |
Puhan, Gautam | 4 |
More ▼ |
Publication Type
Education Level
Location
United States | 71 |
Australia | 60 |
Germany | 60 |
United Kingdom | 46 |
China | 45 |
Turkey | 45 |
Canada | 42 |
Nigeria | 40 |
Texas | 36 |
Netherlands | 35 |
California | 32 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Meets WWC Standards without Reservations | 4 |
Meets WWC Standards with or without Reservations | 4 |
Does not meet standards | 1 |
Falk, Carl F.; Feuerstahler, Leah M. – Educational and Psychological Measurement, 2022
Large-scale assessments often use a computer adaptive test (CAT) for selection of items and for scoring respondents. Such tests often assume a parametric form for the relationship between item responses and the underlying construct. Although semi- and nonparametric response functions could be used, there is scant research on their performance in a…
Descriptors: Item Response Theory, Adaptive Testing, Computer Assisted Testing, Nonparametric Statistics
Suthathip Thirakunkovit – Language Testing in Asia, 2025
Establishing a cut score is a crucial aspect of the test development process since the selected cut score has the potential to impact students' performance outcomes and shape instructional strategies within the classroom. Therefore, it is vital for those involved in test development to set a cut score that is both fair and justifiable. This cut…
Descriptors: Cutting Scores, Culture Fair Tests, Language Tests, Test Construction
Melchor Sánchez-Mendiola; Abigail P. Manzano-Patiño; Manuel García-Minjares; Enrique Buzo Casanova; Careli J. Herrera Penilla; Katyna Goytia-Rodríguez; Adrián Martínez-González – Educational Assessment, Evaluation and Accountability, 2023
COVID-19 has disrupted higher education globally, and there is scarce information about the "learning loss" in university students throughout this crisis. The goal of the study was to compare scores in a large-scale knowledge diagnostic exam applied to students admitted to the university, before and during the pandemic. Research design…
Descriptors: College Freshmen, Diagnostic Tests, Scores, Achievement Gains
Kim, Ahyoung Alicia; Yumsek, Meltem; Kemp, Jason A.; Chapman, Mark; Cook, H. Gary – Language Testing, 2023
English learners (ELs) comprise approximately 10% of kindergarten to Grade 12 students in US public schools, with about 15% of ELs identified as having disabilities. English language proficiency (ELP) assessments must adhere to universal design principles and incorporate universal tools, designed to increase accessibility for all ELs, including…
Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Students with Disabilities
Kárász, Judit T.; Széll, Krisztián; Takács, Szabolcs – Quality Assurance in Education: An International Perspective, 2023
Purpose: Based on the general formula, which depends on the length and difficulty of the test, the number of respondents and the number of ability levels, this study aims to provide a closed formula for the adaptive tests with medium difficulty (probability of solution is p = 1/2) to determine the accuracy of the parameters for each item and in…
Descriptors: Test Length, Probability, Comparative Analysis, Difficulty Level
Ritschard, Gilbert – Sociological Methods & Research, 2023
This study reviews and compares indicators that can serve to characterize numerically the nature of state sequences. It also introduces several new indicators. Alongside basic measures such as the length, the number of visited distinct states, and the number of state changes, we shall consider composite measures such as turbulence and the…
Descriptors: Comparative Analysis, Measurement Techniques, Hypothesis Testing, Foreign Countries
Olsho, Alexis; Smith, Trevor I.; Eaton, Philip; Zimmerman, Charlotte; Boudreaux, Andrew; White Brahmia, Suzanne – Physical Review Physics Education Research, 2023
We developed the Physics Inventory of Quantitative Literacy (PIQL) to assess students' quantitative reasoning in introductory physics contexts. The PIQL includes several "multiple-choice-multipleresponse" (MCMR) items (i.e., multiple-choice questions for which more than one response may be selected) as well as traditional single-response…
Descriptors: Multiple Choice Tests, Science Tests, Physics, Measures (Individuals)
Kim, Sooyeon; Walker, Michael – ETS Research Report Series, 2021
In this investigation, we used real data to assess potential differential effects associated with taking a test in a test center (TC) versus testing at home using remote proctoring (RP). We used a pseudo-equivalent groups (PEG) approach to examine group equivalence at the item level and the total score level. If our assumption holds that the PEG…
Descriptors: Testing, Distance Education, Comparative Analysis, Test Items
Ioana-Elena Oana; Carsten Q. Schneider – Sociological Methods & Research, 2024
The robustness of qualitative comparative analysis (QCA) results features high on the agenda of methodologists and practitioners. This article aims at advancing this debate on several fronts. First, in line with the extant literature, we take a comprehensive view on robustness arguing that decisions on calibration, consistency, and frequency…
Descriptors: Robustness (Statistics), Qualitative Research, Comparative Analysis, Decision Making
Panachanok Chanwaiwit; Lalida Wiboonwachara – rEFLections, 2025
Chiang Mai Rajabhat University Test of English Proficiency (CMRU-TEP) is a required English proficiency test for all CMRU students before graduation. Despite its meticulous design, there is an opportunity for students to improve their scores through focused efforts and targeted support. This study employs an explanatory sequential mixed-methods…
Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Language Proficiency
Marinho, Nathalie L.; Witmer, Sara E.; Jess, Nicole; Roschmann, Sarina – Language Assessment Quarterly, 2023
The use of accommodations is often recommended to remove barriers to academic testing among English Learners (ELs). However, it is unclear whether accommodations are particularly effective at improving ELs' test scores. A growing foundation of empirical work has explored this topic. We conducted a meta-analysis that examined several possible…
Descriptors: English Language Learners, Testing Accommodations, Barriers, Scores
Soland, James; Kuhfeld, Megan; Rios, Joseph – Large-scale Assessments in Education, 2021
Low examinee effort is a major threat to valid uses of many test scores. Fortunately, several methods have been developed to detect noneffortful item responses, most of which use response times. To accurately identify noneffortful responses, one must set response time thresholds separating those responses from effortful ones. While other studies…
Descriptors: Reaction Time, Measurement, Response Style (Tests), Reading Tests
Shunji Wang; Katerina M. Marcoulides; Jiashan Tang; Ke-Hai Yuan – Structural Equation Modeling: A Multidisciplinary Journal, 2024
A necessary step in applying bi-factor models is to evaluate the need for domain factors with a general factor in place. The conventional null hypothesis testing (NHT) was commonly used for such a purpose. However, the conventional NHT meets challenges when the domain loadings are weak or the sample size is insufficient. This article proposes…
Descriptors: Hypothesis Testing, Error of Measurement, Comparative Analysis, Monte Carlo Methods
Philippe Goldammer; Peter Lucas Stöckli; Yannik Andrea Escher; Hubert Annen; Klaus Jonas – Educational and Psychological Measurement, 2024
Indirect indices for faking detection in questionnaires make use of a respondent's deviant or unlikely response pattern over the course of the questionnaire to identify them as a faker. Compared with established direct faking indices (i.e., lying and social desirability scales), indirect indices have at least two advantages: First, they cannot be…
Descriptors: Identification, Deception, Psychological Testing, Validity
Tan, Teck Kiang – Practical Assessment, Research & Evaluation, 2023
Researchers often have hypotheses concerning the state of affairs in the population from which they sampled their data to compare group means. The classical frequentist approach provides one way of carrying out hypothesis testing using ANOVA to state the null hypothesis that there is no difference in the means and proceed with multiple comparisons…
Descriptors: Comparative Analysis, Hypothesis Testing, Statistical Analysis, Guidelines