Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 4 |
Since 2006 (last 20 years) | 8 |
Descriptor
Statistical Analysis | 12 |
Test Format | 12 |
Scoring | 10 |
Test Items | 7 |
Comparative Analysis | 4 |
Test Interpretation | 4 |
Computer Assisted Testing | 3 |
Foreign Countries | 3 |
Adaptive Testing | 2 |
College Entrance Examinations | 2 |
College Students | 2 |
More ▼ |
Source
Author
Alcaraz-Mármol, Gema | 1 |
Ali, Usama S. | 1 |
Angoff, William H. | 1 |
Bailey, Kathleen M., Ed. | 1 |
Benjamin, Roger | 1 |
Boyer, Michelle | 1 |
Chang, Hua-Hua | 1 |
Fisher, Robert | 1 |
Floyd, Harlee S. | 1 |
Hou, Xiaodong | 1 |
Kieftenbeld, Vincent | 1 |
More ▼ |
Publication Type
Reports - Research | 8 |
Journal Articles | 6 |
Collected Works - Proceedings | 1 |
Dissertations/Theses -… | 1 |
Reports - Descriptive | 1 |
Reports - Evaluative | 1 |
Speeches/Meeting Papers | 1 |
Education Level
Higher Education | 2 |
Early Childhood Education | 1 |
Elementary Education | 1 |
Elementary Secondary Education | 1 |
High Schools | 1 |
Kindergarten | 1 |
Postsecondary Education | 1 |
Primary Education | 1 |
Secondary Education | 1 |
Audience
Location
Estonia | 1 |
Italy | 1 |
Maryland | 1 |
Spain | 1 |
United States | 1 |
Laws, Policies, & Programs
Assessments and Surveys
SAT (College Admission Test) | 2 |
Graduate Record Examinations | 1 |
What Works Clearinghouse Rating
Tingir, Seyfullah – ProQuest LLC, 2019
Educators use various statistical techniques to explain relationships between latent and observable variables. One way to model these relationships is to use Bayesian networks as a scoring model. However, adjusting the conditional probability tables (CPT-parameters) to fit a set of observations is still a challenge when using Bayesian networks. A…
Descriptors: Bayesian Statistics, Statistical Analysis, Scoring, Probability
Kieftenbeld, Vincent; Boyer, Michelle – Applied Measurement in Education, 2017
Automated scoring systems are typically evaluated by comparing the performance of a single automated rater item-by-item to human raters. This presents a challenge when the performance of multiple raters needs to be compared across multiple items. Rankings could depend on specifics of the ranking procedure; observed differences could be due to…
Descriptors: Automation, Scoring, Comparative Analysis, Test Items
Morgan, Grant B.; Moore, Courtney A.; Floyd, Harlee S. – Journal of Psychoeducational Assessment, 2018
Although content validity--how well each item of an instrument represents the construct being measured--is foundational in the development of an instrument, statistical validity is also important to the decisions that are made based on the instrument. The primary purpose of this study is to demonstrate how simulation studies can be used to assist…
Descriptors: Simulation, Decision Making, Test Construction, Validity
Säre, Egle; Luik, Piret; Fisher, Robert – European Early Childhood Education Research Journal, 2016
The purpose of this study was to design an instrument for five- to six-year-old children to help measure their verbal reasoning skills and assess the validity and reliability of the resulting instrument. For this purpose, the researchers have created the Younger Children Verbal Reasoning Test (YCVR-test) and a control instrument, which have been…
Descriptors: Educational Researchers, Verbal Ability, Thinking Skills, Verbal Tests
Ali, Usama S.; Chang, Hua-Hua – ETS Research Report Series, 2014
Adaptive testing is advantageous in that it provides more efficient ability estimates with fewer items than linear testing does. Item-driven adaptive pretesting may also offer similar advantages, and verification of such a hypothesis about item calibration was the main objective of this study. A suitability index (SI) was introduced to adaptively…
Descriptors: Adaptive Testing, Simulation, Pretests Posttests, Test Items
Alcaraz-Mármol, Gema – International Journal of English Studies, 2015
Despite the current importance given to L2 vocabulary acquisition in the last two decades, considerable deficiencies are found in L2 students' vocabulary size. One of the aspects that may influence vocabulary learning is word frequency. However, scholars warn that frequency may lead to wrong conclusions if the way words are distributed is ignored.…
Descriptors: Second Language Learning, Age Differences, Vocabulary Development, Achievement Gains
Wolf, Raffaela; Zahner, Doris; Kostoris, Fiorella; Benjamin, Roger – Council for Aid to Education, 2014
The measurement of higher-order competencies within a tertiary education system across countries presents methodological challenges due to differences in educational systems, socio-economic factors, and perceptions as to which constructs should be assessed (Blömeke, Zlatkin-Troitschanskaia, Kuhn, & Fege, 2013). According to Hart Research…
Descriptors: Case Studies, International Assessment, Performance Based Assessment, Critical Thinking
Lissitz, Robert W.; Hou, Xiaodong; Slater, Sharon Cadman – Journal of Applied Testing Technology, 2012
This article investigates several questions regarding the impact of different item formats on measurement characteristics. Constructed response (CR) items and multiple choice (MC) items obviously differ in their formats and in the resources needed to score them. As such, they have been the subject of considerable discussion regarding the impact of…
Descriptors: Computer Assisted Testing, Scoring, Evaluation Problems, Psychometrics
Kingston, Neal M.; McKinley, Robert L. – 1988
Confirmatory multidimensional item response theory (CMIRT) was used to assess the structure of the Graduate Record Examination General Test, about which much information about factorial structure exists, using a sample of 1,001 psychology majors taking the test in 1984 or 1985. Results supported previous findings that, for this population, there…
Descriptors: College Students, Factor Analysis, Higher Education, Item Analysis
Angoff, William H. – 1991
An attempt was made to evaluate the standard error of equating (at the mean of the scores) in an ongoing testing program. The interest in estimating the empirical standard error of equating is occasioned by some discomfort with the error normally reported for test scores. Data used for this evaluation came from the Admissions Testing Program of…
Descriptors: College Entrance Examinations, Equated Scores, Error of Measurement, High School Students
Lawrence, Ida M.; Schmidt, Amy Elizabeth – College Entrance Examination Board, 2001
The SAT® I: Reasoning Test is administered seven times a year. Primarily for security purposes, several different test forms are given at each administration. How is it possible to compare scores obtained from different test forms and from different test administrations? The purpose of this paper is to provide an overview of the statistical…
Descriptors: Scores, Comparative Analysis, Standardized Tests, College Entrance Examinations
Bailey, Kathleen M., Ed.; And Others – 1987
This collection of 10 selected conference papers report the results of language testing research. Titles and authors are: "Computerized Adaptive Language Testing: A Spanish Placement Exam" (Jerry W. Larson); "Utilizing Rasch Analysis to Detect Cheating on Language Examinations" (Harold S. Madsen); "Scalar Analysis of…
Descriptors: Adaptive Testing, Audiolingual Skills, Cheating, Computer Assisted Testing