ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	4
Since 2006 (last 20 years)	8

Descriptor

Statistical Analysis	12
Test Format	12
Scoring	10
Test Items	7
Comparative Analysis	4
Test Interpretation	4
Computer Assisted Testing	3
Foreign Countries	3
Adaptive Testing	2
College Entrance Examinations	2
College Students	2
Correlation	2
Equated Scores	2
Factor Analysis	2
High School Students	2
Higher Education	2
Item Response Theory	2
Language Tests	2
Mathematics Tests	2
Pretests Posttests	2
Sample Size	2
Scores	2
Scoring Formulas	2
Second Language Instruction	2
Simulation	2
More ▼

Source

Applied Measurement in…	1
College Entrance Examination…	1
Council for Aid to Education	1
ETS Research Report Series	1
European Early Childhood…	1
International Journal of…	1
Journal of Applied Testing…	1
Journal of Psychoeducational…	1
ProQuest LLC	1

Publication Type

Reports - Research	8
Journal Articles	6
Collected Works - Proceedings	1
Dissertations/Theses -…	1
Reports - Descriptive	1
Reports - Evaluative	1
Speeches/Meeting Papers	1

Education Level

Higher Education	2
Early Childhood Education	1
Elementary Education	1
Elementary Secondary Education	1
High Schools	1
Kindergarten	1
Postsecondary Education	1
Primary Education	1
Secondary Education	1

Audience

Location

Estonia	1
Italy	1
Maryland	1
Spain	1
United States	1

Laws, Policies, & Programs

Assessments and Surveys

SAT (College Admission Test)	2
Graduate Record Examinations	1

What Works Clearinghouse Rating

Showing all 12 results Save | Export

Evaluating the Effectiveness of the Expectation-Maximization (EM) Algorithm for Bayesian Network Calibration

Direct link

Tingir, Seyfullah – ProQuest LLC, 2019

Educators use various statistical techniques to explain relationships between latent and observable variables. One way to model these relationships is to use Bayesian networks as a scoring model. However, adjusting the conditional probability tables (CPT-parameters) to fit a set of observations is still a challenge when using Bayesian networks. A…

Descriptors: Bayesian Statistics, Statistical Analysis, Scoring, Probability

Statistically Comparing the Performance of Multiple Automated Raters across Multiple Items

Peer reviewed

Direct link

Kieftenbeld, Vincent; Boyer, Michelle – Applied Measurement in Education, 2017

Automated scoring systems are typically evaluated by comparing the performance of a single automated rater item-by-item to human raters. This presents a challenge when the performance of multiple raters needs to be compared across multiple items. Rankings could depend on specifics of the ranking procedure; observed differences could be due to…

Descriptors: Automation, Scoring, Comparative Analysis, Test Items

On Using Simulations to Inform Decision Making during Instrument Development

Peer reviewed

Direct link

Morgan, Grant B.; Moore, Courtney A.; Floyd, Harlee S. – Journal of Psychoeducational Assessment, 2018

Although content validity--how well each item of an instrument represents the construct being measured--is foundational in the development of an instrument, statistical validity is also important to the decisions that are made based on the instrument. The primary purpose of this study is to demonstrate how simulation studies can be used to assist…

Descriptors: Simulation, Decision Making, Test Construction, Validity

Supporting Educational Researchers and Practitioners in Their Work in Education: Assessing the Verbal Reasoning Skills of Five- to Six-Year-Old Children

Peer reviewed

Direct link

Säre, Egle; Luik, Piret; Fisher, Robert – European Early Childhood Education Research Journal, 2016

The purpose of this study was to design an instrument for five- to six-year-old children to help measure their verbal reasoning skills and assess the validity and reliability of the resulting instrument. For this purpose, the researchers have created the Younger Children Verbal Reasoning Test (YCVR-test) and a control instrument, which have been…

Descriptors: Educational Researchers, Verbal Ability, Thinking Skills, Verbal Tests

An Item-Driven Adaptive Design for Calibrating Pretest Items. Research Report. ETS RR-14-38

Peer reviewed
PDF on ERIC

Download full text

Ali, Usama S.; Chang, Hua-Hua – ETS Research Report Series, 2014

Adaptive testing is advantageous in that it provides more efficient ability estimates with fewer items than linear testing does. Item-driven adaptive pretesting may also offer similar advantages, and verification of such a hypothesis about item calibration was the main objective of this study. A suitability index (SI) was introduced to adaptively…

Descriptors: Adaptive Testing, Simulation, Pretests Posttests, Test Items

Dispersion and Frequency: Is There Any Difference as Regards Their Relation to L2 Vocabulary Gains?

Peer reviewed
PDF on ERIC

Download full text

Alcaraz-Mármol, Gema – International Journal of English Studies, 2015

Despite the current importance given to L2 vocabulary acquisition in the last two decades, considerable deficiencies are found in L2 students' vocabulary size. One of the aspects that may influence vocabulary learning is word frequency. However, scholars warn that frequency may lead to wrong conclusions if the way words are distributed is ignored.…

Descriptors: Second Language Learning, Age Differences, Vocabulary Development, Achievement Gains

A Case Study of an International Performance-Based Assessment of Critical Thinking Skills

Download full text

Wolf, Raffaela; Zahner, Doris; Kostoris, Fiorella; Benjamin, Roger – Council for Aid to Education, 2014

The measurement of higher-order competencies within a tertiary education system across countries presents methodological challenges due to differences in educational systems, socio-economic factors, and perceptions as to which constructs should be assessed (Blömeke, Zlatkin-Troitschanskaia, Kuhn, & Fege, 2013). According to Hart Research…

Descriptors: Case Studies, International Assessment, Performance Based Assessment, Critical Thinking

The Contribution of Constructed Response Items to Large Scale Assessment: Measuring and Understanding Their Impact

Peer reviewed

Direct link

Lissitz, Robert W.; Hou, Xiaodong; Slater, Sharon Cadman – Journal of Applied Testing Technology, 2012

This article investigates several questions regarding the impact of different item formats on measurement characteristics. Constructed response (CR) items and multiple choice (MC) items obviously differ in their formats and in the resources needed to score them. As such, they have been the subject of considerable discussion regarding the impact of…

Descriptors: Computer Assisted Testing, Scoring, Evaluation Problems, Psychometrics

Assessing the Structure of the GRE General Test Using Confirmatory Multidimensional Item Response Theory.

Download full text

Kingston, Neal M.; McKinley, Robert L. – 1988

Confirmatory multidimensional item response theory (CMIRT) was used to assess the structure of the Graduate Record Examination General Test, about which much information about factorial structure exists, using a sample of 1,001 psychology majors taking the test in 1984 or 1985. Results supported previous findings that, for this population, there…

Descriptors: College Students, Factor Analysis, Higher Education, Item Analysis

The Determination of Empirical Standard Errors of Equating the Scores on SAT-Verbal and SAT-Mathematical.

Download full text

Angoff, William H. – 1991

An attempt was made to evaluate the standard error of equating (at the mean of the scores) in an ongoing testing program. The interest in estimating the empirical standard error of equating is occasioned by some discomfort with the error normally reported for test scores. Data used for this evaluation came from the Admissions Testing Program of…

Descriptors: College Entrance Examinations, Equated Scores, Error of Measurement, High School Students

Ensuring Comparable Scores on the SAT® I: Reasoning Test. Research Notes. RN-14

Download full text

Lawrence, Ida M.; Schmidt, Amy Elizabeth – College Entrance Examination Board, 2001

The SAT® I: Reasoning Test is administered seven times a year. Primarily for security purposes, several different test forms are given at each administration. How is it possible to compare scores obtained from different test forms and from different test administrations? The purpose of this paper is to provide an overview of the statistical…

Descriptors: Scores, Comparative Analysis, Standardized Tests, College Entrance Examinations

Language Testing Research. Selected Papers from the Colloquium (Monterey, California, February 27-28, 1986).

Bailey, Kathleen M., Ed.; And Others – 1987

This collection of 10 selected conference papers report the results of language testing research. Titles and authors are: "Computerized Adaptive Language Testing: A Spanish Placement Exam" (Jerry W. Larson); "Utilizing Rasch Analysis to Detect Cheating on Language Examinations" (Harold S. Madsen); "Scalar Analysis of…

Descriptors: Adaptive Testing, Audiolingual Skills, Cheating, Computer Assisted Testing

Alcaraz-Mármol, Gema	1
Ali, Usama S.	1
Angoff, William H.	1
Bailey, Kathleen M., Ed.	1
Benjamin, Roger	1
Boyer, Michelle	1
Chang, Hua-Hua	1
Fisher, Robert	1
Floyd, Harlee S.	1
Hou, Xiaodong	1
Kieftenbeld, Vincent	1
Kingston, Neal M.	1
Kostoris, Fiorella	1
Lawrence, Ida M.	1
Lissitz, Robert W.	1
Luik, Piret	1
McKinley, Robert L.	1
Moore, Courtney A.	1
Morgan, Grant B.	1
Schmidt, Amy Elizabeth	1
Slater, Sharon Cadman	1
Säre, Egle	1
Tingir, Seyfullah	1
Wolf, Raffaela	1
More ▼