ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	0
Since 2006 (last 20 years)	15

Source

Educational Testing Service

Publication Type

Reports - Research	8
Reports - Evaluative	4
Numerical/Quantitative Data	2
Information Analyses	1
Opinion Papers	1
Reports - Descriptive	1

Education Level

Higher Education	4
Elementary Secondary Education	3
Postsecondary Education	3
Elementary Education	2
High Schools	2
Secondary Education	2
Grade 4	1
Grade 5	1
Grade 8	1
Intermediate Grades	1

Audience

Location

Chile	1
China	1
Colombia	1
Egypt	1
Georgia	1
Germany	1
Japan	1
Kentucky	1
Ohio	1
South Carolina	1
South Korea	1
Texas	1
West Virginia	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

SAT (College Admission Test)	2
ACT Assessment	1
Marlowe Crowne Social…	1
National Assessment of…	1
Test of English as a Foreign…	1
Test of English for…	1

What Works Clearinghouse Rating

Showing all 15 results Save | Export

Does Linking Mixed-Format Tests Using a Multiple-Choice Anchor Produce Comparable Results for Male and Female Subgroups? Research Report. ETS RR-11-44

Download full text

Kim, Sooyeon; Walker, Michael E. – Educational Testing Service, 2011

This study examines the use of subpopulation invariance indices to evaluate the appropriateness of using a multiple-choice (MC) item anchor in mixed-format tests, which include both MC and constructed-response (CR) items. Linking functions were derived in the nonequivalent groups with anchor test (NEAT) design using an MC-only anchor set for 4…

Descriptors: Test Format, Multiple Choice Tests, Test Items, Gender Differences

Use of e-rater[R] in Scoring of the TOEFL iBT[R] Writing Test. Research Report. ETS RR-11-25

Download full text

Haberman, Shelby J. – Educational Testing Service, 2011

Alternative approaches are discussed for use of e-rater[R] to score the TOEFL iBT[R] Writing test. These approaches involve alternate criteria. In the 1st approach, the predicted variable is the expected rater score of the examinee's 2 essays. In the 2nd approach, the predicted variable is the expected rater score of 2 essay responses by the…

Descriptors: Writing Tests, Scoring, Essays, Language Tests

How Does the Knowledge of Subgroup Membership of Examinees Affect the Prediction of True Subscores? Research Report. ETS RR-11-43

Download full text

Haberman, Shelby J.; Sinharay, Sandip – Educational Testing Service, 2011

Subscores are reported for several operational assessments. Haberman (2008) suggested a method based on classical test theory to determine if the true subscore is predicted better by the corresponding subscore or the total score. Researchers are often interested in learning how different subgroups perform on subtests. Stricker (1993) and…

Descriptors: True Scores, Test Theory, Prediction, Group Membership

Measurement of New Attributes for Chile's Admissions System to Higher Education. Research Report. ETS RR-11-18

Download full text

Santelices, Maria Veronica; Ugarte, Juan Jose; Flotts, Paulina; Radovic, Darinka; Kyllonen, Patrick – Educational Testing Service, 2011

This paper presents the development and initial validation of new measures of critical thinking and noncognitive attributes that were designed to supplement existing standardized tests used in the admissions system for higher education in Chile. The importance of various facets of this process, including the establishment of technical rigor and…

Descriptors: Foreign Countries, College Entrance Examinations, Test Construction, Test Validity

Unfair Treatment vs. Confirmation Bias? Comments on Santelices and Wilson. Research Report. ETS RR-10-20

Download full text

Dorans, Neil J. – Educational Testing Service, 2010

Santelices and Wilson (2010) claimed to have addressed technical criticisms of Freedle (2003) presented in Dorans (2004a) and elsewhere. Santelices and Wilson's abstract claimed that their study confirmed that SAT[R] verbal items do function differently for African American and White subgroups. In this commentary, I demonstrate that the…

Descriptors: College Entrance Examinations, Verbal Tests, Test Bias, Test Items

The Evaluation of Bias of the Weighted Random Effects Model Estimators. Research Report. ETS RR-11-13

Download full text

Jia, Yue; Stokes, Lynne; Harris, Ian; Wang, Yan – Educational Testing Service, 2011

Estimation of parameters of random effects models from samples collected via complex multistage designs is considered. One way to reduce estimation bias due to unequal probabilities of selection is to incorporate sampling weights. Many researchers have been proposed various weighting methods (Korn, & Graubard, 2003; Pfeffermann, Skinner,…

Descriptors: Computation, Statistical Bias, Sampling, Statistical Analysis

Evaluating Empirical Relationships among Prediction, Measurement, and Scaling Invariance. Research Report. ETS RR-11-06

Download full text

Moses, Tim – Educational Testing Service, 2011

The purpose of this study was to consider the relationships of prediction, measurement, and scaling invariance when these invariances were simultaneously evaluated in psychometric test data. An approach was developed to evaluate prediction, measurement, and scaling invariance based on linear and nonlinear prediction, measurement, and scaling…

Descriptors: Prediction, Measurement, Scaling, Tests

The TOEIC[R] Speaking and Writing Tests:Relations to Test-Taker Perceptions of Proficiency in English. Research Report. ETS RR-09-18

Download full text

Powers, Donald E.; Kim, Hae-Jin; Yu, Feng; Weng, Vincent Z.; VanWinkle, Waverely – Educational Testing Service, 2009

To facilitate the interpretation of test scores from the new TOEIC[R] (Test of English for International Communications[TM]) speaking and writing tests as measures of English-language proficiency, we administered a self-assessment inventory to TOEIC examinees in Japan and Korea, to gather their perceptions of their ability to perform a variety of…

Descriptors: English for Special Purposes, Language Tests, Writing Tests, Speech Tests

Examining the Factor Structure of a State Standards-Based Science Assessment for Students with Learning Disabilities. Research Report. ETS RR-11-38

Download full text

Steinberg, Jonathan; Cline, Frederick; Sawaki, Yasuyo – Educational Testing Service, 2011

This study examined the scores on a state standards-based Grade 5 Science assessment obtained by a group of students without learning disabilities who took the standard form of the test and by three groups of students with learning disabilities: one taking the standard form of the test without accommodations or modifications, a second taking the…

Descriptors: Learning Disabilities, State Standards, Educational Improvement, Science Tests

A Concurrent Validity Study of the 2008 "HSTW" Assessment Scores

Download full text

Young, John W.; Cline, Fred – Educational Testing Service, 2009

"High Schools That Work" (HSTW) is a school improvement initiative that was inaugurated by the Southern Regional Education Board (SREB) in 1987. The main purpose of this concurrent validity study is to evaluate one or more measures by investigating their relationship to other commonly used and established measures given at or about the…

Descriptors: Validity, Educational Improvement, Improvement Programs, High Schools

A General Procedure to Assess the Internal Structure of a Noncognitive Measure--The Student360 Insight Program (S360) Time Management Scale. Research Report. ETS RR-11-42

Download full text

Ling, Guangming; Rijmen, Frank – Educational Testing Service, 2011

The factorial structure of the Time Management (TM) scale of the Student 360: Insight Program (S360) was evaluated based on a national sample. A general procedure with a variety of methods was introduced and implemented, including the computation of descriptive statistics, exploratory factor analysis (EFA), and confirmatory factor analysis (CFA).…

Descriptors: Time Management, Measures (Individuals), Statistical Analysis, Factor Analysis

When Can Subscores Be Expected to Have Added Value? Results from Operational and Simulated Data. Research Report. ETS RR-10-16

Download full text

Sinharay, Sandip – Educational Testing Service, 2010

Recently, there has been an increasing level of interest in subscores for their potential diagnostic value. Haberman (2008) suggested a method based on classical test theory to determine whether subscores have added value over total scores. This paper provides a literature review and reports when subscores were found to have added value for…

Descriptors: Scores, Correlation, Reliability, Item Response Theory

Three Multidimensional Models for Testlet-Based Tests: Formal Relations and an Empirical Comparison. Research Report. ETS RR-09-37

Download full text

Rijmen, Frank – Educational Testing Service, 2009

Three multidimensional item response theory (IRT) models for testlet-based tests are described. In the bifactor model (Gibbons & Hedeker, 1992), each item measures a general dimension in addition to a testlet-specific dimension. The testlet model (Bradlow, Wainer, & Wang, 1999) is a bifactor model in which the loadings on the specific dimensions…

Descriptors: Item Response Theory, Models, Graphs, Comparative Analysis

Writing Assessment and Cognition. Research Report. ETS RR-11-14

Download full text

Deane, Paul – Educational Testing Service, 2011

This paper presents a socio-cognitive framework for connecting writing pedagogy and writing assessment with modern social and cognitive theories of writing. It focuses on providing a general framework that highlights the connections between writing competency and other literacy skills; identifies key connections between literacy instruction,…

Descriptors: Writing (Composition), Writing Evaluation, Writing Tests, Cognitive Ability

Test Takers' Attitudes about the TOEFL iBT[TM]. TOEFL iBT Research Report. RR-10-2

Download full text

Stricker, Lawrence J.; Attali, Yigal – Educational Testing Service, 2010

The principal aims of this study, a conceptual replication of an earlier investigation of the TOEFL[R] computer-based test, or TOEFL CBT, in Buenos Aires, Cairo, and Frankfurt, were to assess test takers' reported acceptance of the TOEFL Internet-based test, or TOEFL iBT[TM], and its associations with possible determinants of this acceptance and…

Descriptors: Computer Attitudes, Questionnaires, Comparative Analysis, Foreign Countries

Correlation	15
Language Tests	5
Scores	5
Factor Analysis	4
Test Validity	4
College Entrance Examinations	3
Foreign Countries	3
Science Tests	3
Scoring	3
Test Items	3
Test Theory	3
Writing Tests	3
Accuracy	2
Achievement Tests	2
Comparative Analysis	2
Computer Assisted Testing	2
Critical Thinking	2
Educational Improvement	2
English (Second Language)	2
Evaluation Methods	2
Factor Structure	2
Gender Differences	2
Item Response Theory	2
Licensing Examinations…	2
Mathematics Tests	2
More ▼

Haberman, Shelby J.	2
Rijmen, Frank	2
Sinharay, Sandip	2
Attali, Yigal	1
Cline, Fred	1
Cline, Frederick	1
Deane, Paul	1
Dorans, Neil J.	1
Flotts, Paulina	1
Harris, Ian	1
Jia, Yue	1
Kim, Hae-Jin	1
Kim, Sooyeon	1
Kyllonen, Patrick	1
Ling, Guangming	1
Moses, Tim	1
Powers, Donald E.	1
Radovic, Darinka	1
Santelices, Maria Veronica	1
Sawaki, Yasuyo	1
Steinberg, Jonathan	1
Stokes, Lynne	1
Stricker, Lawrence J.	1
Ugarte, Juan Jose	1
VanWinkle, Waverely	1
More ▼