ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	5
Since 2016 (last 10 years)	9
Since 2006 (last 20 years)	17

Descriptor

Computer Assisted Testing	18
Measurement	18
Scoring	18
Essay Tests	6
Essays	6
Test Items	6
Evaluation Methods	5
Psychometrics	5
Automation	4
Comparative Analysis	4
Educational Technology	4
Educational Testing	4
Models	4
Scores	4
Test Construction	4
Test Reliability	4
Writing Evaluation	4
Accuracy	3
Classification	3
Factor Analysis	3
Foreign Countries	3
Item Response Theory	3
Program Evaluation	3
Test Results	3
Validity	3
More ▼

Source

Assessing Writing	4
Advanced Education	1
Berkeley Review of Education	1
ETS Research Report Series	1
Educational Assessment	1
Educational Measurement:…	1
Educational Testing Service	1
International Educational…	1
Journal of Applied Testing…	1
Journal of Intelligence	1
Journal of Technology,…	1
Large-scale Assessments in…	1
Mathematical Thinking and…	1
Measurement and Evaluation in…	1
More ▼

Publication Type

Journal Articles	15
Reports - Research	7
Reports - Evaluative	6
Reports - Descriptive	2
Collected Works - Proceedings	1
Opinion Papers	1
Reference Materials -…	1

Education Level

Elementary Secondary Education	7
Higher Education	7
Postsecondary Education	7
Secondary Education	2
Early Childhood Education	1
Elementary Education	1
High Schools	1
Junior High Schools	1
Middle Schools	1

Audience

Location

China	1
Germany	1
Maryland	1
Ukraine	1

Laws, Policies, & Programs

Elementary and Secondary…

Assessments and Surveys

Graduate Record Examinations	1
Test of English as a Foreign…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 18 results Save | Export

Impact of Categorization and Scaling on Classification Agreement and Prediction Accuracy Statistics. Research Report. ETS RR-21-26

Peer reviewed
PDF on ERIC

Download full text

Wang, Wei; Dorans, Neil J. – ETS Research Report Series, 2021

Agreement statistics and measures of prediction accuracy are often used to assess the quality of two measures of a construct. Agreement statistics are appropriate for measures that are supposed to be interchangeable, whereas prediction accuracy statistics are appropriate for situations where one variable is the target and the other variables are…

Descriptors: Classification, Scaling, Prediction, Accuracy

Responsibilities of Users of Standardized Tests (Rust-4E)

Peer reviewed

Direct link

Lenz, A. Stephen; Ault, Haley; Balkin, Richard S.; Barrio Minton, Casey; Erford, Bradley T.; Hays, Danica G.; Kim, Bryan S. K.; Li, Chi – Measurement and Evaluation in Counseling and Development, 2022

In April 2021, The Association for Assessment and Research in Counseling Executive Council commissioned a time-referenced task group to revise the Responsibilities of Users of Standardized Tests (RUST) Statement (3rd edition) published by the Association for Assessment in Counseling (AAC) in 2003. The task group developed a work plan to implement…

Descriptors: Responsibility, Standardized Tests, Counselor Training, Ethics

Young Children's Actions on Length Measurement Tasks: Strategies and Cognitive Attributes

Peer reviewed

Direct link

Clements, Douglas H.; Banse, Holland; Sarama, Julie; Tatsuoka, Curtis; Joswick, Candace; Hudyma, Aaron; Van Dine, Douglas W.; Tatsuoka, Kikumi K. – Mathematical Thinking and Learning: An International Journal, 2022

Researchers often develop instruments using correctness scores (and a variety of theories and techniques, such as Item Response Theory) for validation and scoring. Less frequently, observations of children's strategies are incorporated into the design, development, and application of assessments. We conducted individual interviews of 833…

Descriptors: Item Response Theory, Computer Assisted Testing, Test Items, Mathematics Tests

Binding Costs in Processing Efficiency as Determinants of Cognitive Ability

Peer reviewed
PDF on ERIC

Download full text

Goecke, Benjamin; Schmitz, Florian; Wilhelm, Oliver – Journal of Intelligence, 2021

Performance in elementary cognitive tasks is moderately correlated with fluid intelligence and working memory capacity. These correlations are higher for more complex tasks, presumably due to increased demands on working memory capacity. In accordance with the binding hypothesis, which states that working memory capacity reflects the limit of a…

Descriptors: Intelligence, Cognitive Processes, Short Term Memory, Reaction Time

Computer Adaptive Language Testing According to NATO STANAG 6001 Requirements

Peer reviewed
PDF on ERIC

Download full text

Gawliczek, Piotr; Krykun, Viktoriia; Tarasenko, Nataliya; Tyshchenko, Maksym; Shapran, Oleksandr – Advanced Education, 2021

The article deals with the innovative, cutting age solution within the language testing realm, namely computer adaptive language testing (CALT) in accordance with the NATO Standardization Agreement 6001 (NATO STANAG 6001) requirements for further implementation in foreign language training of personnel of the Armed Forces of Ukraine (AF of…

Descriptors: Computer Assisted Testing, Adaptive Testing, Language Tests, Second Language Instruction

PIACC: A New Design for A New Era

Peer reviewed

Direct link

Kirsch, Irwin; Lennon, Mary Louise – Large-scale Assessments in Education, 2017

As the largest and most innovative international assessment of adults, PIAAC marks an inflection point in the evolution of large-scale comparative assessments. PIAAC grew from the foundation laid by surveys that preceded it, and introduced innovations that have shifted the way we conceive and implement large-scale assessments. As the first fully…

Descriptors: International Assessment, Adults, Measurement, Surveys

Righting Technologies: How Large-Scale Assessment Can Foster a More Equitable Education System

Peer reviewed
PDF on ERIC

Download full text

Behizadeh, Nadia; Lynch, Tom Liam – Berkeley Review of Education, 2017

For the last century, the quality of large-scale assessment in the United States has been undermined by narrow educational theory and hindered by limitations in technology. As a result, poor assessment practices have encouraged low-level instructional practices that disparately affect students from the most disadvantaged communities and schools.…

Descriptors: Equal Education, Measurement, Educational Theories, Evaluation Methods

Integrating Scaffolding Strategies into Technology-Enhanced Assessments of English Learners: Task Types and Measurement Models

Peer reviewed

Direct link

Wolf, Mikyung Kim; Guzman-Orth, Danielle; Lopez, Alexis; Castellano, Katherine; Himelfarb, Igor; Tsutagawa, Fred S. – Educational Assessment, 2016

This article investigates ways to improve the assessment of English learner students' English language proficiency given the current movement of creating next-generation English language proficiency assessments in the Common Core era. In particular, this article discusses the integration of scaffolding strategies, which are prevalently utilized as…

Descriptors: English Language Learners, Scaffolding (Teaching Technique), Language Tests, Language Proficiency

Automated Essay Scoring: Psychometric Guidelines and Practices

Peer reviewed

Direct link

Ramineni, Chaitanya; Williamson, David M. – Assessing Writing, 2013

In this paper, we provide an overview of psychometric procedures and guidelines Educational Testing Service (ETS) uses to evaluate automated essay scoring for operational use. We briefly describe the e-rater system, the procedures and criteria used to evaluate e-rater, implications for a range of potential uses of e-rater, and directions for…

Descriptors: Educational Testing, Guidelines, Scoring, Psychometrics

On the Relation between Automated Essay Scoring and Modern Views of the Writing Construct

Peer reviewed

Direct link

Deane, Paul – Assessing Writing, 2013

This paper examines the construct measured by automated essay scoring (AES) systems. AES systems measure features of the text structure, linguistic structure, and conventional print form of essays; as such, the systems primarily measure text production skills. In the current state-of-the-art, AES provide little direct evidence about such matters…

Descriptors: Scoring, Essays, Text Structure, Writing (Composition)

Validating Automated Essay Scoring for Online Writing Placement

Peer reviewed

Direct link

Ramineni, Chaitanya – Assessing Writing, 2013

In this paper, I describe the design and evaluation of automated essay scoring (AES) models for an institution's writing placement program. Information was gathered on admitted student writing performance at a science and technology research university in the northeastern United States. Under timed conditions, first-year students (N = 879) were…

Descriptors: Validity, Comparative Analysis, Internet, Student Placement

Large-Scale Assessment, Locally-Developed Measures, and Automated Scoring of Essays: Fishing for Red Herrings?

Peer reviewed

Direct link

Condon, William – Assessing Writing, 2013

Automated Essay Scoring (AES) has garnered a great deal of attention from the rhetoric and composition/writing studies community since the Educational Testing Service began using e-rater[R] and the "Criterion"[R] Online Writing Evaluation Service as products in scoring writing tests, and most of the responses have been negative. While the…

Descriptors: Measurement, Psychometrics, Evaluation Methods, Educational Testing

Comments on Neil Dorans's NCME Career Award Address: The Contestant Perspective on Taking Tests--Emanations from the Statue within

Peer reviewed

Direct link

Pommerich, Mary – Educational Measurement: Issues and Practice, 2012

Neil Dorans has made a career of advocating for the examinee. He continues to do so in his NCME career award address, providing a thought-provoking commentary on some current trends in educational measurement that could potentially affect the integrity of test scores. Concerns expressed in the address call attention to a conundrum that faces…

Descriptors: Testing, Scores, Measurement, Test Construction

The Contribution of Constructed Response Items to Large Scale Assessment: Measuring and Understanding Their Impact

Peer reviewed

Direct link

Lissitz, Robert W.; Hou, Xiaodong; Slater, Sharon Cadman – Journal of Applied Testing Technology, 2012

This article investigates several questions regarding the impact of different item formats on measurement characteristics. Constructed response (CR) items and multiple choice (MC) items obviously differ in their formats and in the resources needed to score them. As such, they have been the subject of considerable discussion regarding the impact of…

Descriptors: Computer Assisted Testing, Scoring, Evaluation Problems, Psychometrics

Evaluating the Construct-Coverage of the e-rater[R] Scoring Engine. Research Report. ETS RR-09-01

Download full text

Quinlan, Thomas; Higgins, Derrick; Wolff, Susanne – Educational Testing Service, 2009

This report evaluates the construct coverage of the e-rater[R[ scoring engine. The matter of construct coverage depends on whether one defines writing skill, in terms of process or product. Originally, the e-rater engine consisted of a large set of components with a proven ability to predict human holistic scores. By organizing these capabilities…

Descriptors: Guides, Writing Skills, Factor Analysis, Writing Tests

Previous Page | Next Page »

Pages: 1 | 2

Ramineni, Chaitanya	2
Ault, Haley	1
Balkin, Richard S.	1
Banse, Holland	1
Barnes, Tiffany, Ed.	1
Barrio Minton, Casey	1
Behizadeh, Nadia	1
Castellano, Katherine	1
Clements, Douglas H.	1
Cohen, Allan S., Comp.	1
Condon, William	1
Deane, Paul	1
Dorans, Neil J.	1
Erford, Bradley T.	1
Gawliczek, Piotr	1
Gifford, Bernard	1
Goecke, Benjamin	1
Guzman-Orth, Danielle	1
Hays, Danica G.	1
Hershkovitz, Arnon, Ed.	1
Higgins, Derrick	1
Himelfarb, Igor	1
Hou, Xiaodong	1
Hu, Xiangen, Ed.	1
Hudyma, Aaron	1
More ▼