Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 5 |
Since 2016 (last 10 years) | 9 |
Since 2006 (last 20 years) | 17 |
Descriptor
Computer Assisted Testing | 18 |
Measurement | 18 |
Scoring | 18 |
Essay Tests | 6 |
Essays | 6 |
Test Items | 6 |
Evaluation Methods | 5 |
Psychometrics | 5 |
Automation | 4 |
Comparative Analysis | 4 |
Educational Technology | 4 |
More ▼ |
Source
Author
Publication Type
Journal Articles | 15 |
Reports - Research | 7 |
Reports - Evaluative | 6 |
Reports - Descriptive | 2 |
Collected Works - Proceedings | 1 |
Opinion Papers | 1 |
Reference Materials -… | 1 |
Education Level
Elementary Secondary Education | 7 |
Higher Education | 7 |
Postsecondary Education | 7 |
Secondary Education | 2 |
Early Childhood Education | 1 |
Elementary Education | 1 |
High Schools | 1 |
Junior High Schools | 1 |
Middle Schools | 1 |
Audience
Laws, Policies, & Programs
Elementary and Secondary… | 1 |
Assessments and Surveys
Graduate Record Examinations | 1 |
Test of English as a Foreign… | 1 |
What Works Clearinghouse Rating
Wang, Wei; Dorans, Neil J. – ETS Research Report Series, 2021
Agreement statistics and measures of prediction accuracy are often used to assess the quality of two measures of a construct. Agreement statistics are appropriate for measures that are supposed to be interchangeable, whereas prediction accuracy statistics are appropriate for situations where one variable is the target and the other variables are…
Descriptors: Classification, Scaling, Prediction, Accuracy
Lenz, A. Stephen; Ault, Haley; Balkin, Richard S.; Barrio Minton, Casey; Erford, Bradley T.; Hays, Danica G.; Kim, Bryan S. K.; Li, Chi – Measurement and Evaluation in Counseling and Development, 2022
In April 2021, The Association for Assessment and Research in Counseling Executive Council commissioned a time-referenced task group to revise the Responsibilities of Users of Standardized Tests (RUST) Statement (3rd edition) published by the Association for Assessment in Counseling (AAC) in 2003. The task group developed a work plan to implement…
Descriptors: Responsibility, Standardized Tests, Counselor Training, Ethics
Clements, Douglas H.; Banse, Holland; Sarama, Julie; Tatsuoka, Curtis; Joswick, Candace; Hudyma, Aaron; Van Dine, Douglas W.; Tatsuoka, Kikumi K. – Mathematical Thinking and Learning: An International Journal, 2022
Researchers often develop instruments using correctness scores (and a variety of theories and techniques, such as Item Response Theory) for validation and scoring. Less frequently, observations of children's strategies are incorporated into the design, development, and application of assessments. We conducted individual interviews of 833…
Descriptors: Item Response Theory, Computer Assisted Testing, Test Items, Mathematics Tests
Goecke, Benjamin; Schmitz, Florian; Wilhelm, Oliver – Journal of Intelligence, 2021
Performance in elementary cognitive tasks is moderately correlated with fluid intelligence and working memory capacity. These correlations are higher for more complex tasks, presumably due to increased demands on working memory capacity. In accordance with the binding hypothesis, which states that working memory capacity reflects the limit of a…
Descriptors: Intelligence, Cognitive Processes, Short Term Memory, Reaction Time
Gawliczek, Piotr; Krykun, Viktoriia; Tarasenko, Nataliya; Tyshchenko, Maksym; Shapran, Oleksandr – Advanced Education, 2021
The article deals with the innovative, cutting age solution within the language testing realm, namely computer adaptive language testing (CALT) in accordance with the NATO Standardization Agreement 6001 (NATO STANAG 6001) requirements for further implementation in foreign language training of personnel of the Armed Forces of Ukraine (AF of…
Descriptors: Computer Assisted Testing, Adaptive Testing, Language Tests, Second Language Instruction
Kirsch, Irwin; Lennon, Mary Louise – Large-scale Assessments in Education, 2017
As the largest and most innovative international assessment of adults, PIAAC marks an inflection point in the evolution of large-scale comparative assessments. PIAAC grew from the foundation laid by surveys that preceded it, and introduced innovations that have shifted the way we conceive and implement large-scale assessments. As the first fully…
Descriptors: International Assessment, Adults, Measurement, Surveys
Behizadeh, Nadia; Lynch, Tom Liam – Berkeley Review of Education, 2017
For the last century, the quality of large-scale assessment in the United States has been undermined by narrow educational theory and hindered by limitations in technology. As a result, poor assessment practices have encouraged low-level instructional practices that disparately affect students from the most disadvantaged communities and schools.…
Descriptors: Equal Education, Measurement, Educational Theories, Evaluation Methods
Wolf, Mikyung Kim; Guzman-Orth, Danielle; Lopez, Alexis; Castellano, Katherine; Himelfarb, Igor; Tsutagawa, Fred S. – Educational Assessment, 2016
This article investigates ways to improve the assessment of English learner students' English language proficiency given the current movement of creating next-generation English language proficiency assessments in the Common Core era. In particular, this article discusses the integration of scaffolding strategies, which are prevalently utilized as…
Descriptors: English Language Learners, Scaffolding (Teaching Technique), Language Tests, Language Proficiency
Ramineni, Chaitanya; Williamson, David M. – Assessing Writing, 2013
In this paper, we provide an overview of psychometric procedures and guidelines Educational Testing Service (ETS) uses to evaluate automated essay scoring for operational use. We briefly describe the e-rater system, the procedures and criteria used to evaluate e-rater, implications for a range of potential uses of e-rater, and directions for…
Descriptors: Educational Testing, Guidelines, Scoring, Psychometrics
Deane, Paul – Assessing Writing, 2013
This paper examines the construct measured by automated essay scoring (AES) systems. AES systems measure features of the text structure, linguistic structure, and conventional print form of essays; as such, the systems primarily measure text production skills. In the current state-of-the-art, AES provide little direct evidence about such matters…
Descriptors: Scoring, Essays, Text Structure, Writing (Composition)
Ramineni, Chaitanya – Assessing Writing, 2013
In this paper, I describe the design and evaluation of automated essay scoring (AES) models for an institution's writing placement program. Information was gathered on admitted student writing performance at a science and technology research university in the northeastern United States. Under timed conditions, first-year students (N = 879) were…
Descriptors: Validity, Comparative Analysis, Internet, Student Placement
Condon, William – Assessing Writing, 2013
Automated Essay Scoring (AES) has garnered a great deal of attention from the rhetoric and composition/writing studies community since the Educational Testing Service began using e-rater[R] and the "Criterion"[R] Online Writing Evaluation Service as products in scoring writing tests, and most of the responses have been negative. While the…
Descriptors: Measurement, Psychometrics, Evaluation Methods, Educational Testing
Pommerich, Mary – Educational Measurement: Issues and Practice, 2012
Neil Dorans has made a career of advocating for the examinee. He continues to do so in his NCME career award address, providing a thought-provoking commentary on some current trends in educational measurement that could potentially affect the integrity of test scores. Concerns expressed in the address call attention to a conundrum that faces…
Descriptors: Testing, Scores, Measurement, Test Construction
Lissitz, Robert W.; Hou, Xiaodong; Slater, Sharon Cadman – Journal of Applied Testing Technology, 2012
This article investigates several questions regarding the impact of different item formats on measurement characteristics. Constructed response (CR) items and multiple choice (MC) items obviously differ in their formats and in the resources needed to score them. As such, they have been the subject of considerable discussion regarding the impact of…
Descriptors: Computer Assisted Testing, Scoring, Evaluation Problems, Psychometrics
Quinlan, Thomas; Higgins, Derrick; Wolff, Susanne – Educational Testing Service, 2009
This report evaluates the construct coverage of the e-rater[R[ scoring engine. The matter of construct coverage depends on whether one defines writing skill, in terms of process or product. Originally, the e-rater engine consisted of a large set of components with a proven ability to predict human holistic scores. By organizing these capabilities…
Descriptors: Guides, Writing Skills, Factor Analysis, Writing Tests
Previous Page | Next Page ยป
Pages: 1 | 2