Publication Date
In 2025 | 1 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 3 |
Since 2016 (last 10 years) | 7 |
Since 2006 (last 20 years) | 11 |
Descriptor
Computer Assisted Testing | 14 |
Interrater Reliability | 14 |
Test Validity | 14 |
Test Reliability | 9 |
Scoring | 7 |
Correlation | 6 |
Language Tests | 5 |
English (Second Language) | 4 |
Foreign Countries | 4 |
Test Construction | 4 |
Difficulty Level | 3 |
More ▼ |
Source
Author
Anna-Maria Fall | 2 |
Beula M. Magimairaj | 2 |
Greg Roberts | 2 |
Philip Capin | 2 |
Ronald B. Gillam | 2 |
Sandra L. Gillam | 2 |
Sharon Vaughn | 2 |
Bejar, Isaac I. | 1 |
Ben-Simon, Anat | 1 |
Bennett, Randy Elliot | 1 |
Bobek, Becky L. | 1 |
More ▼ |
Publication Type
Reports - Research | 11 |
Journal Articles | 9 |
Tests/Questionnaires | 2 |
Dissertations/Theses -… | 1 |
Reports - Descriptive | 1 |
Reports - Evaluative | 1 |
Speeches/Meeting Papers | 1 |
Audience
Researchers | 1 |
Laws, Policies, & Programs
Pell Grant Program | 1 |
Assessments and Surveys
Test of English as a Foreign… | 4 |
ACT Assessment | 1 |
Graduate Record Examinations | 1 |
Strengths and Difficulties… | 1 |
What Works Clearinghouse Rating
Tahereh Firoozi; Hamid Mohammadi; Mark J. Gierl – Journal of Educational Measurement, 2025
The purpose of this study is to describe and evaluate a multilingual automated essay scoring (AES) system for grading essays in three languages. Two different sentence embedding models were evaluated within the AES system, multilingual BERT (mBERT) and language-agnostic BERT sentence embedding (LaBSE). German, Italian, and Czech essays were…
Descriptors: College Students, Slavic Languages, German, Italian
Beula M. Magimairaj; Philip Capin; Sandra L. Gillam; Sharon Vaughn; Greg Roberts; Anna-Maria Fall; Ronald B. Gillam – Grantee Submission, 2022
Purpose: Our aim was to evaluate the psychometric properties of the online administered format of the Test of Narrative Language--Second Edition (TNL-2; Gillam & Pearson, 2017), given the importance of assessing children's narrative ability and considerable absence of psychometric studies of spoken language assessments administered online.…
Descriptors: Computer Assisted Testing, Language Tests, Story Telling, Language Impairments
Beula M. Magimairaj; Philip Capin; Sandra L. Gillam; Sharon Vaughn; Greg Roberts; Anna-Maria Fall; Ronald B. Gillam – Language, Speech, and Hearing Services in Schools, 2022
Purpose: Our aim was to evaluate the psychometric properties of the online administered format of the Test of Narrative Language--Second Edition (TNL-2; Gillam & Pearson, 2017), given the importance of assessing children's narrative ability and considerable absence of psychometric studies of spoken language assessments administered online.…
Descriptors: Computer Assisted Testing, Language Tests, Story Telling, Language Impairments
Cohen, Yoav; Levi, Effi; Ben-Simon, Anat – Applied Measurement in Education, 2018
In the current study, two pools of 250 essays, all written as a response to the same prompt, were rated by two groups of raters (14 or 15 raters per group), thereby providing an approximation to the essay's true score. An automated essay scoring (AES) system was trained on the datasets and then scored the essays using a cross-validation scheme. By…
Descriptors: Test Validity, Automation, Scoring, Computer Assisted Testing
Smolinsky, Lawrence; Marx, Brian D.; Olafsson, Gestur; Ma, Yanxia A. – Journal of Educational Computing Research, 2020
Computer-based testing is an expanding use of technology offering advantages to teachers and students. We studied Calculus II classes for science, technology, engineering, and mathematics majors using different testing modes. Three sections with 324 students employed: paper-and-pencil testing, computer-based testing, and both. Computer tests gave…
Descriptors: Test Format, Computer Assisted Testing, Paper (Material), Calculus
Edward Paul Getman – Online Submission, 2020
Despite calls for engaging assessments targeting young language learners (YLLs) between 8 and 13 years old, what makes assessment tasks engaging and how such task characteristics affect measurement quality have not been well studied empirically. Furthermore, there has been a dearth of validity research about technology-enhanced speaking tests for…
Descriptors: English (Second Language), Language Tests, Second Language Learning, Learner Engagement
Razi, Salim – SAGE Open, 2015
Similarity reports of plagiarism detectors should be approached with caution as they may not be sufficient to support allegations of plagiarism. This study developed a 50-item rubric to simplify and standardize evaluation of academic papers. In the spring semester of 2011-2012 academic year, 161 freshmen's papers at the English Language Teaching…
Descriptors: Foreign Countries, Scoring Rubrics, Writing Evaluation, Writing (Composition)
Boström, Petra; Johnels, Jakob Åsberg; Thorson, Maria; Broberg, Malin – Journal of Mental Health Research in Intellectual Disabilities, 2016
Few studies have explored the subjective mental health of adolescents with intellectual disabilities, while proxy ratings indicate an overrepresentation of mental health problems. The present study reports on the design and an initial empirical evaluation of the Well-being in Special Education Questionnaire (WellSEQ). Questions, response scales,…
Descriptors: Mental Health, Peer Relationship, Family Environment, Educational Environment
Haberman, Shelby J. – Educational Testing Service, 2011
Alternative approaches are discussed for use of e-rater[R] to score the TOEFL iBT[R] Writing test. These approaches involve alternate criteria. In the 1st approach, the predicted variable is the expected rater score of the examinee's 2 essays. In the 2nd approach, the predicted variable is the expected rater score of 2 essay responses by the…
Descriptors: Writing Tests, Scoring, Essays, Language Tests
Zechner, Klaus; Bejar, Isaac I.; Hemat, Ramin – ETS Research Report Series, 2007
The increasing availability and performance of computer-based testing has prompted more research on the automatic assessment of language and speaking proficiency. In this investigation, we evaluated the feasibility of using an off-the-shelf speech-recognition system for scoring speaking prompts from the LanguEdge field test of 2002. We first…
Descriptors: Role, Computer Assisted Testing, Language Proficiency, Oral Language
Bennett, Randy Elliot; Rock, Donald A. – 1993
Formulating-Hypotheses (F-H) items present a situation and ask the examinee to generate as many explanations for it as possible. This study examined the generalizability, validity, and examinee perceptions of a computer-delivered version of the task. Eight F-H questions were administered to 192 graduate students. Half of the items restricted…
Descriptors: Computer Assisted Testing, Difficulty Level, Generalizability Theory, Graduate Students
Clariana, Roy B.; Koul, Ravinder; Salehi, Roya – International Journal of Instructional Media, 2006
This investigation seeks to confirm a computer-based approach that can be used to score concept maps (Poindexter & Clariana, 2004) and then describes the concurrent criterion-related validity of these scores. Participants enrolled in two graduate courses (n=24) were asked to read about and research online the structure and function of the heart…
Descriptors: Semantics, Human Body, Test Validity, Anatomy
Bobek, Becky L.; Gore, Paul A. – American College Testing (ACT), Inc., 2004
This research report describes changes made to the Inventory of Work-Relevant Values when it was revised for online use as a part of the Internet version of DISCOVER. Users will see the following differences between the online and CD-ROM versions of the inventory: 22 items rather than 61, simplified presentation, and the contribution of all items…
Descriptors: Interrater Reliability, Field Tests, Internet, Test Construction
Carlson, Sybil B.; Camp, Roberta – 1985
This paper reports on Educational Testing Service research studies investigating the parameters critical to reliability and validity in both the direct and indirect writing ability assessment of higher education applicants. The studies involved: (1) formulating an operational definition of writing competence; (2) designing and pretesting writing…
Descriptors: College Entrance Examinations, Computer Assisted Testing, English (Second Language), Essay Tests