Publication Date
In 2025 | 1 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 9 |
Since 2016 (last 10 years) | 22 |
Since 2006 (last 20 years) | 31 |
Descriptor
Scoring | 72 |
Test Validity | 50 |
Test Reliability | 34 |
Test Construction | 33 |
Testing | 18 |
Validity | 14 |
English (Second Language) | 13 |
Test Interpretation | 13 |
Foreign Countries | 12 |
Interrater Reliability | 12 |
Correlation | 11 |
More ▼ |
Source
Author
Publication Type
Education Level
Audience
Practitioners | 4 |
Researchers | 3 |
Teachers | 1 |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Reuben S. Asempapa; Doris Lee – Discover Education, 2025
Across the world, standards and practices for preparing teachers of mathematics emphasize the importance of math modeling (MM) in developing students' mathematical thinking. The aim of this research study was to develop the Mathematical Modeling Knowledge Scale (MAMKS), capable of determining preservice teachers' (PSTs') knowledge of MM. The study…
Descriptors: Preservice Teachers, Preservice Teacher Education, Mathematics Education, Mathematics Curriculum
Conti, Gary J. – Journal of Education and Learning, 2023
The use of personality inventories has been limited because of their cost and the length. To overcome these limitations, this study created the Personality Identity Estimator (PIE), an easy-to-use inventory to estimate personality types that can be used at no cost. PIE is a categorical inventory containing 12 items with 3 items for each of the 4…
Descriptors: Personality Measures, Personality Traits, Validity, Reliability
Beheshti, Shima; Safa, Mohammad Ahmadi – Iranian Journal of Language Teaching Research, 2023
The indefinite nature of test fairness and different interpretations and definitions of the concept have stirred a lot of controversy over the years, necessitating the reconceptualization of the concept. On this basis, this study aimed to explore the empirical validity of Kunnan's (2008) Test Fairness Framework (TFF) and revisit the established…
Descriptors: Test Bias, Equal Education, Grounded Theory, Test Construction
Evaluating an Explicit Instruction Teacher Observation Protocol through a Validity Argument Approach
Johnson, Evelyn S.; Zheng, Yuzhu; Crawford, Angela R.; Moylan, Laura A. – Journal of Experimental Education, 2022
In this study, we examined the scoring and generalizability assumptions of an explicit instruction (EI) special education teacher observation protocol using many-faceted Rasch measurement (MFRM). Video observations of classroom instruction from 48 special education teachers across four states were collected. External raters (n = 20) were trained…
Descriptors: Direct Instruction, Teacher Education, Classroom Observation Techniques, Validity
Li, Xu; Ouyang, Fan; Liu, Jianwen; Wei, Chengkun; Chen, Wenzhi – Journal of Educational Computing Research, 2023
The computer-supported writing assessment (CSWA) has been widely used to reduce instructor workload and provide real-time feedback. Interpretability of CSWA draws extensive attention because it can benefit the validity, transparency, and knowledge-aware feedback of academic writing assessments. This study proposes a novel assessment tool,…
Descriptors: Computer Assisted Testing, Writing Evaluation, Feedback (Response), Natural Language Processing
Saban-Dülger, Nur Seda; Turan, Figen; Özcebe, Esra – Journal of Speech, Language, and Hearing Research, 2022
Purpose: Language sampling analysis (LSA) plays an important role in evaluating language skills; hence, the study aimed to develop new assessment measures for the LSA in Turkish as alternatives to mean length of utterance (MLU) and the Language Assessment, Remediation and Screening Procedure. With this aim, Developmental Sentence Scoring (DSS) and…
Descriptors: Syntax, Turkish, Speech Communication, Correlation
Yik, Brandon J.; Dood, Amber J.; Cruz-Ramirez de Arellano, Daniel; Fields, Kimberly B.; Raker, Jeffrey R. – Chemistry Education Research and Practice, 2021
Acid-base chemistry is a key reaction motif taught in postsecondary organic chemistry courses. More specifically, concepts from the Lewis acid-base model are broadly applicable to understanding mechanistic ideas such as electron density, nucleophilicity, and electrophilicity; thus, the Lewis model is fundamental to explaining an array of reaction…
Descriptors: Artificial Intelligence, Models, Formative Evaluation, Organic Chemistry
Evaluating an Explicit Instruction Teacher Observation Protocol through a Validity Argument Approach
Johnson, Evelyn S.; Zheng, Yuzhu; Crawford, Angela R.; Moylan, Laura A. – Grantee Submission, 2020
In this study, we examined the scoring and generalizability assumptions of an Explicit Instruction (EI) special education teacher observation protocol using many-faceted Rasch measurement (MFRM). Video observations of classroom instruction from 48 special education teachers across four states were collected. External raters (n = 20) were trained…
Descriptors: Direct Instruction, Teacher Evaluation, Classroom Observation Techniques, Validity
Feranchak, Bret; Deiger, Megan – AERA Online Paper Repository, 2017
Increasingly content area projects and programs at the K-12 level, such as in mathematics, involve a programmatic component or project emphasis on developing "teacher leadership". However, there is no consistent definition or framework for this construct and even fewer validated tools for measuring it. This paper describes our efforts in…
Descriptors: Teacher Leadership, Mathematics Instruction, Guidelines, Elementary Secondary Education
International Journal of Testing, 2018
The second edition of the International Test Commission Guidelines for Translating and Adapting Tests was prepared between 2005 and 2015 to improve upon the first edition, and to respond to advances in testing technology and practices. The 18 guidelines are organized into six categories to facilitate their use: pre-condition (3), test development…
Descriptors: Translation, Test Construction, Testing, Scoring
Bell, Courtney A.; Jones, Nathan D.; Qi, Yi; Lewis, Jennifer M. – Educational Assessment, 2018
All 50 states use observations to evaluate practicing teachers, but we know little about how administrators actually reason when they use those observation protocols. Drawing on think-aloud and stimulated recall data, this study describes the types of strategies and warrants practicing administrators used when rating with their district's…
Descriptors: Administrators, Observation, Validity, Logical Thinking
Fairbairn, Judith; Spiby, Richard – European Journal of Special Needs Education, 2019
Language test developers have a responsibility to ensure that their tests are accessible to test takers of various backgrounds and characteristics and also that they have the opportunity to perform to the best of their ability. This principle is widely recognised by educational and language testing associations in guidelines for the production and…
Descriptors: Testing, Language Tests, Test Construction, Testing Accommodations
Wang, Qiao – Education and Information Technologies, 2022
This study searched for open-source semantic similarity tools and evaluated their effectiveness in automated content scoring of fact-based essays written by English-as-a-Foreign-Language (EFL) learners. Fifty writing samples under a fact-based writing task from an academic English course in a Japanese university were collected and a gold standard…
Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Scoring
Davis, Larry; Norris, John – ETS Research Report Series, 2021
The elicited imitation task (EIT), in which language learners listen to a series of spoken sentences and repeat each one verbatim, is a commonly used measure of language proficiency in second language acquisition research. The "TOEFL® Essentials"™ test includes an EIT as a holistic measure of speaking proficiency, referred to as the…
Descriptors: Task Analysis, Language Proficiency, Speech Communication, Language Tests
Ahmadi Shirazi, Masoumeh – SAGE Open, 2019
Threats to construct validity should be reduced to a minimum. If true, sources of bias, namely raters, items, tests as well as gender, age, race, language background, culture, and socio-economic status need to be spotted and removed. This study investigates raters' experience, language background, and the choice of essay prompt as potential…
Descriptors: Foreign Countries, Language Tests, Test Bias, Essay Tests