Publication Date
In 2025 | 6 |
Since 2024 | 18 |
Descriptor
Test Format | 18 |
Test Validity | 18 |
Test Reliability | 10 |
Foreign Countries | 7 |
Language Tests | 5 |
Test Construction | 5 |
Test Items | 5 |
Testing | 5 |
Item Response Theory | 4 |
Multiple Choice Tests | 4 |
Psychometrics | 4 |
More ▼ |
Source
Author
Ali Khodi | 1 |
Amir Hossein Farrokhi | 1 |
Amit Sevak | 1 |
Bin Tan | 1 |
Celeste Combrinck | 1 |
Chandralekha Singh | 1 |
Cornelia Eva Neuert | 1 |
Dadan Rosana | 1 |
Dana Murano | 1 |
Daniel Fishtein | 1 |
David Soares | 1 |
More ▼ |
Publication Type
Journal Articles | 16 |
Reports - Research | 10 |
Information Analyses | 3 |
Reports - Evaluative | 3 |
Guides - Classroom - Teacher | 2 |
Collected Works - General | 1 |
Tests/Questionnaires | 1 |
Education Level
Higher Education | 6 |
Postsecondary Education | 6 |
Elementary Education | 3 |
Grade 6 | 2 |
Intermediate Grades | 2 |
Middle Schools | 2 |
Secondary Education | 2 |
Early Childhood Education | 1 |
Grade 11 | 1 |
Grade 2 | 1 |
Grade 3 | 1 |
More ▼ |
Audience
Teachers | 2 |
Laws, Policies, & Programs
Assessments and Surveys
ACT Assessment | 1 |
International English… | 1 |
Test of Gross Motor… | 1 |
What Works Clearinghouse Rating
Bin Tan; Nour Armoush; Elisabetta Mazzullo; Okan Bulut; Mark J. Gierl – International Journal of Assessment Tools in Education, 2025
This study reviews existing research on the use of large language models (LLMs) for automatic item generation (AIG). We performed a comprehensive literature search across seven research databases, selected studies based on predefined criteria, and summarized 60 relevant studies that employed LLMs in the AIG process. We identified the most commonly…
Descriptors: Artificial Intelligence, Test Items, Automation, Test Format
Natalja Menold; Vera Toepoel – Sociological Methods & Research, 2024
Research on mixed devices in web surveys is in its infancy. Using a randomized experiment, we investigated device effects (desktop PC, tablet and mobile phone) for six response formats and four different numbers of scale points. N = 5,077 members of an online access panel participated in the experiment. An exact test of measurement invariance and…
Descriptors: Online Surveys, Handheld Devices, Telecommunications, Test Reliability
Xueliang Chen; Vahid Aryadoust; Wenxin Zhang – Language Testing, 2025
The growing diversity among test takers in second or foreign language (L2) assessments makes the importance of fairness front and center. This systematic review aimed to examine how fairness in L2 assessments was evaluated through differential item functioning (DIF) analysis. A total of 83 articles from 27 journals were included in a systematic…
Descriptors: Second Language Learning, Language Tests, Test Items, Item Analysis
Cornelia Eva Neuert – Sociological Methods & Research, 2024
The quality of data in surveys is affected by response burden and questionnaire length. With an increasing number of questions, respondents can become bored, tired, and annoyed and may take shortcuts to reduce the effort needed to complete the survey. In this article, direct evidence is presented on how the position of items within a web…
Descriptors: Online Surveys, Test Items, Test Format, Test Construction
Kevin Woods; Tee McCaldin; Kerry Brown; Rob Buck; Nicola Fairhall; Emma Forshaw; David Soares – Assessment in Education: Principles, Policy & Practice, 2024
In England, Wales and Northern Ireland, the General Certificate of Secondary Education (GCSE) has been for the last 35 years the most common qualification by which students' attainment at age 16 has been measured. The range and balance of processes by which the GCSEs' programmes of study have been assessed have varied over the decades, to include…
Descriptors: Foreign Countries, Secondary School Students, Grade 11, Educational Certificates
Susan K. Johnsen – Gifted Child Today, 2024
The author provides a checklist for educators who are selecting technically adequate tests for identifying and referring students for gifted education services and programs. The checklist includes questions related to how the test was normed, reliability and validity studies as well as questions related to types of scores, administration, and…
Descriptors: Test Selection, Academically Gifted, Gifted Education, Test Validity
Jeff Allen; Jay Thomas; Stacy Dreyer; Scott Johanningmeier; Dana Murano; Ty Cruce; Xin Li; Edgar Sanchez – ACT Education Corp., 2025
This report describes the process of developing and validating the enhanced ACT. The report describes the changes made to the test content and the processes by which these design decisions were implemented. The authors describe how they shared the overall scope of the enhancements, including the initial blueprints, with external expert panels,…
Descriptors: College Entrance Examinations, Testing, Change, Test Construction
Wim J. van der Linden; Luping Niu; Seung W. Choi – Journal of Educational and Behavioral Statistics, 2024
A test battery with two different levels of adaptation is presented: a within-subtest level for the selection of the items in the subtests and a between-subtest level to move from one subtest to the next. The battery runs on a two-level model consisting of a regular response model for each of the subtests extended with a second level for the joint…
Descriptors: Adaptive Testing, Test Construction, Test Format, Test Reliability
Celeste Combrinck – SAGE Open, 2024
We have less time and focus than ever before, while the demand for attention is increasing. Therefore, it is no surprise that when answering questionnaires, we often choose to strongly agree or be neutral, producing problematic and unusable data. The current study investigated forced-choice (ipsative) format compared to the same questions on a…
Descriptors: Likert Scales, Test Format, Surveys, Design
Muhammed Parviz; Masoud Azizi – Discover Education, 2025
This article offers a critical review of the Ministry of Science, Research, and Technology English Proficiency Test (MSRT), a high-stakes exam required for postgraduate graduation, scholarships, and certain employment positions in Iran. Despite its widespread use, the design and implementation of the MSRT raise concerns about its validity and…
Descriptors: Language Tests, Language Proficiency, English (Second Language), Second Language Learning
Laura A. Outhwaite; Pirjo Aunio; Jaimie Ka Yu Leung; Jo Van Herwegen – Educational Psychology Review, 2024
Successful early mathematical development is vital to children's later education, employment, and wellbeing outcomes. However, established measurement tools are infrequently used to (i) assess children's mathematical skills and (ii) identify children with or at-risk of mathematical learning difficulties. In response, this pre-registered systematic…
Descriptors: Mathematics Tests, Screening Tests, Mathematics Skills, At Risk Students
Ali Khodi; Logendra Stanley Ponniah; Amir Hossein Farrokhi; Fateme Sadeghi – Language Testing in Asia, 2024
The current article evaluates a national English language proficiency test known as the "MSRT test" which is used to determine the eligibility of candidates for admission to and completion of higher education programs in Iran. Students in all majors take this standardized, high-stake criterion-referenced test to determine if they have…
Descriptors: Foreign Countries, Language Tests, Reading Tests, Language Proficiency
Stefan O'Grady – International Journal of Listening, 2025
Language assessment is increasingly computermediated. This development presents opportunities with new task formats and equally a need for renewed scrutiny of established conventions. Recent recommendations to increase integrated skills assessment in lecture comprehension tests is premised on empirical research that demonstrates enhanced construct…
Descriptors: Language Tests, Lecture Method, Listening Comprehension Tests, Multiple Choice Tests
Nathan Gavigan; Sarahjane Belton; Una Britton; Shane Dalton; Johann Issartel – European Physical Education Review, 2024
Although there is a plethora of tools available to assess children's movement competence (MC), the literature suggests that many have significant limitations (e.g. not being practical for use in many 'real-world' settings). The FMS[superscript 2] assessment tool has recently been developed as a targeted solution to many of the existing barriers…
Descriptors: Test Validity, Test Format, Children, Evaluation
Fitria Lafifa; Dadan Rosana – Turkish Online Journal of Distance Education, 2024
This research goal to develop a multiple-choice closed-ended test to assessing and evaluate students' digital literacy skills. The sample in this study were students at MTsN 1 Blitar City who were selected using a purposive sampling technique. The test was also validated by experts, namely 2 Doctors of Physics and Science from Yogyakarta State…
Descriptors: Educational Innovation, Student Evaluation, Digital Literacy, Multiple Choice Tests
Previous Page | Next Page »
Pages: 1 | 2