Publication Date
In 2025 | 31 |
Since 2024 | 122 |
Since 2021 (last 5 years) | 345 |
Since 2016 (last 10 years) | 710 |
Since 2006 (last 20 years) | 1190 |
Descriptor
Test Construction | 2673 |
Test Items | 2673 |
Test Validity | 687 |
Test Reliability | 550 |
Foreign Countries | 533 |
Item Analysis | 478 |
Difficulty Level | 423 |
Multiple Choice Tests | 407 |
Item Response Theory | 391 |
Computer Assisted Testing | 380 |
Test Format | 360 |
More ▼ |
Source
Author
Publication Type
Education Level
Audience
Practitioners | 155 |
Teachers | 114 |
Researchers | 99 |
Administrators | 31 |
Students | 17 |
Policymakers | 6 |
Parents | 4 |
Counselors | 3 |
Support Staff | 3 |
Location
Turkey | 63 |
Australia | 53 |
Canada | 30 |
Florida | 26 |
Indonesia | 26 |
Germany | 24 |
United Kingdom | 22 |
United Kingdom (England) | 20 |
China | 18 |
Oregon | 16 |
Japan | 15 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Meets WWC Standards without Reservations | 1 |
Meets WWC Standards with or without Reservations | 1 |
Does not meet standards | 1 |
Yanyan Fu – Educational Measurement: Issues and Practice, 2024
The template-based automated item-generation (TAIG) approach that involves template creation, item generation, item selection, field-testing, and evaluation has more steps than the traditional item development method. Consequentially, there is more margin for error in this process, and any template errors can be cascaded to the generated items.…
Descriptors: Error Correction, Automation, Test Items, Test Construction
Séverin Lions; María Paz Blanco; Pablo Dartnell; Carlos Monsalve; Gabriel Ortega; Julie Lemarié – Applied Measurement in Education, 2024
Multiple-choice items are universally used in formal education. Since they should assess learning, not test-wiseness or guesswork, they must be constructed following the highest possible standards. Hundreds of item-writing guides have provided guidelines to help test developers adopt appropriate strategies to define the distribution and sequence…
Descriptors: Test Construction, Multiple Choice Tests, Guidelines, Test Items
Becker, Benjamin; Weirich, Sebastian; Goldhammer, Frank; Debeer, Dries – Journal of Educational Measurement, 2023
When designing or modifying a test, an important challenge is controlling its speededness. To achieve this, van der Linden (2011a, 2011b) proposed using a lognormal response time model, more specifically the two-parameter lognormal model, and automated test assembly (ATA) via mixed integer linear programming. However, this approach has a severe…
Descriptors: Test Construction, Automation, Models, Test Items
Miguel A. García-Pérez – Educational and Psychological Measurement, 2024
A recurring question regarding Likert items is whether the discrete steps that this response format allows represent constant increments along the underlying continuum. This question appears unsolvable because Likert responses carry no direct information to this effect. Yet, any item administered in Likert format can identically be administered…
Descriptors: Likert Scales, Test Construction, Test Items, Item Analysis
Po-Chun Huang; Ying-Hong Chan; Ching-Yu Yang; Hung-Yuan Chen; Yao-Chung Fan – IEEE Transactions on Learning Technologies, 2024
Question generation (QG) task plays a crucial role in adaptive learning. While significant QG performance advancements are reported, the existing QG studies are still far from practical usage. One point that needs strengthening is to consider the generation of question group, which remains untouched. For forming a question group, intrafactors…
Descriptors: Automation, Test Items, Computer Assisted Testing, Test Construction
Mahmood Ul Hassan; Frank Miller – Journal of Educational Measurement, 2024
Multidimensional achievement tests are recently gaining more importance in educational and psychological measurements. For example, multidimensional diagnostic tests can help students to determine which particular domain of knowledge they need to improve for better performance. To estimate the characteristics of candidate items (calibration) for…
Descriptors: Multidimensional Scaling, Achievement Tests, Test Items, Test Construction
Anne Traynor; Sara C. Christopherson – Applied Measurement in Education, 2024
Combining methods from earlier content validity and more contemporary content alignment studies may allow a more complete evaluation of the meaning of test scores than if either set of methods is used on its own. This article distinguishes item relevance indices in the content validity literature from test representativeness indices in the…
Descriptors: Test Validity, Test Items, Achievement Tests, Test Construction
Christoph Ableitinger; Christian Dorner – International Journal of Mathematical Education in Science and Technology, 2025
The number of complaints university lecturers make about a lack of knowledge, especially first-year students' procedural knowledge, has increased recently. Due to missing adequate empirical evidence, a survey of procedural knowledge among students of Austrian high schools in their final year was conducted. For this purpose, test items for…
Descriptors: Knowledge Level, Cognitive Processes, High School Seniors, Foreign Countries
Chan Zhang; Shuaiying Cao; Minglei Wang; Jiangyan Wang; Lirui He – Field Methods, 2025
Previous research on grid questions has mostly focused on their comparability with the item-by-item method and the use of shading to help respondents navigate through a grid. This study extends prior work by examining whether lexical similarity among grid items affects how respondents answer the questions in an experiment where we manipulated…
Descriptors: Foreign Countries, Surveys, Test Construction, Design
Anna Planas-Lladó; Xavier Úcar – American Journal of Evaluation, 2024
Empowerment is a concept that has become increasingly used over recent years. However, little research has been undertaken into how empowerment can be evaluated, particularly in the case of young people. The aim of this article is to present an inventory of dimensions and indicators of youth empowerment. The article describes the various phases in…
Descriptors: Youth, Empowerment, Test Construction, Test Validity
A Method for Generating Course Test Questions Based on Natural Language Processing and Deep Learning
Hei-Chia Wang; Yu-Hung Chiang; I-Fan Chen – Education and Information Technologies, 2024
Assessment is viewed as an important means to understand learners' performance in the learning process. A good assessment method is based on high-quality examination questions. However, generating high-quality examination questions manually by teachers is a time-consuming task, and it is not easy for students to obtain question banks. To solve…
Descriptors: Natural Language Processing, Test Construction, Test Items, Models
Haokun Liu – International Journal of Multilingualism, 2025
Globally, countries or regions across from east to west like Hong Kong, Macao, Taiwan, Singapore, the United Kingdom, and the United States have incorporated language item questions in their censuses. The assessment of such design advantages and disadvantages is crucial for academic investigation. Despite ongoing discussions, there is a noticeable…
Descriptors: Language Usage, Demography, Surveys, Questionnaires
Kaja Haugen; Cecilie Hamnes Carlsen; Christine Möller-Omrani – Language Awareness, 2025
This article presents the process of constructing and validating a test of metalinguistic awareness (MLA) for young school children (age 8-10). The test was developed between 2021 and 2023 as part of the MetaLearn research project, financed by The Research Council of Norway. The research team defines MLA as using metalinguistic knowledge at a…
Descriptors: Language Tests, Test Construction, Elementary School Students, Metalinguistics
Harold Doran; Testsuhiro Yamada; Ted Diaz; Emre Gonulates; Vanessa Culver – Journal of Educational Measurement, 2025
Computer adaptive testing (CAT) is an increasingly common mode of test administration offering improved test security, better measurement precision, and the potential for shorter testing experiences. This article presents a new item selection algorithm based on a generalized objective function to support multiple types of testing conditions and…
Descriptors: Computer Assisted Testing, Adaptive Testing, Test Items, Algorithms
Fu, Yanyan; Choe, Edison M.; Lim, Hwanggyu; Choi, Jaehwa – Educational Measurement: Issues and Practice, 2022
This case study applied the "weak theory" of Automatic Item Generation (AIG) to generate isomorphic item instances (i.e., unique but psychometrically equivalent items) for a large-scale assessment. Three representative instances were selected from each item template (i.e., model) and pilot-tested. In addition, a new analytical framework,…
Descriptors: Test Items, Measurement, Psychometrics, Test Construction