NotesFAQContact Us
Collection
Advanced
Search Tips
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 1 to 15 of 51 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Enrico Gandolfi; Richard E. Ferdig – Educational Technology Research and Development, 2025
Augmented Reality (AR) is increasingly being adopted in education to foster engagement and interest in a variety of subjects and content areas. However, there is a scarcity of instruments to measure the instructional impact of this innovation. This article addresses this gap in two unique ways. First, it presents validation results of the…
Descriptors: Simulated Environment, Measures (Individuals), Rating Scales, Item Response Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Viola Merhof; Caroline M. Böhm; Thorsten Meiser – Educational and Psychological Measurement, 2024
Item response tree (IRTree) models are a flexible framework to control self-reported trait measurements for response styles. To this end, IRTree models decompose the responses to rating items into sub-decisions, which are assumed to be made on the basis of either the trait being measured or a response style, whereby the effects of such person…
Descriptors: Item Response Theory, Test Interpretation, Test Reliability, Test Validity
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Barno S. Abdullaeva; Diyorjon Abdullaev; Nurislom I. Khursanov; Khurshida B. Kadirova; Laylo Djuraeva – International Journal of Language Testing, 2024
Cloze tests are commonly used in language testing as a quick measure of overall language ability or reading comprehension. A problem for the analysis of cloze tests with item response theory models is that cloze test items are locally dependent. This leads to the violation of the conditional or local independence assumption of IRT models. In this…
Descriptors: Cloze Procedure, Language Tests, Test Items, Correlation
Peer reviewed Peer reviewed
Direct linkDirect link
Abdullah Alamer; Ahmed Al Khateeb; Abdulrahman Alshabeb – Language Assessment Quarterly, 2025
This study introduces the first Arabic Vocabulary Levels Test (Arabic-VLT), created for foreign learners of Arabic. We present compelling evidence to substantiate its validity and reliability. The Arabic-VLT was developed according to five levels, beginning with the most frequently used words (Level 1) to the least frequently used ones (Level 5),…
Descriptors: Arabic, Vocabulary Development, Test Construction, Second Language Learning
Peer reviewed Peer reviewed
Direct linkDirect link
Angelica Garzon Umerenkova; Jesus de la Fuente Arias – Electronic Journal of Research in Educational Psychology, 2024
Introduction: Self-regulation is the ability to adequately plan and manage one's own behavior in a flexible manner. It is a predictor of well-being, health, academic performance, among others. The psychometric characterization of the Self-Regulation Questionnaire-Abbreviated (CAR-abr.) composed of 17 items is presented. A versatile instrument,…
Descriptors: Self Control, Self Management, Questionnaires, Psychometrics
Peer reviewed Peer reviewed
Direct linkDirect link
Liu, Tour; Sun, Yicong; Li, Zhen; Xin, Tao – Measurement: Interdisciplinary Research and Perspectives, 2019
Aberrant response has an important impact on item parameter estimation, individuals' evaluation, and other statistical analysis. There are various types of aberrant response behaviors in educational and psychological tests, like sleeping, guessing, and plodding. Random response is the most common one. The purpose of this research was to clarify…
Descriptors: Test Reliability, Test Validity, Item Response Theory, Differences
Peer reviewed Peer reviewed
Direct linkDirect link
Stoevenbelt, Andrea H.; Wicherts, Jelte M.; Flore, Paulette C.; Phillips, Lorraine A. T.; Pietschnig, Jakob; Verschuere, Bruno; Voracek, Martin; Schwabe, Inga – Educational and Psychological Measurement, 2023
When cognitive and educational tests are administered under time limits, tests may become speeded and this may affect the reliability and validity of the resulting test scores. Prior research has shown that time limits may create or enlarge gender gaps in cognitive and academic testing. On average, women complete fewer items than men when a test…
Descriptors: Timed Tests, Gender Differences, Item Response Theory, Correlation
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Hartono, Wahyu; Hadi, Samsul; Rosnawati, Raden; Retnawati, Heri – Pegem Journal of Education and Instruction, 2023
Researchers design diagnostic assessments to measure students' knowledge structures and processing skills to provide information about their cognitive attribute. The purpose of this study is to determine the instrument's validity and score reliability, as well as to investigate the use of classical test theory to identify item characteristics. The…
Descriptors: Diagnostic Tests, Test Validity, Item Response Theory, Content Validity
Peer reviewed Peer reviewed
Direct linkDirect link
de Ruiter, Laura E.; Bers, Marina U. – Computer Science Education, 2022
Background and Context: Despite the increasing implementation of coding in early curricula, there are few valid and reliable assessments of coding abilities for young children. This impedes studying learning outcomes and the development and evaluation of curricula. Objective: Developing and validating a new instrument for assessing young…
Descriptors: Programming Languages, Computer Software, Coding, Computer Science Education
Peer reviewed Peer reviewed
Direct linkDirect link
Nielsen, Tine – Cogent Education, 2020
Academic self-efficacy is mostly construed as specific; task-specific, course-specific or domain-specific. Previous research in the Danish university context has shown that the self-efficacy subscale in the Motivated Strategies for Leaning Questionnaire is not a single scale, but consists of two separate course- and activity-specific scales; the…
Descriptors: Academic Achievement, Self Efficacy, Test Wiseness, Construct Validity
Peer reviewed Peer reviewed
Direct linkDirect link
Mehren, Rainer; Rempfler, Armin; Buchholz, Janine; Hartig, Johannes; Ulrich-Riedhammer, Eva M. – Journal of Research in Science Teaching, 2018
Constituting a metacognitive strategy, system competence or systems thinking can only assume its assigned key function as a basic concept for the school subject of geography in Germany after a theoretical and empirical foundation has been established. A measurement instrument is required which is suitable both for supporting students and for the…
Descriptors: Models, Metacognition, Competence, Geography
Peer reviewed Peer reviewed
Direct linkDirect link
Choi, Ikkyu; Papageorgiou, Spiros – Language Testing, 2020
Stakeholders of language tests are often interested in subscores. However, reporting a subscore is not always justified; a subscore should provide reliable and distinct information to be worth reporting. When a subscore is used for decisions across multiple levels (e.g., individual test takers and schools), it needs to be justified for its…
Descriptors: English (Second Language), Language Tests, Second Language Learning, Scores
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Bichi, Ado Abdu; Talib, Rohaya – International Journal of Evaluation and Research in Education, 2018
Testing in educational system perform a number of functions, the results from a test can be used to make a number of decisions in education. It is therefore well accepted in the education literature that, testing is an important element of education. To effectively utilize the tests in educational policies and quality assurance its validity and…
Descriptors: Item Response Theory, Test Items, Test Construction, Decision Making
Peer reviewed Peer reviewed
Direct linkDirect link
Longabach, Tanya; Peyton, Vicki – Language Testing, 2018
K-12 English language proficiency tests that assess multiple content domains (e.g., listening, speaking, reading, writing) often have subsections based on these content domains; scores assigned to these subsections are commonly known as subscores. Testing programs face increasing customer demands for the reporting of subscores in addition to the…
Descriptors: Comparative Analysis, Test Reliability, Second Language Learning, Language Proficiency
Peer reviewed Peer reviewed
Direct linkDirect link
Lee, Minji K.; Sweeney, Kevin; Melican, Gerald J. – Educational Assessment, 2017
This study investigates the relationships among factor correlations, inter-item correlations, and the reliability estimates of subscores, providing a guideline with respect to psychometric properties of useful subscores. In addition, it compares subscore estimation methods with respect to reliability and distinctness. The subscore estimation…
Descriptors: Scores, Test Construction, Test Reliability, Test Validity
Previous Page | Next Page »
Pages: 1  |  2  |  3  |  4