Publication Date
In 2025 | 3 |
Since 2024 | 18 |
Descriptor
Test Length | 18 |
Foreign Countries | 9 |
Test Construction | 7 |
Test Validity | 6 |
Accuracy | 5 |
Computer Assisted Testing | 5 |
Item Response Theory | 5 |
Test Reliability | 5 |
High Stakes Tests | 4 |
Language Tests | 4 |
Sample Size | 4 |
More ▼ |
Source
Author
Allan S. Cohen | 1 |
Chenchen Ma | 1 |
Chia-Ying Chu | 1 |
Chieh-An Chen | 1 |
Chun Wang | 1 |
Dubravka Svetina Valdivia | 1 |
Ebru Dogruöz | 1 |
Felipe Espinosa Parra | 1 |
Félix González-Carrasco | 1 |
Gongjun Xu | 1 |
Hakyung Sung | 1 |
More ▼ |
Publication Type
Journal Articles | 16 |
Reports - Research | 16 |
Dissertations/Theses -… | 1 |
Reports - Descriptive | 1 |
Tests/Questionnaires | 1 |
Education Level
Audience
Location
China | 4 |
Australia | 1 |
California | 1 |
Canada | 1 |
Germany | 1 |
Ireland | 1 |
Japan | 1 |
New Zealand | 1 |
Norway | 1 |
Poland | 1 |
Singapore | 1 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
Peabody Picture Vocabulary… | 1 |
Program for International… | 1 |
Trends in International… | 1 |
Woodcock Johnson Tests of… | 1 |
What Works Clearinghouse Rating
Dubravka Svetina Valdivia; Shenghai Dai – Journal of Experimental Education, 2024
Applications of polytomous IRT models in applied fields (e.g., health, education, psychology) are abound. However, little is known about the impact of the number of categories and sample size requirements for precise parameter recovery. In a simulation study, we investigated the impact of the number of response categories and required sample size…
Descriptors: Item Response Theory, Sample Size, Models, Classification
Félix González-Carrasco; Felipe Espinosa Parra; Izaskun Álvarez-Aguado; Sebastián Ponce Olguín; Vanessa Vega Córdova; Miguel Roselló-Peñaloza – British Journal of Learning Disabilities, 2025
Background: The study focuses on the need to optimise assessment scales for support needs in individuals with intellectual and developmental disabilities. Current scales are often lengthy and redundant, leading to exhaustion and response burden. The goal is to use machine learning techniques, specifically item-reduction methods and selection…
Descriptors: Artificial Intelligence, Intellectual Disability, Developmental Disabilities, Individual Needs
Chia-Ying Chu; Pei-Hua Chen; Yi-Shin Tsai; Chieh-An Chen; Yi-Chih Chan; Yan-Jhe Ciou – Journal of Deaf Studies and Deaf Education, 2024
This study investigated the impact of language sample length on mean length of utterance (MLU) and aimed to determine the minimum number of utterances required for a reliable MLU. Conversations were collected from Mandarin-speaking, hard-of-hearing and typical-hearing children aged 16-81 months. The MLUs were calculated using sample sizes ranging…
Descriptors: Foreign Countries, Mandarin Chinese, Young Children, Language Acquisition
Ebru Dogruöz; Hülya Kelecioglu – International Journal of Assessment Tools in Education, 2024
In this research, multistage adaptive tests (MST) were compared according to sample size, panel pattern and module length for top-down and bottom-up test assembly methods. Within the scope of the research, data from PISA 2015 were used and simulation studies were conducted according to the parameters estimated from these data. Analysis results for…
Descriptors: Adaptive Testing, Test Construction, Foreign Countries, Achievement Tests
Handan Narin Kiziltan; Hatice Cigdem Bulut – International Journal of Assessment Tools in Education, 2024
Mental imagery is a vital cognitive skill that significantly influences how reality is perceived while creating art. Its multifaceted nature reveals various dimensions of creative expression, amplifying the inherent complexities of measuring it. This study aimed to shorten the Mental Imagery Scale in Artistic Creativity (MISAC) via the Ant Colony…
Descriptors: Foreign Countries, Undergraduate Students, Art Education, Imagery
Jing Ma – ProQuest LLC, 2024
This study investigated the impact of scoring polytomous items later on measurement precision, classification accuracy, and test security in mixed-format adaptive testing. Utilizing the shadow test approach, a simulation study was conducted across various test designs, lengths, number and location of polytomous item. Results showed that while…
Descriptors: Scoring, Adaptive Testing, Test Items, Classification
Tom Benton – Research Matters, 2024
Educational assessment is used throughout the world for a range of different formative and summative purposes. Wherever an assessment is developed, whether by a teacher creating a quiz for their class, or by a testing company creating a high stakes assessment, it is necessary to decide how long the test should be. Specifically, how many questions…
Descriptors: Foreign Countries, High Stakes Tests, Test Length, Test Construction
Yi-Jui I. Chen; Yi-Jhen Wu; Yi-Hsin Chen; Robin Irey – Journal of Psychoeducational Assessment, 2025
A short form of the 60-item computer-based orthographic processing assessment (long-form COPA or COPA-LF) was developed. The COPA-LF consists of five skills, including rapid perception, access, differentiation, correction, and arrangement. Thirty items from the COPA-LF were selected for the short-form COPA (COPA-SF) based on cognitive diagnostic…
Descriptors: Computer Assisted Testing, Test Length, Test Validity, Orthographic Symbols
Hakyung Sung; Sooyeon Cho; Kristopher Kyle – Language Assessment Quarterly, 2024
Lexical diversity (LD) is an important indicator of second language lexical development. Much research has investigated LD indices, with a focus on learners of English. However, further research is needed in languages that are typologically distinct from English, such as Korean. In this study, we evaluated the reliability and validity of LD…
Descriptors: Second Language Learning, Korean, Persuasive Discourse, Language Tests
Niclas Larson – Journal of the International Society for Teacher Education, 2024
This paper reports on a revision of the assessment model from the first mathematics course for pre-service teachers (PSTs) aiming for grades 5-10, at a Norwegian university. The weight of the final written exam was reduced and a new, mastery-based testing model, with weekly small tests, was introduced. Results from this study show that the PSTs…
Descriptors: High Stakes Tests, Test Length, Mathematics Tests, Preservice Teachers
Shaojie Wang; Won-Chan Lee; Minqiang Zhang; Lixin Yuan – Applied Measurement in Education, 2024
To reduce the impact of parameter estimation errors on IRT linking results, recent work introduced two information-weighted characteristic curve methods for dichotomous items. These two methods showed outstanding performance in both simulation and pseudo-form pseudo-group analysis. The current study expands upon the concept of information…
Descriptors: Item Response Theory, Test Format, Test Length, Error of Measurement
Chenchen Ma; Jing Ouyang; Chun Wang; Gongjun Xu – Grantee Submission, 2024
Survey instruments and assessments are frequently used in many domains of social science. When the constructs that these assessments try to measure become multifaceted, multidimensional item response theory (MIRT) provides a unified framework and convenient statistical tool for item analysis, calibration, and scoring. However, the computational…
Descriptors: Algorithms, Item Response Theory, Scoring, Accuracy
Ying Xu; Xiaodong Li; Jin Chen – Language Testing, 2025
This article provides a detailed review of the Computer-based English Listening Speaking Test (CELST) used in Guangdong, China, as part of the National Matriculation English Test (NMET) to assess students' English proficiency. The CELST measures listening and speaking skills as outlined in the "English Curriculum for Senior Middle…
Descriptors: Computer Assisted Testing, English (Second Language), Language Tests, Listening Comprehension Tests
Nikola Ebenbeck; Markus Gebhardt – Journal of Special Education Technology, 2024
Technologies that enable individualization for students have significant potential in special education. Computerized Adaptive Testing (CAT) refers to digital assessments that automatically adjust their difficulty level based on students' abilities, allowing for personalized, efficient, and accurate measurement. This article examines whether CAT…
Descriptors: Computer Assisted Testing, Students with Disabilities, Special Education, Grade 3
Sedat Sen; Allan S. Cohen – Educational and Psychological Measurement, 2024
A Monte Carlo simulation study was conducted to compare fit indices used for detecting the correct latent class in three dichotomous mixture item response theory (IRT) models. Ten indices were considered: Akaike's information criterion (AIC), the corrected AIC (AICc), Bayesian information criterion (BIC), consistent AIC (CAIC), Draper's…
Descriptors: Goodness of Fit, Item Response Theory, Sample Size, Classification
Previous Page | Next Page »
Pages: 1 | 2