Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 3 |
Since 2006 (last 20 years) | 13 |
Descriptor
Computer Assisted Testing | 15 |
Item Response Theory | 15 |
Reliability | 15 |
Accuracy | 5 |
Foreign Countries | 5 |
Validity | 5 |
Correlation | 4 |
Error of Measurement | 4 |
Test Items | 4 |
Adaptive Testing | 3 |
Comparative Analysis | 3 |
More ▼ |
Source
Author
Alonzo, Julie | 1 |
Anderson, Daniel | 1 |
Cohen-Charash, Yochi | 1 |
Coniam, David | 1 |
Csapó, Beno | 1 |
Engelhard, George, Jr. | 1 |
Foltz, Peter | 1 |
Foorman, Barbara R. | 1 |
Joo, Seang-Hwane | 1 |
Kern, Michael J. | 1 |
Ketelaar, Marjolijn | 1 |
More ▼ |
Publication Type
Journal Articles | 12 |
Reports - Research | 9 |
Reports - Evaluative | 4 |
Speeches/Meeting Papers | 2 |
Opinion Papers | 1 |
Reports - Descriptive | 1 |
Education Level
Elementary Education | 3 |
Early Childhood Education | 2 |
Grade 1 | 2 |
Higher Education | 2 |
Primary Education | 2 |
Elementary Secondary Education | 1 |
Grade 2 | 1 |
Grade 3 | 1 |
Grade 4 | 1 |
Grade 5 | 1 |
Grade 6 | 1 |
More ▼ |
Audience
Laws, Policies, & Programs
Assessments and Surveys
Pediatric Evaluation of… | 1 |
What Works Clearinghouse Rating
Joo, Seang-Hwane; Lee, Philseok; Stark, Stephen – Journal of Educational Measurement, 2018
This research derived information functions and proposed new scalar information indices to examine the quality of multidimensional forced choice (MFC) items based on the RANK model. We also explored how GGUM-RANK information, latent trait recovery, and reliability varied across three MFC formats: pairs (two response alternatives), triplets (three…
Descriptors: Item Response Theory, Models, Item Analysis, Reliability
Wind, Stefanie A.; Wolfe, Edward W.; Engelhard, George, Jr.; Foltz, Peter; Rosenstein, Mark – International Journal of Testing, 2018
Automated essay scoring engines (AESEs) are becoming increasingly popular as an efficient method for performance assessments in writing, including many language assessments that are used worldwide. Before they can be used operationally, AESEs must be "trained" using machine-learning techniques that incorporate human ratings. However, the…
Descriptors: Computer Assisted Testing, Essay Tests, Writing Evaluation, Scoring
Petscher, Yaacov; Mitchell, Alison M.; Foorman, Barbara R. – Reading and Writing: An Interdisciplinary Journal, 2015
A growing body of literature suggests that response latency, the amount of time it takes an individual to respond to an item, may be an important factor to consider when using assessment data to estimate the ability of an individual. Considering that tests of passage and list fluency are being adapted to a computer administration format, it is…
Descriptors: Computer Assisted Testing, Vocabulary, Item Response Theory, Reliability
Wang, Chun – Journal of Educational and Behavioral Statistics, 2014
Many latent traits in social sciences display a hierarchical structure, such as intelligence, cognitive ability, or personality. Usually a second-order factor is linearly related to a group of first-order factors (also called domain abilities in cognitive ability measures), and the first-order factors directly govern the actual item responses.…
Descriptors: Measurement, Accuracy, Item Response Theory, Adaptive Testing
Liu, Sha; Kunnan, Antony John – CALICO Journal, 2016
This study investigated the application of "WriteToLearn" on Chinese undergraduate English majors' essays in terms of its scoring ability and the accuracy of its error feedback. Participants were 163 second-year English majors from a university located in Sichuan province who wrote 326 essays from two writing prompts. Each paper was…
Descriptors: Foreign Countries, Undergraduate Students, English (Second Language), Second Language Learning
Csapó, Beno; Molnár, Gyöngyvér; Nagy, József – Journal of Educational Psychology, 2014
This study explores the potential of using online tests for the assessment of school readiness and for monitoring early reasoning. Four tests of a face-to-face-administered school readiness test battery (speech sound discrimination, relational reasoning, counting and basic numeracy, and deductive reasoning) and a paper-and-pencil inductive…
Descriptors: Computer Assisted Testing, School Readiness, Thinking Skills, Abstract Reasoning
Rogers, Angela – Mathematics Education Research Group of Australasia, 2013
As we move into the 21st century, educationalists are exploring the myriad of possibilities associated with Computer Based Assessment (CBA). At first glance this mode of assessment seems to provide many exciting opportunities in the mathematics domain, yet one must question the validity of CBA and whether our school systems, students and teachers…
Descriptors: Mathematics Tests, Student Evaluation, Computer Assisted Testing, Test Validity
Thomas, Michael L. – Assessment, 2011
Item response theory (IRT) and related latent variable models represent modern psychometric theory, the successor to classical test theory in psychological assessment. Although IRT has become prevalent in the measurement of ability and achievement, its contributions to clinical domains have been less extensive. Applications of IRT to clinical…
Descriptors: Item Response Theory, Psychological Evaluation, Reliability, Error of Measurement
Ketelaar, Marjolijn; Wassenberg-Severijnen, Jeltje – Physical & Occupational Therapy in Pediatrics, 2010
During the past 30 years many pediatric assessment and outcome measures have been developed. Based on Rasch analysis, the Pediatric Evaluation of Disability Inventory (PEDI) was designed to measure functional status by asking parents about both the skills of their children and the performance of daily tasks in three functionally important domains…
Descriptors: Cues, Behavior Problems, Independent Living, Patients
Schmitt, T. A.; Sass, D. A.; Sullivan, J. R.; Walker, C. M. – International Journal of Testing, 2010
Imposed time limits on computer adaptive tests (CATs) can result in examinees having difficulty completing all items, thus compromising the validity and reliability of ability estimates. In this study, the effects of speededness were explored in a simulated CAT environment by varying examinee response patterns to end-of-test items. Expectedly,…
Descriptors: Monte Carlo Methods, Simulation, Computer Assisted Testing, Adaptive Testing
Coniam, David – Journal of Educational Technology Systems, 2011
This article details an investigation into the onscreen marking (OSM) of Liberal Studies (LS) in Hong Kong--where paper-based marking (PBM) of public examinations is being phased out and wholly superseded by OSM. The study involved 14 markers who had previously rated Liberal Studies scripts on screen in the 2009 Hong Kong Advanced Level…
Descriptors: Foreign Countries, Computer Assisted Testing, Educational Technology, Comparative Analysis
Alonzo, Julie; Anderson, Daniel; Tindal, Gerald – Behavioral Research and Teaching, 2009
We present scaling outcomes for mathematics assessments used in the fall to screen students at risk of failing to learn the knowledge and skills described in the National Council of Teachers of Mathematics (NCTM) Focal Point Standards. At each grade level, the assessment consisted of a 48-item test with three 16-item sub-test sets aligned to the…
Descriptors: At Risk Students, Mathematics Teachers, National Standards, Item Response Theory
Wise, Steven L.; Kong, Xiaojing – Applied Measurement in Education, 2005
When low-stakes assessments are administered, the degree to which examinees give their best effort is often unclear, complicating the validity and interpretation of the resulting test scores. This study introduces a new method, based on item response time, for measuring examinee test-taking effort on computer-based test items. This measure, termed…
Descriptors: Psychometrics, Validity, Reaction Time, Test Items
Scherbaum, Charles A.; Cohen-Charash, Yochi; Kern, Michael J. – Educational and Psychological Measurement, 2006
General self-efficacy (GSE), individuals' belief in their ability to perform well in a variety of situations, has been the subject of increasing research attention. However, the psychometric properties (e.g., reliability, validity) associated with the scores on GSE measures have been criticized, which has hindered efforts to further establish the…
Descriptors: Self Efficacy, Measures (Individuals), Psychometrics, Reliability
Liu, Xiufeng – 1994
Problems of validity and reliability of concept mapping are addressed by using item-response theory (IRT) models for scoring. In this study, the overall structure of students' concept maps are defined by the number of links, the number of hierarchies, the number of cross-links, and the number of examples. The study was conducted with 92 students…
Descriptors: Alternative Assessment, Computer Assisted Testing, Concept Mapping, Correlation