ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	3
Since 2006 (last 20 years)	13

Descriptor

Computer Assisted Testing	15
Item Response Theory	15
Reliability	15
Accuracy	5
Foreign Countries	5
Validity	5
Correlation	4
Error of Measurement	4
Test Items	4
Adaptive Testing	3
Comparative Analysis	3
Evaluators	3
Psychometrics	3
Scores	3
Scoring	3
Computation	2
Elementary School Students	2
Factor Structure	2
Goodness of Fit	2
Mathematics Tests	2
Models	2
Reaction Time	2
Statistical Analysis	2
Test Construction	2
Writing Evaluation	2
More ▼

Source

International Journal of…	2
Applied Measurement in…	1
Assessment	1
Behavioral Research and…	1
CALICO Journal	1
Educational and Psychological…	1
Journal of Educational…	1
Journal of Educational…	1
Journal of Educational…	1
Journal of Educational and…	1
Mathematics Education…	1
Physical & Occupational…	1
Reading and Writing: An…	1
More ▼

Publication Type

Journal Articles	12
Reports - Research	9
Reports - Evaluative	4
Speeches/Meeting Papers	2
Opinion Papers	1
Reports - Descriptive	1

Education Level

Elementary Education	3
Early Childhood Education	2
Grade 1	2
Higher Education	2
Primary Education	2
Elementary Secondary Education	1
Grade 2	1
Grade 3	1
Grade 4	1
Grade 5	1
Grade 6	1
Grade 7	1
Grade 8	1
Middle Schools	1
Postsecondary Education	1
Secondary Education	1
More ▼

Audience

Location

Australia	2
China	2
Canada	1
Cyprus	1
Hong Kong	1
Hungary	1
United Kingdom	1

Laws, Policies, & Programs

Assessments and Surveys

Pediatric Evaluation of…

What Works Clearinghouse Rating

Showing all 15 results Save | Export

Development of Information Functions and Indices for the GGUM-RANK Multidimensional Forced Choice IRT Model

Peer reviewed

Direct link

Joo, Seang-Hwane; Lee, Philseok; Stark, Stephen – Journal of Educational Measurement, 2018

This research derived information functions and proposed new scalar information indices to examine the quality of multidimensional forced choice (MFC) items based on the RANK model. We also explored how GGUM-RANK information, latent trait recovery, and reliability varied across three MFC formats: pairs (two response alternatives), triplets (three…

Descriptors: Item Response Theory, Models, Item Analysis, Reliability

The Influence of Rater Effects in Training Sets on the Psychometric Quality of Automated Scoring for Writing Assessments

Peer reviewed

Direct link

Wind, Stefanie A.; Wolfe, Edward W.; Engelhard, George, Jr.; Foltz, Peter; Rosenstein, Mark – International Journal of Testing, 2018

Automated essay scoring engines (AESEs) are becoming increasingly popular as an efficient method for performance assessments in writing, including many language assessments that are used worldwide. Before they can be used operationally, AESEs must be "trained" using machine-learning techniques that incorporate human ratings. However, the…

Descriptors: Computer Assisted Testing, Essay Tests, Writing Evaluation, Scoring

Improving the Reliability of Student Scores from Speeded Assessments: An Illustration of Conditional Item Response Theory Using a Computer-Administered Measure of Vocabulary

Peer reviewed

Direct link

Petscher, Yaacov; Mitchell, Alison M.; Foorman, Barbara R. – Reading and Writing: An Interdisciplinary Journal, 2015

A growing body of literature suggests that response latency, the amount of time it takes an individual to respond to an item, may be an important factor to consider when using assessment data to estimate the ability of an individual. Considering that tests of passage and list fluency are being adapted to a computer administration format, it is…

Descriptors: Computer Assisted Testing, Vocabulary, Item Response Theory, Reliability

Improving Measurement Precision of Hierarchical Latent Traits Using Adaptive Testing

Peer reviewed

Direct link

Wang, Chun – Journal of Educational and Behavioral Statistics, 2014

Many latent traits in social sciences display a hierarchical structure, such as intelligence, cognitive ability, or personality. Usually a second-order factor is linearly related to a group of first-order factors (also called domain abilities in cognitive ability measures), and the first-order factors directly govern the actual item responses.…

Descriptors: Measurement, Accuracy, Item Response Theory, Adaptive Testing

Investigating the Application of Automated Writing Evaluation to Chinese Undergraduate English Majors: A Case Study of "WriteToLearn"

Peer reviewed
PDF on ERIC

Download full text

Liu, Sha; Kunnan, Antony John – CALICO Journal, 2016

This study investigated the application of "WriteToLearn" on Chinese undergraduate English majors' essays in terms of its scoring ability and the accuracy of its error feedback. Participants were 163 second-year English majors from a university located in Sichuan province who wrote 326 essays from two writing prompts. Each paper was…

Descriptors: Foreign Countries, Undergraduate Students, English (Second Language), Second Language Learning

Computer-Based Assessment of School Readiness and Early Reasoning

Peer reviewed

Direct link

Csapó, Beno; Molnár, Gyöngyvér; Nagy, József – Journal of Educational Psychology, 2014

This study explores the potential of using online tests for the assessment of school readiness and for monitoring early reasoning. Four tests of a face-to-face-administered school readiness test battery (speech sound discrimination, relational reasoning, counting and basic numeracy, and deductive reasoning) and a paper-and-pencil inductive…

Descriptors: Computer Assisted Testing, School Readiness, Thinking Skills, Abstract Reasoning

Entering the "New Frontier" of Mathematics Assessment: Designing and Trialling the PVAT-O (Online)

Download full text

Rogers, Angela – Mathematics Education Research Group of Australasia, 2013

As we move into the 21st century, educationalists are exploring the myriad of possibilities associated with Computer Based Assessment (CBA). At first glance this mode of assessment seems to provide many exciting opportunities in the mathematics domain, yet one must question the validity of CBA and whether our school systems, students and teachers…

Descriptors: Mathematics Tests, Student Evaluation, Computer Assisted Testing, Test Validity

The Value of Item Response Theory in Clinical Assessment: A Review

Peer reviewed

Direct link

Thomas, Michael L. – Assessment, 2011

Item response theory (IRT) and related latent variable models represent modern psychometric theory, the successor to classical test theory in psychological assessment. Although IRT has become prevalent in the measurement of ability and achievement, its contributions to clinical domains have been less extensive. Applications of IRT to clinical…

Descriptors: Item Response Theory, Psychological Evaluation, Reliability, Error of Measurement

Developments in Measuring Functional Activities: Where Do We Go with the PEDI-CAT?

Peer reviewed

Direct link

Ketelaar, Marjolijn; Wassenberg-Severijnen, Jeltje – Physical & Occupational Therapy in Pediatrics, 2010

During the past 30 years many pediatric assessment and outcome measures have been developed. Based on Rasch analysis, the Pediatric Evaluation of Disability Inventory (PEDI) was designed to measure functional status by asking parents about both the skills of their children and the performance of daily tasks in three functionally important domains…

Descriptors: Cues, Behavior Problems, Independent Living, Patients

A Monte Carlo Simulation Investigating the Validity and Reliability of Ability Estimation in Item Response Theory with Speeded Computer Adaptive Tests

Peer reviewed

Direct link

Schmitt, T. A.; Sass, D. A.; Sullivan, J. R.; Walker, C. M. – International Journal of Testing, 2010

Imposed time limits on computer adaptive tests (CATs) can result in examinees having difficulty completing all items, thus compromising the validity and reliability of ability estimates. In this study, the effects of speededness were explored in a simulated CAT environment by varying examinee response patterns to end-of-test items. Expectedly,…

Descriptors: Monte Carlo Methods, Simulation, Computer Assisted Testing, Adaptive Testing

The Comparability of Marking on Screen and on Paper: The Case of Liberal Studies in Hong Kong

Peer reviewed

Direct link

Coniam, David – Journal of Educational Technology Systems, 2011

This article details an investigation into the onscreen marking (OSM) of Liberal Studies (LS) in Hong Kong--where paper-based marking (PBM) of public examinations is being phased out and wholly superseded by OSM. The study involved 14 markers who had previously rated Liberal Studies scripts on screen in the 2009 Hong Kong Advanced Level…

Descriptors: Foreign Countries, Computer Assisted Testing, Educational Technology, Comparative Analysis

IRT Analysis of General Outcome Measures in Grades 1-8. Technical Report # 0916

Download full text

Alonzo, Julie; Anderson, Daniel; Tindal, Gerald – Behavioral Research and Teaching, 2009

We present scaling outcomes for mathematics assessments used in the fall to screen students at risk of failing to learn the knowledge and skills described in the National Council of Teachers of Mathematics (NCTM) Focal Point Standards. At each grade level, the assessment consisted of a 48-item test with three 16-item sub-test sets aligned to the…

Descriptors: At Risk Students, Mathematics Teachers, National Standards, Item Response Theory

Response Time Effort: A New Measure of Examinee Motivation in Computer-Based Tests

Peer reviewed

Direct link

Wise, Steven L.; Kong, Xiaojing – Applied Measurement in Education, 2005

When low-stakes assessments are administered, the degree to which examinees give their best effort is often unclear, complicating the validity and interpretation of the resulting test scores. This study introduces a new method, based on item response time, for measuring examinee test-taking effort on computer-based test items. This measure, termed…

Descriptors: Psychometrics, Validity, Reaction Time, Test Items

Measuring General Self-Efficacy: A Comparison of Three Measures Using Item Response Theory

Peer reviewed

Direct link

Scherbaum, Charles A.; Cohen-Charash, Yochi; Kern, Michael J. – Educational and Psychological Measurement, 2006

General self-efficacy (GSE), individuals' belief in their ability to perform well in a variety of situations, has been the subject of increasing research attention. However, the psychometric properties (e.g., reliability, validity) associated with the scores on GSE measures have been criticized, which has hindered efforts to further establish the…

Descriptors: Self Efficacy, Measures (Individuals), Psychometrics, Reliability

The Validity and Reliability of Concept Mapping as an Alternative Science Assessment when Item Response Theory Is Used for Scoring.

Download full text

Liu, Xiufeng – 1994

Problems of validity and reliability of concept mapping are addressed by using item-response theory (IRT) models for scoring. In this study, the overall structure of students' concept maps are defined by the number of links, the number of hierarchies, the number of cross-links, and the number of examples. The study was conducted with 92 students…

Descriptors: Alternative Assessment, Computer Assisted Testing, Concept Mapping, Correlation

Alonzo, Julie	1
Anderson, Daniel	1
Cohen-Charash, Yochi	1
Coniam, David	1
Csapó, Beno	1
Engelhard, George, Jr.	1
Foltz, Peter	1
Foorman, Barbara R.	1
Joo, Seang-Hwane	1
Kern, Michael J.	1
Ketelaar, Marjolijn	1
Kong, Xiaojing	1
Kunnan, Antony John	1
Lee, Philseok	1
Liu, Sha	1
Liu, Xiufeng	1
Mitchell, Alison M.	1
Molnár, Gyöngyvér	1
Nagy, József	1
Petscher, Yaacov	1
Rogers, Angela	1
Rosenstein, Mark	1
Sass, D. A.	1
Scherbaum, Charles A.	1
Schmitt, T. A.	1
More ▼