ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	1
Since 2006 (last 20 years)	6

Descriptor

Psychometrics	13
Test Construction	13
Test Items	6
Computer Assisted Testing	4
Item Response Theory	4
Difficulty Level	3
Foreign Countries	3
Attitude Measures	2
Equated Scores	2
Item Analysis	2
Mathematics Tests	2
Models	2
Multiple Choice Tests	2
Reading Tests	2
Scoring	2
Simulation	2
Test Content	2
Test Use	2
Test Validity	2
Testing Programs	2
Achievement Tests	1
Adaptive Testing	1
Advanced Placement Programs	1
Change	1
Cognitive Processes	1
More ▼

Source

Applied Measurement in…

Publication Type

Journal Articles	13
Reports - Evaluative	6
Reports - Research	5
Reports - Descriptive	2

Education Level

Higher Education	3
Elementary Education	1
Elementary Secondary Education	1
Grade 5	1
High Schools	1
Postsecondary Education	1
Secondary Education	1

Audience

Location

Canada	1
Connecticut	1
Israel	1
United States	1

Laws, Policies, & Programs

Assessments and Surveys

Metropolitan Achievement Tests	1
Stanford Achievement Tests	1
Texas Assessment of Academic…	1

What Works Clearinghouse Rating

Showing all 13 results Save | Export

The Effect of Anchor Test Construction on Scale Drift

Peer reviewed

Direct link

Antal, Judit; Proctor, Thomas P.; Melican, Gerald J. – Applied Measurement in Education, 2014

In common-item equating the anchor block is generally built to represent a miniature form of the total test in terms of content and statistical specifications. The statistical properties frequently reflect equal mean and spread of item difficulty. Sinharay and Holland (2007) suggested that the requirement for equal spread of difficulty may be too…

Descriptors: Test Items, Equated Scores, Difficulty Level, Item Response Theory

A Comparison of Exposure Control Procedures in CAT Systems Based on Different Measurement Models for Testlets

Peer reviewed

Direct link

Boyd, Aimee M.; Dodd, Barbara; Fitzpatrick, Steven – Applied Measurement in Education, 2013

This study compared several exposure control procedures for CAT systems based on the three-parameter logistic testlet response theory model (Wang, Bradlow, & Wainer, 2002) and Masters' (1982) partial credit model when applied to a pool consisting entirely of testlets. The exposure control procedures studied were the modified within 0.10 logits…

Descriptors: Computer Assisted Testing, Item Response Theory, Test Construction, Models

Evaluating the Psychometric Characteristics of Generated Multiple-Choice Test Items

Peer reviewed

Direct link

Gierl, Mark J.; Lai, Hollis; Pugh, Debra; Touchie, Claire; Boulais, André-Philippe; De Champlain, André – Applied Measurement in Education, 2016

Item development is a time- and resource-intensive process. Automatic item generation integrates cognitive modeling with computer technology to systematically generate test items. To date, however, items generated using cognitive modeling procedures have received limited use in operational testing situations. As a result, the psychometric…

Descriptors: Psychometrics, Multiple Choice Tests, Test Items, Item Analysis

Evidence-Centered Assessment Design and the Advanced Placement Program[R]: A Psychometrician's Perspective

Peer reviewed

Direct link

Brennan, Robert L. – Applied Measurement in Education, 2010

This paper provides an overview of evidence-centered assessment design (ECD) and some general information about of the Advanced Placement (AP[R]) Program. Then the papers in this special issue are discussed, as they relate to the use of ECD in the revision of various AP tests. This paper concludes with some observations about the need to validate…

Descriptors: Advanced Placement Programs, Equivalency Tests, Evidence, Test Construction

Item Position and Item Difficulty Change in an IRT-Based Common Item Equating Design

Peer reviewed

Direct link

Meyers, Jason L.; Miller, G. Edward; Way, Walter D. – Applied Measurement in Education, 2009

In operational testing programs using item response theory (IRT), item parameter invariance is threatened when an item appears in a different location on the live test than it did when it was field tested. This study utilizes data from a large state's assessments to model change in Rasch item difficulty (RID) as a function of item position change,…

Descriptors: Test Items, Test Content, Testing Programs, Simulation

A Qualitative Investigation of Panelists' Experiences of Standard Setting Using Two Variations of the Bookmark Method

Peer reviewed

Direct link

Hein, Serge F.; Skaggs, Gary E. – Applied Measurement in Education, 2009

Only a small number of qualitative studies have investigated panelists' experiences during standard-setting activities or the thought processes associated with panelists' actions. This qualitative study involved an examination of the experiences of 11 panelists who participated in a prior, one-day standard-setting meeting in which either the…

Descriptors: Focus Groups, Standard Setting, Cutting Scores, Cognitive Processes

The Influence of Changes in Assessment Design on the Psychometric Quality of Scores.

Peer reviewed

Wolfe, Edward W.; Gitomer, Drew H. – Applied Measurement in Education, 2001

Attempted to improve the measurement quality of a complex performance assessment through principled assessment design using the example of the National Board for Professional Teaching Standards Early Childhood/Generalist examination. All indexes examined improved after revisions were made. Results show the importance of attention to assessment…

Descriptors: Change, Performance Based Assessment, Psychometrics, Scores

Connotatively Inconsistent Test Items.

Peer reviewed

Chang, Lei – Applied Measurement in Education, 1995

A test item is defined as connotatively consistent (CC) or connotatively inconsistent (CI) when its connotation agrees with or contradicts that of the majority of items on a test. CC and CI items were examined in the Life Orientation Test and were shown to measure correlated but distinct traits. (SLD)

Descriptors: Attitude Measures, College Students, Higher Education, Personality Measures

Defending a State Graduation Test: "GI Forum v. Texas Education Agency." Measurement Perspectives from an External Evaluator.

Peer reviewed

Mehrens, William A. – Applied Measurement in Education, 2000

Presents conclusions of an independent measurement expert that the Texas Assessment of Academic Skills (TAAS) was constructed according to acceptable professional standards and tests curricular material considered by the Texas Board of Education important for graduates to have mastered. Also supports the validity and reliability of the TAAS and…

Descriptors: Curriculum, Psychometrics, Reliability, Standards

Constructing a Computerized Adaptive Test for University Applicants with Disabilities

Peer reviewed

Direct link

Moshinsky, Avital; Kazin, Cathrael – Applied Measurement in Education, 2005

In recent years, there has been a large increase in the number of university applicants requesting special accommodations for university entrance exams. The Israeli National Institute for Testing and Evaluation (NITE) administers a Psychometric Entrance Test (comparable to the Scholastic Assessment Test in the United States) to assist universities…

Descriptors: Foreign Countries, Psychometrics, Disabilities, Testing Accommodations

Comparability of Bilingual Versions of Assessments: Sources of Incomparability of English and French Versions of Canada's National Achievement Tests

Peer reviewed

Direct link

Ercikan, Kadriye; Gierl, Mark J.; McCreith, Tanya; Puhan, Gautam; Koh, Kim – Applied Measurement in Education, 2004

This research examined the degree of comparability and sources of incomparability of English and French versions of reading, mathematics, and science tests that were administered as part of a survey of achievement in Canada. The results point to substantial psychometric differences between the 2 language versions. Approximately 18% to 36% of the…

Descriptors: Foreign Countries, Psychometrics, Science Tests, French

The Potential of Criterion-Referenced Tests with Projected Norms.

Peer reviewed

Behuniak, Peter; Tucker, Charlene – Applied Measurement in Education, 1992

Psychometrically linking a state criterion-referenced test (CRT) and a norm-referenced test (NRT) to yield NRT information through the CRT was studied with samples of 1,500 to 3,000 elementary school students per subject and grade level in Connecticut. A CRT/NRT link can create a focused and coherent assessment system. (SLD)

Descriptors: Content Analysis, Criterion Referenced Tests, Educational Assessment, Elementary Education

Computerized-Adaptive and Self-Adapted Music-Listening Tests: Psychometric Features and Motivational Benefits.

Peer reviewed

Vispoel, Walter P.; Coffman, Don D. – Applied Measurement in Education, 1994

Computerized-adaptive (CAT) and self-adapted (SAT) music listening tests were compared for efficiency, reliability, validity, and motivational benefits with 53 junior high school students. Results demonstrate trade-offs, with greater potential motivational benefits for SAT and greater efficiency for CAT. SAT elicited more favorable responses from…

Descriptors: Adaptive Testing, Computer Assisted Testing, Efficiency, Item Response Theory

Gierl, Mark J.	2
Antal, Judit	1
Behuniak, Peter	1
Boulais, André-Philippe	1
Boyd, Aimee M.	1
Brennan, Robert L.	1
Chang, Lei	1
Coffman, Don D.	1
De Champlain, André	1
Dodd, Barbara	1
Ercikan, Kadriye	1
Fitzpatrick, Steven	1
Gitomer, Drew H.	1
Hein, Serge F.	1
Kazin, Cathrael	1
Koh, Kim	1
Lai, Hollis	1
McCreith, Tanya	1
Mehrens, William A.	1
Melican, Gerald J.	1
Meyers, Jason L.	1
Miller, G. Edward	1
Moshinsky, Avital	1
Proctor, Thomas P.	1
Pugh, Debra	1
More ▼