ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	0
Since 2006 (last 20 years)	2

Descriptor

Comparative Testing	11
Test Construction	4
Item Response Theory	3
Mathematics Tests	3
Multiple Choice Tests	3
Reading Tests	3
Standardized Tests	3
Test Items	3
Test Reliability	3
Elementary School Students	2
Elementary Secondary Education	2
High School Students	2
Reading Comprehension	2
Scores	2
Test Bias	2
Test Results	2
Test Validity	2
Adaptive Testing	1
American Indians	1
Bilingual Education	1
Certification	1
Cognitive Psychology	1
College Students	1
Compensatory Education	1
Computer Assisted Testing	1
More ▼

Source

Applied Measurement in…

Author

Bateson, David J.	1
Bolt, Sara E.	1
Carlo, Maria S.	1
Davey, Beth	1
Forsyth, Robert A.	1
Frary, Robert B.	1
Haladyna, Thomas A.	1
Hambleton, Ronald K.	1
Lane, Suzanne	1
Lee, Yoonsun	1
Macready, George B.	1
Rogers, H. Jane	1
Rogers, W. Todd	1
Royer, James M.	1
Schaefer, Lyn	1
Stone, Clement A.	1
Taylor, Catherine S.	1
Ysseldyke, James E.	1
More ▼

Publication Type

Journal Articles	11
Reports - Research	8
Reports - Evaluative	3

Education Level

Grade 10	1
Grade 4	1
Grade 7	1

Audience

Location

Canada

Laws, Policies, & Programs

Assessments and Surveys

Test Anxiety Inventory

What Works Clearinghouse Rating

Showing all 11 results Save | Export

Stability of Rasch Scales over Time

Peer reviewed

Direct link

Taylor, Catherine S.; Lee, Yoonsun – Applied Measurement in Education, 2010

Item response theory (IRT) methods are generally used to create score scales for large-scale tests. Research has shown that IRT scales are stable across groups and over time. Most studies have focused on items that are dichotomously scored. Now Rasch and other IRT models are used to create scales for tests that include polytomously scored items.…

Descriptors: Measures (Individuals), Item Response Theory, Robustness (Statistics), Item Analysis

Detecting Potentially Biased Test Items: Comparison of IRT Area and Mantel-Haenszel Methods.

Peer reviewed

Hambleton, Ronald K.; Rogers, H. Jane – Applied Measurement in Education, 1989

Item Response Theory and Mantel-Haenszel approaches for investigating differential item performance were compared to assess the level of agreement of the approaches in identifying potentially biased items. Subjects were 2,000 White and 2,000 Native American high school students. The Mantel-Haenszel method provides an acceptable approximation of…

Descriptors: American Indians, Comparative Testing, High School Students, High Schools

Comparing DIF across Math and Reading/Language Arts Tests for Students Receiving a Read-Aloud Accommodation

Peer reviewed

Direct link

Bolt, Sara E.; Ysseldyke, James E. – Applied Measurement in Education, 2006

Although testing accommodations are commonly provided to students with disabilities within large-scale testing programs, research findings on how well accommodations allow for comparable measurement of student knowledge and skill remain inconclusive. The purpose of this study was to examine the extent to which 1 commonly held belief about testing…

Descriptors: Oral Reading, Testing Accommodations, Disabilities, Special Needs Students

A Comparison of Two Methods for Structuring Performance Domains.

Peer reviewed

Schaefer, Lyn; And Others – Applied Measurement in Education, 1992

Studied methods for structuring a performance domain for a certification test in emergency nursing based on task frequency ratings from 659 emergency nurses or task similarity ratings from 21 experts. A 125-job analysis survey was used. Similarity judgment results are more easily interpreted and adequately modeled by multivariate analysis. (SLD)

Descriptors: Certification, Comparative Testing, Job Analysis, Licensing Examinations (Professions)

The None-of-the-Above Option: An Empirical Study.

Peer reviewed

Frary, Robert B. – Applied Measurement in Education, 1991

The use of the "none-of-the-above" option (NOTA) in 20 college-level multiple-choice tests was evaluated for classes with 100 or more students. Eight academic disciplines were represented, and 295 NOTA and 724 regular test items were used. It appears that the NOTA can be compatible with good classroom measurement. (TJH)

Descriptors: College Students, Comparative Testing, Difficulty Level, Discriminant Analysis

Use of Restricted Item Response Theory Models for Examining the Stability of Item Parameter Estimates over Time.

Peer reviewed

Stone, Clement A.; Lane, Suzanne – Applied Measurement in Education, 1991

A model-testing approach for evaluating the stability of item response theory item parameter estimates (IPEs) in a pretest-posttest design is illustrated. Nineteen items from the Head Start Measures Battery were used. A moderately high degree of stability in the IPEs for 5,510 children assessed on 2 occasions was found. (TJH)

Descriptors: Comparative Testing, Compensatory Education, Computer Assisted Testing, Early Childhood Education

The Effectiveness of Several Multiple-Choice Formats.

Peer reviewed

Haladyna, Thomas A. – Applied Measurement in Education, 1992

Several multiple-choice item formats are examined in the current climate of test reform. The reform movement is discussed as it affects use of the following formats: (1) complex multiple-choice; (2) alternate choice; (3) true-false; (4) multiple true-false; and (5) the context dependent item set. (SLD)

Descriptors: Cognitive Psychology, Comparative Testing, Context Effect, Educational Change

Assessing the Language Acquisition Progress of Limited English Proficient Students: Problems and a New Alternative.

Peer reviewed

Royer, James M.; Carlo, Maria S. – Applied Measurement in Education, 1991

Measures of linguistic competence for limited-English-proficient students are discussed. The results for 134 students in grades 3 through 6 from a study of the reliability and validity of the Sentence Verification Technique tests as measures of listening and reading comprehension performance in native languages and English are reported. (TJH)

Descriptors: Bilingual Education, Comparative Testing, Elementary Education, Elementary School Students

The Influence of Test-Wiseness on Performance of High School Seniors on School Leaving Examinations.

Peer reviewed

Rogers, W. Todd; Bateson, David J. – Applied Measurement in Education, 1991

The influence of test wiseness on the performance of 736 high school seniors in British Columbia on provincial school leaving examinations in English, algebra, geography, history, biology, and chemistry was studied. The performance of many students on the multiple-choice sections was spuriously enhanced by test wiseness. (TJH)

Descriptors: Comparative Testing, Foreign Countries, Grade 12, Graduation Requirements

Three Applications of Customized Testing in Local School Districts.

Peer reviewed

Forsyth, Robert A.; And Others – Applied Measurement in Education, 1992

Eighth grade teachers in three local school districts helped customize two standardized norm-referenced tests for ninth graders to investigate effects of deleting some items and adding locally constructed items. Results indicate that percentile ranks for the customized tests could be very different from those for the complete test. (SLD)

Descriptors: Adaptive Testing, Comparative Testing, Elementary Secondary Education, Grade 9

Applications of Latent Class Modeling to Investigate the Structure Underlying Reading Comprehension Items.

Peer reviewed

Davey, Beth; Macready, George B. – Applied Measurement in Education, 1990

The usefulness of latent class modeling in addressing several measurement issues is demonstrated via a study of 74 good and 74 poor readers in grades 5 and 6. Procedures were particularly useful for assessing the hierarchical relation among skills and for exploring issues related to item domains. (SLD)

Descriptors: Comparative Testing, Elementary School Students, Grade 5, Grade 6