ERIC - Search Results

Publication Date

In 2025	1
Since 2024	1
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	1
Since 2006 (last 20 years)	6

Descriptor

Comparative Testing	28
Test Construction	28
Test Items	12
Test Format	7
Test Reliability	6
Test Validity	6
Multiple Choice Tests	5
Scoring	5
Testing Problems	5
Computer Assisted Testing	4
High School Students	4
High Schools	4
Higher Education	4
Item Response Theory	4
Standardized Tests	4
Student Evaluation	4
Ability	3
College Students	3
Construct Validity	3
Correlation	3
Difficulty Level	3
Equated Scores	3
Factor Analysis	3
Foreign Countries	3
Item Analysis	3
More ▼

Source

Educational and Psychological…	2
Journal of Educational…	2
Advances in Health Sciences…	1
Applied Measurement in…	1
Applied Psychological…	1
Assessment in Education:…	1
Executive Educator	1
Journal of Geography in…	1
Structural Equation Modeling:…	1
Studies in Educational…	1
TESL Canada Journal	1
More ▼

Publication Type

Reports - Evaluative	28
Journal Articles	13
Speeches/Meeting Papers	8
Tests/Questionnaires	2
Numerical/Quantitative Data	1
Opinion Papers	1
Reports - Research	1

Education Level

Elementary Secondary Education	1
High Schools	1
Middle Schools	1
Postsecondary Education	1

Audience

Researchers

Location

Alabama	1
Australia	1
Canada	1
Idaho	1
United Kingdom	1
United States	1

Laws, Policies, & Programs

Assessments and Surveys

Alabama High School…	1
Program for International…	1
SAT (College Admission Test)	1
Test of English as a Foreign…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 28 results Save | Export

Efficient Adjusted Joint Significance Test and Sobel-Type Confidence Interval for Mediation Effect

Peer reviewed

Direct link

Haixiang Zhang – Structural Equation Modeling: A Multidisciplinary Journal, 2025

Mediation analysis is an important statistical tool in many research fields, where the joint significance test is widely utilized for examining mediation effects. Nevertheless, the limitation of this mediation testing method stems from its conservative Type I error, which reduces its statistical power and imposes certain constraints on its…

Descriptors: Structural Equation Models, Statistical Significance, Robustness (Statistics), Comparative Testing

International Comparisons and Sensitivity to Instruction

Peer reviewed

Direct link

Wiliam, Dylan – Assessment in Education: Principles, Policy & Practice, 2008

While international comparisons such as those provided by PISA may be meaningful in terms of overall judgements about the performance of educational systems, caution is needed in terms of more fine-grained judgements. In particular it is argued that the results of PISA to draw conclusions about the quality of instruction in different systems is…

Descriptors: Test Bias, Test Construction, Comparative Testing, Evaluation

Comparisons among Designs for Equating Mixed-Format Tests in Large-Scale Assessments

Peer reviewed

Direct link

Kim, Sooyeon; Walker, Michael E.; McHale, Frederick – Journal of Educational Measurement, 2010

In this study we examined variations of the nonequivalent groups equating design for tests containing both multiple-choice (MC) and constructed-response (CR) items to determine which design was most effective in producing equivalent scores across the two tests to be equated. Using data from a large-scale exam, this study investigated the use of…

Descriptors: Measures (Individuals), Scoring, Equated Scores, Test Bias

Computer-Based and Paper-and-Pencil Administration Mode Effects on a Statewide End-of-Course English Test

Peer reviewed

Direct link

Kim, Do-Hong; Huynh, Huynh – Educational and Psychological Measurement, 2008

The current study compared student performance between paper-and-pencil testing (PPT) and computer-based testing (CBT) on a large-scale statewide end-of-course English examination. Analyses were conducted at both the item and test levels. The overall results suggest that scores obtained from PPT and CBT were comparable. However, at the content…

Descriptors: Reading Comprehension, Computer Assisted Testing, Factor Analysis, Comparative Testing

Use of a Committee Review Process to Improve the Quality of Course Examinations

Peer reviewed

Direct link

Wallach, P. M.; Crespo, L. M.; Holtzman, K. Z.; Galbraith, R. M.; Swanson, D. B. – Advances in Health Sciences Education, 2006

Purpose: In conjunction with curricular changes, a process to develop integrated examinations was implemented. Pre-established guidelines were provided favoring vignettes, clinically relevant material, and application of knowledge rather than simple recall. Questions were read aloud in a committee including all course directors, and a reviewer…

Descriptors: Test Items, Rating Scales, Examiners, Guidelines

Examination of Various Influences on the Mantel-Haenszel Statistic.

Clauser, Brian E.; And Others – 1991

Item bias has been a major concern for test developers during recent years. The Mantel-Haenszel statistic has been among the preferred methods for identifying biased items. The statistic's performance in identifying uniform bias in simulated data modeled by producing various levels of difference in the (item difficulty) b-parameter for reference…

Descriptors: Comparative Testing, Difficulty Level, Item Bias, Item Response Theory

Application of an Automated Item Selection Method to Real Data.

Peer reviewed

Stocking, Martha L.; And Others – Applied Psychological Measurement, 1993

A method of automatically selecting items for inclusion in a test with constraints on item content and statistical properties was applied to real data. Tests constructed manually from the same data and constraints were compared to tests constructed automatically. Results show areas in which automated assembly can improve test construction. (SLD)

Descriptors: Algorithms, Automation, Comparative Testing, Computer Assisted Testing

An Investigation of the Power of Stout's Test of Essential Unidimensionality.

Download full text

Ang, Cheng; Miller, M. David – 1993

The power of the procedure of W. Stout to detect deviations from essential unidimensionality in two-dimensional data was investigated for minor, moderate, and large deviations from unidimensionality using criteria for deviations from unidimensionality based on prior research. Test lengths of 20 and 40 items and sample sizes of 700 and 1,500 were…

Descriptors: Ability, Comparative Testing, Correlation, Item Response Theory

Investigating Form Comparability in the Idaho Comprehensive Literacy Assessment: Matters of Fairness and Transparency

Peer reviewed

Direct link

Squires, David; Trevisan, Michael S.; Canney, George F. – Studies in Educational Evaluation, 2006

The Idaho Comprehensive Literacy Assessment (ICLA) is a faculty-developed, state-wide, high-stakes assessment of pre-service teachers' knowledge and application of research based literacy practices. The literacy faculty control all aspects of the test, including construction, refinement, administration, scoring and reporting. The test development…

Descriptors: Test Construction, Comparative Testing, Investigations, Test Reliability

The Evaluation of Present Course Placement Procedures Using the Compass Tests.

Download full text

Barr, James E.; Rasor, Richard A.; Grill, Cathie – 2002

This document addresses how well ARC's computerized placement tests (Compass) assist individuals in reaching informed decisions about enrolling in selected courses, including English composition, reading, mathematics, and ESL. The document addresses the question of whether Compass scores add any relevant information in the decision-making process…

Descriptors: Academic Standards, Cognitive Processes, Community Colleges, Comparative Testing

A Comparison of the Performance of Simulated Hierarchical and Linear Testlets.

Peer reviewed

Wainer, Howard; And Others – Journal of Educational Measurement, 1992

Computer simulations were run to measure the relationship between testlet validity and factors of item pool size and testlet length for both adaptive and linearly constructed testlets. Making a testlet adaptive yields only modest increases in aggregate validity because of the peakedness of the typical proficiency distribution. (Author/SLD)

Descriptors: Adaptive Testing, Comparative Testing, Computer Assisted Testing, Computer Simulation

Use of an Inclusive Option and the Optimal Number of Options for Multiple-Choice Items.

Peer reviewed

Crehan, Kevin D.; And Others – Educational and Psychological Measurement, 1993

Studies with 220 college students found that multiple-choice test items with 3 items are more difficult than those with 4 items, and items with the none-of-these option are more difficult than those without this option. Neither format manipulation affected item discrimination. Implications for test construction are discussed. (SLD)

Descriptors: College Students, Comparative Testing, Difficulty Level, Distractors (Tests)

Assessing Dimensionality of a Set of Items--Comparison of Different Approaches.

Download full text

Nandakumar, Ratna – 1992

The performance of the following four methodologies for assessing unidimensionality was examined: (1) DIMTEST; (2) the approach of P. W. Holland and P. R. Rosenbaum; (3) linear factor analysis; and (4) non-linear factor analysis. Each method is examined and compared with other methods using simulated data sets and real data sets. Seven data sets,…

Descriptors: Ability, Comparative Testing, Correlation, Equations (Mathematics)

None of the Above.

Wiggins, Grant – Executive Educator, 1994

Instead of relying on standardized test scores and interdistrict comparisons, school systems must develop a more powerful, timely, and local approach to accountability that is truly client-centered and focused on results. Accountability requires giving successful teachers the freedom and opportunity to take effective ideas beyond their own…

Descriptors: Accountability, Comparative Testing, Elementary Secondary Education, Feedback

Assessing Construct Validity in Community Evaluations: A Multitrait-Multimethod Approach.

Download full text

Babcock, Judith L.; And Others – 1992

This study used multiple methods to assess basic community needs and attributes of community atmosphere (cohesion, religious involvement, and recreational activities) in two psychometric studies. Part 1 revised self-report community assessment measures, developed multi-item scales for each construct, and tested reliabilities and factor structures…

Descriptors: Community Needs, Community Organizations, Community Programs, Comparative Testing

Previous Page | Next Page »

Pages: 1 | 2

Trevisan, Michael S.	2
Ang, Cheng	1
Babcock, Judith L.	1
Baker, Eva L.	1
Barr, James E.	1
Bejar, Isaac I.	1
Canney, George F.	1
Clauser, Brian E.	1
Crehan, Kevin D.	1
Crespo, L. M.	1
De Puga, Ignacio Suarez	1
Des Brisay, Margaret	1
Ekstrom, Ruth B.	1
Galbraith, R. M.	1
Goldberg, Gail Lynn	1
Grill, Cathie	1
Haixiang Zhang	1
Haladyna, Thomas A.	1
Harrington, Thomas F.	1
Holtzman, K. Z.	1
Huynh, Huynh	1
Jones, Allan	1
Kapinus, Barbara	1
Kim, Do-Hong	1
More ▼