ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	2
Since 2006 (last 20 years)	3

Descriptor

Evaluation Methods	9
Scoring	9
Test Theory	9
Item Analysis	4
Testing	4
Latent Trait Theory	3
Test Reliability	3
Equated Scores	2
Higher Education	2
Program Validation	2
Psychometrics	2
Statistical Analysis	2
Test Construction	2
Accounting	1
Anatomy	1
Attitude Measures	1
COVID-19	1
Career Development	1
College Faculty	1
College Science	1
College Students	1
Comparative Analysis	1
Computer Assisted Testing	1
Computer Programs	1
Confidence Testing	1
More ▼

Source

Anatomical Sciences Education	1
European Journal of…	1
Instructional Science	1
International Journal of…	1
Journal of Educational…	1
Physical Review Physics…	1
Research Quarterly for…	1

Author

Aksu, Gökhan	1
Bhaskar, R.	1
Cohen, Allan S., Comp.	1
Dillard, Jesse F.	1
Eser, Mehmet Taha	1
Feldt, Leonard S.	1
Fenna, Doug S.	1
Gilliland, Kurt O.	1
Hills, John R.	1
Kernick, Edward T.	1
Mislevy, Robert J.	1
Rainey, Katherine D.	1
Royal, Kenneth D.	1
Spray, Judith A.	1
Vignal, Michael	1
Wilcox, Bethany R.	1
More ▼

Publication Type

Journal Articles	7
Reports - Research	7
Reference Materials -…	1
Reports - Evaluative	1
Speeches/Meeting Papers	1

Education Level

Higher Education	3
Postsecondary Education	2

Audience

Location

Laws, Policies, & Programs

Elementary and Secondary…

Assessments and Surveys

National Assessment of…

What Works Clearinghouse Rating

Showing all 9 results Save | Export

Comparison of the Results of the Generalizability Theory with the Inter-Rater Agreement Coefficients

Peer reviewed
PDF on ERIC

Download full text

Eser, Mehmet Taha; Aksu, Gökhan – International Journal of Curriculum and Instruction, 2022

The agreement between raters is examined within the scope of the concept of "inter-rater reliability". Although there are clear definitions of the concepts of agreement between raters and reliability between raters, there is no clear information about the conditions under which agreement and reliability level methods are appropriate to…

Descriptors: Generalizability Theory, Interrater Reliability, Evaluation Methods, Test Theory

Validation of a Coupled, Multiple Response Assessment for Upper-Division Thermal Physics

Peer reviewed

Direct link

Rainey, Katherine D.; Vignal, Michael; Wilcox, Bethany R. – Physical Review Physics Education Research, 2022

Currently there are no assessment instruments available for upper-division thermal physics, though several introductory assessments are currently available. Notably missing from these introductory assessment are items targeting statistical mechanics. This leaves a gap in the content that can be assessed by upper-division thermal physics faculty.…

Descriptors: Physics, Science Instruction, Thermodynamics, College Science

Using Rasch Measurement to Score, Evaluate, and Improve Examinations in an Anatomy Course

Peer reviewed

Direct link

Royal, Kenneth D.; Gilliland, Kurt O.; Kernick, Edward T. – Anatomical Sciences Education, 2014

Any examination that involves moderate to high stakes implications for examinees should be psychometrically sound and legally defensible. Currently, there are two broad and competing families of test theories that are used to score examination data. The majority of instructors outside the high-stakes testing arena rely on classical test theory…

Descriptors: Item Response Theory, Scoring, Evaluation Methods, Anatomy

Item Response Theory in Educational Assessment.

Mislevy, Robert J. – 1988

Large-scale educational assessments differ from familiar educational measurements by attempting to provide information about the levels and natures of skills in populations rather than in individuals. That the distinct purposes of assessment require different methodologies than individual measurement was recognized by the development of…

Descriptors: Educational Assessment, Evaluation Methods, Item Analysis, Latent Trait Theory

Assessment of Foundation Knowledge: Are Students Confident in Their Ability?

Peer reviewed

Direct link

Fenna, Doug S. – European Journal of Engineering Education, 2004

Multiple-choice testing (MCT) has several advantages which are becoming more relevant in the current financial climate. In particular, they can be machine marked. As an objective testing method it is particularly relevant to engineering and other factual courses, but MCTs are not widely used in engineering because students can benefit from…

Descriptors: Guessing (Tests), Testing, Multiple Choice Tests, Engineering Education

A Theory-Based Comparison of the Reliabilities of Fixed-Length and Trials-to-Criterion Scoring of Physical Education Skills Tests.

Peer reviewed

Feldt, Leonard S.; Spray, Judith A. – Research Quarterly for Exercise and Sport, 1983

The reliabilities of two types of measurement plans were compared across six hypothetical distributions of true scores or abilities. The measurement plans were: (1) fixed-length, where the number of trials for all examinees is set in advance; and (2) trials-to-criterion, where examinees must keep trying until they complete a given number of trials…

Descriptors: Criterion Referenced Tests, Evaluation Methods, Higher Education, Measurement Techniques

Using Cognitive Science to Assign Test Weights.

Peer reviewed

Bhaskar, R.; Dillard, Jesse F. – Instructional Science, 1983

Description of an objective method for assigning weights to questions on examinations includes discussions of classical test theory, knowledge organization, and how task analysis can be used to identify knowledge elements required to solve specific problems, rank them, and assign objective weights to exam questions using a Pareto distribution (7…

Descriptors: Accounting, Epistemology, Evaluation Methods, Item Analysis

Equating Minimum-Competency Tests: Comparison of Methods.

Peer reviewed

Hills, John R.; And Others – Journal of Educational Measurement, 1988

Five methods of equating minimum-competency tests were compared using the Florida Statewide Student Assessment Test, Part II, for 1984 and 1986. Four of five methods yielded essentially comparable results for the highest scoring 84% of the students. Different lengths of anchor items were compared, using the concurrent item response theory equating…

Descriptors: Comparative Analysis, Equated Scores, Evaluation Methods, Graduation Requirements

Bibliography of Papers on Latent Trait Assessment.

Cohen, Allan S., Comp. – 1979

This partially annotated bibliography of journal articles, dissertations, convention papers, research reports, and a few books and unpublished manuscripts provides a comprehensive coverage of work on latent trait theory and practice. Documents are arranged alphabetically by author. The period covered ranges from the early 1950's to the present.…

Descriptors: Attitude Measures, Career Development, Computer Assisted Testing, Computer Programs