ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	1
Since 2006 (last 20 years)	6

Descriptor

Comparative Analysis	20
Scoring	13
Classification	5
Cutting Scores	5
Standard Setting (Scoring)	5
Test Items	5
Reliability	4
Scoring Formulas	4
Difficulty Level	3
Foreign Countries	3
Guessing (Tests)	3
Item Response Theory	3
Multiple Choice Tests	3
Objective Tests	3
Scores	3
Test Reliability	3
Criterion Referenced Tests	2
Discriminant Analysis	2
Error of Measurement	2
Estimation (Mathematics)	2
Evaluators	2
Factor Structure	2
Higher Education	2
Interrater Reliability	2
Item Analysis	2
More ▼

Source

Educational and Psychological…

Publication Type

Journal Articles	14
Reports - Research	11
Reports - Evaluative	4

Education Level

Elementary Education	1
Elementary Secondary Education	1
Grade 3	1
Primary Education	1

Audience

Location

Canada	1
Switzerland (Geneva)	1

Laws, Policies, & Programs

Assessments and Surveys

Trends in International…

What Works Clearinghouse Rating

Showing 1 to 15 of 20 results Save | Export

Scoring Graphical Responses in TIMSS 2019 Using Artificial Neural Networks

Peer reviewed

Direct link

von Davier, Matthias; Tyack, Lillian; Khorramdel, Lale – Educational and Psychological Measurement, 2023

Automated scoring of free drawings or images as responses has yet to be used in large-scale assessments of student achievement. In this study, we propose artificial neural networks to classify these types of graphical responses from a TIMSS 2019 item. We are comparing classification accuracy of convolutional and feed-forward approaches. Our…

Descriptors: Scoring, Networks, Artificial Intelligence, Elementary Secondary Education

Validation of Automated Scoring of Oral Reading

Peer reviewed

Direct link

Balogh, Jennifer; Bernstein, Jared; Cheng, Jian; Van Moere, Alistair; Townshend, Brent; Suzuki, Masanori – Educational and Psychological Measurement, 2012

A two-part experiment is presented that validates a new measurement tool for scoring oral reading ability. Data collected by the U.S. government in a large-scale literacy assessment of adults were analyzed by a system called VersaReader that uses automatic speech recognition and speech processing technologies to score oral reading fluency. In the…

Descriptors: Reading Fluency, Measures (Individuals), Scoring, Reading Ability

Automated Scoring of Teachers' Open-Ended Responses to Video Prompts: Bringing the Classroom-Video-Analysis Assessment to Scale

Peer reviewed
PDF on ERIC

Download full text

Direct link

Nicole B. Kersting; Bruce L. Sherin; James W. Stigler – Educational and Psychological Measurement, 2014

In this study, we explored the potential for machine scoring of short written responses to the Classroom-Video-Analysis (CVA) assessment, which is designed to measure teachers' usable mathematics teaching knowledge. We created naïve Bayes classifiers for CVA scales assessing three different topic areas and compared computer-generated scores to…

Descriptors: Scoring, Automation, Video Technology, Teacher Evaluation

Reducing the Cognitive Complexity Associated with Standard Setting: A Comparison of the Single-Passage Bookmark and Yes/No Methods

Peer reviewed

Direct link

Skaggs, Gary; Hein, Serge F. – Educational and Psychological Measurement, 2011

Judgmental standard setting methods have been criticized for the cognitive complexity of the judgment task that panelists are asked to complete. This study compared two methods designed to reduce this complexity: the yes/no method and the single-passage bookmark method. Two mock standard setting panel meetings were convened, one for each method,…

Descriptors: Standard Setting (Scoring), Methods, Cutting Scores, Experienced Teachers

DIF Trees: Using Classification Trees to Detect Differential Item Functioning

Peer reviewed

Direct link

Vaughn, Brandon K.; Wang, Qiu – Educational and Psychological Measurement, 2010

A nonparametric tree classification procedure is used to detect differential item functioning for items that are dichotomously scored. Classification trees are shown to be an alternative procedure to detect differential item functioning other than the use of traditional Mantel-Haenszel and logistic regression analysis. A nonparametric…

Descriptors: Test Bias, Classification, Nonparametric Statistics, Regression (Statistics)

Rasch Modeling of the Self-Deception Scale of the Balanced Inventory of Desirable Responding

Peer reviewed

Direct link

Cervellione, Kelly L.; Lee, Young-Sun; Bonanno, George A. – Educational and Psychological Measurement, 2009

Self-deception has become a construct of great interest in individual differences research because it has been associated with levels of resilience and mental health. The Balanced Inventory of Desirable Responding (BIDR) is a self-report measure used for quantifying self-deception. In this study we used Rasch modeling to examine the properties of…

Descriptors: Personality Measures, Personality Traits, Deception, Item Response Theory

A Comparison of Three Methods of Scoring True-False Tests.

Peer reviewed

Hsu, Louis M. – Educational and Psychological Measurement, 1979

Though the Paired-Item-Score (Eakin and Long) (EJ 174 780) method of scoring true-false tests has certain advantages over the traditional scoring methods (percentage right and right minus wrong), these advantages are attained at the cost of a larger risk of misranking the examinees. (Author/BW)

Descriptors: Comparative Analysis, Guessing (Tests), Objective Tests, Probability

An Empirical Comparison of Cutoff Score Methods for Content-Related and Criterion-Related Validity Settings.

Peer reviewed

Woehr, David J.; And Others – Educational and Psychological Measurement, 1991

Methods for setting cutoff scores based on criterion performance, normative comparison, and absolute judgment were compared for scores on a multiple-choice psychology examination for 121 undergraduates and 251 undergraduates as a comparison group. All methods fell within the standard error of measurement. Implications of differences for decision…

Descriptors: Comparative Analysis, Concurrent Validity, Content Validity, Cutting Scores

A Comparative Test of Magnitude Estimation and Pair-Comparison Treatment of Complete Ranks for Scaling a Small Number of Equal-Interval Frequency Response Anchors.

Peer reviewed

Schriesheim, Chester A.; Gardiner, Claudia C. – Educational and Psychological Measurement, 1992

Whether previously noted differences in 2 sets of recommended 5-point equal-interval response anchors could have been caused by scaling too many stimuli at once was studied for scores of 110 college students. A comparison of Magnitude Estimation (MET) and Thurstone Case III illustrates the advantages of MET. (SLD)

Descriptors: College Students, Comparative Analysis, Estimation (Mathematics), Higher Education

An Empirical Investigation Comparing the Effectiveness of Four Scoring Strategies on the Kuder Occupational Interest Survey Form DD

Peer reviewed

Olejnik, Stephen; Porter, Andrew C. – Educational and Psychological Measurement, 1975

The four scoring strategies compared were: lamda coefficients, chi-square weights, and two applications of multiple discriminant analysis. No significant differences were found when applied to the Kuder Occupational Interest Survey. (RC)

Descriptors: Analysis of Variance, Comparative Analysis, Discriminant Analysis, Interest Inventories

The Effect of Double Standardized Scoring on the Semantic Differential

Peer reviewed

Haynes, Jack R. – Educational and Psychological Measurement, 1975

Descriptors: Classification, Comparative Analysis, Factor Analysis, Factor Structure

A Preliminary Investigation of Two Procedures for Setting Examination Standards

Peer reviewed

Andrew, Barbara J.; Hecht, James T. – Educational and Psychological Measurement, 1976

Results suggest that different groups of judges do set similar examination standards when using the same procedure, and that the average of individual judgments does not differ significantly from group consensus judgments. Significant differences were found, however, between the standards set by the two procedures employed. (RC)

Descriptors: Comparative Analysis, Cutting Scores, Multiple Choice Tests, Pass Fail Grading

On Bounds for the Average Correlation Between Subtest Scores in Ipsatively Scored Tests

Peer reviewed

Gleser, Leon Jay – Educational and Psychological Measurement, 1972

Paper is concerned with the effect that ipsative scoring has upon a commonly used index of between-subtest correlation. (Author)

Descriptors: Comparative Analysis, Forced Choice Technique, Mathematical Applications, Measurement Techniques

An Investigation of Procedures for Developing and Validating the Classroom Attitude Measure Toward Educational Inquiry

Peer reviewed

Stauffer, A. J. – Educational and Psychological Measurement, 1974

Descriptors: Attitude Change, Attitude Measures, Comparative Analysis, Educational Research

The Validation of the Scoring Criterion Problem in Multiplicative Two-Dimensional Classification Tasks.

Peer reviewed

Kingma, Johannes; TenVergert, Elisabeth M. – Educational and Psychological Measurement, 1987

Two studies investigated the functional equivalence of three different scoring systems used to assess the child's ability to understand and carry out multiplicative classification tasks. All three scoring criteria produced reliable and homogeneous tests. Their factor matrices were similar, and the corresponding factor structures were invariant…

Descriptors: Classification, Cognitive Measurement, Comparative Analysis, Developmental Tasks

Previous Page | Next Page »

Pages: 1 | 2

Andrew, Barbara J.	1
Balogh, Jennifer	1
Behuniak, Peter, Jr.	1
Bernstein, Jared	1
Bonanno, George A.	1
Bruce L. Sherin	1
Cervellione, Kelly L.	1
Cheng, Jian	1
Gardiner, Claudia C.	1
Gleser, Leon Jay	1
Harasym, P. H.	1
Haynes, Jack R.	1
Hecht, James T.	1
Hein, Serge F.	1
Hsu, Louis M.	1
James W. Stigler	1
Khorramdel, Lale	1
Kingma, Johannes	1
Lee, Young-Sun	1
Lord, Frederic M.	1
Ndalichako, Joyce L.	1
Nicole B. Kersting	1
Olejnik, Stephen	1
Porter, Andrew C.	1
More ▼