ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	2
Since 2006 (last 20 years)	11

Descriptor

Cutting Scores	12
Standard Setting (Scoring)	7
English (Second Language)	3
Foreign Countries	3
Language Tests	3
Licensing Examinations…	3
Second Language Learning	3
Test Items	3
Validity	3
Adults	2
Computation	2
Difficulty Level	2
Educational Testing	2
Equated Scores	2
Identification	2
Item Response Theory	2
Reliability	2
Statistical Analysis	2
Student Evaluation	2
Testing Programs	2
Anxiety	1
Case Studies	1
Certification	1
Classification	1
Cognitive Processes	1
More ▼

Source

International Journal of…

Publication Type

Journal Articles	12
Reports - Research	7
Reports - Descriptive	3
Reports - Evaluative	2

Education Level

Elementary Secondary Education	2
Grade 5	2
Elementary Education	1
Grade 3	1
Grade 7	1
Higher Education	1
Intermediate Grades	1

Audience

Location

Africa	1
Argentina	1
Brazil	1

Laws, Policies, & Programs

Assessments and Surveys

International English…	1
Test of English as a Foreign…	1
United States Medical…	1

What Works Clearinghouse Rating

Showing all 12 results Save | Export

Development of an Indicator for Screening the Dependent Personality Disorder Using Factors of the Dimensional Clinical Personality Inventory 2 in a Brazilian Community Sample

Peer reviewed

Direct link

André Pereira Gonçalves; Lucas de Francisco Carvalho – International Journal of Testing, 2024

We aimed to verify the capacity of the Dimensional Clinical Personality Inventory 2 (IDCP-2) factors to identify people with high levels of dependent personality disorder (DPD) traits in a Brazilian sample extracted from the general population. Participants were 1469 adults who responded to factors from the IDCP-2, the Personality Inventory for…

Descriptors: Mental Disorders, Screening Tests, Personality Measures, Foreign Countries

Investigating Separate and Concurrent Approaches for Item Parameter Drift in 3PL Item Response Theory Equating

Peer reviewed

Direct link

Arce-Ferrer, Alvaro J.; Bulut, Okan – International Journal of Testing, 2017

This study examines separate and concurrent approaches to combine the detection of item parameter drift (IPD) and the estimation of scale transformation coefficients in the context of the common item nonequivalent groups design with the three-parameter item response theory equating. The study uses real and synthetic data sets to compare the two…

Descriptors: Item Response Theory, Equated Scores, Identification, Computation

Identifying and Evaluating External Validity Evidence for Passing Scores

Peer reviewed

Direct link

Davis-Becker, Susan L.; Buckendahl, Chad W. – International Journal of Testing, 2013

A critical component of the standard setting process is collecting evidence to evaluate the recommended cut scores and their use for making decisions and classifying students based on test performance. Kane (1994, 2001) proposed a framework by which practitioners can identify and evaluate evidence of the results of the standard setting from (1)…

Descriptors: Standard Setting (Scoring), Evidence, Validity, Cutting Scores

Standard Setting to an International Reference Framework: Implications for Theory and Practice

Peer reviewed

Direct link

Lim, Gad S.; Geranpayeh, Ardeshir; Khalifa, Hanan; Buckendahl, Chad W. – International Journal of Testing, 2013

Standard setting theory has largely developed with reference to a typical situation, determining a level or levels of performance for one exam for one context. However, standard setting is now being used with international reference frameworks, where some parameters and assumptions of classical standard setting do not hold. We consider the…

Descriptors: Standard Setting (Scoring), Validity, Models, Language Tests

The Effect of Data Format on Integration of Performance Data into Angoff Judgments

Peer reviewed

Direct link

Clauser, Brian E.; Mee, Janet; Margolis, Melissa J. – International Journal of Testing, 2013

This study investigated the extent to which the performance data format impacted data use in Angoff standard setting exercises. Judges from two standard settings (a total of five panels) were randomly assigned to one of two groups. The full-data group received two types of data: (1) the proportion of examinees selecting each option and (2) plots…

Descriptors: Standard Setting (Scoring), Cutting Scores, Validity, Reliability

Enhancing the Interpretability of the Overall Results of an International Test of English-Language Proficiency

Peer reviewed

Direct link

Papageorgiou, Spiros; Morgan, Rick; Becker, Valerie – International Journal of Testing, 2015

The purpose of this study was to enhance the meaning of the scores of an English-language test by developing performance levels and descriptors for reporting overall test performance. The levels and descriptors were intended to accompany the total scale scores of TOEFL Junior® Standard, an international test of English as a second/foreign…

Descriptors: Language Proficiency, Language Tests, English (Second Language), Second Language Learning

Evaluating the Bookmark Standard Setting Method: The Impact of Random Item Ordering

Peer reviewed

Direct link

Davis-Becker, Susan L.; Buckendahl, Chad W.; Gerrow, Jack – International Journal of Testing, 2011

Throughout the world, cut scores are an important aspect of a high-stakes testing program because they are a key operational component of the interpretation of test scores. One method for setting standards that is prevalent in educational testing programs--the Bookmark method--is intended to be a less cognitively complex alternative to methods…

Descriptors: Standard Setting (Scoring), Cutting Scores, Educational Testing, Licensing Examinations (Professions)

Applying Rasch Model and Generalizability Theory to Study Modified-Angoff Cut Scores

Peer reviewed

Direct link

Arce, Alvaro J.; Wang, Ze – International Journal of Testing, 2012

The traditional approach to scale modified-Angoff cut scores transfers the raw cuts to an existing raw-to-scale score conversion table. Under the traditional approach, cut scores and conversion table raw scores are not only seen as interchangeable but also as originating from a common scaling process. In this article, we propose an alternative…

Descriptors: Generalizability Theory, Item Response Theory, Cutting Scores, Scaling

Evaluating Panelists' Standard Setting Perceptions in a Developing Nation

Peer reviewed

Direct link

Ferdous, Abdullah A.; Buckendahl, Chad W. – International Journal of Testing, 2013

Considerable research about standard setting has revolved around a U.S.-centric policy context. That is, over the past decade, conclusions about thought processes and the interaction of education policy and panelists' judgments have been based on assumptions of comparable policy settings. However, whether these assumptions generalize to other…

Descriptors: Standard Setting (Scoring), Cognitive Processes, Mathematics Tests, Language Tests

Impact of Inclusion or Exclusion of Repeaters on Test Equating

Peer reviewed

Direct link

Puhan, Gautam – International Journal of Testing, 2011

This study examined the effect of including or excluding repeaters on the equating process and results. New forms of two tests were equated to their respective old forms using either all examinees or only the first timer examinees in the new form sample. Results showed that for both tests used in this study, including or excluding repeaters in the…

Descriptors: Equated Scores, Educational Testing, Student Evaluation, Sample Size

Scoring Guide Alignment: Combining Scorer Judgments with Item Parameter Estimates to Set Cut Scores

Peer reviewed

Direct link

Childs, Ruth A.; Jaciw, Andrew P.; Saunders, Kelsey – International Journal of Testing, 2007

Many approaches to standard-setting use item calibration and student score estimation results to structure panelists' tasks. However, this requires collecting standard-setting judgments after the item analysis results are available. The Scoring Guide Alignment approach collects standard-setting judgments during the scoring sessions from teachers…

Descriptors: Testing Programs, Scoring, Item Analysis, Test Items

A Study of the Criterion Validity of the Mattis Dementia Rating Scale.

Peer reviewed

Fernandez, Alberto Luis; Scheffel, Debora L. – International Journal of Testing, 2003

Evaluated the criterion validity of the Mattis Dementia Rating Scale (S. Mattis, 1988) with a concurrent study to obtain a cut-off score for an Argentinean population by administering a battery of tests to 60 memory disorder patients. Findings demonstrate high convergent validity with another measure and show an appropriate cut score for use with…

Descriptors: Adults, Cognitive Tests, Cutting Scores, Dementia

Buckendahl, Chad W.	4
Davis-Becker, Susan L.	2
André Pereira Gonçalves	1
Arce, Alvaro J.	1
Arce-Ferrer, Alvaro J.	1
Becker, Valerie	1
Bulut, Okan	1
Childs, Ruth A.	1
Clauser, Brian E.	1
Ferdous, Abdullah A.	1
Fernandez, Alberto Luis	1
Geranpayeh, Ardeshir	1
Gerrow, Jack	1
Jaciw, Andrew P.	1
Khalifa, Hanan	1
Lim, Gad S.	1
Lucas de Francisco Carvalho	1
Margolis, Melissa J.	1
Mee, Janet	1
Morgan, Rick	1
Papageorgiou, Spiros	1
Puhan, Gautam	1
Saunders, Kelsey	1
Scheffel, Debora L.	1
Wang, Ze	1
More ▼