ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	6
Since 2006 (last 20 years)	14

Descriptor

Statistical Analysis	23
Test Construction	23
Test Format	23
Test Items	14
Test Validity	8
Comparative Analysis	6
Equated Scores	6
Foreign Countries	5
Multiple Choice Tests	5
Scores	5
Second Language Learning	5
College Students	4
Higher Education	4
Item Analysis	4
Item Response Theory	4
Psychometrics	4
Test Theory	4
English (Second Language)	3
Language Tests	3
Sample Size	3
Simulation	3
Test Reliability	3
College Entrance Examinations	2
Correlation	2
Culture Fair Tests	2
More ▼

Source

ETS Research Report Series	3
Journal of Psychoeducational…	2
ProQuest LLC	2
Applied Measurement in…	1
Center for Educational Policy…	1
College Board	1
College Entrance Examination…	1
Council for Aid to Education	1
Edinburgh Working Papers in…	1
Educational and Psychological…	1
International Journal of…	1
Journal of Economic Education	1
Journal of Educational…	1
Language Assessment Quarterly	1
Language Teaching Research	1
RSR: Reference Services Review	1
Teaching of Psychology	1
More ▼

Publication Type

Reports - Research	18
Journal Articles	16
Speeches/Meeting Papers	3
Dissertations/Theses -…	2
Reports - Evaluative	2
Numerical/Quantitative Data	1
Reports - Descriptive	1

Education Level

Higher Education	5
Postsecondary Education	4
Elementary Education	2
High Schools	1
Secondary Education	1

Audience

Location

Czech Republic	1
Florida	1
Italy	1
Japan	1
Philippines	1
United States	1

Laws, Policies, & Programs

Assessments and Surveys

SAT (College Admission Test)	3
Cognitive Assessment System	1
Flesch Kincaid Grade Level…	1
Kaufman Assessment Battery…	1
Test of English for…	1
Wechsler Intelligence Scale…	1
Woodcock Johnson Tests of…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 23 results Save | Export

A Comparison of IRT Linking Approaches under the Nonequivalent Groups Anchor Test Design

Direct link

Jiajing Huang – ProQuest LLC, 2022

The nonequivalent-groups anchor-test (NEAT) data-collection design is commonly used in large-scale assessments. Under this design, different test groups take different test forms. Each test form has its own unique items and all test forms share a set of common items. If item response theory (IRT) models are applied to analyze the test data, the…

Descriptors: Item Response Theory, Test Format, Test Items, Test Construction

Evaluating Statistical Targets for Assembling Parallel Mixed-Format Test Forms

Peer reviewed

Direct link

Debeer, Dries; Ali, Usama S.; van Rijn, Peter W. – Journal of Educational Measurement, 2017

Test assembly is the process of selecting items from an item pool to form one or more new test forms. Often new test forms are constructed to be parallel with an existing (or an ideal) test. Within the context of item response theory, the test information function (TIF) or the test characteristic curve (TCC) are commonly used as statistical…

Descriptors: Test Format, Test Construction, Statistical Analysis, Comparative Analysis

On Using Simulations to Inform Decision Making during Instrument Development

Peer reviewed

Direct link

Morgan, Grant B.; Moore, Courtney A.; Floyd, Harlee S. – Journal of Psychoeducational Assessment, 2018

Although content validity--how well each item of an instrument represents the construct being measured--is foundational in the development of an instrument, statistical validity is also important to the decisions that are made based on the instrument. The primary purpose of this study is to demonstrate how simulation studies can be used to assist…

Descriptors: Simulation, Decision Making, Test Construction, Validity

Format of Options in Multiple Choice Test vis-a-vis Test Performance

Peer reviewed
PDF on ERIC

Download full text

Bendulo, Hermabeth O.; Tibus, Erlinda D.; Bande, Rhodora A.; Oyzon, Voltaire Q.; Milla, Norberto E.; Macalinao, Myrna L. – International Journal of Evaluation and Research in Education, 2017

Testing or evaluation in an educational context is primarily used to measure or evaluate and authenticate the academic readiness, learning advancement, acquisition of skills, or instructional needs of learners. This study tried to determine whether the varied combinations of arrangements of options and letter cases in a Multiple-Choice Test (MCT)…

Descriptors: Test Format, Multiple Choice Tests, Test Construction, Eye Movements

A Systematic Examination of the Linguistic Demand of Cognitive Test Directions Administered to School-Age Populations

Peer reviewed

Direct link

Cormier, Damien C.; Bulut, Okan; Singh, Deepak; Kennedy, Kathleen E.; Wang, Kun; Heudes, Alethea; Lekwa, Adam J. – Journal of Psychoeducational Assessment, 2018

The selection and interpretation of individually administered norm-referenced cognitive tests that are administered to culturally and linguistically diverse (CLD) students continue to be an important consideration within the psychoeducational assessment process. Understanding test directions during the assessment of cognitive abilities is…

Descriptors: Intelligence Tests, Cognitive Ability, High Stakes Tests, Children

Diagnostic Tests in Czech for Pupils with a First Language Different from the Language of Schooling

Peer reviewed
PDF on ERIC

Download full text

Vodicková, Katerina; Kostelecká, Yvona – Center for Educational Policy Studies Journal, 2016

Mastering a second language, in this case Czech, is crucial for pupils whose first language differs from the language of schooling, so that they can engage more successfully in the educational process. In order to adjust language teaching to pupils' needs, it is necessary to identify which language skills or individual competences set out within…

Descriptors: Diagnostic Tests, Slavic Languages, High Stakes Tests, Foreign Countries

Exploring Alternative Test Form Linking Designs with Modified Equating Sample Size and Anchor Test Length. Research Report. ETS RR-13-02

Peer reviewed
PDF on ERIC

Download full text

Wang, Lin; Qian, Jiahe; Lee, Yi-Hsuan – ETS Research Report Series, 2013

The purpose of this study was to evaluate the combined effects of reduced equating sample size and shortened anchor test length on item response theory (IRT)-based linking and equating results. Data from two independent operational forms of a large-scale testing program were used to establish the baseline results for evaluating the results from…

Descriptors: Test Construction, Item Response Theory, Testing Programs, Simulation

The Creation and Validation of a Listening Vocabulary Levels Test

Peer reviewed

Direct link

McLean, Stuart; Kramer, Brandon; Beglar, David – Language Teaching Research, 2015

An important gap in the field of second language vocabulary assessment concerns the lack of validated tests measuring aural vocabulary knowledge. The primary purpose of this study is to introduce and provide preliminary validity evidence for the Listening Vocabulary Levels Test (LVLT), which has been designed as a diagnostic tool to measure…

Descriptors: Test Construction, Test Validity, English (Second Language), Second Language Learning

A Case Study of an International Performance-Based Assessment of Critical Thinking Skills

Download full text

Wolf, Raffaela; Zahner, Doris; Kostoris, Fiorella; Benjamin, Roger – Council for Aid to Education, 2014

The measurement of higher-order competencies within a tertiary education system across countries presents methodological challenges due to differences in educational systems, socio-economic factors, and perceptions as to which constructs should be assessed (Blömeke, Zlatkin-Troitschanskaia, Kuhn, & Fege, 2013). According to Hart Research…

Descriptors: Case Studies, International Assessment, Performance Based Assessment, Critical Thinking

A Comparison of Equating/Linking Using the Stocking-Lord Method and Concurrent Calibration with Mixed-Format Tests in the Non-Equivalent Groups Common-Item Design under IRT

Direct link

Tian, Feng – ProQuest LLC, 2011

There has been a steady increase in the use of mixed-format tests, that is, tests consisting of both multiple-choice items and constructed-response items in both classroom and large-scale assessments. This calls for appropriate equating methods for such tests. As Item Response Theory (IRT) has rapidly become mainstream as the theoretical basis for…

Descriptors: Item Response Theory, Comparative Analysis, Equated Scores, Statistical Analysis

Developing Form Assembly Specifications for Exams with Multiple Choice and Constructed Response Items: Balancing Reliability and Validity Concerns

Download full text

Hendrickson, Amy; Patterson, Brian; Ewing, Maureen – College Board, 2010

The psychometric considerations and challenges associated with including constructed response items on tests are discussed along with how these issues affect the form assembly specifications for mixed-format exams. Reliability and validity, security and fairness, pretesting, content and skills coverage, test length and timing, weights, statistical…

Descriptors: Multiple Choice Tests, Test Format, Test Construction, Test Validity

Approaches to the Design of Diagnostic Item Models. Research Report. ETS RR-08-07

Peer reviewed
PDF on ERIC

Download full text

Graf, Edith Aurora – ETS Research Report Series, 2008

Quantitative item models are item structures that may be expressed in terms of mathematical variables and constraints. An item model may be developed as a computer program from which large numbers of items are automatically generated. Item models can be used to produce large numbers of items for use in traditional, large-scale assessments. But…

Descriptors: Test Items, Models, Diagnostic Tests, Statistical Analysis

Constructing Better Second Language Assessments Based on Differential Item Functioning Analysis

Peer reviewed

Direct link

Allalouf, Avi; Abramzon, Andrea – Language Assessment Quarterly, 2008

Differential item functioning (DIF) analysis can be used to great advantage in second language (L2) assessments. This study examined the differences in performance on L2 test items between groups from different first language backgrounds and suggested ways of improving L2 assessments. The study examined DIF on L2 (Hebrew) test items for two…

Descriptors: Test Items, Test Format, Second Language Learning, Test Construction

Development of Approximations to Population Invariance Indices. Research Report. ETS RR-08-36

Peer reviewed
PDF on ERIC

Download full text

Liu, Jinghua; Zhu, Xiaowen – ETS Research Report Series, 2008

The purpose of this paper is to explore methods to approximate population invariance without conducting multiple linkings for subpopulations. Under the single group or equivalent groups design, no linking needs to be performed for the parallel-linear system linking functions. The unequated raw score information can be used as an approximation. For…

Descriptors: Raw Scores, Test Format, Comparative Analysis, Test Construction

Checking the Statistical Equivalence of Nearly Identical Test Editions.

Peer reviewed

Dorans, Neil J.; Lawrence, Ida M. – Applied Measurement in Education, 1990

A procedure for checking the score equivalence of nearly identical editions of a test is described and illustrated with Scholastic Aptitude Test data. The procedure uses the standard error of equating and uses graphical representation of score conversion deviations from the identity function in standard error units. (SLD)

Descriptors: Equated Scores, Grade Equivalent Scores, Scores, Statistical Analysis

Previous Page | Next Page »

Pages: 1 | 2

Abramzon, Andrea	1
Ali, Usama S.	1
Allalouf, Avi	1
Balch, William R.	1
Bande, Rhodora A.	1
Beglar, David	1
Bendulo, Hermabeth O.	1
Benjamin, Roger	1
Bruno, James E.	1
Bulut, Okan	1
Cormier, Damien C.	1
Debeer, Dries	1
Dirkzwager, A.	1
Dorans, Neil J.	1
Ewing, Maureen	1
Floyd, Harlee S.	1
Gohmann, Stephan F.	1
Graf, Edith Aurora	1
Hambleton, Ronald K.	1
Hendrickson, Amy	1
Heudes, Alethea	1
Jackson, Carol A.	1
Jiajing Huang	1
Kennedy, Kathleen E.	1
More ▼