ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	0
Since 2006 (last 20 years)	16

Source

International Journal of…

Publication Type

Journal Articles	22
Reports - Evaluative	22
Information Analyses	1

Education Level

Adult Education	1
Higher Education	1

Audience

Administrators	1
Counselors	1
Parents	1
Teachers	1

Location

Netherlands	2
Australia	1
Canada	1
France	1
New Zealand	1
United States	1

Laws, Policies, & Programs

No Child Left Behind Act 2001

Assessments and Surveys

Cognitive Abilities Test	1
Program for International…	1
Self Description Questionnaire	1

What Works Clearinghouse Rating

Showing 1 to 15 of 22 results Save | Export

The Dutch Review Process for Evaluating the Quality of Psychological Tests: History, Procedure, and Results

Peer reviewed

Direct link

Evers, Arne; Sijtsma, Klaas; Lucassen, Wouter; Meijer, Rob R. – International Journal of Testing, 2010

This article describes the 2009 revision of the Dutch Rating System for Test Quality and presents the results of test ratings from almost 30 years. The rating system evaluates the quality of a test on seven criteria: theoretical basis, quality of the testing materials, comprehensiveness of the manual, norms, reliability, construct validity, and…

Descriptors: Rating Scales, Documentation, Educational Quality, Educational Testing

A Monte Carlo Simulation Investigating the Validity and Reliability of Ability Estimation in Item Response Theory with Speeded Computer Adaptive Tests

Peer reviewed

Direct link

Schmitt, T. A.; Sass, D. A.; Sullivan, J. R.; Walker, C. M. – International Journal of Testing, 2010

Imposed time limits on computer adaptive tests (CATs) can result in examinees having difficulty completing all items, thus compromising the validity and reliability of ability estimates. In this study, the effects of speededness were explored in a simulated CAT environment by varying examinee response patterns to end-of-test items. Expectedly,…

Descriptors: Monte Carlo Methods, Simulation, Computer Assisted Testing, Adaptive Testing

Do Examinees Have Similar Test-Taking Effort? A High-Stakes Question for Low-Stakes Testing

Peer reviewed

Direct link

Barry, Carol L.; Horst, S. Jeanne; Finney, Sara J.; Brown, Allison R.; Kopp, Jason P. – International Journal of Testing, 2010

Given the prevalence of low-stakes testing internationally (e.g., NAEP, TIMSS, PIRLS), it is crucial to try to better understand examinee motivation in these contexts. In the current study, mixture modeling results supported three different profiles of test-taking effort over the course of five tests. Classes 1 and 2 had varying levels of effort…

Descriptors: Testing, Comparative Analysis, Accountability, College Students

Evaluating the Bookmark Standard Setting Method: The Impact of Random Item Ordering

Peer reviewed

Direct link

Davis-Becker, Susan L.; Buckendahl, Chad W.; Gerrow, Jack – International Journal of Testing, 2011

Throughout the world, cut scores are an important aspect of a high-stakes testing program because they are a key operational component of the interpretation of test scores. One method for setting standards that is prevalent in educational testing programs--the Bookmark method--is intended to be a less cognitively complex alternative to methods…

Descriptors: Standard Setting (Scoring), Cutting Scores, Educational Testing, Licensing Examinations (Professions)

The "International Journal of Testing": A Content Review

Peer reviewed

Direct link

Zenisky, April L.; Crotts, Katrina M. – International Journal of Testing, 2010

The "International Journal of Testing" (IJT) is the journal of the International Test Commission. It is intended to support the dissemination of scholarly research on tests and test use worldwide. The purpose of this article is to reflect on what has been published in IJT over its nine volumes to date, with a focus on the extent to which…

Descriptors: Test Use, Testing, Evaluation, Tests

Can Structural Equation Models in Second Language Testing and Learning Research Be Successfully Replicated?

Peer reviewed

Direct link

In'nami, Yo; Koizumi, Rie – International Journal of Testing, 2010

Because structural equation models are widely used in testing and assessment, investigation into the accuracy of such models may help raise awareness of the value of reanalysis or replication. We focused on second language testing and learning studies and examined: (a) To what extent is information necessary for replication provided by authors?…

Descriptors: Structural Equation Models, Second Language Learning, Second Languages, Testing

Evaluating the Invariance of Cognitive Profile Patterns Derived from Profile Analysis via Multidimensional Scaling (PAMS): A Bootstrapping Approach

Peer reviewed

Direct link

Kim, Se-Kang – International Journal of Testing, 2010

The aim of the current study is to validate the invariance of major profile patterns derived from multidimensional scaling (MDS) by bootstrapping. Profile Analysis via Multidimensional Scaling (PAMS) was employed to obtain profiles and bootstrapping was used to construct the sampling distributions of the profile coordinates and the empirical…

Descriptors: Intervals, Multidimensional Scaling, Profiles, Evaluation

The Relation between Practice Effects, Test-Taker Characteristics and Degree of g-Saturation

Peer reviewed

Direct link

Reeve, Charlie L.; Lam, Holly – International Journal of Testing, 2007

Prior research regarding practice effects on ability tests has focused primarily on the differences in average score gains across different types of preparation (e.g., test retaking vs. focused item practice vs. coaching). In contrast, there has been little concerted effort towards understanding the significant variance in score gains across…

Descriptors: Achievement Gains, Test Wiseness, Tests, Testing

The Multidimensionality of Verbal Analogy Items

Peer reviewed

Direct link

Ullstadius, Eva; Carlstedt, Berit; Gustafsson, Jan-Eric – International Journal of Testing, 2008

The influence of general and verbal ability on each of 72 verbal analogy test items were investigated with new factor analytical techniques. The analogy items together with the Computerized Swedish Enlistment Battery (CAT-SEB) were given randomly to two samples of 18-year-old male conscripts (n = 8566 and n = 5289). Thirty-two of the 72 items had…

Descriptors: Test Items, Verbal Ability, Factor Analysis, Swedish

Differential Item Functioning Analysis Using Rasch Item Information Functions

Peer reviewed

Direct link

Wyse, Adam E.; Mapuranga, Raymond – International Journal of Testing, 2009

Differential item functioning (DIF) analysis is a statistical technique used for ensuring the equity and fairness of educational assessments. This study formulates a new DIF analysis method using the information similarity index (ISI). ISI compares item information functions when data fits the Rasch model. Through simulations and an international…

Descriptors: Test Bias, Evaluation Methods, Test Items, Educational Assessment

Correcting for Person Misfit in Aggregated Score Reporting

Peer reviewed

Direct link

Brown, Richard S.; Villarreal, Julio C. – International Journal of Testing, 2007

There has been considerable research regarding the extent to which psychometric sound assessments sometimes yield individual score estimates that are inconsistent with the response patterns of the individual. It has been suggested that individual response patterns may differ from expectations for a number of reasons, including subject motivation,…

Descriptors: Psychometrics, Test Bias, Testing, Simulation

A Review of "Integrity[TM]"

Peer reviewed

Direct link

Veldkamp, Bernard P. – International Journal of Testing, 2008

Integrity[TM], an online application for testing both the statistical integrity of the test and the academic integrity of the examinees, was evaluated for this review. Program features and the program output are described. An overview of the statistics in Integrity[TM] is provided, and the application is illustrated with a small simulation study.…

Descriptors: Simulation, Integrity, Statistics, Computer Assisted Testing

Improving Test Quality in the Netherlands: Results of 18 Years of Test Ratings.

Peer reviewed

Evers, Arne – International Journal of Testing, 2001

Describes the Dutch rating system for test quality, which evaluates a test for seven criteria, and analyses the results of test ratings from the past 18 years. Results show a steady increase in test quality in the Netherlands that can be attributed to use of better tests and declining use of tests of less quality after evaluation. (SLD)

Descriptors: Criteria, Educational Testing, Evaluation Methods, Foreign Countries

Factorial Invariance Testing and Latent Mean Differences for the Self-Description Questionnaire II (Short Version) with Indigenous and Non-Indigenous Australian Secondary School Students

Peer reviewed

Direct link

Bodkin-Andrews, Gawaian H.; Ha, My Trinh; Craven, Rhonda G.; Yeung, Alexander Seesing – International Journal of Testing, 2010

This investigation reports on the cross-cultural equivalence testing of the Self-Description Questionnaire II (short version; SDQII-S) for Indigenous and non-Indigenous Australian secondary student samples. A variety of statistical analysis techniques were employed to assess the psychometric properties of the SDQII-S for both the Indigenous and…

Descriptors: Indigenous Populations, Disadvantaged, Testing, Measures (Individuals)

Linguistic Influences on Differential Item Functioning for Second Language Learners on the National Assessment of Educational Progress

Peer reviewed

Direct link

Mahoney, Kate – International Journal of Testing, 2008

Education policy in many countries has undergone changes regarding the testing of English Language Learners (ELLs), who by definition are not yet proficient in the language of the test. As policies mandate the inclusion of ELLs in large-scale testing, many question the validity of achievement test scores because the degree to which the test score…

Descriptors: Test Items, Linguistics, Testing, Second Language Learning

Previous Page | Next Page »

Pages: 1 | 2

Testing	10
Computer Assisted Testing	6
Test Items	6
Item Response Theory	5
Comparative Analysis	4
Foreign Countries	4
Models	4
Psychometrics	4
Scores	4
Simulation	4
Computation	3
Educational Testing	3
Factor Analysis	3
Psychological Testing	3
Reliability	3
Tests	3
Achievement Tests	2
Adaptive Testing	2
Computer Software	2
Construct Validity	2
Criteria	2
Error of Measurement	2
Ethics	2
Evaluation	2
Evaluation Methods	2
More ▼

Evers, Arne	2
Veldkamp, Bernard P.	2
Ariel, Adelaide	1
Barry, Carol L.	1
Bodkin-Andrews, Gawaian H.	1
Bolin, Aaron U.	1
Breithaupt, Krista	1
Brown, Allison R.	1
Brown, Richard S.	1
Buckendahl, Chad W.	1
Carlstedt, Berit	1
Craven, Rhonda G.	1
Crotts, Katrina M.	1
Davis-Becker, Susan L.	1
Finney, Sara J.	1
Gerrow, Jack	1
Gustafsson, Jan-Eric	1
Ha, My Trinh	1
Hall, John D.	1
Hoadley, David	1
Horst, S. Jeanne	1
Howerton, D. Lynn	1
In'nami, Yo	1
Kim, Se-Kang	1
Koizumi, Rie	1
More ▼