ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	4
Since 2006 (last 20 years)	9

Descriptor

Test Length	16
Test Reliability	16
Scores	8
Test Items	7
Test Validity	7
Item Response Theory	5
Computer Assisted Testing	4
Test Format	4
Error of Measurement	3
Estimation (Mathematics)	3
Language Proficiency	3
Language Tests	3
Psychometrics	3
Simulation	3
Testing	3
Achievement Tests	2
Difficulty Level	2
Guessing (Tests)	2
Individual Testing	2
Item Analysis	2
Models	2
Quality Assurance	2
Second Language Learning	2
Second Languages	2
Standardized Tests	2
More ▼

Source

Applied Measurement in…	2
Applied Psychological…	2
Journal of Educational…	2
Language Testing	2
Assessment & Evaluation in…	1
Educational Research and…	1
International Educational…	1
Journal of Psychoeducational…	1
Montgomery County Public…	1
Psychological Assessment	1
School Psychology Review	1
More ▼

Publication Type

Reports - Evaluative	16
Journal Articles	13
Numerical/Quantitative Data	1
Speeches/Meeting Papers	1

Education Level

Early Childhood Education	1
Elementary Education	1
Elementary Secondary Education	1
Grade 1	1
Grade 2	1
Grade 3	1
Grade 4	1
Grade 5	1
Grade 6	1
Grade 7	1
Grade 8	1
Grade 9	1
High Schools	1
Intermediate Grades	1
Junior High Schools	1
Kindergarten	1
Middle Schools	1
Primary Education	1
Secondary Education	1
More ▼

Audience

Location

Maryland	1
Netherlands	1

Laws, Policies, & Programs

Assessments and Surveys

ACTFL Oral Proficiency…	1
Armed Forces Qualification…	1
Developmental Indicators for…	1
Measures of Academic Progress	1
Peabody Picture Vocabulary…	1
Wechsler Adult Intelligence…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 16 results Save | Export

A Comparison of Automated Scale Short Form Selection Strategies

Peer reviewed
PDF on ERIC

Download full text

Raborn, Anthony W.; Leite, Walter L.; Marcoulides, Katerina M. – International Educational Data Mining Society, 2019

Short forms of psychometric scales have been commonly used in educational and psychological research to reduce the burden of test administration. However, it is challenging to select items for a short form that preserve the validity and reliability of the scores of the original scale. This paper presents and evaluates multiple automated methods…

Descriptors: Psychometrics, Measures (Individuals), Mathematics, Heuristics

Test Review: TestDaF

Peer reviewed

Direct link

Norris, John; Drackert, Anastasia – Language Testing, 2018

The Test of German as a Foreign Language (TestDaF) plays a critical role as a standardized test of German language proficiency. Developed and administered by the Society for Academic Study Preparation and Test Development (g.a.s.t.), TestDaF was launched in 2001 and has experienced persistent annual growth, with more than 44,000 test takers in…

Descriptors: German, Second Language Learning, Language Tests, Language Proficiency

ACTFL Oral Proficiency Interview -- Computer (OPIc)

Peer reviewed

Direct link

Isbell, Dan; Winke, Paula – Language Testing, 2019

The American Council on the Teaching of Foreign Languages (ACTFL) oral proficiency interview -- computer (OPIc) testing system represents an ambitious effort in language assessment: Assessing oral proficiency in over a dozen languages, on the same scale, from virtually anywhere at any time. Especially for users in contexts where multiple foreign…

Descriptors: Oral Language, Language Tests, Language Proficiency, Second Language Learning

Student Outcomes on MAP Growth: Comparison of Virtual and In-Person Administrations

Download full text

James, Syretta R.; Liu, Shihching Jessica; Maina, Nyambura; Wade, Julie; Wang, Helen; Wilson, Heather; Wolanin, Natalie – Montgomery County Public Schools, 2021

The impact of the COVID-19 pandemic continues to overwhelm the functioning and outcomes of educational systems throughout the nation. The public education system is under particular scrutiny given that students, families, and educators are under considerable stress to maintain academic progress. Since the beginning of the crisis, school-systems…

Descriptors: Achievement Tests, COVID-19, Pandemics, Public Schools

Test Review: C. Mardell & D. S. Goldenberg. "Speed Developmental Indicators for the Assessment of Learning-Fourth Edition" ("Speed DIAL-4")

Peer reviewed

Direct link

Doskey, Elena M.; Lagunas, Brenda; SooHoo, Michelle; Lomax, Amanda; Bullick, Stephanie – Journal of Psychoeducational Assessment, 2013

The Speed DIAL-4 was developed from the Developmental Indicators for the Assessment of Learning, Fourth Edition (DIAL-4), a screening designed to identify children between the ages of 2 years, 6 months through 5 years, 11 months "who are in need of intervention or diagnostic assessment in the following areas: motor, concepts, language,…

Descriptors: Screening Tests, Young Children, Test Length, Scoring

Comparing the Performance of Five Multidimensional CAT Selection Procedures with Different Stopping Rules

Peer reviewed

Direct link

Yao, Lihua – Applied Psychological Measurement, 2013

Through simulated data, five multidimensional computerized adaptive testing (MCAT) selection procedures with varying test lengths are examined and compared using different stopping rules. Fixed item exposure rates are used for all the items, and the Priority Index (PI) method is used for the content constraints. Two stopping rules, standard error…

Descriptors: Computer Assisted Testing, Adaptive Testing, Test Items, Selection

Relating Unidimensional IRT Parameters to a Multidimensional Response Space: A Review of Two Alternative Projection IRT Models for Scoring Subscales

Peer reviewed

Direct link

Kahraman, Nilufer; Thompson, Tony – Journal of Educational Measurement, 2011

A practical concern for many existing tests is that subscore test lengths are too short to provide reliable and meaningful measurement. A possible method of improving the subscale reliability and validity would be to make use of collateral information provided by items from other subscales of the same test. To this end, the purpose of this article…

Descriptors: Test Length, Test Items, Alignment (Education), Models

Ongoing Issues in Test Fairness

Peer reviewed

Direct link

Camilli, Gregory – Educational Research and Evaluation, 2013

In the attempt to identify or prevent unfair tests, both quantitative analyses and logical evaluation are often used. For the most part, fairness evaluation is a pragmatic attempt at determining whether procedural or substantive due process has been accorded to either a group of test takers or an individual. In both the individual and comparative…

Descriptors: Alternative Assessment, Test Bias, Test Content, Test Format

Estimating the Consistency and Accuracy of Classifications Based on Test Scores.

Peer reviewed

Livingston, Samuel A.; Lewis, Charles – Journal of Educational Measurement, 1995

A method is presented for estimating the accuracy and consistency of classifications based on test scores. The reliability of the score is used to estimate effective test length in terms of discrete items. The true-score distribution is estimated by fitting a four-parameter beta model. (SLD)

Descriptors: Classification, Estimation (Mathematics), Scores, Statistical Distributions

Estimating the Reliability of a Test Containing Multiple Item Formats.

Peer reviewed

Qualls, Audrey L. – Applied Measurement in Education, 1995

Classically parallel, tau-equivalently parallel, and congenerically parallel models representing various degrees of part-test parallelism and their appropriateness for tests composed of multiple item formats are discussed. An appropriate reliability estimate for a test with multiple item formats is presented and illustrated. (SLD)

Descriptors: Achievement Tests, Estimation (Mathematics), Measurement Techniques, Test Format

Corrected Estimates of WAIS-R Short Form Reliability and Standard Error of Measurement.

Peer reviewed

Axelrod, Bradley N.; And Others – Psychological Assessment, 1996

The calculations of D. Schretlen, R. H. B. Benedict, and J. H. Bobholz for the reliabilities of a short form of the Wechsler Adult Intelligence Scale--Revised (WAIS-R) (1994) consistently overestimated the values. More accurate values are provided for the WAIS--R and a seven-subtest short form. (SLD)

Descriptors: Error Correction, Error of Measurement, Estimation (Mathematics), Intelligence Tests

Influence of Test and Person Characteristics on Nonparametric Appropriateness Measurement.

Peer reviewed

Meijer, Rob R.; And Others – Applied Psychological Measurement, 1994

The power of the nonparametric person-fit statistic, U3, is investigated through simulations as a function of item characteristics, test characteristics, person characteristics, and the group to which examinees belong. Results suggest conditions under which relatively short tests can be used for person-fit analysis. (SLD)

Descriptors: Difficulty Level, Group Membership, Item Response Theory, Nonparametric Statistics

Test Review: The Revised PPVT.

Peer reviewed

Kipps, Debi; Hanson, Dave – School Psychology Review, 1983

The Peabody Picture Vocabulary Test-Revised (Dunn and Dunn) is described as a convenient, quick test, possessing improvements over the original. It measures a subject's receptive (hearing) vocabulary for Standard American English. However, the validity information for the test is less than adequate, since no validity studies are presented for it.…

Descriptors: Auditory Tests, Individual Testing, Scores, Test Length

An Investigation of the Differential Effort Received by Items on a Low-Stakes Computer-Based Test

Peer reviewed

Direct link

Wise, Steven L. – Applied Measurement in Education, 2006

In low-stakes testing, the motivation levels of examinees are often a matter of concern to test givers because a lack of examinee effort represents a direct threat to the validity of the test data. This study investigated the use of response time to assess the amount of examinee effort received by individual test items. In 2 studies, it was found…

Descriptors: Computer Assisted Testing, Motivation, Test Validity, Item Response Theory

Multiple Choice and True/False Tests: Reliability Measures and Some Implications of Negative Marking

Peer reviewed

Direct link

Burton, Richard F. – Assessment & Evaluation in Higher Education, 2004

The standard error of measurement usefully provides confidence limits for scores in a given test, but is it possible to quantify the reliability of a test with just a single number that allows comparison of tests of different format? Reliability coefficients do not do this, being dependent on the spread of examinee attainment. Better in this…

Descriptors: Multiple Choice Tests, Error of Measurement, Test Reliability, Test Items

Previous Page | Next Page »

Pages: 1 | 2

Axelrod, Bradley N.	1
Bullick, Stephanie	1
Burton, Richard F.	1
Camilli, Gregory	1
Doskey, Elena M.	1
Drackert, Anastasia	1
Hanson, Dave	1
Isbell, Dan	1
James, Syretta R.	1
Kahraman, Nilufer	1
Kipps, Debi	1
Lagunas, Brenda	1
Leite, Walter L.	1
Lewis, Charles	1
Liu, Shihching Jessica	1
Livingston, Samuel A.	1
Lomax, Amanda	1
Maina, Nyambura	1
Marcoulides, Katerina M.	1
Meijer, Rob R.	1
Norris, John	1
Qualls, Audrey L.	1
Raborn, Anthony W.	1
SooHoo, Michelle	1
More ▼