Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 0 |
Since 2006 (last 20 years) | 8 |
Descriptor
Error of Measurement | 13 |
Testing | 13 |
Item Response Theory | 5 |
Academic Achievement | 4 |
Evaluation Methods | 4 |
Test Items | 4 |
Probability | 3 |
Psychometrics | 3 |
Reliability | 3 |
Test Theory | 3 |
Classification | 2 |
More ▼ |
Source
Author
Alonzo, Julie | 1 |
Altepeter, Tom | 1 |
Birnbaum, Michael H. | 1 |
Bramley, Tom | 1 |
Briggs, Derek C. | 1 |
Chang, Shun-Wen | 1 |
Cronbach, Lee J. | 1 |
Guo, Hongwen | 1 |
Karkee, Thakur B. | 1 |
Kirsch, Irwin S. | 1 |
Park, Bitnara Jasmine | 1 |
More ▼ |
Publication Type
Reports - Evaluative | 13 |
Journal Articles | 8 |
Numerical/Quantitative Data | 2 |
Opinion Papers | 2 |
Speeches/Meeting Papers | 2 |
Tests/Questionnaires | 1 |
Education Level
Junior High Schools | 2 |
Grade 7 | 1 |
Higher Education | 1 |
Middle Schools | 1 |
Secondary Education | 1 |
Audience
Administrators | 1 |
Counselors | 1 |
Practitioners | 1 |
Location
Taiwan | 1 |
United Kingdom (England) | 1 |
Laws, Policies, & Programs
Job Training Partnership Act… | 1 |
Assessments and Surveys
ACT Assessment | 1 |
Expressive One Word Picture… | 1 |
SAT (College Admission Test) | 1 |
What Works Clearinghouse Rating
Birnbaum, Michael H. – Psychological Review, 2011
This article contrasts 2 approaches to analyzing transitivity of preference and other behavioral properties in choice data. The approach of Regenwetter, Dana, and Davis-Stober (2011) assumes that on each choice, a decision maker samples randomly from a mixture of preference orders to determine whether "A" is preferred to "B." In contrast, Birnbaum…
Descriptors: Evidence, Testing, Computation, Probability
Guo, Hongwen – Psychometrika, 2010
After many equatings have been conducted in a testing program, equating errors can accumulate to a degree that is not negligible compared to the standard error of measurement. In this paper, the author investigates the asymptotic accumulative standard error of equating (ASEE) for linear equating methods, including chained linear, Tucker, and…
Descriptors: Testing Programs, Testing, Error of Measurement, Equated Scores
Park, Bitnara Jasmine; Alonzo, Julie; Tindal, Gerald – Behavioral Research and Teaching, 2011
This technical report describes the process of development and piloting of reading comprehension measures that are appropriate for seventh-grade students as part of an online progress screening and monitoring assessment system, http://easycbm.com. Each measure consists of an original fictional story of approximately 1,600 to 1,900 words with 20…
Descriptors: Reading Comprehension, Reading Tests, Grade 7, Test Construction
Bramley, Tom – Educational Research, 2010
Background: A recent article published in "Educational Research" on the reliability of results in National Curriculum testing in England (Newton, "The reliability of results from national curriculum testing in England," "Educational Research" 51, no. 2: 181-212, 2009) suggested that: (1) classification accuracy can be…
Descriptors: National Curriculum, Educational Research, Testing, Measurement
Woods, Carol M. – Applied Psychological Measurement, 2009
Differential item functioning (DIF) occurs when items on a test or questionnaire have different measurement properties for one group of people versus another, irrespective of group-mean differences on the construct. Methods for testing DIF require matching members of different groups on an estimate of the construct. Preferably, the estimate is…
Descriptors: Test Results, Testing, Item Response Theory, Test Bias
Briggs, Derek C. – National Association for College Admission Counseling, 2009
This discussion paper represents one of the National Association for College Admission Counseling's (NACAC's) first post-Testing Commission steps in advancing the knowledge base and dialogue about test preparation. It describes various types of test preparation programs and summarizes the existing academic research on the effects of test…
Descriptors: Testing, Standardized Tests, School Counselors, College Admission
Solano-Flores, Guillermo – Educational Researcher, 2008
The testing of English language learners (ELLs) is, to a large extent, a random process because of poor implementation and factors that are uncertain or beyond control. Yet current testing practices and policies appear to be based on deterministic views of language and linguistic groups and erroneous assumptions about the capacity of assessment…
Descriptors: Generalizability Theory, Testing, Second Language Learning, Error of Measurement

Cronbach, Lee J.; And Others – Educational and Psychological Measurement, 1997
Through the standard error, rather than a reliability coefficient, generalizability theory provides an indicator of the uncertainty attached to school and individual scores on performance assessments. Recommendations are made to apply generalizability theory to current performance assessments, emphasizing practices that differ from usual…
Descriptors: Academic Achievement, Error of Measurement, Generalizability Theory, Performance Based Assessment
Karkee, Thakur B.; Wright, Karen R. – Online Submission, 2004
Different item response theory (IRT) models may be employed for item calibration. Change of testing vendors, for example, may result in the adoption of a different model than that previously used with a testing program. To provide scale continuity and preserve cut score integrity, item parameter estimates from the new model must be linked to the…
Descriptors: Measures (Individuals), Evaluation Criteria, Testing, Integrity
Stewart, E. Elizabeth – 1981
Context effects are defined as being influences on test performance associated with the content of successively presented test items or sections. Four types of context effects are identified: (1) direct context effects (practice effects) which occur when performance on items is affected by the examinee having been exposed to similar types of…
Descriptors: Context Effect, Data Collection, Error of Measurement, Evaluation Methods

Altepeter, Tom – School Psychology Review, 1983
A critical review of the Expressive One-Word Picture Vocabulary Test (Gardner) is offered. The reviewer feels that the instrument cannot be recommended in its present form. Further research concerning the manual, and theoretical issues, (particularly test-retest stability) is strongly recommended. (Author/PN)
Descriptors: Error of Measurement, Intelligence Tests, Item Analysis, Pictorial Stimuli
Chang, Shun-Wen – Educational and Psychological Measurement, 2006
This study evaluates the effects of employing the linear, normalizing, and arcsine transformation methods for constructing scale scores on the Basic Competence Test (BCTEST). Tests in three subject areas (Chinese, English, and Mathematics) were studied using the data of test administrations from 2001 to 2003. The resulting scale scores for each…
Descriptors: Standardized Tests, Achievement Tests, Test Theory, True Scores

Kirsch, Irwin S.; And Others – 1992
A comprehensive assessment of the literacy proficiencies of Job Training Partnership Act (JTPA) and Employment Service/Unemployment Insurance (ES/UI) participants was conducted by the Department of Labor. The survey responses of a sample of 2,501 JTPA applicants and 3,277 ES/UI participants were scored, weighted, analyzed, and used to develop a…
Descriptors: Adult Literacy, Comparative Analysis, Correlation, Data Collection