Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 1 |
Since 2006 (last 20 years) | 3 |
Descriptor
Statistical Analysis | 47 |
Test Construction | 47 |
Test Interpretation | 47 |
Test Reliability | 16 |
Item Analysis | 15 |
Test Validity | 14 |
Scores | 11 |
Test Items | 11 |
Mathematical Models | 10 |
Test Results | 9 |
Testing | 9 |
More ▼ |
Source
Author
Publication Type
Education Level
Elementary Secondary Education | 1 |
Higher Education | 1 |
Junior High Schools | 1 |
Middle Schools | 1 |
Secondary Education | 1 |
Audience
Practitioners | 2 |
Teachers | 2 |
Parents | 1 |
Students | 1 |
Location
California | 2 |
Michigan | 2 |
Alabama | 1 |
California (Stanford) | 1 |
Colorado (Denver) | 1 |
Indiana | 1 |
Italy | 1 |
Kansas | 1 |
Massachusetts | 1 |
Minnesota | 1 |
New Jersey | 1 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
Test of English as a Foreign… | 2 |
California Achievement Tests | 1 |
Massachusetts Comprehensive… | 1 |
National Assessment of… | 1 |
Rokeach Value Survey | 1 |
Strong Campbell Interest… | 1 |
Trends in International… | 1 |
What Works Clearinghouse Rating
Papageorgiou, Spiros; Morgan, Rick; Becker, Valerie – International Journal of Testing, 2015
The purpose of this study was to enhance the meaning of the scores of an English-language test by developing performance levels and descriptors for reporting overall test performance. The levels and descriptors were intended to accompany the total scale scores of TOEFL Junior® Standard, an international test of English as a second/foreign…
Descriptors: Language Proficiency, Language Tests, English (Second Language), Second Language Learning
Traynor, Anne – Educational Assessment, 2017
Variation in test performance among examinees from different regions or national jurisdictions is often partially attributed to differences in the degree of content correspondence between local school or training program curricula, and the test of interest. This posited relationship between test-curriculum correspondence, or "alignment,"…
Descriptors: Test Items, Test Construction, Alignment (Education), Curriculum
Wolf, Raffaela; Zahner, Doris; Kostoris, Fiorella; Benjamin, Roger – Council for Aid to Education, 2014
The measurement of higher-order competencies within a tertiary education system across countries presents methodological challenges due to differences in educational systems, socio-economic factors, and perceptions as to which constructs should be assessed (Blömeke, Zlatkin-Troitschanskaia, Kuhn, & Fege, 2013). According to Hart Research…
Descriptors: Case Studies, International Assessment, Performance Based Assessment, Critical Thinking
Popp, Jerome A. – 1975
In this paper it is argued that the problem of construct validation in the construction of instruments and indicators is an important problem for educational researchers and practitioners; moreover, it is claimed that the popular notion of operational definition is a misleading idea which has obscured the problem of construct validity in…
Descriptors: Evaluation Methods, Statistical Analysis, Statistical Significance, Test Construction
Fritz, Kentner V.; Cornish, Richard D. – Counseling Center Reports, 1971
The MERMAC computer program is offered to the University of Wisconsin faculty for use in scoring and analyzing classroom tests. The characteristics of a good test are discussed; examples are given of the output of the MERMAC program; and the results are used to show how the quality of a test may be improved. Although the MERMAC Program is for…
Descriptors: Computer Programs, Evaluation, Higher Education, Scoring
Woodson, M. I. Charles E.
It has been argued that item variance and test variance are not necessary characteristics for criterion-referenced tests, although they are necessary for norm-referenced tests. This position is in error because it considers sample statistics as the criteria for evaluating items and tests. Within a particular sample, an item or test may have no…
Descriptors: Criterion Referenced Tests, Evaluation Criteria, Item Analysis, Item Sampling

Carver, Ronald P. – Journal of Reading Behavior, 1978
Although it makes a great deal of sense to attempt to design reading research so as to be able to generalize beyond the particular reading passages used in an experiment, the suggestion that tests of statistical significance are a necessary part of making valid generalizations is nonsense. (HOD)
Descriptors: Generalization, Reading Research, Research Design, Research Methodology
Davidson, Fred – 1995
This study examined initial evidence of changes in fit to a unidimensional model for some language tests at multiple ability levels. Seven data sets were analyzed using the first phase of exploratory factor analysis: principal component eigenvalue extraction. Each data set is analyzed at varying n-sizes: whole group; random subsample; and five…
Descriptors: Difficulty Level, Language Aptitude, Language Proficiency, Language Skills
Levine, Michael V. – 1976
It is shown that empirical mental test P - P plots are approximately equal to theoretical item-item curves, at least for long tests administered to many people. This result is important because it leads to (1) a distribution free method for estimating points on item-item curves; (2) a general method for defining estimates of item parameters; and…
Descriptors: Item Analysis, Latent Trait Theory, Mathematical Applications, Mathematical Models

Briggs, Peter F.; And Others – Journal of Clinical Psychology, 1972
The Minnesota-Briggs History Record (M-B) is a self-administered history inventory. This monograph summarizes studies of the M-B and describes the development of seven scales based upon M-B items. (Authors)
Descriptors: Biographical Inventories, Item Analysis, Measurement Instruments, Personality Assessment
Harris, Chester W. – 1975
Achievement tests which are specifically linked to an instructional program and have been developed in relation to an objectives base and/or to an item generation rule are considered, as well as student response data. Three types of studies are outlined and the kind of procedures thought useful illustrated. As various methods for examining…
Descriptors: Achievement Tests, Instructional Programs, Item Banks, Item Sampling
Towne, Douglas C. – 1971
A technique for displaying and analyzing Osgood's Semantic Differential data in three -dimensional semantic space is described. The technique employs a square board, with equidistant drilled holes, in which are placed dowels of various lengths combined with labels of different shapes. Studies have found 3 major factors (Evaluation, Activity, and…
Descriptors: Data Analysis, Evaluation Methods, Models, Rating Scales
Michigan State Dept. of Education, Lansing. – 1971
This report describes the development of the 1969-70 Michigan Educational Assessment measures used in assessing the levels and distribution of educational performance for Michigan's districts, schools, and pupils. The report has four sections. The first section contains a brief description of the 1969-70 assessment program, including a statement…
Descriptors: Achievement Tests, Attitude Measures, Educational Testing, Measurement Instruments

Murstein, Bernard I. – Journal of Marriage and the Family, 1976
The author discusses a common error in marriage research, i.e. use of a control group. Single scores which are correlated require no randomized control group. Two arrays of scores which are correlated generally require a randomized control group. (Author/HMV)
Descriptors: Data Analysis, Marriage, Research Methodology, Research Problems

Downing, Steven M.; Haladyna, Thomas M. – Applied Measurement in Education, 1997
An ideal process is outlined for test item development and the study of item responses to ensure that tests are sound. Qualitative and quantitative methods are used to assess the item-level validity evidence for high-stakes examinations. A checklist for assessment is provided. (SLD)
Descriptors: High Stakes Tests, Item Response Theory, Qualitative Research, Quality Control