ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	0
Since 2006 (last 20 years)	7

Descriptor

Measurement Techniques	21
Test Theory	21
Testing Problems	21
Test Construction	10
Educational Testing	8
Educational Assessment	7
Equated Scores	7
Foreign Countries	7
Psychometrics	7
Test Interpretation	7
Test Use	7
Test Validity	7
Evaluation Methods	6
Test Items	6
Classification	5
Comparative Analysis	5
Definitions	5
Difficulty Level	4
High Stakes Tests	4
Predictive Measurement	4
Scaling	4
Statistical Analysis	4
Test Bias	4
Computer Assisted Testing	3
Evaluation Criteria	3
More ▼

Source

Measurement:…	4
Journal of Educational…	3
Assessment in Education:…	1
Evaluation in Education:…	1
Intelligence	1
International Journal of…	1
Journal of Educational and…	1
Review of Research in…	1

Publication Type

Journal Articles	13
Reports - Research	7
Opinion Papers	5
Reports - Evaluative	3
Speeches/Meeting Papers	3
Reports - Descriptive	2
Book/Product Reviews	1
Books	1
Collected Works - General	1
Collected Works - Proceedings	1

Education Level

Elementary Secondary Education	5
Secondary Education	1

Audience

Researchers

Location

United Kingdom	3
United Kingdom (England)	3
United States	3
Netherlands	2
United Kingdom (Wales)	2
Australia	1
Sweden	1
Texas	1
United Kingdom (Northern…	1

Laws, Policies, & Programs

Individuals with Disabilities…	1
No Child Left Behind Act 2001	1

Assessments and Surveys

SAT (College Admission Test)	2
Advanced Placement…	1
Childrens Depression Inventory	1
Peabody Picture Vocabulary…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 21 results Save | Export

Screening Test Items for Differential Item Functioning

Peer reviewed

Direct link

Longford, Nicholas T. – Journal of Educational and Behavioral Statistics, 2014

A method for medical screening is adapted to differential item functioning (DIF). Its essential elements are explicit declarations of the level of DIF that is acceptable and of the loss function that quantifies the consequences of the two kinds of inappropriate classification of an item. Instead of a single level and a single function, sets of…

Descriptors: Test Items, Test Bias, Simulation, Hypothesis Testing

Educational Measurement Issues and Implications of High Stakes Decision Making in Final Examinations in Secondary Education in the Netherlands

Peer reviewed

Direct link

van Rijn, P. W.; Beguin, A. A.; Verstralen, H. H. F. M. – Assessment in Education: Principles, Policy & Practice, 2012

While measurement precision is relatively easy to establish for single tests and assessments, it is much more difficult to determine for decision making with multiple tests on different subjects. This latter is the situation in the system of final examinations for secondary education in the Netherlands and is used as an example in this paper. This…

Descriptors: Secondary Education, Tests, Foreign Countries, Decision Making

Defending the Quality of Links between Scores from Different Tests and Exams

Peer reviewed

Direct link

Cresswell, Mike – Measurement: Interdisciplinary Research and Perspectives, 2010

Paul Newton (2010), with his characteristic concern about theory, has set out two different ways of thinking about the basis upon which equivalences of one sort or another are established between test score scales. His reason for doing this is a desire to establish "the defensibility of linkages lower on the continuum than concordance."…

Descriptors: Foreign Countries, Measurement Techniques, Psychometrics, Comparative Analysis

Conceptualizing Comparability

Peer reviewed

Direct link

Newton, Paul E. – Measurement: Interdisciplinary Research and Perspectives, 2010

This article presents the author's rejoinder to thinking about linking from issue 8(1). Particularly within the more embracing linking frameworks, e.g., Holland & Dorans (2006) and Holland (2007), there appears to be a major disjunction between (1) classification discourse: the supposed basis for classification, that is, the underlying theory…

Descriptors: Foreign Countries, Measurement Techniques, Psychometrics, Comparative Analysis

What Constitutes Legitimate Causal Linking?

Peer reviewed

Direct link

Baird, Jo-Anne – Measurement: Interdisciplinary Research and Perspectives, 2010

Newton's article (2010) makes three main contributions to the literature. First, it is transatlantic, bringing together literatures that have been dealing with similar problems, using sometimes different methods and certainly with distinctive educational, cultural perspectives. He points out that neither of these literatures has all of the…

Descriptors: Foreign Countries, Predictive Validity, Standards, Ethics

What Dictates the Meaning of Test Linking? A Reaction to "Thinking about Linking"

Peer reviewed

Direct link

von Davier, Alina A. – Measurement: Interdisciplinary Research and Perspectives, 2010

The article "Thinking About Linking" by Newton (2010) presents a novel philosophical perspective on the way that educational assessments should be linked. Newton starts by describing the linking framework as it was characterized in various publications and identifies a cross-cultural dimension in the definitions and uses of test…

Descriptors: Foreign Countries, Educational Assessment, Student Evaluation, Evaluation Criteria

On the Direct Measurement of Face Validity: A Comment on Nevo.

Peer reviewed

Secolsky, Charles – Journal of Educational Measurement, 1987

For measuring the face validity of a test, Nevo suggested that test takers and nonprofessional users rate items on a five point scale. This article questions the ability of those raters and the credibility of the aggregated judgment as evidence of the validity of the test. (JAZ)

Descriptors: Content Validity, Measurement Techniques, Rating Scales, Test Items

On Interpreting Test Scores as Social Indicators: Statistical Considerations.

Peer reviewed

Spencer, Bruce D. – Journal of Educational Measurement, 1983

Because test scores are ordinal not cordinal attributes, the average test score often is a misleading way to summarize the scores of a group of individuals. Similarly, correlation coefficients may be misleading summary measures of association between test scores. Proper, readily interpretable, summary statistics are developed from a theory of…

Descriptors: Correlation, Measurement Techniques, Scores, Statistical Analysis

Toward A Theory of Construct Definition.

Peer reviewed

Stenner, A. Jackson; And Others – Journal of Educational Measurement, 1983

In an attempt to restore the symmetry and balance between the study of person and item variation, this paper presents a novel methodology construct specification equations, which allows one to ascertain from the lawful behavior of items what an instrument is measuring. (Author/PN)

Descriptors: Measurement Objectives, Measurement Techniques, Research Methodology, Test Construction

Reflections on Stephen Jay Gould's "The Mismeasure of Man" (1981): A Retrospective Review. Book Review.

Peer reviewed

Carroll, John B. – Intelligence, 1995

It is argued that the statements and accusations made by Stephen Jay Gould about the use of factor analysis are incorrect and unjustified and that tests properly designed for the purpose can adequately measure a "general" or "g" factor of intelligence, particularly in view of the developments in testing since "The…

Descriptors: Factor Analysis, Intelligence Tests, Measurement Techniques, Nature Nurture Controversy

Passing Score and Length of a Mastery Test.

van der Linden, Wim J. – Evaluation in Education: International Progress, 1982

In mastery testing a linear relationship between an optimal passing score and test length is presented with a new optimization criterion. The usual indifference zone approach, a binomial error model, decision errors, and corrections for guessing are discussed. Related results in sequential testing and the latent class approach are included. (CM)

Descriptors: Cutting Scores, Educational Testing, Mastery Tests, Mathematical Models

Empiricism and Values: Two Faces of Educational Change.

Peer reviewed

Airaisian, Peter W. – International Journal of Educational Research, 1997

This issue presents examinations of educational testing, large-scale alternative assessment, small-scale alternative assessment, and educational measurement. These discussions go beyond technical issues to provide a conceptual perspective and a view of underlying histories, theories, applications, and the uncertainties associated with these…

Descriptors: Alternative Assessment, Educational Assessment, Educational Change, Educational Testing

What Counts as Evidence of Educational Achievement? The Role of Constructs in the Pursuit of Equity in Assessment

Peer reviewed

Direct link

Wiliam, Dylan – Review of Research in Education, 2010

The idea that validity should be considered a property of inferences, rather than of assessments, has developed slowly over the past century. In early writings about the validity of educational assessments, validity was defined as a property of an assessment. The most common definition was that an assessment was valid to the extent that it…

Descriptors: Educational Assessment, Validity, Inferences, Construct Validity

An Investigation of Two Procedures for Smoothing Test Norms.

Download full text

Jones, Patricia B.; Sabers, Darrell L. – 1984

Several techniques have been developed for creating continuous smooth distributions of test norms. This paper describes two studies that explore the behavior of cubic splines in order to determine their appropriateness for use in test norming. The first study uses data from the Curriculum Referenced Tests of Mastery (CRTM) and employs two…

Descriptors: Equated Scores, Goodness of Fit, Measurement Techniques, Norm Referenced Tests

Depression in Children: The Children's Depression Inventory.

Download full text

Crowley, Susan L.; And Others – 1993

Issues surrounding accurate assessment of depression in children have received much attention. However, the stability of scores from depression measures has generally been estimated using only classical test score theory, rather than the more powerful generalizability theory. The dependability of scores from the Children's Depression Inventory…

Descriptors: Children, Clinical Diagnosis, Depression (Psychology), Diagnostic Tests

Previous Page | Next Page »

Pages: 1 | 2

Airaisian, Peter W.	1
Baird, Jo-Anne	1
Beguin, A. A.	1
Carroll, John B.	1
Cresswell, Mike	1
Crowley, Susan L.	1
Hambleton, Ronald K.	1
Jones, Patricia B.	1
Kiely, Gerard L.	1
Linn, Robert L., Ed.	1
Longford, Nicholas T.	1
Newton, Paul E.	1
Rogers, H. Jane	1
Sabers, Darrell L.	1
Secolsky, Charles	1
Spencer, Bruce D.	1
Stenner, A. Jackson	1
Theunissen, Phiel J. J. M.	1
Verstralen, H. H. F. M.	1
Wainer, Howard	1
Weiss, David J.	1
Wiliam, Dylan	1
van Rijn, P. W.	1
van Weeren, J., Ed.	1
More ▼