NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 11 results Save | Export
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Sekercioglu, Güçlü – International Online Journal of Education and Teaching, 2018
An empirical evidence for independent samples of a population regarding measurement invariance implies that factor structure of a measurement tool is equal across these samples; in other words, it measures the intended psychological trait within the same structure. In this case, the evidence of construct validity would be strengthened within the…
Descriptors: Factor Analysis, Error of Measurement, Factor Structure, Construct Validity
Peer reviewed Peer reviewed
Direct linkDirect link
Tijmstra, Jesper; Hessen, David J.; van der Heijden, Peter G. M.; Sijtsma, Klaas – Psychometrika, 2013
Most dichotomous item response models share the assumption of latent monotonicity, which states that the probability of a positive response to an item is a nondecreasing function of a latent variable intended to be measured. Latent monotonicity cannot be evaluated directly, but it implies manifest monotonicity across a variety of observed scores,…
Descriptors: Item Response Theory, Statistical Inference, Probability, Psychometrics
Peer reviewed Peer reviewed
Direct linkDirect link
Draxler, Clemens – Psychometrika, 2010
This paper is concerned with supplementing statistical tests for the Rasch model so that additionally to the probability of the error of the first kind (Type I probability) the probability of the error of the second kind (Type II probability) can be controlled at a predetermined level by basing the test on the appropriate number of observations.…
Descriptors: Statistical Analysis, Probability, Sample Size, Error of Measurement
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Ferrando, Pere J. – Psicologica: International Journal of Methodology and Experimental Psychology, 2010
This article proposes a procedure, based on a global statistic, for assessing intra-individual consistency in a test-retest design with a short-term retest interval. The procedure is developed within the framework of parametric item response theory, and the statistic is a likelihood-based measure that can be considered as an extension of the…
Descriptors: Item Response Theory, Intervals, Psychometrics, Testing
Peer reviewed Peer reviewed
Direct linkDirect link
Warrens, Matthijs J. – Psychometrika, 2008
We discuss properties that association coefficients may have in general, e.g., zero value under statistical independence, and we examine coefficients for 2x2 tables with respect to these properties. Furthermore, we study a family of coefficients that are linear transformations of the observed proportion of agreement given the marginal…
Descriptors: Probability, Error of Measurement, Psychometrics, Measurement Techniques
Peer reviewed Peer reviewed
Direct linkDirect link
Bramley, Tom – Educational Research, 2010
Background: A recent article published in "Educational Research" on the reliability of results in National Curriculum testing in England (Newton, "The reliability of results from national curriculum testing in England," "Educational Research" 51, no. 2: 181-212, 2009) suggested that: (1) classification accuracy can be…
Descriptors: National Curriculum, Educational Research, Testing, Measurement
Peer reviewed Peer reviewed
Direct linkDirect link
Hutchison, Dougal – Oxford Review of Education, 2008
There is a degree of instability in any measurement, so that if it is repeated, it is possible that a different result may be obtained. Such instability, generally described as "measurement error", may affect the conclusions drawn from an investigation, and methods exist for allowing it. It is less widely known that different disciplines, and…
Descriptors: Measurement Techniques, Data Analysis, Error of Measurement, Test Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
Solano-Flores, Guillermo – Educational Researcher, 2008
The testing of English language learners (ELLs) is, to a large extent, a random process because of poor implementation and factors that are uncertain or beyond control. Yet current testing practices and policies appear to be based on deterministic views of language and linguistic groups and erroneous assumptions about the capacity of assessment…
Descriptors: Generalizability Theory, Testing, Second Language Learning, Error of Measurement
Peer reviewed Peer reviewed
Shrivastav, Rahul; Sapienza, Christine M.; Nandur, Vuday – Journal of Speech, Language, and Hearing Research, 2005
Rating scales are commonly used to study voice quality. However, recent research has demonstrated that perceptual measures of voice quality obtained using rating scales suffer from poor interjudge agreement and reliability, especially in the midrange of the scale. These findings, along with those obtained using multidimensional scaling (MDS), have…
Descriptors: Psychometrics, Probability, Rating Scales, Interrater Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
Dirkzwager, Arie – International Journal of Testing, 2003
The crux in psychometrics is how to estimate the probability that a respondent answers an item correctly on one occasion out of many. Under the current testing paradigm this probability is estimated using all kinds of statistical techniques and mathematical modeling. Multiple evaluation is a new testing paradigm using the person's own personal…
Descriptors: Psychometrics, Probability, Models, Measurement
Peer reviewed Peer reviewed
Andrich, David – Applied Psychological Measurement, 1989
A probabilistic item response theory (IRT) model is developed for pair-comparison design in which the unfolding principle governing the choice process uses a discriminant process analogous to Thurstone's Law of Comparative Judgment. A simulation study demonstrates the feasibility of estimation, and two examples illustrate the implications for…
Descriptors: Algorithms, Computer Simulation, Discrimination Learning, Equations (Mathematics)