ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	2
Since 2006 (last 20 years)	6

Descriptor

Statistical Distributions	23
Test Theory	23
Scores	8
Statistical Studies	8
Error of Measurement	5
Estimation (Mathematics)	5
Mathematical Models	5
Measurement Techniques	5
Scoring	5
Test Items	5
Comparative Analysis	4
Cutting Scores	4
Sampling	4
Test Reliability	4
Achievement Tests	3
Correlation	3
Criterion Referenced Tests	3
Item Response Theory	3
Statistical Analysis	3
Statistical Bias	3
Computation	2
Difficulty Level	2
Equations (Mathematics)	2
Factor Analysis	2
Foreign Countries	2
More ▼

Source

Advances in Physiology…	1
Applied Psychological…	1
Australian Journal of…	1
Educational Research and…	1
Journal of Educational…	1
Journal of Educational…	1
Journal of Experimental…	1
Malaysian Online Journal of…	1
Multivariate Behavioral…	1
ProQuest LLC	1
Psychometrika	1
Society for Research on…	1
More ▼

Publication Type

Reports - Research	16
Journal Articles	10
Speeches/Meeting Papers	9
Reports - Evaluative	4
Reports - Descriptive	2
Dissertations/Theses -…	1

Education Level

Elementary Education	2
Grade 10	1
Grade 3	1
Grade 8	1
Grade 9	1
High Schools	1
Higher Education	1
Junior High Schools	1
Middle Schools	1
Secondary Education	1

Audience

Researchers

Location

Australia	1
New York	1
New York (New York)	1
Sweden	1

Laws, Policies, & Programs

Assessments and Surveys

ACT Assessment	1
Alabama High School…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 23 results Save | Export

Cluster-Robust Variance Estimators for Binary Observations in Heterogeneous Groups and Their Application to Psychometric Analyses of Repeated Measures

Direct link

Sarah Marie Marquis – ProQuest LLC, 2020

This dissertation is composed of a study of estimation methods in classical and test theories and the elaboration and application of a cluster-robust variance estimator. Variance estimators derived from generalized estimating equations are known to be robust to most covariance structures and are therefore well suited for psychometric analysis of…

Descriptors: Multivariate Analysis, Robustness (Statistics), Computation, Test Theory

A General Method for Adjusting Test Score Distributions to Account for Rescoring and Retesting

Peer reviewed

Direct link

Sophie Litschwartz – Society for Research on Educational Effectiveness, 2021

Background/Context: Pass/fail standardized exams frequently selectively rescore failing exams and retest failing examinees. This practice distorts the test score distribution and can confuse those who do analysis on these distributions. In 2011, the Wall Street Journal showed large discontinuities in the New York City Regent test score…

Descriptors: Standardized Tests, Pass Fail Grading, Scoring Rubrics, Scoring Formulas

The Relationship between CTT and IRT Approaches in Analyzing Item Characteristics

Peer reviewed
PDF on ERIC

Download full text

Abedalaziz, Nabeel; Leng, Chin Hai – Malaysian Online Journal of Educational Sciences, 2013

Most of the tests and inventories used by counseling psychologists have been developed using CTT; IRT derives from what is called latent trait theory. A number of important differences exist between CTT- versus IRT-based approaches to both test development and evaluation, as well as the process of scoring the response profiles of individual…

Descriptors: Test Theory, Item Response Theory, Difficulty Level, Models

Development of Nonword and Irregular Word Lists for Australian Grade 3 Students Using Rasch Analysis

Peer reviewed

Direct link

Callinan, Sarah; Cunningham, Everarda; Theiler, Stephen – Australian Journal of Learning Difficulties, 2014

Many tests used in educational settings to identify learning difficulties endeavour to pick up only the lowest performers. Yet these tests are generally developed within a Classical Test Theory (CTT) paradigm that assumes that data do not have significant skew. Rasch analysis is more tolerant of skew and was used to validate two newly developed…

Descriptors: Foreign Countries, Reading Tests, Item Response Theory, Elementary School Students

Making Do with What We Have: Use Your Bootstraps

Peer reviewed

Direct link

Calmettes, Guillaume; Drummond, Gordon B.; Vowler, Sarah L. – Advances in Physiology Education, 2012

A jack knife is a pocket knife that is put to many tasks, because it's ready to hand. Often there could be a better tool for the job, such as a screwdriver, a scraper, or a can-opener, but these are not usually pocket items. In statistical terms, the expression implies making do with what's available. Another simile, of an extreme situation, is…

Descriptors: Statistical Analysis, Computation, Population Distribution, Evaluation Methods

Classical Test Theory and Item Response Theory: Analytical and Empirical Comparisons.

Download full text

Hwang, Dae-Yeop – 2002

This study compared classical test theory (CTT) and item response theory (IRT). The behavior of the item and person statistics derived from these two measurement frameworks was examined analytically and empirically using a data set obtained from BILOG (R. Mislay and D. Block, 1997). The example was a 15-item test with a sample size of 600…

Descriptors: Comparative Analysis, Measurement Techniques, Scores, Statistical Distributions

Tests of Significance of Correlation Coefficients in the Absence of Bivariate Normal Populations.

Peer reviewed

Zimmerman, Donald W. – Journal of Experimental Education, 1986

A computer program randomly sampled ordered pairs of scores from known populations that departed from bivariate normal form and calculated correlation coefficients from sample values. Hypotheses were tested (1) that population correlations are zero using the t statistic; and (2) that population correlations have non-zero values using the r to z…

Descriptors: Correlation, Hypothesis Testing, Sampling, Statistical Distributions

Estimation of Reliability Coefficients Using the Test Information Function and Its Modifications.

Peer reviewed

Samejima, Fumiko – Applied Psychological Measurement, 1994

The reliability coefficient is predicted from the test information function (TIF) or two modified TIF formulas and a specific trait distribution. Examples illustrate the variability of the reliability coefficient across different trait distributions, and results are compared with empirical reliability coefficients. (SLD)

Descriptors: Adaptive Testing, Error of Measurement, Estimation (Mathematics), Reliability

The Choice of Scale for Educational Measurement: An IRT Perspective.

Peer reviewed

Yen, Wendy M. – Journal of Educational Measurement, 1986

Two methods of constucting equal-interval scales for educational achievement are discussed: Thurstone's absolute scaling method and Item Response Theory. Alternative criteria for choosing a scale are contrasted. It is argued that clearer criteria are needed for judging the appropriateness and usefulness of alternative scaling procedures.…

Descriptors: Achievement Tests, Latent Trait Theory, Mathematical Models, Scaling

Some Formulas for Use with Bayesian Ability Estimates.

Download full text

Mislevy, Robert J. – 1993

Relationships between Bayesian ability estimates and the parameters of a normal population distribution are derived in the context of classical test theory. Analogies are provided for use as approximations in work with item response theory (IRT). The following issues are addressed: (1) the relationship between the distribution of the latent…

Descriptors: Ability, Bayesian Statistics, Computer Software, Estimation (Mathematics)

The Use of Confidence Intervals When Interpreting Test Scores. EREAPA Publication Series No. 93-4.

Download full text

Wheeler, Patricia H. – 1993

A person's obtained score on a test provides an estimate of the individual's "true" score on that test. The obtained score is considered to have two parts, the true component and the error component. Classical test theory assumes that obtained scores for an individual over multiple administrations of the same test will lie symmetrically…

Descriptors: Cutting Scores, Error of Measurement, Scores, Statistical Distributions

Reliability of Composite Measurements Based on the m Highest of n Equivalent Components.

Peer reviewed

Huynh, Huynh – Journal of Educational Statistics, 1986

Under the assumptions of classical measurement theory and the condition of normality, a formula is derived for the reliability of composite scores. The formula represents an extension of the Spearman-Brown formula to the case of truncated data. (Author/JAZ)

Descriptors: Computer Simulation, Error of Measurement, Expectancy Tables, Scoring Formulas

Confirmatory Measurement Model Comparisons Using Latent Means.

Peer reviewed

Millsap, Roger E.; Everson, Howard – Multivariate Behavioral Research, 1991

Use of confirmatory factor analysis (CFA) with nonzero latent means in testing six different measurement models from classical test theory is discussed. Implications of the six models for observed mean and covariance structures are described, and three examples of the use of CFA in testing the models are presented. (SLD)

Descriptors: Comparative Analysis, Equations (Mathematics), Goodness of Fit, Mathematical Models

Dimensions of National Test Performance: A Two-Level Approach

Peer reviewed

Direct link

Aberg-Bengtsson, Lisbeth; Erickson, Gudrun – Educational Research and Evaluation, 2006

The research project presented in this article was set in the Swedish school context and carried out on a set of compulsory national tests for English, Swedish, and mathematics used at the end of compulsory school. The aims were: (a) to gain a deeper knowledge of the internal structure of the tests and (b) to separate individual performance from…

Descriptors: Individual Testing, Factor Analysis, Structural Equation Models, Foreign Countries

On the Relative Power of the Paired Samples t Test and Wilcoxon's Signed-Ranks Test.

Blair, R. Clifford; Higgins, James J. – 1985

Monte Carlo methods were employed to assess the relative power of the paired samples t test and Wilcoxon's signed-ranks test under ten population shapes. Results of the study indicated that: (1) each of the two statistics was more powerful than the other in given situations; (2) the power advantages of the t test under normal theory were small;…

Descriptors: Estimation (Mathematics), Literature Reviews, Measurement Techniques, Monte Carlo Methods

Previous Page | Next Page »

Pages: 1 | 2

Abedalaziz, Nabeel	1
Aberg-Bengtsson, Lisbeth	1
Becker, Betsy Jane	1
Blair, R. Clifford	1
Borrello, Gloria M.	1
Callinan, Sarah	1
Calmettes, Guillaume	1
Cope, Ronald T.	1
Cunningham, Everarda	1
Drummond, Gordon B.	1
Erickson, Gudrun	1
Everson, Howard	1
Garrido, Mariquita	1
Higgins, James J.	1
Huynh, Huynh	1
Hwang, Dae-Yeop	1
Jones, Patricia B.	1
Kolen, Michael J.	1
Leng, Chin Hai	1
Lockwood, Robert E.	1
Millsap, Roger E.	1
Mislevy, Robert J.	1
Molenaar, Ivo W.	1
Payne, David A.	1
More ▼