Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 0 |
Since 2006 (last 20 years) | 9 |
Descriptor
Error Patterns | 11 |
Evaluation Methods | 11 |
Evaluation Research | 11 |
Computation | 4 |
Simulation | 4 |
Educational Practices | 3 |
Models | 3 |
Testing | 3 |
Correlation | 2 |
Error Correction | 2 |
Feedback (Response) | 2 |
More ▼ |
Source
Author
Kim, Eun Sook | 2 |
Yoon, Myeongsun | 2 |
Athy, Jeremy | 1 |
Attali, Yigal | 1 |
Chan, Daniel W.-L. | 1 |
Chan, Wai | 1 |
Delany, Eileen | 1 |
Forero, Carlos G. | 1 |
Friedrich, Jeff | 1 |
Gagne, Phill | 1 |
Goldberg, Gail Lynn | 1 |
More ▼ |
Publication Type
Journal Articles | 11 |
Reports - Research | 7 |
Reports - Evaluative | 3 |
Reports - Descriptive | 1 |
Education Level
Higher Education | 2 |
Postsecondary Education | 1 |
Audience
Location
Oklahoma | 1 |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Kim, Eun Sook; Kwok, Oi-man; Yoon, Myeongsun – Structural Equation Modeling: A Multidisciplinary Journal, 2012
Testing factorial invariance has recently gained more attention in different social science disciplines. Nevertheless, when examining factorial invariance, it is generally assumed that the observations are independent of each other, which might not be always true. In this study, we examined the impact of testing factorial invariance in multilevel…
Descriptors: Monte Carlo Methods, Testing, Social Science Research, Factor Structure
Kim, Eun Sook; Yoon, Myeongsun; Lee, Taehun – Educational and Psychological Measurement, 2012
Multiple-indicators multiple-causes (MIMIC) modeling is often used to test a latent group mean difference while assuming the equivalence of factor loadings and intercepts over groups. However, this study demonstrated that MIMIC was insensitive to the presence of factor loading noninvariance, which implies that factor loading invariance should be…
Descriptors: Test Items, Simulation, Testing, Statistical Analysis
Iamarino, Danielle L. – Current Issues in Education, 2014
This paper explores the methodology and application of an assessment philosophy known as standards-based grading, via a critical comparison of standards-based grading to other assessment philosophies commonly employed at the elementary, secondary, and post-secondary levels of education. Evidenced by examples of increased student engagement and…
Descriptors: Grading, Evaluation Methods, Evaluation Criteria, Evaluation Research
Attali, Yigal – Applied Psychological Measurement, 2011
Recently, Attali and Powers investigated the usefulness of providing immediate feedback on the correctness of answers to constructed response questions and the opportunity to revise incorrect answers. This article introduces an item response theory (IRT) model for scoring revised responses to questions when several attempts are allowed. The model…
Descriptors: Feedback (Response), Item Response Theory, Models, Error Correction
Haardorfer, Regine; Gagne, Phill – Focus on Autism and Other Developmental Disabilities, 2010
Some researchers have argued for the use of or have attempted to make use of randomization tests in single-subject research. To address this tide of interest, the authors of this article describe randomization tests, discuss the theoretical rationale for applying them to single-subject research, and provide an overview of the methodological…
Descriptors: Research Design, Researchers, Evaluation Methods, Research Methodology
Hathcoat, John D.; Penn, Jeremy D. – Research & Practice in Assessment, 2012
Critics of standardized testing have recommended replacing standardized tests with more authentic assessment measures, such as classroom assignments, projects, or portfolios rated by a panel of raters using common rubrics. Little research has examined the consistency of scores across multiple authentic assignments or the implications of this…
Descriptors: Generalizability Theory, Performance Based Assessment, Writing Across the Curriculum, Standardized Tests
Forero, Carlos G.; Maydeu-Olivares, Alberto – Psychological Methods, 2009
The performance of parameter estimates and standard errors in estimating F. Samejima's graded response model was examined across 324 conditions. Full information maximum likelihood (FIML) was compared with a 3-stage estimator for categorical item factor analysis (CIFA) when the unweighted least squares method was used in CIFA's third stage. CIFA…
Descriptors: Factor Analysis, Least Squares Statistics, Computation, Item Response Theory
Athy, Jeremy; Friedrich, Jeff; Delany, Eileen – Science & Education, 2008
Egon Brunswik (1903-1955) first made an interesting distinction between perception and explicit reasoning, arguing that perception included quick estimates of an object's size, nearly always resulting in good approximations in uncertain environments, whereas explicit reasoning, while better at achieving exact estimates, could often fail by wide…
Descriptors: Psychology, Logical Thinking, Perception, Psychological Studies
Williams, Jason; MacKinnon, David P. – Structural Equation Modeling: A Multidisciplinary Journal, 2008
Recent advances in testing mediation have found that certain resampling methods and tests based on the mathematical distribution of 2 normal random variables substantially outperform the traditional "z" test. However, these studies have primarily focused only on models with a single mediator and 2 component paths. To address this limitation, a…
Descriptors: Intervals, Testing, Predictor Variables, Effect Size
Chan, Wai; Chan, Daniel W.-L. – Psychological Methods, 2004
The standard Pearson correlation coefficient is a biased estimator of the true population correlation, ?, when the predictor and the criterion are range restricted. To correct the bias, the correlation corrected for range restriction, r-sub(c), has been recommended, and a standard formula based on asymptotic results for estimating its standard…
Descriptors: Computation, Intervals, Sample Size, Monte Carlo Methods

Goldberg, Gail Lynn; Kapinus, Barbara – Applied Measurement in Education, 1993
Using responses of 123 elementary school teachers, a battery of performance-assessment tasks designed to generate responses to reading tests was evaluated from task development and scoring perspectives. More than one dozen types of errors were identified. Practical outcomes of the study and improvement of task development and scoring are…
Descriptors: Educational Assessment, Educational Practices, Elementary Education, Elementary School Teachers