NotesFAQContact Us
Collection
Advanced
Search Tips
Publication Date
In 20250
Since 20240
Since 2021 (last 5 years)0
Since 2016 (last 10 years)2
Since 2006 (last 20 years)7
Education Level
Adult Education1
Location
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 1 to 15 of 26 results Save | Export
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Raborn, Anthony W.; Leite, Walter L.; Marcoulides, Katerina M. – International Educational Data Mining Society, 2019
Short forms of psychometric scales have been commonly used in educational and psychological research to reduce the burden of test administration. However, it is challenging to select items for a short form that preserve the validity and reliability of the scores of the original scale. This paper presents and evaluates multiple automated methods…
Descriptors: Psychometrics, Measures (Individuals), Mathematics, Heuristics
Peer reviewed Peer reviewed
Direct linkDirect link
Wang, Shiyu; Lin, Haiyan; Chang, Hua-Hua; Douglas, Jeff – Journal of Educational Measurement, 2016
Computerized adaptive testing (CAT) and multistage testing (MST) have become two of the most popular modes in large-scale computer-based sequential testing. Though most designs of CAT and MST exhibit strength and weakness in recent large-scale implementations, there is no simple answer to the question of which design is better because different…
Descriptors: Computer Assisted Testing, Adaptive Testing, Test Format, Sequential Approach
Peer reviewed Peer reviewed
Direct linkDirect link
Yao, Lihua – Applied Psychological Measurement, 2013
Through simulated data, five multidimensional computerized adaptive testing (MCAT) selection procedures with varying test lengths are examined and compared using different stopping rules. Fixed item exposure rates are used for all the items, and the Priority Index (PI) method is used for the content constraints. Two stopping rules, standard error…
Descriptors: Computer Assisted Testing, Adaptive Testing, Test Items, Selection
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Williams, Matt N.; Gomez Grajales, Carlos Alberto; Kurkiewicz, Dason – Practical Assessment, Research & Evaluation, 2013
In 2002, an article entitled "Four assumptions of multiple regression that researchers should always test" by Osborne and Waters was published in "PARE." This article has gone on to be viewed more than 275,000 times (as of August 2013), and it is one of the first results displayed in a Google search for "regression…
Descriptors: Multiple Regression Analysis, Misconceptions, Reader Response, Predictor Variables
Peer reviewed Peer reviewed
Direct linkDirect link
Gekara, Victor Oyaro; Bloor, Michael; Sampson, Helen – Journal of Vocational Education and Training, 2011
Vocational education and training (VET) concerns the cultivation and development of specific skills and competencies, in addition to broad underpinning knowledge relating to paid employment. VET assessment is, therefore, designed to determine the extent to which a trainee has effectively acquired the knowledge, skills, and competencies required by…
Descriptors: Marine Education, Occupational Safety and Health, Computer Assisted Testing, Vocational Education
Peer reviewed Peer reviewed
Direct linkDirect link
Maydeu-Olivares, Alberto; Coffman, Donna L.; Hartmann, Wolfgang M. – Psychological Methods, 2007
The point estimate of sample coefficient alpha may provide a misleading impression of the reliability of the test score. Because sample coefficient alpha is consistently biased downward, it is more likely to yield a misleading impression of poor reliability. The magnitude of the bias is greatest precisely when the variability of sample alpha is…
Descriptors: Intervals, Scores, Sample Size, Simulation
Peer reviewed Peer reviewed
Direct linkDirect link
Borsman, Denny; Romeijn, Jan-Willem; Wicherts, Jelte M. – Psychological Methods, 2008
This article shows that measurement invariance (defined in terms of an invariant measurement model in different groups) is generally inconsistent with selection invariance (defined in terms of equal sensitivity and specificity across groups). In particular, when a unidimensional measurement instrument is used and group differences are present in…
Descriptors: Test Items, Minority Groups, Measurement, Scores
Peer reviewed Peer reviewed
Zimmerman, Donald W.; And Others – Educational and Psychological Measurement, 1993
Coefficient alpha was examined through computer simulation as an estimate of test reliability under violation of two assumptions. Coefficient alpha underestimated reliability under violation of the assumption of essential tau-equivalence of subtest scores and overestimated it under violation of the assumption of uncorrelated subtest error scores.…
Descriptors: Computer Simulation, Estimation (Mathematics), Mathematical Models, Robustness (Statistics)
Peer reviewed Peer reviewed
Direct linkDirect link
Gierl, Mark J.; Gotzmann, Andrea; Boughton, Keith A. – Applied Measurement in Education, 2004
Differential item functioning (DIF) analyses are used to identify items that operate differently between two groups, after controlling for ability. The Simultaneous Item Bias Test (SIBTEST) is a popular DIF detection method that matches examinees on a true score estimate of ability. However in some testing situations, like test translation and…
Descriptors: True Scores, Simulation, Test Bias, Student Evaluation
Stansfield, Charles W. – 1990
The simulated oral proficiency interview (SOPI) is a semi-direct speaking test that models the format of the oral proficiency interview (OPI). The OPI is a method of assessing general speaking proficiency in a second language. The SOPI is a tape-recorded test consisting of six parts: simple personal background questions posed in a simulated…
Descriptors: Comparative Analysis, Interviews, Language Proficiency, Language Tests
Peer reviewed Peer reviewed
Segall, Daniel O. – Psychometrika, 1994
An asymptotic expression for the reliability of a linearly equated test is developed using normal theory. Reliability is expressed as the product of test reliability before equating and an adjustment term that is a function of the sample sizes used to estimate the linear equating transformation. The approach is illustrated. (SLD)
Descriptors: Equated Scores, Error of Measurement, Estimation (Mathematics), Sample Size
Yen, Wendy M. – 1982
Test scores that are not perfectly reliable cannot be strictly equated unless they are strictly parallel. This fact implies that tau equivalence can be lost if an equipercentile equating is applied to observed scores that are not strictly parallel. Thirty-six simulated data sets are produced to simulate equating tests with different difficulties…
Descriptors: Difficulty Level, Equated Scores, Latent Trait Theory, Methods
Peer reviewed Peer reviewed
Meijer, Rob R.; And Others – Applied Psychological Measurement, 1994
The power of the nonparametric person-fit statistic, U3, is investigated through simulations as a function of item characteristics, test characteristics, person characteristics, and the group to which examinees belong. Results suggest conditions under which relatively short tests can be used for person-fit analysis. (SLD)
Descriptors: Difficulty Level, Group Membership, Item Response Theory, Nonparametric Statistics
Peer reviewed Peer reviewed
Renner, Richard R.; Greenwood, Gordon E. – Assessment and Evaluation in Higher Education, 1985
Fictitious student evaluations of a faculty member's teaching performance are presented to the reader in an exercise in interpreting such information. Evaluator comments reveal a widespread divergence of views. (MSE)
Descriptors: College Faculty, Evaluation Criteria, Evaluation Methods, Higher Education
Peer reviewed Peer reviewed
Direct linkDirect link
Kistner, Emily O.; Muller, Keith E. – Psychometrika, 2004
Intraclass correlation and Cronbach's alpha are widely used to describe reliability of tests and measurements. Even with Gaussian data, exact distributions are known only for compound symmetric covariance (equal variances and equal correlations). Recently, large sample Gaussian approximations were derived for the distribution functions. New exact…
Descriptors: Correlation, Test Reliability, Test Results, Probability
Previous Page | Next Page ยป
Pages: 1  |  2