NotesFAQContact Us
Collection
Advanced
Search Tips
Audience
Laws, Policies, & Programs
No Child Left Behind Act 20011
Assessments and Surveys
Work Keys (ACT)2
ACT Assessment1
What Works Clearinghouse Rating
Showing 1 to 15 of 33 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Slade, Teri; Gross, Douglas P.; Niwa, Laura; McKillop, Ashley B.; Guptill, Christine – International Journal of Social Research Methodology, 2021
There is increasing concern among researchers about collecting data on sex and gender variables, yet many researchers are unsure of how to deal meaningfully with these variables. Drawing on literature that tests the psychometric properties of sex and gender demographic questions, we present considerations for collecting sex and gender demographic…
Descriptors: Demography, Sex, Gender Issues, Research Methodology
Peer reviewed Peer reviewed
Direct linkDirect link
Lee, Won-Chan; Kim, Stella Y.; Choi, Jiwon; Kang, Yujin – Journal of Educational Measurement, 2020
This article considers psychometric properties of composite raw scores and transformed scale scores on mixed-format tests that consist of a mixture of multiple-choice and free-response items. Test scores on several mixed-format tests are evaluated with respect to conditional and overall standard errors of measurement, score reliability, and…
Descriptors: Raw Scores, Item Response Theory, Test Format, Multiple Choice Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Cousineau, Denis; Laurencelle, Louis – Educational and Psychological Measurement, 2017
Assessing global interrater agreement is difficult as most published indices are affected by the presence of mixtures of agreements and disagreements. A previously proposed method was shown to be specifically sensitive to global agreement, excluding mixtures, but also negatively biased. Here, we propose two alternatives in an attempt to find what…
Descriptors: Interrater Reliability, Evaluation Methods, Statistical Bias, Accuracy
Peer reviewed Peer reviewed
Direct linkDirect link
Flanagan, Dawn P.; Schneider, W. Joel – International Journal of School & Educational Psychology, 2016
When education works, it creates productive, innovative citizens eager to contribute to a well-functioning democracy. In contrast, educational failure has lifelong consequences, with some individuals experiencing decades of preventable hardship. Dawn Flanagan and Joel Schneider write in this response that, like Kranzler, Floyd, Benson, Zabowski,…
Descriptors: Learning Disabilities, Identification, Diagnostic Tests, Criticism
Peer reviewed Peer reviewed
Direct linkDirect link
Bradshaw, Jenny; Wheater, Rebecca – Research Papers in Education, 2013
This review examined a range of approaches internationally to the reporting of assessment results for individual students, with a particular focus on how results are represented, the level of detail reported and the steps taken to quantify, report and explain error and uncertainty in the results' reports or certificates given to students in a…
Descriptors: Test Reliability, Error of Measurement, High Stakes Tests, Foreign Countries
Peer reviewed Peer reviewed
Direct linkDirect link
Kaplan, David; Depaoli, Sarah – Structural Equation Modeling: A Multidisciplinary Journal, 2011
This article examines the problem of specification error in 2 models for categorical latent variables; the latent class model and the latent Markov model. Specification error in the latent class model focuses on the impact of incorrectly specifying the number of latent classes of the categorical latent variable on measures of model adequacy as…
Descriptors: Markov Processes, Longitudinal Studies, Probability, Item Response Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Ludtke, Oliver; Marsh, Herbert W.; Robitzsch, Alexander; Trautwein, Ulrich – Psychological Methods, 2011
In multilevel modeling, group-level variables (L2) for assessing contextual effects are frequently generated by aggregating variables from a lower level (L1). A major problem of contextual analyses in the social sciences is that there is no error-free measurement of constructs. In the present article, 2 types of error occurring in multilevel data…
Descriptors: Simulation, Educational Psychology, Social Sciences, Measurement
Peer reviewed Peer reviewed
Direct linkDirect link
Yang, Manshu; Chow, Sy-Miin – Psychometrika, 2010
Facial electromyography (EMG) is a useful physiological measure for detecting subtle affective changes in real time. A time series of EMG data contains bursts of electrical activity that increase in magnitude when the pertinent facial muscles are activated. Whereas previous methods for detecting EMG activation are often based on deterministic or…
Descriptors: Test Bias, Error of Measurement, Human Body, Diagnostic Tests
Micceri, Theodore; Parasher, Pradnya; Waugh, Gordon W.; Herreid, Charlene – Online Submission, 2009
An extensive review of the research literature and a study comparing over 36,000 survey responses with archival true scores indicated that one should expect a minimum of at least three percent random error for the least ambiguous of self-report measures. The Gulliver Effect occurs when a small proportion of error in a sizable subpopulation exerts…
Descriptors: Error of Measurement, Minority Groups, Measurement, Computation
Peer reviewed Peer reviewed
Direct linkDirect link
Vaughn, Brandon K.; Wang, Qiu – Educational and Psychological Measurement, 2010
A nonparametric tree classification procedure is used to detect differential item functioning for items that are dichotomously scored. Classification trees are shown to be an alternative procedure to detect differential item functioning other than the use of traditional Mantel-Haenszel and logistic regression analysis. A nonparametric…
Descriptors: Test Bias, Classification, Nonparametric Statistics, Regression (Statistics)
Peer reviewed Peer reviewed
Direct linkDirect link
Van Ravesteyn, Nicolien T.; Dallmeijer, Annet J.; Scholtes, Vanessa A.; Roorda, Leo D.; Becher, Jules G. – Developmental Medicine & Child Neurology, 2010
Aim: The objective of this study was to assess the reliability of a mobility questionnaire (MobQues) that was developed to measure the mobility limitations of children with cerebral palsy (CP) as rated by their parents. A clinical version of the questionnaire, consisting of 47 items (MobQues47), is available, as well as a research version with 28…
Descriptors: Cerebral Palsy, Interrater Reliability, Questionnaires, Classification
Peer reviewed Peer reviewed
Direct linkDirect link
Sijtsma, Klaas – International Journal of Testing, 2009
This article reviews three topics from test theory that continue to raise discussion and controversy and capture test theorists' and constructors' interest. The first topic concerns the discussion of the methodology of investigating and establishing construct validity; the second topic concerns reliability and its misuse, alternative definitions…
Descriptors: Construct Validity, Reliability, Classification, Test Theory
Peer reviewed Peer reviewed
Direct linkDirect link
Bramley, Tom – Educational Research, 2010
Background: A recent article published in "Educational Research" on the reliability of results in National Curriculum testing in England (Newton, "The reliability of results from national curriculum testing in England," "Educational Research" 51, no. 2: 181-212, 2009) suggested that: (1) classification accuracy can be…
Descriptors: National Curriculum, Educational Research, Testing, Measurement
Group of Eight (NJ1), 2012
The current main world university rankings broadly group the leading research universities of nations. Australia's Go8 universities are generally within the top 250 ranked universities, with several institutions in the top 50-100 on some measures. This recognition is commendable, however imperfect the individual rankings may be. Use is made of…
Descriptors: Evaluation Methods, Foreign Countries, Public Policy, Research Universities
Peer reviewed Peer reviewed
Direct linkDirect link
Monbaliu, E.; Ortibus, E.; Roelens, F.; Desloovere, K.; Deklerck, J.; Prinzie, P.; De Cock, P.; Feys, H. – Developmental Medicine & Child Neurology, 2010
Aim: This study investigated the reliability and validity of the Barry-Albright Dystonia Scale (BADS), the Burke-Fahn-Marsden Movement Scale (BFMMS), and the Unified Dystonia Rating Scale (UDRS) in patients with bilateral dystonic cerebral palsy (CP). Method: Three raters independently scored videotapes of 10 patients (five males, five females;…
Descriptors: Content Validity, Cerebral Palsy, Validity, Interrater Reliability
Previous Page | Next Page ยป
Pages: 1  |  2  |  3