NotesFAQContact Us
Collection
Advanced
Search Tips
Audience
Researchers2
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing all 14 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Stefanie A. Wind; Yangmeng Xu – Educational Assessment, 2024
We explored three approaches to resolving or re-scoring constructed-response items in mixed-format assessments: rater agreement, person fit, and targeted double scoring (TDS). We used a simulation study to consider how the three approaches impact the psychometric properties of student achievement estimates, with an emphasis on person fit. We found…
Descriptors: Interrater Reliability, Error of Measurement, Evaluation Methods, Examiners
Peer reviewed Peer reviewed
Direct linkDirect link
Gorgun, Guher; Bulut, Okan – Educational and Psychological Measurement, 2021
In low-stakes assessments, some students may not reach the end of the test and leave some items unanswered due to various reasons (e.g., lack of test-taking motivation, poor time management, and test speededness). Not-reached items are often treated as incorrect or not-administered in the scoring process. However, when the proportion of…
Descriptors: Scoring, Test Items, Response Style (Tests), Mathematics Tests
Beula M. Magimairaj; Philip Capin; Sandra L. Gillam; Sharon Vaughn; Greg Roberts; Anna-Maria Fall; Ronald B. Gillam – Grantee Submission, 2022
Purpose: Our aim was to evaluate the psychometric properties of the online administered format of the Test of Narrative Language--Second Edition (TNL-2; Gillam & Pearson, 2017), given the importance of assessing children's narrative ability and considerable absence of psychometric studies of spoken language assessments administered online.…
Descriptors: Computer Assisted Testing, Language Tests, Story Telling, Language Impairments
Beula M. Magimairaj; Philip Capin; Sandra L. Gillam; Sharon Vaughn; Greg Roberts; Anna-Maria Fall; Ronald B. Gillam – Language, Speech, and Hearing Services in Schools, 2022
Purpose: Our aim was to evaluate the psychometric properties of the online administered format of the Test of Narrative Language--Second Edition (TNL-2; Gillam & Pearson, 2017), given the importance of assessing children's narrative ability and considerable absence of psychometric studies of spoken language assessments administered online.…
Descriptors: Computer Assisted Testing, Language Tests, Story Telling, Language Impairments
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Liu, Sha; Kunnan, Antony John – CALICO Journal, 2016
This study investigated the application of "WriteToLearn" on Chinese undergraduate English majors' essays in terms of its scoring ability and the accuracy of its error feedback. Participants were 163 second-year English majors from a university located in Sichuan province who wrote 326 essays from two writing prompts. Each paper was…
Descriptors: Foreign Countries, Undergraduate Students, English (Second Language), Second Language Learning
Peer reviewed Peer reviewed
Direct linkDirect link
Puhan, Gautam – Journal of Educational Measurement, 2012
Tucker and chained linear equatings were evaluated in two testing scenarios. In Scenario 1, referred to as rater comparability scoring and equating, the anchor-to-total correlation is often very high for the new form but moderate for the reference form. This may adversely affect the results of Tucker equating, especially if the new and reference…
Descriptors: Testing, Scoring, Equated Scores, Statistical Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Casabianca, Jodi M.; McCaffrey, Daniel F.; Gitomer, Drew H.; Bell, Courtney A.; Hamre, Bridget K.; Pianta, Robert C. – Educational and Psychological Measurement, 2013
Classroom observation of teachers is a significant part of educational measurement; measurements of teacher practice are being used in teacher evaluation systems across the country. This research investigated whether observations made live in the classroom and from video recording of the same lessons yielded similar inferences about teaching.…
Descriptors: Secondary School Mathematics, Mathematics Instruction, Classroom Observation Techniques, Algebra
Peer reviewed Peer reviewed
Direct linkDirect link
Gagnon, Robert; Lubarsky, Stuart; Lambert, Carole; Charlin, Bernard – Advances in Health Sciences Education, 2011
The Script Concordance Test (SCT) uses a panel-based, aggregate scoring method that aims to capture the variability of responses of experienced practitioners to particular clinical situations. The use of this type of scoring method is a key determinant of the tool's discriminatory power, but deviant answers could potentially diminish the…
Descriptors: Expertise, Oncology, Scoring, Error of Measurement
Peer reviewed Peer reviewed
Direct linkDirect link
Collins, Rebecca L.; Martino, Steven C.; Elliott, Marc N. – Developmental Psychology, 2011
Longitudinal research has demonstrated a link between exposure to sexual content in media and subsequent changes in adolescent sexual behavior, including initiation of intercourse and various noncoital sexual activities. Based on a reanalysis of one of the data sets involved, Steinberg and Monahan (2011) have challenged these findings. However,…
Descriptors: Sexuality, Mass Media Effects, Adolescents, Evaluation Methods
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Carnoy, Martin – National Education Policy Center, 2015
Stanford education professor Martin Carnoy examines four main critiques of how international test results are used in policymaking. Of particular interest are critiques of the policy analyses published by the Program for International Student Assessment (PISA). Using average PISA scores as a comparative measure of student achievement is misleading…
Descriptors: Criticism, Reputation, Test Validity, Error of Measurement
Peer reviewed Peer reviewed
Direct linkDirect link
Schlotz, Wolff; Yim, Ilona S.; Zoccola, Peggy M.; Jansen, Lars; Schulz, Peter – Psychological Assessment, 2011
There is accumulating evidence that individual differences in stress reactivity contribute to the risk for stress-related disease. However, the assessment of stress reactivity remains challenging, and there is a relative lack of questionnaires reliably assessing this construct. We here present the Perceived Stress Reactivity Scale (PSRS), a…
Descriptors: Stress Variables, Self Efficacy, Factor Structure, Infants
Suddick, David E.; And Others – 1985
The Test of Standard Written English (TSWE) is a 50-item multiple choice instrument designed to assess the ability of college students to use English. In this study, based upon a sample of 45 students, the TSWE was revalidated with writing samples. The coefficient of 0.54 was most impressive given that the TSWE scores were restricted to those…
Descriptors: Correlation, Error of Measurement, Essay Tests, Higher Education
Shale, Doug – 1986
This study is an attempt at a cohesive characterization of the concept of essay reliability. As such, it takes as a basic premise that previous and current practices in reporting reliability estimates for essay tests have certain shortcomings. The study provides an analysis of these shortcomings--partly to encourage a fuller understanding of the…
Descriptors: Analysis of Variance, Correlation, Error of Measurement, Essay Tests
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Xi, Xiaoming; Mollaun, Pam – ETS Research Report Series, 2006
This study explores the utility of analytic scoring for the TOEFL® Academic Speaking Test (TAST) in providing useful and reliable diagnostic information in three aspects of candidates' performance: delivery, language use, and topic development. G studies were used to investigate the dependability of the analytic scores, the distinctness of the…
Descriptors: English (Second Language), Language Tests, Second Language Learning, Oral Language