NotesFAQContact Us
Collection
Advanced
Search Tips
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 1 to 15 of 47 results Save | Export
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Sayin, Ayfer; Sata, Mehmet – International Journal of Assessment Tools in Education, 2022
The aim of the present study was to examine Turkish teacher candidates' competency levels in writing different types of test items by utilizing Rasch analysis. In addition, the effect of the expertise of the raters scoring the items written by the teacher candidates was examined within the scope of the study. 84 Turkish teacher candidates…
Descriptors: Foreign Countries, Item Response Theory, Evaluators, Expertise
Peer reviewed Peer reviewed
Direct linkDirect link
Lee, Won-Chan; Kim, Stella Y.; Choi, Jiwon; Kang, Yujin – Journal of Educational Measurement, 2020
This article considers psychometric properties of composite raw scores and transformed scale scores on mixed-format tests that consist of a mixture of multiple-choice and free-response items. Test scores on several mixed-format tests are evaluated with respect to conditional and overall standard errors of measurement, score reliability, and…
Descriptors: Raw Scores, Item Response Theory, Test Format, Multiple Choice Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Kim, Stella Y.; Lee, Won-Chan – Applied Measurement in Education, 2019
This study explores classification consistency and accuracy for mixed-format tests using real and simulated data. In particular, the current study compares six methods of estimating classification consistency and accuracy for seven mixed-format tests. The relative performance of the estimation methods is evaluated using simulated data. Study…
Descriptors: Classification, Reliability, Accuracy, Test Format
Tingir, Seyfullah – ProQuest LLC, 2019
Educators use various statistical techniques to explain relationships between latent and observable variables. One way to model these relationships is to use Bayesian networks as a scoring model. However, adjusting the conditional probability tables (CPT-parameters) to fit a set of observations is still a challenge when using Bayesian networks. A…
Descriptors: Bayesian Statistics, Statistical Analysis, Scoring, Probability
Peer reviewed Peer reviewed
Direct linkDirect link
Hao, Tao; Wang, Zhe; Ardasheva, Yuliya – Journal of Research on Educational Effectiveness, 2021
This meta-analysis reviewed research between 2012 and 2018 focused on technology-assisted second language (L2) vocabulary learning for English as a foreign language (EFL) learner. A total of 45 studies of 2,374 preschool-to-college EFL students contributed effect sizes to this meta-analysis. Compared with traditional instructional methods, the…
Descriptors: Vocabulary Development, Second Language Learning, Second Language Instruction, English (Second Language)
Peer reviewed Peer reviewed
Direct linkDirect link
Kim, Sooyeon; Moses, Tim – International Journal of Testing, 2013
The major purpose of this study is to assess the conditions under which single scoring for constructed-response (CR) items is as effective as double scoring in the licensure testing context. We used both empirical datasets of five mixed-format licensure tests collected in actual operational settings and simulated datasets that allowed for the…
Descriptors: Scoring, Test Format, Licensing Examinations (Professions), Test Items
Peer reviewed Peer reviewed
Direct linkDirect link
Yarnell, Jordy B.; Pfeiffer, Steven I. – Journal of Psychoeducational Assessment, 2015
The present study examined the psychometric equivalence of administering a computer-based version of the Gifted Rating Scale (GRS) compared with the traditional paper-and-pencil GRS-School Form (GRS-S). The GRS-S is a teacher-completed rating scale used in gifted assessment. The GRS-Electronic Form provides an alternative method of administering…
Descriptors: Gifted, Psychometrics, Rating Scales, Computer Assisted Testing
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Moses, Tim – ETS Research Report Series, 2013
The purpose of this report is to review ETS psychometric contributions that focus on test scores. Two major sections review contributions based on assessing test scores' measurement characteristics and other contributions about using test scores as predictors in correlational and regression relationships. An additional section reviews additional…
Descriptors: Psychometrics, Scores, Correlation, Regression (Statistics)
Peer reviewed Peer reviewed
Direct linkDirect link
Irwin, Brian; Hepplestone, Stuart – Assessment & Evaluation in Higher Education, 2012
There have been calls in the literature for changes to assessment practices in higher education, to increase flexibility and give learners more control over the assessment process. This article explores the possibilities of allowing student choice in the format used to present their work, as a starting point for changing assessment, based on…
Descriptors: Student Evaluation, College Students, Selection, Computer Assisted Testing
Peer reviewed Peer reviewed
Direct linkDirect link
Hoffman, Lesa; Templin, Jonathan; Rice, Mabel L. – Journal of Speech, Language, and Hearing Research, 2012
Purpose: The present work describes how vocabulary ability as assessed by 3 different forms of the Peabody Picture Vocabulary Test (PPVT; Dunn & Dunn, 1997) can be placed on a common latent metric through item response theory (IRT) modeling, by which valid comparisons of ability between samples or over time can then be made. Method: Responses…
Descriptors: Item Response Theory, Test Format, Vocabulary, Comparative Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Weatherly, Jeffrey N.; Derenne, Adam; Terrell, Heather K. – Psychological Record, 2011
Several measures of delay discounting have been shown to be reliable over periods of up to 3 months. In the present study, 115 participants completed a fill-in-the-blank (FITB) delay-discounting task on sets of 5 different commodities, 12 weeks apart. Results showed that discounting rates were not well described by a hyperbolic function but were…
Descriptors: Delay of Gratification, Reliability, Test Format, Measures (Individuals)
Peer reviewed Peer reviewed
Direct linkDirect link
Lee, HyeSun; Winke, Paula – Language Testing, 2013
We adapted three practice College Scholastic Ability Tests (CSAT) of English listening, each with five-option items, to create four- and three-option versions by asking 73 Korean speakers or learners of English to eliminate the least plausible options in two rounds. Two hundred and sixty-four Korean high school English-language learners formed…
Descriptors: Academic Ability, Stakeholders, Reliability, Listening Comprehension Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Kolen, Michael J.; Lee, Won-Chan – Educational Measurement: Issues and Practice, 2011
This paper illustrates that the psychometric properties of scores and scales that are used with mixed-format educational tests can impact the use and interpretation of the scores that are reported to examinees. Psychometric properties that include reliability and conditional standard errors of measurement are considered in this paper. The focus is…
Descriptors: Test Use, Test Format, Error of Measurement, Raw Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Steinmetz, Jean-Paul; Brunner, Martin; Loarer, Even; Houssemand, Claude – Psychological Assessment, 2010
The Wisconsin Card Sorting Test (WCST) assesses executive and frontal lobe function and can be administered manually or by computer. Despite the widespread application of the 2 versions, the psychometric equivalence of their scores has rarely been evaluated and only a limited set of criteria has been considered. The present experimental study (N =…
Descriptors: Computer Assisted Testing, Psychometrics, Test Theory, Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Vassar, Matt – Social Indicators Research, 2008
The purpose of the present study was to meta-analytically investigate the score reliability for the Satisfaction With Life Scale. Four-hundred and sixteen articles using the measure were located through electronic database searches and then separated to identify studies which had calculated reliability estimates from their own data. Sixty-two…
Descriptors: Test Format, Life Satisfaction, Reliability, Measures (Individuals)
Previous Page | Next Page ยป
Pages: 1  |  2  |  3  |  4