NotesFAQContact Us
Collection
Advanced
Search Tips
Publication Date
In 20250
Since 20240
Since 2021 (last 5 years)0
Since 2016 (last 10 years)0
Since 2006 (last 20 years)16
Publication Type
Reports - Evaluative21
Journal Articles18
Reports - Research1
Tests/Questionnaires1
Audience
Laws, Policies, & Programs
No Child Left Behind Act 20012
What Works Clearinghouse Rating
Showing 1 to 15 of 21 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Laprise, Shari L. – College Teaching, 2012
Successful exam composition can be a difficult task. Exams should not only assess student comprehension, but be learning tools in and of themselves. In a biotechnology course delivered to nonmajors at a business college, objective multiple-choice test questions often require students to choose the exception or "not true" choice. Anecdotal student…
Descriptors: Feedback (Response), Test Items, Multiple Choice Tests, Biotechnology
Peer reviewed Peer reviewed
Direct linkDirect link
Stone, Gregory Ethan; Koskey, Kristin L. K.; Sondergeld, Toni A. – Educational and Psychological Measurement, 2011
Typical validation studies on standard setting models, most notably the Angoff and modified Angoff models, have ignored construct development, a critical aspect associated with all conceptualizations of measurement processes. Stone compared the Angoff and objective standard setting (OSS) models and found that Angoff failed to define a legitimate…
Descriptors: Cutting Scores, Standard Setting (Scoring), Models, Construct Validity
Peer reviewed Peer reviewed
Direct linkDirect link
Hatcher, Donald L. – New Directions for Institutional Research, 2011
In this article, after describing one approach for teaching critical thinking (CT) that was in place at Baker University from 1990 to 2008, the author describes their experience assessing CT using three standardized exams and shows why the choice of a standardized CT test can be problematic and the results misleading. These results can be…
Descriptors: Test Results, Essay Tests, Critical Thinking, Thinking Skills
Peer reviewed Peer reviewed
Direct linkDirect link
Ricketts, Chris; Brice, Julie; Coombes, Lee – Advances in Health Sciences Education, 2010
The purpose of multiple choice tests of medical knowledge is to estimate as accurately as possible a candidate's level of knowledge. However, concern is sometimes expressed that multiple choice tests may also discriminate in undesirable and irrelevant ways, such as between minority ethnic groups or by sex of candidates. There is little literature…
Descriptors: Medical Students, Testing Accommodations, Ethnic Groups, Learning Disabilities
Peer reviewed Peer reviewed
Direct linkDirect link
Kim, Sooyeon; Walker, Michael E.; McHale, Frederick – Journal of Educational Measurement, 2010
In this study we examined variations of the nonequivalent groups equating design for tests containing both multiple-choice (MC) and constructed-response (CR) items to determine which design was most effective in producing equivalent scores across the two tests to be equated. Using data from a large-scale exam, this study investigated the use of…
Descriptors: Measures (Individuals), Scoring, Equated Scores, Test Bias
Peer reviewed Peer reviewed
Direct linkDirect link
Wheeler, Sharon; Twist, Craig – European Physical Education Review, 2010
Body mass index (BMI) is increasingly recognized as an inadequate measure for determining obesity in children. Therefore, the aim within this study was to investigate other indirect methods of body fat assessment that could potentially be used in place of BMI. Twenty-four children (boys: 13.8 [plus or minus] 0.8 yr; girls: 13.3 [plus or minus] 0.5…
Descriptors: Obesity, Body Composition, Measurement Techniques, Comparative Testing
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Attali, Yigal; Bridgeman, Brent; Trapani, Catherine – Journal of Technology, Learning, and Assessment, 2010
A generic approach in automated essay scoring produces scores that have the same meaning across all prompts, existing or new, of a writing assessment. This is accomplished by using a single set of linguistic indicators (or features), a consistent way of combining and weighting these features into essay scores, and a focus on features that are not…
Descriptors: Writing Evaluation, Writing Tests, Scoring, Test Scoring Machines
Setzer, J. Carl; He, Yi – GED Testing Service, 2009
Reliability Analysis for the Internationally Administered 2002 Series GED (General Educational Development) Tests Reliability refers to the consistency, or stability, of test scores when the authors administer the measurement procedure repeatedly to groups of examinees (American Educational Research Association [AERA], American Psychological…
Descriptors: Educational Research, Error of Measurement, Scores, Test Reliability
McGlynn, Angela Provitera – Education Digest: Essential Readings Condensed for Quick Review, 2008
A new report, "The Proficiency Illusion," released last year by the Thomas B. Fordham Institute states that the tests that states use to measure academic progress under the No Child Left Behind Act (NCLB) are creating a false impression of success, especially in reading and especially in the early grades. The report is a collaboration…
Descriptors: Federal Legislation, Academic Achievement, Rating Scales, Achievement Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Donnellan, M. Brent – Educational and Psychological Measurement, 2008
The properties of the achievement goal inventories developed by Grant and Dweck (2003) and Elliot and McGregor (2001) were evaluated in two studies with a total of 780 participants. A four-factor specification for the Grant and Dweck inventory did not closely replicate results published in their original report. In contrast, the structure of the…
Descriptors: Academic Achievement, Psychometrics, Program Validation, Achievement Rating
Peer reviewed Peer reviewed
Direct linkDirect link
Kim, Do-Hong; Huynh, Huynh – Educational and Psychological Measurement, 2008
The current study compared student performance between paper-and-pencil testing (PPT) and computer-based testing (CBT) on a large-scale statewide end-of-course English examination. Analyses were conducted at both the item and test levels. The overall results suggest that scores obtained from PPT and CBT were comparable. However, at the content…
Descriptors: Reading Comprehension, Computer Assisted Testing, Factor Analysis, Comparative Testing
Peer reviewed Peer reviewed
Direct linkDirect link
Ferdous, Abdullah A.; Plake, Barbara S. – Educational and Psychological Measurement, 2007
In an Angoff standard setting procedure, judges estimate the probability that a hypothetical randomly selected minimally competent candidate will answer correctly each item in the test. In many cases, these item performance estimates are made twice, with information shared with the panelists between estimates. Especially for long tests, this…
Descriptors: Test Items, Probability, Item Analysis, Standard Setting (Scoring)
Peer reviewed Peer reviewed
Direct linkDirect link
Maguire, Phil; Devereux, Barry; Costello, Fintan; Cater, Arthur – Journal of Experimental Psychology: Learning, Memory, and Cognition, 2007
The competition among relations in nominals (CARIN) theory of conceptual combination (C. L. Gagne & E. J. Shoben, 1997) proposes that people interpret nominal compounds by selecting a relation from a pool of competing alternatives and that relation availability is influenced by the frequency with which relations have been previously associated…
Descriptors: Competition, Program Validation, Item Analysis, Human Relations
Peer reviewed Peer reviewed
Direct linkDirect link
van de Velden, Michel; Bijmolt, Tammo H. A. – Psychometrika, 2006
A method is presented for generalized canonical correlation analysis of two or more matrices with missing rows. The method is a combination of Carroll's (1968) method and the missing data approach of the OVERALS technique (Van der Burg, 1988). In a simulation study we assess the performance of the method and compare it to an existing procedure…
Descriptors: Multivariate Analysis, Matrices, Simulation, Comparative Testing
Peer reviewed Peer reviewed
Direct linkDirect link
Meyer, J. Patrick; Huynh, Huynh; Seaman, Michael A. – Journal of Educational Measurement, 2004
Exact nonparametric procedures have been used to identify the level of differential item functioning (DIF) in binary items. This study explored the use of exact DIF procedures with items scored on a Likert scale. The results from an attitude survey suggest that the large-sample Cochran-Mantel-Haenszel (CMH) procedure identifies more items as…
Descriptors: Test Bias, Attitude Measures, Surveys, Predictive Validity
Previous Page | Next Page ยป
Pages: 1  |  2