Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 4 |
Since 2006 (last 20 years) | 8 |
Descriptor
Source
Research-publishing.net | 2 |
AERA Online Paper Repository | 1 |
Accounting Education | 1 |
College Board | 1 |
ETS Research Report Series | 1 |
Grantee Submission | 1 |
Pearson | 1 |
Author
Aizawa, Kazumi | 2 |
Iso, Tatsuo | 2 |
Algina, James | 1 |
Binici, Salih | 1 |
Boylan, Daniel | 1 |
DeBoer, George E. | 1 |
Ewing, Maureen | 1 |
Gardner, Suzanne | 1 |
Gu, Yongqi | 1 |
Harbaugh, Allen G. | 1 |
Hardcastle, Joseph | 1 |
More ▼ |
Publication Type
Speeches/Meeting Papers | 15 |
Reports - Research | 11 |
Reports - Evaluative | 3 |
Journal Articles | 2 |
Reports - Descriptive | 1 |
Tests/Questionnaires | 1 |
Education Level
Higher Education | 2 |
Postsecondary Education | 2 |
Elementary Education | 1 |
Elementary Secondary Education | 1 |
Audience
Researchers | 2 |
Location
Japan | 2 |
China | 1 |
Netherlands | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Graduate Record Examinations | 1 |
SAT (College Admission Test) | 1 |
Test of English for… | 1 |
What Works Clearinghouse Rating
Harbaugh, Allen G.; Liu, Min – AERA Online Paper Repository, 2017
This research examines the effects of single-value response style contamination on measures of model fit and model convergence issues. A simulation study examines the effects resulting from percentage of contamination, number of manifest, number of reverse coded items, magnitude of standardized factor loadings, response scale granularity, and…
Descriptors: Goodness of Fit, Sample Size, Statistical Analysis, Test Format
Jonick, Christine; Schneider, Jennifer; Boylan, Daniel – Accounting Education, 2017
The purpose of the research is to examine the effect of different response formats on student performance on introductory accounting exam questions. The study analyzes 1104 accounting students' responses to quantitative questions presented in two formats: multiple-choice and fill-in. Findings indicate that response format impacts student…
Descriptors: Introductory Courses, Accounting, Test Format, Multiple Choice Tests
Hardcastle, Joseph; Herrmann-Abell, Cari F.; DeBoer, George E. – Grantee Submission, 2017
Can student performance on computer-based tests (CBT) and paper-and-pencil tests (PPT) be considered equivalent measures of student knowledge? States and school districts are grappling with this question, and although studies addressing this question are growing, additional research is needed. We report on the performance of students who took…
Descriptors: Academic Achievement, Computer Assisted Testing, Comparative Analysis, Student Evaluation
Aizawa, Kazumi; Iso, Tatsuo; Nadasdy, Paul – Research-publishing.net, 2017
Testing learners' English proficiency is central to university English classes in Japan. This study developed and implemented a set of parallel online receptive aural and visual vocabulary tests that would predict learners' English proficiency. The tests shared the same target words and choices--the main difference was the presentation of the…
Descriptors: Receptive Language, English (Second Language), Second Language Learning, Word Frequency
Wang, Lin; Qian, Jiahe; Lee, Yi-Hsuan – ETS Research Report Series, 2013
The purpose of this study was to evaluate the combined effects of reduced equating sample size and shortened anchor test length on item response theory (IRT)-based linking and equating results. Data from two independent operational forms of a large-scale testing program were used to establish the baseline results for evaluating the results from…
Descriptors: Test Construction, Item Response Theory, Testing Programs, Simulation
Aizawa, Kazumi; Iso, Tatsuo – Research-publishing.net, 2013
The present study aims to demonstrate how the estimation of vocabulary size might be affected by two neglected factors in vocabulary size tests. The first factor is randomization of question sequence, as opposed to the traditional high-to-low frequency sequencing. The second factor is learners' confidence in choosing the correct meaning for a…
Descriptors: Vocabulary, Computer Assisted Testing, Scores, Multiple Choice Tests
Powers, Sonya; Turhan, Ahmet; Binici, Salih – Pearson, 2012
The population sensitivity of vertical scaling results was evaluated for a state reading assessment spanning grades 3-10 and a state mathematics test spanning grades 3-8. Subpopulations considered included males and females. The 3-parameter logistic model was used to calibrate math and reading items and a common item design was used to construct…
Descriptors: Scaling, Equated Scores, Standardized Tests, Reading Tests
Hendrickson, Amy; Patterson, Brian; Ewing, Maureen – College Board, 2010
The psychometric considerations and challenges associated with including constructed response items on tests are discussed along with how these issues affect the form assembly specifications for mixed-format exams. Reliability and validity, security and fairness, pretesting, content and skills coverage, test length and timing, weights, statistical…
Descriptors: Multiple Choice Tests, Test Format, Test Construction, Test Validity
Kingston, Neal M.; McKinley, Robert L. – 1988
Confirmatory multidimensional item response theory (CMIRT) was used to assess the structure of the Graduate Record Examination General Test, about which much information about factorial structure exists, using a sample of 1,001 psychology majors taking the test in 1984 or 1985. Results supported previous findings that, for this population, there…
Descriptors: College Students, Factor Analysis, Higher Education, Item Analysis
McCall, Chester H., Jr.; Gardner, Suzanne – 1984
The Research Services of the National Education Association (NEA) conducted a nationwide teacher opinion poll (TOP) based upon a stratified disproportionate two-state cluster sample of classroom teachers. This research study was conducted to test the hypothesis that the order of presentation of items would make no difference in the conclusions…
Descriptors: Attitude Measures, Elementary Secondary Education, National Surveys, Statistical Analysis
Livingston, Samuel A.; And Others – 1989
Combinations of five methods of equating test forms and two methods of selecting samples of students for equating were compared for accuracy. The two sampling methods were representative sampling from the population and matching samples on the anchor test score. The equating methods were: (1) the Tucker method; (2) the Levine method; (3) the…
Descriptors: Comparative Analysis, Data Collection, Equated Scores, High School Students
Lancaster, Diana M.; And Others – 1987
Difficulty and discrimination ability were compared between multiple choice and short answer items in midterm and final examinations for the internal medicine course at Louisiana State University School of Dentistry. The examinations were administered to 67 sophomore dental students in that course. Additionally, the impact of the source of the…
Descriptors: Dental Schools, Dentistry, Difficulty Level, Discriminant Analysis
Gu, Yongqi; And Others – 1995
Based on personal experience, this paper examines the ambiguities of the Likert-type 5-point scale in learning strategy elicitation. Four parallel questionnaires consisting of the same batch of 20 items taken from the Oxford scale (1990) were administered among a group of 120 tertiary level, non-English majors in China. Questionnaire 1 used the…
Descriptors: Ambiguity, College Students, English (Second Language), Foreign Countries
Samson, Digna M. M. – 1983
The traditional multiple-choice reading comprehension test of English as a second language, used in the Dutch school-leaving examinations, has been criticized for its apparent lack of construct validity. The Dutch National Institute for Educational Measurement has conducted a number of studies to determine whether there is a different skill…
Descriptors: English (Second Language), Foreign Countries, Language Tests, Multiple Choice Tests
Legg, Sue M.; Algina, James – 1986
This paper focuses on the questions which arise as test practitioners monitor score scales derived from latent trait theory. Large scale assessment programs are dynamic and constantly challenge the assumptions and limits of latent trait models. Even though testing programs evolve, test scores must remain reliable indicators of progress.…
Descriptors: Difficulty Level, Educational Assessment, Elementary Secondary Education, Equated Scores