Publication Date
In 2025 | 0 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 8 |
Since 2006 (last 20 years) | 16 |
Descriptor
Difficulty Level | 19 |
Statistical Analysis | 19 |
Test Format | 19 |
Test Items | 15 |
Item Response Theory | 10 |
Comparative Analysis | 7 |
Foreign Countries | 6 |
Item Analysis | 6 |
Multiple Choice Tests | 6 |
Equated Scores | 5 |
Correlation | 4 |
More ▼ |
Source
Author
Abshire, Elizabeth | 1 |
Algina, James | 1 |
Alpayar, Cagla | 1 |
Batty, Aaron Olaf | 1 |
Binici, Salih | 1 |
Blankenbiller, Margaret | 1 |
Boone, William J. | 1 |
Borowski, Andreas | 1 |
Brownell, Sara E. | 1 |
Burton, Nancy W. | 1 |
Chang, Mei-Lin | 1 |
More ▼ |
Publication Type
Reports - Research | 17 |
Journal Articles | 15 |
Speeches/Meeting Papers | 3 |
Reports - Evaluative | 2 |
Tests/Questionnaires | 1 |
Education Level
Higher Education | 6 |
Postsecondary Education | 6 |
Secondary Education | 3 |
Elementary Secondary Education | 2 |
Middle Schools | 2 |
Elementary Education | 1 |
Grade 7 | 1 |
Grade 8 | 1 |
Grade 9 | 1 |
High Schools | 1 |
Junior High Schools | 1 |
More ▼ |
Audience
Researchers | 1 |
Location
Australia | 2 |
Germany | 2 |
Japan | 1 |
Turkey (Ankara) | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Defining Issues Test | 1 |
National Assessment of… | 1 |
Test of English as a Foreign… | 1 |
What Works Clearinghouse Rating
Inga Laukaityte; Marie Wiberg – Practical Assessment, Research & Evaluation, 2024
The overall aim was to examine effects of differences in group ability and features of the anchor test form on equating bias and the standard error of equating (SEE) using both real and simulated data. Chained kernel equating, Postratification kernel equating, and Circle-arc equating were studied. A college admissions test with four different…
Descriptors: Ability Grouping, Test Items, College Entrance Examinations, High Stakes Tests
Lim, Euijin; Lee, Won-Chan – Applied Measurement in Education, 2020
The purpose of this study is to address the necessity of subscore equating and to evaluate the performance of various equating methods for subtests. Assuming the random groups design and number-correct scoring, this paper analyzed real data and simulated data with four study factors including test dimensionality, subtest length, form difference in…
Descriptors: Equated Scores, Test Length, Test Format, Difficulty Level
Morgan, Grant B.; Moore, Courtney A.; Floyd, Harlee S. – Journal of Psychoeducational Assessment, 2018
Although content validity--how well each item of an instrument represents the construct being measured--is foundational in the development of an instrument, statistical validity is also important to the decisions that are made based on the instrument. The primary purpose of this study is to demonstrate how simulation studies can be used to assist…
Descriptors: Simulation, Decision Making, Test Construction, Validity
Alpayar, Cagla; Gulleroglu, H. Deniz – Educational Research and Reviews, 2017
The aim of this research is to determine whether students' test performance and approaches to test questions change based on the type of mathematics questions (visual or verbal) administered to them. This research is based on a mixed-design model. The quantitative data are gathered from 297 seventh grade students, attending seven different middle…
Descriptors: Foreign Countries, Middle School Students, Grade 7, Student Evaluation
Chang, Mei-Lin; Engelhard, George, Jr. – Journal of Psychoeducational Assessment, 2016
The purpose of this study is to examine the psychometric quality of the Teachers' Sense of Efficacy Scale (TSES) with data collected from 554 teachers in a U.S. Midwestern state. The many-facet Rasch model was used to examine several potential contextual influences (years of teaching experience, school context, and levels of emotional exhaustion)…
Descriptors: Models, Teacher Attitudes, Self Efficacy, Item Response Theory
Schwichow, Martin; Christoph, Simon; Boone, William J.; Härtig, Hendrik – International Journal of Science Education, 2016
The so-called control-of-variables strategy (CVS) incorporates the important scientific reasoning skills of designing controlled experiments and interpreting experimental outcomes. As CVS is a prominent component of science standards appropriate assessment instruments are required to measure these scientific reasoning skills and to evaluate the…
Descriptors: Thinking Skills, Science Instruction, Science Experiments, Science Tests
Socha, Alan; DeMars, Christine E. – Educational and Psychological Measurement, 2013
Modeling multidimensional test data with a unidimensional model can result in serious statistical errors, such as bias in item parameter estimates. Many methods exist for assessing the dimensionality of a test. The current study focused on DIMTEST. Using simulated data, the effects of sample size splitting for use with the ATFIND procedure for…
Descriptors: Sample Size, Test Length, Correlation, Test Format
Chen, Jing; Sheehan, Kathleen M. – ETS Research Report Series, 2015
The "TOEFL"® family of assessments includes the "TOEFL"® Primary"™, "TOEFL Junior"®, and "TOEFL iBT"® tests. The linguistic complexity of stimulus passages in the reading sections of the TOEFL family of assessments is expected to differ across the test levels. This study evaluates the linguistic…
Descriptors: Language Tests, Second Language Learning, English (Second Language), Reading Comprehension
Wright, Christian D.; Eddy, Sarah L.; Wenderoth, Mary Pat; Abshire, Elizabeth; Blankenbiller, Margaret; Brownell, Sara E. – CBE - Life Sciences Education, 2016
Recent reform efforts in undergraduate biology have recommended transforming course exams to test at more cognitively challenging levels, which may mean including more cognitively challenging and more constructed-response questions on assessments. However, changing the characteristics of exams could result in bias against historically underserved…
Descriptors: Introductory Courses, Biology, Undergraduate Students, Higher Education
Batty, Aaron Olaf – Language Testing, 2015
The rise in the affordability of quality video production equipment has resulted in increased interest in video-mediated tests of foreign language listening comprehension. Although research on such tests has continued fairly steadily since the early 1980s, studies have relied on analyses of raw scores, despite the growing prevalence of item…
Descriptors: Listening Comprehension Tests, Comparative Analysis, Video Technology, Audio Equipment
Kirschner, Sophie; Borowski, Andreas; Fischer, Hans E.; Gess-Newsome, Julie; von Aufschnaiter, Claudia – International Journal of Science Education, 2016
Teachers' professional knowledge is assumed to be a key variable for effective teaching. As teacher education has the goal to enhance professional knowledge of current and future teachers, this knowledge should be described and assessed. Nevertheless, only a limited number of studies quantitatively measures physics teachers' professional…
Descriptors: Evaluation Methods, Tests, Test Format, Science Instruction
Hudson, Ross D.; Treagust, David F. – Research in Science & Technological Education, 2013
Background: This study developed from observations of apparent achievement differences between male and female chemistry performances in a state university entrance examination. Male students performed more strongly than female students, especially in higher scores. Apart from the gender of the students, two other important factors that might…
Descriptors: Chemistry, College Entrance Examinations, State Universities, Gender Differences
Powers, Sonya; Turhan, Ahmet; Binici, Salih – Pearson, 2012
The population sensitivity of vertical scaling results was evaluated for a state reading assessment spanning grades 3-10 and a state mathematics test spanning grades 3-8. Subpopulations considered included males and females. The 3-parameter logistic model was used to calibrate math and reading items and a common item design was used to construct…
Descriptors: Scaling, Equated Scores, Standardized Tests, Reading Tests
Filipi, Anna – Language Testing, 2012
The Assessment of Language Competence (ALC) certificates is an annual, international testing program developed by the Australian Council for Educational Research to test the listening and reading comprehension skills of lower to middle year levels of secondary school. The tests are developed for three levels in French, German, Italian and…
Descriptors: Listening Comprehension Tests, Item Response Theory, Statistical Analysis, Foreign Countries
Liao, Chi-Wen; Livingston, Samuel A. – ETS Research Report Series, 2008
Randomly equivalent forms (REF) of tests in listening and reading for nonnative speakers of English were created by stratified random assignment of items to forms, stratifying on item content and predicted difficulty. The study included 50 replications of the procedure for each test. Each replication generated 2 REFs. The equivalence of those 2…
Descriptors: Equated Scores, Item Analysis, Test Items, Difficulty Level
Previous Page | Next Page »
Pages: 1 | 2