ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	8
Since 2006 (last 20 years)	16

Descriptor

Difficulty Level	19
Statistical Analysis	19
Test Format	19
Test Items	15
Item Response Theory	10
Comparative Analysis	7
Foreign Countries	6
Item Analysis	6
Multiple Choice Tests	6
Equated Scores	5
Correlation	4
Language Tests	4
Test Reliability	4
English (Second Language)	3
Gender Differences	3
Goodness of Fit	3
Higher Education	3
Listening Comprehension Tests	3
Reading Tests	3
Sample Size	3
Science Instruction	3
Second Language Learning	3
Test Validity	3
Academic Achievement	2
College Entrance Examinations	2
More ▼

Source

ETS Research Report Series	2
International Journal of…	2
Journal of Psychoeducational…	2
Language Testing	2
Applied Measurement in…	1
CBE - Life Sciences Education	1
Educational Research and…	1
Educational and Psychological…	1
Journal of Interactive Online…	1
Pearson	1
Practical Assessment,…	1
Research in Science &…	1
More ▼

Publication Type

Reports - Research	17
Journal Articles	15
Speeches/Meeting Papers	3
Reports - Evaluative	2
Tests/Questionnaires	1

Education Level

Higher Education	6
Postsecondary Education	6
Secondary Education	3
Elementary Secondary Education	2
Middle Schools	2
Elementary Education	1
Grade 7	1
Grade 8	1
Grade 9	1
High Schools	1
Junior High Schools	1
More ▼

Audience

Researchers

Location

Australia	2
Germany	2
Japan	1
Turkey (Ankara)	1

Laws, Policies, & Programs

Assessments and Surveys

Defining Issues Test	1
National Assessment of…	1
Test of English as a Foreign…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 19 results Save | Export

Impacts of Differences in Group Abilities and Anchor Test Features on Three Non-IRT Test Equating Methods

Peer reviewed
PDF on ERIC

Download full text

Inga Laukaityte; Marie Wiberg – Practical Assessment, Research & Evaluation, 2024

The overall aim was to examine effects of differences in group ability and features of the anchor test form on equating bias and the standard error of equating (SEE) using both real and simulated data. Chained kernel equating, Postratification kernel equating, and Circle-arc equating were studied. A college admissions test with four different…

Descriptors: Ability Grouping, Test Items, College Entrance Examinations, High Stakes Tests

Subscore Equating and Profile Reporting

Peer reviewed

Direct link

Lim, Euijin; Lee, Won-Chan – Applied Measurement in Education, 2020

The purpose of this study is to address the necessity of subscore equating and to evaluate the performance of various equating methods for subtests. Assuming the random groups design and number-correct scoring, this paper analyzed real data and simulated data with four study factors including test dimensionality, subtest length, form difference in…

Descriptors: Equated Scores, Test Length, Test Format, Difficulty Level

On Using Simulations to Inform Decision Making during Instrument Development

Peer reviewed

Direct link

Morgan, Grant B.; Moore, Courtney A.; Floyd, Harlee S. – Journal of Psychoeducational Assessment, 2018

Although content validity--how well each item of an instrument represents the construct being measured--is foundational in the development of an instrument, statistical validity is also important to the decisions that are made based on the instrument. The primary purpose of this study is to demonstrate how simulation studies can be used to assist…

Descriptors: Simulation, Decision Making, Test Construction, Validity

Examination of Test and Item Statistics from Visual and Verbal Mathematics Questions

Peer reviewed
PDF on ERIC

Download full text

Alpayar, Cagla; Gulleroglu, H. Deniz – Educational Research and Reviews, 2017

The aim of this research is to determine whether students' test performance and approaches to test questions change based on the type of mathematics questions (visual or verbal) administered to them. This research is based on a mixed-design model. The quantitative data are gathered from 297 seventh grade students, attending seven different middle…

Descriptors: Foreign Countries, Middle School Students, Grade 7, Student Evaluation

Examining the Teachers' Sense of Efficacy Scale at the Item Level with Rasch Measurement Model

Peer reviewed

Direct link

Chang, Mei-Lin; Engelhard, George, Jr. – Journal of Psychoeducational Assessment, 2016

The purpose of this study is to examine the psychometric quality of the Teachers' Sense of Efficacy Scale (TSES) with data collected from 554 teachers in a U.S. Midwestern state. The many-facet Rasch model was used to examine several potential contextual influences (years of teaching experience, school context, and levels of emotional exhaustion)…

Descriptors: Models, Teacher Attitudes, Self Efficacy, Item Response Theory

The Impact of Sub-Skills and Item Content on Students' Skills with Regard to the Control-of-Variables Strategy

Peer reviewed

Direct link

Schwichow, Martin; Christoph, Simon; Boone, William J.; Härtig, Hendrik – International Journal of Science Education, 2016

The so-called control-of-variables strategy (CVS) incorporates the important scientific reasoning skills of designing controlled experiments and interpreting experimental outcomes. As CVS is a prominent component of science standards appropriate assessment instruments are required to measure these scientific reasoning skills and to evaluate the…

Descriptors: Thinking Skills, Science Instruction, Science Experiments, Science Tests

An Investigation of Sample Size Splitting on ATFIND and DIMTEST

Peer reviewed

Direct link

Socha, Alan; DeMars, Christine E. – Educational and Psychological Measurement, 2013

Modeling multidimensional test data with a unidimensional model can result in serious statistical errors, such as bias in item parameter estimates. Many methods exist for assessing the dimensionality of a test. The current study focused on DIMTEST. Using simulated data, the effects of sample size splitting for use with the ATFIND procedure for…

Descriptors: Sample Size, Test Length, Correlation, Test Format

Analyzing and Comparing Reading Stimulus Materials across the "TOEFL"® Family of Assessments. "TOEFL iBT"® Research Report. TOEFL iBT-26. ETS Research Report No. RR-15-08

Peer reviewed
PDF on ERIC

Download full text

Chen, Jing; Sheehan, Kathleen M. – ETS Research Report Series, 2015

The "TOEFL"® family of assessments includes the "TOEFL"® Primary"™, "TOEFL Junior"®, and "TOEFL iBT"® tests. The linguistic complexity of stimulus passages in the reading sections of the TOEFL family of assessments is expected to differ across the test levels. This study evaluates the linguistic…

Descriptors: Language Tests, Second Language Learning, English (Second Language), Reading Comprehension

Cognitive Difficulty and Format of Exams Predicts Gender and Socioeconomic Gaps in Exam Performance of Students in Introductory Biology Courses

Peer reviewed

Direct link

Wright, Christian D.; Eddy, Sarah L.; Wenderoth, Mary Pat; Abshire, Elizabeth; Blankenbiller, Margaret; Brownell, Sara E. – CBE - Life Sciences Education, 2016

Recent reform efforts in undergraduate biology have recommended transforming course exams to test at more cognitively challenging levels, which may mean including more cognitively challenging and more constructed-response questions on assessments. However, changing the characteristics of exams could result in bias against historically underserved…

Descriptors: Introductory Courses, Biology, Undergraduate Students, Higher Education

A Comparison of Video- and Audio-Mediated Listening Tests with Many-Facet Rasch Modeling and Differential Distractor Functioning

Peer reviewed

Direct link

Batty, Aaron Olaf – Language Testing, 2015

The rise in the affordability of quality video production equipment has resulted in increased interest in video-mediated tests of foreign language listening comprehension. Although research on such tests has continued fairly steadily since the early 1980s, studies have relied on analyses of raw scores, despite the growing prevalence of item…

Descriptors: Listening Comprehension Tests, Comparative Analysis, Video Technology, Audio Equipment

Developing and Evaluating a Paper-and-Pencil Test to Assess Components of Physics Teachers' Pedagogical Content Knowledge

Peer reviewed

Direct link

Kirschner, Sophie; Borowski, Andreas; Fischer, Hans E.; Gess-Newsome, Julie; von Aufschnaiter, Claudia – International Journal of Science Education, 2016

Teachers' professional knowledge is assumed to be a key variable for effective teaching. As teacher education has the goal to enhance professional knowledge of current and future teachers, this knowledge should be described and assessed. Nevertheless, only a limited number of studies quantitatively measures physics teachers' professional…

Descriptors: Evaluation Methods, Tests, Test Format, Science Instruction

Which Form of Assessment Provides the Best Information about Student Performance in Chemistry Examinations?

Peer reviewed

Direct link

Hudson, Ross D.; Treagust, David F. – Research in Science & Technological Education, 2013

Background: This study developed from observations of apparent achievement differences between male and female chemistry performances in a state university entrance examination. Male students performed more strongly than female students, especially in higher scores. Apart from the gender of the students, two other important factors that might…

Descriptors: Chemistry, College Entrance Examinations, State Universities, Gender Differences

Population Invariance of Vertical Scaling Results

Direct link

Powers, Sonya; Turhan, Ahmet; Binici, Salih – Pearson, 2012

The population sensitivity of vertical scaling results was evaluated for a state reading assessment spanning grades 3-10 and a state mathematics test spanning grades 3-8. Subpopulations considered included males and females. The 3-parameter logistic model was used to calibrate math and reading items and a common item design was used to construct…

Descriptors: Scaling, Equated Scores, Standardized Tests, Reading Tests

Do Questions Written in the Target Language Make Foreign Language Listening Comprehension Tests More Difficult?

Peer reviewed

Direct link

Filipi, Anna – Language Testing, 2012

The Assessment of Language Competence (ALC) certificates is an annual, international testing program developed by the Australian Council for Educational Research to test the listening and reading comprehension skills of lower to middle year levels of secondary school. The tests are developed for three levels in French, German, Italian and…

Descriptors: Listening Comprehension Tests, Item Response Theory, Statistical Analysis, Foreign Countries

Examining an Alternative to Score Equating: A Randomly Equivalent Forms Approach. Research Report. ETS RR-08-14

Peer reviewed
PDF on ERIC

Download full text

Liao, Chi-Wen; Livingston, Samuel A. – ETS Research Report Series, 2008

Randomly equivalent forms (REF) of tests in listening and reading for nonnative speakers of English were created by stratified random assignment of items to forms, stratifying on item content and predicted difficulty. The study included 50 replications of the procedure for each test. Each replication generated 2 REFs. The equivalence of those 2…

Descriptors: Equated Scores, Item Analysis, Test Items, Difficulty Level

Previous Page | Next Page »

Pages: 1 | 2

Abshire, Elizabeth	1
Algina, James	1
Alpayar, Cagla	1
Batty, Aaron Olaf	1
Binici, Salih	1
Blankenbiller, Margaret	1
Boone, William J.	1
Borowski, Andreas	1
Brownell, Sara E.	1
Burton, Nancy W.	1
Chang, Mei-Lin	1
Chen, Jing	1
Christoph, Simon	1
DeMars, Christine E.	1
Eddy, Sarah L.	1
Engelhard, George, Jr.	1
Filipi, Anna	1
Fischer, Hans E.	1
Floyd, Harlee S.	1
Gess-Newsome, Julie	1
Gulleroglu, H. Deniz	1
Hudson, Ross D.	1
Härtig, Hendrik	1
Inga Laukaityte	1
More ▼