Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 1 |
Since 2006 (last 20 years) | 15 |
Descriptor
Source
Author
Dorans, Neil J. | 2 |
Liu, Jinghua | 2 |
Nese, Joseph F. T. | 2 |
Tindal, Gerald | 2 |
Virgin, A. E. | 2 |
Ajuonuma, Juliet O. | 1 |
Alonzo, Julie | 1 |
Anderson, Daniel | 1 |
Arenson, Ethan A. | 1 |
Baker, Jean | 1 |
Barford, Sean W. | 1 |
More ▼ |
Publication Type
Reports - Research | 53 |
Journal Articles | 19 |
Speeches/Meeting Papers | 10 |
Numerical/Quantitative Data | 3 |
Tests/Questionnaires | 2 |
Education Level
Higher Education | 6 |
Postsecondary Education | 5 |
Elementary Secondary Education | 4 |
Elementary Education | 3 |
Secondary Education | 3 |
Grade 6 | 2 |
Grade 7 | 2 |
Middle Schools | 2 |
Grade 3 | 1 |
Grade 4 | 1 |
Grade 5 | 1 |
More ▼ |
Audience
Researchers | 4 |
Location
Canada | 2 |
Georgia | 2 |
Nigeria | 2 |
Oregon | 2 |
Arizona (Mesa) | 1 |
Australia | 1 |
California | 1 |
Iowa | 1 |
New Jersey | 1 |
Pennsylvania (Philadelphia) | 1 |
South Carolina | 1 |
More ▼ |
Laws, Policies, & Programs
Elementary and Secondary… | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Longford, Nicholas T. – Journal of Educational and Behavioral Statistics, 2015
An equating procedure for a testing program with evolving distribution of examinee profiles is developed. No anchor is available because the original scoring scheme was based on expert judgment of the item difficulties. Pairs of examinees from two administrations are formed by matching on coarsened propensity scores derived from a set of…
Descriptors: Equated Scores, Testing Programs, College Entrance Examinations, Scoring
Tindal, Gerald; Nese, Joseph F. T.; Stevens, Joseph J. – Educational Assessment, 2017
For the past decade, the accountability model associated with No Child Left Behind (NCLB) emphasized proficiency on end of year tests; with Every Student Succeeds Act (ESSA) the emphasis on proficiency within statewide testing programs, though now integrated with other measures of student learning, nevertheless remains a primary metric for…
Descriptors: Testing Programs, Middle School Students, Models, State Standards
Jiang, Feng; McComas, William F. – International Journal of Science Education, 2015
Gauging the effectiveness of specific teaching strategies remains a major topic of interest in science education. Inquiry teaching among others has been supported by extensive research and recommended by the National Science Education Standards. However, most of the empirical evidence in support was collected in research settings rather than in…
Descriptors: Inquiry, Active Learning, Science Instruction, Science Achievement
Benítez, Isabel; Padilla, José-Luis – Journal of Mixed Methods Research, 2014
Differential item functioning (DIF) can undermine the validity of cross-lingual comparisons. While a lot of efficient statistics for detecting DIF are available, few general findings have been found to explain DIF results. The objective of the article was to study DIF sources by using a mixed method design. The design involves a quantitative phase…
Descriptors: Foreign Countries, Mixed Methods Research, Test Bias, Cross Cultural Studies
Wang, Lin; Qian, Jiahe; Lee, Yi-Hsuan – ETS Research Report Series, 2013
The purpose of this study was to evaluate the combined effects of reduced equating sample size and shortened anchor test length on item response theory (IRT)-based linking and equating results. Data from two independent operational forms of a large-scale testing program were used to establish the baseline results for evaluating the results from…
Descriptors: Test Construction, Item Response Theory, Testing Programs, Simulation
Unamma, Anthony Odera – Open Praxis, 2013
This research work was aimed at determining the degree of community members' interference in the conduct of university distance learning examination in South Eastern Nigeria. It was also aimed at finding out the factors responsible for the community members' interference, the ways by which interference is effected, the consequences and the…
Descriptors: Foreign Countries, Distance Education, Community Involvement, Testing Problems
McBee, Matthew T.; Peters, Scott J.; Waterman, Craig – Gifted Child Quarterly, 2014
Best practice in gifted and talented identification procedures involves making decisions on the basis of multiple measures. However, very little research has investigated the impact of different methods of combining multiple measures. This article examines the consequences of the conjunctive ("and"), disjunctive/complementary…
Descriptors: Best Practices, Ability Identification, Academically Gifted, Correlation
Guo, Hongwen; Liu, Jinghua; Dorans, Neil; Feigenbaum, Miriam – ETS Research Report Series, 2011
Maintaining score stability is crucial for an ongoing testing program that administers several tests per year over many years. One way to stall the drift of the score scale is to use an equating design with multiple links. In this study, we use the operational and experimental SAT® data collected from 44 administrations to investigate the effect…
Descriptors: Equated Scores, College Entrance Examinations, Reliability, Testing Programs
Creagh, Sue – TESOL in Context, 2014
Teachers are now experiencing the age of quantitative test-driven assessment, in which there is little weight accorded to teacher-based judgement about student progress. In the Australian context, the NAPLaN test has become a driving force in school and teacher accountability. The language of NAPLaN is one of bands and numerical scores and…
Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Student Evaluation
Moses, Tim; Liu, Jinghua; Tan, Adele; Deng, Weiling; Dorans, Neil J. – ETS Research Report Series, 2013
In this study, differential item functioning (DIF) methods utilizing 14 different matching variables were applied to assess DIF in the constructed-response (CR) items from 6 forms of 3 mixed-format tests. Results suggested that the methods might produce distinct patterns of DIF results for different tests and testing programs, in that the DIF…
Descriptors: Test Construction, Multiple Choice Tests, Test Items, Item Analysis
Mrazik, Martin; Janzen, Troy M.; Dombrowski, Stefan C.; Barford, Sean W.; Krawchuk, Lindsey L. – Canadian Journal of School Psychology, 2012
A total of 19 graduate students enrolled in a graduate course conducted 6 consecutive administrations of the Wechsler Intelligence Scale for Children, 4th edition (WISC-IV, Canadian version). Test protocols were examined to obtain data describing the frequency of examiner errors, including administration and scoring errors. Results identified 511…
Descriptors: Intelligence Tests, Intelligence, Statistical Analysis, Scoring
Scheetz, James P. – 1976
When performing large scale evaluations (e.g., on a state-wide or national level) it may not be possible to administer all items in the item universe to all respondents in the subject population. One method which has been proposed to sample both items and respondents is multiple matrix sampling (MMS) in which a sample of the items is administered…
Descriptors: Item Sampling, Statistical Analysis, Testing Programs
Saez, Leilani; Park, Bitnara; Nese, Joseph F. T.; Jamgochian, Elisa; Lai, Cheng-Fei; Anderson, Daniel; Kamata, Akihito; Alonzo, Julie; Tindal, Gerald – Behavioral Research and Teaching, 2010
In this series of studies, we investigated the technical adequacy of three curriculum-based measures used as benchmarks and for monitoring progress in three critical reading- related skills: fluency, reading comprehension, and vocabulary. In particular, we examined the following easyCBM measurement across grades 3-7 at fall, winter, and spring…
Descriptors: Elementary School Students, Middle School Students, Vocabulary, Reading Comprehension

Harris, Deborah J.; Kolen, Michael J. – Educational and Psychological Measurement, 1988
Three methods of estimating point-biserial correlation coefficient standard errors were compared: (1) assuming normality; (2) not assuming normality; and (3) bootstrapping. Although errors estimated assuming normality were biased, such estimates were less variable and easier to compute, suggesting that this might be the method of choice in some…
Descriptors: Error of Measurement, Estimation (Mathematics), Item Analysis, Statistical Analysis
Ajuonuma, Juliet O. – African Higher Education Review, 2008
This study was designed to carry out a survey of the implementation of continuous assessment (CA) in Nigerian universities. Two research questions and one hypothesis were formulated to guide the study. The sample for the study consisted of 1,340 respondents. A 24 item self-report instrument was used for the study. The data generated, were analyzed…
Descriptors: Foreign Countries, Program Implementation, Testing Programs, Test Items