Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 1 |
Since 2006 (last 20 years) | 6 |
Descriptor
Comparative Analysis | 7 |
Error of Measurement | 7 |
Testing Programs | 7 |
Equated Scores | 3 |
Sampling | 3 |
Computation | 2 |
Context Effect | 2 |
Hierarchical Linear Modeling | 2 |
International Programs | 2 |
Item Response Theory | 2 |
Language Tests | 2 |
More ▼ |
Source
Journal of Educational and… | 2 |
ETS Research Report Series | 1 |
Grantee Submission | 1 |
Language Testing | 1 |
ProQuest LLC | 1 |
Author
Cai, Li | 2 |
Yang, Ji Seung | 2 |
Guo, Hongwen | 1 |
Gutierrez Arvizu, Maria Nelly | 1 |
Isbell, Daniel | 1 |
Jamieson, Joan | 1 |
Kolen, Michael J. | 1 |
LaFlair, Geoffrey T. | 1 |
Lee, Yi-Hsuan | 1 |
May, L. D. Nicolas | 1 |
Qian, Jiahe | 1 |
More ▼ |
Publication Type
Reports - Research | 5 |
Journal Articles | 4 |
Dissertations/Theses -… | 1 |
Reports - Evaluative | 1 |
Speeches/Meeting Papers | 1 |
Education Level
Higher Education | 1 |
Postsecondary Education | 1 |
Secondary Education | 1 |
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
Program for International… | 2 |
What Works Clearinghouse Rating
LaFlair, Geoffrey T.; Isbell, Daniel; May, L. D. Nicolas; Gutierrez Arvizu, Maria Nelly; Jamieson, Joan – Language Testing, 2017
Language programs need multiple test forms for secure administrations and effective placement decisions, but can they have confidence that scores on alternate test forms have the same meaning? In large-scale testing programs, various equating methods are available to ensure the comparability of forms. The choice of equating method is informed by…
Descriptors: Language Tests, Equated Scores, Testing Programs, Comparative Analysis
Yang, Ji Seung; Cai, Li – Journal of Educational and Behavioral Statistics, 2014
The main purpose of this study is to improve estimation efficiency in obtaining maximum marginal likelihood estimates of contextual effects in the framework of nonlinear multilevel latent variable model by adopting the Metropolis-Hastings Robbins-Monro algorithm (MH-RM). Results indicate that the MH-RM algorithm can produce estimates and standard…
Descriptors: Computation, Hierarchical Linear Modeling, Mathematics, Context Effect
Yang, Ji Seung; Cai, Li – Grantee Submission, 2014
The main purpose of this study is to improve estimation efficiency in obtaining maximum marginal likelihood estimates of contextual effects in the framework of nonlinear multilevel latent variable model by adopting the Metropolis-Hastings Robbins-Monro algorithm (MH-RM; Cai, 2008, 2010a, 2010b). Results indicate that the MH-RM algorithm can…
Descriptors: Computation, Hierarchical Linear Modeling, Mathematics, Context Effect
Wang, Lin; Qian, Jiahe; Lee, Yi-Hsuan – ETS Research Report Series, 2013
The purpose of this study was to evaluate the combined effects of reduced equating sample size and shortened anchor test length on item response theory (IRT)-based linking and equating results. Data from two independent operational forms of a large-scale testing program were used to establish the baseline results for evaluating the results from…
Descriptors: Test Construction, Item Response Theory, Testing Programs, Simulation
Guo, Hongwen; Sinharay, Sandip – Journal of Educational and Behavioral Statistics, 2011
Nonparametric or kernel regression estimation of item response curves (IRCs) is often used in item analysis in testing programs. These estimates are biased when the observed scores are used as the regressor because the observed scores are contaminated by measurement error. Accuracy of this estimation is a concern theoretically and operationally.…
Descriptors: Testing Programs, Measurement, Item Analysis, Error of Measurement
Wang, Huan – ProQuest LLC, 2010
Multiple uses of the same assessment may present challenges for both the design and use of an assessment. Little advice, however, has been given to assessment developers as to how to understand the phenomena of multiple assessment use and meet the challenges these present. Particularly problematic is the case in which an assessment is used for…
Descriptors: Test Use, Testing Programs, Program Effectiveness, Test Construction
Kolen, Michael J. – 1984
Large sample standard errors for the Tucker method of linear equating under the common item nonrandom groups design are derived under normality assumptions as well as under less restrictive assumptions. Standard errors of Tucker equating are estimated using the bootstrap method described by Efron. The results from different methods are compared…
Descriptors: Certification, Comparative Analysis, Equated Scores, Error of Measurement