Publication Date
| In 2026 | 0 |
| Since 2025 | 200 |
| Since 2022 (last 5 years) | 1070 |
| Since 2017 (last 10 years) | 2580 |
| Since 2007 (last 20 years) | 4941 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 653 |
| Teachers | 563 |
| Researchers | 250 |
| Students | 201 |
| Administrators | 81 |
| Policymakers | 22 |
| Parents | 17 |
| Counselors | 8 |
| Community | 7 |
| Support Staff | 3 |
| Media Staff | 1 |
| More ▼ | |
Location
| Turkey | 225 |
| Canada | 223 |
| Australia | 155 |
| Germany | 116 |
| United States | 99 |
| China | 90 |
| Florida | 86 |
| Indonesia | 82 |
| Taiwan | 78 |
| United Kingdom | 73 |
| California | 65 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 4 |
| Meets WWC Standards with or without Reservations | 4 |
| Does not meet standards | 1 |
Bailey, Alison L.; Huang, Becky H.; Shin, Hye Won; Farnsworth, Tim; Butler, Frances A. – National Center for Research on Evaluation, Standards, and Student Testing (CRESST), 2007
Within an evidentiary framework for operationally defining academic English language proficiency (AELP), linguistic analyses of standards, classroom discourse, and textbooks have led to specifications for assessment of AELP. The test development process described here is novel due to the emphasis on using linguistic profiles to inform the …
Descriptors: Grade 5, Textbooks, Psychometrics, Profiles
Lievens, Filip; Sackett, Paul R. – Journal of Applied Psychology, 2007
This study used principles underlying item generation theory to posit competing perspectives about which features of situational judgment tests might enhance or impede consistent measurement across repeat test administrations. This led to 3 alternate-form development approaches (random assignment, incident isomorphism, and item isomorphism). The…
Descriptors: Validity, High Stakes Tests, Test Construction, Testing
Sharifi Ashtiani, Nahid; Babaii, Esmat – Studies in Educational Evaluation, 2007
For decades traditional methods of testing have been criticized for saying relatively little reliably about students' ability as well as causing anxiety, which can negatively affect students' recall of learned information. The reform movement with its innovative approaches focusing on learner-centered education perceives assessment as an…
Descriptors: Teaching Methods, Program Effectiveness, Grade 11, Test Construction
Al-A'ali, Mansoor – Educational Technology & Society, 2007
Computer adaptive testing is the study of scoring tests and questions based on assumptions concerning the mathematical relationship between examinees' ability and the examinees' responses. Adaptive student tests, which are based on item response theory (IRT), have many advantages over conventional tests. We use the least square method, a…
Descriptors: Educational Testing, Higher Education, Elementary Secondary Education, Student Evaluation
Schedl, Mary; And Others – 1995
The Test of English as a Foreign Language (TOEFL) program is exploring a change in Section 3 of the TOEFL test that would replace the vocabulary subpart with additional reading comprehension questions. This study investigated the proposed revision in terms of the length and timing that would be necessary to address concerns of test speededness of…
Descriptors: Adult Students, English (Second Language), Language Tests, Psychometrics
McPeek, W. Miles; Wild, Cheryl L. – 1992
The use of the Mantel-Haenszel statistic was investigated as a methodology for identifying differentially functioning items on the NTE Programs Core Battery. Retrospective analyses of the data collected over a 3-year period are reported for Black/White, Hispanic/White, and female/male comparisons in 50 samples ranging from 88 to 23,773 teacher…
Descriptors: Beginning Teachers, Blacks, Elementary Secondary Education, Evaluation Methods
Wainer, Howard; And Others – 1991
When an examination consists, in whole or in part, of constructed response items, it is a common practice to allow the examinee to choose among a variety of questions. This procedure is usually adopted so that the limited number of items that can be completed in the allotted time does not unfairly affect the examinee. This results in the de facto…
Descriptors: Adaptive Testing, Chemistry, Comparative Analysis, Computer Assisted Testing
O'Neill, Kathleen A.; And Others – 1993
The purpose of this study was to identify differentially functioning items on operational administrations of the Graduate Management Admission Test (GMAT) through the use of the Mantel-Haenszel statistic. Retrospective analyses of data collected over 3 years are reported for black/white and female/male comparisons for the Verbal and Quantitative…
Descriptors: Black Students, Classification, College Entrance Examinations, Difficulty Level
Dolecki, Yolanda; And Others – 1992
This report describes how an advisory committee of health occupations education instructors from Missouri schools where tests were given statistically analyzed test items for health services assistant for reliability and validity and then revised the test items. Seventy-five sets of test item booklets were printed. Four test booklets and a…
Descriptors: Allied Health Occupations Education, Allied Health Personnel, Criterion Referenced Tests, Health Services
Pang, Xiao L.; And Others – 1994
The function of Mantel-Haenszel (MH) and logistic regression (LR) statistics with real data in detecting gender-based differentially functioning items (DIF) was investigated when sample size and criterion variable varied. The data base consisted of the item responses of a population of 183,356 Caucasians to the Math test of the ACT Assessment…
Descriptors: College Entrance Examinations, Foreign Countries, Identification, Item Bias
Bergstrom, Betty; And Others – 1994
Examinee response times from a computerized adaptive test taken by 204 examinees taking a certification examination were analyzed using a hierarchical linear model. Two equations were posed: a within-person model and a between-person model. Variance within persons was eight times greater than variance between persons. Several variables…
Descriptors: Adaptive Testing, Adults, Certification, Computer Assisted Testing
PDF pending restorationSykes, Robert C.; And Others – 1996
The presence of multiple readings of a student response to a constructed-response item in a large-scale assessment requires a procedure for combining the ratings to obtain an item score. An alternative to the averaged item ratings that are usually used is the summing of ratings for each item. This study evaluated the effect of summing as opposed…
Descriptors: Constructed Response, High Schools, Item Response Theory, Mathematics Education
Kehoe, Jerard – 1995
This digest describes some basics of the construction of multiple-choice tests. As a rule, the test maker should strive for test item stems (introductory questions or incomplete statements at the beginning of each item that are followed by the options) that are clear and parsimonious, answers that are unequivocal and chosen by the students who do…
Descriptors: Culture Fair Tests, Distractors (Tests), Educational Assessment, Item Bias
Shorey, Leonard – 1991
Tests in social studies and integrated science given in Saint Vincent, Saint Lucia, Grenada, and Dominica were analyzed by the Organization for Co-operation in Overseas Development (OCOD) Comprehensive Teacher Training Program (CTTP) for discrimination, difficulty, and reliability, as well as other characteristics. There were 767 examinees for the…
Descriptors: Difficulty Level, Elementary Secondary Education, Evaluation Methods, Foreign Countries
Burstein, Jill C.; Kaplan, Randy M. – 1995
There is a considerable interest at Educational Testing Service (ETS) to include performance-based, natural language constructed-response items on standardized tests. Such items can be developed, but the projected time and costs required to have these items scored by human graders would be prohibitive. In order for ETS to include these types of…
Descriptors: Computer Assisted Testing, Constructed Response, Cost Effectiveness, Hypothesis Testing

Peer reviewed
Direct link
