Publication Date
| In 2026 | 0 |
| Since 2025 | 200 |
| Since 2022 (last 5 years) | 1070 |
| Since 2017 (last 10 years) | 2580 |
| Since 2007 (last 20 years) | 4941 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 653 |
| Teachers | 563 |
| Researchers | 250 |
| Students | 201 |
| Administrators | 81 |
| Policymakers | 22 |
| Parents | 17 |
| Counselors | 8 |
| Community | 7 |
| Support Staff | 3 |
| Media Staff | 1 |
| More ▼ | |
Location
| Turkey | 225 |
| Canada | 223 |
| Australia | 155 |
| Germany | 116 |
| United States | 99 |
| China | 90 |
| Florida | 86 |
| Indonesia | 82 |
| Taiwan | 78 |
| United Kingdom | 73 |
| California | 65 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 4 |
| Meets WWC Standards with or without Reservations | 4 |
| Does not meet standards | 1 |
Peer reviewedGustafsson, Jan-Eric; Holmberg, Lena M. – Scandinavian Journal of Educational Research, 1992
To determine whether or not there are systematic differences in the psychometric properties of items in the vocabulary test of the Swedish Scholastic Aptitude Test, data from test administrations from 1984 through 1988 (over 50,000 students) were analyzed. The systematic relationships between word characteristics and psychometric properties are…
Descriptors: Adults, College Entrance Examinations, Foreign Countries, Higher Education
Peer reviewedWetter, Martha W.; And Others – Psychological Assessment, 1992
Effects of random responding and malingering on Minnesota Multiphasic Personality Inventory 2 (MMPI-2) validity scales were studied with 173 graduate and undergraduate University of Kentucky (Lexington) students. Inconsistent responding and malingering produced significant elevations on the validity scales, with the dissimulation scale appearing…
Descriptors: Graduate Students, Higher Education, Personality Measures, Rating Scales
Peer reviewedReckase, Mark D.; McKinley, Robert L. – Applied Psychological Measurement, 1991
The concept of item discrimination is generalized to the case in which more than one ability is required to determine the correct response to an item, using the conceptual framework of item response theory and the definition of multidimensional item difficulty previously developed by M. Reckase (1985). (SLD)
Descriptors: Ability, Definitions, Difficulty Level, Equations (Mathematics)
Peer reviewedKolstad, Rosemarie K.; Kolstad, Robert A. – Clearing House, 1994
Argues that multiple-choice tests can be effective only if the items are written in a format suitable for testing the mastery of specific instructional objectives. Proposes the use of nonrestrictive test items and cites examples of such items. (FL)
Descriptors: Elementary Secondary Education, Multiple Choice Tests, Student Evaluation, Test Construction
Peer reviewedCrehan, Kevin; Haladyna, Thomas M. – Journal of Experimental Education, 1991
Two item-writing rules were tested: phrasing stems as questions versus partial sentences; and using the "none-of-the-above" option instead of a specific content option. Results with 228 college students do not support the use of either stem type and provide limited evidence to caution against the "none-of-the-above" option.…
Descriptors: College Students, Higher Education, Multiple Choice Tests, Test Construction
Peer reviewedShohamy, Elana – Annual Review of Applied Linguistics, 1990
Reviews studies and tests that show how discourse analysis has contributed to the theory, research, and development of language testing, covering the relations among discourse analysis and competence and testing theory; research on language tests and tasks; and task development. A 60-citation unannotated bibliography is included. (CB)
Descriptors: Communicative Competence (Languages), Discourse Analysis, Language Research, Language Tests
Collison, Michele N-K – Chronicle of Higher Education, 1990
Although the American College Testing Program (ACT) items were somewhat changed in 1989-90 and are not directly comparable with the previous year's scores, some see stability in the new scores. Improved minority group performance is attributed in part to greater participation in college-preparatory classes. (MSE)
Descriptors: College Entrance Examinations, Higher Education, Minority Groups, Scores
Peer reviewedHanson, Bradley A.; And Others – Applied Psychological Measurement, 1993
The delta method was used to derive standard errors (SES) of the Levine observed score and Levine true score linear test equating methods using data from two test forms. SES derived without the normality assumption and bootstrap SES were very close. The situation with skewed score distributions is also discussed. (SLD)
Descriptors: Equated Scores, Equations (Mathematics), Error of Measurement, Sampling
Peer reviewedKim, Seock-Ho; Cohen, Allan S. – Applied Psychological Measurement, 1998
Investigated Type I error rates of the likelihood-ratio test for the detection of differential item functioning (DIF) using Monte Carlo simulations under the graded-response model. Type I error rates were within theoretically expected values for all six combinations of sample sizes and ability-matching conditions at each of the nominal alpha…
Descriptors: Ability, Item Bias, Item Response Theory, Monte Carlo Methods
Peer reviewedO'Neill, Thomas; Lunz, Mary E.; Thiede, Keith – Journal of Applied Measurement, 2000
Studied item exposure in a computerized adaptive test when the item selection algorithm presents examinees with questions they were asked in a previous test administration. Results with 178 repeat examinees on a medical technologists' test indicate that the combined use of an adaptive algorithm to select items and latent trait theory to estimate…
Descriptors: Adaptive Testing, Algorithms, Computer Assisted Testing, Item Response Theory
Peer reviewedDassa, Clement; Lambert, Jean; Blais, Regis; Potvin, Diane; Gauthier, Natalie – Canadian Journal of Program Evaluation/La Revue canadienne d'evaluation de programme, 1997
Whether a middle alternative in the response choices to a questionnaire influences the reliability and validity of survey responses was studied with 1,390 physicians, nurses, and midwives. Including a neutral option had little effect on overall reliability and validity, but allowed better coherence when items were considered globally. (SLD)
Descriptors: Attitude Measures, Nurses, Obstetrics, Opinions
Peer reviewedBennett, Randy Elliot; Morley, Mary; Quardt, Dennis – Applied Psychological Measurement, 2000
Describes three open-ended response types that could broaden the conception of mathematical problem solving used in computerized admissions tests: (1) mathematical expression (ME); (2) generating examples (GE); and (3) and graphical modeling (GM). Illustrates how combining ME, GE, and GM can form extended constructed response problems. (SLD)
Descriptors: Adaptive Testing, Computer Assisted Testing, Constructed Response, Mathematics Tests
Peer reviewedRushton, J. Philippe; Skuy, Mervyn – Intelligence, 2000
Administered untimed Raven's Standard Progressive Matrices (SPM) to 173 African and 136 White college students in South Africa. In comparison with the 1993 U.S. normative sample, African students scored at the 14th percentile, and White students at the 61st percentile. Differences were greater on SPM items with the highest item total correlations,…
Descriptors: Black Students, College Students, Correlation, Foreign Countries
Peer reviewedReise, Steven P.; Flannery, Wm. Peter – Applied Measurement in Education, 1996
Statistical and theoretical issues that arise from assessing person-fit on measures of typical performance are discussed, including the frequent attenuation of detection of person-misfit, the need for methods of identifying sources of response aberrancy, and person-fit measures as moderators of trait-criterion relations. (SLD)
Descriptors: Item Response Theory, Measurement Techniques, Performance, Responses
Peer reviewedKim, Mikyung – Language Testing, 2001
Investigates differential item functioning (DIF) across two different broad language groupings, Asian and European, in a speaking test in which the test takers' responses were rated polytomously. Data were collected from 1038 nonnative speakers of English from France, Hong Kong, Japan, Spain, Switzerland, and Thailand who took the SPEAK test in…
Descriptors: English (Second Language), Foreign Countries, Item Analysis, Language Tests


