Publication Date
| In 2026 | 0 |
| Since 2025 | 200 |
| Since 2022 (last 5 years) | 1070 |
| Since 2017 (last 10 years) | 2580 |
| Since 2007 (last 20 years) | 4941 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 653 |
| Teachers | 563 |
| Researchers | 250 |
| Students | 201 |
| Administrators | 81 |
| Policymakers | 22 |
| Parents | 17 |
| Counselors | 8 |
| Community | 7 |
| Support Staff | 3 |
| Media Staff | 1 |
| More ▼ | |
Location
| Turkey | 225 |
| Canada | 223 |
| Australia | 155 |
| Germany | 116 |
| United States | 99 |
| China | 90 |
| Florida | 86 |
| Indonesia | 82 |
| Taiwan | 78 |
| United Kingdom | 73 |
| California | 65 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 4 |
| Meets WWC Standards with or without Reservations | 4 |
| Does not meet standards | 1 |
Peer reviewedMartin, Joseph J. – Chemical Engineering Education, 1979
This problem requires the student to calculate the volume of air breathed by a bicyclist every hour in Denver, Colorado, and New York City. (BB)
Descriptors: Chemistry, College Science, Computation, Energy
Peer reviewedBlack, Thomas R. – Journal of Research in Science Teaching, 1980
Explores variables influencing the cognitive emphasis of teachers' examinations. Examination questions from Nigerian secondary school science teachers were analyzed according to Bloom's Taxonomy. The influence of the following variables on levels of questions was investigated: teachers' educational backgrounds, subjects taught, grade level taught,…
Descriptors: Difficulty Level, Science Education, Science Tests, Secondary Education
Peer reviewedWilcox, Rand R. – Applied Psychological Measurement, 1980
This paper discusses how certain recent technical advances might be extended to examine proficiency tests which are conceptualized as representing a variety of skills with one or more items per skill. In contrast to previous analyses, errors in the item level are included. (Author/BW)
Descriptors: Mastery Tests, Minimum Competencies, Minimum Competency Testing, Sampling
Peer reviewedHuynh, Huynh; Saunders, Joseph C. – Journal of Educational Measurement, 1980
Single administration (beta-binomial) estimates for the raw agreement index p and the corrected-for-chance kappa index in mastery testing are compared with those based on two test administrations in terms of estimation bias and sampling variability. Bias is about 2.5 percent for p and 10 percent for kappa. (Author/RL)
Descriptors: Comparative Analysis, Error of Measurement, Mastery Tests, Mathematical Models
Peer reviewedJaeger, Richard M. – Journal of Educational Measurement, 1981
Five indices are discussed that should logically discriminate between situations in which: (1) the linear equating method (LEM) adequately adjusts for difference between score distributions of two approximately parallel test forms; or (2) a method more complex than the linear equating method is needed. (RL)
Descriptors: College Entrance Examinations, Comparative Analysis, Difficulty Level, Equated Scores
Peer reviewedHoste, R. – British Journal of Educational Psychology, 1981
In this paper, a proposal is made by which a content validity coefficient can be calculated. An example of the use of the coefficient is given, demonstrating that different question combinations in a CSE biology examination in which a choice of questions was given gave different levels of content validity. (Author)
Descriptors: Achievement Tests, Biology, Content Analysis, Item Sampling
Peer reviewedWarren, Gordon – Journal of Research in Science Teaching, 1979
The paper documents experiments designed to compare essay and multiple-choice tests as means of testing science learning. Results indicate it is easier to score high on multiple-choice tests; some students substitute quantity on essay tests; and essay tests reveal weaknesses hidden by multiple-choice tests. (RE)
Descriptors: Educational Research, Educational Theories, Evaluation Methods, Instructional Materials
Peer reviewedKnifong, J. Dan – Journal for Research in Mathematics Education, 1980
The computational difficulty of the word problem sections of eight standardized achievement tests was analyzed with respect to the variety of computational procedures and the number of digits per whole number computation. Analysis reveals considerable variation among the current tests in terms of computational procedure and difficulty. (Author/MK)
Descriptors: Computation, Difficulty Level, Educational Research, Elementary Education
Peer reviewedHolliday, William G.; Partridge, Louise A. – Journal of Research in Science Teaching, 1979
Investigates two evaluative hypotheses related to test item sequence and the performance of the students who take the tests. (SA)
Descriptors: Educational Research, Elementary Education, Evaluation, Measurement
Peer reviewedJohanson, George A. – Evaluation Practice, 1997
A discussion of differential item functioning (DIF) in the context of attitude assessment is followed by examples involving the detection of DIF on an attitudes-toward-science scale completed by 1,550 elementary school students and the finding of no DIF in a workshop evaluation completed by 1,682 adults. (SLD)
Descriptors: Adults, Attitude Measures, Attitudes, Elementary Education
Peer reviewedZenisky, April L.; Sireci, Stephen G. – Applied Measurement in Education, 2002
Reviews and illustrates some of the current technological developments in computer-based testing, focusing on novel item formats and automated scoring methodologies. The review shows a number of innovations being researched and implemented. (SLD)
Descriptors: Educational Innovation, Educational Technology, Elementary Secondary Education, Large Scale Assessment
Peer reviewedBan, Jae-Chun; Hanson, Bradley A.; Yi, Qing; Harris, Deborah J. – Journal of Educational Measurement, 2002
Compared three online pretest calibration scaling methods through simulation: (1) marginal maximum likelihood with one expectation maximization (EM) cycle (OEM) method; (2) marginal maximum likelihood with multiple EM cycles (MEM); and (3) M. Stocking's method B. MEM produced the smallest average total error in parameter estimation; OEM yielded…
Descriptors: Computer Assisted Testing, Error of Measurement, Maximum Likelihood Statistics, Online Systems
Peer reviewedBline, Dennis; Lowe, Dana R.; Meixner, Wilda F.; Nouri, Hossein – Journal of Business Communication, 2003
Presents the results of an investigation about the effect of question order randomization on the psychometric properties of two frequently used oral and written apprehension instruments: McCroskey's oral communication apprehension scale and Daly and Miller's writing apprehension scale. Shows that the measurement properties of these instruments…
Descriptors: Communication Apprehension, Communication Research, Higher Education, Questionnaires
Peer reviewedHynan, Linda S.; Foster, Barbara M. – Teaching of Psychology, 1997
Describes a project used in a sophomore-level psychological testing and measurement course. Students worked through the different phases of developing a test focused on item writing, reliability, and validity. Responses from both students and instructors have been consistently positive. (MJP)
Descriptors: Higher Education, Item Analysis, Item Response Theory, Psychological Testing
Peer reviewedBerger, Martijn P. F.; Veerkamp, Wim J. J. – Journal of Educational and Behavioral Statistics, 1997
Some alternative criteria for item selection in adaptive testing are proposed that take into account uncertainty in the ability estimates. A simulation study shows that the likelihood weighted information criterion is a good alternative to the maximum information criterion. Another good alternative uses a Bayesian expected a posteriori estimator.…
Descriptors: Ability, Adaptive Testing, Bayesian Statistics, Computer Assisted Testing


