Publication Date
| In 2026 | 0 |
| Since 2025 | 0 |
| Since 2022 (last 5 years) | 0 |
| Since 2017 (last 10 years) | 2 |
| Since 2007 (last 20 years) | 7 |
Descriptor
| Comparative Analysis | 54 |
| Testing Problems | 54 |
| Test Reliability | 44 |
| Test Validity | 22 |
| Higher Education | 13 |
| Test Construction | 13 |
| Testing | 12 |
| Scores | 10 |
| College Students | 8 |
| Evaluation Methods | 8 |
| Achievement Tests | 6 |
| More ▼ | |
Source
Author
Publication Type
Education Level
| Higher Education | 3 |
| Postsecondary Education | 3 |
| Elementary Secondary Education | 1 |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Isbell, Dan; Winke, Paula – Language Testing, 2019
The American Council on the Teaching of Foreign Languages (ACTFL) oral proficiency interview -- computer (OPIc) testing system represents an ambitious effort in language assessment: Assessing oral proficiency in over a dozen languages, on the same scale, from virtually anywhere at any time. Especially for users in contexts where multiple foreign…
Descriptors: Oral Language, Language Tests, Language Proficiency, Second Language Learning
Phongsirikul, Marissa – rEFLections, 2018
The study aimed to investigate teachers' and students' perceptions towards traditional and alternative types of assessment within a classroom context of an English course provided for English-majoring students at tertiary level. A combination of traditional and alternative assessment tools was implemented in the study. The researcher developed…
Descriptors: Teacher Attitudes, Student Attitudes, Alternative Assessment, Second Language Learning
Davis, Andrew – Ethics and Education, 2015
PISA claims that it can extend its reach from its current core subjects of Reading, Science, Maths and problem-solving. Yet given the requirement for high levels of reliability for PISA, especially in the light of its current high stakes character, proposed widening of its subject coverage cannot embrace some important aspects of the social and…
Descriptors: International Assessment, High Stakes Tests, Reliability, Academic Achievement
Ghilay, Yaron; Ghilay, Ruth – Journal of Educational Technology, 2012
The study examined advantages and disadvantages of computerised assessment compared to traditional evaluation. It was based on two samples of college students (n=54) being examined in computerised tests instead of paper-based exams. Students were asked to answer a questionnaire focused on test effectiveness, experience, flexibility and integrity.…
Descriptors: Student Evaluation, Higher Education, Comparative Analysis, Computer Assisted Testing
Baker, Beverly A. – Assessing Writing, 2010
In high-stakes writing assessments, rater training in the use of a rating scale does not eliminate variability in grade attribution. This realisation has been accompanied by research that explores possible sources of rater variability, such as rater background or rating scale type. However, there has been little consideration thus far of…
Descriptors: Foreign Countries, Writing Evaluation, Writing Tests, Testing
Peer reviewedGreen, Samuel B. – Educational and Psychological Measurement, 1981
The proportion of agreement, G, and kappa indexes are shown to differ in how they correct for chance agreements between two observers. On the basis of the findings, it is suggested that no single agreement index is appropriate for all sets of data. (Author/BW)
Descriptors: Comparative Analysis, Measurement Techniques, Test Reliability, Testing Problems
Peer reviewedKaiser, Henry F. – Educational and Psychological Measurement, 1980
The use of Bayes' estimates for proportions in the Law of Comparative Judgment is suggested to avoid sample proportions of zero and one. (Author)
Descriptors: Bayesian Statistics, Comparative Analysis, Reliability, Statistical Analysis
Thrash, Susan K.; Porter, Andrew C. – 1974
The purpose of this paper is to prove that one currently recommended method of obtaining the reliability of an instrument defined on a population of aggregate units is invalid. This method randomly splits the aggregate into two halves, correlates the two half unit scores by a Pearson product moment correlation coefficient, and corrects the…
Descriptors: Comparative Analysis, Correlation, Measurement Techniques, Sampling
Peer reviewedWagner, Edwin E.; And Others – Educational and Psychological Measurement, 1990
Maximized correlation as an internal reliability estimate for tests with few items was investigated. An actual sampling distribution of maximum correlation--"r" max--was empirically derived from 100 samples of 50 cases each from Rorschach test data and compared with those of alpha and an odd/even split, using 2,020 Rorschach protocols.…
Descriptors: Comparative Analysis, Correlation, Estimation (Mathematics), Sample Size
Peer reviewedRubin, Donald B.; Thayer, Dorothy – Psychometrika, 1978
A procedure is developed for estimating correlations among new tests when non-overlapping sub-samples each are administered a different new test and all sub-samples are administered a set of standard tests. (JKS)
Descriptors: Comparative Analysis, Correlation, Measurement, Standardized Tests
Nevo, Barukh – Measurement and Evaluation in Guidance, 1976
Freshmen (N=202) took two batteries of aptitude tests 10 months apart. Six pairs of tests were studied. Two pairs were identical, two were parallel, and two were completely different. This design made it possible to separate three components of practice: (a) general test sophistication, (b) specific practice effect, and (c) item familiarization.…
Descriptors: Aptitude Tests, College Freshmen, Comparative Analysis, Group Testing
Huang, Jinyan – Assessing Writing, 2008
Using generalizability theory, this study examined both the rating variability and reliability of ESL students' writing in the provincial English examinations in Canada. Three years' data were used in order to complete the analyses and examine the stability of the results. The major research question that guided this study was: Are there any…
Descriptors: Generalizability Theory, Foreign Countries, English (Second Language), Writing Tests
Comparison of Yes-No, Matched-Pairs, and All-No Scoring of a First-Grade Economics Achievement Test.
Larkins, A. Guy; Shaver, James P. – 1968
Developing practical achievement tests for use at the primary-grade level is a difficult task. Some problems encountered appear to be resolved by using verbally administered yes-no tests. But such tests are criticized as having a low reliability because they offer only two choices. Two modifications of the yes-no test have been proposed to…
Descriptors: Achievement Tests, Comparative Analysis, Primary Education, Test Construction
Peer reviewedLinn, Robert L.; Werts, Charles E. – Journal of Educational Measurement, 1971
Two problems in the investigation of predictive bias in tests, the effect of unreliability of the predictors, and the effect of excluding a predictor from the regression equation on which there are preexisting group differences, are discussed. (Author)
Descriptors: Comparative Analysis, Minority Groups, Predictive Measurement, Predictor Variables
Peer reviewedColeman, Marilyn; And Others – Psychology in the Schools, 1980
The mean IQ on the Slosson Intelligence Test (SIT) was substantially higher than expected based on the earlier Peabody Picture Vocabulary Test (PPVT) scores. Sampling error and examiner error were excluded as explanations. Results suggest that the PPVT and SIT yield different scores and lack comparability. (Author)
Descriptors: Children, Comparative Analysis, Intelligence Tests, Intervention

Direct link
