Publication Date
| In 2026 | 10 |
| Since 2025 | 2328 |
| Since 2022 (last 5 years) | 12843 |
| Since 2017 (last 10 years) | 33968 |
| Since 2007 (last 20 years) | 68459 |
Descriptor
| Foreign Countries | 30579 |
| Test Validity | 21757 |
| Scores | 18263 |
| Academic Achievement | 16934 |
| Test Construction | 16763 |
| Test Reliability | 15036 |
| Achievement Tests | 14864 |
| Standardized Tests | 14724 |
| Comparative Analysis | 14431 |
| Elementary Secondary Education | 13046 |
| Language Tests | 12551 |
| More ▼ | |
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 5034 |
| Teachers | 3394 |
| Researchers | 2630 |
| Policymakers | 1232 |
| Administrators | 979 |
| Students | 687 |
| Parents | 325 |
| Counselors | 216 |
| Community | 162 |
| Support Staff | 50 |
| Media Staff | 34 |
| More ▼ | |
Location
| Turkey | 2823 |
| Australia | 2430 |
| Canada | 2270 |
| California | 1854 |
| United States | 1727 |
| Texas | 1615 |
| China | 1579 |
| United Kingdom | 1315 |
| Florida | 1312 |
| United Kingdom (England) | 1203 |
| Germany | 1123 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 121 |
| Meets WWC Standards with or without Reservations | 189 |
| Does not meet standards | 174 |
Yildirim, Ozen – International Education Studies, 2019
The measurement tool not measuring the specific construct has a validity problem. Individuals based on the results obtained from this type of tool should not be evaluated. The purpose of this study was to examine the differentiated item functioning and item bias of mathematics items in the Programme for International Student Achievement 2012…
Descriptors: Gender Differences, Mathematics Tests, Test Bias, Achievement Tests
Siegfried, Christin; Wuttke, Eveline – Citizenship, Social and Economics Education, 2019
Due to their test economy and objective evaluability, multiple-choice items are used much more frequently to test knowledge than constructed-response questions. However, studies point out that dependencies may exist between the individual test result and the test format (multiple-choice or constructed-response). Studies testing economic knowledge…
Descriptors: Multiple Choice Tests, Test Bias, Sex Fairness, Gender Differences
Yasar, Metin – International Journal of Assessment Tools in Education, 2019
The main purpose of this study is to develop a perceived stress scale based on Classical Test Theory (CTT) and Graded Response Model (GRM); to compare the parameters of the items in the scale that are tried to be developed according to both models, and to determine under which theory the measurement tool produces more reliable and valid results…
Descriptors: Affective Measures, Anxiety, Test Theory, Test Construction
Wise, Steven L.; Kuhfeld, Megan R.; Soland, James – Applied Measurement in Education, 2019
When we administer educational achievement tests, we want to be confident that the resulting scores validly indicate what the test takers know and can do. However, if the test is perceived as low stakes by the test taker, disengaged test taking sometimes occurs, which poses a serious threat to score validity. When computer-based tests are used,…
Descriptors: Guessing (Tests), Computer Assisted Testing, Achievement Tests, Scores
Parkin, Jason R. – Journal of Psychoeducational Assessment, 2019
Theories of reading and writing development suggest that the factor structure of achievement batteries could change across development. As a result, it is important to test achievement batteries for invariance across development. The purpose of these analyses is to determine whether the factor structure of reading, writing, and oral language…
Descriptors: Achievement Tests, Reading Tests, Writing Tests, Language Tests
Raykov, Tenko; Dimitrov, Dimiter M.; Marcoulides, George A.; Harrison, Michael – Educational and Psychological Measurement, 2019
This note highlights and illustrates the links between item response theory and classical test theory in the context of polytomous items. An item response modeling procedure is discussed that can be used for point and interval estimation of the individual true score on any item in a measuring instrument or item set following the popular and widely…
Descriptors: Correlation, Item Response Theory, Test Items, Scores
Petscher, Y.; Pentimonti, J.; Stanley, C. – National Center on Improving Literacy, 2019
Validity is broadly defined as how well something measures what it's supposed to measure. The reliability and validity of scores from assessments are two concepts that are closely knit together and feed into each other.
Descriptors: Screening Tests, Scores, Test Validity, Test Reliability
Polatcan, Mahmut – International Journal of Contemporary Educational Research, 2020
The purpose of this study is to adapt the Professional Learning Activities Scale (PLAS) developed by Geijsel, Sleegers, Stoel, and Krüger (2009) into Turkish through conducting the relevant validity and reliability analyses. This study followed the pathway recommended by Hambleton and Patsula (1999) for the adaptation process. The data we used…
Descriptors: Faculty Development, Learning Activities, Test Construction, Test Validity
Hobsein, Kathryn N.; Barbera, Jack – Chemistry Education Research and Practice, 2020
Identity has been proposed as a mechanism to increase persistence within Science, Technology, Engineering and Mathematics (STEM) education programs. To assess the impact of identity on STEM persistence, measures that produce valid and reliable data within a given STEM discipline need to be employed. Therefore, this study developed and evaluated…
Descriptors: STEM Education, Chemistry, Identification (Psychology), Academic Persistence
Liu, Yuan; Hau, Kit-Tai – Educational and Psychological Measurement, 2020
In large-scale low-stake assessment such as the Programme for International Student Assessment (PISA), students may skip items (missingness) which are within their ability to complete. The detection and taking care of these noneffortful responses, as a measure of test-taking motivation, is an important issue in modern psychometric models.…
Descriptors: Response Style (Tests), Motivation, Test Items, Statistical Analysis
Akman, Berrin; Alabay, Erhan; Veziroglu-Celik, Mefharet; Aksoy, Pinar; Gelbal, Selahattin – Journal of Education in Science, Environment and Health, 2020
Children interact with science from the moment of their birth. By means of the scientific experiences they have as a result of this interaction, children acquire numerous skills related to science at the early ages. These scientific skills and children's orientation towards science such as awareness, attitude, proficiency, inquisitiveness, as well…
Descriptors: Test Construction, Foreign Countries, Cultural Relevance, Science Interests
Özcan, Gülsen; Aktag, Isil; Gülözer, Kaine – International Journal of Evaluation and Research in Education, 2020
The study aimed to develop a valid and reliable scale to measure the expectations of students from the discipline program implemented in their schools. The study was conducted with students studying in seven different high schools in fall semester of 2019-2020 school year. As a result of the Confirmatory Factor Analysis (CFA), a 5-point Likert…
Descriptors: Test Construction, Expectation, Discipline, Test Validity
Barenthien, Julia; Lindner, Marlit Annalena; Ziegler, Tobias; Steffensky, Mirjam – Early Years: An International Journal of Research and Development, 2020
Preschool teachers' domain-specific professional knowledge is assumed to play an important role in the quality of early childhood education and thus in young children's learning in different areas. Due to a lack of adequate instruments little is known about preschool teachers' science-specific knowledge. In order to develop such a test instrument,…
Descriptors: Foreign Countries, Preschool Teachers, Scientific Literacy, Pedagogical Content Knowledge
Yavas, Tuba; Celik, Vehbi – Cypriot Journal of Educational Sciences, 2020
This study aims to develop a scale in order to determine the organisational learning levels of educational organisations. The scale took its final form after the items were written, experts' opinions were received and pilot applications were implemented. A survey was administered to 267 teachers using a simple random sampling method. An 18-item…
Descriptors: Organizational Culture, Learning, Test Construction, Measures (Individuals)
Steele, Catriona M.; Peladeau-Pigeon, Melanie; Nagy, Ahmed; Waito, Ashley A. – Journal of Speech, Language, and Hearing Research, 2020
Purpose: The field lacks consensus about preferred metrics for capturing pharyngeal residue on videofluoroscopy. We explored four different methods, namely, the visuoperceptual Eisenhuber scale and three pixel-based methods: (a) residue area divided by vallecular or pyriform sinus spatial housing ("%-Full"), (b) the Normalized Residue…
Descriptors: Human Body, Physiology, Speech Language Pathology, Measurement Techniques

Peer reviewed
Direct link
