Publication Date
| In 2026 | 0 |
| Since 2025 | 197 |
| Since 2022 (last 5 years) | 1067 |
| Since 2017 (last 10 years) | 2577 |
| Since 2007 (last 20 years) | 4938 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
| Practitioners | 653 |
| Teachers | 563 |
| Researchers | 250 |
| Students | 201 |
| Administrators | 81 |
| Policymakers | 22 |
| Parents | 17 |
| Counselors | 8 |
| Community | 7 |
| Support Staff | 3 |
| Media Staff | 1 |
| More ▼ | |
Location
| Turkey | 225 |
| Canada | 223 |
| Australia | 155 |
| Germany | 116 |
| United States | 99 |
| China | 90 |
| Florida | 86 |
| Indonesia | 82 |
| Taiwan | 78 |
| United Kingdom | 73 |
| California | 65 |
| More ▼ | |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
| Meets WWC Standards without Reservations | 4 |
| Meets WWC Standards with or without Reservations | 4 |
| Does not meet standards | 1 |
Özkan, Yesim Özer; Güvendir, Meltem Acar – Journal of Pedagogical Research, 2021
Large scale assessment is conducted at different class levels for various purposes such as identifying student success in education, observing the impacts of educational reforms on student achievement, assessment, selection, and placement. It is expected that these tests and their items are used in education do not display different traits with…
Descriptors: Foreign Countries, Test Bias, Student Evaluation, Test Items
Khoshdel, Fahimeh; Baghaei, Purya; Bemani, Purya; Bemani, Masoumeh – International Journal of Language Testing, 2016
In this paper we tried to demonstrate the validity of C-Test using construct identification approach. In this approach to construct validation the factors which contribute to item difficulty are identified. The assumption is that the factors which make items difficult are actually the construct underlying the test. For the purposes of this study,…
Descriptors: Test Items, Difficulty Level, Test Validity, Cloze Procedure
Fitzpatrick, Joseph; Skorupski, William P. – Journal of Educational Measurement, 2016
The equating performance of two internal anchor test structures--miditests and minitests--is studied for four IRT equating methods using simulated data. Originally proposed by Sinharay and Holland, miditests are anchors that have the same mean difficulty as the overall test but less variance in item difficulties. Four popular IRT equating methods…
Descriptors: Difficulty Level, Test Items, Comparative Analysis, Test Construction
Matlock, Ki Lynn; Turner, Ronna – Educational and Psychological Measurement, 2016
When constructing multiple test forms, the number of items and the total test difficulty are often equivalent. Not all test developers match the number of items and/or average item difficulty within subcontent areas. In this simulation study, six test forms were constructed having an equal number of items and average item difficulty overall.…
Descriptors: Item Response Theory, Computation, Test Items, Difficulty Level
Davenport, Ernest C.; Davison, Mark L.; Liou, Pey-Yan; Love, Quintin U. – Educational Measurement: Issues and Practice, 2016
The main points of Sijtsma and Green and Yang in Educational Measurement: Issues and Practice (34, 4) are that reliability, internal consistency, and unidimensionality are distinct and that Cronbach's alpha may be problematic. Neither of these assertions are at odds with Davenport, Davison, Liou, and Love in the same issue. However, many authors…
Descriptors: Educational Assessment, Reliability, Validity, Test Construction
O'Shea, Ann; Breen, Sinéad; Jaworski, Barbara – International Journal of Research in Undergraduate Mathematics Education, 2016
This paper describes the development of a concept inventory, a test designed to investigate undergraduate students' understanding of the concept of function. A central purpose was to address "conceptual" understanding. We outline a set of elements of the understanding of function, based on key properties of the function concept, which…
Descriptors: Undergraduate Students, Mathematical Concepts, College Mathematics, Mathematics Instruction
Raspa, Melissa; Bann, Carla M.; Gwaltney, Angela; Benke, Timothy A.; Fu, Cary; Glaze, Daniel G.; Haas, Richard; Heydemann, Peter; Jones, Mary; Kaufmann, Walter E.; Lieberman, David; Marsh, Eric; Peters, Sarika; Ryther, Robin; Standridge, Shannon; Skinner, Steven A.; Percy, Alan K.; Neul, Jeffrey L. – American Journal on Intellectual and Developmental Disabilities, 2020
Rett syndrome (RTT) is a neurodevelopmental disorder that primarily affects females. Recent work indicates the potential for disease modifying therapies. However, there remains a need to develop outcome measures for use in clinical trials. Using data from a natural history study (n = 1,075), we examined the factor structure, internal consistency,…
Descriptors: Genetic Disorders, Psychometrics, Psychomotor Skills, Physical Disabilities
Smarter Balanced Assessment Consortium, 2020
The Smarter Balanced Assessment Consortium (Smarter Balanced) strives to provide every student with a positive and productive assessment experience, generating results that are a fair and accurate estimate of each student's achievement. Further, Smarter Balanced is building on a framework of accessibility for all students, including English…
Descriptors: Student Evaluation, Evaluation Methods, English Language Learners, Students with Disabilities
Chiavaroli, Neville – Practical Assessment, Research & Evaluation, 2017
Despite the majority of MCQ writing guides discouraging the use of negatively-worded multiple choice questions (NWQs), they continue to be regularly used both in locally produced examinations and commercially available questions. There are several reasons why the use of NWQs may prove resistant to sound pedagogical advice. Nevertheless, systematic…
Descriptors: Multiple Choice Tests, Test Construction, Test Items, Validity
Schneider, Arthur E. – Journal of Education and Practice, 2017
Action research was undertaken to begin to explore the possibility of improving second-language Thai college student performance on completion questions by using bolded and underscored words in test item stems, called "assist devices." This intervention was designed to focus student attention on key terms. Twenty-one students, in an…
Descriptors: Foreign Countries, Assistive Technology, Test Items, Second Language Learning
Guenole, Nigel; Chernyshenko, Oleksandr S.; Weekly, Jeff – International Journal of Testing, 2017
Situational judgment tests (SJTs) are widely agreed to be a measurement technique. It is also widely agreed that SJTs are a questionable methodological choice for measurement of psychological constructs, such as behavioral competencies, due to a lack of evidence supporting appropriate factor structures and high internal consistencies.…
Descriptors: Situational Tests, Psychological Evaluation, Test Construction, Industrial Psychology
Caliskan, Nihat; Kuzu, Okan; Kuzu, Yasemin – Journal of Education and Learning, 2017
The purpose of this study was to develop a rating scale that can be used to evaluate behavior patterns of the organization people pattern of preservice teachers (PSTs). By reviewing the related literature on people patterns, a preliminary scale of 38 items with a five-points Likert type was prepared. The number of items was reduced to 29 after…
Descriptors: Foreign Countries, Behavior Rating Scales, Test Construction, Preservice Teachers
Reed, Jessica J.; Brandriet, Alexandra R.; Holme, Thomas A. – Journal of Chemical Education, 2017
Recent efforts to reform K-12 science curricula, embedded within the "NRC Framework for K-12 Science Education" and the "Next Generation Science Standards," have focused on unifying core disciplinary content with crosscutting concepts that span across science disciplines and scientific practices. With these reforms comes the…
Descriptors: Science Education, Chemistry, Elementary Secondary Education, Science Tests
Kim, Chae-Eun; O'Grady, William; Deen, Kamil; Kim, Kitaek – Language Acquisition: A Journal of Developmental Linguistics, 2017
This article shows that the Korean Extrinsic Plural Marker (EPM) may be acquired by children on the basis of very little evidence. The EPM marks distributivity, unlike the Instrinsic Plural Marker, which marks plurality. Thirty monolingual learners of Korean aged 5;03 to 6;09 (mean age 6;01) were tested using a series of Truth Value Judgment Tasks…
Descriptors: Monolingualism, Pretests Posttests, Experimental Groups, Control Groups
Naumann, Alexander; Hartig, Johannes; Hochweber, Jan – Journal of Educational and Behavioral Statistics, 2017
Valid inferences on teaching drawn from students' test scores require that tests are sensitive to the instruction students received in class. Accordingly, measures of the test items' instructional sensitivity provide empirical support for validity claims about inferences on instruction. In the present study, we first introduce the concepts of…
Descriptors: Test Items, Item Response Theory, Instructional Effectiveness, Psychometrics

Peer reviewed
Direct link
