Farrell, C.C., Penuel, W., Arce-Trigatti, P., Soland, J., Singleton, C., Fox Resnick, A., Stamatis, K., Riedy, R., Henrick, E., Sexton, S., Wellberg, S., & Schmidt, D.
R&R at Frontiers in Research Metrics and Analytics
An increasingly popular form of collaboration involves forming partnerships among researchers, educators, and community members to improve or transform education systems through research inquiry. However, not all partnerships are successful. The field needs valid, reliable, and useful measures to help with assessing progress towards partnership goals. In this community case study, we present a participatory, mixed-methods approach designed to create measures for assessing progress of research-practice partnerships (RPPs) in the field of education. The case illustrates a novel approach to participatory measurement design, driven by needs from the field. As a result, the measures align with the values and practices of the very collaborations the measures were intended to assess.
Developing a measure to evaluate education research-practice partnerships
Soland, J., Penuel, W.R., Farrell, C.C., & Wellberg, S.
R&R at Research Evaluation
Research-practice partnerships are an increasingly popular approach for supporting the use of research to inform and guide efforts to improve and transform education. To date, however, evaluators have lacked measures to evaluate such partnerships, which typically aim for a range of outcomes. This paper describes a project to develop validity evidence for a survey to evaluate the effectiveness of research-practice partnerships in education. The project followed an evidence-centered design approach to developing and evaluating the validity of the survey measure, collecting and analyzing data from 65 different research-practice partnership. Results indicate the scales were reliable overall, but that educators and researchers exhibited different response patterns to items. Evidence from this study does not support summative uses of the scales for evaluating individual RPPs; however, they could be used formatively for informing discussions about improvement. Future research can investigate the possibility that the scales could be used in evaluating portfolios of RPP projects.
Revisiting the 20% rule of thumb for linking items in vertical scales
Soland, J., Wellberg, S., & Kuhfeld, M.
Under review at Journal of Educational Measurement
Developing educational assessments with vertical scales is essential to understanding how learning develops over time. While vertical scales have been studied for decades, many informal rules of thumb still linger in applied vertical scaling spaces. For example, there is an ad hoc suggestion that 20% of items overlap between adjacent grades when calibrating item parameters (Kolen & Brennan, 2004). Yet, there are many reasons to suspect that the required proportion of linking items might differ considerably depending on item and person parameters, or mode of calibration. For example, one could imagine needing fewer linking items if those items are more discriminating or needing more items when using separate rather than concurrent calibration. In this study, we investigate such issues through extensive Monte Carlo simulations and empirical analyses. Specifically, we vary conditions like characteristics of the items (especially discrimination and range of difficulty), the population being assessed (e.g., the magnitude and variability of gains over time), the linking method, and ultimately the proportion of linking items. We then calibrate the items and examine recovery of true person and item parameters. Similar analyses are conducted using empirical data from a vertically scaled early literacy assessment under development in Virginia. Results suggest that the proportion of linking items needed can vary substantively depending on item and person parameters, as well as the linking strategy used. These results have the potential to create rules of thumb designed for specific aspects of a given testing scenario.
A grade-level comparison of instructional and testing practices in U.S. mathematics classes
Wellberg, S.
R&R at Educational Studies in Mathematics
This study uses regression analyses and multi-group structural equation modeling to investigate the relationships among the learning objectives, instructional practices, and test types used in U.S. mathematics classes at the elementary and secondary levels. The results indicate that there are substantial differences pedagogical and testing approaches used in mathematics classes across grade levels. At the elementary level, students are more likely to experience reform-based instruction but are typically given tests that are primarily composed on short-answer/multiple-choice items. At the secondary level, however, the instruction is more teacher-centered, but students are given tests that contain constructed-response items more frequently. Additionally, the strength, and sometimes direction, of the relationships among these aspects of the classroom environment differ between elementary and secondary classes. Most notably, while secondary classes with a heavy emphasis on traditional learning objectives are less likely to be given tests with constructed response items, the opposite holds for elementary classes.
Race, instruction, and testing in U.S. mathematics classes: A QuantCrit investigation
Wellberg, S.
R&R at Educational Assessment
This study uses the QuantCrit extension of Critical Race Theory to investigate the relationship between the instructional and testing practices used by teachers of U.S. mathematics classes and whether that relationship changes based on the racial composition of the class. Findings suggest that teachers with larger percentages of Black students in their class test more frequently and are less likely to give tests that require students to demonstrate their thought processes. Additionally, there is a strong relationship between the instructional approach a teacher uses with their class and the types of tests they give, and teachers who use more student-centered instructional techniques appear to have larger race-based discrepancies in their testing practices.
"A shot in the dark": High-school mathematics teachers' test development processes
Wellberg, S.
This study explores the factors that high-school mathematics teachers consider when constructing classroom tests.