Skip to main content
Open Access Publications from the University of California


Students Making Sense of Multi-level Data

This study builds on research of Konold, Finzer, and Kreetong (2014), which demonstrated that students with little or no training in statistics or data analysis created nested tables or narratives to encode data that had an underlying hierarchical structure. The question remains whether students can understand and work with such structures when they are presented with them. In this study, we examine the reasoning of a seventh-grade student as she used the data-analysis environment CODAP to explore data about the growth of plants. The data were represented in a case table that structured them hierarchically at three levels. We attend in particular to her developing understanding of both the hierarchical case table and related graphs at the level Curcio (1987) refers to as “reading the data.” Critical to her coming to understand the data was 1) coming to see the graphs and table as dynamic rather than static representations, and 2) using dynamic linking among the objects to explore their relations. These findings have the potential for informing curricula, software development, and teachers’ knowledge for teaching.

Technology Innovations

Dynamic Data in the Statistics Classroom

The call for using real data in the classroom has long meant using datasets which are culled, cleaned, and wrangled prior to any student working with the observations. However, an important part of teaching statistics should include actually retrieving data from the Internet. Nowadays, there are many different sources of data that are continually updated by the organization hosting the data website. The R tools to download such dynamic data have improved in such a way to make accessing the data possible even in an introductory statistics class. We provide five full analyses on dynamic data as well as an additional nine sources of dynamic data that can be brought into the classroom. The goal of our work is to demonstrate that using dynamic data can have a short learning curve, even for introductory students or faculty unfamiliar with the landscape. The examples provided are unlikely to create expert data scrapers, but they should help motivate students and faculty toward more engaged use of online data sources.

  • 5 supplemental PDFs
  • 5 supplemental files

The fivethirtyeight R Package: "Tame Data" Principles for Introductory Statistics and Data Science Courses

As statistics and data science instructors, we often seek to use data in our courses that are rich, real, realistic, and relevant. To this end we created the fivethirtyeight R package of data and code behind the stories and interactives at the data journalism website After a discussion on the conflicting pedagogical goals of "minimizing prerequisites to research" (Cobb 2015) while at the same time presenting students with a realistic view of data as it exists "in the wild," we articulate how a desired balance between these two goals informed the design of the package. The details behind this balance are articulated as our proposed "Tame data principles for introductory statistics and data science courses." Details of the package's construction and example uses are included as well.

  • 1 supplemental ZIP

Using Twitter to Energize the Introductory Statistics Class

In an effort to increase student learning in the introductory statistics class and simultaneously cater to their social media personas, an innovative pedagogical approach was developed using the social media platform Twitter. Our approach took advantage of over two decades’ worth of research findings regarding the established and successful pedagogical innovations in the teaching of statistics. By deploying a planned multi-tier approach that rests heavily on active learning strategies, we observed favorable outcomes from student learning, including active class participation, enhanced engagement and enjoyment, application of statistics concepts to everyday life, and aspects of improvement in statistical literacy, reasoning and thinking skills. This innovation used a combination of analytical writing, small group discussions, and Socratic discussion to accomplish these results.

Classroom Management with RStudio Server Professional

We introduce a new RStudio addin that facilitates the exchange, grading, and real-time supervision of student work in RStudio.  We discuss the results of using the addin in a course in Fall 2016, and conclude with future extensions.  Readers who will find the addin of benefit to their own teaching will find installation instructions in the addin's GitHub repository.  The source code is released under the LGPL version 3.

The TSHS Resources Portal: A Source of Real and Relevant Data for Teaching Statistics in the Health Sciences

The use of real and relevant data in the statistics classroom is an ideal for many teachers of statistics, as these offer clear benefits for engagement with the kinds of real world applications that students will encounter in the future. Yet, the time required to identify and obtain well-documented data and to prepare classroom teaching materials is often prohibitive. In this paper we review the many benefits of using real and relevant data, and use these to define the characteristics of a well-documented health sciences data resource. We review the barriers to using real and relevant data in teaching. Lastly, we introduce the Teaching of Statistics in the Health Sciences (TSHS) Resources Portal (, a source for well-documented health-related datasets and teaching materials. This dynamic resource provides a means to both obtain and contribute peer-reviewed educational materials. The target audience of the tool is teachers of statistics in health sciences settings, particularly those who teach introductory and applied courses in the graduate or post-graduate setting, but we expect that other statistics educators seeking rich, well-documented datasets will also find the Portal a useful resource.