Center for the Teaching of Statistics
Internet Data Analysis for the Undergraduate Statistics Curriculum. Journal of Statistics Education (forthcoming)
- Author(s): Sanchez, Juana
- He, Yan
- et al.
Statistics textbooks for undergraduates have not caught up with the enormous amount of analysis of Internet data that is taking place these days. Case studies that use Web server log data, Internet survey data or Internet network traffic data are rare in undergraduate Statistics education. This paper summarizes the results of research in three areas of Internet data analysis: users' web browing behavior, user demographics, and network performance. We present some of the main questions analyzed in the literature, some unsolved problems, and some typical data analysis methods used. We illustrate the questions and the methods with large data sets. The data sets were obtained from the publicly available pool of data. Those data sets had to be processed and transformed to make them available for classroom exercises. The processed data sets as well as more material for classes, are available at a web site with address that can be obtained from the main author.