Internet Data Analysis for the Undergraduate Statistics Curriculum. Journal of Statistics Education (forthcoming)
Statistics textbooks for undergraduates have not caught up with the enormous amount of analysis of Internet data that is taking place these days. Case studies that use Web server log data, Internet survey data or Internet network traffic data are rare in undergraduate Statistics education. This paper summarizes the results of research in three areas of Internet data analysis: users' web browing behavior, user demographics, and network performance. We present some of the main questions analyzed in the literature, some unsolved problems, and some typical data analysis methods used. We illustrate the questions and the methods with large data sets. The data sets were obtained from the publicly available pool of data. Those data sets had to be processed and transformed to make them available for classroom exercises. The processed data sets as well as more material for classes, are available at a web site with address that can be obtained from the main author.
Purposes of the study Assessing the extent to which the students in a lower division class (statistics 10) are expected to engage in recall of statistical information, comprehension & interpretation of statistical information, and application, analysis, synthesis, and evaluation of statistical concepts and methods. Assessing the extent to which the questions asked on statistics 10 examinations are stated within context and with reference to real world problems. Analysis of the type of questions (multiple-choice, true-false, word problems, and calculation problems) asked in the statistics 10 exams by level of challenge, context, and content taught.
Data Analysis Activities and Problems for the Computer Science Major in a Post-calculus Introductory Statistics Course
The material presented here is a very small subset of problems currently being prepared for a larger instructional improvement project funded by the Office of Instructional Development (OID) at UCLA. The objective of the project is to create a manual with data sets and contextual problems for Computer Science majors that will complement the textbooks used in the calculus-based upper-division Applied Statistics course. More than one third of the students in this course are from Computer Science, while the remaining students come from Engineering and Applied Math, with a very few majoring in fields like Economics, Biology or Genetics. The course is a prerequisite for another one taught by the Computer Science Department on advanced Probability Models for Computer Science, which many majors taking Applied Statistics never take.