The focus of this dissertation is on measuring, analyzing and modeling emerging appli-
cations in the Internet. Specifically, we concentrate on understanding the internals of content
distribution paradigms such as Peer-to-Peer (P2P) systems and podcasts. This dissertation
consists of three main thrusts which we describe below.
P2P streams have been reported to constitute nearly 61% of all upstream traffic. P2P
streams are used for disseminating content ranging from video programs to linux images.
This everpresent ubiquity of P2P networks has also allowed them to be used for sharing
copyrighted material. This has resulted in organizations like the RIAA, taking legal action
against file-sharers. As a result P2P users have employed defenses against being monitored
by such organizations. We have found that a little caution pays off a lot, since there is a 100%
probability of a naive P2P user being monitored when accessing these networks.
Further, as a case study, we present a comprehensive study about eDonkey, a popular
P2P network. We identify the limitations of current approaches to measure P2P networks.
Additionally, we find that P2P flows traverse through the Internet quite differently than http
flows. Based on this, we present metrics useful for distinguishing P2P traffic from other
forms of traditional content distribution in the Internet.
Finally, podcasts, a relatively new content distribution mechanism is expected to garner
an audience of nearly 56 million subscribers by 2010. Measuring and modeling podcasts re-
mains an open problem despite the significance that has been gained by this application. This
form of content distribution is best described as a push based mechanism, which is different
from traditional http based content distribution. We measure podcast streams, analyze them
and develop a traffic generator, SimPod, for simulation purposes.