Caching Strategies for Private and Efficient Content Retrieval in Information-Centric Networks
- Author(s): Abani, Noor
- Advisor(s): Gerla, Mario
- et al.
In the recent years, we have witnessed an explosive growth of content on the Internet (storage, retrieval, delivery). Consequently, the usage of computer networking shifted from sharing hardware and processing resources, as its purpose was in the early days of its creation, to accessing and sharing content instead. Therefore, the current TCP/IP Internet architecture, having been designed for the former purpose, is facing several problems in adapting to that phenomenal increase in content.
Information Centric Networking (ICN) is a proposal for a future network architecture to better adapt to the current needs of users connected to the Internet. ICN shifts from the current host-centric TCP/IP architecture to one where content is the focal point. Specifically, IP addresses are replaced by content names as the main identifier for packet routing and forwarding.
Disposing of the end-to-end principle that keeps end-to-end transactions unaware of resources and content available along the path, ICN leverages in-network caching to make content distribution more efficient by reducing network traffic, download time and server load. While in-network caching is a potential performance booster, if done naively, it can hinder full exploitation of its benefits and also pose several privacy threats. In this thesis, we aim at proposing caching mechanisms that could achieve efficient content distribution while also providing privacy guarantees to the requester of content.
In current ICN implementations, the default caching strategy is one where each router caches every piece of content that passes through it. It has been shown in previous work that such a universal mode of caching is inefficient, highly redundant, incurring high replacement rates and also privacy breaching.
This thesis addresses the aforementioned limitations by designing caching strategies that aim at reducing redundancy, replacement rates and protect user privacy. The problem is first formulated as an optimization problem that finds the optimal storage allocation and content placement that minimizes data delivery costs subject to storage budget and link capacities while also taking into account content popularity profiles. Given that such constraints might not be available a priori, we also propose two popularity-based caching strategies where routers autonomously track content popularity and progressively cache file chunks as the file gets more popular. We acknowledge the lack of consideration for the effect of content popularity dynamics in previous caching schemes, and emulate content popularity dynamics by simulating spatial and temporal locality of content items to study its effect on in-network caching policies.
In addition to its inefficiency, universal caching has also caused the emergence of several privacy risks. Timing attacks are one such privacy breach where attackers use timing analysis of data retrievals to analyze users' interests in particular content. In this thesis, we explain the vulnerability of ICN to timing attacks and propose centrality-based caching to mitigate such attacks and use the anonymity set privacy metric to show that our strategy does provide privacy guarantees. Moreover, our simulations show that we can still achieve high hit rates and low delays while alleviating the network's vulnerability to timing attacks.
Lastly, we also design a proactive caching strategy for ICN where user content is prefetched and cached before it is requested. Previously, proactive caching has been considered for edge networks only. In this thesis, we design a proactive caching strategy that leverages ICN's flexibility of caching data anywhere in the network, rather than just at the edge. Such leverage allows for reducing redundant caching to combat prediction uncertainty by prefetching at routers at levels higher in the network hierarchy.
The end goal of this thesis is a better understanding of the limitations of a universal caching policy, in addition to designing both reactive and proactive caching strategies that, in comparison, provide more efficient content retrieval, while also protecting user privacy.