Distributed Dataset Synchronization in Named Data Networking
- Author(s): Shang, Wentao
- Advisor(s): Zhang, Lixia
- et al.
Distributed dataset synchronization (sync for short) provides an important abstraction for multi-party data-centric communication in the Named Data Networking (NDN) architecture. Several NDN Sync protocols have been developed so far, each takes a different design approach than the others. They all enable a group of distributed nodes to publish to, and consume data from, a shared dataset and maintain a consistent state about the dataset among the participants. However, each of them has its own issue in the protocol design that causes inefficiency in the dataset synchronization. In addition, none of them provides built-in membership management support, making it difficult to remove departed nodes from the sync protocol state (maintained by every node in the group). As a result, existing applications running on top of sync have to implement group management by themselves, either at the application layer or by extending the underlying sync protocol.
In this dissertation we first present a comparative study on the design of existing sync protocols. Our analysis focuses on the data naming convention, the representation of dataset namespace, and the state synchronization mechanism. Through the side-by-side comparison on those key design aspects, we identify common design patterns in the existing protocols and articulate their design issues.
Based on the lessons learned from the comparative study, we design a new sync protocol called VectorSync that addresses the issues in the existing works and enables new functions. In VectorSync, the state of the shared dataset namespace is concisely represented by version vector, which allows the sync nodes to detect and reconcile inconsistencies efficiently. Every node maintains a consistent view of the current group members through a leader-based membership synchronization mechanism, which also provides the basis for data authentication and access control. Our simulation-based evaluation shows that the VectorSync design improves the efficiency of dataset synchronization compared to the widely used ChronoSync protocol under various network conditions and provides efficient group membership management without affecting the synchronization speed.
At the end of this dissertation we present the design of a few pilot applications in the Internet of Things (IoT) area that utilize different NDN sync protocols to enable important functions that are often difficult to achieve under the current TCP/IP-based IoT architecture. These new applications demonstrate the importance of NDN sync and illustrate the unique sync-based application design patterns that arise from NDN's data-centric communication model.