Skip to main content
eScholarship
Open Access Publications from the University of California

UC Irvine

UC Irvine Electronic Theses and Dissertations bannerUC Irvine

Influence Maximization in GOLAP

Abstract

The notion of influence among people or organizations has been the core conceptual basis for making various decisions and performing social activities in our society. With the increasing availability of datasets in various domains such as Social Networks and digital healthcare, it becomes more feasible to apply complex analytics on influence networks. However, there exist technical challenges of representing various types of influence networks, handling the variability on the analytics types, and optimizing the time complexity in running the analytics. We present a comprehensive approach to managing influence networks using a set of extended graph models, called Graph-based OLAP (GOLAP). The design space for GOLAP is defined by the incorporation of node types (i.e., colors), weights on relationships (i.e., edges), constraints on the number of nodes for a certain node type, and constraints on the percentage of nodes for a certain node type. We begin with defining a method to find a Strongest Influence Path (SIP) which is the strongest path from the source node to the target node. Then, we extend it with k-colors, a constraint on the number of nodes, and a constraint on the percentage of nodes. Hence, we can answer complex queries on influence networks such as “find the SIP with t nodes of color c” or “find the SIP with t% nodes of color c.” Based on the SIP model, we present a set of Influence Maximization (IM) methods which find a set of s seed nodes that can influence the whole graph maximally with various constraints such as having ‘t nodes of color c’. We apply the IM methods to Gastrointestinal (GI) cancer data and prove the proposed approach works well in the context of GI cancers. We use text mining to identify objects and relationships to construct a graph and use graph-based IM to discover the most influential co-occurring genes. We also address the methods for optimizing the time complexity of the analytics algorithms. We apply heuristic-based and graph reduction-based methods to reduce the time complexity. In addition to proving the proposed methods, we present the result of our implementation on the methods.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View