No icon

MCS-GPM: Multi-Constrained Simulation Based Graph Pattern Matching in Contextual Social Graphs

MCS-GPM: Multi-Constrained Simulation Based Graph Pattern Matching in Contextual Social Graphs

Abstract:

Graph Pattern Matching (GPM) has been used in lots of areas, like biology, medical science, and physics. With the advent of Online Social Networks (OSNs), recently, GPM has been playing a significant role in social network analysis, which has been widely used in, for example, finding experts, social community mining and social position detection. Given a query which contains a pattern graph GQ and a data graph GD, a GPM algorithm finds those subgraphs, GM, that match GQ in GD. However, the existing GPM methods do not consider the multiple end-to-end constraints of the social contexts, like social relationships, social trust and social positions on edges in GQ, which are commonly found in various applications, such as crowdsourcing travel, social network based e-commerce and study group selection, etc. In this paper, we first conceptually extend Bounded Simulation to Multi-Constrained Simulation (MCS), and propose a novel NP-Complete Multi-Constrained Graph Pattern Matching (MC-GPM) problem. Then, to address the efficiency issue in large-scale MC-GPM, we propose a new concept called Strong Social Component (SSC), consisting of participants with strong social connections. We also propose an approach to identifying SSCs, and propose a novel index method and a graph compression method for SSC. Moreover, we devise a multithreading heuristic algorithm, called M-HAMC, to bidirectionally search the MC-GPM results in parallel without decompressing graphs. An extensive empirical study over five real-world large-scale social graphs has demonstrated the effectiveness and efficiency of our approach.

Existing System:

The bounded simulation based GPM only considers the bounded path length of an edge when matching the edges, which greedily finds the subgraph that has the minimal diameter. However, social graphs have many social contexts associated with vertices and edges, like the Contextual Social Graph (CSG), where each vertex has the social role information in a specific domain (e.g., a professor in data mining), and each edge has the social relationship between participants (e.g., a father and his son) and the social trust information between participants (e.g., Tom trusts Bob in car repairing). In a variety of GPM based applications in social graphs, e.g., traveller selection in crowd-sourcing travel, study group selection (classroomsalon.com), and the expert selection in social graphs, in addition to the bounded path length, people are willing to incorporate the constraints of the intimacy social relationship and social trust between people in the identified subgraph in terms of GPM in a CSG, which have significant influence on people’s collaborations and decision making.

Proposed System:

We first propose a new notion of Multiple-Constrainted Simulation (MCS). In contrast to its traditional counterpart, the MCS based MC-GPM is to find a graph pattern matching result, where each edge of the matching graph satisfies both the bounded path length and the multiple constraints on edges, which can better support many emerging social network based applications.

We then propose a concept called Strong Social Component  (SSC), which consists of participants who have strong social connections, and propose an approach to identifying SSCs. As the social connections in SSC usually stay stable in a very long period of time, we propose a novel index structure and a graph compression method for SSC with polynomial time complexity. Our method can match the pattern graph without any graph decompression, which can reduce storage consumption and improve efficiency.

Comment As:

Comment (0)