Arijit Khan presents "Towards Querying and Mining of Big-Graphs"

Arijit Khan
Title: "Towards Querying and Mining of Big-Graphs"     Arijit Khan ABSTRACT            With the advent of the Internet, sources of data have increased dramatically, including the World Wide Web, social networks, knowledge graphs, medical and government records. These semi- structured data are usually represented as big-graphs with labeled nodes and edges. Querying and mining of these linked datasets are essential for a wide range of emerging applications, such as viral marketing, web search, malware detection, image retrieval, and social networks analysis. However, the complex combinations of structure and content, coupled with the massive volume of these data, raise several challenges that require new efforts for smarter and faster graph analysis. In this talk, I shall discuss two broad directions of my research: (1) querying of large-scale networks, including heterogeneous networks, uncertain and stream graphs, and (2) pattern mining over large graphs. In the domain of querying heterogeneous networks, due to noise and lack of schema, structured methods such as SPARQL — which require an underlying schema to formulate a query — are often too restrictive. Without knowing the exact structure of the data and the semantics of the entity labels and their relationships, can we still query them and obtain the relevant results? How do we query uncertain graphs and streams? Can we detect malwares by mining discriminative features from their call graphs? From the perspective of advertising and viral marketing, what are the top-k most interesting itemsets and the top-k most influential persons in a social network? In this talk, I shall discuss our effective and efficient techniques to solve these emerging problems associated with querying and mining of complex Big-Graphs. Finally, I shall conclude by stating my current and future research directions including the possibility of hardware-based solutions for graph querying, building of a scalable, distributed storage system for massive graphs, and potential applications of graphs in crowd-sourcing.   BIOGRAPHICAL NOTE Arijit Khan is a post-doctorate researcher in the Systems group at ETH Zurich. His research interests span in the area of big-data, big-graphs, and graph systems. He received his PhD from the Department of Computer Science, University of California, Santa Barbara. His PhD dissertation was focused on efficiently answering queries in large-scale social and information networks that are noisy and often lack a fixed schema. Arijit is the recipient of the prestigious IBM PhD Fellowship in 2012-13. He co-presented a tutorial on emerging queries over linked data at ICDE 2012.