Graph Pattern Mining

Thursday, February 28, 2008 - 5:14pm

XIFENG YAN IBM T. J. Watson Research Center

TIME: 3:30 – 4:30 p.m.
PLACE: Computer Science Conference Room, Harold Frank Hall Rm. 1132

Graphs become increasingly important in modeling complex structures that emerge in a wide range of domains. In the core of many graph-related applications, there is a strong need for mining interesting graph patterns with user-specified objective functions. Unfortunately, most interestingness functions are not anti-monotonic, for which existing frequency-centric mining algorithms could fail due to the exponential pattern space. In this talk, I will first introduce a new mining principle that explores the correlation between structural similarity and interestingness similarity, which could help identify the most significant patterns quickly by searching dissimilar patterns. The discovered patterns could be used to build high-quality interpretable graph classifiers. Next, I will address the same issue from the perspective of mining noisy graphs, which are often found in biological data. With introduction of graph summarization technique, it becomes possible to perform effective mining in noisy gene co-expression networks. In addition to graph pattern mining, I will briefly introduce my other work of graph mining for software failure diagnosis and social network analysis.

Xifeng Yan is a Research Staff Member at IBM T. J. Watson Research Center. He obtained his BE degree in computer engineering from Zhejiang University and his PhD degree in computer science from the University of Illinois at Urbana-Champaign. His current research interests are data mining, bioinformatics, and database systems. He has published more than 30 papers in refereed journals and conferences, including TODS, Bioinformatics, TSE, SIGMOD, SIGKDD, VLDB, and ISMB. His dissertation on graph mining and management received the 2006 ACM-SIGMOD Best Dissertation Runner-up Award.

HOST: Ambuj Singh