Sequence data are ubiquitous in diverse domains such as bioinformatics, computational neuronal science, and user behavior analysis. As a result, many critical applications require extracting knowledge from sequences in multi-level. For example, mining frequent patterns is the central goal of motif discovery in biological sequences, while in computational neuronal science, one essential task is to infer causal networks from neural event sequences (spike trains). Despite the differences, most of existing knowledge extraction tools face new challenges posted by modern instruments. That is, as large scale and high resolution sequence data become available, we need knowledge extraction tools with better efficiency and higher accuracy.
In this talk, I will first present our work on a statistical tool that could be used to accurately identify inhibitory causal relations from spike trains. While most of existing works devote their efforts on characterizing the statistics of neural spike trains, we show that it is crucial to make predictions about the response of neurons to changes. More importantly, our results are validated by real biological experiments with a novel instrument, which makes this work the first of its kind. Second, I will also introduce our recent progress on predicting neural conditions with a deep learning approach. In this work, we are trying to address the problem of running classification on multiple groups of heterogeneous intra-dependent spike trains, where there is no easy way to directly utilize raw sequences as inputs to train an end-to-end classification model.