Systems modeling and network inference for genetic and epigenetic regulation of gene expression
Date of Issue2016-04-19
School of Computer Engineering
Bioinformatics Research Centre
Gene expression is a fundamental activity in cellular environment. Due to the complexity of gene regulation, the regulatory mechanisms of gene expression in many cellular processes remain elusive despite a long time of research. Computational methods have been indispensable companions to experimental work by providing predictions and guidance. In this thesis, we propose computational methods to elucidate the regulation of gene expression on two basic and interacting layers, namely genetic and epigenetic layers. First we focus on the inference of gene regulatory networks (GRNs) from gene expression data, which is a central and challenging problem in systems biology. Here we aim to reconstruct the transcriptional regulatory networks controlling cell fate decisions in mammalian embryonic development. We propose a novel method that integrates the structure of a cell lineage tree with transcriptional patterns from single-cell data. This method adopts probabilistic Boolean network for network modeling, and genetic algorithm as search strategy. Guided by the “directionality” of cell development along branches of the cell-lineage tree, our method is able to accurately infer the regulatory circuits from single-cell gene expression data, in a holistic way. Applied on the single-cell transcriptional data of mouse preimplantation development, our algorithm outperforms the state-of-the-art methods of GRN inference. In addition, we develop an algorithm to infer time-delayed GRNs from time-series gene expression data. Time delay is an essential characteristic of gene regulation. However, the inference of time-delayed GRNs is even more challenging because it is a higher-order inference from less information (i.e. time points in time series data). We propose an algorithm based on cross-correlation and network deconvolution to infer time-delayed GRNs. Experiments on time-series gene expression datasets of yeast show that our method has significantly higher sensitivity and better F-measure than other methods for time-delayed GRN inference. Then we investigate the relationships between epigenetic modifications and the global gene expression in an organism. Based on the technique of association rule mining, we propose a framework to uncover the relationships between combinatorial histone modifications and the global transcriptional state. Our findings reveal that histone modifications regulate transcription on a large scale and some of the rules are consistent with the literature, while other rules are de novo and could guide further experiments. We also build a model to predict gene expression based on our association rules. Our model outperforms a published Bayesian network model for gene expression prediction using histone modifications. Our results bring new support for the hypothesis of “histone code”, and provide insight into the study of epigenetic regulatory mechanisms of transcription as well as other biological processes. Furthermore, we integrate epigenetic data into GRN inference to improve the performance. We choose Bayesian network to model a GRN for its strength in integrating prior knowledge. Here dynamic Bayesian network (DBN) is employed to capture the dynamic gene regulation from time series gene expression data. And epigenetic data (histone modifications here) are incorporated into the prior probability of the Bayesian model, using Gibbs distribution. The effectiveness of our method has been evaluated on both simulated data and real-life gene expression data in yeast cell cycle. Experimental results show that the integration of epigenetic data improves the performance of GRN inference, which indicates the intimate relationship between epigenetic modifications and gene regulation. To conclude, we have made the following contributions in this thesis: proposing a novel method to infer GRNs from single-cell gene expression data, providing an effective method to infer time-delayed GRNs with high sensitivity, developing a framework for histone code mining, and integrating epigenetic data into GRN inference to improve the performance. The mechanisms of gene regulation have been investigated on both genetic and epigenetic layers to achieve more accurate models and give more precise predictions. Our work would help biologists elucidate the complex program of gene regulation, which could provide insight into the molecular basis of normal development as well as diseases.
DRNTU::Engineering::Computer science and engineering::Computer applications::Life and medical sciences