Reconstruction of Genetic Networks (Grace Shieh, organizer)
Satoru Miyano (Human Genome Center, Institute of Medical Science, University of Tokyo)
Computational Strategy for Drug Target Gene Discovery with Gene Networks
Saturday 8:30-8:50, Fountain III
Abstract:
We developed a series of computational methods based on Bayesian networks for mining gene networks from microarray gene expression data. We combined the Bayesian network approach with nonparametric regression, where genes are regarded as random variables and the nonparametric regression enables us to capture from linear to nonlinear structures between genes. In order to improve the biological accuracy of estimated gene networks, we made a general framework by extending this method so that it can employ genome-wide other biological information such as sequence information on promoter regions, protein-protein interactions, protein-DNA interactions, and subcelluar localization information. By definition, Bayesian network assumes a directed acyclic graph as its underling structure. However, gene regulatory networks may involve feedback loops which violate the acyclicity condition of Bayesian network. In order to resolve the acyclicity restriction of Bayesian network model, we also developed the dynamic Bayesian network with nonparametric regression for time-course gene expression data. Though the problem of finding an optimal Bayesian network is known computationally intractable, we developed an algorithm for searching optimal and suboptimal Bayesian networks in feasible time for small networks. Computational experiments with this search algorithm have provided evidences of the biological rationality of our computational strategy.
These computational methods for estimating gene networks were applied for searching drug target genes. For a given drug, our strategy assumes two kinds of microarray gene expression data: One is a short time-course gene expression data for the drug response. The other is a set of gene expression data obtained by knock-downs of several hundreds of carefully selected genes (one knock-down for each microarray measurement). With these gene expression data, our computational method produces a gene network expressed as a Bayesian network that most strongly relates to the mode-of-action of the drug in cells.
We prepared 270 novel gene knock-downs for HUVEC and the fenofibrate was used as the drug for investigation. Microarray measurements were conducted for these 270 gene knock-downs and the drug responses in time-course. >From these data, we generated gene networks of around 1000 genes by using the supercomputer system at Human Genome Center of University of Tokyo. We report an analysis of the computationally estimated gene networks and discuss how we can explore the networks for searching drug target genes, by focusing on the genes around PPAR-alpha, which is known as the agonist of fenofibrate. Along with this talk, we will also mention the computational capabilities and tools that are required for the current and future research.
Ying Nian Wu (Department of Statistics, UCLA)
ChIP-chip: Data, Model, and Analysis
Saturday 8:50-9:10, Fountain III
Abstract:
ChIP-chip (or ChIP-on-chip) is a technology for isolation and
identification of genomic sites occupied by specific DNA binding proteins
in living cells. The ChIP-chip data can be obtained over the whole genome
by tiling arrays, where a peak in the signal is generally observed at a
protein binding site. In this talk, I will describe the ChIP-chip data and
propose a probability model for such data. I will then present a
model-based computational method for locating and testing peaks for the
purpose of identifying potential protein binding sites.
Joint work with Ming Zheng of UCLA, and Leah O. Barrera and Bing Ren of
UCSD. Mpeak software can be download at
http://www.stat.ucla.edu/~zmdl/mpeak/
Ming-Chi Kao (University of Michigan School of Medicine)
Integrating Cross-Platform Microarray Data by Second-order Analysis: Functional Annotation and Network Reconstruction
Saturday 9:10-9:30, Fountain III
Abstract:
We discuss 2nd-order gene expression analysis, which extracts expression patterns as meta-information from each data set individually and analyzes them across multiple data sets. Using yeast as a model system, we demonstrate two distinct advantages of our approach: we can identify genes of the same function yet without coexpression patterns and we can elucidate the cooperativities between transcription factors for regulatory network reconstruction by overcoming a key obstacle of genetic network reconstruction, namely the quantification of activities of transcription factors.
Ines Thiele (Systems Biology Research Group University of California, San Diego)
Constraint-based Modeling of Genome-scale Metabolic Networks: An Unbiased Assessment of Candidate Metabolic Network States
Saturday 9:30-9:50, Fountain III
Abstract:
Over the past two decades, advances in molecular biology, DNA sequencing and other high-throughput methods have dramatically increased the amount of information available for various organisms. Metabolic reconstructions integrate various ÔomicsÕ data sets into a database of genes, enzymes, and specific reactions that quantitatively describe the metabolic processes of an organism. These metabolic reconstructions can be mathematically represented with a stoichiometric matrix, thus enabling the use of various mathematical tools to investigate network capabilities. Constraint-based analyses have proven to be valuable for studying genome-scale metabolic networks. The constraint based approach is based on the fact that cellular networks are constrained to operate within boundaries set by physico-chemical constraints (mass conservation, directional flow, enzymatic capacity, etc). The imposition of constraints corresponds to a mathematical definition of a solution space within which all feasible solutions lie. Optimization-based studies of the steady state flux space, such as identification of potential cellular objectives, prediction of optimal growth rates, and of lethality of gene knockouts, have proven very useful, but are limited in that they require the a priori statement of an objective for the network and thus bias the search for particular network states. New approaches are needed that will allow for the unbiased assessment of network potential.
An approach presented here is focused on uniform random sampling of the steady state flux space in order to fully determine the probability distribution of all possible steady state fluxes allowed in a network subjected to certain physico-chemical constraints. Using this method, candidate steady-state flux distributions were determined under different sets of constraints representing various physiological and patho-physiological conditions for two reconstructed metabolic networks: the human red blood cell and the human cardiac mitochondrion. The uniform random sampling of the steady-state ßux spaces under simulated physiologic conditions yielded the following key results: 1) probability distributions for the values of individual metabolic ßuxes showed a wide variety of shapes that could not have been inferred without computation; 2) pairwise correlation coefficients were calculated between all ßuxes, determining the level of independence between the measurement of any two ßuxes, and identifying highly correlated reaction sets; and 3) the network-wide effects of the change in one (or a few) variables (i.e., a simulated enzymopathy, diabetes or ischemia) were computed. Mathematical models provide a compact and informative representation of a hypothesis of how a cell works. Thus, understanding model predictions clearly is vital to driving forward the iterative model-building procedure that is at the heart of systems biology. Taken together, the uniform random sampling procedure provides a broadening of the constraint-based approach by allowing for the unbiased and detailed assessment of the impact of the applied physico-chemical constraints on a reconstructed network.