efficient gene-network reconstruction: Topics by Science.gov

Sample records for efficient gene-network reconstruction

Reconstructing genome-wide regulatory network of E. coli using transcriptome data and predicted transcription factor activities

PubMed Central

2011-01-01

Background Gene regulatory networks play essential roles in living organisms to control growth, keep internal metabolism running and respond to external environmental changes. Understanding the connections and the activity levels of regulators is important for the research of gene regulatory networks. While relevance score based algorithms that reconstruct gene regulatory networks from transcriptome data can infer genome-wide gene regulatory networks, they are unfortunately prone to false positive results. Transcription factor activities (TFAs) quantitatively reflect the ability of the transcription factor to regulate target genes. However, classic relevance score based gene regulatory network reconstruction algorithms use models do not include the TFA layer, thus missing a key regulatory element. Results This work integrates TFA prediction algorithms with relevance score based network reconstruction algorithms to reconstruct gene regulatory networks with improved accuracy over classic relevance score based algorithms. This method is called Gene expression and Transcription factor activity based Relevance Network (GTRNetwork). Different combinations of TFA prediction algorithms and relevance score functions have been applied to find the most efficient combination. When the integrated GTRNetwork method was applied to E. coli data, the reconstructed genome-wide gene regulatory network predicted 381 new regulatory links. This reconstructed gene regulatory network including the predicted new regulatory links show promising biological significances. Many of the new links are verified by known TF binding site information, and many other links can be verified from the literature and databases such as EcoCyc. The reconstructed gene regulatory network is applied to a recent transcriptome analysis of E. coli during isobutanol stress. In addition to the 16 significantly changed TFAs detected in the original paper, another 7 significantly changed TFAs have been detected by using our reconstructed network. Conclusions The GTRNetwork algorithm introduces the hidden layer TFA into classic relevance score-based gene regulatory network reconstruction processes. Integrating the TFA biological information with regulatory network reconstruction algorithms significantly improves both detection of new links and reduces that rate of false positives. The application of GTRNetwork on E. coli gene transcriptome data gives a set of potential regulatory links with promising biological significance for isobutanol stress and other conditions. PMID:21668997
Gene regulatory network identification from the yeast cell cycle based on a neuro-fuzzy system.

PubMed

Wang, B H; Lim, J W; Lim, J S

2016-08-30

Many studies exist for reconstructing gene regulatory networks (GRNs). In this paper, we propose a method based on an advanced neuro-fuzzy system, for gene regulatory network reconstruction from microarray time-series data. This approach uses a neural network with a weighted fuzzy function to model the relationships between genes. Fuzzy rules, which determine the regulators of genes, are very simplified through this method. Additionally, a regulator selection procedure is proposed, which extracts the exact dynamic relationship between genes, using the information obtained from the weighted fuzzy function. Time-series related features are extracted from the original data to employ the characteristics of temporal data that are useful for accurate GRN reconstruction. The microarray dataset of the yeast cell cycle was used for our study. We measured the mean squared prediction error for the efficiency of the proposed approach and evaluated the accuracy in terms of precision, sensitivity, and F-score. The proposed method outperformed the other existing approaches.
Identifying Functional Mechanisms of Gene and Protein Regulatory Networks in Response to a Broader Range of Environmental Stresses

PubMed Central

Li, Cheng-Wei; Chen, Bor-Sen

2010-01-01

Cellular responses to sudden environmental stresses or physiological changes provide living organisms with the opportunity for final survival and further development. Therefore, it is an important topic to understand protective mechanisms against environmental stresses from the viewpoint of gene and protein networks. We propose two coupled nonlinear stochastic dynamic models to reconstruct stress-activated gene and protein regulatory networks via microarray data in response to environmental stresses. According to the reconstructed gene/protein networks, some possible mutual interactions, feedforward and feedback loops are found for accelerating response and filtering noises in these signaling pathways. A bow-tie core network is also identified to coordinate mutual interactions and feedforward loops, feedback inhibitions, feedback activations, and cross talks to cope efficiently with a broader range of environmental stresses with limited proteins and pathways. PMID:20454442
Inferring Gene Regulatory Networks by Singular Value Decomposition and Gravitation Field Algorithm

PubMed Central

Zheng, Ming; Wu, Jia-nan; Huang, Yan-xin; Liu, Gui-xia; Zhou, You; Zhou, Chun-guang

2012-01-01

Reconstruction of gene regulatory networks (GRNs) is of utmost interest and has become a challenge computational problem in system biology. However, every existing inference algorithm from gene expression profiles has its own advantages and disadvantages. In particular, the effectiveness and efficiency of every previous algorithm is not high enough. In this work, we proposed a novel inference algorithm from gene expression data based on differential equation model. In this algorithm, two methods were included for inferring GRNs. Before reconstructing GRNs, singular value decomposition method was used to decompose gene expression data, determine the algorithm solution space, and get all candidate solutions of GRNs. In these generated family of candidate solutions, gravitation field algorithm was modified to infer GRNs, used to optimize the criteria of differential equation model, and search the best network structure result. The proposed algorithm is validated on both the simulated scale-free network and real benchmark gene regulatory network in networks database. Both the Bayesian method and the traditional differential equation model were also used to infer GRNs, and the results were used to compare with the proposed algorithm in our work. And genetic algorithm and simulated annealing were also used to evaluate gravitation field algorithm. The cross-validation results confirmed the effectiveness of our algorithm, which outperforms significantly other previous algorithms. PMID:23226565
SCNS: a graphical tool for reconstructing executable regulatory networks from single-cell genomic data.

PubMed

Woodhouse, Steven; Piterman, Nir; Wintersteiger, Christoph M; Göttgens, Berthold; Fisher, Jasmin

2018-05-25

Reconstruction of executable mechanistic models from single-cell gene expression data represents a powerful approach to understanding developmental and disease processes. New ambitious efforts like the Human Cell Atlas will soon lead to an explosion of data with potential for uncovering and understanding the regulatory networks which underlie the behaviour of all human cells. In order to take advantage of this data, however, there is a need for general-purpose, user-friendly and efficient computational tools that can be readily used by biologists who do not have specialist computer science knowledge. The Single Cell Network Synthesis toolkit (SCNS) is a general-purpose computational tool for the reconstruction and analysis of executable models from single-cell gene expression data. Through a graphical user interface, SCNS takes single-cell qPCR or RNA-sequencing data taken across a time course, and searches for logical rules that drive transitions from early cell states towards late cell states. Because the resulting reconstructed models are executable, they can be used to make predictions about the effect of specific gene perturbations on the generation of specific lineages. SCNS should be of broad interest to the growing number of researchers working in single-cell genomics and will help further facilitate the generation of valuable mechanistic insights into developmental, homeostatic and disease processes.
CoryneRegNet 4.0 – A reference database for corynebacterial gene regulatory networks

PubMed Central

Baumbach, Jan

2007-01-01

Background Detailed information on DNA-binding transcription factors (the key players in the regulation of gene expression) and on transcriptional regulatory interactions of microorganisms deduced from literature-derived knowledge, computer predictions and global DNA microarray hybridization experiments, has opened the way for the genome-wide analysis of transcriptional regulatory networks. The large-scale reconstruction of these networks allows the in silico analysis of cell behavior in response to changing environmental conditions. We previously published CoryneRegNet, an ontology-based data warehouse of corynebacterial transcription factors and regulatory networks. Initially, it was designed to provide methods for the analysis and visualization of the gene regulatory network of Corynebacterium glutamicum. Results Now we introduce CoryneRegNet release 4.0, which integrates data on the gene regulatory networks of 4 corynebacteria, 2 mycobacteria and the model organism Escherichia coli K12. As the previous versions, CoryneRegNet provides a web-based user interface to access the database content, to allow various queries, and to support the reconstruction, analysis and visualization of regulatory networks at different hierarchical levels. In this article, we present the further improved database content of CoryneRegNet along with novel analysis features. The network visualization feature GraphVis now allows the inter-species comparisons of reconstructed gene regulatory networks and the projection of gene expression levels onto that networks. Therefore, we added stimulon data directly into the database, but also provide Web Service access to the DNA microarray analysis platform EMMA. Additionally, CoryneRegNet now provides a SOAP based Web Service server, which can easily be consumed by other bioinformatics software systems. Stimulons (imported from the database, or uploaded by the user) can be analyzed in the context of known transcriptional regulatory networks to predict putative contradictions or further gene regulatory interactions. Furthermore, it integrates protein clusters by means of heuristically solving the weighted graph cluster editing problem. In addition, it provides Web Service based access to up to date gene annotation data from GenDB. Conclusion The release 4.0 of CoryneRegNet is a comprehensive system for the integrated analysis of procaryotic gene regulatory networks. It is a versatile systems biology platform to support the efficient and large-scale analysis of transcriptional regulation of gene expression in microorganisms. It is publicly available at . PMID:17986320
Evolutionary Construction of Block-Based Neural Networks in Consideration of Failure

NASA Astrophysics Data System (ADS)

Takamori, Masahito; Koakutsu, Seiichi; Hamagami, Tomoki; Hirata, Hironori

In this paper we propose a modified gene coding and an evolutionary construction in consideration of failure in evolutionary construction of Block-Based Neural Networks. In the modified gene coding, we arrange the genes of weights on a chromosome in consideration of the position relation of the genes of weight and structure. By the modified gene coding, the efficiency of search by crossover is increased. Thereby, it is thought that improvement of the convergence rate of construction and shortening of construction time can be performed. In the evolutionary construction in consideration of failure, the structure which is adapted for failure is built in the state where failure occured. Thereby, it is thought that BBNN can be reconstructed in a short time at the time of failure. To evaluate the proposed method, we apply it to pattern classification and autonomous mobile robot control problems. The computational experiments indicate that the proposed method can improve convergence rate of construction and shorten of construction and reconstruction time.
BoolNet--an R package for generation, reconstruction and analysis of Boolean networks.

PubMed

Müssel, Christoph; Hopfensitz, Martin; Kestler, Hans A

2010-05-15

As the study of information processing in living cells moves from individual pathways to complex regulatory networks, mathematical models and simulation become indispensable tools for analyzing the complex behavior of such networks and can provide deep insights into the functioning of cells. The dynamics of gene expression, for example, can be modeled with Boolean networks (BNs). These are mathematical models of low complexity, but have the advantage of being able to capture essential properties of gene-regulatory networks. However, current implementations of BNs only focus on different sub-aspects of this model and do not allow for a seamless integration into existing preprocessing pipelines. BoolNet efficiently integrates methods for synchronous, asynchronous and probabilistic BNs. This includes reconstructing networks from time series, generating random networks, robustness analysis via perturbation, Markov chain simulations, and identification and visualization of attractors. The package BoolNet is freely available from the R project at http://cran.r-project.org/ or http://www.informatik.uni-ulm.de/ni/mitarbeiter/HKestler/boolnet/ under Artistic License 2.0. hans.kestler@uni-ulm.de Supplementary data are available at Bioinformatics online.
On construction of stochastic genetic networks based on gene expression sequences.

PubMed

Ching, Wai-Ki; Ng, Michael M; Fung, Eric S; Akutsu, Tatsuya

2005-08-01

Reconstruction of genetic regulatory networks from time series data of gene expression patterns is an important research topic in bioinformatics. Probabilistic Boolean Networks (PBNs) have been proposed as an effective model for gene regulatory networks. PBNs are able to cope with uncertainty, corporate rule-based dependencies between genes and discover the sensitivity of genes in their interactions with other genes. However, PBNs are unlikely to use directly in practice because of huge amount of computational cost for obtaining predictors and their corresponding probabilities. In this paper, we propose a multivariate Markov model for approximating PBNs and describing the dynamics of a genetic network for gene expression sequences. The main contribution of the new model is to preserve the strength of PBNs and reduce the complexity of the networks. The number of parameters of our proposed model is O(n2) where n is the number of genes involved. We also develop efficient estimation methods for solving the model parameters. Numerical examples on synthetic data sets and practical yeast data sequences are given to demonstrate the effectiveness of the proposed model.
F-MAP: A Bayesian approach to infer the gene regulatory network using external hints

PubMed Central

Shahdoust, Maryam; Mahjub, Hossein; Sadeghi, Mehdi

2017-01-01

The Common topological features of related species gene regulatory networks suggest reconstruction of the network of one species by using the further information from gene expressions profile of related species. We present an algorithm to reconstruct the gene regulatory network named; F-MAP, which applies the knowledge about gene interactions from related species. Our algorithm sets a Bayesian framework to estimate the precision matrix of one species microarray gene expressions dataset to infer the Gaussian Graphical model of the network. The conjugate Wishart prior is used and the information from related species is applied to estimate the hyperparameters of the prior distribution by using the factor analysis. Applying the proposed algorithm on six related species of drosophila shows that the precision of reconstructed networks is improved considerably compared to the precision of networks constructed by other Bayesian approaches. PMID:28938012
A Sparse Reconstruction Approach for Identifying Gene Regulatory Networks Using Steady-State Experiment Data

PubMed Central

Zhang, Wanhong; Zhou, Tong

2015-01-01

Motivation Identifying gene regulatory networks (GRNs) which consist of a large number of interacting units has become a problem of paramount importance in systems biology. Situations exist extensively in which causal interacting relationships among these units are required to be reconstructed from measured expression data and other a priori information. Though numerous classical methods have been developed to unravel the interactions of GRNs, these methods either have higher computing complexities or have lower estimation accuracies. Note that great similarities exist between identification of genes that directly regulate a specific gene and a sparse vector reconstruction, which often relates to the determination of the number, location and magnitude of nonzero entries of an unknown vector by solving an underdetermined system of linear equations y = Φx. Based on these similarities, we propose a novel framework of sparse reconstruction to identify the structure of a GRN, so as to increase accuracy of causal regulation estimations, as well as to reduce their computational complexity. Results In this paper, a sparse reconstruction framework is proposed on basis of steady-state experiment data to identify GRN structure. Different from traditional methods, this approach is adopted which is well suitable for a large-scale underdetermined problem in inferring a sparse vector. We investigate how to combine the noisy steady-state experiment data and a sparse reconstruction algorithm to identify causal relationships. Efficiency of this method is tested by an artificial linear network, a mitogen-activated protein kinase (MAPK) pathway network and the in silico networks of the DREAM challenges. The performance of the suggested approach is compared with two state-of-the-art algorithms, the widely adopted total least-squares (TLS) method and those available results on the DREAM project. Actual results show that, with a lower computational cost, the proposed method can significantly enhance estimation accuracy and greatly reduce false positive and negative errors. Furthermore, numerical calculations demonstrate that the proposed algorithm may have faster convergence speed and smaller fluctuation than other methods when either estimate error or estimate bias is considered. PMID:26207991
Determining Regulatory Networks Governing the Differentiation of Embryonic Stem Cells to Pancreatic Lineage

NASA Astrophysics Data System (ADS)

Banerjee, Ipsita

2009-03-01

Knowledge of pathways governing cellular differentiation to specific phenotype will enable generation of desired cell fates by careful alteration of the governing network by adequate manipulation of the cellular environment. With this aim, we have developed a novel method to reconstruct the underlying regulatory architecture of a differentiating cell population from discrete temporal gene expression data. We utilize an inherent feature of biological networks, that of sparsity, in formulating the network reconstruction problem as a bi-level mixed-integer programming problem. The formulation optimizes the network topology at the upper level and the network connectivity strength at the lower level. The method is first validated by in-silico data, before applying it to the complex system of embryonic stem (ES) cell differentiation. This formulation enables efficient identification of the underlying network topology which could accurately predict steps necessary for directing differentiation to subsequent stages. Concurrent experimental verification demonstrated excellent agreement with model prediction.
Integrating Genetic and Functional Genomic Data to Elucidate Common Disease Tra

NASA Astrophysics Data System (ADS)

Schadt, Eric

2005-03-01

The reconstruction of genetic networks in mammalian systems is one of the primary goals in biological research, especially as such reconstructions relate to elucidating not only common, polygenic human diseases, but living systems more generally. Here I present a statistical procedure for inferring causal relationships between gene expression traits and more classic clinical traits, including complex disease traits. This procedure has been generalized to the gene network reconstruction problem, where naturally occurring genetic variations in segregating mouse populations are used as a source of perturbations to elucidate tissue-specific gene networks. Differences in the extent of genetic control between genders and among four different tissues are highlighted. I also demonstrate that the networks derived from expression data in segregating mouse populations using the novel network reconstruction algorithm are able to capture causal associations between genes that result in increased predictive power, compared to more classically reconstructed networks derived from the same data. This approach to causal inference in large segregating mouse populations over multiple tissues not only elucidates fundamental aspects of transcriptional control, it also allows for the objective identification of key drivers of common human diseases.
CMIP: a software package capable of reconstructing genome-wide regulatory networks using gene expression data.

PubMed

Zheng, Guangyong; Xu, Yaochen; Zhang, Xiujun; Liu, Zhi-Ping; Wang, Zhuo; Chen, Luonan; Zhu, Xin-Guang

2016-12-23

A gene regulatory network (GRN) represents interactions of genes inside a cell or tissue, in which vertexes and edges stand for genes and their regulatory interactions respectively. Reconstruction of gene regulatory networks, in particular, genome-scale networks, is essential for comparative exploration of different species and mechanistic investigation of biological processes. Currently, most of network inference methods are computationally intensive, which are usually effective for small-scale tasks (e.g., networks with a few hundred genes), but are difficult to construct GRNs at genome-scale. Here, we present a software package for gene regulatory network reconstruction at a genomic level, in which gene interaction is measured by the conditional mutual information measurement using a parallel computing framework (so the package is named CMIP). The package is a greatly improved implementation of our previous PCA-CMI algorithm. In CMIP, we provide not only an automatic threshold determination method but also an effective parallel computing framework for network inference. Performance tests on benchmark datasets show that the accuracy of CMIP is comparable to most current network inference methods. Moreover, running tests on synthetic datasets demonstrate that CMIP can handle large datasets especially genome-wide datasets within an acceptable time period. In addition, successful application on a real genomic dataset confirms its practical applicability of the package. This new software package provides a powerful tool for genomic network reconstruction to biological community. The software can be accessed at http://www.picb.ac.cn/CMIP/ .
Gene Expression Network Reconstruction by Convex Feature Selection when Incorporating Genetic Perturbations

PubMed Central

Logsdon, Benjamin A.; Mezey, Jason

2010-01-01

Cellular gene expression measurements contain regulatory information that can be used to discover novel network relationships. Here, we present a new algorithm for network reconstruction powered by the adaptive lasso, a theoretically and empirically well-behaved method for selecting the regulatory features of a network. Any algorithms designed for network discovery that make use of directed probabilistic graphs require perturbations, produced by either experiments or naturally occurring genetic variation, to successfully infer unique regulatory relationships from gene expression data. Our approach makes use of appropriately selected cis-expression Quantitative Trait Loci (cis-eQTL), which provide a sufficient set of independent perturbations for maximum network resolution. We compare the performance of our network reconstruction algorithm to four other approaches: the PC-algorithm, QTLnet, the QDG algorithm, and the NEO algorithm, all of which have been used to reconstruct directed networks among phenotypes leveraging QTL. We show that the adaptive lasso can outperform these algorithms for networks of ten genes and ten cis-eQTL, and is competitive with the QDG algorithm for networks with thirty genes and thirty cis-eQTL, with rich topologies and hundreds of samples. Using this novel approach, we identify unique sets of directed relationships in Saccharomyces cerevisiae when analyzing genome-wide gene expression data for an intercross between a wild strain and a lab strain. We recover novel putative network relationships between a tyrosine biosynthesis gene (TYR1), and genes involved in endocytosis (RCY1), the spindle checkpoint (BUB2), sulfonate catabolism (JLP1), and cell-cell communication (PRM7). Our algorithm provides a synthesis of feature selection methods and graphical model theory that has the potential to reveal new directed regulatory relationships from the analysis of population level genetic and gene expression data. PMID:21152011
Structures and Boolean Dynamics in Gene Regulatory Networks

NASA Astrophysics Data System (ADS)

Szedlak, Anthony

This dissertation discusses the topological and dynamical properties of GRNs in cancer, and is divided into four main chapters. First, the basic tools of modern complex network theory are introduced. These traditional tools as well as those developed by myself (set efficiency, interset efficiency, and nested communities) are crucial for understanding the intricate topological properties of GRNs, and later chapters recall these concepts. Second, the biology of gene regulation is discussed, and a method for disease-specific GRN reconstruction developed by our collaboration is presented. This complements the traditional exhaustive experimental approach of building GRNs edge-by-edge by quickly inferring the existence of as of yet undiscovered edges using correlations across sets of gene expression data. This method also provides insight into the distribution of common mutations across GRNs. Third, I demonstrate that the structures present in these reconstructed networks are strongly related to the evolutionary histories of their constituent genes. Investigation of how the forces of evolution shaped the topology of GRNs in multicellular organisms by growing outward from a core of ancient, conserved genes can shed light upon the ''reverse evolution'' of normal cells into unicellular-like cancer states. Next, I simulate the dynamics of the GRNs of cancer cells using the Hopfield model, an infinite range spin-glass model designed with the ability to encode Boolean data as attractor states. This attractor-driven approach facilitates the integration of gene expression data into predictive mathematical models. Perturbations representing therapeutic interventions are applied to sets of genes, and the resulting deviations from their attractor states are recorded, suggesting new potential drug targets for experimentation. Finally, I extend the Hopfield model to modular networks, cyclic attractors, and complex attractors, and apply these concepts to simulations of the cell cycle process. Futher development of these and other theoretical and computational tools is necessary to analyze the deluge of experimental data produced by modern and future biological high throughput methods. (Abstract shortened by ProQuest.).
Unraveling gene regulatory networks from time-resolved gene expression data -- a measures comparison study

PubMed Central

2011-01-01

Background Inferring regulatory interactions between genes from transcriptomics time-resolved data, yielding reverse engineered gene regulatory networks, is of paramount importance to systems biology and bioinformatics studies. Accurate methods to address this problem can ultimately provide a deeper insight into the complexity, behavior, and functions of the underlying biological systems. However, the large number of interacting genes coupled with short and often noisy time-resolved read-outs of the system renders the reverse engineering a challenging task. Therefore, the development and assessment of methods which are computationally efficient, robust against noise, applicable to short time series data, and preferably capable of reconstructing the directionality of the regulatory interactions remains a pressing research problem with valuable applications. Results Here we perform the largest systematic analysis of a set of similarity measures and scoring schemes within the scope of the relevance network approach which are commonly used for gene regulatory network reconstruction from time series data. In addition, we define and analyze several novel measures and schemes which are particularly suitable for short transcriptomics time series. We also compare the considered 21 measures and 6 scoring schemes according to their ability to correctly reconstruct such networks from short time series data by calculating summary statistics based on the corresponding specificity and sensitivity. Our results demonstrate that rank and symbol based measures have the highest performance in inferring regulatory interactions. In addition, the proposed scoring scheme by asymmetric weighting has shown to be valuable in reducing the number of false positive interactions. On the other hand, Granger causality as well as information-theoretic measures, frequently used in inference of regulatory networks, show low performance on the short time series analyzed in this study. Conclusions Our study is intended to serve as a guide for choosing a particular combination of similarity measures and scoring schemes suitable for reconstruction of gene regulatory networks from short time series data. We show that further improvement of algorithms for reverse engineering can be obtained if one considers measures that are rooted in the study of symbolic dynamics or ranks, in contrast to the application of common similarity measures which do not consider the temporal character of the employed data. Moreover, we establish that the asymmetric weighting scoring scheme together with symbol based measures (for low noise level) and rank based measures (for high noise level) are the most suitable choices. PMID:21771321
Gene regulatory network inference using fused LASSO on multiple data sets

PubMed Central

Omranian, Nooshin; Eloundou-Mbebi, Jeanne M. O.; Mueller-Roeber, Bernd; Nikoloski, Zoran

2016-01-01

Devising computational methods to accurately reconstruct gene regulatory networks given gene expression data is key to systems biology applications. Here we propose a method for reconstructing gene regulatory networks by simultaneous consideration of data sets from different perturbation experiments and corresponding controls. The method imposes three biologically meaningful constraints: (1) expression levels of each gene should be explained by the expression levels of a small number of transcription factor coding genes, (2) networks inferred from different data sets should be similar with respect to the type and number of regulatory interactions, and (3) relationships between genes which exhibit similar differential behavior over the considered perturbations should be favored. We demonstrate that these constraints can be transformed in a fused LASSO formulation for the proposed method. The comparative analysis on transcriptomics time-series data from prokaryotic species, Escherichia coli and Mycobacterium tuberculosis, as well as a eukaryotic species, mouse, demonstrated that the proposed method has the advantages of the most recent approaches for regulatory network inference, while obtaining better performance and assigning higher scores to the true regulatory links. The study indicates that the combination of sparse regression techniques with other biologically meaningful constraints is a promising framework for gene regulatory network reconstructions. PMID:26864687
Network Reconstruction Using Nonparametric Additive ODE Models

PubMed Central

Henderson, James; Michailidis, George

2014-01-01

Network representations of biological systems are widespread and reconstructing unknown networks from data is a focal problem for computational biologists. For example, the series of biochemical reactions in a metabolic pathway can be represented as a network, with nodes corresponding to metabolites and edges linking reactants to products. In a different context, regulatory relationships among genes are commonly represented as directed networks with edges pointing from influential genes to their targets. Reconstructing such networks from data is a challenging problem receiving much attention in the literature. There is a particular need for approaches tailored to time-series data and not reliant on direct intervention experiments, as the former are often more readily available. In this paper, we introduce an approach to reconstructing directed networks based on dynamic systems models. Our approach generalizes commonly used ODE models based on linear or nonlinear dynamics by extending the functional class for the functions involved from parametric to nonparametric models. Concomitantly we limit the complexity by imposing an additive structure on the estimated slope functions. Thus the submodel associated with each node is a sum of univariate functions. These univariate component functions form the basis for a novel coupling metric that we define in order to quantify the strength of proposed relationships and hence rank potential edges. We show the utility of the method by reconstructing networks using simulated data from computational models for the glycolytic pathway of Lactocaccus Lactis and a gene network regulating the pluripotency of mouse embryonic stem cells. For purposes of comparison, we also assess reconstruction performance using gene networks from the DREAM challenges. We compare our method to those that similarly rely on dynamic systems models and use the results to attempt to disentangle the distinct roles of linearity, sparsity, and derivative estimation. PMID:24732037
Identification of candidate genes in Populus cell wall biosynthesis using text-mining, co-expression network and comparative genomics

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yang, Xiaohan; Ye, Chuyu; Bisaria, Anjali

2011-01-01

Populus is an important bioenergy crop for bioethanol production. A greater understanding of cell wall biosynthesis processes is critical in reducing biomass recalcitrance, a major hindrance in efficient generation of ethanol from lignocellulosic biomass. Here, we report the identification of candidate cell wall biosynthesis genes through the development and application of a novel bioinformatics pipeline. As a first step, via text-mining of PubMed publications, we obtained 121 Arabidopsis genes that had the experimental evidences supporting their involvement in cell wall biosynthesis or remodeling. The 121 genes were then used as bait genes to query an Arabidopsis co-expression database and additionalmore » genes were identified as neighbors of the bait genes in the network, increasing the number of genes to 548. The 548 Arabidopsis genes were then used to re-query the Arabidopsis co-expression database and re-construct a network that captured additional network neighbors, expanding to a total of 694 genes. The 694 Arabidopsis genes were computationally divided into 22 clusters. Queries of the Populus genome using the Arabidopsis genes revealed 817 Populus orthologs. Functional analysis of gene ontology and tissue-specific gene expression indicated that these Arabidopsis and Populus genes are high likelihood candidates for functional genomics in relation to cell wall biosynthesis.« less

Identification of functional differences in metabolic networks using comparative genomics and constraint-based models.

PubMed

Hamilton, Joshua J; Reed, Jennifer L

2012-01-01

Genome-scale network reconstructions are useful tools for understanding cellular metabolism, and comparisons of such reconstructions can provide insight into metabolic differences between organisms. Recent efforts toward comparing genome-scale models have focused primarily on aligning metabolic networks at the reaction level and then looking at differences and similarities in reaction and gene content. However, these reaction comparison approaches are time-consuming and do not identify the effect network differences have on the functional states of the network. We have developed a bilevel mixed-integer programming approach, CONGA, to identify functional differences between metabolic networks by comparing network reconstructions aligned at the gene level. We first identify orthologous genes across two reconstructions and then use CONGA to identify conditions under which differences in gene content give rise to differences in metabolic capabilities. By seeking genes whose deletion in one or both models disproportionately changes flux through a selected reaction (e.g., growth or by-product secretion) in one model over another, we are able to identify structural metabolic network differences enabling unique metabolic capabilities. Using CONGA, we explore functional differences between two metabolic reconstructions of Escherichia coli and identify a set of reactions responsible for chemical production differences between the two models. We also use this approach to aid in the development of a genome-scale model of Synechococcus sp. PCC 7002. Finally, we propose potential antimicrobial targets in Mycobacterium tuberculosis and Staphylococcus aureus based on differences in their metabolic capabilities. Through these examples, we demonstrate that a gene-centric approach to comparing metabolic networks allows for a rapid comparison of metabolic models at a functional level. Using CONGA, we can identify differences in reaction and gene content which give rise to different functional predictions. Because CONGA provides a general framework, it can be applied to find functional differences across models and biological systems beyond those presented here.
Identification of Functional Differences in Metabolic Networks Using Comparative Genomics and Constraint-Based Models

PubMed Central

Hamilton, Joshua J.; Reed, Jennifer L.

2012-01-01

Genome-scale network reconstructions are useful tools for understanding cellular metabolism, and comparisons of such reconstructions can provide insight into metabolic differences between organisms. Recent efforts toward comparing genome-scale models have focused primarily on aligning metabolic networks at the reaction level and then looking at differences and similarities in reaction and gene content. However, these reaction comparison approaches are time-consuming and do not identify the effect network differences have on the functional states of the network. We have developed a bilevel mixed-integer programming approach, CONGA, to identify functional differences between metabolic networks by comparing network reconstructions aligned at the gene level. We first identify orthologous genes across two reconstructions and then use CONGA to identify conditions under which differences in gene content give rise to differences in metabolic capabilities. By seeking genes whose deletion in one or both models disproportionately changes flux through a selected reaction (e.g., growth or by-product secretion) in one model over another, we are able to identify structural metabolic network differences enabling unique metabolic capabilities. Using CONGA, we explore functional differences between two metabolic reconstructions of Escherichia coli and identify a set of reactions responsible for chemical production differences between the two models. We also use this approach to aid in the development of a genome-scale model of Synechococcus sp. PCC 7002. Finally, we propose potential antimicrobial targets in Mycobacterium tuberculosis and Staphylococcus aureus based on differences in their metabolic capabilities. Through these examples, we demonstrate that a gene-centric approach to comparing metabolic networks allows for a rapid comparison of metabolic models at a functional level. Using CONGA, we can identify differences in reaction and gene content which give rise to different functional predictions. Because CONGA provides a general framework, it can be applied to find functional differences across models and biological systems beyond those presented here. PMID:22666308
Modeling genome-wide dynamic regulatory network in mouse lungs with influenza infection using high-dimensional ordinary differential equations.

PubMed

Wu, Shuang; Liu, Zhi-Ping; Qiu, Xing; Wu, Hulin

2014-01-01

The immune response to viral infection is regulated by an intricate network of many genes and their products. The reverse engineering of gene regulatory networks (GRNs) using mathematical models from time course gene expression data collected after influenza infection is key to our understanding of the mechanisms involved in controlling influenza infection within a host. A five-step pipeline: detection of temporally differentially expressed genes, clustering genes into co-expressed modules, identification of network structure, parameter estimate refinement, and functional enrichment analysis, is developed for reconstructing high-dimensional dynamic GRNs from genome-wide time course gene expression data. Applying the pipeline to the time course gene expression data from influenza-infected mouse lungs, we have identified 20 distinct temporal expression patterns in the differentially expressed genes and constructed a module-based dynamic network using a linear ODE model. Both intra-module and inter-module annotations and regulatory relationships of our inferred network show some interesting findings and are highly consistent with existing knowledge about the immune response in mice after influenza infection. The proposed method is a computationally efficient, data-driven pipeline bridging experimental data, mathematical modeling, and statistical analysis. The application to the influenza infection data elucidates the potentials of our pipeline in providing valuable insights into systematic modeling of complicated biological processes.
Empirical Bayes conditional independence graphs for regulatory network recovery.

PubMed

Mahdi, Rami; Madduri, Abishek S; Wang, Guoqing; Strulovici-Barel, Yael; Salit, Jacqueline; Hackett, Neil R; Crystal, Ronald G; Mezey, Jason G

2012-08-01

Computational inference methods that make use of graphical models to extract regulatory networks from gene expression data can have difficulty reconstructing dense regions of a network, a consequence of both computational complexity and unreliable parameter estimation when sample size is small. As a result, identification of hub genes is of special difficulty for these methods. We present a new algorithm, Empirical Light Mutual Min (ELMM), for large network reconstruction that has properties well suited for recovery of graphs with high-degree nodes. ELMM reconstructs the undirected graph of a regulatory network using empirical Bayes conditional independence testing with a heuristic relaxation of independence constraints in dense areas of the graph. This relaxation allows only one gene of a pair with a putative relation to be aware of the network connection, an approach that is aimed at easing multiple testing problems associated with recovering densely connected structures. Using in silico data, we show that ELMM has better performance than commonly used network inference algorithms including GeneNet, ARACNE, FOCI, GENIE3 and GLASSO. We also apply ELMM to reconstruct a network among 5492 genes expressed in human lung airway epithelium of healthy non-smokers, healthy smokers and individuals with chronic obstructive pulmonary disease assayed using microarrays. The analysis identifies dense sub-networks that are consistent with known regulatory relationships in the lung airway and also suggests novel hub regulatory relationships among a number of genes that play roles in oxidative stress and secretion. Software for running ELMM is made available at http://mezeylab.cb.bscb.cornell.edu/Software.aspx. ramimahdi@yahoo.com or jgm45@cornell.edu Supplementary data are available at Bioinformatics online.
Dynamic network reconstruction from gene expression data applied to immune response during bacterial infection.

PubMed

Guthke, Reinhard; Möller, Ulrich; Hoffmann, Martin; Thies, Frank; Töpfer, Susanne

2005-04-15

The immune response to bacterial infection represents a complex network of dynamic gene and protein interactions. We present an optimized reverse engineering strategy aimed at a reconstruction of this kind of interaction networks. The proposed approach is based on both microarray data and available biological knowledge. The main kinetics of the immune response were identified by fuzzy clustering of gene expression profiles (time series). The number of clusters was optimized using various evaluation criteria. For each cluster a representative gene with a high fuzzy-membership was chosen in accordance with available physiological knowledge. Then hypothetical network structures were identified by seeking systems of ordinary differential equations, whose simulated kinetics could fit the gene expression profiles of the cluster-representative genes. For the construction of hypothetical network structures singular value decomposition (SVD) based methods and a newly introduced heuristic Network Generation Method here were compared. It turned out that the proposed novel method could find sparser networks and gave better fits to the experimental data. Reinhard.Guthke@hki-jena.de.
A swarm intelligence framework for reconstructing gene networks: searching for biologically plausible architectures.

PubMed

Kentzoglanakis, Kyriakos; Poole, Matthew

2012-01-01

In this paper, we investigate the problem of reverse engineering the topology of gene regulatory networks from temporal gene expression data. We adopt a computational intelligence approach comprising swarm intelligence techniques, namely particle swarm optimization (PSO) and ant colony optimization (ACO). In addition, the recurrent neural network (RNN) formalism is employed for modeling the dynamical behavior of gene regulatory systems. More specifically, ACO is used for searching the discrete space of network architectures and PSO for searching the corresponding continuous space of RNN model parameters. We propose a novel solution construction process in the context of ACO for generating biologically plausible candidate architectures. The objective is to concentrate the search effort into areas of the structure space that contain architectures which are feasible in terms of their topological resemblance to real-world networks. The proposed framework is initially applied to the reconstruction of a small artificial network that has previously been studied in the context of gene network reverse engineering. Subsequently, we consider an artificial data set with added noise for reconstructing a subnetwork of the genetic interaction network of S. cerevisiae (yeast). Finally, the framework is applied to a real-world data set for reverse engineering the SOS response system of the bacterium Escherichia coli. Results demonstrate the relative advantage of utilizing problem-specific knowledge regarding biologically plausible structural properties of gene networks over conducting a problem-agnostic search in the vast space of network architectures.
Comparative genomics of metabolic capacities of regulons controlled by cis-regulatory RNA motifs in bacteria.

PubMed

Sun, Eric I; Leyn, Semen A; Kazanov, Marat D; Saier, Milton H; Novichkov, Pavel S; Rodionov, Dmitry A

2013-09-02

In silico comparative genomics approaches have been efficiently used for functional prediction and reconstruction of metabolic and regulatory networks. Riboswitches are metabolite-sensing structures often found in bacterial mRNA leaders controlling gene expression on transcriptional or translational levels.An increasing number of riboswitches and other cis-regulatory RNAs have been recently classified into numerous RNA families in the Rfam database. High conservation of these RNA motifs provides a unique advantage for their genomic identification and comparative analysis. A comparative genomics approach implemented in the RegPredict tool was used for reconstruction and functional annotation of regulons controlled by RNAs from 43 Rfam families in diverse taxonomic groups of Bacteria. The inferred regulons include ~5200 cis-regulatory RNAs and more than 12000 target genes in 255 microbial genomes. All predicted RNA-regulated genes were classified into specific and overall functional categories. Analysis of taxonomic distribution of these categories allowed us to establish major functional preferences for each analyzed cis-regulatory RNA motif family. Overall, most RNA motif regulons showed predictable functional content in accordance with their experimentally established effector ligands. Our results suggest that some RNA motifs (including thiamin pyrophosphate and cobalamin riboswitches that control the cofactor metabolism) are widespread and likely originated from the last common ancestor of all bacteria. However, many more analyzed RNA motifs are restricted to a narrow taxonomic group of bacteria and likely represent more recent evolutionary innovations. The reconstructed regulatory networks for major known RNA motifs substantially expand the existing knowledge of transcriptional regulation in bacteria. The inferred regulons can be used for genetic experiments, functional annotations of genes, metabolic reconstruction and evolutionary analysis. The obtained genome-wide collection of reference RNA motif regulons is available in the RegPrecise database (http://regprecise.lbl.gov/).
The Reconstruction and Analysis of Gene Regulatory Networks.

PubMed

Zheng, Guangyong; Huang, Tao

2018-01-01

In post-genomic era, an important task is to explore the function of individual biological molecules (i.e., gene, noncoding RNA, protein, metabolite) and their organization in living cells. For this end, gene regulatory networks (GRNs) are constructed to show relationship between biological molecules, in which the vertices of network denote biological molecules and the edges of network present connection between nodes (Strogatz, Nature 410:268-276, 2001; Bray, Science 301:1864-1865, 2003). Biologists can understand not only the function of biological molecules but also the organization of components of living cells through interpreting the GRNs, since a gene regulatory network is a comprehensively physiological map of living cells and reflects influence of genetic and epigenetic factors (Strogatz, Nature 410:268-276, 2001; Bray, Science 301:1864-1865, 2003). In this paper, we will review the inference methods of GRN reconstruction and analysis approaches of network structure. As a powerful tool for studying complex diseases and biological processes, the applications of the network method in pathway analysis and disease gene identification will be introduced.
ARACNe-AP: Gene Network Reverse Engineering through Adaptive Partitioning inference of Mutual Information. | Office of Cancer Genomics

Cancer.gov

The accurate reconstruction of gene regulatory networks from large scale molecular profile datasets represents one of the grand challenges of Systems Biology. The Algorithm for the Reconstruction of Accurate Cellular Networks (ARACNe) represents one of the most effective tools to accomplish this goal. However, the initial Fixed Bandwidth (FB) implementation is both inefficient and unable to deal with sample sets providing largely uneven coverage of the probability density space.
Comparative analysis of gene regulatory networks: from network reconstruction to evolution.

PubMed

Thompson, Dawn; Regev, Aviv; Roy, Sushmita

2015-01-01

Regulation of gene expression is central to many biological processes. Although reconstruction of regulatory circuits from genomic data alone is therefore desirable, this remains a major computational challenge. Comparative approaches that examine the conservation and divergence of circuits and their components across strains and species can help reconstruct circuits as well as provide insights into the evolution of gene regulatory processes and their adaptive contribution. In recent years, advances in genomic and computational tools have led to a wealth of methods for such analysis at the sequence, expression, pathway, module, and entire network level. Here, we review computational methods developed to study transcriptional regulatory networks using comparative genomics, from sequence to functional data. We highlight how these methods use evolutionary conservation and divergence to reliably detect regulatory components as well as estimate the extent and rate of divergence. Finally, we discuss the promise and open challenges in linking regulatory divergence to phenotypic divergence and adaptation.
Harnessing Diversity towards the Reconstructing of Large Scale Gene Regulatory Networks

PubMed Central

Yamanaka, Ryota; Kitano, Hiroaki

2013-01-01

Elucidating gene regulatory network (GRN) from large scale experimental data remains a central challenge in systems biology. Recently, numerous techniques, particularly consensus driven approaches combining different algorithms, have become a potentially promising strategy to infer accurate GRNs. Here, we develop a novel consensus inference algorithm, TopkNet that can integrate multiple algorithms to infer GRNs. Comprehensive performance benchmarking on a cloud computing framework demonstrated that (i) a simple strategy to combine many algorithms does not always lead to performance improvement compared to the cost of consensus and (ii) TopkNet integrating only high-performance algorithms provide significant performance improvement compared to the best individual algorithms and community prediction. These results suggest that a priori determination of high-performance algorithms is a key to reconstruct an unknown regulatory network. Similarity among gene-expression datasets can be useful to determine potential optimal algorithms for reconstruction of unknown regulatory networks, i.e., if expression-data associated with known regulatory network is similar to that with unknown regulatory network, optimal algorithms determined for the known regulatory network can be repurposed to infer the unknown regulatory network. Based on this observation, we developed a quantitative measure of similarity among gene-expression datasets and demonstrated that, if similarity between the two expression datasets is high, TopkNet integrating algorithms that are optimal for known dataset perform well on the unknown dataset. The consensus framework, TopkNet, together with the similarity measure proposed in this study provides a powerful strategy towards harnessing the wisdom of the crowds in reconstruction of unknown regulatory networks. PMID:24278007
Reconstructing Genetic Regulatory Networks Using Two-Step Algorithms with the Differential Equation Models of Neural Networks.

PubMed

Chen, Chi-Kan

2017-07-26

The identification of genetic regulatory networks (GRNs) provides insights into complex cellular processes. A class of recurrent neural networks (RNNs) captures the dynamics of GRN. Algorithms combining the RNN and machine learning schemes were proposed to reconstruct small-scale GRNs using gene expression time series. We present new GRN reconstruction methods with neural networks. The RNN is extended to a class of recurrent multilayer perceptrons (RMLPs) with latent nodes. Our methods contain two steps: the edge rank assignment step and the network construction step. The former assigns ranks to all possible edges by a recursive procedure based on the estimated weights of wires of RNN/RMLP (RE RNN /RE RMLP ), and the latter constructs a network consisting of top-ranked edges under which the optimized RNN simulates the gene expression time series. The particle swarm optimization (PSO) is applied to optimize the parameters of RNNs and RMLPs in a two-step algorithm. The proposed RE RNN -RNN and RE RMLP -RNN algorithms are tested on synthetic and experimental gene expression time series of small GRNs of about 10 genes. The experimental time series are from the studies of yeast cell cycle regulated genes and E. coli DNA repair genes. The unstable estimation of RNN using experimental time series having limited data points can lead to fairly arbitrary predicted GRNs. Our methods incorporate RNN and RMLP into a two-step structure learning procedure. Results show that the RE RMLP using the RMLP with a suitable number of latent nodes to reduce the parameter dimension often result in more accurate edge ranks than the RE RNN using the regularized RNN on short simulated time series. Combining by a weighted majority voting rule the networks derived by the RE RMLP -RNN using different numbers of latent nodes in step one to infer the GRN, the method performs consistently and outperforms published algorithms for GRN reconstruction on most benchmark time series. The framework of two-step algorithms can potentially incorporate with different nonlinear differential equation models to reconstruct the GRN.
Differential reconstructed gene interaction networks for deriving toxicity threshold in chemical risk assessment.

PubMed

Yang, Yi; Maxwell, Andrew; Zhang, Xiaowei; Wang, Nan; Perkins, Edward J; Zhang, Chaoyang; Gong, Ping

2013-01-01

Pathway alterations reflected as changes in gene expression regulation and gene interaction can result from cellular exposure to toxicants. Such information is often used to elucidate toxicological modes of action. From a risk assessment perspective, alterations in biological pathways are a rich resource for setting toxicant thresholds, which may be more sensitive and mechanism-informed than traditional toxicity endpoints. Here we developed a novel differential networks (DNs) approach to connect pathway perturbation with toxicity threshold setting. Our DNs approach consists of 6 steps: time-series gene expression data collection, identification of altered genes, gene interaction network reconstruction, differential edge inference, mapping of genes with differential edges to pathways, and establishment of causal relationships between chemical concentration and perturbed pathways. A one-sample Gaussian process model and a linear regression model were used to identify genes that exhibited significant profile changes across an entire time course and between treatments, respectively. Interaction networks of differentially expressed (DE) genes were reconstructed for different treatments using a state space model and then compared to infer differential edges/interactions. DE genes possessing differential edges were mapped to biological pathways in databases such as KEGG pathways. Using the DNs approach, we analyzed a time-series Escherichia coli live cell gene expression dataset consisting of 4 treatments (control, 10, 100, 1000 mg/L naphthenic acids, NAs) and 18 time points. Through comparison of reconstructed networks and construction of differential networks, 80 genes were identified as DE genes with a significant number of differential edges, and 22 KEGG pathways were altered in a concentration-dependent manner. Some of these pathways were perturbed to a degree as high as 70% even at the lowest exposure concentration, implying a high sensitivity of our DNs approach. Findings from this proof-of-concept study suggest that our approach has a great potential in providing a novel and sensitive tool for threshold setting in chemical risk assessment. In future work, we plan to analyze more time-series datasets with a full spectrum of concentrations and sufficient replications per treatment. The pathway alteration-derived thresholds will also be compared with those derived from apical endpoints such as cell growth rate.
Reconstruction of an integrated genome-scale co-expression network reveals key modules involved in lung adenocarcinoma.

PubMed

Bidkhori, Gholamreza; Narimani, Zahra; Hosseini Ashtiani, Saman; Moeini, Ali; Nowzari-Dalini, Abbas; Masoudi-Nejad, Ali

2013-01-01

Our goal of this study was to reconstruct a "genome-scale co-expression network" and find important modules in lung adenocarcinoma so that we could identify the genes involved in lung adenocarcinoma. We integrated gene mutation, GWAS, CGH, array-CGH and SNP array data in order to identify important genes and loci in genome-scale. Afterwards, on the basis of the identified genes a co-expression network was reconstructed from the co-expression data. The reconstructed network was named "genome-scale co-expression network". As the next step, 23 key modules were disclosed through clustering. In this study a number of genes have been identified for the first time to be implicated in lung adenocarcinoma by analyzing the modules. The genes EGFR, PIK3CA, TAF15, XIAP, VAPB, Appl1, Rab5a, ARF4, CLPTM1L, SP4, ZNF124, LPP, FOXP1, SOX18, MSX2, NFE2L2, SMARCC1, TRA2B, CBX3, PRPF6, ATP6V1C1, MYBBP1A, MACF1, GRM2, TBXA2R, PRKAR2A, PTK2, PGF and MYO10 are among the genes that belong to modules 1 and 22. All these genes, being implicated in at least one of the phenomena, namely cell survival, proliferation and metastasis, have an over-expression pattern similar to that of EGFR. In few modules, the genes such as CCNA2 (Cyclin A2), CCNB2 (Cyclin B2), CDK1, CDK5, CDC27, CDCA5, CDCA8, ASPM, BUB1, KIF15, KIF2C, NEK2, NUSAP1, PRC1, SMC4, SYCE2, TFDP1, CDC42 and ARHGEF9 are present that play a crucial role in cell cycle progression. In addition to the mentioned genes, there are some other genes (i.e. DLGAP5, BIRC5, PSMD2, Src, TTK, SENP2, PSMD2, DOK2, FUS and etc.) in the modules.
Reverse engineering and analysis of large genome-scale gene networks

PubMed Central

Aluru, Maneesha; Zola, Jaroslaw; Nettleton, Dan; Aluru, Srinivas

2013-01-01

Reverse engineering the whole-genome networks of complex multicellular organisms continues to remain a challenge. While simpler models easily scale to large number of genes and gene expression datasets, more accurate models are compute intensive limiting their scale of applicability. To enable fast and accurate reconstruction of large networks, we developed Tool for Inferring Network of Genes (TINGe), a parallel mutual information (MI)-based program. The novel features of our approach include: (i) B-spline-based formulation for linear-time computation of MI, (ii) a novel algorithm for direct permutation testing and (iii) development of parallel algorithms to reduce run-time and facilitate construction of large networks. We assess the quality of our method by comparison with ARACNe (Algorithm for the Reconstruction of Accurate Cellular Networks) and GeneNet and demonstrate its unique capability by reverse engineering the whole-genome network of Arabidopsis thaliana from 3137 Affymetrix ATH1 GeneChips in just 9 min on a 1024-core cluster. We further report on the development of a new software Gene Network Analyzer (GeNA) for extracting context-specific subnetworks from a given set of seed genes. Using TINGe and GeNA, we performed analysis of 241 Arabidopsis AraCyc 8.0 pathways, and the results are made available through the web. PMID:23042249
Cellular neural networks, the Navier-Stokes equation, and microarray image reconstruction.

PubMed

Zineddin, Bachar; Wang, Zidong; Liu, Xiaohui

2011-11-01

Although the last decade has witnessed a great deal of improvements achieved for the microarray technology, many major developments in all the main stages of this technology, including image processing, are still needed. Some hardware implementations of microarray image processing have been proposed in the literature and proved to be promising alternatives to the currently available software systems. However, the main drawback of those proposed approaches is the unsuitable addressing of the quantification of the gene spot in a realistic way without any assumption about the image surface. Our aim in this paper is to present a new image-reconstruction algorithm using the cellular neural network that solves the Navier-Stokes equation. This algorithm offers a robust method for estimating the background signal within the gene-spot region. The MATCNN toolbox for Matlab is used to test the proposed method. Quantitative comparisons are carried out, i.e., in terms of objective criteria, between our approach and some other available methods. It is shown that the proposed algorithm gives highly accurate and realistic measurements in a fully automated manner within a remarkably efficient time.
Reconstructing directed gene regulatory network by only gene expression data.

PubMed

Zhang, Lu; Feng, Xi Kang; Ng, Yen Kaow; Li, Shuai Cheng

2016-08-18

Accurately identifying gene regulatory network is an important task in understanding in vivo biological activities. The inference of such networks is often accomplished through the use of gene expression data. Many methods have been developed to evaluate gene expression dependencies between transcription factor and its target genes, and some methods also eliminate transitive interactions. The regulatory (or edge) direction is undetermined if the target gene is also a transcription factor. Some methods predict the regulatory directions in the gene regulatory networks by locating the eQTL single nucleotide polymorphism, or by observing the gene expression changes when knocking out/down the candidate transcript factors; regrettably, these additional data are usually unavailable, especially for the samples deriving from human tissues. In this study, we propose the Context Based Dependency Network (CBDN), a method that is able to infer gene regulatory networks with the regulatory directions from gene expression data only. To determine the regulatory direction, CBDN computes the influence of source to target by evaluating the magnitude changes of expression dependencies between the target gene and the others with conditioning on the source gene. CBDN extends the data processing inequality by involving the dependency direction to distinguish between direct and transitive relationship between genes. We also define two types of important regulators which can influence a majority of the genes in the network directly or indirectly. CBDN can detect both of these two types of important regulators by averaging the influence functions of candidate regulator to the other genes. In our experiments with simulated and real data, even with the regulatory direction taken into account, CBDN outperforms the state-of-the-art approaches for inferring gene regulatory network. CBDN identifies the important regulators in the predicted network: 1. TYROBP influences a batch of genes that are related to Alzheimer's disease; 2. ZNF329 and RB1 significantly regulate those 'mesenchymal' gene expression signature genes for brain tumors. By merely leveraging gene expression data, CBDN can efficiently infer the existence of gene-gene interactions as well as their regulatory directions. The constructed networks are helpful in the identification of important regulators for complex diseases.
JRmGRN: Joint reconstruction of multiple gene regulatory networks with common hub genes using data from multiple tissues or conditions.

PubMed

Deng, Wenping; Zhang, Kui; Liu, Sanzhen; Zhao, Patrick; Xu, Shizhong; Wei, Hairong

2018-04-30

Joint reconstruction of multiple gene regulatory networks (GRNs) using gene expression data from multiple tissues/conditions is very important for understanding common and tissue/condition-specific regulation. However, there are currently no computational models and methods available for directly constructing such multiple GRNs that not only share some common hub genes but also possess tissue/condition-specific regulatory edges. In this paper, we proposed a new graphic Gaussian model for joint reconstruction of multiple gene regulatory networks (JRmGRN), which highlighted hub genes, using gene expression data from several tissues/conditions. Under the framework of Gaussian graphical model, JRmGRN method constructs the GRNs through maximizing a penalized log likelihood function. We formulated it as a convex optimization problem, and then solved it with an alternating direction method of multipliers (ADMM) algorithm. The performance of JRmGRN was first evaluated with synthetic data and the results showed that JRmGRN outperformed several other methods for reconstruction of GRNs. We also applied our method to real Arabidopsis thaliana RNA-seq data from two light regime conditions in comparison with other methods, and both common hub genes and some conditions-specific hub genes were identified with higher accuracy and precision. JRmGRN is available as a R program from: https://github.com/wenpingd. hairong@mtu.edu. Proof of theorem, derivation of algorithm and supplementary data are available at Bioinformatics online.
Parenclitic networks: uncovering new functions in biological data

PubMed Central

Zanin, Massimiliano; Alcazar, Joaquín Medina; Carbajosa, Jesus Vicente; Paez, Marcela Gomez; Papo, David; Sousa, Pedro; Menasalvas, Ernestina; Boccaletti, Stefano

2014-01-01

We introduce a novel method to represent time independent, scalar data sets as complex networks. We apply our method to investigate gene expression in the response to osmotic stress of Arabidopsis thaliana. In the proposed network representation, the most important genes for the plant response turn out to be the nodes with highest centrality in appropriately reconstructed networks. We also performed a target experiment, in which the predicted genes were artificially induced one by one, and the growth of the corresponding phenotypes compared to that of the wild-type. The joint application of the network reconstruction method and of the in vivo experiments allowed identifying 15 previously unknown key genes, and provided models of their mutual relationships. This novel representation extends the use of graph theory to data sets hitherto considered outside of the realm of its application, vastly simplifying the characterization of their underlying structure. PMID:24870931
Reconstruction of the metabolic network of Pseudomonas aeruginosa to interrogate virulence factor synthesis

NASA Astrophysics Data System (ADS)

Bartell, Jennifer A.; Blazier, Anna S.; Yen, Phillip; Thøgersen, Juliane C.; Jelsbak, Lars; Goldberg, Joanna B.; Papin, Jason A.

2017-03-01

Virulence-linked pathways in opportunistic pathogens are putative therapeutic targets that may be associated with less potential for resistance than targets in growth-essential pathways. However, efficacy of virulence-linked targets may be affected by the contribution of virulence-related genes to metabolism. We evaluate the complex interrelationships between growth and virulence-linked pathways using a genome-scale metabolic network reconstruction of Pseudomonas aeruginosa strain PA14 and an updated, expanded reconstruction of P. aeruginosa strain PAO1. The PA14 reconstruction accounts for the activity of 112 virulence-linked genes and virulence factor synthesis pathways that produce 17 unique compounds. We integrate eight published genome-scale mutant screens to validate gene essentiality predictions in rich media, contextualize intra-screen discrepancies and evaluate virulence-linked gene distribution across essentiality datasets. Computational screening further elucidates interconnectivity between inhibition of virulence factor synthesis and growth. Successful validation of selected gene perturbations using PA14 transposon mutants demonstrates the utility of model-driven screening of therapeutic targets.

Construct and Compare Gene Coexpression Networks with DAPfinder and DAPview.

PubMed

Skinner, Jeff; Kotliarov, Yuri; Varma, Sudhir; Mine, Karina L; Yambartsev, Anatoly; Simon, Richard; Huyen, Yentram; Morgun, Andrey

2011-07-14

DAPfinder and DAPview are novel BRB-ArrayTools plug-ins to construct gene coexpression networks and identify significant differences in pairwise gene-gene coexpression between two phenotypes. Each significant difference in gene-gene association represents a Differentially Associated Pair (DAP). Our tools include several choices of filtering methods, gene-gene association metrics, statistical testing methods and multiple comparison adjustments. Network results are easily displayed in Cytoscape. Analyses of glioma experiments and microarray simulations demonstrate the utility of these tools. DAPfinder is a new friendly-user tool for reconstruction and comparison of biological networks.
Reverse-engineering of gene networks for regulating early blood development from single-cell measurements.

PubMed

Wei, Jiangyong; Hu, Xiaohua; Zou, Xiufen; Tian, Tianhai

2017-12-28

Recent advances in omics technologies have raised great opportunities to study large-scale regulatory networks inside the cell. In addition, single-cell experiments have measured the gene and protein activities in a large number of cells under the same experimental conditions. However, a significant challenge in computational biology and bioinformatics is how to derive quantitative information from the single-cell observations and how to develop sophisticated mathematical models to describe the dynamic properties of regulatory networks using the derived quantitative information. This work designs an integrated approach to reverse-engineer gene networks for regulating early blood development based on singel-cell experimental observations. The wanderlust algorithm is initially used to develop the pseudo-trajectory for the activities of a number of genes. Since the gene expression data in the developed pseudo-trajectory show large fluctuations, we then use Gaussian process regression methods to smooth the gene express data in order to obtain pseudo-trajectories with much less fluctuations. The proposed integrated framework consists of both bioinformatics algorithms to reconstruct the regulatory network and mathematical models using differential equations to describe the dynamics of gene expression. The developed approach is applied to study the network regulating early blood cell development. A graphic model is constructed for a regulatory network with forty genes and a dynamic model using differential equations is developed for a network of nine genes. Numerical results suggests that the proposed model is able to match experimental data very well. We also examine the networks with more regulatory relations and numerical results show that more regulations may exist. We test the possibility of auto-regulation but numerical simulations do not support the positive auto-regulation. In addition, robustness is used as an importantly additional criterion to select candidate networks. The research results in this work shows that the developed approach is an efficient and effective method to reverse-engineer gene networks using single-cell experimental observations.
Informed walks: whispering hints to gene hunters inside networks' jungle.

PubMed

Bourdakou, Marilena M; Spyrou, George M

2017-10-11

Systemic approaches offer a different point of view on the analysis of several types of molecular associations as well as on the identification of specific gene communities in several cancer types. However, due to lack of sufficient data needed to construct networks based on experimental evidence, statistical gene co-expression networks are widely used instead. Many efforts have been made to exploit the information hidden in these networks. However, these approaches still need to capitalize comprehensively the prior knowledge encrypted into molecular pathway associations and improve their efficiency regarding the discovery of both exclusive subnetworks as candidate biomarkers and conserved subnetworks that may uncover common origins of several cancer types. In this study we present the development of the Informed Walks model based on random walks that incorporate information from molecular pathways to mine candidate genes and gene-gene links. The proposed model has been applied to TCGA (The Cancer Genome Atlas) datasets from seven different cancer types, exploring the reconstructed co-expression networks of the whole set of genes and driving to highlighted sub-networks for each cancer type. In the sequel, we elucidated the impact of each subnetwork on the indication of underlying exclusive and common molecular mechanisms as well as on the short-listing of drugs that have the potential to suppress the corresponding cancer type through a drug-repurposing pipeline. We have developed a method of gene subnetwork highlighting based on prior knowledge, capable to give fruitful insights regarding the underlying molecular mechanisms and valuable input to drug-repurposing pipelines for a variety of cancer types.
Integrated Module and Gene-Specific Regulatory Inference Implicates Upstream Signaling Networks

PubMed Central

Roy, Sushmita; Lagree, Stephen; Hou, Zhonggang; Thomson, James A.; Stewart, Ron; Gasch, Audrey P.

2013-01-01

Regulatory networks that control gene expression are important in diverse biological contexts including stress response and development. Each gene's regulatory program is determined by module-level regulation (e.g. co-regulation via the same signaling system), as well as gene-specific determinants that can fine-tune expression. We present a novel approach, Modular regulatory network learning with per gene information (MERLIN), that infers regulatory programs for individual genes while probabilistically constraining these programs to reveal module-level organization of regulatory networks. Using edge-, regulator- and module-based comparisons of simulated networks of known ground truth, we find MERLIN reconstructs regulatory programs of individual genes as well or better than existing approaches of network reconstruction, while additionally identifying modular organization of the regulatory networks. We use MERLIN to dissect global transcriptional behavior in two biological contexts: yeast stress response and human embryonic stem cell differentiation. Regulatory modules inferred by MERLIN capture co-regulatory relationships between signaling proteins and downstream transcription factors thereby revealing the upstream signaling systems controlling transcriptional responses. The inferred networks are enriched for regulators with genetic or physical interactions, supporting the inference, and identify modules of functionally related genes bound by the same transcriptional regulators. Our method combines the strengths of per-gene and per-module methods to reveal new insights into transcriptional regulation in stress and development. PMID:24146602
MINER: exploratory analysis of gene interaction networks by machine learning from expression data.

PubMed

Kadupitige, Sidath Randeni; Leung, Kin Chun; Sellmeier, Julia; Sivieng, Jane; Catchpoole, Daniel R; Bain, Michael E; Gaëta, Bruno A

2009-12-03

The reconstruction of gene regulatory networks from high-throughput "omics" data has become a major goal in the modelling of living systems. Numerous approaches have been proposed, most of which attempt only "one-shot" reconstruction of the whole network with no intervention from the user, or offer only simple correlation analysis to infer gene dependencies. We have developed MINER (Microarray Interactive Network Exploration and Representation), an application that combines multivariate non-linear tree learning of individual gene regulatory dependencies, visualisation of these dependencies as both trees and networks, and representation of known biological relationships based on common Gene Ontology annotations. MINER allows biologists to explore the dependencies influencing the expression of individual genes in a gene expression data set in the form of decision, model or regression trees, using their domain knowledge to guide the exploration and formulate hypotheses. Multiple trees can then be summarised in the form of a gene network diagram. MINER is being adopted by several of our collaborators and has already led to the discovery of a new significant regulatory relationship with subsequent experimental validation. Unlike most gene regulatory network inference methods, MINER allows the user to start from genes of interest and build the network gene-by-gene, incorporating domain expertise in the process. This approach has been used successfully with RNA microarray data but is applicable to other quantitative data produced by high-throughput technologies such as proteomics and "next generation" DNA sequencing.
Biblio-MetReS: A bibliometric network reconstruction application and server

PubMed Central

2011-01-01

Background Reconstruction of genes and/or protein networks from automated analysis of the literature is one of the current targets of text mining in biomedical research. Some user-friendly tools already perform this analysis on precompiled databases of abstracts of scientific papers. Other tools allow expert users to elaborate and analyze the full content of a corpus of scientific documents. However, to our knowledge, no user friendly tool that simultaneously analyzes the latest set of scientific documents available on line and reconstructs the set of genes referenced in those documents is available. Results This article presents such a tool, Biblio-MetReS, and compares its functioning and results to those of other user-friendly applications (iHOP, STRING) that are widely used. Under similar conditions, Biblio-MetReS creates networks that are comparable to those of other user friendly tools. Furthermore, analysis of full text documents provides more complete reconstructions than those that result from using only the abstract of the document. Conclusions Literature-based automated network reconstruction is still far from providing complete reconstructions of molecular networks. However, its value as an auxiliary tool is high and it will increase as standards for reporting biological entities and relationships become more widely accepted and enforced. Biblio-MetReS is an application that can be downloaded from http://metres.udl.cat/. It provides an easy to use environment for researchers to reconstruct their networks of interest from an always up to date set of scientific documents. PMID:21975133
Reverse engineering highlights potential principles of large gene regulatory network design and learning.

PubMed

Carré, Clément; Mas, André; Krouk, Gabriel

2017-01-01

Inferring transcriptional gene regulatory networks from transcriptomic datasets is a key challenge of systems biology, with potential impacts ranging from medicine to agronomy. There are several techniques used presently to experimentally assay transcription factors to target relationships, defining important information about real gene regulatory networks connections. These techniques include classical ChIP-seq, yeast one-hybrid, or more recently, DAP-seq or target technologies. These techniques are usually used to validate algorithm predictions. Here, we developed a reverse engineering approach based on mathematical and computer simulation to evaluate the impact that this prior knowledge on gene regulatory networks may have on training machine learning algorithms. First, we developed a gene regulatory networks-simulating engine called FRANK (Fast Randomizing Algorithm for Network Knowledge) that is able to simulate large gene regulatory networks (containing 10 4 genes) with characteristics of gene regulatory networks observed in vivo. FRANK also generates stable or oscillatory gene expression directly produced by the simulated gene regulatory networks. The development of FRANK leads to important general conclusions concerning the design of large and stable gene regulatory networks harboring scale free properties (built ex nihilo). In combination with supervised (accepting prior knowledge) support vector machine algorithm we (i) address biologically oriented questions concerning our capacity to accurately reconstruct gene regulatory networks and in particular we demonstrate that prior-knowledge structure is crucial for accurate learning, and (ii) draw conclusions to inform experimental design to performed learning able to solve gene regulatory networks in the future. By demonstrating that our predictions concerning the influence of the prior-knowledge structure on support vector machine learning capacity holds true on real data ( Escherichia coli K14 network reconstruction using network and transcriptomic data), we show that the formalism used to build FRANK can to some extent be a reasonable model for gene regulatory networks in real cells.
Reconstructing regulatory networks from the dynamic plasticity of gene expression by mutual information

PubMed Central

Wang, Jianxin; Chen, Bo; Wang, Yaqun; Wang, Ningtao; Garbey, Marc; Tran-Son-Tay, Roger; Berceli, Scott A.; Wu, Rongling

2013-01-01

The capacity of an organism to respond to its environment is facilitated by the environmentally induced alteration of gene and protein expression, i.e. expression plasticity. The reconstruction of gene regulatory networks based on expression plasticity can gain not only new insights into the causality of transcriptional and cellular processes but also the complex regulatory mechanisms that underlie biological function and adaptation. We describe an approach for network inference by integrating expression plasticity into Shannon’s mutual information. Beyond Pearson correlation, mutual information can capture non-linear dependencies and topology sparseness. The approach measures the network of dependencies of genes expressed in different environments, allowing the environment-induced plasticity of gene dependencies to be tested in unprecedented details. The approach is also able to characterize the extent to which the same genes trigger different amounts of expression in response to environmental changes. We demonstrated the usefulness of this approach through analysing gene expression data from a rabbit vein graft study that includes two distinct blood flow environments. The proposed approach provides a powerful tool for the modelling and analysis of dynamic regulatory networks using gene expression data from distinct environments. PMID:23470995
Rhodobase, a meta-analytical tool for reconstructing gene regulatory networks in a model photosynthetic bacterium.

PubMed

Moskvin, Oleg V; Bolotin, Dmitry; Wang, Andrew; Ivanov, Pavel S; Gomelsky, Mark

2011-02-01

We present Rhodobase, a web-based meta-analytical tool for analysis of transcriptional regulation in a model anoxygenic photosynthetic bacterium, Rhodobacter sphaeroides. The gene association meta-analysis is based on the pooled data from 100 of R. sphaeroides whole-genome DNA microarrays. Gene-centric regulatory networks were visualized using the StarNet approach (Jupiter, D.C., VanBuren, V., 2008. A visual data mining tool that facilitates reconstruction of transcription regulatory networks. PLoS ONE 3, e1717) with several modifications. We developed a means to identify and visualize operons and superoperons. We designed a framework for the cross-genome search for transcription factor binding sites that takes into account high GC-content and oligonucleotide usage profile characteristic of the R. sphaeroides genome. To facilitate reconstruction of directional relationships between co-regulated genes, we screened upstream sequences (-400 to +20bp from start codons) of all genes for putative binding sites of bacterial transcription factors using a self-optimizing search method developed here. To test performance of the meta-analysis tools and transcription factor site predictions, we reconstructed selected nodes of the R. sphaeroides transcription factor-centric regulatory matrix. The test revealed regulatory relationships that correlate well with the experimentally derived data. The database of transcriptional profile correlations, the network visualization engine and the optimized search engine for transcription factor binding sites analysis are available at http://rhodobase.org. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.
Semi-Supervised Multi-View Learning for Gene Network Reconstruction

PubMed Central

Ceci, Michelangelo; Pio, Gianvito; Kuzmanovski, Vladimir; Džeroski, Sašo

2015-01-01

The task of gene regulatory network reconstruction from high-throughput data is receiving increasing attention in recent years. As a consequence, many inference methods for solving this task have been proposed in the literature. It has been recently observed, however, that no single inference method performs optimally across all datasets. It has also been shown that the integration of predictions from multiple inference methods is more robust and shows high performance across diverse datasets. Inspired by this research, in this paper, we propose a machine learning solution which learns to combine predictions from multiple inference methods. While this approach adds additional complexity to the inference process, we expect it would also carry substantial benefits. These would come from the automatic adaptation to patterns on the outputs of individual inference methods, so that it is possible to identify regulatory interactions more reliably when these patterns occur. This article demonstrates the benefits (in terms of accuracy of the reconstructed networks) of the proposed method, which exploits an iterative, semi-supervised ensemble-based algorithm. The algorithm learns to combine the interactions predicted by many different inference methods in the multi-view learning setting. The empirical evaluation of the proposed algorithm on a prokaryotic model organism (E. coli) and on a eukaryotic model organism (S. cerevisiae) clearly shows improved performance over the state of the art methods. The results indicate that gene regulatory network reconstruction for the real datasets is more difficult for S. cerevisiae than for E. coli. The software, all the datasets used in the experiments and all the results are available for download at the following link: http://figshare.com/articles/Semi_supervised_Multi_View_Learning_for_Gene_Network_Reconstruction/1604827. PMID:26641091
Reconstruction of cellular signal transduction networks using perturbation assays and linear programming.

PubMed

Knapp, Bettina; Kaderali, Lars

2013-01-01

Perturbation experiments for example using RNA interference (RNAi) offer an attractive way to elucidate gene function in a high throughput fashion. The placement of hit genes in their functional context and the inference of underlying networks from such data, however, are challenging tasks. One of the problems in network inference is the exponential number of possible network topologies for a given number of genes. Here, we introduce a novel mathematical approach to address this question. We formulate network inference as a linear optimization problem, which can be solved efficiently even for large-scale systems. We use simulated data to evaluate our approach, and show improved performance in particular on larger networks over state-of-the art methods. We achieve increased sensitivity and specificity, as well as a significant reduction in computing time. Furthermore, we show superior performance on noisy data. We then apply our approach to study the intracellular signaling of human primary nave CD4(+) T-cells, as well as ErbB signaling in trastuzumab resistant breast cancer cells. In both cases, our approach recovers known interactions and points to additional relevant processes. In ErbB signaling, our results predict an important role of negative and positive feedback in controlling the cell cycle progression.
Reconstruction of the regulatory network for Bacillus subtilis and reconciliation with gene expression data

DOE Office of Scientific and Technical Information (OSTI.GOV)

Faria, Jose P.; Overbeek, Ross; Taylor, Ronald C.

Here, we introduce a manually constructed and curated regulatory network model that describes the current state of knowledge of transcriptional regulation of B. subtilis. The model corresponds to an updated and enlarged version of the regulatory model of central metabolism originally proposed in 2008. We extended the original network to the whole genome by integration of information from DBTBS, a compendium of regulatory data that includes promoters, transcription factors (TFs), binding sites, motifs and regulated operons. Additionally, we consolidated our network with all the information on regulation included in the SporeWeb and Subtiwiki community-curated resources on B. subtilis. Finally, wemore » reconciled our network with data from RegPrecise, which recently released their own less comprehensive reconstruction of the regulatory network for B. subtilis. Our model describes 275 regulators and their target genes, representing 30 different mechanisms of regulation such as TFs, RNA switches, Riboswitches and small regulatory RNAs. Overall, regulatory information is included in the model for approximately 2500 of the ~4200 genes in B. subtilis 168. In an effort to further expand our knowledge of B. subtilis regulation, we reconciled our model with expression data. For this process, we reconstructed the Atomic Regulons (ARs) for B. subtilis, which are the sets of genes that share the same “ON” and “OFF” gene expression profiles across multiple samples of experimental data. We show how atomic regulons for B. subtilis are able to capture many sets of genes corresponding to regulated operons in our manually curated network. Additionally, we demonstrate how atomic regulons can be used to help expand or validate the knowledge of the regulatory networks by looking at highly correlated genes in the ARs for which regulatory information is lacking. During this process, we were also able to infer novel stimuli for hypothetical genes by exploring the genome expression metadata relating to experimental conditions, gaining insights into novel biology.« less
Reconstruction of the regulatory network for Bacillus subtilis and reconciliation with gene expression data

DOE PAGES

Faria, Jose P.; Overbeek, Ross; Taylor, Ronald C.; ...

2016-03-18

Here, we introduce a manually constructed and curated regulatory network model that describes the current state of knowledge of transcriptional regulation of B. subtilis. The model corresponds to an updated and enlarged version of the regulatory model of central metabolism originally proposed in 2008. We extended the original network to the whole genome by integration of information from DBTBS, a compendium of regulatory data that includes promoters, transcription factors (TFs), binding sites, motifs and regulated operons. Additionally, we consolidated our network with all the information on regulation included in the SporeWeb and Subtiwiki community-curated resources on B. subtilis. Finally, wemore » reconciled our network with data from RegPrecise, which recently released their own less comprehensive reconstruction of the regulatory network for B. subtilis. Our model describes 275 regulators and their target genes, representing 30 different mechanisms of regulation such as TFs, RNA switches, Riboswitches and small regulatory RNAs. Overall, regulatory information is included in the model for approximately 2500 of the ~4200 genes in B. subtilis 168. In an effort to further expand our knowledge of B. subtilis regulation, we reconciled our model with expression data. For this process, we reconstructed the Atomic Regulons (ARs) for B. subtilis, which are the sets of genes that share the same “ON” and “OFF” gene expression profiles across multiple samples of experimental data. We show how atomic regulons for B. subtilis are able to capture many sets of genes corresponding to regulated operons in our manually curated network. Additionally, we demonstrate how atomic regulons can be used to help expand or validate the knowledge of the regulatory networks by looking at highly correlated genes in the ARs for which regulatory information is lacking. During this process, we were also able to infer novel stimuli for hypothetical genes by exploring the genome expression metadata relating to experimental conditions, gaining insights into novel biology.« less
Inferring Regulatory Networks by Combining Perturbation Screens and Steady State Gene Expression Profiles

PubMed Central

Michailidis, George

2014-01-01

Reconstructing transcriptional regulatory networks is an important task in functional genomics. Data obtained from experiments that perturb genes by knockouts or RNA interference contain useful information for addressing this reconstruction problem. However, such data can be limited in size and/or are expensive to acquire. On the other hand, observational data of the organism in steady state (e.g., wild-type) are more readily available, but their informational content is inadequate for the task at hand. We develop a computational approach to appropriately utilize both data sources for estimating a regulatory network. The proposed approach is based on a three-step algorithm to estimate the underlying directed but cyclic network, that uses as input both perturbation screens and steady state gene expression data. In the first step, the algorithm determines causal orderings of the genes that are consistent with the perturbation data, by combining an exhaustive search method with a fast heuristic that in turn couples a Monte Carlo technique with a fast search algorithm. In the second step, for each obtained causal ordering, a regulatory network is estimated using a penalized likelihood based method, while in the third step a consensus network is constructed from the highest scored ones. Extensive computational experiments show that the algorithm performs well in reconstructing the underlying network and clearly outperforms competing approaches that rely only on a single data source. Further, it is established that the algorithm produces a consistent estimate of the regulatory network. PMID:24586224
Hidden Markov induced Dynamic Bayesian Network for recovering time evolving gene regulatory networks

NASA Astrophysics Data System (ADS)

Zhu, Shijia; Wang, Yadong

2015-12-01

Dynamic Bayesian Networks (DBN) have been widely used to recover gene regulatory relationships from time-series data in computational systems biology. Its standard assumption is ‘stationarity’, and therefore, several research efforts have been recently proposed to relax this restriction. However, those methods suffer from three challenges: long running time, low accuracy and reliance on parameter settings. To address these problems, we propose a novel non-stationary DBN model by extending each hidden node of Hidden Markov Model into a DBN (called HMDBN), which properly handles the underlying time-evolving networks. Correspondingly, an improved structural EM algorithm is proposed to learn the HMDBN. It dramatically reduces searching space, thereby substantially improving computational efficiency. Additionally, we derived a novel generalized Bayesian Information Criterion under the non-stationary assumption (called BWBIC), which can help significantly improve the reconstruction accuracy and largely reduce over-fitting. Moreover, the re-estimation formulas for all parameters of our model are derived, enabling us to avoid reliance on parameter settings. Compared to the state-of-the-art methods, the experimental evaluation of our proposed method on both synthetic and real biological data demonstrates more stably high prediction accuracy and significantly improved computation efficiency, even with no prior knowledge and parameter settings.
CoryneRegNet 3.0--an interactive systems biology platform for the analysis of gene regulatory networks in corynebacteria and Escherichia coli.

PubMed

Baumbach, Jan; Wittkop, Tobias; Rademacher, Katrin; Rahmann, Sven; Brinkrolf, Karina; Tauch, Andreas

2007-04-30

CoryneRegNet is an ontology-based data warehouse for the reconstruction and visualization of transcriptional regulatory interactions in prokaryotes. To extend the biological content of CoryneRegNet, we added comprehensive data on transcriptional regulations in the model organism Escherichia coli K-12, originally deposited in the international reference database RegulonDB. The enhanced web interface of CoryneRegNet offers several types of search options. The results of a search are displayed in a table-based style and include a visualization of the genetic organization of the respective gene region. Information on DNA binding sites of transcriptional regulators is depicted by sequence logos. The results can also be displayed by several layouters implemented in the graphical user interface GraphVis, allowing, for instance, the visualization of genome-wide network reconstructions and the homology-based inter-species comparison of reconstructed gene regulatory networks. In an application example, we compare the composition of the gene regulatory networks involved in the SOS response of E. coli and Corynebacterium glutamicum. CoryneRegNet is available at the following URL: http://www.cebitec.uni-bielefeld.de/groups/gi/software/coryneregnet/.
Biblio-MetReS for user-friendly mining of genes and biological processes in scientific documents.

PubMed

Usie, Anabel; Karathia, Hiren; Teixidó, Ivan; Alves, Rui; Solsona, Francesc

2014-01-01

One way to initiate the reconstruction of molecular circuits is by using automated text-mining techniques. Developing more efficient methods for such reconstruction is a topic of active research, and those methods are typically included by bioinformaticians in pipelines used to mine and curate large literature datasets. Nevertheless, experimental biologists have a limited number of available user-friendly tools that use text-mining for network reconstruction and require no programming skills to use. One of these tools is Biblio-MetReS. Originally, this tool permitted an on-the-fly analysis of documents contained in a number of web-based literature databases to identify co-occurrence of proteins/genes. This approach ensured results that were always up-to-date with the latest live version of the databases. However, this 'up-to-dateness' came at the cost of large execution times. Here we report an evolution of the application Biblio-MetReS that permits constructing co-occurrence networks for genes, GO processes, Pathways, or any combination of the three types of entities and graphically represent those entities. We show that the performance of Biblio-MetReS in identifying gene co-occurrence is as least as good as that of other comparable applications (STRING and iHOP). In addition, we also show that the identification of GO processes is on par to that reported in the latest BioCreAtIvE challenge. Finally, we also report the implementation of a new strategy that combines on-the-fly analysis of new documents with preprocessed information from documents that were encountered in previous analyses. This combination simultaneously decreases program run time and maintains 'up-to-dateness' of the results. http://metres.udl.cat/index.php/downloads, metres.cmb@gmail.com.
Combination of a proteomics approach and reengineering of meso scale network models for prediction of mode-of-action for tyrosine kinase inhibitors.

PubMed

Balabanov, Stefan; Wilhelm, Thomas; Venz, Simone; Keller, Gunhild; Scharf, Christian; Pospisil, Heike; Braig, Melanie; Barett, Christine; Bokemeyer, Carsten; Walther, Reinhard; Brümmendorf, Tim H; Schuppert, Andreas

2013-01-01

In drug discovery, the characterisation of the precise modes of action (MoA) and of unwanted off-target effects of novel molecularly targeted compounds is of highest relevance. Recent approaches for identification of MoA have employed various techniques for modeling of well defined signaling pathways including structural information, changes in phenotypic behavior of cells and gene expression patterns after drug treatment. However, efficient approaches focusing on proteome wide data for the identification of MoA including interference with mutations are underrepresented. As mutations are key drivers of drug resistance in molecularly targeted tumor therapies, efficient analysis and modeling of downstream effects of mutations on drug MoA is a key to efficient development of improved targeted anti-cancer drugs. Here we present a combination of a global proteome analysis, reengineering of network models and integration of apoptosis data used to infer the mode-of-action of various tyrosine kinase inhibitors (TKIs) in chronic myeloid leukemia (CML) cell lines expressing wild type as well as TKI resistance conferring mutants of BCR-ABL. The inferred network models provide a tool to predict the main MoA of drugs as well as to grouping of drugs with known similar kinase inhibitory activity patterns in comparison to drugs with an additional MoA. We believe that our direct network reconstruction approach, demonstrated on proteomics data, can provide a complementary method to the established network reconstruction approaches for the preclinical modeling of the MoA of various types of targeted drugs in cancer treatment. Hence it may contribute to the more precise prediction of clinically relevant on- and off-target effects of TKIs.
Combination of a Proteomics Approach and Reengineering of Meso Scale Network Models for Prediction of Mode-of-Action for Tyrosine Kinase Inhibitors

PubMed Central

Balabanov, Stefan; Wilhelm, Thomas; Venz, Simone; Keller, Gunhild; Scharf, Christian; Pospisil, Heike; Braig, Melanie; Barett, Christine; Bokemeyer, Carsten; Walther, Reinhard

2013-01-01

In drug discovery, the characterisation of the precise modes of action (MoA) and of unwanted off-target effects of novel molecularly targeted compounds is of highest relevance. Recent approaches for identification of MoA have employed various techniques for modeling of well defined signaling pathways including structural information, changes in phenotypic behavior of cells and gene expression patterns after drug treatment. However, efficient approaches focusing on proteome wide data for the identification of MoA including interference with mutations are underrepresented. As mutations are key drivers of drug resistance in molecularly targeted tumor therapies, efficient analysis and modeling of downstream effects of mutations on drug MoA is a key to efficient development of improved targeted anti-cancer drugs. Here we present a combination of a global proteome analysis, reengineering of network models and integration of apoptosis data used to infer the mode-of-action of various tyrosine kinase inhibitors (TKIs) in chronic myeloid leukemia (CML) cell lines expressing wild type as well as TKI resistance conferring mutants of BCR-ABL. The inferred network models provide a tool to predict the main MoA of drugs as well as to grouping of drugs with known similar kinase inhibitory activity patterns in comparison to drugs with an additional MoA. We believe that our direct network reconstruction approach, demonstrated on proteomics data, can provide a complementary method to the established network reconstruction approaches for the preclinical modeling of the MoA of various types of targeted drugs in cancer treatment. Hence it may contribute to the more precise prediction of clinically relevant on- and off-target effects of TKIs. PMID:23326482
Reconstruction of network topology using status-time-series data

NASA Astrophysics Data System (ADS)

Pandey, Pradumn Kumar; Badarla, Venkataramana

2018-01-01

Uncovering the heterogeneous connection pattern of a networked system from the available status-time-series (STS) data of a dynamical process on the network is of great interest in network science and known as a reverse engineering problem. Dynamical processes on a network are affected by the structure of the network. The dependency between the diffusion dynamics and structure of the network can be utilized to retrieve the connection pattern from the diffusion data. Information of the network structure can help to devise the control of dynamics on the network. In this paper, we consider the problem of network reconstruction from the available status-time-series (STS) data using matrix analysis. The proposed method of network reconstruction from the STS data is tested successfully under susceptible-infected-susceptible (SIS) diffusion dynamics on real-world and computer-generated benchmark networks. High accuracy and efficiency of the proposed reconstruction procedure from the status-time-series data define the novelty of the method. Our proposed method outperforms compressed sensing theory (CST) based method of network reconstruction using STS data. Further, the same procedure of network reconstruction is applied to the weighted networks. The ordering of the edges in the weighted networks is identified with high accuracy.

Reconstruction of metabolic networks from high-throughput metabolite profiling data: in silico analysis of red blood cell metabolism.

PubMed

Nemenman, Ilya; Escola, G Sean; Hlavacek, William S; Unkefer, Pat J; Unkefer, Clifford J; Wall, Michael E

2007-12-01

We investigate the ability of algorithms developed for reverse engineering of transcriptional regulatory networks to reconstruct metabolic networks from high-throughput metabolite profiling data. For benchmarking purposes, we generate synthetic metabolic profiles based on a well-established model for red blood cell metabolism. A variety of data sets are generated, accounting for different properties of real metabolic networks, such as experimental noise, metabolite correlations, and temporal dynamics. These data sets are made available online. We use ARACNE, a mainstream algorithm for reverse engineering of transcriptional regulatory networks from gene expression data, to predict metabolic interactions from these data sets. We find that the performance of ARACNE on metabolic data is comparable to that on gene expression data.
Dynamic sporulation gene co-expression networks for Bacillus subtilis 168 and the food-borne isolate Bacillus amyloliquefaciens: a transcriptomic model

PubMed Central

Omony, Jimmy; de Jong, Anne; Krawczyk, Antonina O.; Eijlander, Robyn T.; Kuipers, Oscar P.

2018-01-01

Sporulation is a survival strategy, adapted by bacterial cells in response to harsh environmental adversities. The adaptation potential differs between strains and the variations may arise from differences in gene regulation. Gene networks are a valuable way of studying such regulation processes and establishing associations between genes. We reconstructed and compared sporulation gene co-expression networks (GCNs) of the model laboratory strain Bacillus subtilis 168 and the food-borne industrial isolate Bacillus amyloliquefaciens. Transcriptome data obtained from samples of six stages during the sporulation process were used for network inference. Subsequently, a gene set enrichment analysis was performed to compare the reconstructed GCNs of B. subtilis 168 and B. amyloliquefaciens with respect to biological functions, which showed the enriched modules with coherent functional groups associated with sporulation. On basis of the GCNs and time-evolution of differentially expressed genes, we could identify novel candidate genes strongly associated with sporulation in B. subtilis 168 and B. amyloliquefaciens. The GCNs offer a framework for exploring transcription factors, their targets, and co-expressed genes during sporulation. Furthermore, the methodology described here can conveniently be applied to other species or biological processes. PMID:29424683
Dynamic sporulation gene co-expression networks for Bacillus subtilis 168 and the food-borne isolate Bacillus amyloliquefaciens: a transcriptomic model.

PubMed

Omony, Jimmy; de Jong, Anne; Krawczyk, Antonina O; Eijlander, Robyn T; Kuipers, Oscar P

2018-02-09

Sporulation is a survival strategy, adapted by bacterial cells in response to harsh environmental adversities. The adaptation potential differs between strains and the variations may arise from differences in gene regulation. Gene networks are a valuable way of studying such regulation processes and establishing associations between genes. We reconstructed and compared sporulation gene co-expression networks (GCNs) of the model laboratory strain Bacillus subtilis 168 and the food-borne industrial isolate Bacillus amyloliquefaciens. Transcriptome data obtained from samples of six stages during the sporulation process were used for network inference. Subsequently, a gene set enrichment analysis was performed to compare the reconstructed GCNs of B. subtilis 168 and B. amyloliquefaciens with respect to biological functions, which showed the enriched modules with coherent functional groups associated with sporulation. On basis of the GCNs and time-evolution of differentially expressed genes, we could identify novel candidate genes strongly associated with sporulation in B. subtilis 168 and B. amyloliquefaciens. The GCNs offer a framework for exploring transcription factors, their targets, and co-expressed genes during sporulation. Furthermore, the methodology described here can conveniently be applied to other species or biological processes.
Trade-off between Multiple Constraints Enables Simultaneous Formation of Modules and Hubs in Neural Systems

PubMed Central

Chen, Yuhan; Wang, Shengjun; Hilgetag, Claus C.; Zhou, Changsong

2013-01-01

The formation of the complex network architecture of neural systems is subject to multiple structural and functional constraints. Two obvious but apparently contradictory constraints are low wiring cost and high processing efficiency, characterized by short overall wiring length and a small average number of processing steps, respectively. Growing evidence shows that neural networks are results from a trade-off between physical cost and functional value of the topology. However, the relationship between these competing constraints and complex topology is not well understood quantitatively. We explored this relationship systematically by reconstructing two known neural networks, Macaque cortical connectivity and C. elegans neuronal connections, from combinatory optimization of wiring cost and processing efficiency constraints, using a control parameter , and comparing the reconstructed networks to the real networks. We found that in both neural systems, the reconstructed networks derived from the two constraints can reveal some important relations between the spatial layout of nodes and the topological connectivity, and match several properties of the real networks. The reconstructed and real networks had a similar modular organization in a broad range of , resulting from spatial clustering of network nodes. Hubs emerged due to the competition of the two constraints, and their positions were close to, and partly coincided, with the real hubs in a range of values. The degree of nodes was correlated with the density of nodes in their spatial neighborhood in both reconstructed and real networks. Generally, the rebuilt network matched a significant portion of real links, especially short-distant ones. These findings provide clear evidence to support the hypothesis of trade-off between multiple constraints on brain networks. The two constraints of wiring cost and processing efficiency, however, cannot explain all salient features in the real networks. The discrepancy suggests that there are further relevant factors that are not yet captured here. PMID:23505352
Gene network reconstruction from transcriptional dynamics under kinetic model uncertainty: a case for the second derivative

PubMed Central

Bickel, David R.; Montazeri, Zahra; Hsieh, Pei-Chun; Beatty, Mary; Lawit, Shai J.; Bate, Nicholas J.

2009-01-01

Motivation: Measurements of gene expression over time enable the reconstruction of transcriptional networks. However, Bayesian networks and many other current reconstruction methods rely on assumptions that conflict with the differential equations that describe transcriptional kinetics. Practical approximations of kinetic models would enable inferring causal relationships between genes from expression data of microarray, tag-based and conventional platforms, but conclusions are sensitive to the assumptions made. Results: The representation of a sufficiently large portion of genome enables computation of an upper bound on how much confidence one may place in influences between genes on the basis of expression data. Information about which genes encode transcription factors is not necessary but may be incorporated if available. The methodology is generalized to cover cases in which expression measurements are missing for many of the genes that might control the transcription of the genes of interest. The assumption that the gene expression level is roughly proportional to the rate of translation led to better empirical performance than did either the assumption that the gene expression level is roughly proportional to the protein level or the Bayesian model average of both assumptions. Availability: http://www.oisb.ca points to R code implementing the methods (R Development Core Team 2004). Contact: dbickel@uottawa.ca Supplementary information: http://www.davidbickel.com PMID:19218351
Maximum likelihood of phylogenetic networks.

PubMed

Jin, Guohua; Nakhleh, Luay; Snir, Sagi; Tuller, Tamir

2006-11-01

Horizontal gene transfer (HGT) is believed to be ubiquitous among bacteria, and plays a major role in their genome diversification as well as their ability to develop resistance to antibiotics. In light of its evolutionary significance and implications for human health, developing accurate and efficient methods for detecting and reconstructing HGT is imperative. In this article we provide a new HGT-oriented likelihood framework for many problems that involve phylogeny-based HGT detection and reconstruction. Beside the formulation of various likelihood criteria, we show that most of these problems are NP-hard, and offer heuristics for efficient and accurate reconstruction of HGT under these criteria. We implemented our heuristics and used them to analyze biological as well as synthetic data. In both cases, our criteria and heuristics exhibited very good performance with respect to identifying the correct number of HGT events as well as inferring their correct location on the species tree. Implementation of the criteria as well as heuristics and hardness proofs are available from the authors upon request. Hardness proofs can also be downloaded at http://www.cs.tau.ac.il/~tamirtul/MLNET/Supp-ML.pdf
CaSPIAN: A Causal Compressive Sensing Algorithm for Discovering Directed Interactions in Gene Networks

PubMed Central

Emad, Amin; Milenkovic, Olgica

2014-01-01

We introduce a novel algorithm for inference of causal gene interactions, termed CaSPIAN (Causal Subspace Pursuit for Inference and Analysis of Networks), which is based on coupling compressive sensing and Granger causality techniques. The core of the approach is to discover sparse linear dependencies between shifted time series of gene expressions using a sequential list-version of the subspace pursuit reconstruction algorithm and to estimate the direction of gene interactions via Granger-type elimination. The method is conceptually simple and computationally efficient, and it allows for dealing with noisy measurements. Its performance as a stand-alone platform without biological side-information was tested on simulated networks, on the synthetic IRMA network in Saccharomyces cerevisiae, and on data pertaining to the human HeLa cell network and the SOS network in E. coli. The results produced by CaSPIAN are compared to the results of several related algorithms, demonstrating significant improvements in inference accuracy of documented interactions. These findings highlight the importance of Granger causality techniques for reducing the number of false-positives, as well as the influence of noise and sampling period on the accuracy of the estimates. In addition, the performance of the method was tested in conjunction with biological side information of the form of sparse “scaffold networks”, to which new edges were added using available RNA-seq or microarray data. These biological priors aid in increasing the sensitivity and precision of the algorithm in the small sample regime. PMID:24622336
Robust Learning of High-dimensional Biological Networks with Bayesian Networks

NASA Astrophysics Data System (ADS)

Nägele, Andreas; Dejori, Mathäus; Stetter, Martin

Structure learning of Bayesian networks applied to gene expression data has become a potentially useful method to estimate interactions between genes. However, the NP-hardness of Bayesian network structure learning renders the reconstruction of the full genetic network with thousands of genes unfeasible. Consequently, the maximal network size is usually restricted dramatically to a small set of genes (corresponding with variables in the Bayesian network). Although this feature reduction step makes structure learning computationally tractable, on the downside, the learned structure might be adversely affected due to the introduction of missing genes. Additionally, gene expression data are usually very sparse with respect to the number of samples, i.e., the number of genes is much greater than the number of different observations. Given these problems, learning robust network features from microarray data is a challenging task. This chapter presents several approaches tackling the robustness issue in order to obtain a more reliable estimation of learned network features.
Systems level analysis of the Chlamydomonas reinhardtii metabolic network reveals variability in evolutionary co-conservation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chaiboonchoe, Amphun; Ghamsari, Lila; Dohai, Bushra

Metabolic networks, which are mathematical representations of organismal metabolism, are reconstructed to provide computational platforms to guide metabolic engineering experiments and explore fundamental questions on metabolism. Systems level analyses, such as interrogation of phylogenetic relationships within the network, can provide further guidance on the modification of metabolic circuitries. Chlamydomonas reinhardtii, a biofuel relevant green alga that has retained key genes with plant, animal, and protist affinities, serves as an ideal model organism to investigate the interplay between gene function and phylogenetic affinities at multiple organizational levels. Here, using detailed topological and functional analyses, coupled with transcriptomics studies on a metabolicmore » network that we have reconstructed for C. reinhardtii, we show that network connectivity has a significant concordance with the co-conservation of genes; however, a distinction between topological and functional relationships is observable within the network. Dynamic and static modes of co-conservation were defined and observed in a subset of gene-pairs across the network topologically. In contrast, genes with predicted synthetic interactions, or genes involved in coupled reactions, show significant enrichment for both shorter and longer phylogenetic distances. Based on our results, we propose that the metabolic network of C. reinhardtii is assembled with an architecture to minimize phylogenetic profile distances topologically, while it includes an expansion of such distances for functionally interacting genes. This arrangement may increase the robustness of C. reinhardtii's network in dealing with varied environmental challenges that the species may face. As a result, the defined evolutionary constraints within the network, which identify important pairings of genes in metabolism, may offer guidance on synthetic biology approaches to optimize the production of desirable metabolites.« less
Systems level analysis of the Chlamydomonas reinhardtii metabolic network reveals variability in evolutionary co-conservation.

PubMed

Chaiboonchoe, Amphun; Ghamsari, Lila; Dohai, Bushra; Ng, Patrick; Khraiwesh, Basel; Jaiswal, Ashish; Jijakli, Kenan; Koussa, Joseph; Nelson, David R; Cai, Hong; Yang, Xinping; Chang, Roger L; Papin, Jason; Yu, Haiyuan; Balaji, Santhanam; Salehi-Ashtiani, Kourosh

2016-07-19

Metabolic networks, which are mathematical representations of organismal metabolism, are reconstructed to provide computational platforms to guide metabolic engineering experiments and explore fundamental questions on metabolism. Systems level analyses, such as interrogation of phylogenetic relationships within the network, can provide further guidance on the modification of metabolic circuitries. Chlamydomonas reinhardtii, a biofuel relevant green alga that has retained key genes with plant, animal, and protist affinities, serves as an ideal model organism to investigate the interplay between gene function and phylogenetic affinities at multiple organizational levels. Here, using detailed topological and functional analyses, coupled with transcriptomics studies on a metabolic network that we have reconstructed for C. reinhardtii, we show that network connectivity has a significant concordance with the co-conservation of genes; however, a distinction between topological and functional relationships is observable within the network. Dynamic and static modes of co-conservation were defined and observed in a subset of gene-pairs across the network topologically. In contrast, genes with predicted synthetic interactions, or genes involved in coupled reactions, show significant enrichment for both shorter and longer phylogenetic distances. Based on our results, we propose that the metabolic network of C. reinhardtii is assembled with an architecture to minimize phylogenetic profile distances topologically, while it includes an expansion of such distances for functionally interacting genes. This arrangement may increase the robustness of C. reinhardtii's network in dealing with varied environmental challenges that the species may face. The defined evolutionary constraints within the network, which identify important pairings of genes in metabolism, may offer guidance on synthetic biology approaches to optimize the production of desirable metabolites.
Systems level analysis of the Chlamydomonas reinhardtii metabolic network reveals variability in evolutionary co-conservation

DOE PAGES

Chaiboonchoe, Amphun; Ghamsari, Lila; Dohai, Bushra; ...

2016-06-14

Metabolic networks, which are mathematical representations of organismal metabolism, are reconstructed to provide computational platforms to guide metabolic engineering experiments and explore fundamental questions on metabolism. Systems level analyses, such as interrogation of phylogenetic relationships within the network, can provide further guidance on the modification of metabolic circuitries. Chlamydomonas reinhardtii, a biofuel relevant green alga that has retained key genes with plant, animal, and protist affinities, serves as an ideal model organism to investigate the interplay between gene function and phylogenetic affinities at multiple organizational levels. Here, using detailed topological and functional analyses, coupled with transcriptomics studies on a metabolicmore » network that we have reconstructed for C. reinhardtii, we show that network connectivity has a significant concordance with the co-conservation of genes; however, a distinction between topological and functional relationships is observable within the network. Dynamic and static modes of co-conservation were defined and observed in a subset of gene-pairs across the network topologically. In contrast, genes with predicted synthetic interactions, or genes involved in coupled reactions, show significant enrichment for both shorter and longer phylogenetic distances. Based on our results, we propose that the metabolic network of C. reinhardtii is assembled with an architecture to minimize phylogenetic profile distances topologically, while it includes an expansion of such distances for functionally interacting genes. This arrangement may increase the robustness of C. reinhardtii's network in dealing with varied environmental challenges that the species may face. As a result, the defined evolutionary constraints within the network, which identify important pairings of genes in metabolism, may offer guidance on synthetic biology approaches to optimize the production of desirable metabolites.« less
Simulation-Based Evaluation of Hybridization Network Reconstruction Methods in the Presence of Incomplete Lineage Sorting

PubMed Central

Kamneva, Olga K; Rosenberg, Noah A

2017-01-01

Hybridization events generate reticulate species relationships, giving rise to species networks rather than species trees. We report a comparative study of consensus, maximum parsimony, and maximum likelihood methods of species network reconstruction using gene trees simulated assuming a known species history. We evaluate the role of the divergence time between species involved in a hybridization event, the relative contributions of the hybridizing species, and the error in gene tree estimation. When gene tree discordance is mostly due to hybridization and not due to incomplete lineage sorting (ILS), most of the methods can detect even highly skewed hybridization events between highly divergent species. For recent divergences between hybridizing species, when the influence of ILS is sufficiently high, likelihood methods outperform parsimony and consensus methods, which erroneously identify extra hybridizations. The more sophisticated likelihood methods, however, are affected by gene tree errors to a greater extent than are consensus and parsimony. PMID:28469378
Reconstruction of Complex Network based on the Noise via QR Decomposition and Compressed Sensing.

PubMed

Li, Lixiang; Xu, Dafei; Peng, Haipeng; Kurths, Jürgen; Yang, Yixian

2017-11-08

It is generally known that the states of network nodes are stable and have strong correlations in a linear network system. We find that without the control input, the method of compressed sensing can not succeed in reconstructing complex networks in which the states of nodes are generated through the linear network system. However, noise can drive the dynamics between nodes to break the stability of the system state. Therefore, a new method integrating QR decomposition and compressed sensing is proposed to solve the reconstruction problem of complex networks under the assistance of the input noise. The state matrix of the system is decomposed by QR decomposition. We construct the measurement matrix with the aid of Gaussian noise so that the sparse input matrix can be reconstructed by compressed sensing. We also discover that noise can build a bridge between the dynamics and the topological structure. Experiments are presented to show that the proposed method is more accurate and more efficient to reconstruct four model networks and six real networks by the comparisons between the proposed method and only compressed sensing. In addition, the proposed method can reconstruct not only the sparse complex networks, but also the dense complex networks.
NetMiner-an ensemble pipeline for building genome-wide and high-quality gene co-expression network using massive-scale RNA-seq samples.

PubMed

Yu, Hua; Jiao, Bingke; Lu, Lu; Wang, Pengfei; Chen, Shuangcheng; Liang, Chengzhi; Liu, Wei

2018-01-01

Accurately reconstructing gene co-expression network is of great importance for uncovering the genetic architecture underlying complex and various phenotypes. The recent availability of high-throughput RNA-seq sequencing has made genome-wide detecting and quantifying of the novel, rare and low-abundance transcripts practical. However, its potential merits in reconstructing gene co-expression network have still not been well explored. Using massive-scale RNA-seq samples, we have designed an ensemble pipeline, called NetMiner, for building genome-scale and high-quality Gene Co-expression Network (GCN) by integrating three frequently used inference algorithms. We constructed a RNA-seq-based GCN in one species of monocot rice. The quality of network obtained by our method was verified and evaluated by the curated gene functional association data sets, which obviously outperformed each single method. In addition, the powerful capability of network for associating genes with functions and agronomic traits was shown by enrichment analysis and case studies. In particular, we demonstrated the potential value of our proposed method to predict the biological roles of unknown protein-coding genes, long non-coding RNA (lncRNA) genes and circular RNA (circRNA) genes. Our results provided a valuable and highly reliable data source to select key candidate genes for subsequent experimental validation. To facilitate identification of novel genes regulating important biological processes and phenotypes in other plants or animals, we have published the source code of NetMiner, making it freely available at https://github.com/czllab/NetMiner.
Reconstruction of an Integrated Genome-Scale Co-Expression Network Reveals Key Modules Involved in Lung Adenocarcinoma

PubMed Central

Hosseini Ashtiani, Saman; Moeini, Ali; Nowzari-Dalini, Abbas; Masoudi-Nejad, Ali

2013-01-01

Our goal of this study was to reconstruct a “genome-scale co-expression network” and find important modules in lung adenocarcinoma so that we could identify the genes involved in lung adenocarcinoma. We integrated gene mutation, GWAS, CGH, array-CGH and SNP array data in order to identify important genes and loci in genome-scale. Afterwards, on the basis of the identified genes a co-expression network was reconstructed from the co-expression data. The reconstructed network was named “genome-scale co-expression network”. As the next step, 23 key modules were disclosed through clustering. In this study a number of genes have been identified for the first time to be implicated in lung adenocarcinoma by analyzing the modules. The genes EGFR, PIK3CA, TAF15, XIAP, VAPB, Appl1, Rab5a, ARF4, CLPTM1L, SP4, ZNF124, LPP, FOXP1, SOX18, MSX2, NFE2L2, SMARCC1, TRA2B, CBX3, PRPF6, ATP6V1C1, MYBBP1A, MACF1, GRM2, TBXA2R, PRKAR2A, PTK2, PGF and MYO10 are among the genes that belong to modules 1 and 22. All these genes, being implicated in at least one of the phenomena, namely cell survival, proliferation and metastasis, have an over-expression pattern similar to that of EGFR. In few modules, the genes such as CCNA2 (Cyclin A2), CCNB2 (Cyclin B2), CDK1, CDK5, CDC27, CDCA5, CDCA8, ASPM, BUB1, KIF15, KIF2C, NEK2, NUSAP1, PRC1, SMC4, SYCE2, TFDP1, CDC42 and ARHGEF9 are present that play a crucial role in cell cycle progression. In addition to the mentioned genes, there are some other genes (i.e. DLGAP5, BIRC5, PSMD2, Src, TTK, SENP2, PSMD2, DOK2, FUS and etc.) in the modules. PMID:23874428
Independence screening for high dimensional nonlinear additive ODE models with applications to dynamic gene regulatory networks.

PubMed

Xue, Hongqi; Wu, Shuang; Wu, Yichao; Ramirez Idarraga, Juan C; Wu, Hulin

2018-05-02

Mechanism-driven low-dimensional ordinary differential equation (ODE) models are often used to model viral dynamics at cellular levels and epidemics of infectious diseases. However, low-dimensional mechanism-based ODE models are limited for modeling infectious diseases at molecular levels such as transcriptomic or proteomic levels, which is critical to understand pathogenesis of diseases. Although linear ODE models have been proposed for gene regulatory networks (GRNs), nonlinear regulations are common in GRNs. The reconstruction of large-scale nonlinear networks from time-course gene expression data remains an unresolved issue. Here, we use high-dimensional nonlinear additive ODEs to model GRNs and propose a 4-step procedure to efficiently perform variable selection for nonlinear ODEs. To tackle the challenge of high dimensionality, we couple the 2-stage smoothing-based estimation method for ODEs and a nonlinear independence screening method to perform variable selection for the nonlinear ODE models. We have shown that our method possesses the sure screening property and it can handle problems with non-polynomial dimensionality. Numerical performance of the proposed method is illustrated with simulated data and a real data example for identifying the dynamic GRN of Saccharomyces cerevisiae. Copyright © 2018 John Wiley & Sons, Ltd.
An approach for reduction of false predictions in reverse engineering of gene regulatory networks.

PubMed

Khan, Abhinandan; Saha, Goutam; Pal, Rajat Kumar

2018-05-14

A gene regulatory network discloses the regulatory interactions amongst genes, at a particular condition of the human body. The accurate reconstruction of such networks from time-series genetic expression data using computational tools offers a stiff challenge for contemporary computer scientists. This is crucial to facilitate the understanding of the proper functioning of a living organism. Unfortunately, the computational methods produce many false predictions along with the correct predictions, which is unwanted. Investigations in the domain focus on the identification of as many correct regulations as possible in the reverse engineering of gene regulatory networks to make it more reliable and biologically relevant. One way to achieve this is to reduce the number of incorrect predictions in the reconstructed networks. In the present investigation, we have proposed a novel scheme to decrease the number of false predictions by suitably combining several metaheuristic techniques. We have implemented the same using a dataset ensemble approach (i.e. combining multiple datasets) also. We have employed the proposed methodology on real-world experimental datasets of the SOS DNA Repair network of Escherichia coli and the IMRA network of Saccharomyces cerevisiae. Subsequently, we have experimented upon somewhat larger, in silico networks, namely, DREAM3 and DREAM4 Challenge networks, and 15-gene and 20-gene networks extracted from the GeneNetWeaver database. To study the effect of multiple datasets on the quality of the inferred networks, we have used four datasets in each experiment. The obtained results are encouraging enough as the proposed methodology can reduce the number of false predictions significantly, without using any supplementary prior biological information for larger gene regulatory networks. It is also observed that if a small amount of prior biological information is incorporated here, the results improve further w.r.t. the prediction of true positives. Copyright © 2018 Elsevier Ltd. All rights reserved.
An empirical Bayes approach to network recovery using external knowledge.

PubMed

Kpogbezan, Gino B; van der Vaart, Aad W; van Wieringen, Wessel N; Leday, Gwenaël G R; van de Wiel, Mark A

2017-09-01

Reconstruction of a high-dimensional network may benefit substantially from the inclusion of prior knowledge on the network topology. In the case of gene interaction networks such knowledge may come for instance from pathway repositories like KEGG, or be inferred from data of a pilot study. The Bayesian framework provides a natural means of including such prior knowledge. Based on a Bayesian Simultaneous Equation Model, we develop an appealing Empirical Bayes (EB) procedure that automatically assesses the agreement of the used prior knowledge with the data at hand. We use variational Bayes method for posterior densities approximation and compare its accuracy with that of Gibbs sampling strategy. Our method is computationally fast, and can outperform known competitors. In a simulation study, we show that accurate prior data can greatly improve the reconstruction of the network, but need not harm the reconstruction if wrong. We demonstrate the benefits of the method in an analysis of gene expression data from GEO. In particular, the edges of the recovered network have superior reproducibility (compared to that of competitors) over resampled versions of the data. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Dynamic modelling of microRNA regulation during mesenchymal stem cell differentiation.

PubMed

Weber, Michael; Sotoca, Ana M; Kupfer, Peter; Guthke, Reinhard; van Zoelen, Everardus J

2013-11-12

Network inference from gene expression data is a typical approach to reconstruct gene regulatory networks. During chondrogenic differentiation of human mesenchymal stem cells (hMSCs), a complex transcriptional network is active and regulates the temporal differentiation progress. As modulators of transcriptional regulation, microRNAs (miRNAs) play a critical role in stem cell differentiation. Integrated network inference aimes at determining interrelations between miRNAs and mRNAs on the basis of expression data as well as miRNA target predictions. We applied the NetGenerator tool in order to infer an integrated gene regulatory network. Time series experiments were performed to measure mRNA and miRNA abundances of TGF-beta1+BMP2 stimulated hMSCs. Network nodes were identified by analysing temporal expression changes, miRNA target gene predictions, time series correlation and literature knowledge. Network inference was performed using NetGenerator to reconstruct a dynamical regulatory model based on the measured data and prior knowledge. The resulting model is robust against noise and shows an optimal trade-off between fitting precision and inclusion of prior knowledge. It predicts the influence of miRNAs on the expression of chondrogenic marker genes and therefore proposes novel regulatory relations in differentiation control. By analysing the inferred network, we identified a previously unknown regulatory effect of miR-524-5p on the expression of the transcription factor SOX9 and the chondrogenic marker genes COL2A1, ACAN and COL10A1. Genome-wide exploration of miRNA-mRNA regulatory relationships is a reasonable approach to identify miRNAs which have so far not been associated with the investigated differentiation process. The NetGenerator tool is able to identify valid gene regulatory networks on the basis of miRNA and mRNA time series data.
Network reconstruction and systems analysis of plant cell wall deconstruction by Neurospora crassa.

PubMed

Samal, Areejit; Craig, James P; Coradetti, Samuel T; Benz, J Philipp; Eddy, James A; Price, Nathan D; Glass, N Louise

2017-01-01

Plant biomass degradation by fungal-derived enzymes is rapidly expanding in economic importance as a clean and efficient source for biofuels. The ability to rationally engineer filamentous fungi would facilitate biotechnological applications for degradation of plant cell wall polysaccharides. However, incomplete knowledge of biomolecular networks responsible for plant cell wall deconstruction impedes experimental efforts in this direction. To expand this knowledge base, a detailed network of reactions important for deconstruction of plant cell wall polysaccharides into simple sugars was constructed for the filamentous fungus Neurospora crassa . To reconstruct this network, information was integrated from five heterogeneous data types: functional genomics, transcriptomics, proteomics, genetics, and biochemical characterizations. The combined information was encapsulated into a feature matrix and the evidence weighted to assign annotation confidence scores for each gene within the network. Comparative analyses of RNA-seq and ChIP-seq data shed light on the regulation of the plant cell wall degradation network, leading to a novel hypothesis for degradation of the hemicellulose mannan. The transcription factor CLR-2 was subsequently experimentally shown to play a key role in the mannan degradation pathway of N. crassa . Here we built a network that serves as a scaffold for integration of diverse experimental datasets. This approach led to the elucidation of regulatory design principles for plant cell wall deconstruction by filamentous fungi and a novel function for the transcription factor CLR-2. This expanding network will aid in efforts to rationally engineer industrially relevant hyper-production strains.

ROBNCA: robust network component analysis for recovering transcription factor activities.

PubMed

Noor, Amina; Ahmad, Aitzaz; Serpedin, Erchin; Nounou, Mohamed; Nounou, Hazem

2013-10-01

Network component analysis (NCA) is an efficient method of reconstructing the transcription factor activity (TFA), which makes use of the gene expression data and prior information available about transcription factor (TF)-gene regulations. Most of the contemporary algorithms either exhibit the drawback of inconsistency and poor reliability, or suffer from prohibitive computational complexity. In addition, the existing algorithms do not possess the ability to counteract the presence of outliers in the microarray data. Hence, robust and computationally efficient algorithms are needed to enable practical applications. We propose ROBust Network Component Analysis (ROBNCA), a novel iterative algorithm that explicitly models the possible outliers in the microarray data. An attractive feature of the ROBNCA algorithm is the derivation of a closed form solution for estimating the connectivity matrix, which was not available in prior contributions. The ROBNCA algorithm is compared with FastNCA and the non-iterative NCA (NI-NCA). ROBNCA estimates the TF activity profiles as well as the TF-gene control strength matrix with a much higher degree of accuracy than FastNCA and NI-NCA, irrespective of varying noise, correlation and/or amount of outliers in case of synthetic data. The ROBNCA algorithm is also tested on Saccharomyces cerevisiae data and Escherichia coli data, and it is observed to outperform the existing algorithms. The run time of the ROBNCA algorithm is comparable with that of FastNCA, and is hundreds of times faster than NI-NCA. The ROBNCA software is available at http://people.tamu.edu/∼amina/ROBNCA
MIIC online: a web server to reconstruct causal or non-causal networks from non-perturbative data.

PubMed

Sella, Nadir; Verny, Louis; Uguzzoni, Guido; Affeldt, Séverine; Isambert, Hervé

2018-07-01

We present a web server running the MIIC algorithm, a network learning method combining constraint-based and information-theoretic frameworks to reconstruct causal, non-causal or mixed networks from non-perturbative data, without the need for an a priori choice on the class of reconstructed network. Starting from a fully connected network, the algorithm first removes dispensable edges by iteratively subtracting the most significant information contributions from indirect paths between each pair of variables. The remaining edges are then filtered based on their confidence assessment or oriented based on the signature of causality in observational data. MIIC online server can be used for a broad range of biological data, including possible unobserved (latent) variables, from single-cell gene expression data to protein sequence evolution and outperforms or matches state-of-the-art methods for either causal or non-causal network reconstruction. MIIC online can be freely accessed at https://miic.curie.fr. Supplementary data are available at Bioinformatics online.
Inference of cancer-specific gene regulatory networks using soft computing rules.

PubMed

Wang, Xiaosheng; Gotoh, Osamu

2010-03-24

Perturbations of gene regulatory networks are essentially responsible for oncogenesis. Therefore, inferring the gene regulatory networks is a key step to overcoming cancer. In this work, we propose a method for inferring directed gene regulatory networks based on soft computing rules, which can identify important cause-effect regulatory relations of gene expression. First, we identify important genes associated with a specific cancer (colon cancer) using a supervised learning approach. Next, we reconstruct the gene regulatory networks by inferring the regulatory relations among the identified genes, and their regulated relations by other genes within the genome. We obtain two meaningful findings. One is that upregulated genes are regulated by more genes than downregulated ones, while downregulated genes regulate more genes than upregulated ones. The other one is that tumor suppressors suppress tumor activators and activate other tumor suppressors strongly, while tumor activators activate other tumor activators and suppress tumor suppressors weakly, indicating the robustness of biological systems. These findings provide valuable insights into the pathogenesis of cancer.
Reconstruction of extended Petri nets from time series data and its application to signal transduction and to gene regulatory networks

PubMed Central

2011-01-01

Background Network inference methods reconstruct mathematical models of molecular or genetic networks directly from experimental data sets. We have previously reported a mathematical method which is exclusively data-driven, does not involve any heuristic decisions within the reconstruction process, and deliveres all possible alternative minimal networks in terms of simple place/transition Petri nets that are consistent with a given discrete time series data set. Results We fundamentally extended the previously published algorithm to consider catalysis and inhibition of the reactions that occur in the underlying network. The results of the reconstruction algorithm are encoded in the form of an extended Petri net involving control arcs. This allows the consideration of processes involving mass flow and/or regulatory interactions. As a non-trivial test case, the phosphate regulatory network of enterobacteria was reconstructed using in silico-generated time-series data sets on wild-type and in silico mutants. Conclusions The new exact algorithm reconstructs extended Petri nets from time series data sets by finding all alternative minimal networks that are consistent with the data. It suggested alternative molecular mechanisms for certain reactions in the network. The algorithm is useful to combine data from wild-type and mutant cells and may potentially integrate physiological, biochemical, pharmacological, and genetic data in the form of a single model. PMID:21762503
Model-based reconstruction of synthetic promoter library in Corynebacterium glutamicum.

PubMed

Zhang, Shuanghong; Liu, Dingyu; Mao, Zhitao; Mao, Yufeng; Ma, Hongwu; Chen, Tao; Zhao, Xueming; Wang, Zhiwen

2018-05-01

To develop an efficient synthetic promoter library for fine-tuned expression of target genes in Corynebacterium glutamicum. A synthetic promoter library for C. glutamicum was developed based on conserved sequences of the - 10 and - 35 regions. The synthetic promoter library covered a wide range of strengths, ranging from 1 to 193% of the tac promoter. 68 promoters were selected and sequenced for correlation analysis between promoter sequence and strength with a statistical model. A new promoter library was further reconstructed with improved promoter strength and coverage based on the results of correlation analysis. Tandem promoter P70 was finally constructed with increased strength by 121% over the tac promoter. The promoter library developed in this study showed a great potential for applications in metabolic engineering and synthetic biology for the optimization of metabolic networks. To the best of our knowledge, this is the first reconstruction of synthetic promoter library based on statistical analysis of C. glutamicum.
Cross disease analysis of co-functional microRNA pairs on a reconstructed network of disease-gene-microRNA tripartite.

PubMed

Peng, Hui; Lan, Chaowang; Zheng, Yi; Hutvagner, Gyorgy; Tao, Dacheng; Li, Jinyan

2017-03-24

MicroRNAs always function cooperatively in their regulation of gene expression. Dysfunctions of these co-functional microRNAs can play significant roles in disease development. We are interested in those multi-disease associated co-functional microRNAs that regulate their common dysfunctional target genes cooperatively in the development of multiple diseases. The research is potentially useful for human disease studies at the transcriptional level and for the study of multi-purpose microRNA therapeutics. We designed a computational method to detect multi-disease associated co-functional microRNA pairs and conducted cross disease analysis on a reconstructed disease-gene-microRNA (DGR) tripartite network. The construction of the DGR tripartite network is by the integration of newly predicted disease-microRNA associations with those relationships of diseases, microRNAs and genes maintained by existing databases. The prediction method uses a set of reliable negative samples of disease-microRNA association and a pre-computed kernel matrix instead of kernel functions. From this reconstructed DGR tripartite network, multi-disease associated co-functional microRNA pairs are detected together with their common dysfunctional target genes and ranked by a novel scoring method. We also conducted proof-of-concept case studies on cancer-related co-functional microRNA pairs as well as on non-cancer disease-related microRNA pairs. With the prioritization of the co-functional microRNAs that relate to a series of diseases, we found that the co-function phenomenon is not unusual. We also confirmed that the regulation of the microRNAs for the development of cancers is more complex and have more unique properties than those of non-cancer diseases.
Efficient Exploration of the Space of Reconciled Gene Trees

PubMed Central

Szöllősi, Gergely J.; Rosikiewicz, Wojciech; Boussau, Bastien; Tannier, Eric; Daubin, Vincent

2013-01-01

Gene trees record the combination of gene-level events, such as duplication, transfer and loss (DTL), and species-level events, such as speciation and extinction. Gene tree–species tree reconciliation methods model these processes by drawing gene trees into the species tree using a series of gene and species-level events. The reconstruction of gene trees based on sequence alone almost always involves choosing between statistically equivalent or weakly distinguishable relationships that could be much better resolved based on a putative species tree. To exploit this potential for accurate reconstruction of gene trees, the space of reconciled gene trees must be explored according to a joint model of sequence evolution and gene tree–species tree reconciliation. Here we present amalgamated likelihood estimation (ALE), a probabilistic approach to exhaustively explore all reconciled gene trees that can be amalgamated as a combination of clades observed in a sample of gene trees. We implement the ALE approach in the context of a reconciliation model (Szöllősi et al. 2013), which allows for the DTL of genes. We use ALE to efficiently approximate the sum of the joint likelihood over amalgamations and to find the reconciled gene tree that maximizes the joint likelihood among all such trees. We demonstrate using simulations that gene trees reconstructed using the joint likelihood are substantially more accurate than those reconstructed using sequence alone. Using realistic gene tree topologies, branch lengths, and alignment sizes, we demonstrate that ALE produces more accurate gene trees even if the model of sequence evolution is greatly simplified. Finally, examining 1099 gene families from 36 cyanobacterial genomes we find that joint likelihood-based inference results in a striking reduction in apparent phylogenetic discord, with respectively. 24%, 59%, and 46% reductions in the mean numbers of duplications, transfers, and losses per gene family. The open source implementation of ALE is available from https://github.com/ssolo/ALE.git. [amalgamation; gene tree reconciliation; gene tree reconstruction; lateral gene transfer; phylogeny.] PMID:23925510
Single-shot T2 mapping using overlapping-echo detachment planar imaging and a deep convolutional neural network.

PubMed

Cai, Congbo; Wang, Chao; Zeng, Yiqing; Cai, Shuhui; Liang, Dong; Wu, Yawen; Chen, Zhong; Ding, Xinghao; Zhong, Jianhui

2018-04-24

An end-to-end deep convolutional neural network (CNN) based on deep residual network (ResNet) was proposed to efficiently reconstruct reliable T 2 mapping from single-shot overlapping-echo detachment (OLED) planar imaging. The training dataset was obtained from simulations that were carried out on SPROM (Simulation with PRoduct Operator Matrix) software developed by our group. The relationship between the original OLED image containing two echo signals and the corresponding T 2 mapping was learned by ResNet training. After the ResNet was trained, it was applied to reconstruct the T 2 mapping from simulation and in vivo human brain data. Although the ResNet was trained entirely on simulated data, the trained network was generalized well to real human brain data. The results from simulation and in vivo human brain experiments show that the proposed method significantly outperforms the echo-detachment-based method. Reliable T 2 mapping with higher accuracy is achieved within 30 ms after the network has been trained, while the echo-detachment-based OLED reconstruction method took approximately 2 min. The proposed method will facilitate real-time dynamic and quantitative MR imaging via OLED sequence, and deep convolutional neural network has the potential to reconstruct maps from complex MRI sequences efficiently. © 2018 International Society for Magnetic Resonance in Medicine.
Sequence-based model of gap gene regulatory network.

PubMed

Kozlov, Konstantin; Gursky, Vitaly; Kulakovskiy, Ivan; Samsonova, Maria

2014-01-01

The detailed analysis of transcriptional regulation is crucially important for understanding biological processes. The gap gene network in Drosophila attracts large interest among researches studying mechanisms of transcriptional regulation. It implements the most upstream regulatory layer of the segmentation gene network. The knowledge of molecular mechanisms involved in gap gene regulation is far less complete than that of genetics of the system. Mathematical modeling goes beyond insights gained by genetics and molecular approaches. It allows us to reconstruct wild-type gene expression patterns in silico, infer underlying regulatory mechanism and prove its sufficiency. We developed a new model that provides a dynamical description of gap gene regulatory systems, using detailed DNA-based information, as well as spatial transcription factor concentration data at varying time points. We showed that this model correctly reproduces gap gene expression patterns in wild type embryos and is able to predict gap expression patterns in Kr mutants and four reporter constructs. We used four-fold cross validation test and fitting to random dataset to validate the model and proof its sufficiency in data description. The identifiability analysis showed that most model parameters are well identifiable. We reconstructed the gap gene network topology and studied the impact of individual transcription factor binding sites on the model output. We measured this impact by calculating the site regulatory weight as a normalized difference between the residual sum of squares error for the set of all annotated sites and for the set with the site of interest excluded. The reconstructed topology of the gap gene network is in agreement with previous modeling results and data from literature. We showed that 1) the regulatory weights of transcription factor binding sites show very weak correlation with their PWM score; 2) sites with low regulatory weight are important for the model output; 3) functional important sites are not exclusively located in cis-regulatory elements, but are rather dispersed through regulatory region. It is of importance that some of the sites with high functional impact in hb, Kr and kni regulatory regions coincide with strong sites annotated and verified in Dnase I footprint assays.
Context-specific metabolic networks are consistent with experiments.

PubMed

Becker, Scott A; Palsson, Bernhard O

2008-05-16

Reconstructions of cellular metabolism are publicly available for a variety of different microorganisms and some mammalian genomes. To date, these reconstructions are "genome-scale" and strive to include all reactions implied by the genome annotation, as well as those with direct experimental evidence. Clearly, many of the reactions in a genome-scale reconstruction will not be active under particular conditions or in a particular cell type. Methods to tailor these comprehensive genome-scale reconstructions into context-specific networks will aid predictive in silico modeling for a particular situation. We present a method called Gene Inactivity Moderated by Metabolism and Expression (GIMME) to achieve this goal. The GIMME algorithm uses quantitative gene expression data and one or more presupposed metabolic objectives to produce the context-specific reconstruction that is most consistent with the available data. Furthermore, the algorithm provides a quantitative inconsistency score indicating how consistent a set of gene expression data is with a particular metabolic objective. We show that this algorithm produces results consistent with biological experiments and intuition for adaptive evolution of bacteria, rational design of metabolic engineering strains, and human skeletal muscle cells. This work represents progress towards producing constraint-based models of metabolism that are specific to the conditions where the expression profiling data is available.
Simulated maximum likelihood method for estimating kinetic rates in gene expression.

PubMed

Tian, Tianhai; Xu, Songlin; Gao, Junbin; Burrage, Kevin

2007-01-01

Kinetic rate in gene expression is a key measurement of the stability of gene products and gives important information for the reconstruction of genetic regulatory networks. Recent developments in experimental technologies have made it possible to measure the numbers of transcripts and protein molecules in single cells. Although estimation methods based on deterministic models have been proposed aimed at evaluating kinetic rates from experimental observations, these methods cannot tackle noise in gene expression that may arise from discrete processes of gene expression, small numbers of mRNA transcript, fluctuations in the activity of transcriptional factors and variability in the experimental environment. In this paper, we develop effective methods for estimating kinetic rates in genetic regulatory networks. The simulated maximum likelihood method is used to evaluate parameters in stochastic models described by either stochastic differential equations or discrete biochemical reactions. Different types of non-parametric density functions are used to measure the transitional probability of experimental observations. For stochastic models described by biochemical reactions, we propose to use the simulated frequency distribution to evaluate the transitional density based on the discrete nature of stochastic simulations. The genetic optimization algorithm is used as an efficient tool to search for optimal reaction rates. Numerical results indicate that the proposed methods can give robust estimations of kinetic rates with good accuracy.
Reconstruction and Analysis of Human Kidney-Specific Metabolic Network Based on Omics Data

PubMed Central

Zhang, Ai-Di; Dai, Shao-Xing; Huang, Jing-Fei

2013-01-01

With the advent of the high-throughput data production, recent studies of tissue-specific metabolic networks have largely advanced our understanding of the metabolic basis of various physiological and pathological processes. However, for kidney, which plays an essential role in the body, the available kidney-specific model remains incomplete. This paper reports the reconstruction and characterization of the human kidney metabolic network based on transcriptome and proteome data. In silico simulations revealed that house-keeping genes were more essential than kidney-specific genes in maintaining kidney metabolism. Importantly, a total of 267 potential metabolic biomarkers for kidney-related diseases were successfully explored using this model. Furthermore, we found that the discrepancies in metabolic processes of different tissues are directly corresponding to tissue's functions. Finally, the phenotypes of the differentially expressed genes in diabetic kidney disease were characterized, suggesting that these genes may affect disease development through altering kidney metabolism. Thus, the human kidney-specific model constructed in this study may provide valuable information for the metabolism of kidney and offer excellent insights into complex kidney diseases. PMID:24222897
The Convolutional Visual Network for Identification and Reconstruction of NOvA Events

DOE Office of Scientific and Technical Information (OSTI.GOV)

Psihas, Fernanda

In 2016 the NOvA experiment released results for the observation of oscillations in the vμ and ve channels as well as ve cross section measurements using neutrinos from Fermilab’s NuMI beam. These and other measurements in progress rely on the accurate identification and reconstruction of the neutrino flavor and energy recorded by our detectors. This presentation describes the first application of convolutional neural network technology for event identification and reconstruction in particle detectors like NOvA. The Convolutional Visual Network (CVN) Algorithm was developed for identification, categorization, and reconstruction of NOvA events. It increased the selection efficiency of the ve appearancemore » signal by 40% and studies show potential impact to the vμ disappearance analysis.« less
Photoacoustic image reconstruction via deep learning

NASA Astrophysics Data System (ADS)

Antholzer, Stephan; Haltmeier, Markus; Nuster, Robert; Schwab, Johannes

2018-02-01

Applying standard algorithms to sparse data problems in photoacoustic tomography (PAT) yields low-quality images containing severe under-sampling artifacts. To some extent, these artifacts can be reduced by iterative image reconstruction algorithms which allow to include prior knowledge such as smoothness, total variation (TV) or sparsity constraints. These algorithms tend to be time consuming as the forward and adjoint problems have to be solved repeatedly. Further, iterative algorithms have additional drawbacks. For example, the reconstruction quality strongly depends on a-priori model assumptions about the objects to be recovered, which are often not strictly satisfied in practical applications. To overcome these issues, in this paper, we develop direct and efficient reconstruction algorithms based on deep learning. As opposed to iterative algorithms, we apply a convolutional neural network, whose parameters are trained before the reconstruction process based on a set of training data. For actual image reconstruction, a single evaluation of the trained network yields the desired result. Our presented numerical results (using two different network architectures) demonstrate that the proposed deep learning approach reconstructs images with a quality comparable to state of the art iterative reconstruction methods.
A method of reconstructing the spatial measurement network by mobile measurement transmitter for shipbuilding

NASA Astrophysics Data System (ADS)

Guo, Siyang; Lin, Jiarui; Yang, Linghui; Ren, Yongjie; Guo, Yin

2017-07-01

The workshop Measurement Position System (wMPS) is a distributed measurement system which is suitable for the large-scale metrology. However, there are some inevitable measurement problems in the shipbuilding industry, such as the restriction by obstacles and limited measurement range. To deal with these factors, this paper presents a method of reconstructing the spatial measurement network by mobile transmitter. A high-precision coordinate control network with more than six target points is established. The mobile measuring transmitter can be added into the measurement network using this coordinate control network with the spatial resection method. This method reconstructs the measurement network and broadens the measurement scope efficiently. To verify this method, two comparison experiments are designed with the laser tracker as the reference. The results demonstrate that the accuracy of point-to-point length is better than 0.4mm and the accuracy of coordinate measurement is better than 0.6mm.
Metabolic network reconstruction of Chlamydomonas offers insight into light-driven algal metabolism

PubMed Central

Chang, Roger L; Ghamsari, Lila; Manichaikul, Ani; Hom, Erik F Y; Balaji, Santhanam; Fu, Weiqi; Shen, Yun; Hao, Tong; Palsson, Bernhard Ø; Salehi-Ashtiani, Kourosh; Papin, Jason A

2011-01-01

Metabolic network reconstruction encompasses existing knowledge about an organism's metabolism and genome annotation, providing a platform for omics data analysis and phenotype prediction. The model alga Chlamydomonas reinhardtii is employed to study diverse biological processes from photosynthesis to phototaxis. Recent heightened interest in this species results from an international movement to develop algal biofuels. Integrating biological and optical data, we reconstructed a genome-scale metabolic network for this alga and devised a novel light-modeling approach that enables quantitative growth prediction for a given light source, resolving wavelength and photon flux. We experimentally verified transcripts accounted for in the network and physiologically validated model function through simulation and generation of new experimental growth data, providing high confidence in network contents and predictive applications. The network offers insight into algal metabolism and potential for genetic engineering and efficient light source design, a pioneering resource for studying light-driven metabolism and quantitative systems biology. PMID:21811229
Parallel Mutual Information Based Construction of Genome-Scale Networks on the Intel® Xeon Phi™ Coprocessor.

PubMed

Misra, Sanchit; Pamnany, Kiran; Aluru, Srinivas

2015-01-01

Construction of whole-genome networks from large-scale gene expression data is an important problem in systems biology. While several techniques have been developed, most cannot handle network reconstruction at the whole-genome scale, and the few that can, require large clusters. In this paper, we present a solution on the Intel Xeon Phi coprocessor, taking advantage of its multi-level parallelism including many x86-based cores, multiple threads per core, and vector processing units. We also present a solution on the Intel® Xeon® processor. Our solution is based on TINGe, a fast parallel network reconstruction technique that uses mutual information and permutation testing for assessing statistical significance. We demonstrate the first ever inference of a plant whole genome regulatory network on a single chip by constructing a 15,575 gene network of the plant Arabidopsis thaliana from 3,137 microarray experiments in only 22 minutes. In addition, our optimization for parallelizing mutual information computation on the Intel Xeon Phi coprocessor holds out lessons that are applicable to other domains.
Passing messages between biological networks to refine predicted interactions.

PubMed

Glass, Kimberly; Huttenhower, Curtis; Quackenbush, John; Yuan, Guo-Cheng

2013-01-01

Regulatory network reconstruction is a fundamental problem in computational biology. There are significant limitations to such reconstruction using individual datasets, and increasingly people attempt to construct networks using multiple, independent datasets obtained from complementary sources, but methods for this integration are lacking. We developed PANDA (Passing Attributes between Networks for Data Assimilation), a message-passing model using multiple sources of information to predict regulatory relationships, and used it to integrate protein-protein interaction, gene expression, and sequence motif data to reconstruct genome-wide, condition-specific regulatory networks in yeast as a model. The resulting networks were not only more accurate than those produced using individual data sets and other existing methods, but they also captured information regarding specific biological mechanisms and pathways that were missed using other methodologies. PANDA is scalable to higher eukaryotes, applicable to specific tissue or cell type data and conceptually generalizable to include a variety of regulatory, interaction, expression, and other genome-scale data. An implementation of the PANDA algorithm is available at www.sourceforge.net/projects/panda-net.
Managing uncertainty in metabolic network structure and improving predictions using EnsembleFBA

PubMed Central

2017-01-01

Genome-scale metabolic network reconstructions (GENREs) are repositories of knowledge about the metabolic processes that occur in an organism. GENREs have been used to discover and interpret metabolic functions, and to engineer novel network structures. A major barrier preventing more widespread use of GENREs, particularly to study non-model organisms, is the extensive time required to produce a high-quality GENRE. Many automated approaches have been developed which reduce this time requirement, but automatically-reconstructed draft GENREs still require curation before useful predictions can be made. We present a novel approach to the analysis of GENREs which improves the predictive capabilities of draft GENREs by representing many alternative network structures, all equally consistent with available data, and generating predictions from this ensemble. This ensemble approach is compatible with many reconstruction methods. We refer to this new approach as Ensemble Flux Balance Analysis (EnsembleFBA). We validate EnsembleFBA by predicting growth and gene essentiality in the model organism Pseudomonas aeruginosa UCBPP-PA14. We demonstrate how EnsembleFBA can be included in a systems biology workflow by predicting essential genes in six Streptococcus species and mapping the essential genes to small molecule ligands from DrugBank. We found that some metabolic subsystems contributed disproportionately to the set of predicted essential reactions in a way that was unique to each Streptococcus species, leading to species-specific outcomes from small molecule interactions. Through our analyses of P. aeruginosa and six Streptococci, we show that ensembles increase the quality of predictions without drastically increasing reconstruction time, thus making GENRE approaches more practical for applications which require predictions for many non-model organisms. All of our functions and accompanying example code are available in an open online repository. PMID:28263984
Managing uncertainty in metabolic network structure and improving predictions using EnsembleFBA.

PubMed

Biggs, Matthew B; Papin, Jason A

2017-03-01

Genome-scale metabolic network reconstructions (GENREs) are repositories of knowledge about the metabolic processes that occur in an organism. GENREs have been used to discover and interpret metabolic functions, and to engineer novel network structures. A major barrier preventing more widespread use of GENREs, particularly to study non-model organisms, is the extensive time required to produce a high-quality GENRE. Many automated approaches have been developed which reduce this time requirement, but automatically-reconstructed draft GENREs still require curation before useful predictions can be made. We present a novel approach to the analysis of GENREs which improves the predictive capabilities of draft GENREs by representing many alternative network structures, all equally consistent with available data, and generating predictions from this ensemble. This ensemble approach is compatible with many reconstruction methods. We refer to this new approach as Ensemble Flux Balance Analysis (EnsembleFBA). We validate EnsembleFBA by predicting growth and gene essentiality in the model organism Pseudomonas aeruginosa UCBPP-PA14. We demonstrate how EnsembleFBA can be included in a systems biology workflow by predicting essential genes in six Streptococcus species and mapping the essential genes to small molecule ligands from DrugBank. We found that some metabolic subsystems contributed disproportionately to the set of predicted essential reactions in a way that was unique to each Streptococcus species, leading to species-specific outcomes from small molecule interactions. Through our analyses of P. aeruginosa and six Streptococci, we show that ensembles increase the quality of predictions without drastically increasing reconstruction time, thus making GENRE approaches more practical for applications which require predictions for many non-model organisms. All of our functions and accompanying example code are available in an open online repository.

CoryneRegNet: an ontology-based data warehouse of corynebacterial transcription factors and regulatory networks.

PubMed

Baumbach, Jan; Brinkrolf, Karina; Czaja, Lisa F; Rahmann, Sven; Tauch, Andreas

2006-02-14

The application of DNA microarray technology in post-genomic analysis of bacterial genome sequences has allowed the generation of huge amounts of data related to regulatory networks. This data along with literature-derived knowledge on regulation of gene expression has opened the way for genome-wide reconstruction of transcriptional regulatory networks. These large-scale reconstructions can be converted into in silico models of bacterial cells that allow a systematic analysis of network behavior in response to changing environmental conditions. CoryneRegNet was designed to facilitate the genome-wide reconstruction of transcriptional regulatory networks of corynebacteria relevant in biotechnology and human medicine. During the import and integration process of data derived from experimental studies or literature knowledge CoryneRegNet generates links to genome annotations, to identified transcription factors and to the corresponding cis-regulatory elements. CoryneRegNet is based on a multi-layered, hierarchical and modular concept of transcriptional regulation and was implemented by using the relational database management system MySQL and an ontology-based data structure. Reconstructed regulatory networks can be visualized by using the yFiles JAVA graph library. As an application example of CoryneRegNet, we have reconstructed the global transcriptional regulation of a cellular module involved in SOS and stress response of corynebacteria. CoryneRegNet is an ontology-based data warehouse that allows a pertinent data management of regulatory interactions along with the genome-scale reconstruction of transcriptional regulatory networks. These models can further be combined with metabolic networks to build integrated models of cellular function including both metabolism and its transcriptional regulation.
Genome-scale reconstruction of the sigma factor network in Escherichia coli: topology and functional states

PubMed Central

2014-01-01

Background At the beginning of the transcription process, the RNA polymerase (RNAP) core enzyme requires a σ-factor to recognize the genomic location at which the process initiates. Although the crucial role of σ-factors has long been appreciated and characterized for many individual promoters, we do not yet have a genome-scale assessment of their function. Results Using multiple genome-scale measurements, we elucidated the network of σ-factor and promoter interactions in Escherichia coli. The reconstructed network includes 4,724 σ-factor-specific promoters corresponding to transcription units (TUs), representing an increase of more than 300% over what has been previously reported. The reconstructed network was used to investigate competition between alternative σ-factors (the σ70 and σ38 regulons), confirming the competition model of σ substitution and negative regulation by alternative σ-factors. Comparison with σ-factor binding in Klebsiella pneumoniae showed that transcriptional regulation of conserved genes in closely related species is unexpectedly divergent. Conclusions The reconstructed network reveals the regulatory complexity of the promoter architecture in prokaryotic genomes, and opens a path to the direct determination of the systems biology of their transcriptional regulatory networks. PMID:24461193
Genome-scale reconstruction of the Streptococcus pyogenes M49 metabolic network reveals growth requirements and indicates potential drug targets.

PubMed

Levering, Jennifer; Fiedler, Tomas; Sieg, Antje; van Grinsven, Koen W A; Hering, Silvio; Veith, Nadine; Olivier, Brett G; Klett, Lara; Hugenholtz, Jeroen; Teusink, Bas; Kreikemeyer, Bernd; Kummer, Ursula

2016-08-20

Genome-scale metabolic models comprise stoichiometric relations between metabolites, as well as associations between genes and metabolic reactions and facilitate the analysis of metabolism. We computationally reconstructed the metabolic network of the lactic acid bacterium Streptococcus pyogenes M49. Initially, we based the reconstruction on genome annotations and already existing and curated metabolic networks of Bacillus subtilis, Escherichia coli, Lactobacillus plantarum and Lactococcus lactis. This initial draft was manually curated with the final reconstruction accounting for 480 genes associated with 576 reactions and 558 metabolites. In order to constrain the model further, we performed growth experiments of wild type and arcA deletion strains of S. pyogenes M49 in a chemically defined medium and calculated nutrient uptake and production fluxes. We additionally performed amino acid auxotrophy experiments to test the consistency of the model. The established genome-scale model can be used to understand the growth requirements of the human pathogen S. pyogenes and define optimal and suboptimal conditions, but also to describe differences and similarities between S. pyogenes and related lactic acid bacteria such as L. lactis in order to find strategies to reduce the growth of the pathogen and propose drug targets. Copyright © 2016 Elsevier B.V. All rights reserved.
HiDi: an efficient reverse engineering schema for large-scale dynamic regulatory network reconstruction using adaptive differentiation.

PubMed

Deng, Yue; Zenil, Hector; Tegnér, Jesper; Kiani, Narsis A

2017-12-15

The use of differential equations (ODE) is one of the most promising approaches to network inference. The success of ODE-based approaches has, however, been limited, due to the difficulty in estimating parameters and by their lack of scalability. Here, we introduce a novel method and pipeline to reverse engineer gene regulatory networks from gene expression of time series and perturbation data based upon an improvement on the calculation scheme of the derivatives and a pre-filtration step to reduce the number of possible links. The method introduces a linear differential equation model with adaptive numerical differentiation that is scalable to extremely large regulatory networks. We demonstrate the ability of this method to outperform current state-of-the-art methods applied to experimental and synthetic data using test data from the DREAM4 and DREAM5 challenges. Our method displays greater accuracy and scalability. We benchmark the performance of the pipeline with respect to dataset size and levels of noise. We show that the computation time is linear over various network sizes. The Matlab code of the HiDi implementation is available at: www.complexitycalculator.com/HiDiScript.zip. hzenilc@gmail.com or narsis.kiani@ki.se. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Connectome sensitivity or specificity: which is more important?

PubMed

Zalesky, Andrew; Fornito, Alex; Cocchi, Luca; Gollo, Leonardo L; van den Heuvel, Martijn P; Breakspear, Michael

2016-11-15

Connectomes with high sensitivity and high specificity are unattainable with current axonal fiber reconstruction methods, particularly at the macro-scale afforded by magnetic resonance imaging. Tensor-guided deterministic tractography yields sparse connectomes that are incomplete and contain false negatives (FNs), whereas probabilistic methods steered by crossing-fiber models yield dense connectomes, often with low specificity due to false positives (FPs). Densely reconstructed probabilistic connectomes are typically thresholded to improve specificity at the cost of a reduction in sensitivity. What is the optimal tradeoff between connectome sensitivity and specificity? We show empirically and theoretically that specificity is paramount. Our evaluations of the impact of FPs and FNs on empirical connectomes indicate that specificity is at least twice as important as sensitivity when estimating key properties of brain networks, including topological measures of network clustering, network efficiency and network modularity. Our asymptotic analysis of small-world networks with idealized modular structure reveals that as the number of nodes grows, specificity becomes exactly twice as important as sensitivity to the estimation of the clustering coefficient. For the estimation of network efficiency, the relative importance of specificity grows linearly with the number of nodes. The greater importance of specificity is due to FPs occurring more prevalently between network modules rather than within them. These spurious inter-modular connections have a dramatic impact on network topology. We argue that efforts to maximize the sensitivity of connectome reconstruction should be realigned with the need to map brain networks with high specificity. Copyright © 2016 Elsevier Inc. All rights reserved.
Version 6 of the consensus yeast metabolic network refines biochemical coverage and improves model performance

PubMed Central

Heavner, Benjamin D.; Smallbone, Kieran; Price, Nathan D.; Walker, Larry P.

2013-01-01

Updates to maintain a state-of-the art reconstruction of the yeast metabolic network are essential to reflect our understanding of yeast metabolism and functional organization, to eliminate any inaccuracies identified in earlier iterations, to improve predictive accuracy and to continue to expand into novel subsystems to extend the comprehensiveness of the model. Here, we present version 6 of the consensus yeast metabolic network (Yeast 6) as an update to the community effort to computationally reconstruct the genome-scale metabolic network of Saccharomyces cerevisiae S288c. Yeast 6 comprises 1458 metabolites participating in 1888 reactions, which are annotated with 900 yeast genes encoding the catalyzing enzymes. Compared with Yeast 5, Yeast 6 demonstrates improved sensitivity, specificity and positive and negative predictive values for predicting gene essentiality in glucose-limited aerobic conditions when analyzed with flux balance analysis. Additionally, Yeast 6 improves the accuracy of predicting the likelihood that a mutation will cause auxotrophy. The network reconstruction is available as a Systems Biology Markup Language (SBML) file enriched with Minimium Information Requested in the Annotation of Biochemical Models (MIRIAM)-compliant annotations. Small- and macromolecules in the network are referenced to authoritative databases such as Uniprot or ChEBI. Molecules and reactions are also annotated with appropriate publications that contain supporting evidence. Yeast 6 is freely available at http://yeast.sf.net/ as three separate SBML files: a model using the SBML level 3 Flux Balance Constraint package, a model compatible with the MATLAB® COBRA Toolbox for backward compatibility and a reconstruction containing only reactions for which there is experimental evidence (without the non-biological reactions necessary for simulating growth). Database URL: http://yeast.sf.net/ PMID:23935056
PyPanda: a Python package for gene regulatory network reconstruction

PubMed Central

van IJzendoorn, David G.P.; Glass, Kimberly; Quackenbush, John; Kuijjer, Marieke L.

2016-01-01

Summary: PANDA (Passing Attributes between Networks for Data Assimilation) is a gene regulatory network inference method that uses message-passing to integrate multiple sources of ‘omics data. PANDA was originally coded in C ++. In this application note we describe PyPanda, the Python version of PANDA. PyPanda runs considerably faster than the C ++ version and includes additional features for network analysis. Availability and implementation: The open source PyPanda Python package is freely available at http://github.com/davidvi/pypanda. Contact: mkuijjer@jimmy.harvard.edu or d.g.p.van_ijzendoorn@lumc.nl PMID:27402905
PyPanda: a Python package for gene regulatory network reconstruction.

PubMed

van IJzendoorn, David G P; Glass, Kimberly; Quackenbush, John; Kuijjer, Marieke L

2016-11-01

PANDA (Passing Attributes between Networks for Data Assimilation) is a gene regulatory network inference method that uses message-passing to integrate multiple sources of 'omics data. PANDA was originally coded in C ++. In this application note we describe PyPanda, the Python version of PANDA. PyPanda runs considerably faster than the C ++ version and includes additional features for network analysis. The open source PyPanda Python package is freely available at http://github.com/davidvi/pypanda CONTACT: mkuijjer@jimmy.harvard.edu or d.g.p.van_ijzendoorn@lumc.nl. © The Author 2016. Published by Oxford University Press.
Enhancing biological relevance of a weighted gene co-expression network for functional module identification.

PubMed

Prom-On, Santitham; Chanthaphan, Atthawut; Chan, Jonathan Hoyin; Meechai, Asawin

2011-02-01

Relationships among gene expression levels may be associated with the mechanisms of the disease. While identifying a direct association such as a difference in expression levels between case and control groups links genes to disease mechanisms, uncovering an indirect association in the form of a network structure may help reveal the underlying functional module associated with the disease under scrutiny. This paper presents a method to improve the biological relevance in functional module identification from the gene expression microarray data by enhancing the structure of a weighted gene co-expression network using minimum spanning tree. The enhanced network, which is called a backbone network, contains only the essential structural information to represent the gene co-expression network. The entire backbone network is decoupled into a number of coherent sub-networks, and then the functional modules are reconstructed from these sub-networks to ensure minimum redundancy. The method was tested with a simulated gene expression dataset and case-control expression datasets of autism spectrum disorder and colorectal cancer studies. The results indicate that the proposed method can accurately identify clusters in the simulated dataset, and the functional modules of the backbone network are more biologically relevant than those obtained from the original approach.
Reverse Engineering Validation using a Benchmark Synthetic Gene Circuit in Human Cells

PubMed Central

Kang, Taek; White, Jacob T.; Xie, Zhen; Benenson, Yaakov; Sontag, Eduardo; Bleris, Leonidas

2013-01-01

Multi-component biological networks are often understood incompletely, in large part due to the lack of reliable and robust methodologies for network reverse engineering and characterization. As a consequence, developing automated and rigorously validated methodologies for unraveling the complexity of biomolecular networks in human cells remains a central challenge to life scientists and engineers. Today, when it comes to experimental and analytical requirements, there exists a great deal of diversity in reverse engineering methods, which renders the independent validation and comparison of their predictive capabilities difficult. In this work we introduce an experimental platform customized for the development and verification of reverse engineering and pathway characterization algorithms in mammalian cells. Specifically, we stably integrate a synthetic gene network in human kidney cells and use it as a benchmark for validating reverse engineering methodologies. The network, which is orthogonal to endogenous cellular signaling, contains a small set of regulatory interactions that can be used to quantify the reconstruction performance. By performing successive perturbations to each modular component of the network and comparing protein and RNA measurements, we study the conditions under which we can reliably reconstruct the causal relationships of the integrated synthetic network. PMID:23654266
Reverse engineering validation using a benchmark synthetic gene circuit in human cells.

PubMed

Kang, Taek; White, Jacob T; Xie, Zhen; Benenson, Yaakov; Sontag, Eduardo; Bleris, Leonidas

2013-05-17

Multicomponent biological networks are often understood incompletely, in large part due to the lack of reliable and robust methodologies for network reverse engineering and characterization. As a consequence, developing automated and rigorously validated methodologies for unraveling the complexity of biomolecular networks in human cells remains a central challenge to life scientists and engineers. Today, when it comes to experimental and analytical requirements, there exists a great deal of diversity in reverse engineering methods, which renders the independent validation and comparison of their predictive capabilities difficult. In this work we introduce an experimental platform customized for the development and verification of reverse engineering and pathway characterization algorithms in mammalian cells. Specifically, we stably integrate a synthetic gene network in human kidney cells and use it as a benchmark for validating reverse engineering methodologies. The network, which is orthogonal to endogenous cellular signaling, contains a small set of regulatory interactions that can be used to quantify the reconstruction performance. By performing successive perturbations to each modular component of the network and comparing protein and RNA measurements, we study the conditions under which we can reliably reconstruct the causal relationships of the integrated synthetic network.
Systematic Evaluation of Molecular Networks for Discovery of Disease Genes.

PubMed

Huang, Justin K; Carlin, Daniel E; Yu, Michael Ku; Zhang, Wei; Kreisberg, Jason F; Tamayo, Pablo; Ideker, Trey

2018-04-25

Gene networks are rapidly growing in size and number, raising the question of which networks are most appropriate for particular applications. Here, we evaluate 21 human genome-wide interaction networks for their ability to recover 446 disease gene sets identified through literature curation, gene expression profiling, or genome-wide association studies. While all networks have some ability to recover disease genes, we observe a wide range of performance with STRING, ConsensusPathDB, and GIANT networks having the best performance overall. A general tendency is that performance scales with network size, suggesting that new interaction discovery currently outweighs the detrimental effects of false positives. Correcting for size, we find that the DIP network provides the highest efficiency (value per interaction). Based on these results, we create a parsimonious composite network with both high efficiency and performance. This work provides a benchmark for selection of molecular networks in human disease research. Copyright © 2018 Elsevier Inc. All rights reserved.
Reconstruction of financial networks for robust estimation of systemic risk

NASA Astrophysics Data System (ADS)

Mastromatteo, Iacopo; Zarinelli, Elia; Marsili, Matteo

2012-03-01

In this paper we estimate the propagation of liquidity shocks through interbank markets when the information about the underlying credit network is incomplete. We show that techniques such as maximum entropy currently used to reconstruct credit networks severely underestimate the risk of contagion by assuming a trivial (fully connected) topology, a type of network structure which can be very different from the one empirically observed. We propose an efficient message-passing algorithm to explore the space of possible network structures and show that a correct estimation of the network degree of connectedness leads to more reliable estimations for systemic risk. Such an algorithm is also able to produce maximally fragile structures, providing a practical upper bound for the risk of contagion when the actual network structure is unknown. We test our algorithm on ensembles of synthetic data encoding some features of real financial networks (sparsity and heterogeneity), finding that more accurate estimations of risk can be achieved. Finally we find that this algorithm can be used to control the amount of information that regulators need to require from banks in order to sufficiently constrain the reconstruction of financial networks.
Constructing an integrated gene similarity network for the identification of disease genes.

PubMed

Tian, Zhen; Guo, Maozu; Wang, Chunyu; Xing, LinLin; Wang, Lei; Zhang, Yin

2017-09-20

Discovering novel genes that are involved human diseases is a challenging task in biomedical research. In recent years, several computational approaches have been proposed to prioritize candidate disease genes. Most of these methods are mainly based on protein-protein interaction (PPI) networks. However, since these PPI networks contain false positives and only cover less half of known human genes, their reliability and coverage are very low. Therefore, it is highly necessary to fuse multiple genomic data to construct a credible gene similarity network and then infer disease genes on the whole genomic scale. We proposed a novel method, named RWRB, to infer causal genes of interested diseases. First, we construct five individual gene (protein) similarity networks based on multiple genomic data of human genes. Then, an integrated gene similarity network (IGSN) is reconstructed based on similarity network fusion (SNF) method. Finally, we employee the random walk with restart algorithm on the phenotype-gene bilayer network, which combines phenotype similarity network, IGSN as well as phenotype-gene association network, to prioritize candidate disease genes. We investigate the effectiveness of RWRB through leave-one-out cross-validation methods in inferring phenotype-gene relationships. Results show that RWRB is more accurate than state-of-the-art methods on most evaluation metrics. Further analysis shows that the success of RWRB is benefited from IGSN which has a wider coverage and higher reliability comparing with current PPI networks. Moreover, we conduct a comprehensive case study for Alzheimer's disease and predict some novel disease genes that supported by literature. RWRB is an effective and reliable algorithm in prioritizing candidate disease genes on the genomic scale. Software and supplementary information are available at http://nclab.hit.edu.cn/~tianzhen/RWRB/ .
Reconstruction of metabolic pathways by combining probabilistic graphical model-based and knowledge-based methods

PubMed Central

2014-01-01

Automatic reconstruction of metabolic pathways for an organism from genomics and transcriptomics data has been a challenging and important problem in bioinformatics. Traditionally, known reference pathways can be mapped into an organism-specific ones based on its genome annotation and protein homology. However, this simple knowledge-based mapping method might produce incomplete pathways and generally cannot predict unknown new relations and reactions. In contrast, ab initio metabolic network construction methods can predict novel reactions and interactions, but its accuracy tends to be low leading to a lot of false positives. Here we combine existing pathway knowledge and a new ab initio Bayesian probabilistic graphical model together in a novel fashion to improve automatic reconstruction of metabolic networks. Specifically, we built a knowledge database containing known, individual gene / protein interactions and metabolic reactions extracted from existing reference pathways. Known reactions and interactions were then used as constraints for Bayesian network learning methods to predict metabolic pathways. Using individual reactions and interactions extracted from different pathways of many organisms to guide pathway construction is new and improves both the coverage and accuracy of metabolic pathway construction. We applied this probabilistic knowledge-based approach to construct the metabolic networks from yeast gene expression data and compared its results with 62 known metabolic networks in the KEGG database. The experiment showed that the method improved the coverage of metabolic network construction over the traditional reference pathway mapping method and was more accurate than pure ab initio methods. PMID:25374614
[Study on intersection and regulation mechanism of "efficacy-toxicity network" of aconite in combination environment of Sini decoction].

PubMed

Li, Zhi-yong; Bao, Hong-juan; Zhang, Shuo-feng; Ye, Tian-yuan; Yang, Ce; Li, Yan-wen

2015-02-01

To explore the intersection and regulation mechanism of "efficacy-toxicity network" of Glycyrrhizae Radix et Rhizoma, Zingiberis Rhizoma and Aconiti Lateralis Radix Praeparata's action gene in the combination environment of Sini decoction with the network pharmacological method. The gene interaction network of Aconiti Lateralis Radix Praeparata, Glycyrrhizae Radix et Rhizoma, Zingiberis Rhizoma were mined and established with Cytoscape software and Agilent literature search plug-in. The "efficiency-toxicity network" intersection of Aconiti Lateralis Radix Praeparata was formed according to its effects in anti-heart failure, neurotoxicity and cardiotoxicity. The target genes were clustered with Clusterviz plug-in. And the possible pathways of the "efficacy-tox- icity network" intersection of Glycyrrhizae Radix et Rhizoma, Zingiberis Rhizoma and Aconiti Lateralis Radix Praeparata were forecasted in DAVID database. There were five genes related to neurotoxicity, cardiotoxicity and anti-heart failure function of Aconiti Lateralis Radix Praeparata, namely AKT1, BAX, HCC, IL6 and IL8, which formed 47 nodes genes in the "efficiency-toxicity network" intersection of Aconiti Lateralis Radix Praeparata. There were 29 and 27 coincident genes in the "efficiency-toxicity network" of Glycyrrhizae Radix et Rhizoma, Zingiberis Rhizoma and Aconiti Lateralis Radix Praeparata. There were 23 and 17 possible regulatory pathways. In the combination environment of Sini decoction, Glycyrrhizae Radix et Rhizoma and Zingiberis Rhizoma may regulate the efficiency-toxicity network of Aconiti Lateralis Radix Praeparata by influencing immune-inflammatory signaling pathway, apoptosis-autophagy signaling pathway, nerve cell and myocardial ischemia and hypoxia protection signaling pathways.
CoryneRegNet: An ontology-based data warehouse of corynebacterial transcription factors and regulatory networks

PubMed Central

Baumbach, Jan; Brinkrolf, Karina; Czaja, Lisa F; Rahmann, Sven; Tauch, Andreas

2006-01-01

Background The application of DNA microarray technology in post-genomic analysis of bacterial genome sequences has allowed the generation of huge amounts of data related to regulatory networks. This data along with literature-derived knowledge on regulation of gene expression has opened the way for genome-wide reconstruction of transcriptional regulatory networks. These large-scale reconstructions can be converted into in silico models of bacterial cells that allow a systematic analysis of network behavior in response to changing environmental conditions. Description CoryneRegNet was designed to facilitate the genome-wide reconstruction of transcriptional regulatory networks of corynebacteria relevant in biotechnology and human medicine. During the import and integration process of data derived from experimental studies or literature knowledge CoryneRegNet generates links to genome annotations, to identified transcription factors and to the corresponding cis-regulatory elements. CoryneRegNet is based on a multi-layered, hierarchical and modular concept of transcriptional regulation and was implemented by using the relational database management system MySQL and an ontology-based data structure. Reconstructed regulatory networks can be visualized by using the yFiles JAVA graph library. As an application example of CoryneRegNet, we have reconstructed the global transcriptional regulation of a cellular module involved in SOS and stress response of corynebacteria. Conclusion CoryneRegNet is an ontology-based data warehouse that allows a pertinent data management of regulatory interactions along with the genome-scale reconstruction of transcriptional regulatory networks. These models can further be combined with metabolic networks to build integrated models of cellular function including both metabolism and its transcriptional regulation. PMID:16478536
Machine Learning-Assisted Network Inference Approach to Identify a New Class of Genes that Coordinate the Functionality of Cancer Networks.

PubMed

Ghanat Bari, Mehrab; Ung, Choong Yong; Zhang, Cheng; Zhu, Shizhen; Li, Hu

2017-08-01

Emerging evidence indicates the existence of a new class of cancer genes that act as "signal linkers" coordinating oncogenic signals between mutated and differentially expressed genes. While frequently mutated oncogenes and differentially expressed genes, which we term Class I cancer genes, are readily detected by most analytical tools, the new class of cancer-related genes, i.e., Class II, escape detection because they are neither mutated nor differentially expressed. Given this hypothesis, we developed a Machine Learning-Assisted Network Inference (MALANI) algorithm, which assesses all genes regardless of expression or mutational status in the context of cancer etiology. We used 8807 expression arrays, corresponding to 9 cancer types, to build more than 2 × 10 8 Support Vector Machine (SVM) models for reconstructing a cancer network. We found that ~3% of ~19,000 not differentially expressed genes are Class II cancer gene candidates. Some Class II genes that we found, such as SLC19A1 and ATAD3B, have been recently reported to associate with cancer outcomes. To our knowledge, this is the first study that utilizes both machine learning and network biology approaches to uncover Class II cancer genes in coordinating functionality in cancer networks and will illuminate our understanding of how genes are modulated in a tissue-specific network contribute to tumorigenesis and therapy development.
Thermodynamic Constraints Improve Metabolic Networks.

PubMed

Krumholz, Elias W; Libourel, Igor G L

2017-08-08

In pursuit of establishing a realistic metabolic phenotypic space, the reversibility of reactions is thermodynamically constrained in modern metabolic networks. The reversibility constraints follow from heuristic thermodynamic poise approximations that take anticipated cellular metabolite concentration ranges into account. Because constraints reduce the feasible space, draft metabolic network reconstructions may need more extensive reconciliation, and a larger number of genes may become essential. Notwithstanding ubiquitous application, the effect of reversibility constraints on the predictive capabilities of metabolic networks has not been investigated in detail. Instead, work has focused on the implementation and validation of the thermodynamic poise calculation itself. With the advance of fast linear programming-based network reconciliation, the effects of reversibility constraints on network reconciliation and gene essentiality predictions have become feasible and are the subject of this study. Networks with thermodynamically informed reversibility constraints outperformed gene essentiality predictions compared to networks that were constrained with randomly shuffled constraints. Unconstrained networks predicted gene essentiality as accurately as thermodynamically constrained networks, but predicted substantially fewer essential genes. Networks that were reconciled with sequence similarity data and strongly enforced reversibility constraints outperformed all other networks. We conclude that metabolic network analysis confirmed the validity of the thermodynamic constraints, and that thermodynamic poise information is actionable during network reconciliation. Copyright © 2017 Biophysical Society. Published by Elsevier Inc. All rights reserved.
Supervised de novo reconstruction of metabolic pathways from metabolome-scale compound sets

PubMed Central

Kotera, Masaaki; Tabei, Yasuo; Yamanishi, Yoshihiro; Tokimatsu, Toshiaki; Goto, Susumu

2013-01-01

Motivation: The metabolic pathway is an important biochemical reaction network involving enzymatic reactions among chemical compounds. However, it is assumed that a large number of metabolic pathways remain unknown, and many reactions are still missing even in known pathways. Therefore, the most important challenge in metabolomics is the automated de novo reconstruction of metabolic pathways, which includes the elucidation of previously unknown reactions to bridge the metabolic gaps. Results: In this article, we develop a novel method to reconstruct metabolic pathways from a large compound set in the reaction-filling framework. We define feature vectors representing the chemical transformation patterns of compound–compound pairs in enzymatic reactions using chemical fingerprints. We apply a sparsity-induced classifier to learn what we refer to as ‘enzymatic-reaction likeness’, i.e. whether compound pairs are possibly converted to each other by enzymatic reactions. The originality of our method lies in the search for potential reactions among many compounds at a time, in the extraction of reaction-related chemical transformation patterns and in the large-scale applicability owing to the computational efficiency. In the results, we demonstrate the usefulness of our proposed method on the de novo reconstruction of 134 metabolic pathways in Kyoto Encyclopedia of Genes and Genomes (KEGG). Our comprehensively predicted reaction networks of 15 698 compounds enable us to suggest many potential pathways and to increase research productivity in metabolomics. Availability: Softwares are available on request. Supplementary material are available at http://web.kuicr.kyoto-u.ac.jp/supp/kot/ismb2013/. Contact: goto@kuicr.kyoto-u.ac.jp PMID:23812977

PlantPAN 2.0: an update of plant promoter analysis navigator for reconstructing transcriptional regulatory networks in plants.

PubMed

Chow, Chi-Nga; Zheng, Han-Qin; Wu, Nai-Yun; Chien, Chia-Hung; Huang, Hsien-Da; Lee, Tzong-Yi; Chiang-Hsieh, Yi-Fan; Hou, Ping-Fu; Yang, Tien-Yi; Chang, Wen-Chi

2016-01-04

Transcription factors (TFs) are sequence-specific DNA-binding proteins acting as critical regulators of gene expression. The Plant Promoter Analysis Navigator (PlantPAN; http://PlantPAN2.itps.ncku.edu.tw) provides an informative resource for detecting transcription factor binding sites (TFBSs), corresponding TFs, and other important regulatory elements (CpG islands and tandem repeats) in a promoter or a set of plant promoters. Additionally, TFBSs, CpG islands, and tandem repeats in the conserve regions between similar gene promoters are also identified. The current PlantPAN release (version 2.0) contains 16 960 TFs and 1143 TF binding site matrices among 76 plant species. In addition to updating of the annotation information, adding experimentally verified TF matrices, and making improvements in the visualization of transcriptional regulatory networks, several new features and functions are incorporated. These features include: (i) comprehensive curation of TF information (response conditions, target genes, and sequence logos of binding motifs, etc.), (ii) co-expression profiles of TFs and their target genes under various conditions, (iii) protein-protein interactions among TFs and their co-factors, (iv) TF-target networks, and (v) downstream promoter elements. Furthermore, a dynamic transcriptional regulatory network under various conditions is provided in PlantPAN 2.0. The PlantPAN 2.0 is a systematic platform for plant promoter analysis and reconstructing transcriptional regulatory networks. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Passing Messages between Biological Networks to Refine Predicted Interactions

PubMed Central

Glass, Kimberly; Huttenhower, Curtis; Quackenbush, John; Yuan, Guo-Cheng

2013-01-01

Regulatory network reconstruction is a fundamental problem in computational biology. There are significant limitations to such reconstruction using individual datasets, and increasingly people attempt to construct networks using multiple, independent datasets obtained from complementary sources, but methods for this integration are lacking. We developed PANDA (Passing Attributes between Networks for Data Assimilation), a message-passing model using multiple sources of information to predict regulatory relationships, and used it to integrate protein-protein interaction, gene expression, and sequence motif data to reconstruct genome-wide, condition-specific regulatory networks in yeast as a model. The resulting networks were not only more accurate than those produced using individual data sets and other existing methods, but they also captured information regarding specific biological mechanisms and pathways that were missed using other methodologies. PANDA is scalable to higher eukaryotes, applicable to specific tissue or cell type data and conceptually generalizable to include a variety of regulatory, interaction, expression, and other genome-scale data. An implementation of the PANDA algorithm is available at www.sourceforge.net/projects/panda-net. PMID:23741402
Integration of Steady-State and Temporal Gene Expression Data for the Inference of Gene Regulatory Networks

PubMed Central

Wang, Yi Kan; Hurley, Daniel G.; Schnell, Santiago; Print, Cristin G.; Crampin, Edmund J.

2013-01-01

We develop a new regression algorithm, cMIKANA, for inference of gene regulatory networks from combinations of steady-state and time-series gene expression data. Using simulated gene expression datasets to assess the accuracy of reconstructing gene regulatory networks, we show that steady-state and time-series data sets can successfully be combined to identify gene regulatory interactions using the new algorithm. Inferring gene networks from combined data sets was found to be advantageous when using noisy measurements collected with either lower sampling rates or a limited number of experimental replicates. We illustrate our method by applying it to a microarray gene expression dataset from human umbilical vein endothelial cells (HUVECs) which combines time series data from treatment with growth factor TNF and steady state data from siRNA knockdown treatments. Our results suggest that the combination of steady-state and time-series datasets may provide better prediction of RNA-to-RNA interactions, and may also reveal biological features that cannot be identified from dynamic or steady state information alone. Finally, we consider the experimental design of genomics experiments for gene regulatory network inference and show that network inference can be improved by incorporating steady-state measurements with time-series data. PMID:23967277
Likelihood-Based Gene Annotations for Gap Filling and Quality Assessment in Genome-Scale Metabolic Models

PubMed Central

Benedict, Matthew N.; Mundy, Michael B.; Henry, Christopher S.; Chia, Nicholas; Price, Nathan D.

2014-01-01

Genome-scale metabolic models provide a powerful means to harness information from genomes to deepen biological insights. With exponentially increasing sequencing capacity, there is an enormous need for automated reconstruction techniques that can provide more accurate models in a short time frame. Current methods for automated metabolic network reconstruction rely on gene and reaction annotations to build draft metabolic networks and algorithms to fill gaps in these networks. However, automated reconstruction is hampered by database inconsistencies, incorrect annotations, and gap filling largely without considering genomic information. Here we develop an approach for applying genomic information to predict alternative functions for genes and estimate their likelihoods from sequence homology. We show that computed likelihood values were significantly higher for annotations found in manually curated metabolic networks than those that were not. We then apply these alternative functional predictions to estimate reaction likelihoods, which are used in a new gap filling approach called likelihood-based gap filling to predict more genomically consistent solutions. To validate the likelihood-based gap filling approach, we applied it to models where essential pathways were removed, finding that likelihood-based gap filling identified more biologically relevant solutions than parsimony-based gap filling approaches. We also demonstrate that models gap filled using likelihood-based gap filling provide greater coverage and genomic consistency with metabolic gene functions compared to parsimony-based approaches. Interestingly, despite these findings, we found that likelihoods did not significantly affect consistency of gap filled models with Biolog and knockout lethality data. This indicates that the phenotype data alone cannot necessarily be used to discriminate between alternative solutions for gap filling and therefore, that the use of other information is necessary to obtain a more accurate network. All described workflows are implemented as part of the DOE Systems Biology Knowledgebase (KBase) and are publicly available via API or command-line web interface. PMID:25329157
Inferring gene dependency network specific to phenotypic alteration based on gene expression data and clinical information of breast cancer.

PubMed

Zhou, Xionghui; Liu, Juan

2014-01-01

Although many methods have been proposed to reconstruct gene regulatory network, most of them, when applied in the sample-based data, can not reveal the gene regulatory relations underlying the phenotypic change (e.g. normal versus cancer). In this paper, we adopt phenotype as a variable when constructing the gene regulatory network, while former researches either neglected it or only used it to select the differentially expressed genes as the inputs to construct the gene regulatory network. To be specific, we integrate phenotype information with gene expression data to identify the gene dependency pairs by using the method of conditional mutual information. A gene dependency pair (A,B) means that the influence of gene A on the phenotype depends on gene B. All identified gene dependency pairs constitute a directed network underlying the phenotype, namely gene dependency network. By this way, we have constructed gene dependency network of breast cancer from gene expression data along with two different phenotype states (metastasis and non-metastasis). Moreover, we have found the network scale free, indicating that its hub genes with high out-degrees may play critical roles in the network. After functional investigation, these hub genes are found to be biologically significant and specially related to breast cancer, which suggests that our gene dependency network is meaningful. The validity has also been justified by literature investigation. From the network, we have selected 43 discriminative hubs as signature to build the classification model for distinguishing the distant metastasis risks of breast cancer patients, and the result outperforms those classification models with published signatures. In conclusion, we have proposed a promising way to construct the gene regulatory network by using sample-based data, which has been shown to be effective and accurate in uncovering the hidden mechanism of the biological process and identifying the gene signature for phenotypic change.
Novel candidate genes important for asthma and hypertension comorbidity revealed from associative gene networks.

PubMed

Saik, Olga V; Demenkov, Pavel S; Ivanisenko, Timofey V; Bragina, Elena Yu; Freidin, Maxim B; Goncharova, Irina A; Dosenko, Victor E; Zolotareva, Olga I; Hofestaedt, Ralf; Lavrik, Inna N; Rogaev, Evgeny I; Ivanisenko, Vladimir A

2018-02-13

Hypertension and bronchial asthma are a major issue for people's health. As of 2014, approximately one billion adults, or ~ 22% of the world population, have had hypertension. As of 2011, 235-330 million people globally have been affected by asthma and approximately 250,000-345,000 people have died each year from the disease. The development of the effective treatment therapies against these diseases is complicated by their comorbidity features. This is often a major problem in diagnosis and their treatment. Hence, in this study the bioinformatical methodology for the analysis of the comorbidity of these two diseases have been developed. As such, the search for candidate genes related to the comorbid conditions of asthma and hypertension can help in elucidating the molecular mechanisms underlying the comorbid condition of these two diseases, and can also be useful for genotyping and identifying new drug targets. Using ANDSystem, the reconstruction and analysis of gene networks associated with asthma and hypertension was carried out. The gene network of asthma included 755 genes/proteins and 62,603 interactions, while the gene network of hypertension - 713 genes/proteins and 45,479 interactions. Two hundred and five genes/proteins and 9638 interactions were shared between asthma and hypertension. An approach for ranking genes implicated in the comorbid condition of two diseases was proposed. The approach is based on nine criteria for ranking genes by their importance, including standard methods of gene prioritization (Endeavor, ToppGene) as well as original criteria that take into account the characteristics of an associative gene network and the presence of known polymorphisms in the analysed genes. According to the proposed approach, the genes IL10, TLR4, and CAT had the highest priority in the development of comorbidity of these two diseases. Additionally, it was revealed that the list of top genes is enriched with apoptotic genes and genes involved in biological processes related to the functioning of central nervous system. The application of methods of reconstruction and analysis of gene networks is a productive tool for studying the molecular mechanisms of comorbid conditions. The method put forth to rank genes by their importance to the comorbid condition of asthma and hypertension was employed that resulted in prediction of 10 genes, playing the key role in the development of the comorbid condition. The results can be utilised to plan experiments for identification of novel candidate genes along with searching for novel pharmacological targets.
P-Finder: Reconstruction of Signaling Networks from Protein-Protein Interactions and GO Annotations.

PubMed

Young-Rae Cho; Yanan Xin; Speegle, Greg

2015-01-01

Because most complex genetic diseases are caused by defects of cell signaling, illuminating a signaling cascade is essential for understanding their mechanisms. We present three novel computational algorithms to reconstruct signaling networks between a starting protein and an ending protein using genome-wide protein-protein interaction (PPI) networks and gene ontology (GO) annotation data. A signaling network is represented as a directed acyclic graph in a merged form of multiple linear pathways. An advanced semantic similarity metric is applied for weighting PPIs as the preprocessing of all three methods. The first algorithm repeatedly extends the list of nodes based on path frequency towards an ending protein. The second algorithm repeatedly appends edges based on the occurrence of network motifs which indicate the link patterns more frequently appearing in a PPI network than in a random graph. The last algorithm uses the information propagation technique which iteratively updates edge orientations based on the path strength and merges the selected directed edges. Our experimental results demonstrate that the proposed algorithms achieve higher accuracy than previous methods when they are tested on well-studied pathways of S. cerevisiae. Furthermore, we introduce an interactive web application tool, called P-Finder, to visualize reconstructed signaling networks.
Phylomemetics—Evolutionary Analysis beyond the Gene

PubMed Central

Howe, Christopher J.; Windram, Heather F.

2011-01-01

Genes are propagated by error-prone copying, and the resulting variation provides the basis for phylogenetic reconstruction of evolutionary relationships. Horizontal gene transfer may be superimposed on a tree-like evolutionary pattern, with some relationships better depicted as networks. The copying of manuscripts by scribes is very similar to the replication of genes, and phylogenetic inference programs can be used directly for reconstructing the copying history of different versions of a manuscript text. Phylogenetic methods have also been used for some time to analyse the evolution of languages and the development of physical cultural artefacts. These studies can help to answer a range of anthropological questions. We propose the adoption of the term “phylomemetics” for phylogenetic analysis of reproducing non-genetic elements. PMID:21655311
Molecular mechanisms of system responses to novel stimuli are predictable from public data

PubMed Central

Danziger, Samuel A.; Ratushny, Alexander V.; Smith, Jennifer J.; Saleem, Ramsey A.; Wan, Yakun; Arens, Christina E.; Armstrong, Abraham M.; Sitko, Katherine; Chen, Wei-Ming; Chiang, Jung-Hsien; Reiss, David J.; Baliga, Nitin S.; Aitchison, John D.

2014-01-01

Systems scale models provide the foundation for an effective iterative cycle between hypothesis generation, experiment and model refinement. Such models also enable predictions facilitating the understanding of biological complexity and the control of biological systems. Here, we demonstrate the reconstruction of a globally predictive gene regulatory model from public data: a model that can drive rational experiment design and reveal new regulatory mechanisms underlying responses to novel environments. Specifically, using ∼1500 publically available genome-wide transcriptome data sets from Saccharomyces cerevisiae, we have reconstructed an environment and gene regulatory influence network that accurately predicts regulatory mechanisms and gene expression changes on exposure of cells to completely novel environments. Focusing on transcriptional networks that induce peroxisomes biogenesis, the model-guided experiments allow us to expand a core regulatory network to include novel transcriptional influences and linkage across signaling and transcription. Thus, the approach and model provides a multi-scalar picture of gene dynamics and are powerful resources for exploiting extant data to rationally guide experimentation. The techniques outlined here are generally applicable to any biological system, which is especially important when experimental systems are challenging and samples are difficult and expensive to obtain—a common problem in laboratory animal and human studies. PMID:24185701
Construction and comparison of gene co-expression networks shows complex plant immune responses

PubMed Central

López, Camilo; López-Kleine, Liliana

2014-01-01

Gene co-expression networks (GCNs) are graphic representations that depict the coordinated transcription of genes in response to certain stimuli. GCNs provide functional annotations of genes whose function is unknown and are further used in studies of translational functional genomics among species. In this work, a methodology for the reconstruction and comparison of GCNs is presented. This approach was applied using gene expression data that were obtained from immunity experiments in Arabidopsis thaliana, rice, soybean, tomato and cassava. After the evaluation of diverse similarity metrics for the GCN reconstruction, we recommended the mutual information coefficient measurement and a clustering coefficient-based method for similarity threshold selection. To compare GCNs, we proposed a multivariate approach based on the Principal Component Analysis (PCA). Branches of plant immunity that were exemplified by each experiment were analyzed in conjunction with the PCA results, suggesting both the robustness and the dynamic nature of the cellular responses. The dynamic of molecular plant responses produced networks with different characteristics that are differentiable using our methodology. The comparison of GCNs from plant pathosystems, showed that in response to similar pathogens plants could activate conserved signaling pathways. The results confirmed that the closeness of GCNs projected on the principal component space is an indicative of similarity among GCNs. This also can be used to understand global patterns of events triggered during plant immune responses. PMID:25320678
Reconstructing gene regulatory networks from knock-out data using Gaussian Noise Model and Pearson Correlation Coefficient.

PubMed

Mohamed Salleh, Faridah Hani; Arif, Shereena Mohd; Zainudin, Suhaila; Firdaus-Raih, Mohd

2015-12-01

A gene regulatory network (GRN) is a large and complex network consisting of interacting elements that, over time, affect each other's state. The dynamics of complex gene regulatory processes are difficult to understand using intuitive approaches alone. To overcome this problem, we propose an algorithm for inferring the regulatory interactions from knock-out data using a Gaussian model combines with Pearson Correlation Coefficient (PCC). There are several problems relating to GRN construction that have been outlined in this paper. We demonstrated the ability of our proposed method to (1) predict the presence of regulatory interactions between genes, (2) their directionality and (3) their states (activation or suppression). The algorithm was applied to network sizes of 10 and 50 genes from DREAM3 datasets and network sizes of 10 from DREAM4 datasets. The predicted networks were evaluated based on AUROC and AUPR. We discovered that high false positive values were generated by our GRN prediction methods because the indirect regulations have been wrongly predicted as true relationships. We achieved satisfactory results as the majority of sub-networks achieved AUROC values above 0.5. Copyright © 2015 Elsevier Ltd. All rights reserved.
Reconstructing the regulatory network controlling commitment and sporulation in Physarum polycephalum based on hierarchical Petri Net modelling and simulation.

PubMed

Marwan, Wolfgang; Sujatha, Arumugam; Starostzik, Christine

2005-10-21

We reconstruct the regulatory network controlling commitment and sporulation of Physarum polycephalum from experimental results using a hierarchical Petri Net-based modelling and simulation framework. The stochastic Petri Net consistently describes the structure and simulates the dynamics of the molecular network as analysed by genetic, biochemical and physiological experiments within a single coherent model. The Petri Net then is extended to simulate time-resolved somatic complementation experiments performed by mixing the cytoplasms of mutants altered in the sporulation response, to systematically explore the network structure and to probe its dynamics. This reverse engineering approach presumably can be employed to explore other molecular or genetic signalling systems where the activity of genes or their products can be experimentally controlled in a time-resolved manner.
RegPrecise 3.0--a resource for genome-scale exploration of transcriptional regulation in bacteria.

PubMed

Novichkov, Pavel S; Kazakov, Alexey E; Ravcheev, Dmitry A; Leyn, Semen A; Kovaleva, Galina Y; Sutormin, Roman A; Kazanov, Marat D; Riehl, William; Arkin, Adam P; Dubchak, Inna; Rodionov, Dmitry A

2013-11-01

Genome-scale prediction of gene regulation and reconstruction of transcriptional regulatory networks in prokaryotes is one of the critical tasks of modern genomics. Bacteria from different taxonomic groups, whose lifestyles and natural environments are substantially different, possess highly diverged transcriptional regulatory networks. The comparative genomics approaches are useful for in silico reconstruction of bacterial regulons and networks operated by both transcription factors (TFs) and RNA regulatory elements (riboswitches). RegPrecise (http://regprecise.lbl.gov) is a web resource for collection, visualization and analysis of transcriptional regulons reconstructed by comparative genomics. We significantly expanded a reference collection of manually curated regulons we introduced earlier. RegPrecise 3.0 provides access to inferred regulatory interactions organized by phylogenetic, structural and functional properties. Taxonomy-specific collections include 781 TF regulogs inferred in more than 160 genomes representing 14 taxonomic groups of Bacteria. TF-specific collections include regulogs for a selected subset of 40 TFs reconstructed across more than 30 taxonomic lineages. Novel collections of regulons operated by RNA regulatory elements (riboswitches) include near 400 regulogs inferred in 24 bacterial lineages. RegPrecise 3.0 provides four classifications of the reference regulons implemented as controlled vocabularies: 55 TF protein families; 43 RNA motif families; ~150 biological processes or metabolic pathways; and ~200 effectors or environmental signals. Genome-wide visualization of regulatory networks and metabolic pathways covered by the reference regulons are available for all studied genomes. A separate section of RegPrecise 3.0 contains draft regulatory networks in 640 genomes obtained by an conservative propagation of the reference regulons to closely related genomes. RegPrecise 3.0 gives access to the transcriptional regulons reconstructed in bacterial genomes. Analytical capabilities include exploration of: regulon content, structure and function; TF binding site motifs; conservation and variations in genome-wide regulatory networks across all taxonomic groups of Bacteria. RegPrecise 3.0 was selected as a core resource on transcriptional regulation of the Department of Energy Systems Biology Knowledgebase, an emerging software and data environment designed to enable researchers to collaboratively generate, test and share new hypotheses about gene and protein functions, perform large-scale analyses, and model interactions in microbes, plants, and their communities.
Efficient content-based low-altitude images correlated network and strips reconstruction

NASA Astrophysics Data System (ADS)

He, Haiqing; You, Qi; Chen, Xiaoyong

2017-01-01

The manual intervention method is widely used to reconstruct strips for further aerial triangulation in low-altitude photogrammetry. Clearly the method for fully automatic photogrammetric data processing is not an expected way. In this paper, we explore a content-based approach without manual intervention or external information for strips reconstruction. Feature descriptors in the local spatial patterns are extracted by SIFT to construct vocabulary tree, in which these features are encoded in terms of TF-IDF numerical statistical algorithm to generate new representation for each low-altitude image. Then images correlated network is reconstructed by similarity measure, image matching and geometric graph theory. Finally, strips are reconstructed automatically by tracing straight lines and growing adjacent images gradually. Experimental results show that the proposed approach is highly effective in automatically rearranging strips of lowaltitude images and can provide rough relative orientation for further aerial triangulation.
Reconstruction and visualization of carbohydrate, N-glycosylation pathways in Pichia pastoris CBS7435 using computational and system biology approaches.

PubMed

Srivastava, Akriti; Somvanshi, Pallavi; Mishra, Bhartendu Nath

2013-06-01

Pichia pastoris is an efficient expression system for production of recombinant proteins. To understand its physiology for building novel applications it is important to understand and reconstruct its metabolic network. The metabolic reconstruction approach connects genotype with phenotype. Here, we have attempted to reconstruct carbohydrate metabolism pathways responsible for high biomass density and N-glycosylation pathways involved in the post translational modification of proteins of P. pastoris CBS7435. Both these metabolic pathways play a crucial role in heterologous protein production. We report novel, missing and unannotated enzymes involved in the target metabolic pathways. A strong possibility of cellulose and xylose metabolic processes in P. pastoris CBS7435 suggests its use in the area of biofuels. The reconstructed metabolic networks can be used for increased yields and improved product quality, for designing appropriate growth medium, for production of recombinant therapeutics and for making biofuels.
Metabolism and evolution: A comparative study of reconstructed genome-level metabolic networks

NASA Astrophysics Data System (ADS)

Almaas, Eivind

2008-03-01

The availability of high-quality annotations of sequenced genomes has made it possible to generate organism-specific comprehensive maps of cellular metabolism. Currently, more than twenty such metabolic reconstructions are publicly available, with the majority focused on bacteria. A typical metabolic reconstruction for a bacterium results in a complex network containing hundreds of metabolites (nodes) and reactions (links), while some even contain more than a thousand. The constrain-based optimization approach of flux-balance analysis (FBA) is used to investigate the functional characteristics of such large-scale metabolic networks, making it possible to estimate an organism's growth behavior in a wide variety of nutrient environments, as well as its robustness to gene loss. We have recently completed the genome-level metabolic reconstruction of Yersinia pseudotuberculosis, as well as the three Yersinia pestis biovars Antiqua, Mediaevalis, and Orientalis. While Y. pseudotuberculosis typically only causes fever and abdominal pain that can mimic appendicitis, the evolutionary closely related Y. pestis strains are the aetiological agents of the bubonic plague. In this presentation, I will discuss our results and conclusions from a comparative study on the evolution of metabolic function in the four Yersiniae networks using FBA and related techniques, and I will give particular focus to the interplay between metabolic network topology and evolutionary flexibility.
NetBenchmark: a bioconductor package for reproducible benchmarks of gene regulatory network inference.

PubMed

Bellot, Pau; Olsen, Catharina; Salembier, Philippe; Oliveras-Vergés, Albert; Meyer, Patrick E

2015-09-29

In the last decade, a great number of methods for reconstructing gene regulatory networks from expression data have been proposed. However, very few tools and datasets allow to evaluate accurately and reproducibly those methods. Hence, we propose here a new tool, able to perform a systematic, yet fully reproducible, evaluation of transcriptional network inference methods. Our open-source and freely available Bioconductor package aggregates a large set of tools to assess the robustness of network inference algorithms against different simulators, topologies, sample sizes and noise intensities. The benchmarking framework that uses various datasets highlights the specialization of some methods toward network types and data. As a result, it is possible to identify the techniques that have broad overall performances.
Wisdom of crowds for robust gene network inference

PubMed Central

Marbach, Daniel; Costello, James C.; Küffner, Robert; Vega, Nicci; Prill, Robert J.; Camacho, Diogo M.; Allison, Kyle R.; Kellis, Manolis; Collins, James J.; Stolovitzky, Gustavo

2012-01-01

Reconstructing gene regulatory networks from high-throughput data is a long-standing problem. Through the DREAM project (Dialogue on Reverse Engineering Assessment and Methods), we performed a comprehensive blind assessment of over thirty network inference methods on Escherichia coli, Staphylococcus aureus, Saccharomyces cerevisiae, and in silico microarray data. We characterize performance, data requirements, and inherent biases of different inference approaches offering guidelines for both algorithm application and development. We observe that no single inference method performs optimally across all datasets. In contrast, integration of predictions from multiple inference methods shows robust and high performance across diverse datasets. Thereby, we construct high-confidence networks for E. coli and S. aureus, each comprising ~1700 transcriptional interactions at an estimated precision of 50%. We experimentally test 53 novel interactions in E. coli, of which 23 were supported (43%). Our results establish community-based methods as a powerful and robust tool for the inference of transcriptional gene regulatory networks. PMID:22796662
Inferring dynamic gene regulatory networks in cardiac differentiation through the integration of multi-dimensional data.

PubMed

Gong, Wuming; Koyano-Nakagawa, Naoko; Li, Tongbin; Garry, Daniel J

2015-03-07

Decoding the temporal control of gene expression patterns is key to the understanding of the complex mechanisms that govern developmental decisions during heart development. High-throughput methods have been employed to systematically study the dynamic and coordinated nature of cardiac differentiation at the global level with multiple dimensions. Therefore, there is a pressing need to develop a systems approach to integrate these data from individual studies and infer the dynamic regulatory networks in an unbiased fashion. We developed a two-step strategy to integrate data from (1) temporal RNA-seq, (2) temporal histone modification ChIP-seq, (3) transcription factor (TF) ChIP-seq and (4) gene perturbation experiments to reconstruct the dynamic network during heart development. First, we trained a logistic regression model to predict the probability (LR score) of any base being bound by 543 TFs with known positional weight matrices. Second, four dimensions of data were combined using a time-varying dynamic Bayesian network model to infer the dynamic networks at four developmental stages in the mouse [mouse embryonic stem cells (ESCs), mesoderm (MES), cardiac progenitors (CP) and cardiomyocytes (CM)]. Our method not only infers the time-varying networks between different stages of heart development, but it also identifies the TF binding sites associated with promoter or enhancers of downstream genes. The LR scores of experimentally verified ESCs and heart enhancers were significantly higher than random regions (p <10(-100)), suggesting that a high LR score is a reliable indicator for functional TF binding sites. Our network inference model identified a region with an elevated LR score approximately -9400 bp upstream of the transcriptional start site of Nkx2-5, which overlapped with a previously reported enhancer region (-9435 to -8922 bp). TFs such as Tead1, Gata4, Msx2, and Tgif1 were predicted to bind to this region and participate in the regulation of Nkx2-5 gene expression. Our model also predicted the key regulatory networks for the ESC-MES, MES-CP and CP-CM transitions. We report a novel method to systematically integrate multi-dimensional -omics data and reconstruct the gene regulatory networks. This method will allow one to rapidly determine the cis-modules that regulate key genes during cardiac differentiation.
Evaluation of the skin irritation using a DNA microarray on a reconstructed human epidermal model.

PubMed

Niwa, Makoto; Nagai, Kanji; Oike, Hideaki; Kobori, Masuko

2009-02-01

To avoid the need to use animals to test the skin irritancy potential of chemicals and cosmetics, it is important to establish an in vitro method based on the reconstructed human epidermal model. To evaluate skin irritancy efficiently and sensitively, we determined the gene expression induced by a topically-applied mild irritant sodium dodecyl sulfate (SDS) in a reconstructed human epidermal model LabCyte EPI-MODEL (LabCyte) using a DNA microarray carrying genes that were related to inflammation, immunity, stress and housekeeping. The expression and secretion of IL-1alpha in reconstructed human epidermal culture is known to be induced by irritation. We detected the induction of IL-1alpha expression and its secretion into the cell culture medium by treatment with 0.075% SDS for 18 h in LabCyte culture using DNA microarray, quantitative reverse-transcription polymerase chain reaction (RT-PCR) and ELISA. DNA microarray analysis indicated that the expression of 10 of the 205 genes carried on the DNA microarray was significantly induced in a LabCyte culture by 0.05% or 0.075% SDS irritation for 18 h. RT-PCR analysis confirmed that SDS treatment significantly induced the expressions of interleukin-1 receptor antagonist (IL-1RN), FOS-like antigen 1 (FOSL1), heat shock 70 kDa protein 1A (HSPA1) and myeloid differentiation primary response gene (88) (MYD88), as well as the known marker genes for irritation IL-1beta and IL-8 in a LabCyte culture. Our results showed that a DNA microarray is a useful tool for efficiently evaluating mild skin irritation using a reconstructed human epidermal model.

Reconstruction of stochastic temporal networks through diffusive arrival times

NASA Astrophysics Data System (ADS)

Li, Xun; Li, Xiang

2017-06-01

Temporal networks have opened a new dimension in defining and quantification of complex interacting systems. Our ability to identify and reproduce time-resolved interaction patterns is, however, limited by the restricted access to empirical individual-level data. Here we propose an inverse modelling method based on first-arrival observations of the diffusion process taking place on temporal networks. We describe an efficient coordinate-ascent implementation for inferring stochastic temporal networks that builds in particular but not exclusively on the null model assumption of mutually independent interaction sequences at the dyadic level. The results of benchmark tests applied on both synthesized and empirical network data sets confirm the validity of our algorithm, showing the feasibility of statistically accurate inference of temporal networks only from moderate-sized samples of diffusion cascades. Our approach provides an effective and flexible scheme for the temporally augmented inverse problems of network reconstruction and has potential in a broad variety of applications.
Reconstruction of stochastic temporal networks through diffusive arrival times

PubMed Central

Li, Xun; Li, Xiang

2017-01-01

Temporal networks have opened a new dimension in defining and quantification of complex interacting systems. Our ability to identify and reproduce time-resolved interaction patterns is, however, limited by the restricted access to empirical individual-level data. Here we propose an inverse modelling method based on first-arrival observations of the diffusion process taking place on temporal networks. We describe an efficient coordinate-ascent implementation for inferring stochastic temporal networks that builds in particular but not exclusively on the null model assumption of mutually independent interaction sequences at the dyadic level. The results of benchmark tests applied on both synthesized and empirical network data sets confirm the validity of our algorithm, showing the feasibility of statistically accurate inference of temporal networks only from moderate-sized samples of diffusion cascades. Our approach provides an effective and flexible scheme for the temporally augmented inverse problems of network reconstruction and has potential in a broad variety of applications. PMID:28604687
Identification and Analyses of AUX-IAA target genes controlling multiple pathways in developing fiber cells of Gossypium hirsutum L

PubMed Central

Nigam, Deepti; Sawant, Samir V

2013-01-01

Technological development led to an increased interest in systems biological approaches in plants to characterize developmental mechanism and candidate genes relevant to specific tissue or cell morphology. AUX-IAA proteins are important plant-specific putative transcription factors. There are several reports on physiological response of this family in Arabidopsis but in cotton fiber the transcriptional network through which AUX-IAA regulated its target genes is still unknown. in-silico modelling of cotton fiber development specific gene expression data (108 microarrays and 22,737 genes) using Algorithm for the Reconstruction of Accurate Cellular Networks (ARACNe) reveals 3690 putative AUX-IAA target genes of which 139 genes were known to be AUX-IAA co-regulated within Arabidopsis. Further AUX-IAA targeted gene regulatory network (GRN) had substantial impact on the transcriptional dynamics of cotton fiber, as showed by, altered TF networks, and Gene Ontology (GO) biological processes and metabolic pathway associated with its target genes. Analysis of the AUX-IAA-correlated gene network reveals multiple functions for AUX-IAA target genes such as unidimensional cell growth, cellular nitrogen compound metabolic process, nucleosome organization, DNA-protein complex and process related to cell wall. These candidate networks/pathways have a variety of profound impacts on such cellular functions as stress response, cell proliferation, and cell differentiation. While these functions are fairly broad, their underlying TF networks may provide a global view of AUX-IAA regulated gene expression and a GRN that guides future studies in understanding role of AUX-IAA box protein and its targets regulating fiber development. PMID:24497725
Genome scale metabolic reconstruction of Chlorella variabilis for exploring its metabolic potential for biofuels.

PubMed

Juneja, Ankita; Chaplen, Frank W R; Murthy, Ganti S

2016-08-01

A compartmentalized genome scale metabolic network was reconstructed for Chlorella variabilis to offer insight into various metabolic potentials from this alga. The model, iAJ526, was reconstructed with 1455 reactions, 1236 metabolites and 526 genes. 21% of the reactions were transport reactions and about 81% of the total reactions were associated with enzymes. Along with gap filling reactions, 2 major sub-pathways were added to the model, chitosan synthesis and rhamnose metabolism. The reconstructed model had reaction participation of 4.3 metabolites per reaction and average lethality fraction of 0.21. The model was effective in capturing the growth of C. variabilis under three light conditions (white, red and red+blue light) with fair agreement. This reconstructed metabolic network will serve an important role in systems biology for further exploration of metabolism for specific target metabolites and enable improved characteristics in the strain through metabolic engineering. Copyright © 2016 Elsevier Ltd. All rights reserved.
A reproducible approach to high-throughput biological data acquisition and integration

PubMed Central

Rahnavard, Gholamali; Waldron, Levi; McIver, Lauren; Shafquat, Afrah; Franzosa, Eric A.; Miropolsky, Larissa; Sweeney, Christopher

2015-01-01

Modern biological research requires rapid, complex, and reproducible integration of multiple experimental results generated both internally and externally (e.g., from public repositories). Although large systematic meta-analyses are among the most effective approaches both for clinical biomarker discovery and for computational inference of biomolecular mechanisms, identifying, acquiring, and integrating relevant experimental results from multiple sources for a given study can be time-consuming and error-prone. To enable efficient and reproducible integration of diverse experimental results, we developed a novel approach for standardized acquisition and analysis of high-throughput and heterogeneous biological data. This allowed, first, novel biomolecular network reconstruction in human prostate cancer, which correctly recovered and extended the NFκB signaling pathway. Next, we investigated host-microbiome interactions. In less than an hour of analysis time, the system retrieved data and integrated six germ-free murine intestinal gene expression datasets to identify the genes most influenced by the gut microbiota, which comprised a set of immune-response and carbohydrate metabolism processes. Finally, we constructed integrated functional interaction networks to compare connectivity of peptide secretion pathways in the model organisms Escherichia coli, Bacillus subtilis, and Pseudomonas aeruginosa. PMID:26157642
NEAT: an efficient network enrichment analysis test.

PubMed

Signorelli, Mirko; Vinciotti, Veronica; Wit, Ernst C

2016-09-05

Network enrichment analysis is a powerful method, which allows to integrate gene enrichment analysis with the information on relationships between genes that is provided by gene networks. Existing tests for network enrichment analysis deal only with undirected networks, they can be computationally slow and are based on normality assumptions. We propose NEAT, a test for network enrichment analysis. The test is based on the hypergeometric distribution, which naturally arises as the null distribution in this context. NEAT can be applied not only to undirected, but to directed and partially directed networks as well. Our simulations indicate that NEAT is considerably faster than alternative resampling-based methods, and that its capacity to detect enrichments is at least as good as the one of alternative tests. We discuss applications of NEAT to network analyses in yeast by testing for enrichment of the Environmental Stress Response target gene set with GO Slim and KEGG functional gene sets, and also by inspecting associations between functional sets themselves. NEAT is a flexible and efficient test for network enrichment analysis that aims to overcome some limitations of existing resampling-based tests. The method is implemented in the R package neat, which can be freely downloaded from CRAN ( https://cran.r-project.org/package=neat ).
BiGG: a Biochemical Genetic and Genomic knowledgebase of large scale metabolic reconstructions

PubMed Central

2010-01-01

Background Genome-scale metabolic reconstructions under the Constraint Based Reconstruction and Analysis (COBRA) framework are valuable tools for analyzing the metabolic capabilities of organisms and interpreting experimental data. As the number of such reconstructions and analysis methods increases, there is a greater need for data uniformity and ease of distribution and use. Description We describe BiGG, a knowledgebase of Biochemically, Genetically and Genomically structured genome-scale metabolic network reconstructions. BiGG integrates several published genome-scale metabolic networks into one resource with standard nomenclature which allows components to be compared across different organisms. BiGG can be used to browse model content, visualize metabolic pathway maps, and export SBML files of the models for further analysis by external software packages. Users may follow links from BiGG to several external databases to obtain additional information on genes, proteins, reactions, metabolites and citations of interest. Conclusions BiGG addresses a need in the systems biology community to have access to high quality curated metabolic models and reconstructions. It is freely available for academic use at http://bigg.ucsd.edu. PMID:20426874
Enhancer Linking by Methylation/Expression Relationships (ELMER) | Informatics Technology for Cancer Research (ITCR)

Cancer.gov

R tool for analysis of DNA methylation and expression datasets. Integrative analysis allows reconstruction of in vivo transcription factor networks altered in cancer along with identification of the underlying gene regulatory sequences.
Likelihood-based gene annotations for gap filling and quality assessment in genome-scale metabolic models

DOE PAGES

Benedict, Matthew N.; Mundy, Michael B.; Henry, Christopher S.; ...

2014-10-16

Genome-scale metabolic models provide a powerful means to harness information from genomes to deepen biological insights. With exponentially increasing sequencing capacity, there is an enormous need for automated reconstruction techniques that can provide more accurate models in a short time frame. Current methods for automated metabolic network reconstruction rely on gene and reaction annotations to build draft metabolic networks and algorithms to fill gaps in these networks. However, automated reconstruction is hampered by database inconsistencies, incorrect annotations, and gap filling largely without considering genomic information. Here we develop an approach for applying genomic information to predict alternative functions for genesmore » and estimate their likelihoods from sequence homology. We show that computed likelihood values were significantly higher for annotations found in manually curated metabolic networks than those that were not. We then apply these alternative functional predictions to estimate reaction likelihoods, which are used in a new gap filling approach called likelihood-based gap filling to predict more genomically consistent solutions. To validate the likelihood-based gap filling approach, we applied it to models where essential pathways were removed, finding that likelihood-based gap filling identified more biologically relevant solutions than parsimony-based gap filling approaches. We also demonstrate that models gap filled using likelihood-based gap filling provide greater coverage and genomic consistency with metabolic gene functions compared to parsimony-based approaches. Interestingly, despite these findings, we found that likelihoods did not significantly affect consistency of gap filled models with Biolog and knockout lethality data. This indicates that the phenotype data alone cannot necessarily be used to discriminate between alternative solutions for gap filling and therefore, that the use of other information is necessary to obtain a more accurate network. All described workflows are implemented as part of the DOE Systems Biology Knowledgebase (KBase) and are publicly available via API or command-line web interface.« less
Inferring nonlinear gene regulatory networks from gene expression data based on distance correlation.

PubMed

Guo, Xiaobo; Zhang, Ye; Hu, Wenhao; Tan, Haizhu; Wang, Xueqin

2014-01-01

Nonlinear dependence is general in regulation mechanism of gene regulatory networks (GRNs). It is vital to properly measure or test nonlinear dependence from real data for reconstructing GRNs and understanding the complex regulatory mechanisms within the cellular system. A recently developed measurement called the distance correlation (DC) has been shown powerful and computationally effective in nonlinear dependence for many situations. In this work, we incorporate the DC into inferring GRNs from the gene expression data without any underling distribution assumptions. We propose three DC-based GRNs inference algorithms: CLR-DC, MRNET-DC and REL-DC, and then compare them with the mutual information (MI)-based algorithms by analyzing two simulated data: benchmark GRNs from the DREAM challenge and GRNs generated by SynTReN network generator, and an experimentally determined SOS DNA repair network in Escherichia coli. According to both the receiver operator characteristic (ROC) curve and the precision-recall (PR) curve, our proposed algorithms significantly outperform the MI-based algorithms in GRNs inference.
Inferring Nonlinear Gene Regulatory Networks from Gene Expression Data Based on Distance Correlation

PubMed Central

Guo, Xiaobo; Zhang, Ye; Hu, Wenhao; Tan, Haizhu; Wang, Xueqin

2014-01-01

Nonlinear dependence is general in regulation mechanism of gene regulatory networks (GRNs). It is vital to properly measure or test nonlinear dependence from real data for reconstructing GRNs and understanding the complex regulatory mechanisms within the cellular system. A recently developed measurement called the distance correlation (DC) has been shown powerful and computationally effective in nonlinear dependence for many situations. In this work, we incorporate the DC into inferring GRNs from the gene expression data without any underling distribution assumptions. We propose three DC-based GRNs inference algorithms: CLR-DC, MRNET-DC and REL-DC, and then compare them with the mutual information (MI)-based algorithms by analyzing two simulated data: benchmark GRNs from the DREAM challenge and GRNs generated by SynTReN network generator, and an experimentally determined SOS DNA repair network in Escherichia coli. According to both the receiver operator characteristic (ROC) curve and the precision-recall (PR) curve, our proposed algorithms significantly outperform the MI-based algorithms in GRNs inference. PMID:24551058
Exploring Hydrogenotrophic Methanogenesis: a Genome Scale Metabolic Reconstruction of Methanococcus maripaludis

DOE PAGES

Richards, Matthew A.; Lie, Thomas J.; Zhang, Juan; ...

2016-10-10

Hydrogenotrophic methanogenesis occurs in multiple environments, ranging from the intestinal tracts of animals to anaerobic sediments and hot springs. Energy conservation in hydrogenotrophic methanogens was long a mystery; only within the last decade was it reported that net energy conservation for growth depends on electron bifurcation. In this work, we focus onMethanococcus maripaludis, a well-studied hydrogenotrophic marine methanogen. To better understand hydrogenotrophic methanogenesis and compare it with methylotrophic methanogenesis that utilizes oxidative phosphorylation rather than electron bifurcation, we have built iMR539, a genome scale metabolic reconstruction that accounts for 539 of the 1,722 protein-coding genes ofM. maripaludisstrain S2. Our reconstructedmore » metabolic network uses recent literature to not only represent the central electron bifurcation reaction but also incorporate vital biosynthesis and assimilation pathways, including unique cofactor and coenzyme syntheses. We show that our model accurately predicts experimental growth and gene knockout data, with 93% accuracy and a Matthews correlation coefficient of 0.78. Furthermore, we use our metabolic network reconstruction to probe the implications of electron bifurcation by showing its essentiality, as well as investigating the infeasibility of aceticlastic methanogenesis in the network. Additionally, we demonstrate a method of applying thermodynamic constraints to a metabolic model to quickly estimate overall free-energy changes between what comes in and out of the cell. Finally, we describe a novel reconstruction-specific computational toolbox we created to improve usability. Together, our results provide a computational network for exploring hydrogenotrophic methanogenesis and confirm the importance of electron bifurcation in this process. Understanding and applying hydrogenotrophic methanogenesis is a promising avenue for developing new bioenergy technologies around methane gas. Although a significant portion of biological methane is generated through this environmentally ubiquitous pathway, existing methanogen models portray the more traditional energy conservation mechanisms that are found in other methanogens. In conclusion, we have constructed a genome scale metabolic network ofMethanococcus maripaludisthat explicitly accounts for all major reactions involved in hydrogenotrophic methanogenesis. Our reconstruction demonstrates the importance of electron bifurcation in central metabolism, providing both a window into hydrogenotrophic methanogenesis and a hypothesis-generating platform to fuel metabolic engineering efforts.« less
Part mutual information for quantifying direct associations in networks.

PubMed

Zhao, Juan; Zhou, Yiwei; Zhang, Xiujun; Chen, Luonan

2016-05-03

Quantitatively identifying direct dependencies between variables is an important task in data analysis, in particular for reconstructing various types of networks and causal relations in science and engineering. One of the most widely used criteria is partial correlation, but it can only measure linearly direct association and miss nonlinear associations. However, based on conditional independence, conditional mutual information (CMI) is able to quantify nonlinearly direct relationships among variables from the observed data, superior to linear measures, but suffers from a serious problem of underestimation, in particular for those variables with tight associations in a network, which severely limits its applications. In this work, we propose a new concept, "partial independence," with a new measure, "part mutual information" (PMI), which not only can overcome the problem of CMI but also retains the quantification properties of both mutual information (MI) and CMI. Specifically, we first defined PMI to measure nonlinearly direct dependencies between variables and then derived its relations with MI and CMI. Finally, we used a number of simulated data as benchmark examples to numerically demonstrate PMI features and further real gene expression data from Escherichia coli and yeast to reconstruct gene regulatory networks, which all validated the advantages of PMI for accurately quantifying nonlinearly direct associations in networks.
State Space Model with hidden variables for reconstruction of gene regulatory networks.

PubMed

Wu, Xi; Li, Peng; Wang, Nan; Gong, Ping; Perkins, Edward J; Deng, Youping; Zhang, Chaoyang

2011-01-01

State Space Model (SSM) is a relatively new approach to inferring gene regulatory networks. It requires less computational time than Dynamic Bayesian Networks (DBN). There are two types of variables in the linear SSM, observed variables and hidden variables. SSM uses an iterative method, namely Expectation-Maximization, to infer regulatory relationships from microarray datasets. The hidden variables cannot be directly observed from experiments. How to determine the number of hidden variables has a significant impact on the accuracy of network inference. In this study, we used SSM to infer Gene regulatory networks (GRNs) from synthetic time series datasets, investigated Bayesian Information Criterion (BIC) and Principle Component Analysis (PCA) approaches to determining the number of hidden variables in SSM, and evaluated the performance of SSM in comparison with DBN. True GRNs and synthetic gene expression datasets were generated using GeneNetWeaver. Both DBN and linear SSM were used to infer GRNs from the synthetic datasets. The inferred networks were compared with the true networks. Our results show that inference precision varied with the number of hidden variables. For some regulatory networks, the inference precision of DBN was higher but SSM performed better in other cases. Although the overall performance of the two approaches is compatible, SSM is much faster and capable of inferring much larger networks than DBN. This study provides useful information in handling the hidden variables and improving the inference precision.
Co-expression networks reveal the tissue-specific regulation of transcription and splicing

PubMed Central

Saha, Ashis; Kim, Yungil; Gewirtz, Ariel D.H.; Jo, Brian; Gao, Chuan; McDowell, Ian C.; Engelhardt, Barbara E.

2017-01-01

Gene co-expression networks capture biologically important patterns in gene expression data, enabling functional analyses of genes, discovery of biomarkers, and interpretation of genetic variants. Most network analyses to date have been limited to assessing correlation between total gene expression levels in a single tissue or small sets of tissues. Here, we built networks that additionally capture the regulation of relative isoform abundance and splicing, along with tissue-specific connections unique to each of a diverse set of tissues. We used the Genotype-Tissue Expression (GTEx) project v6 RNA sequencing data across 50 tissues and 449 individuals. First, we developed a framework called Transcriptome-Wide Networks (TWNs) for combining total expression and relative isoform levels into a single sparse network, capturing the interplay between the regulation of splicing and transcription. We built TWNs for 16 tissues and found that hubs in these networks were strongly enriched for splicing and RNA binding genes, demonstrating their utility in unraveling regulation of splicing in the human transcriptome. Next, we used a Bayesian biclustering model that identifies network edges unique to a single tissue to reconstruct Tissue-Specific Networks (TSNs) for 26 distinct tissues and 10 groups of related tissues. Finally, we found genetic variants associated with pairs of adjacent nodes in our networks, supporting the estimated network structures and identifying 20 genetic variants with distant regulatory impact on transcription and splicing. Our networks provide an improved understanding of the complex relationships of the human transcriptome across tissues. PMID:29021288
DOE Office of Scientific and Technical Information (OSTI.GOV)

Harwood, Caroline S

The goal of this project is to identify gene networks that are critical for efficient biohydrogen production by leveraging variation in gene content and gene expression in independently isolated Rhodopseudomonas palustris strains. Coexpression methods were applied to large data sets that we have collected to define probabilistic causal gene networks. To our knowledge this a first systems level approach that takes advantage of strain-to strain variability to computationally define networks critical for a particular bacterial phenotypic trait.
Gene Network Rewiring to Study Melanoma Stage Progression and Elements Essential for Driving Melanoma

PubMed Central

Kaushik, Abhinav; Bhatia, Yashuma; Ali, Shakir; Gupta, Dinesh

2015-01-01

Metastatic melanoma patients have a poor prognosis, mainly attributable to the underlying heterogeneity in melanoma driver genes and altered gene expression profiles. These characteristics of melanoma also make the development of drugs and identification of novel drug targets for metastatic melanoma a daunting task. Systems biology offers an alternative approach to re-explore the genes or gene sets that display dysregulated behaviour without being differentially expressed. In this study, we have performed systems biology studies to enhance our knowledge about the conserved property of disease genes or gene sets among mutually exclusive datasets representing melanoma progression. We meta-analysed 642 microarray samples to generate melanoma reconstructed networks representing four different stages of melanoma progression to extract genes with altered molecular circuitry wiring as compared to a normal cellular state. Intriguingly, a majority of the melanoma network-rewired genes are not differentially expressed and the disease genes involved in melanoma progression consistently modulate its activity by rewiring network connections. We found that the shortlisted disease genes in the study show strong and abnormal network connectivity, which enhances with the disease progression. Moreover, the deviated network properties of the disease gene sets allow ranking/prioritization of different enriched, dysregulated and conserved pathway terms in metastatic melanoma, in agreement with previous findings. Our analysis also reveals presence of distinct network hubs in different stages of metastasizing tumor for the same set of pathways in the statistically conserved gene sets. The study results are also presented as a freely available database at http://bioinfo.icgeb.res.in/m3db/. The web-based database resource consists of results from the analysis presented here, integrated with cytoscape web and user-friendly tools for visualization, retrieval and further analysis. PMID:26558755
Reconstruction and topological characterization of the sigma factor regulatory network of Mycobacterium tuberculosis

PubMed Central

Chauhan, Rinki; Ravi, Janani; Datta, Pratik; Chen, Tianlong; Schnappinger, Dirk; Bassler, Kevin E.; Balázsi, Gábor; Gennaro, Maria Laura

2016-01-01

Accessory sigma factors, which reprogram RNA polymerase to transcribe specific gene sets, activate bacterial adaptive responses to noxious environments. Here we reconstruct the complete sigma factor regulatory network of the human pathogen Mycobacterium tuberculosis by an integrated approach. The approach combines identification of direct regulatory interactions between M. tuberculosis sigma factors in an E. coli model system, validation of selected links in M. tuberculosis, and extensive literature review. The resulting network comprises 41 direct interactions among all 13 sigma factors. Analysis of network topology reveals (i) a three-tiered hierarchy initiating at master regulators, (ii) high connectivity and (iii) distinct communities containing multiple sigma factors. These topological features are likely associated with multi-layer signal processing and specialized stress responses involving multiple sigma factors. Moreover, the identification of overrepresented network motifs, such as autoregulation and coregulation of sigma and anti-sigma factor pairs, provides structural information that is relevant for studies of network dynamics. PMID:27029515
Genetic architecture of wood properties based on association analysis and co-expression networks in white spruce.

PubMed

Lamara, Mebarek; Raherison, Elie; Lenz, Patrick; Beaulieu, Jean; Bousquet, Jean; MacKay, John

2016-04-01

Association studies are widely utilized to analyze complex traits but their ability to disclose genetic architectures is often limited by statistical constraints, and functional insights are usually minimal in nonmodel organisms like forest trees. We developed an approach to integrate association mapping results with co-expression networks. We tested single nucleotide polymorphisms (SNPs) in 2652 candidate genes for statistical associations with wood density, stiffness, microfibril angle and ring width in a population of 1694 white spruce trees (Picea glauca). Associations mapping identified 229-292 genes per wood trait using a statistical significance level of P < 0.05 to maximize discovery. Over-representation of genes associated for nearly all traits was found in a xylem preferential co-expression group developed in independent experiments. A xylem co-expression network was reconstructed with 180 wood associated genes and several known MYB and NAC regulators were identified as network hubs. The network revealed a link between the gene PgNAC8, wood stiffness and microfibril angle, as well as considerable within-season variation for both genetic control of wood traits and gene expression. Trait associations were distributed throughout the network suggesting complex interactions and pleiotropic effects. Our findings indicate that integration of association mapping and co-expression networks enhances our understanding of complex wood traits. © 2015 The Authors. New Phytologist © 2015 New Phytologist Trust.
PLAU inferred from a correlation network is critical for suppressor function of regulatory T cells

PubMed Central

He, Feng; Chen, Hairong; Probst-Kepper, Michael; Geffers, Robert; Eifes, Serge; del Sol, Antonio; Schughart, Klaus; Zeng, An-Ping; Balling, Rudi

2012-01-01

Human FOXP3+CD25+CD4+ regulatory T cells (Tregs) are essential to the maintenance of immune homeostasis. Several genes are known to be important for murine Tregs, but for human Tregs the genes and underlying molecular networks controlling the suppressor function still largely remain unclear. Here, we describe a strategy to identify the key genes directly from an undirected correlation network which we reconstruct from a very high time-resolution (HTR) transcriptome during the activation of human Tregs/CD4+ T-effector cells. We show that a predicted top-ranked new key gene PLAU (the plasminogen activator urokinase) is important for the suppressor function of both human and murine Tregs. Further analysis unveils that PLAU is particularly important for memory Tregs and that PLAU mediates Treg suppressor function via STAT5 and ERK signaling pathways. Our study demonstrates the potential for identifying novel key genes for complex dynamic biological processes using a network strategy based on HTR data, and reveals a critical role for PLAU in Treg suppressor function. PMID:23169000

Functional architecture and global properties of the Corynebacterium glutamicum regulatory network: Novel insights from a dataset with a high genomic coverage.

PubMed

Freyre-González, Julio A; Tauch, Andreas

2017-09-10

Corynebacterium glutamicum is a Gram-positive, anaerobic, rod-shaped soil bacterium able to grow on a diversity of carbon sources like sugars and organic acids. It is a biotechnological relevant organism because of its highly efficient ability to biosynthesize amino acids, such as l-glutamic acid and l-lysine. Here, we reconstructed the most complete C. glutamicum regulatory network to date and comprehensively analyzed its global organizational properties, systems-level features and functional architecture. Our analyses show the tremendous power of Abasy Atlas to study the functional organization of regulatory networks. We created two models of the C. glutamicum regulatory network: all-evidences (containing both weak and strong supported interactions, genomic coverage=73%) and strongly-supported (only accounting for strongly supported evidences, genomic coverage=71%). Using state-of-the-art methodologies, we prove that power-law behaviors truly govern the connectivity and clustering coefficient distributions. We found a non-previously reported circuit motif that we named complex feed-forward motif. We highlighted the importance of feedback loops for the functional architecture, beyond whether they are statistically over-represented or not in the network. We show that the previously reported top-down approach is inadequate to infer the hierarchy governing a regulatory network because feedback bridges different hierarchical layers, and the top-down approach disregards the presence of intermodular genes shaping the integration layer. Our findings all together further support a diamond-shaped, three-layered hierarchy exhibiting some feedback between processing and coordination layers, which is shaped by four classes of systems-level elements: global regulators, locally autonomous modules, basal machinery and intermodular genes. Copyright © 2016 Elsevier B.V. All rights reserved.
Social networks in primates: smart and tolerant species have more efficient networks.

PubMed

Pasquaretta, Cristian; Levé, Marine; Claidière, Nicolas; van de Waal, Erica; Whiten, Andrew; MacIntosh, Andrew J J; Pelé, Marie; Bergstrom, Mackenzie L; Borgeaud, Christèle; Brosnan, Sarah F; Crofoot, Margaret C; Fedigan, Linda M; Fichtel, Claudia; Hopper, Lydia M; Mareno, Mary Catherine; Petit, Odile; Schnoell, Anna Viktoria; di Sorrentino, Eugenia Polizzi; Thierry, Bernard; Tiddi, Barbara; Sueur, Cédric

2014-12-23

Network optimality has been described in genes, proteins and human communicative networks. In the latter, optimality leads to the efficient transmission of information with a minimum number of connections. Whilst studies show that differences in centrality exist in animal networks with central individuals having higher fitness, network efficiency has never been studied in animal groups. Here we studied 78 groups of primates (24 species). We found that group size and neocortex ratio were correlated with network efficiency. Centralisation (whether several individuals are central in the group) and modularity (how a group is clustered) had opposing effects on network efficiency, showing that tolerant species have more efficient networks. Such network properties affecting individual fitness could be shaped by natural selection. Our results are in accordance with the social brain and cultural intelligence hypotheses, which suggest that the importance of network efficiency and information flow through social learning relates to cognitive abilities.
Social networks in primates: smart and tolerant species have more efficient networks

PubMed Central

Pasquaretta, Cristian; Levé, Marine; Claidière, Nicolas; van de Waal, Erica; Whiten, Andrew; MacIntosh, Andrew J. J.; Pelé, Marie; Bergstrom, Mackenzie L.; Borgeaud, Christèle; Brosnan, Sarah F.; Crofoot, Margaret C.; Fedigan, Linda M.; Fichtel, Claudia; Hopper, Lydia M.; Mareno, Mary Catherine; Petit, Odile; Schnoell, Anna Viktoria; di Sorrentino, Eugenia Polizzi; Thierry, Bernard; Tiddi, Barbara; Sueur, Cédric

2014-01-01

Network optimality has been described in genes, proteins and human communicative networks. In the latter, optimality leads to the efficient transmission of information with a minimum number of connections. Whilst studies show that differences in centrality exist in animal networks with central individuals having higher fitness, network efficiency has never been studied in animal groups. Here we studied 78 groups of primates (24 species). We found that group size and neocortex ratio were correlated with network efficiency. Centralisation (whether several individuals are central in the group) and modularity (how a group is clustered) had opposing effects on network efficiency, showing that tolerant species have more efficient networks. Such network properties affecting individual fitness could be shaped by natural selection. Our results are in accordance with the social brain and cultural intelligence hypotheses, which suggest that the importance of network efficiency and information flow through social learning relates to cognitive abilities. PMID:25534964
Network representations of immune system complexity

PubMed Central

Subramanian, Naeha; Torabi-Parizi, Parizad; Gottschalk, Rachel A.; Germain, Ronald N.; Dutta, Bhaskar

2015-01-01

The mammalian immune system is a dynamic multi-scale system composed of a hierarchically organized set of molecular, cellular and organismal networks that act in concert to promote effective host defense. These networks range from those involving gene regulatory and protein-protein interactions underlying intracellular signaling pathways and single cell responses to increasingly complex networks of in vivo cellular interaction, positioning and migration that determine the overall immune response of an organism. Immunity is thus not the product of simple signaling events but rather non-linear behaviors arising from dynamic, feedback-regulated interactions among many components. One of the major goals of systems immunology is to quantitatively measure these complex multi-scale spatial and temporal interactions, permitting development of computational models that can be used to predict responses to perturbation. Recent technological advances permit collection of comprehensive datasets at multiple molecular and cellular levels while advances in network biology support representation of the relationships of components at each level as physical or functional interaction networks. The latter facilitate effective visualization of patterns and recognition of emergent properties arising from the many interactions of genes, molecules, and cells of the immune system. We illustrate the power of integrating ‘omics’ and network modeling approaches for unbiased reconstruction of signaling and transcriptional networks with a focus on applications involving the innate immune system. We further discuss future possibilities for reconstruction of increasingly complex cellular and organism-level networks and development of sophisticated computational tools for prediction of emergent immune behavior arising from the concerted action of these networks. PMID:25625853
Network Reconstruction From High-Dimensional Ordinary Differential Equations.

PubMed

Chen, Shizhe; Shojaie, Ali; Witten, Daniela M

2017-01-01

We consider the task of learning a dynamical system from high-dimensional time-course data. For instance, we might wish to estimate a gene regulatory network from gene expression data measured at discrete time points. We model the dynamical system nonparametrically as a system of additive ordinary differential equations. Most existing methods for parameter estimation in ordinary differential equations estimate the derivatives from noisy observations. This is known to be challenging and inefficient. We propose a novel approach that does not involve derivative estimation. We show that the proposed method can consistently recover the true network structure even in high dimensions, and we demonstrate empirical improvement over competing approaches. Supplementary materials for this article are available online.
The G-Box Transcriptional Regulatory Code in Arabidopsis1[OPEN

PubMed Central

Shepherd, Samuel J.K.; Brestovitsky, Anna; Dickinson, Patrick; Biswas, Surojit

2017-01-01

Plants have significantly more transcription factor (TF) families than animals and fungi, and plant TF families tend to contain more genes; these expansions are linked to adaptation to environmental stressors. Many TF family members bind to similar or identical sequence motifs, such as G-boxes (CACGTG), so it is difficult to predict regulatory relationships. We determined that the flanking sequences near G-boxes help determine in vitro specificity but that this is insufficient to predict the transcription pattern of genes near G-boxes. Therefore, we constructed a gene regulatory network that identifies the set of bZIPs and bHLHs that are most predictive of the expression of genes downstream of perfect G-boxes. This network accurately predicts transcriptional patterns and reconstructs known regulatory subnetworks. Finally, we present Ara-BOX-cis (araboxcis.org), a Web site that provides interactive visualizations of the G-box regulatory network, a useful resource for generating predictions for gene regulatory relations. PMID:28864470
Identifying interactions in the time and frequency domains in local and global networks - A Granger Causality Approach.

PubMed

Zou, Cunlu; Ladroue, Christophe; Guo, Shuixia; Feng, Jianfeng

2010-06-21

Reverse-engineering approaches such as Bayesian network inference, ordinary differential equations (ODEs) and information theory are widely applied to deriving causal relationships among different elements such as genes, proteins, metabolites, neurons, brain areas and so on, based upon multi-dimensional spatial and temporal data. There are several well-established reverse-engineering approaches to explore causal relationships in a dynamic network, such as ordinary differential equations (ODE), Bayesian networks, information theory and Granger Causality. Here we focused on Granger causality both in the time and frequency domain and in local and global networks, and applied our approach to experimental data (genes and proteins). For a small gene network, Granger causality outperformed all the other three approaches mentioned above. A global protein network of 812 proteins was reconstructed, using a novel approach. The obtained results fitted well with known experimental findings and predicted many experimentally testable results. In addition to interactions in the time domain, interactions in the frequency domain were also recovered. The results on the proteomic data and gene data confirm that Granger causality is a simple and accurate approach to recover the network structure. Our approach is general and can be easily applied to other types of temporal data.
Evaluating the Small-World-Ness of a Sampled Network: Functional Connectivity of Entorhinal-Hippocampal Circuitry

NASA Astrophysics Data System (ADS)

She, Qi; Chen, Guanrong; Chan, Rosa H. M.

2016-02-01

The amount of publicly accessible experimental data has gradually increased in recent years, which makes it possible to reconsider many longstanding questions in neuroscience. In this paper, an efficient framework is presented for reconstructing functional connectivity using experimental spike-train data. A modified generalized linear model (GLM) with L1-norm penalty was used to investigate 10 datasets. These datasets contain spike-train data collected from the entorhinal-hippocampal region in the brains of rats performing different tasks. The analysis shows that entorhinal-hippocampal network of well-trained rats demonstrated significant small-world features. It is found that the connectivity structure generated by distance-dependent models is responsible for the observed small-world features of the reconstructed networks. The models are utilized to simulate a subset of units recorded from a large biological neural network using multiple electrodes. Two metrics for quantifying the small-world-ness both suggest that the reconstructed network from the sampled nodes estimates a more prominent small-world-ness feature than that of the original unknown network when the number of recorded neurons is small. Finally, this study shows that it is feasible to adjust the estimated small-world-ness results based on the number of neurons recorded to provide a more accurate reference of the network property.
Functional architecture of Escherichia coli: new insights provided by a natural decomposition approach.

PubMed

Freyre-González, Julio A; Alonso-Pavón, José A; Treviño-Quintanilla, Luis G; Collado-Vides, Julio

2008-10-27

Previous studies have used different methods in an effort to extract the modular organization of transcriptional regulatory networks. However, these approaches are not natural, as they try to cluster strongly connected genes into a module or locate known pleiotropic transcription factors in lower hierarchical layers. Here, we unravel the transcriptional regulatory network of Escherichia coli by separating it into its key elements, thus revealing its natural organization. We also present a mathematical criterion, based on the topological features of the transcriptional regulatory network, to classify the network elements into one of two possible classes: hierarchical or modular genes. We found that modular genes are clustered into physiologically correlated groups validated by a statistical analysis of the enrichment of the functional classes. Hierarchical genes encode transcription factors responsible for coordinating module responses based on general interest signals. Hierarchical elements correlate highly with the previously studied global regulators, suggesting that this could be the first mathematical method to identify global regulators. We identified a new element in transcriptional regulatory networks never described before: intermodular genes. These are structural genes that integrate, at the promoter level, signals coming from different modules, and therefore from different physiological responses. Using the concept of pleiotropy, we have reconstructed the hierarchy of the network and discuss the role of feedforward motifs in shaping the hierarchical backbone of the transcriptional regulatory network. This study sheds new light on the design principles underpinning the organization of transcriptional regulatory networks, showing a novel nonpyramidal architecture composed of independent modules globally governed by hierarchical transcription factors, whose responses are integrated by intermodular genes.
Improved annotation through genome-scale metabolic modeling of Aspergillus oryzae

PubMed Central

Vongsangnak, Wanwipa; Olsen, Peter; Hansen, Kim; Krogsgaard, Steen; Nielsen, Jens

2008-01-01

Background Since ancient times the filamentous fungus Aspergillus oryzae has been used in the fermentation industry for the production of fermented sauces and the production of industrial enzymes. Recently, the genome sequence of A. oryzae with 12,074 annotated genes was released but the number of hypothetical proteins accounted for more than 50% of the annotated genes. Considering the industrial importance of this fungus, it is therefore valuable to improve the annotation and further integrate genomic information with biochemical and physiological information available for this microorganism and other related fungi. Here we proposed the gene prediction by construction of an A. oryzae Expressed Sequence Tag (EST) library, sequencing and assembly. We enhanced the function assignment by our developed annotation strategy. The resulting better annotation was used to reconstruct the metabolic network leading to a genome scale metabolic model of A. oryzae. Results Our assembled EST sequences we identified 1,046 newly predicted genes in the A. oryzae genome. Furthermore, it was possible to assign putative protein functions to 398 of the newly predicted genes. Noteworthy, our annotation strategy resulted in assignment of new putative functions to 1,469 hypothetical proteins already present in the A. oryzae genome database. Using the substantially improved annotated genome we reconstructed the metabolic network of A. oryzae. This network contains 729 enzymes, 1,314 enzyme-encoding genes, 1,073 metabolites and 1,846 (1,053 unique) biochemical reactions. The metabolic reactions are compartmentalized into the cytosol, the mitochondria, the peroxisome and the extracellular space. Transport steps between the compartments and the extracellular space represent 281 reactions, of which 161 are unique. The metabolic model was validated and shown to correctly describe the phenotypic behavior of A. oryzae grown on different carbon sources. Conclusion A much enhanced annotation of the A. oryzae genome was performed and a genome-scale metabolic model of A. oryzae was reconstructed. The model accurately predicted the growth and biomass yield on different carbon sources. The model serves as an important resource for gaining further insight into our understanding of A. oryzae physiology. PMID:18500999
Exploring candidate biomarkers for lung and prostate cancers using gene expression and flux variability analysis.

PubMed

Asgari, Yazdan; Khosravi, Pegah; Zabihinpour, Zahra; Habibi, Mahnaz

2018-02-19

Genome-scale metabolic models have provided valuable resources for exploring changes in metabolism under normal and cancer conditions. However, metabolism itself is strongly linked to gene expression, so integration of gene expression data into metabolic models might improve the detection of genes involved in the control of tumor progression. Herein, we considered gene expression data as extra constraints to enhance the predictive powers of metabolic models. We reconstructed genome-scale metabolic models for lung and prostate, under normal and cancer conditions to detect the major genes associated with critical subsystems during tumor development. Furthermore, we utilized gene expression data in combination with an information theory-based approach to reconstruct co-expression networks of the human lung and prostate in both cohorts. Our results revealed 19 genes as candidate biomarkers for lung and prostate cancer cells. This study also revealed that the development of a complementary approach (integration of gene expression and metabolic profiles) could lead to proposing novel biomarkers and suggesting renovated cancer treatment strategies which have not been possible to detect using either of the methods alone.
Co-expression networks reveal the tissue-specific regulation of transcription and splicing.

PubMed

Saha, Ashis; Kim, Yungil; Gewirtz, Ariel D H; Jo, Brian; Gao, Chuan; McDowell, Ian C; Engelhardt, Barbara E; Battle, Alexis

2017-11-01

Gene co-expression networks capture biologically important patterns in gene expression data, enabling functional analyses of genes, discovery of biomarkers, and interpretation of genetic variants. Most network analyses to date have been limited to assessing correlation between total gene expression levels in a single tissue or small sets of tissues. Here, we built networks that additionally capture the regulation of relative isoform abundance and splicing, along with tissue-specific connections unique to each of a diverse set of tissues. We used the Genotype-Tissue Expression (GTEx) project v6 RNA sequencing data across 50 tissues and 449 individuals. First, we developed a framework called Transcriptome-Wide Networks (TWNs) for combining total expression and relative isoform levels into a single sparse network, capturing the interplay between the regulation of splicing and transcription. We built TWNs for 16 tissues and found that hubs in these networks were strongly enriched for splicing and RNA binding genes, demonstrating their utility in unraveling regulation of splicing in the human transcriptome. Next, we used a Bayesian biclustering model that identifies network edges unique to a single tissue to reconstruct Tissue-Specific Networks (TSNs) for 26 distinct tissues and 10 groups of related tissues. Finally, we found genetic variants associated with pairs of adjacent nodes in our networks, supporting the estimated network structures and identifying 20 genetic variants with distant regulatory impact on transcription and splicing. Our networks provide an improved understanding of the complex relationships of the human transcriptome across tissues. © 2017 Saha et al.; Published by Cold Spring Harbor Laboratory Press.
Gene order in rosid phylogeny, inferred from pairwise syntenies among extant genomes

PubMed Central

2012-01-01

Background Ancestral gene order reconstruction for flowering plants has lagged behind developments in yeasts, insects and higher animals, because of the recency of widespread plant genome sequencing, sequencers' embargoes on public data use, paralogies due to whole genome duplication (WGD) and fractionation of undeleted duplicates, extensive paralogy from other sources, and the computational cost of existing methods. Results We address these problems, using the gene order of four core eudicot genomes (cacao, castor bean, papaya and grapevine) that have escaped any recent WGD events, and two others (poplar and cucumber) that descend from independent WGDs, in inferring the ancestral gene order of the rosid clade and those of its main subgroups, the fabids and malvids. We improve and adapt techniques including the OMG method for extracting large, paralogy-free, multiple orthologies from conflated pairwise synteny data among the six genomes and the PATHGROUPS approach for ancestral gene order reconstruction in a given phylogeny, where some genomes may be descendants of WGD events. We use the gene order evidence to evaluate the hypothesis that the order Malpighiales belongs to the malvids rather than as traditionally assigned to the fabids. Conclusions Gene orders of ancestral eudicot species, involving 10,000 or more genes can be reconstructed in an efficient, parsimonious and consistent way, despite paralogies due to WGD and other processes. Pairwise genomic syntenies provide appropriate input to a parameter-free procedure of multiple ortholog identification followed by gene-order reconstruction in solving instances of the "small phylogeny" problem. PMID:22759433
Continuous time Bayesian networks identify Prdm1 as a negative regulator of TH17 cell differentiation in humans

PubMed Central

Acerbi, Enzo; Viganò, Elena; Poidinger, Michael; Mortellaro, Alessandra; Zelante, Teresa; Stella, Fabio

2016-01-01

T helper 17 (TH17) cells represent a pivotal adaptive cell subset involved in multiple immune disorders in mammalian species. Deciphering the molecular interactions regulating TH17 cell differentiation is particularly critical for novel drug target discovery designed to control maladaptive inflammatory conditions. Using continuous time Bayesian networks over a time-course gene expression dataset, we inferred the global regulatory network controlling TH17 differentiation. From the network, we identified the Prdm1 gene encoding the B lymphocyte-induced maturation protein 1 as a crucial negative regulator of human TH17 cell differentiation. The results have been validated by perturbing Prdm1 expression on freshly isolated CD4+ naïve T cells: reduction of Prdm1 expression leads to augmentation of IL-17 release. These data unravel a possible novel target to control TH17 polarization in inflammatory disorders. Furthermore, this study represents the first in vitro validation of continuous time Bayesian networks as gene network reconstruction method and as hypothesis generation tool for wet-lab biological experiments. PMID:26976045
Combining inferred regulatory and reconstructed metabolic networks enhances phenotype prediction in yeast.

PubMed

Wang, Zhuo; Danziger, Samuel A; Heavner, Benjamin D; Ma, Shuyi; Smith, Jennifer J; Li, Song; Herricks, Thurston; Simeonidis, Evangelos; Baliga, Nitin S; Aitchison, John D; Price, Nathan D

2017-05-01

Gene regulatory and metabolic network models have been used successfully in many organisms, but inherent differences between them make networks difficult to integrate. Probabilistic Regulation Of Metabolism (PROM) provides a partial solution, but it does not incorporate network inference and underperforms in eukaryotes. We present an Integrated Deduced And Metabolism (IDREAM) method that combines statistically inferred Environment and Gene Regulatory Influence Network (EGRIN) models with the PROM framework to create enhanced metabolic-regulatory network models. We used IDREAM to predict phenotypes and genetic interactions between transcription factors and genes encoding metabolic activities in the eukaryote, Saccharomyces cerevisiae. IDREAM models contain many fewer interactions than PROM and yet produce significantly more accurate growth predictions. IDREAM consistently outperformed PROM using any of three popular yeast metabolic models and across three experimental growth conditions. Importantly, IDREAM's enhanced accuracy makes it possible to identify subtle synthetic growth defects. With experimental validation, these novel genetic interactions involving the pyruvate dehydrogenase complex suggested a new role for fatty acid-responsive factor Oaf1 in regulating acetyl-CoA production in glucose grown cells.
Honey bee-inspired algorithms for SNP haplotype reconstruction problem

NASA Astrophysics Data System (ADS)

PourkamaliAnaraki, Maryam; Sadeghi, Mehdi

2016-03-01

Reconstructing haplotypes from SNP fragments is an important problem in computational biology. There have been a lot of interests in this field because haplotypes have been shown to contain promising data for disease association research. It is proved that haplotype reconstruction in Minimum Error Correction model is an NP-hard problem. Therefore, several methods such as clustering techniques, evolutionary algorithms, neural networks and swarm intelligence approaches have been proposed in order to solve this problem in appropriate time. In this paper, we have focused on various evolutionary clustering techniques and try to find an efficient technique for solving haplotype reconstruction problem. It can be referred from our experiments that the clustering methods relying on the behaviour of honey bee colony in nature, specifically bees algorithm and artificial bee colony methods, are expected to result in more efficient solutions. An application program of the methods is available at the following link. http://www.bioinf.cs.ipm.ir/software/haprs/
Reconstruction of three-dimensional porous media using generative adversarial neural networks

NASA Astrophysics Data System (ADS)

Mosser, Lukas; Dubrule, Olivier; Blunt, Martin J.

2017-10-01

To evaluate the variability of multiphase flow properties of porous media at the pore scale, it is necessary to acquire a number of representative samples of the void-solid structure. While modern x-ray computer tomography has made it possible to extract three-dimensional images of the pore space, assessment of the variability in the inherent material properties is often experimentally not feasible. We present a method to reconstruct the solid-void structure of porous media by applying a generative neural network that allows an implicit description of the probability distribution represented by three-dimensional image data sets. We show, by using an adversarial learning approach for neural networks, that this method of unsupervised learning is able to generate representative samples of porous media that honor their statistics. We successfully compare measures of pore morphology, such as the Euler characteristic, two-point statistics, and directional single-phase permeability of synthetic realizations with the calculated properties of a bead pack, Berea sandstone, and Ketton limestone. Results show that generative adversarial networks can be used to reconstruct high-resolution three-dimensional images of porous media at different scales that are representative of the morphology of the images used to train the neural network. The fully convolutional nature of the trained neural network allows the generation of large samples while maintaining computational efficiency. Compared to classical stochastic methods of image reconstruction, the implicit representation of the learned data distribution can be stored and reused to generate multiple realizations of the pore structure very rapidly.
Identification of Direct Target Genes Using Joint Sequence and Expression Likelihood with Application to DAF-16

PubMed Central

Yu, Ron X.; Liu, Jie; True, Nick; Wang, Wei

2008-01-01

A major challenge in the post-genome era is to reconstruct regulatory networks from the biological knowledge accumulated up to date. The development of tools for identifying direct target genes of transcription factors (TFs) is critical to this endeavor. Given a set of microarray experiments, a probabilistic model called TRANSMODIS has been developed which can infer the direct targets of a TF by integrating sequence motif, gene expression and ChIP-chip data. The performance of TRANSMODIS was first validated on a set of transcription factor perturbation experiments (TFPEs) involving Pho4p, a well studied TF in Saccharomyces cerevisiae. TRANSMODIS removed elements of arbitrariness in manual target gene selection process and produced results that concur with one's intuition. TRANSMODIS was further validated on a genome-wide scale by comparing it with two other methods in Saccharomyces cerevisiae. The usefulness of TRANSMODIS was then demonstrated by applying it to the identification of direct targets of DAF-16, a critical TF regulating ageing in Caenorhabditis elegans. We found that 189 genes were tightly regulated by DAF-16. In addition, DAF-16 has differential preference for motifs when acting as an activator or repressor, which awaits experimental verification. TRANSMODIS is computationally efficient and robust, making it a useful probabilistic framework for finding immediate targets. PMID:18350157
Technologies and Approaches to Elucidate and Model the Virulence Program of Salmonella.

DOE Office of Scientific and Technical Information (OSTI.GOV)

McDermott, Jason E.; Yoon, Hyunjin; Nakayasu, Ernesto S.

Salmonella is a primary cause of enteric diseases in a variety of animals. During its evolution into a pathogenic bacterium, Salmonella acquired an elaborate regulatory network that responds to multiple environmental stimuli within host animals and integrates them resulting in fine regulation of the virulence program. The coordinated action by this regulatory network involves numerous virulence regulators, necessitating genome-wide profiling analysis to assess and combine efforts from multiple regulons. In this review we discuss recent high-throughput analytic approaches to understand the regulatory network of Salmonella that controls virulence processes. Application of high-throughput analyses have generated a large amount of datamore » and driven development of computational approaches required for data integration. Therefore, we also cover computer-aided network analyses to infer regulatory networks, and demonstrate how genome-scale data can be used to construct regulatory and metabolic systems models of Salmonella pathogenesis. Genes that are coordinately controlled by multiple virulence regulators under infectious conditions are more likely to be important for pathogenesis. Thus, reconstructing the global regulatory network during infection or, at the very least, under conditions that mimic the host cellular environment not only provides a bird’s eye view of Salmonella survival strategy in response to hostile host environments but also serves as an efficient means to identify novel virulence factors that are essential for Salmonella to accomplish systemic infection in the host.« less
Hemispheric lateralization of topological organization in structural brain networks.

PubMed

Caeyenberghs, Karen; Leemans, Alexander

2014-09-01

The study on structural brain asymmetries in healthy individuals plays an important role in our understanding of the factors that modulate cognitive specialization in the brain. Here, we used fiber tractography to reconstruct the left and right hemispheric networks of a large cohort of 346 healthy participants (20-86 years) and performed a graph theoretical analysis to investigate this brain laterality from a network perspective. Findings revealed that the left hemisphere is significantly more "efficient" than the right hemisphere, whereas the right hemisphere showed higher values of "betweenness centrality" and "small-worldness." In particular, left-hemispheric networks displayed increased nodal efficiency in brain regions related to language and motor actions, whereas the right hemisphere showed an increase in nodal efficiency in brain regions involved in memory and visuospatial attention. In addition, we found that hemispheric networks decrease in efficiency with age. Finally, we observed significant gender differences in measures of global connectivity. By analyzing the structural hemispheric brain networks, we have provided new insights into understanding the neuroanatomical basis of lateralized brain functions. Copyright © 2014 Wiley Periodicals, Inc.

Integrative network alignment reveals large regions of global network similarity in yeast and human.

PubMed

Kuchaiev, Oleksii; Przulj, Natasa

2011-05-15

High-throughput methods for detecting molecular interactions have produced large sets of biological network data with much more yet to come. Analogous to sequence alignment, efficient and reliable network alignment methods are expected to improve our understanding of biological systems. Unlike sequence alignment, network alignment is computationally intractable. Hence, devising efficient network alignment heuristics is currently a foremost challenge in computational biology. We introduce a novel network alignment algorithm, called Matching-based Integrative GRAph ALigner (MI-GRAAL), which can integrate any number and type of similarity measures between network nodes (e.g. proteins), including, but not limited to, any topological network similarity measure, sequence similarity, functional similarity and structural similarity. Hence, we resolve the ties in similarity measures and find a combination of similarity measures yielding the largest contiguous (i.e. connected) and biologically sound alignments. MI-GRAAL exposes the largest functional, connected regions of protein-protein interaction (PPI) network similarity to date: surprisingly, it reveals that 77.7% of proteins in the baker's yeast high-confidence PPI network participate in such a subnetwork that is fully contained in the human high-confidence PPI network. This is the first demonstration that species as diverse as yeast and human contain so large, continuous regions of global network similarity. We apply MI-GRAAL's alignments to predict functions of un-annotated proteins in yeast, human and bacteria validating our predictions in the literature. Furthermore, using network alignment scores for PPI networks of different herpes viruses, we reconstruct their phylogenetic relationship. This is the first time that phylogeny is exactly reconstructed from purely topological alignments of PPI networks. Supplementary files and MI-GRAAL executables: http://bio-nets.doc.ic.ac.uk/MI-GRAAL/.
Reconstruction of a composite comparative map composed of ten legume genomes.

PubMed

Lee, Chaeyoung; Yu, Dongwoon; Choi, Hong-Kyu; Kim, Ryan W

2017-01-01

The Fabaceae (legume family) is the third largest and the second of agricultural importance among flowering plant groups. In this study, we report the reconstruction of a composite comparative map composed of ten legume genomes, including seven species from the galegoid clade ( Medicago truncatula , Medicago sativa , Lens culinaris, Pisum sativum , Lotus japonicus , Cicer arietinum , Vicia faba ) and three species from the phaseoloid clade ( Vigna radiata , Phaseolus vulgaris , Glycine max ). To accomplish this comparison, a total of 209 cross-species gene-derived markers were employed. The comparative analysis resulted in a single extensive genetic/genomic network composed of 93 chromosomes or linkage groups, from which 110 synteny blocks and other evolutionary events (e.g., 13 inversions) were identified. This comparative map also allowed us to deduce several large scale evolutionary events, such as chromosome fusion/fission, with which might explain differences in chromosome numbers among compared species or between the two clades. As a result, useful properties of cross-species genic markers were re-verified as an efficient tool for cross-species translation of genomic information, and similar approaches, combined with a high throughput bioinformatic marker design program, should be effective for applying the knowledge of trait-associated genes to other important crop species for breeding purposes. Here, we provide a basic comparative framework for the ten legume species, and expect to be usefully applied towards the crop improvement in legume breeding.
Designing a parallel evolutionary algorithm for inferring gene networks on the cloud computing environment.

PubMed

Lee, Wei-Po; Hsiao, Yu-Ting; Hwang, Wei-Che

2014-01-16

To improve the tedious task of reconstructing gene networks through testing experimentally the possible interactions between genes, it becomes a trend to adopt the automated reverse engineering procedure instead. Some evolutionary algorithms have been suggested for deriving network parameters. However, to infer large networks by the evolutionary algorithm, it is necessary to address two important issues: premature convergence and high computational cost. To tackle the former problem and to enhance the performance of traditional evolutionary algorithms, it is advisable to use parallel model evolutionary algorithms. To overcome the latter and to speed up the computation, it is advocated to adopt the mechanism of cloud computing as a promising solution: most popular is the method of MapReduce programming model, a fault-tolerant framework to implement parallel algorithms for inferring large gene networks. This work presents a practical framework to infer large gene networks, by developing and parallelizing a hybrid GA-PSO optimization method. Our parallel method is extended to work with the Hadoop MapReduce programming model and is executed in different cloud computing environments. To evaluate the proposed approach, we use a well-known open-source software GeneNetWeaver to create several yeast S. cerevisiae sub-networks and use them to produce gene profiles. Experiments have been conducted and the results have been analyzed. They show that our parallel approach can be successfully used to infer networks with desired behaviors and the computation time can be largely reduced. Parallel population-based algorithms can effectively determine network parameters and they perform better than the widely-used sequential algorithms in gene network inference. These parallel algorithms can be distributed to the cloud computing environment to speed up the computation. By coupling the parallel model population-based optimization method and the parallel computational framework, high quality solutions can be obtained within relatively short time. This integrated approach is a promising way for inferring large networks.
Designing a parallel evolutionary algorithm for inferring gene networks on the cloud computing environment

PubMed Central

2014-01-01

Background To improve the tedious task of reconstructing gene networks through testing experimentally the possible interactions between genes, it becomes a trend to adopt the automated reverse engineering procedure instead. Some evolutionary algorithms have been suggested for deriving network parameters. However, to infer large networks by the evolutionary algorithm, it is necessary to address two important issues: premature convergence and high computational cost. To tackle the former problem and to enhance the performance of traditional evolutionary algorithms, it is advisable to use parallel model evolutionary algorithms. To overcome the latter and to speed up the computation, it is advocated to adopt the mechanism of cloud computing as a promising solution: most popular is the method of MapReduce programming model, a fault-tolerant framework to implement parallel algorithms for inferring large gene networks. Results This work presents a practical framework to infer large gene networks, by developing and parallelizing a hybrid GA-PSO optimization method. Our parallel method is extended to work with the Hadoop MapReduce programming model and is executed in different cloud computing environments. To evaluate the proposed approach, we use a well-known open-source software GeneNetWeaver to create several yeast S. cerevisiae sub-networks and use them to produce gene profiles. Experiments have been conducted and the results have been analyzed. They show that our parallel approach can be successfully used to infer networks with desired behaviors and the computation time can be largely reduced. Conclusions Parallel population-based algorithms can effectively determine network parameters and they perform better than the widely-used sequential algorithms in gene network inference. These parallel algorithms can be distributed to the cloud computing environment to speed up the computation. By coupling the parallel model population-based optimization method and the parallel computational framework, high quality solutions can be obtained within relatively short time. This integrated approach is a promising way for inferring large networks. PMID:24428926
Improvement of transgenic cloning efficiencies by culturing recipient oocytes and donor cells with antioxidant vitamins in cattle.

PubMed

Wongsrikeao, Pimprapar; Nagai, Takashi; Agung, Budiyanto; Taniguchi, Masayasu; Kunishi, Miho; Suto, Shizuyo; Otoi, Takeshige

2007-06-01

The present study was conducted to investigate effects of antioxidants during maturation culture of recipient oocytes and/or culture of gene-transfected donor cells on the meiotic competence of recipient oocytes, and the developmental competence and quality of the reconstructed embryos after nuclear transfer (NT) in cattle. Gene-transfected donor cells had negative effects on the proportions of blastocyst formation, total cell numbers, and DNA fragmentation indices of reconstructed embryos. Supplementation of either vitamin E (alpha-tocopherol: 100 microM) or vitamin C (ascorbic acid: 100 microM) during maturation culture significantly enhanced the cytoplasmic maturation of oocytes and subsequent development of embryos reconstructed with the oocytes and gene-transfected donor cells, but did not have synergistic effects. The supplementation of vitamin E during maturation culture of recipient oocytes increased the proportions of fusion and blastocyst formation of gene-transfected NT embryos, in which the proportions were similar to those of nontransfected NT embryos. When the gene-transfected donor cells that had been cultured with 0, 50, or 100 microM of vitamin E were transferred into recipient oocytes matured with vitamin E (100 microM), 50 microM of vitamin E increased the proportion of blastocyst formation and reduced the index of DNA fragmentation of blastocysts. In conclusion, gene-transfected donor cells have negatively influenced the NT outcome. Supplementation of vitamin E during both recipient oocyte maturation and donor cell culture enhanced the blastocyst formation and efficiently blocked DNA damage in transgenic NT embryos. (c) 2006 Wiley-Liss, Inc.
Natural selection drove metabolic specialization of the chromatophore in Paulinella chromatophora.

PubMed

Valadez-Cano, Cecilio; Olivares-Hernández, Roberto; Resendis-Antonio, Osbaldo; DeLuna, Alexander; Delaye, Luis

2017-04-14

Genome degradation of host-restricted mutualistic endosymbionts has been attributed to inactivating mutations and genetic drift while genes coding for host-relevant functions are conserved by purifying selection. Unlike their free-living relatives, the metabolism of mutualistic endosymbionts and endosymbiont-originated organelles is specialized in the production of metabolites which are released to the host. This specialization suggests that natural selection crafted these metabolic adaptations. In this work, we analyzed the evolution of the metabolism of the chromatophore of Paulinella chromatophora by in silico modeling. We asked whether genome reduction is driven by metabolic engineering strategies resulted from the interaction with the host. As its widely known, the loss of enzyme coding genes leads to metabolic network restructuring sometimes improving the production rates. In this case, the production rate of reduced-carbon in the metabolism of the chromatophore. We reconstructed the metabolic networks of the chromatophore of P. chromatophora CCAC 0185 and a close free-living relative, the cyanobacterium Synechococcus sp. WH 5701. We found that the evolution of free-living to host-restricted lifestyle rendered a fragile metabolic network where >80% of genes in the chromatophore are essential for metabolic functionality. Despite the lack of experimental information, the metabolic reconstruction of the chromatophore suggests that the host provides several metabolites to the endosymbiont. By using these metabolites as intracellular conditions, in silico simulations of genome evolution by gene lose recover with 77% accuracy the actual metabolic gene content of the chromatophore. Also, the metabolic model of the chromatophore allowed us to predict by flux balance analysis a maximum rate of reduced-carbon released by the endosymbiont to the host. By inspecting the central metabolism of the chromatophore and the free-living cyanobacteria we found that by improvements in the gluconeogenic pathway the metabolism of the endosymbiont uses more efficiently the carbon source for reduced-carbon production. In addition, our in silico simulations of the evolutionary process leading to the reduced metabolic network of the chromatophore showed that the predicted rate of released reduced-carbon is obtained in less than 5% of the times under a process guided by random gene deletion and genetic drift. We interpret previous findings as evidence that natural selection at holobiont level shaped the rate at which reduced-carbon is exported to the host. Finally, our model also predicts that the ABC phosphate transporter (pstSACB) which is conserved in the genome of the chromatophore of P. chromatophora strain CCAC 0185 is a necessary component to release reduced-carbon molecules to the host. Our evolutionary analysis suggests that in the case of Paulinella chromatophora natural selection at the holobiont level played a prominent role in shaping the metabolic specialization of the chromatophore. We propose that natural selection acted as a "metabolic engineer" by favoring metabolic restructurings that led to an increased release of reduced-carbon to the host.
Understanding Biological Regulation Through Synthetic Biology.

PubMed

Bashor, Caleb J; Collins, James J

2018-05-20

Engineering synthetic gene regulatory circuits proceeds through iterative cycles of design, building, and testing. Initial circuit designs must rely on often-incomplete models of regulation established by fields of reductive inquiry-biochemistry and molecular and systems biology. As differences in designed and experimentally observed circuit behavior are inevitably encountered, investigated, and resolved, each turn of the engineering cycle can force a resynthesis in understanding of natural network function. Here, we outline research that uses the process of gene circuit engineering to advance biological discovery. Synthetic gene circuit engineering research has not only refined our understanding of cellular regulation but furnished biologists with a toolkit that can be directed at natural systems to exact precision manipulation of network structure. As we discuss, using circuit engineering to predictively reorganize, rewire, and reconstruct cellular regulation serves as the ultimate means of testing and understanding how cellular phenotype emerges from systems-level network function.
MyGeneFriends: A Social Network Linking Genes, Genetic Diseases, and Researchers

PubMed Central

Allot, Alexis; Chennen, Kirsley; Nevers, Yannis; Poidevin, Laetitia; Kress, Arnaud; Ripp, Raymond; Thompson, Julie Dawn; Poch, Olivier

2017-01-01

Background The constant and massive increase of biological data offers unprecedented opportunities to decipher the function and evolution of genes and their roles in human diseases. However, the multiplicity of sources and flow of data mean that efficient access to useful information and knowledge production has become a major challenge. This challenge can be addressed by taking inspiration from Web 2.0 and particularly social networks, which are at the forefront of big data exploration and human-data interaction. Objective MyGeneFriends is a Web platform inspired by social networks, devoted to genetic disease analysis, and organized around three types of proactive agents: genes, humans, and genetic diseases. The aim of this study was to improve exploration and exploitation of biological, postgenomic era big data. Methods MyGeneFriends leverages conventions popularized by top social networks (Facebook, LinkedIn, etc), such as networks of friends, profile pages, friendship recommendations, affinity scores, news feeds, content recommendation, and data visualization. Results MyGeneFriends provides simple and intuitive interactions with data through evaluation and visualization of connections (friendships) between genes, humans, and diseases. The platform suggests new friends and publications and allows agents to follow the activity of their friends. It dynamically personalizes information depending on the user’s specific interests and provides an efficient way to share information with collaborators. Furthermore, the user’s behavior itself generates new information that constitutes an added value integrated in the network, which can be used to discover new connections between biological agents. Conclusions We have developed MyGeneFriends, a Web platform leveraging conventions from popular social networks to redefine the relationship between humans and biological big data and improve human processing of biomedical data. MyGeneFriends is available at lbgi.fr/mygenefriends. PMID:28623182
Degrees of separation as a statistical tool for evaluating candidate genes.

PubMed

Nelson, Ronald M; Pettersson, Mats E

2014-12-01

Selection of candidate genes is an important step in the exploration of complex genetic architecture. The number of gene networks available is increasing and these can provide information to help with candidate gene selection. It is currently common to use the degree of connectedness in gene networks as validation in Genome Wide Association (GWA) and Quantitative Trait Locus (QTL) mapping studies. However, it can cause misleading results if not validated properly. Here we present a method and tool for validating the gene pairs from GWA studies given the context of the network they co-occur in. It ensures that proposed interactions and gene associations are not statistical artefacts inherent to the specific gene network architecture. The CandidateBacon package provides an easy and efficient method to calculate the average degree of separation (DoS) between pairs of genes to currently available gene networks. We show how these empirical estimates of average connectedness are used to validate candidate gene pairs. Validation of interacting genes by comparing their connectedness with the average connectedness in the gene network will provide support for said interactions by utilising the growing amount of gene network information available. Copyright © 2014 Elsevier Ltd. All rights reserved.
Efficiency of the neighbor-joining method in reconstructing deep and shallow evolutionary relationships in large phylogenies.

PubMed

Kumar, S; Gadagkar, S R

2000-12-01

The neighbor-joining (NJ) method is widely used in reconstructing large phylogenies because of its computational speed and the high accuracy in phylogenetic inference as revealed in computer simulation studies. However, most computer simulation studies have quantified the overall performance of the NJ method in terms of the percentage of branches inferred correctly or the percentage of replications in which the correct tree is recovered. We have examined other aspects of its performance, such as the relative efficiency in correctly reconstructing shallow (close to the external branches of the tree) and deep branches in large phylogenies; the contribution of zero-length branches to topological errors in the inferred trees; and the influence of increasing the tree size (number of sequences), evolutionary rate, and sequence length on the efficiency of the NJ method. Results show that the correct reconstruction of deep branches is no more difficult than that of shallower branches. The presence of zero-length branches in realized trees contributes significantly to the overall error observed in the NJ tree, especially in large phylogenies or slowly evolving genes. Furthermore, the tree size does not influence the efficiency of NJ in reconstructing shallow and deep branches in our simulation study, in which the evolutionary process is assumed to be homogeneous in all lineages.
Exploring novel key regulators in breast cancer network.

PubMed

Ali, Shahnawaz; Malik, Md Zubbair; Singh, Soibam Shyamchand; Chirom, Keilash; Ishrat, Romana; Singh, R K Brojen

2018-01-01

The breast cancer network constructed from 70 experimentally verified genes is found to follow hierarchical scale free nature with heterogeneous modular organization and diverge leading hubs. The topological parameters (degree distributions, clustering co-efficient, connectivity and centralities) of this network obey fractal rules indicating absence of centrality lethality rule, and efficient communication among the components. From the network theoretical approach, we identified few key regulators out of large number of leading hubs, which are deeply rooted from top to down of the network, serve as backbone of the network, and possible target genes. However, p53, which is one of these key regulators, is found to be in low rank and keep itself at low profile but directly cross-talks with important genes BRCA2 and BRCA3. The popularity of these hubs gets changed in unpredictable way at various levels of organization thus showing disassortive nature. The local community paradigm approach in this network shows strong correlation of nodes in majority of modules/sub-modules (fast communication among nodes) and weak correlation of nodes only in few modules/sub-modules (slow communication among nodes) at various levels of network organization.
Genome-wide Reconstruction of OxyR and SoxRS Transcriptional Regulatory Networks under Oxidative Stress in Escherichia coli K-12 MG1655.

PubMed

Seo, Sang Woo; Kim, Donghyuk; Szubin, Richard; Palsson, Bernhard O

2015-08-25

Three transcription factors (TFs), OxyR, SoxR, and SoxS, play a critical role in transcriptional regulation of the defense system for oxidative stress in bacteria. However, their full genome-wide regulatory potential is unknown. Here, we perform a genome-scale reconstruction of the OxyR, SoxR, and SoxS regulons in Escherichia coli K-12 MG1655. Integrative data analysis reveals that a total of 68 genes in 51 transcription units (TUs) belong to these regulons. Among them, 48 genes showed more than 2-fold changes in expression level under single-TF-knockout conditions. This reconstruction expands the genome-wide roles of these factors to include direct activation of genes related to amino acid biosynthesis (methionine and aromatic amino acids), cell wall synthesis (lipid A biosynthesis and peptidoglycan growth), and divalent metal ion transport (Mn(2+), Zn(2+), and Mg(2+)). Investigating the co-regulation of these genes with other stress-response TFs reveals that they are independently regulated by stress-specific TFs. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
TRACING CO-REGULATORY NETWORK DYNAMICS IN NOISY, SINGLE-CELL TRANSCRIPTOME TRAJECTORIES.

PubMed

Cordero, Pablo; Stuart, Joshua M

2017-01-01

The availability of gene expression data at the single cell level makes it possible to probe the molecular underpinnings of complex biological processes such as differentiation and oncogenesis. Promising new methods have emerged for reconstructing a progression 'trajectory' from static single-cell transcriptome measurements. However, it remains unclear how to adequately model the appreciable level of noise in these data to elucidate gene regulatory network rewiring. Here, we present a framework called Single Cell Inference of MorphIng Trajectories and their Associated Regulation (SCIMITAR) that infers progressions from static single-cell transcriptomes by employing a continuous parametrization of Gaussian mixtures in high-dimensional curves. SCIMITAR yields rich models from the data that highlight genes with expression and co-expression patterns that are associated with the inferred progression. Further, SCIMITAR extracts regulatory states from the implicated trajectory-evolvingco-expression networks. We benchmark the method on simulated data to show that it yields accurate cell ordering and gene network inferences. Applied to the interpretation of a single-cell human fetal neuron dataset, SCIMITAR finds progression-associated genes in cornerstone neural differentiation pathways missed by standard differential expression tests. Finally, by leveraging the rewiring of gene-gene co-expression relations across the progression, the method reveals the rise and fall of co-regulatory states and trajectory-dependent gene modules. These analyses implicate new transcription factors in neural differentiation including putative co-factors for the multi-functional NFAT pathway.
Efficient experimental design for uncertainty reduction in gene regulatory networks.

PubMed

Dehghannasiri, Roozbeh; Yoon, Byung-Jun; Dougherty, Edward R

2015-01-01

An accurate understanding of interactions among genes plays a major role in developing therapeutic intervention methods. Gene regulatory networks often contain a significant amount of uncertainty. The process of prioritizing biological experiments to reduce the uncertainty of gene regulatory networks is called experimental design. Under such a strategy, the experiments with high priority are suggested to be conducted first. The authors have already proposed an optimal experimental design method based upon the objective for modeling gene regulatory networks, such as deriving therapeutic interventions. The experimental design method utilizes the concept of mean objective cost of uncertainty (MOCU). MOCU quantifies the expected increase of cost resulting from uncertainty. The optimal experiment to be conducted first is the one which leads to the minimum expected remaining MOCU subsequent to the experiment. In the process, one must find the optimal intervention for every gene regulatory network compatible with the prior knowledge, which can be prohibitively expensive when the size of the network is large. In this paper, we propose a computationally efficient experimental design method. This method incorporates a network reduction scheme by introducing a novel cost function that takes into account the disruption in the ranking of potential experiments. We then estimate the approximate expected remaining MOCU at a lower computational cost using the reduced networks. Simulation results based on synthetic and real gene regulatory networks show that the proposed approximate method has close performance to that of the optimal method but at lower computational cost. The proposed approximate method also outperforms the random selection policy significantly. A MATLAB software implementing the proposed experimental design method is available at http://gsp.tamu.edu/Publications/supplementary/roozbeh15a/.
Efficient experimental design for uncertainty reduction in gene regulatory networks

PubMed Central

2015-01-01

Background An accurate understanding of interactions among genes plays a major role in developing therapeutic intervention methods. Gene regulatory networks often contain a significant amount of uncertainty. The process of prioritizing biological experiments to reduce the uncertainty of gene regulatory networks is called experimental design. Under such a strategy, the experiments with high priority are suggested to be conducted first. Results The authors have already proposed an optimal experimental design method based upon the objective for modeling gene regulatory networks, such as deriving therapeutic interventions. The experimental design method utilizes the concept of mean objective cost of uncertainty (MOCU). MOCU quantifies the expected increase of cost resulting from uncertainty. The optimal experiment to be conducted first is the one which leads to the minimum expected remaining MOCU subsequent to the experiment. In the process, one must find the optimal intervention for every gene regulatory network compatible with the prior knowledge, which can be prohibitively expensive when the size of the network is large. In this paper, we propose a computationally efficient experimental design method. This method incorporates a network reduction scheme by introducing a novel cost function that takes into account the disruption in the ranking of potential experiments. We then estimate the approximate expected remaining MOCU at a lower computational cost using the reduced networks. Conclusions Simulation results based on synthetic and real gene regulatory networks show that the proposed approximate method has close performance to that of the optimal method but at lower computational cost. The proposed approximate method also outperforms the random selection policy significantly. A MATLAB software implementing the proposed experimental design method is available at http://gsp.tamu.edu/Publications/supplementary/roozbeh15a/. PMID:26423515
De novo reconstruction of gene regulatory networks from time series data, an approach based on formal methods.

PubMed

Ceccarelli, Michele; Cerulo, Luigi; Santone, Antonella

2014-10-01

Reverse engineering of gene regulatory relationships from genomics data is a crucial task to dissect the complex underlying regulatory mechanism occurring in a cell. From a computational point of view the reconstruction of gene regulatory networks is an undetermined problem as the large number of possible solutions is typically high in contrast to the number of available independent data points. Many possible solutions can fit the available data, explaining the data equally well, but only one of them can be the biologically true solution. Several strategies have been proposed in literature to reduce the search space and/or extend the amount of independent information. In this paper we propose a novel algorithm based on formal methods, mathematically rigorous techniques widely adopted in engineering to specify and verify complex software and hardware systems. Starting with a formal specification of gene regulatory hypotheses we are able to mathematically prove whether a time course experiment belongs or not to the formal specification, determining in fact whether a gene regulation exists or not. The method is able to detect both direction and sign (inhibition/activation) of regulations whereas most of literature methods are limited to undirected and/or unsigned relationships. We empirically evaluated the approach on experimental and synthetic datasets in terms of precision and recall. In most cases we observed high levels of accuracy outperforming the current state of art, despite the computational cost increases exponentially with the size of the network. We made available the tool implementing the algorithm at the following url: http://www.bioinformatics.unisannio.it. Copyright © 2014 Elsevier Inc. All rights reserved.
Discovering time-lagged rules from microarray data using gene profile classifiers

PubMed Central

2011-01-01

Background Gene regulatory networks have an essential role in every process of life. In this regard, the amount of genome-wide time series data is becoming increasingly available, providing the opportunity to discover the time-delayed gene regulatory networks that govern the majority of these molecular processes. Results This paper aims at reconstructing gene regulatory networks from multiple genome-wide microarray time series datasets. In this sense, a new model-free algorithm called GRNCOP2 (Gene Regulatory Network inference by Combinatorial OPtimization 2), which is a significant evolution of the GRNCOP algorithm, was developed using combinatorial optimization of gene profile classifiers. The method is capable of inferring potential time-delay relationships with any span of time between genes from various time series datasets given as input. The proposed algorithm was applied to time series data composed of twenty yeast genes that are highly relevant for the cell-cycle study, and the results were compared against several related approaches. The outcomes have shown that GRNCOP2 outperforms the contrasted methods in terms of the proposed metrics, and that the results are consistent with previous biological knowledge. Additionally, a genome-wide study on multiple publicly available time series data was performed. In this case, the experimentation has exhibited the soundness and scalability of the new method which inferred highly-related statistically-significant gene associations. Conclusions A novel method for inferring time-delayed gene regulatory networks from genome-wide time series datasets is proposed in this paper. The method was carefully validated with several publicly available data sets. The results have demonstrated that the algorithm constitutes a usable model-free approach capable of predicting meaningful relationships between genes, revealing the time-trends of gene regulation. PMID:21524308
Reconstruction of a Functional Human Gene Network, with an Application for Prioritizing Positional Candidate Genes

PubMed Central

Franke, Lude; Bakel, Harm van; Fokkens, Like; de Jong, Edwin D.; Egmont-Petersen, Michael; Wijmenga, Cisca

2006-01-01

Most common genetic disorders have a complex inheritance and may result from variants in many genes, each contributing only weak effects to the disease. Pinpointing these disease genes within the myriad of susceptibility loci identified in linkage studies is difficult because these loci may contain hundreds of genes. However, in any disorder, most of the disease genes will be involved in only a few different molecular pathways. If we know something about the relationships between the genes, we can assess whether some genes (which may reside in different loci) functionally interact with each other, indicating a joint basis for the disease etiology. There are various repositories of information on pathway relationships. To consolidate this information, we developed a functional human gene network that integrates information on genes and the functional relationships between genes, based on data from the Kyoto Encyclopedia of Genes and Genomes, the Biomolecular Interaction Network Database, Reactome, the Human Protein Reference Database, the Gene Ontology database, predicted protein-protein interactions, human yeast two-hybrid interactions, and microarray coexpressions. We applied this network to interrelate positional candidate genes from different disease loci and then tested 96 heritable disorders for which the Online Mendelian Inheritance in Man database reported at least three disease genes. Artificial susceptibility loci, each containing 100 genes, were constructed around each disease gene, and we used the network to rank these genes on the basis of their functional interactions. By following up the top five genes per artificial locus, we were able to detect at least one known disease gene in 54% of the loci studied, representing a 2.8-fold increase over random selection. This suggests that our method can significantly reduce the cost and effort of pinpointing true disease genes in analyses of disorders for which numerous loci have been reported but for which most of the genes are unknown. PMID:16685651
Genome-Scale Reconstruction of the Human Astrocyte Metabolic Network

PubMed Central

Martín-Jiménez, Cynthia A.; Salazar-Barreto, Diego; Barreto, George E.; González, Janneth

2017-01-01

Astrocytes are the most abundant cells of the central nervous system; they have a predominant role in maintaining brain metabolism. In this sense, abnormal metabolic states have been found in different neuropathological diseases. Determination of metabolic states of astrocytes is difficult to model using current experimental approaches given the high number of reactions and metabolites present. Thus, genome-scale metabolic networks derived from transcriptomic data can be used as a framework to elucidate how astrocytes modulate human brain metabolic states during normal conditions and in neurodegenerative diseases. We performed a Genome-Scale Reconstruction of the Human Astrocyte Metabolic Network with the purpose of elucidating a significant portion of the metabolic map of the astrocyte. This is the first global high-quality, manually curated metabolic reconstruction network of a human astrocyte. It includes 5,007 metabolites and 5,659 reactions distributed among 8 cell compartments, (extracellular, cytoplasm, mitochondria, endoplasmic reticle, Golgi apparatus, lysosome, peroxisome and nucleus). Using the reconstructed network, the metabolic capabilities of human astrocytes were calculated and compared both in normal and ischemic conditions. We identified reactions activated in these two states, which can be useful for understanding the astrocytic pathways that are affected during brain disease. Additionally, we also showed that the obtained flux distributions in the model, are in accordance with literature-based findings. Up to date, this is the most complete representation of the human astrocyte in terms of inclusion of genes, proteins, reactions and metabolic pathways, being a useful guide for in-silico analysis of several metabolic behaviors of the astrocyte during normal and pathologic states. PMID:28243200
Discovering the hidden sub-network component in a ranked list of genes or proteins derived from genomic experiments

PubMed Central

García-Alonso, Luz; Alonso, Roberto; Vidal, Enrique; Amadoz, Alicia; de María, Alejandro; Minguez, Pablo; Medina, Ignacio; Dopazo, Joaquín

2012-01-01

Genomic experiments (e.g. differential gene expression, single-nucleotide polymorphism association) typically produce ranked list of genes. We present a simple but powerful approach which uses protein–protein interaction data to detect sub-networks within such ranked lists of genes or proteins. We performed an exhaustive study of network parameters that allowed us concluding that the average number of components and the average number of nodes per component are the parameters that best discriminate between real and random networks. A novel aspect that increases the efficiency of this strategy in finding sub-networks is that, in addition to direct connections, also connections mediated by intermediate nodes are considered to build up the sub-networks. The possibility of using of such intermediate nodes makes this approach more robust to noise. It also overcomes some limitations intrinsic to experimental designs based on differential expression, in which some nodes are invariant across conditions. The proposed approach can also be used for candidate disease-gene prioritization. Here, we demonstrate the usefulness of the approach by means of several case examples that include a differential expression analysis in Fanconi Anemia, a genome-wide association study of bipolar disorder and a genome-scale study of essentiality in cancer genes. An efficient and easy-to-use web interface (available at http://www.babelomics.org) based on HTML5 technologies is also provided to run the algorithm and represent the network. PMID:22844098

The transcriptional regulatory network of Corynebacterium jeikeium K411 and its interaction with metabolic routes contributing to human body odor formation.

PubMed

Barzantny, Helena; Schröder, Jasmin; Strotmeier, Jasmin; Fredrich, Eugenie; Brune, Iris; Tauch, Andreas

2012-06-15

Lipophilic corynebacteria are involved in the generation of volatile odorous products in the process of human body odor formation by degrading skin lipids and specific odor precursors. Therefore, these bacteria represent appropriate model systems for the cosmetic industry to examine axillary malodor formation on the molecular level. To understand the transcriptional control of metabolic pathways involved in this process, the transcriptional regulatory network of the lipophilic axilla isolate Corynebacterium jeikeium K411 was reconstructed from the complete genome sequence. This bioinformatic approach detected a gene-regulatory repertoire of 83 candidate proteins, including 56 DNA-binding transcriptional regulators, nine two-component systems, nine sigma factors, and nine regulators with diverse physiological functions. Furthermore, a cross-genome comparison among selected corynebacterial species of the taxonomic cluster 3 revealed a common gene-regulatory repertoire of 44 transcriptional regulators, including the MarR-like regulator Jk0257, which is exclusively encoded in the genomes of this taxonomical subline. The current network reconstruction comprises 48 transcriptional regulators and 674 gene-regulatory interactions that were assigned to five interconnected functional modules. Most genes involved in lipid degradation are under the combined control of the global cAMP-sensing transcriptional regulator GlxR and the LuxR-family regulator RamA, probably reflecting the essential role of lipid degradation in C. jeikeium. This study provides the first genome-scale in silico analysis of the transcriptional regulation of metabolism in a lipophilic bacterium involved in the formation of human body odor. Copyright © 2012 Elsevier B.V. All rights reserved.
Multiple Linear Regression for Reconstruction of Gene Regulatory Networks in Solving Cascade Error Problems

PubMed Central

Zainudin, Suhaila; Arif, Shereena M.

2017-01-01

Gene regulatory network (GRN) reconstruction is the process of identifying regulatory gene interactions from experimental data through computational analysis. One of the main reasons for the reduced performance of previous GRN methods had been inaccurate prediction of cascade motifs. Cascade error is defined as the wrong prediction of cascade motifs, where an indirect interaction is misinterpreted as a direct interaction. Despite the active research on various GRN prediction methods, the discussion on specific methods to solve problems related to cascade errors is still lacking. In fact, the experiments conducted by the past studies were not specifically geared towards proving the ability of GRN prediction methods in avoiding the occurrences of cascade errors. Hence, this research aims to propose Multiple Linear Regression (MLR) to infer GRN from gene expression data and to avoid wrongly inferring of an indirect interaction (A → B → C) as a direct interaction (A → C). Since the number of observations of the real experiment datasets was far less than the number of predictors, some predictors were eliminated by extracting the random subnetworks from global interaction networks via an established extraction method. In addition, the experiment was extended to assess the effectiveness of MLR in dealing with cascade error by using a novel experimental procedure that had been proposed in this work. The experiment revealed that the number of cascade errors had been very minimal. Apart from that, the Belsley collinearity test proved that multicollinearity did affect the datasets used in this experiment greatly. All the tested subnetworks obtained satisfactory results, with AUROC values above 0.5. PMID:28250767
Systematic prediction of gene function in Arabidopsis thaliana using a probabilistic functional gene network

PubMed Central

Hwang, Sohyun; Rhee, Seung Y; Marcotte, Edward M; Lee, Insuk

2012-01-01

AraNet is a functional gene network for the reference plant Arabidopsis and has been constructed in order to identify new genes associated with plant traits. It is highly predictive for diverse biological pathways and can be used to prioritize genes for functional screens. Moreover, AraNet provides a web-based tool with which plant biologists can efficiently discover novel functions of Arabidopsis genes (http://www.functionalnet.org/aranet/). This protocol explains how to conduct network-based prediction of gene functions using AraNet and how to interpret the prediction results. Functional discovery in plant biology is facilitated by combining candidate prioritization by AraNet with focused experimental tests. PMID:21886106
Long-term solar UV radiation reconstructed by Artificial Neural Networks (ANN)

NASA Astrophysics Data System (ADS)

Feister, U.; Junk, J.; Woldt, M.

2008-01-01

Artificial Neural Networks (ANN) are efficient tools to derive solar UV radiation from measured meteorological parameters such as global radiation, aerosol optical depths and atmospheric column ozone. The ANN model has been tested with different combinations of data from the two sites Potsdam and Lindenberg, and used to reconstruct solar UV radiation at eight European sites by more than 100 years into the past. Annual totals of UV radiation derived from reconstructed daily UV values reflect interannual variations and long-term patterns that are compatible with variabilities and changes of measured input data, in particular global dimming by about 1980-1990, subsequent global brightening, volcanic eruption effects such as that of Mt. Pinatubo, and the long-term ozone decline since the 1970s. Patterns of annual erythemal UV radiation are very similar at sites located at latitudes close to each other, but different patterns occur between UV radiation at sites in different latitude regions.
Sieve-based relation extraction of gene regulatory networks from biological literature

PubMed Central

2015-01-01

Background Relation extraction is an essential procedure in literature mining. It focuses on extracting semantic relations between parts of text, called mentions. Biomedical literature includes an enormous amount of textual descriptions of biological entities, their interactions and results of related experiments. To extract them in an explicit, computer readable format, these relations were at first extracted manually from databases. Manual curation was later replaced with automatic or semi-automatic tools with natural language processing capabilities. The current challenge is the development of information extraction procedures that can directly infer more complex relational structures, such as gene regulatory networks. Results We develop a computational approach for extraction of gene regulatory networks from textual data. Our method is designed as a sieve-based system and uses linear-chain conditional random fields and rules for relation extraction. With this method we successfully extracted the sporulation gene regulation network in the bacterium Bacillus subtilis for the information extraction challenge at the BioNLP 2013 conference. To enable extraction of distant relations using first-order models, we transform the data into skip-mention sequences. We infer multiple models, each of which is able to extract different relationship types. Following the shared task, we conducted additional analysis using different system settings that resulted in reducing the reconstruction error of bacterial sporulation network from 0.73 to 0.68, measured as the slot error rate between the predicted and the reference network. We observe that all relation extraction sieves contribute to the predictive performance of the proposed approach. Also, features constructed by considering mention words and their prefixes and suffixes are the most important features for higher accuracy of extraction. Analysis of distances between different mention types in the text shows that our choice of transforming data into skip-mention sequences is appropriate for detecting relations between distant mentions. Conclusions Linear-chain conditional random fields, along with appropriate data transformations, can be efficiently used to extract relations. The sieve-based architecture simplifies the system as new sieves can be easily added or removed and each sieve can utilize the results of previous ones. Furthermore, sieves with conditional random fields can be trained on arbitrary text data and hence are applicable to broad range of relation extraction tasks and data domains. PMID:26551454
Sieve-based relation extraction of gene regulatory networks from biological literature.

PubMed

Žitnik, Slavko; Žitnik, Marinka; Zupan, Blaž; Bajec, Marko

2015-01-01

Relation extraction is an essential procedure in literature mining. It focuses on extracting semantic relations between parts of text, called mentions. Biomedical literature includes an enormous amount of textual descriptions of biological entities, their interactions and results of related experiments. To extract them in an explicit, computer readable format, these relations were at first extracted manually from databases. Manual curation was later replaced with automatic or semi-automatic tools with natural language processing capabilities. The current challenge is the development of information extraction procedures that can directly infer more complex relational structures, such as gene regulatory networks. We develop a computational approach for extraction of gene regulatory networks from textual data. Our method is designed as a sieve-based system and uses linear-chain conditional random fields and rules for relation extraction. With this method we successfully extracted the sporulation gene regulation network in the bacterium Bacillus subtilis for the information extraction challenge at the BioNLP 2013 conference. To enable extraction of distant relations using first-order models, we transform the data into skip-mention sequences. We infer multiple models, each of which is able to extract different relationship types. Following the shared task, we conducted additional analysis using different system settings that resulted in reducing the reconstruction error of bacterial sporulation network from 0.73 to 0.68, measured as the slot error rate between the predicted and the reference network. We observe that all relation extraction sieves contribute to the predictive performance of the proposed approach. Also, features constructed by considering mention words and their prefixes and suffixes are the most important features for higher accuracy of extraction. Analysis of distances between different mention types in the text shows that our choice of transforming data into skip-mention sequences is appropriate for detecting relations between distant mentions. Linear-chain conditional random fields, along with appropriate data transformations, can be efficiently used to extract relations. The sieve-based architecture simplifies the system as new sieves can be easily added or removed and each sieve can utilize the results of previous ones. Furthermore, sieves with conditional random fields can be trained on arbitrary text data and hence are applicable to broad range of relation extraction tasks and data domains.
Lung evolution as a cipher for physiology

PubMed Central

Torday, J. S.; Rehan, V. K.

2009-01-01

In the postgenomic era, we need an algorithm to readily translate genes into physiologic principles. The failure to advance biomedicine is due to the false hope raised in the wake of the Human Genome Project (HGP) by the promise of systems biology as a ready means of reconstructing physiology from genes. like the atom in physics, the cell, not the gene, is the smallest completely functional unit of biology. Trying to reassemble gene regulatory networks without accounting for this fundamental feature of evolution will result in a genomic atlas, but not an algorithm for functional genomics. For example, the evolution of the lung can be “deconvoluted” by applying cell-cell communication mechanisms to all aspects of lung biology development, homeostasis, and regeneration/repair. Gene regulatory networks common to these processes predict ontogeny, phylogeny, and the disease-related consequences of failed signaling. This algorithm elucidates characteristics of vertebrate physiology as a cascade of emergent and contingent cellular adaptational responses. By reducing complex physiological traits to gene regulatory networks and arranging them hierarchically in a self-organizing map, like the periodic table of elements in physics, the first principles of physiology will emerge. PMID:19366785
GIGA: a simple, efficient algorithm for gene tree inference in the genomic age

PubMed Central

2010-01-01

Background Phylogenetic relationships between genes are not only of theoretical interest: they enable us to learn about human genes through the experimental work on their relatives in numerous model organisms from bacteria to fruit flies and mice. Yet the most commonly used computational algorithms for reconstructing gene trees can be inaccurate for numerous reasons, both algorithmic and biological. Additional information beyond gene sequence data has been shown to improve the accuracy of reconstructions, though at great computational cost. Results We describe a simple, fast algorithm for inferring gene phylogenies, which makes use of information that was not available prior to the genomic age: namely, a reliable species tree spanning much of the tree of life, and knowledge of the complete complement of genes in a species' genome. The algorithm, called GIGA, constructs trees agglomeratively from a distance matrix representation of sequences, using simple rules to incorporate this genomic age information. GIGA makes use of a novel conceptualization of gene trees as being composed of orthologous subtrees (containing only speciation events), which are joined by other evolutionary events such as gene duplication or horizontal gene transfer. An important innovation in GIGA is that, at every step in the agglomeration process, the tree is interpreted/reinterpreted in terms of the evolutionary events that created it. Remarkably, GIGA performs well even when using a very simple distance metric (pairwise sequence differences) and no distance averaging over clades during the tree construction process. Conclusions GIGA is efficient, allowing phylogenetic reconstruction of very large gene families and determination of orthologs on a large scale. It is exceptionally robust to adding more gene sequences, opening up the possibility of creating stable identifiers for referring to not only extant genes, but also their common ancestors. We compared trees produced by GIGA to those in the TreeFam database, and they were very similar in general, with most differences likely due to poor alignment quality. However, some remaining differences are algorithmic, and can be explained by the fact that GIGA tends to put a larger emphasis on minimizing gene duplication and deletion events. PMID:20534164
GIGA: a simple, efficient algorithm for gene tree inference in the genomic age.

PubMed

Thomas, Paul D

2010-06-09

Phylogenetic relationships between genes are not only of theoretical interest: they enable us to learn about human genes through the experimental work on their relatives in numerous model organisms from bacteria to fruit flies and mice. Yet the most commonly used computational algorithms for reconstructing gene trees can be inaccurate for numerous reasons, both algorithmic and biological. Additional information beyond gene sequence data has been shown to improve the accuracy of reconstructions, though at great computational cost. We describe a simple, fast algorithm for inferring gene phylogenies, which makes use of information that was not available prior to the genomic age: namely, a reliable species tree spanning much of the tree of life, and knowledge of the complete complement of genes in a species' genome. The algorithm, called GIGA, constructs trees agglomeratively from a distance matrix representation of sequences, using simple rules to incorporate this genomic age information. GIGA makes use of a novel conceptualization of gene trees as being composed of orthologous subtrees (containing only speciation events), which are joined by other evolutionary events such as gene duplication or horizontal gene transfer. An important innovation in GIGA is that, at every step in the agglomeration process, the tree is interpreted/reinterpreted in terms of the evolutionary events that created it. Remarkably, GIGA performs well even when using a very simple distance metric (pairwise sequence differences) and no distance averaging over clades during the tree construction process. GIGA is efficient, allowing phylogenetic reconstruction of very large gene families and determination of orthologs on a large scale. It is exceptionally robust to adding more gene sequences, opening up the possibility of creating stable identifiers for referring to not only extant genes, but also their common ancestors. We compared trees produced by GIGA to those in the TreeFam database, and they were very similar in general, with most differences likely due to poor alignment quality. However, some remaining differences are algorithmic, and can be explained by the fact that GIGA tends to put a larger emphasis on minimizing gene duplication and deletion events.
A genome-scale metabolic reconstruction of Pseudomonas putida KT2440: iJN746 as a cell factory.

PubMed

Nogales, Juan; Palsson, Bernhard Ø; Thiele, Ines

2008-09-16

Pseudomonas putida is the best studied pollutant degradative bacteria and is harnessed by industrial biotechnology to synthesize fine chemicals. Since the publication of P. putida KT2440's genome, some in silico analyses of its metabolic and biotechnology capacities have been published. However, global understanding of the capabilities of P. putida KT2440 requires the construction of a metabolic model that enables the integration of classical experimental data along with genomic and high-throughput data. The constraint-based reconstruction and analysis (COBRA) approach has been successfully used to build and analyze in silico genome-scale metabolic reconstructions. We present a genome-scale reconstruction of P. putida KT2440's metabolism, iJN746, which was constructed based on genomic, biochemical, and physiological information. This manually-curated reconstruction accounts for 746 genes, 950 reactions, and 911 metabolites. iJN746 captures biotechnologically relevant pathways, including polyhydroxyalkanoate synthesis and catabolic pathways of aromatic compounds (e.g., toluene, benzoate, phenylacetate, nicotinate), not described in other metabolic reconstructions or biochemical databases. The predictive potential of iJN746 was validated using experimental data including growth performance and gene deletion studies. Furthermore, in silico growth on toluene was found to be oxygen-limited, suggesting the existence of oxygen-efficient pathways not yet annotated in P. putida's genome. Moreover, we evaluated the production efficiency of polyhydroxyalkanoates from various carbon sources and found fatty acids as the most prominent candidates, as expected. Here we presented the first genome-scale reconstruction of P. putida, a biotechnologically interesting all-surrounder. Taken together, this work illustrates the utility of iJN746 as i) a knowledge-base, ii) a discovery tool, and iii) an engineering platform to explore P. putida's potential in bioremediation and bioplastic production.
Predicting gene regulatory networks of soybean nodulation from RNA-Seq transcriptome data.

PubMed

Zhu, Mingzhu; Dahmen, Jeremy L; Stacey, Gary; Cheng, Jianlin

2013-09-22

High-throughput RNA sequencing (RNA-Seq) is a revolutionary technique to study the transcriptome of a cell under various conditions at a systems level. Despite the wide application of RNA-Seq techniques to generate experimental data in the last few years, few computational methods are available to analyze this huge amount of transcription data. The computational methods for constructing gene regulatory networks from RNA-Seq expression data of hundreds or even thousands of genes are particularly lacking and urgently needed. We developed an automated bioinformatics method to predict gene regulatory networks from the quantitative expression values of differentially expressed genes based on RNA-Seq transcriptome data of a cell in different stages and conditions, integrating transcriptional, genomic and gene function data. We applied the method to the RNA-Seq transcriptome data generated for soybean root hair cells in three different development stages of nodulation after rhizobium infection. The method predicted a soybean nodulation-related gene regulatory network consisting of 10 regulatory modules common for all three stages, and 24, 49 and 70 modules separately for the first, second and third stage, each containing both a group of co-expressed genes and several transcription factors collaboratively controlling their expression under different conditions. 8 of 10 common regulatory modules were validated by at least two kinds of validations, such as independent DNA binding motif analysis, gene function enrichment test, and previous experimental data in the literature. We developed a computational method to reliably reconstruct gene regulatory networks from RNA-Seq transcriptome data. The method can generate valuable hypotheses for interpreting biological data and designing biological experiments such as ChIP-Seq, RNA interference, and yeast two hybrid experiments.
A Small World of Neuronal Synchrony

PubMed Central

Yu, Shan; Huang, Debin; Singer, Wolf

2008-01-01

A small-world network has been suggested to be an efficient solution for achieving both modular and global processing—a property highly desirable for brain computations. Here, we investigated functional networks of cortical neurons using correlation analysis to identify functional connectivity. To reconstruct the interaction network, we applied the Ising model based on the principle of maximum entropy. This allowed us to assess the interactions by measuring pairwise correlations and to assess the strength of coupling from the degree of synchrony. Visual responses were recorded in visual cortex of anesthetized cats, simultaneously from up to 24 neurons. First, pairwise correlations captured most of the patterns in the population's activity and, therefore, provided a reliable basis for the reconstruction of the interaction networks. Second, and most importantly, the resulting networks had small-world properties; the average path lengths were as short as in simulated random networks, but the clustering coefficients were larger. Neurons differed considerably with respect to the number and strength of interactions, suggesting the existence of “hubs” in the network. Notably, there was no evidence for scale-free properties. These results suggest that cortical networks are optimized for the coexistence of local and global computations: feature detection and feature integration or binding. PMID:18400792
Strawberry: Fast and accurate genome-guided transcript reconstruction and quantification from RNA-Seq.

PubMed

Liu, Ruolin; Dickerson, Julie

2017-11-01

We propose a novel method and software tool, Strawberry, for transcript reconstruction and quantification from RNA-Seq data under the guidance of genome alignment and independent of gene annotation. Strawberry consists of two modules: assembly and quantification. The novelty of Strawberry is that the two modules use different optimization frameworks but utilize the same data graph structure, which allows a highly efficient, expandable and accurate algorithm for dealing large data. The assembly module parses aligned reads into splicing graphs, and uses network flow algorithms to select the most likely transcripts. The quantification module uses a latent class model to assign read counts from the nodes of splicing graphs to transcripts. Strawberry simultaneously estimates the transcript abundances and corrects for sequencing bias through an EM algorithm. Based on simulations, Strawberry outperforms Cufflinks and StringTie in terms of both assembly and quantification accuracies. Under the evaluation of a real data set, the estimated transcript expression by Strawberry has the highest correlation with Nanostring probe counts, an independent experiment measure for transcript expression. Strawberry is written in C++14, and is available as open source software at https://github.com/ruolin/strawberry under the MIT license.
MAGIA2: from miRNA and genes expression data integrative analysis to microRNA–transcription factor mixed regulatory circuits (2012 update)

PubMed Central

Bisognin, Andrea; Sales, Gabriele; Coppe, Alessandro; Bortoluzzi, Stefania; Romualdi, Chiara

2012-01-01

MAGIA2 (http://gencomp.bio.unipd.it/magia2) is an update, extension and evolution of the MAGIA web tool. It is dedicated to the integrated analysis of in silico target prediction, microRNA (miRNA) and gene expression data for the reconstruction of post-transcriptional regulatory networks. miRNAs are fundamental post-transcriptional regulators of several key biological and pathological processes. As miRNAs act prevalently through target degradation, their expression profiles are expected to be inversely correlated to those of the target genes. Low specificity of target prediction algorithms makes integration approaches an interesting solution for target prediction refinement. MAGIA2 performs this integrative approach supporting different association measures, multiple organisms and almost all target predictions algorithms. Nevertheless, miRNAs activity should be viewed as part of a more complex scenario where regulatory elements and their interactors generate a highly connected network and where gene expression profiles are the result of different levels of regulation. The updated MAGIA2 tries to dissect this complexity by reconstructing mixed regulatory circuits involving either miRNA or transcription factor (TF) as regulators. Two types of circuits are identified: (i) a TF that regulates both a miRNA and its target and (ii) a miRNA that regulates both a TF and its target. PMID:22618880
A sub-space greedy search method for efficient Bayesian Network inference.

PubMed

Zhang, Qing; Cao, Yong; Li, Yong; Zhu, Yanming; Sun, Samuel S M; Guo, Dianjing

2011-09-01

Bayesian network (BN) has been successfully used to infer the regulatory relationships of genes from microarray dataset. However, one major limitation of BN approach is the computational cost because the calculation time grows more than exponentially with the dimension of the dataset. In this paper, we propose a sub-space greedy search method for efficient Bayesian Network inference. Particularly, this method limits the greedy search space by only selecting gene pairs with higher partial correlation coefficients. Using both synthetic and real data, we demonstrate that the proposed method achieved comparable results with standard greedy search method yet saved ∼50% of the computational time. We believe that sub-space search method can be widely used for efficient BN inference in systems biology. Copyright © 2011 Elsevier Ltd. All rights reserved.
Phylogenetic comparative methods on phylogenetic networks with reticulations.

PubMed

Bastide, Paul; Solís-Lemus, Claudia; Kriebel, Ricardo; Sparks, K William; Ané, Cécile

2018-04-25

The goal of Phylogenetic Comparative Methods (PCMs) is to study the distribution of quantitative traits among related species. The observed traits are often seen as the result of a Brownian Motion (BM) along the branches of a phylogenetic tree. Reticulation events such as hybridization, gene flow or horizontal gene transfer, can substantially affect a species' traits, but are not modeled by a tree. Phylogenetic networks have been designed to represent reticulate evolution. As they become available for downstream analyses, new models of trait evolution are needed, applicable to networks. One natural extension of the BM is to use a weighted average model for the trait of a hybrid, at a reticulation point. We develop here an efficient recursive algorithm to compute the phylogenetic variance matrix of a trait on a network, in only one preorder traversal of the network. We then extend the standard PCM tools to this new framework, including phylogenetic regression with covariates (or phylogenetic ANOVA), ancestral trait reconstruction, and Pagel's λ test of phylogenetic signal. The trait of a hybrid is sometimes outside of the range of its two parents, for instance because of hybrid vigor or hybrid depression. These two phenomena are rather commonly observed in present-day hybrids. Transgressive evolution can be modeled as a shift in the trait value following a reticulation point. We develop a general framework to handle such shifts, and take advantage of the phylogenetic regression view of the problem to design statistical tests for ancestral transgressive evolution in the evolutionary history of a group of species. We study the power of these tests in several scenarios, and show that recent events have indeed the strongest impact on the trait distribution of present-day taxa. We apply those methods to a dataset of Xiphophorus fishes, to confirm and complete previous analysis in this group. All the methods developed here are available in the Julia package PhyloNetworks.
MyGeneFriends: A Social Network Linking Genes, Genetic Diseases, and Researchers.

PubMed

Allot, Alexis; Chennen, Kirsley; Nevers, Yannis; Poidevin, Laetitia; Kress, Arnaud; Ripp, Raymond; Thompson, Julie Dawn; Poch, Olivier; Lecompte, Odile

2017-06-16

The constant and massive increase of biological data offers unprecedented opportunities to decipher the function and evolution of genes and their roles in human diseases. However, the multiplicity of sources and flow of data mean that efficient access to useful information and knowledge production has become a major challenge. This challenge can be addressed by taking inspiration from Web 2.0 and particularly social networks, which are at the forefront of big data exploration and human-data interaction. MyGeneFriends is a Web platform inspired by social networks, devoted to genetic disease analysis, and organized around three types of proactive agents: genes, humans, and genetic diseases. The aim of this study was to improve exploration and exploitation of biological, postgenomic era big data. MyGeneFriends leverages conventions popularized by top social networks (Facebook, LinkedIn, etc), such as networks of friends, profile pages, friendship recommendations, affinity scores, news feeds, content recommendation, and data visualization. MyGeneFriends provides simple and intuitive interactions with data through evaluation and visualization of connections (friendships) between genes, humans, and diseases. The platform suggests new friends and publications and allows agents to follow the activity of their friends. It dynamically personalizes information depending on the user's specific interests and provides an efficient way to share information with collaborators. Furthermore, the user's behavior itself generates new information that constitutes an added value integrated in the network, which can be used to discover new connections between biological agents. We have developed MyGeneFriends, a Web platform leveraging conventions from popular social networks to redefine the relationship between humans and biological big data and improve human processing of biomedical data. MyGeneFriends is available at lbgi.fr/mygenefriends. ©Alexis Allot, Kirsley Chennen, Yannis Nevers, Laetitia Poidevin, Arnaud Kress, Raymond Ripp, Julie Dawn Thompson, Olivier Poch, Odile Lecompte. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 16.06.2017.
A platform for rapid prototyping of synthetic gene networks in mammalian cells

PubMed Central

Duportet, Xavier; Wroblewska, Liliana; Guye, Patrick; Li, Yinqing; Eyquem, Justin; Rieders, Julianne; Rimchala, Tharathorn; Batt, Gregory; Weiss, Ron

2014-01-01

Mammalian synthetic biology may provide novel therapeutic strategies, help decipher new paths for drug discovery and facilitate synthesis of valuable molecules. Yet, our capacity to genetically program cells is currently hampered by the lack of efficient approaches to streamline the design, construction and screening of synthetic gene networks. To address this problem, here we present a framework for modular and combinatorial assembly of functional (multi)gene expression vectors and their efficient and specific targeted integration into a well-defined chromosomal context in mammalian cells. We demonstrate the potential of this framework by assembling and integrating different functional mammalian regulatory networks including the largest gene circuit built and chromosomally integrated to date (6 transcription units, 27kb) encoding an inducible memory device. Using a library of 18 different circuits as a proof of concept, we also demonstrate that our method enables one-pot/single-flask chromosomal integration and screening of circuit libraries. This rapid and powerful prototyping platform is well suited for comparative studies of genetic regulatory elements, genes and multi-gene circuits as well as facile development of libraries of isogenic engineered cell lines. PMID:25378321
A Graphical Model of Smoking-Induced Global Instability in Lung Cancer.

PubMed

Wang, Yanbo; Qian, Weikang; Yuan, Bo

2018-01-01

Smoking is the major cause of lung cancer and the leading cause of cancer-related death in the world. The most current view about lung cancer is no longer limited to individual genes being mutated by any carcinogenic insults from smoking. Instead, tumorigenesis is a phenotype conferred by many systematic and global alterations, leading to extensive heterogeneity and variation for both the genotypes and phenotypes of individual cancer cells. Thus, strategically it is foremost important to develop a methodology to capture any consistent and global alterations presumably shared by most of the cancerous cells for a given population. This is particularly true that almost all of the data collected from solid cancers (including lung cancers) are usually distant apart over a large span of temporal or even spatial contexts. Here, we report a multiple non-Gaussian graphical model to reconstruct the gene interaction network using two previously published gene expression datasets. Our graphical model aims to selectively detect gross structural changes at the level of gene interaction networks. Our methodology is extensively validated, demonstrating good robustness, as well as the selectivity and specificity expected based on our biological insights. In summary, gene regulatory networks are still relatively stable during presumably the early stage of neoplastic transformation. But drastic structural differences can be found between lung cancer and its normal control, including the gain of functional modules for cellular proliferations such as EGFR and PDGFRA, as well as the lost of the important IL6 module, supporting their roles as potential drug targets. Interestingly, our method can also detect early modular changes, with the ALDH3A1 and its associated interactions being strongly implicated as a potential early marker, whose activations appear to alter LCN2 module as well as its interactions with the important TP53-MDM2 circuitry. Our strategy using the graphical model to reconstruct gene interaction work with biologically-inspired constraints exemplifies the importance and beauty of biology in developing any bio-computational approach.
Genes under weaker stabilizing selection increase network evolvability and rapid regulatory adaptation to an environmental shift.

PubMed

Laarits, T; Bordalo, P; Lemos, B

2016-08-01

Regulatory networks play a central role in the modulation of gene expression, the control of cellular differentiation, and the emergence of complex phenotypes. Regulatory networks could constrain or facilitate evolutionary adaptation in gene expression levels. Here, we model the adaptation of regulatory networks and gene expression levels to a shift in the environment that alters the optimal expression level of a single gene. Our analyses show signatures of natural selection on regulatory networks that both constrain and facilitate rapid evolution of gene expression level towards new optima. The analyses are interpreted from the standpoint of neutral expectations and illustrate the challenge to making inferences about network adaptation. Furthermore, we examine the consequence of variable stabilizing selection across genes on the strength and direction of interactions in regulatory networks and in their subsequent adaptation. We observe that directional selection on a highly constrained gene previously under strong stabilizing selection was more efficient when the gene was embedded within a network of partners under relaxed stabilizing selection pressure. The observation leads to the expectation that evolutionarily resilient regulatory networks will contain optimal ratios of genes whose expression is under weak and strong stabilizing selection. Altogether, our results suggest that the variable strengths of stabilizing selection across genes within regulatory networks might itself contribute to the long-term adaptation of complex phenotypes. © 2016 European Society For Evolutionary Biology. Journal of Evolutionary Biology © 2016 European Society For Evolutionary Biology.

Cyanobacterial Biofuels: Strategies and Developments on Network and Modeling.

PubMed

Klanchui, Amornpan; Raethong, Nachon; Prommeenate, Peerada; Vongsangnak, Wanwipa; Meechai, Asawin

Cyanobacteria, the phototrophic microorganisms, have attracted much attention recently as a promising source for environmentally sustainable biofuels production. However, barriers for commercial markets of cyanobacteria-based biofuels concern the economic feasibility. Miscellaneous strategies for improving the production performance of cyanobacteria have thus been developed. Among these, the simple ad hoc strategies resulting in failure to optimize fully cell growth coupled with desired product yield are explored. With the advancement of genomics and systems biology, a new paradigm toward systems metabolic engineering has been recognized. In particular, a genome-scale metabolic network reconstruction and modeling is a crucial systems-based tool for whole-cell-wide investigation and prediction. In this review, the cyanobacterial genome-scale metabolic models, which offer a system-level understanding of cyanobacterial metabolism, are described. The main process of metabolic network reconstruction and modeling of cyanobacteria are summarized. Strategies and developments on genome-scale network and modeling through the systems metabolic engineering approach are advanced and employed for efficient cyanobacterial-based biofuels production.
SPECTRAL GRAPH THEORY AND GRAPH ENERGY METRICS SHOW EVIDENCE FOR THE ALZHEIMER’S DISEASE DISCONNECTION SYNDROME IN APOE-4 RISK GENE CARRIERS

PubMed Central

Daianu, Madelaine; Mezher, Adam; Jahanshad, Neda; Hibar, Derrek P.; Nir, Talia M.; Jack, Clifford R.; Weiner, Michael W.; Bernstein, Matt A.; Thompson, Paul M.

2015-01-01

Our understanding of network breakdown in Alzheimer’s disease (AD) is likely to be enhanced through advanced mathematical descriptors. Here, we applied spectral graph theory to provide novel metrics of structural connectivity based on 3-Tesla diffusion weighted images in 42 AD patients and 50 healthy controls. We reconstructed connectivity networks using whole-brain tractography and examined, for the first time here, cortical disconnection based on the graph energy and spectrum. We further assessed supporting metrics - link density and nodal strength - to better interpret our results. Metrics were analyzed in relation to the well-known APOE-4 genetic risk factor for late-onset AD. The number of disconnected cortical regions increased with the number of copies of the APOE-4 risk gene in people with AD. Each additional copy of the APOE-4 risk gene may lead to more dysfunctional networks with weakened or abnormal connections, providing evidence for the previously hypothesized “disconnection syndrome”. PMID:26413205
SPECTRAL GRAPH THEORY AND GRAPH ENERGY METRICS SHOW EVIDENCE FOR THE ALZHEIMER'S DISEASE DISCONNECTION SYNDROME IN APOE-4 RISK GENE CARRIERS.

PubMed

Daianu, Madelaine; Mezher, Adam; Jahanshad, Neda; Hibar, Derrek P; Nir, Talia M; Jack, Clifford R; Weiner, Michael W; Bernstein, Matt A; Thompson, Paul M

2015-04-01

Our understanding of network breakdown in Alzheimer's disease (AD) is likely to be enhanced through advanced mathematical descriptors. Here, we applied spectral graph theory to provide novel metrics of structural connectivity based on 3-Tesla diffusion weighted images in 42 AD patients and 50 healthy controls. We reconstructed connectivity networks using whole-brain tractography and examined, for the first time here, cortical disconnection based on the graph energy and spectrum. We further assessed supporting metrics - link density and nodal strength - to better interpret our results. Metrics were analyzed in relation to the well-known APOE -4 genetic risk factor for late-onset AD. The number of disconnected cortical regions increased with the number of copies of the APOE -4 risk gene in people with AD. Each additional copy of the APOE -4 risk gene may lead to more dysfunctional networks with weakened or abnormal connections, providing evidence for the previously hypothesized "disconnection syndrome".
A Continental-Wide Perspective: The Genepool of Nuclear Encoded Ribosomal DNA and Single-Copy Gene Sequences in North American Boechera (Brassicaceae)

PubMed Central

Kiefer, Christiane; Koch, Marcus A.

2012-01-01

74 of the currently accepted 111 taxa of the North American genus Boechera (Brassicaceae) were subject to pyhlogenetic reconstruction and network analysis. The dataset comprised 911 accessions for which ITS sequences were analyzed. Phylogenetic analyses yielded largely unresolved trees. Together with the network analysis confirming this result this can be interpreted as an indication for multiple, independent, and rapid diversification events. Network analyses were superimposed with datasets describing i) geographical distribution, ii) taxonomy, iii) reproductive mode, and iv) distribution history based on phylogeographic evidence. Our results provide first direct evidence for enormous reticulate evolution in the entire genus and give further insights into the evolutionary history of this complex genus on a continental scale. In addition two novel single-copy gene markers, orthologues of the Arabidopsis thaliana genes At2g25920 and At3g18900, were analyzed for subsets of taxa and confirmed the findings obtained through the ITS data. PMID:22606266
Recurrent neural network-based modeling of gene regulatory network using elephant swarm water search algorithm.

PubMed

Mandal, Sudip; Saha, Goutam; Pal, Rajat Kumar

2017-08-01

Correct inference of genetic regulations inside a cell from the biological database like time series microarray data is one of the greatest challenges in post genomic era for biologists and researchers. Recurrent Neural Network (RNN) is one of the most popular and simple approach to model the dynamics as well as to infer correct dependencies among genes. Inspired by the behavior of social elephants, we propose a new metaheuristic namely Elephant Swarm Water Search Algorithm (ESWSA) to infer Gene Regulatory Network (GRN). This algorithm is mainly based on the water search strategy of intelligent and social elephants during drought, utilizing the different types of communication techniques. Initially, the algorithm is tested against benchmark small and medium scale artificial genetic networks without and with presence of different noise levels and the efficiency was observed in term of parametric error, minimum fitness value, execution time, accuracy of prediction of true regulation, etc. Next, the proposed algorithm is tested against the real time gene expression data of Escherichia Coli SOS Network and results were also compared with others state of the art optimization methods. The experimental results suggest that ESWSA is very efficient for GRN inference problem and performs better than other methods in many ways.
Insights on fluid-rock interaction evolution during deformation from fracture network geochemistry at reservoir-scale

NASA Astrophysics Data System (ADS)

Beaudoin, Nicolas; Koehn, Daniel; Lacombe, Olivier; Bellahsen, Nicolas; Emmanuel, Laurent

2015-04-01

Fluid migration and fluid-rock interactions during deformation is a challenging problematic to picture. Numerous interplays, as between porosity-permeability creation and clogging, or evolution of the mechanical properties of rock, are key features when it comes to monitor reservoir evolution, or to better understand seismic cycle n the shallow crust. These phenomenoms are especially important in foreland basins, where various fluids can invade strata and efficiently react with limestones, altering their physical properties. Stable isotopes (O, C, Sr) measurements and fluid inclusion microthermometry of faults cement and veins cement lead to efficient reconstruction of the origin, temperature and migration pathways for fluids (i.e. fluid system) that precipitated during joints opening or faults activation. Such a toolbox can be used on a diffuse fracture network that testifies the local and/or regional deformation history experienced by the rock at reservoir-scale. This contribution underlines the advantages and limits of geochemical studies of diffuse fracture network at reservoir-scale by presenting results of fluid system reconstruction during deformation in folded structures from various thrust-belts, tectonic context and deformation history. We compare reconstructions of fluid-rock interaction evolution during post-deposition, post-burial growth of basement-involved folds in the Sevier-Laramide American Rocky Mountains foreland, a reconstruction of fluid-rock interaction evolution during syn-depostion shallow detachment folding in the Southern Pyrenean foreland, and a preliminary reconstruction of fluid-rock interactions in a post-deposition, post-burial development of a detachment fold in the Appenines. Beyond regional specification for the nature of fluids, a common behavior appears during deformation as in every fold, curvature-related joints (related either to folding or to foreland flexure) connected vertically the pre-existing stratified fluid system. The lengthscale of the migration and the nature of invading fluids during these connections is different in every studied example, and can be related to the tectonic nature of the fold, along with the burial depth at the time of deformation. Thus, to decipher fluid-fracture relationships provides insights to better reconstruct the mechanisms of deformation at reservoir-scale.
Genome-scale model reveals metabolic basis of biomass partitioning in a model diatom

DOE PAGES

Levering, Jennifer; Broddrick, Jared; Dupont, Christopher L.; ...

2016-05-06

Diatoms are eukaryotic microalgae that contain genes from various sources, including bacteria and the secondary endosymbiotic host. Due to this unique combination of genes, diatoms are taxonomically and functionally distinct from other algae and vascular plants and confer novel metabolic capabilities. Based on the genome annotation, we performed a genome-scale metabolic network reconstruction for the marine diatom Phaeodactylum tricornutum. Due to their endosymbiotic origin, diatoms possess a complex chloroplast structure which complicates the prediction of subcellular protein localization. Based on previous work we implemented a pipeline that exploits a series of bioinformatics tools to predict protein localization. The manually curatedmore » reconstructed metabolic network iLB1027_lipid accounts for 1,027 genes associated with 4,456 reactions and 2,172 metabolites distributed across six compartments. To constrain the genome-scale model, we determined the organism specific biomass composition in terms of lipids, carbohydrates, and proteins using Fourier transform infrared spectrometry. Our simulations indicate the presence of a yet unknown glutamine-ornithine shunt that could be used to transfer reducing equivalents generated by photosynthesis to the mitochondria. Furthermore, the model reflects the known biochemical composition of P. tricornutum in defined culture conditions and enables metabolic engineering strategies to improve the use of P. tricornutum for biotechnological applications.« less
Bio-crude transcriptomics: Gene discovery and metabolic network reconstruction for the biosynthesis of the terpenome of the hydrocarbon oil-producing green alga, Botryococcus braunii race B (Showa)*

DOE PAGES

Molnár, István; Lopez, David; Wisecaver, Jennifer H.; ...

2012-10-30

Microalgae hold promise for yielding a biofuel feedstock that is sustainable, carbon-neutral, distributed, and only minimally disruptive for the production of food and feed by traditional agriculture. Amongst oleaginous eukaryotic algae, the B race of Botryococcus braunii is unique in that it produces large amounts of liquid hydrocarbons of terpenoid origin. These are comparable to fossil crude oil, and are sequestered outside the cells in a communal extracellular polymeric matrix material. The biosynthetic engineering of terpenoid bio-crude production requires identification of genes and reconstruction of metabolic pathways responsible for production of both hydrocarbons and other metabolites of the alga thatmore » compete for photosynthetic carbon and energy.« less
The Applied Mathematics for Power Systems (AMPS)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chertkov, Michael

2012-07-24

Increased deployment of new technologies, e.g., renewable generation and electric vehicles, is rapidly transforming electrical power networks by crossing previously distinct spatiotemporal scales and invalidating many traditional approaches for designing, analyzing, and operating power grids. This trend is expected to accelerate over the coming years, bringing the disruptive challenge of complexity, but also opportunities to deliver unprecedented efficiency and reliability. Our Applied Mathematics for Power Systems (AMPS) Center will discover, enable, and solve emerging mathematics challenges arising in power systems and, more generally, in complex engineered networks. We will develop foundational applied mathematics resulting in rigorous algorithms and simulation toolboxesmore » for modern and future engineered networks. The AMPS Center deconstruction/reconstruction approach 'deconstructs' complex networks into sub-problems within non-separable spatiotemporal scales, a missing step in 20th century modeling of engineered networks. These sub-problems are addressed within the appropriate AMPS foundational pillar - complex systems, control theory, and optimization theory - and merged or 'reconstructed' at their boundaries into more general mathematical descriptions of complex engineered networks where important new questions are formulated and attacked. These two steps, iterated multiple times, will bridge the growing chasm between the legacy power grid and its future as a complex engineered network.« less
A neural network approach for image reconstruction in electron magnetic resonance tomography.

PubMed

Durairaj, D Christopher; Krishna, Murali C; Murugesan, Ramachandran

2007-10-01

An object-oriented, artificial neural network (ANN) based, application system for reconstruction of two-dimensional spatial images in electron magnetic resonance (EMR) tomography is presented. The standard back propagation algorithm is utilized to train a three-layer sigmoidal feed-forward, supervised, ANN to perform the image reconstruction. The network learns the relationship between the 'ideal' images that are reconstructed using filtered back projection (FBP) technique and the corresponding projection data (sinograms). The input layer of the network is provided with a training set that contains projection data from various phantoms as well as in vivo objects, acquired from an EMR imager. Twenty five different network configurations are investigated to test the ability of the generalization of the network. The trained ANN then reconstructs two-dimensional temporal spatial images that present the distribution of free radicals in biological systems. Image reconstruction by the trained neural network shows better time complexity than the conventional iterative reconstruction algorithms such as multiplicative algebraic reconstruction technique (MART). The network is further explored for image reconstruction from 'noisy' EMR data and the results show better performance than the FBP method. The network is also tested for its ability to reconstruct from limited-angle EMR data set.
Reverse engineering of gene regulatory networks.

PubMed

Cho, K H; Choo, S M; Jung, S H; Kim, J R; Choi, H S; Kim, J

2007-05-01

Systems biology is a multi-disciplinary approach to the study of the interactions of various cellular mechanisms and cellular components. Owing to the development of new technologies that simultaneously measure the expression of genetic information, systems biological studies involving gene interactions are increasingly prominent. In this regard, reconstructing gene regulatory networks (GRNs) forms the basis for the dynamical analysis of gene interactions and related effects on cellular control pathways. Various approaches of inferring GRNs from gene expression profiles and biological information, including machine learning approaches, have been reviewed, with a brief introduction of DNA microarray experiments as typical tools for measuring levels of messenger ribonucleic acid (mRNA) expression. In particular, the inference methods are classified according to the required input information, and the main idea of each method is elucidated by comparing its advantages and disadvantages with respect to the other methods. In addition, recent developments in this field are introduced and discussions on the challenges and opportunities for future research are provided.
Pathgroups, a dynamic data structure for genome reconstruction problems.

PubMed

Zheng, Chunfang

2010-07-01

Ancestral gene order reconstruction problems, including the median problem, quartet construction, small phylogeny, guided genome halving and genome aliquoting, are NP hard. Available heuristics dedicated to each of these problems are computationally costly for even small instances. We present a data structure enabling rapid heuristic solution to all these ancestral genome reconstruction problems. A generic greedy algorithm with look-ahead based on an automatically generated priority system suffices for all the problems using this data structure. The efficiency of the algorithm is due to fast updating of the structure during run time and to the simplicity of the priority scheme. We illustrate with the first rapid algorithm for quartet construction and apply this to a set of yeast genomes to corroborate a recent gene sequence-based phylogeny. http://albuquerque.bioinformatics.uottawa.ca/pathgroup/Quartet.html chunfang313@gmail.com Supplementary data are available at Bioinformatics online.
Reconstructing targetable pathways in lung cancer by integrating diverse omics data

PubMed Central

Balbin, O. Alejandro; Prensner, John R.; Sahu, Anirban; Yocum, Anastasia; Shankar, Sunita; Malik, Rohit; Fermin, Damian; Dhanasekaran, Saravana M.; Chandler, Benjamin; Thomas, Dafydd; Beer, David G.; Cao, Xuhong; Nesvizhskii, Alexey I.; Chinnaiyan, Arul M.

2014-01-01

Global ‘multi-omics’ profiling of cancer cells harbours the potential for characterizing the signaling networks associated with specific oncogenes. Here we profile the transcriptome, proteome and phosphoproteome in a panel of non-small cell lung cancer (NSCLC) cell lines in order to reconstruct targetable networks associated with KRAS dependency. We develop a two-step bioinformatics strategy addressing the challenge of integrating these disparate data sets. We first define an ‘abundance-score’ combining transcript, protein and phospho-protein abundances to nominate differentially abundant proteins and then use the Prize Collecting Steiner Tree algorithm to identify functional sub-networks. We identify three modules centered on KRAS and MET, LCK and PAK1 and b-Catenin. We validate activation of these proteins in KRAS-dependent (KRAS-Dep) cells and perform functional studies defining LCK as a critical gene for cell proliferation in KRAS-Dep but not KRAS-independent NSCLCs. These results suggest that LCK is a potential druggable target protein in KRAS-Dep lung cancers. PMID:24135919
Reconstruction method for inversion problems in an acoustic tomography based temperature distribution measurement

NASA Astrophysics Data System (ADS)

Liu, Sha; Liu, Shi; Tong, Guowei

2017-11-01

In industrial areas, temperature distribution information provides a powerful data support for improving system efficiency, reducing pollutant emission, ensuring safety operation, etc. As a noninvasive measurement technology, acoustic tomography (AT) has been widely used to measure temperature distribution where the efficiency of the reconstruction algorithm is crucial for the reliability of the measurement results. Different from traditional reconstruction techniques, in this paper a two-phase reconstruction method is proposed to ameliorate the reconstruction accuracy (RA). In the first phase, the measurement domain is discretized by a coarse square grid to reduce the number of unknown variables to mitigate the ill-posed nature of the AT inverse problem. By taking into consideration the inaccuracy of the measured time-of-flight data, a new cost function is constructed to improve the robustness of the estimation, and a grey wolf optimizer is used to solve the proposed cost function to obtain the temperature distribution on the coarse grid. In the second phase, the Adaboost.RT based BP neural network algorithm is developed for predicting the temperature distribution on the refined grid in accordance with the temperature distribution data estimated in the first phase. Numerical simulations and experiment measurement results validate the superiority of the proposed reconstruction algorithm in improving the robustness and RA.
Data based identification and prediction of nonlinear and complex dynamical systems

NASA Astrophysics Data System (ADS)

Wang, Wen-Xu; Lai, Ying-Cheng; Grebogi, Celso

2016-07-01

The problem of reconstructing nonlinear and complex dynamical systems from measured data or time series is central to many scientific disciplines including physical, biological, computer, and social sciences, as well as engineering and economics. The classic approach to phase-space reconstruction through the methodology of delay-coordinate embedding has been practiced for more than three decades, but the paradigm is effective mostly for low-dimensional dynamical systems. Often, the methodology yields only a topological correspondence of the original system. There are situations in various fields of science and engineering where the systems of interest are complex and high dimensional with many interacting components. A complex system typically exhibits a rich variety of collective dynamics, and it is of great interest to be able to detect, classify, understand, predict, and control the dynamics using data that are becoming increasingly accessible due to the advances of modern information technology. To accomplish these goals, especially prediction and control, an accurate reconstruction of the original system is required. Nonlinear and complex systems identification aims at inferring, from data, the mathematical equations that govern the dynamical evolution and the complex interaction patterns, or topology, among the various components of the system. With successful reconstruction of the system equations and the connecting topology, it may be possible to address challenging and significant problems such as identification of causal relations among the interacting components and detection of hidden nodes. The "inverse" problem thus presents a grand challenge, requiring new paradigms beyond the traditional delay-coordinate embedding methodology. The past fifteen years have witnessed rapid development of contemporary complex graph theory with broad applications in interdisciplinary science and engineering. The combination of graph, information, and nonlinear dynamical systems theories with tools from statistical physics, optimization, engineering control, applied mathematics, and scientific computing enables the development of a number of paradigms to address the problem of nonlinear and complex systems reconstruction. In this Review, we describe the recent advances in this forefront and rapidly evolving field, with a focus on compressive sensing based methods. In particular, compressive sensing is a paradigm developed in recent years in applied mathematics, electrical engineering, and nonlinear physics to reconstruct sparse signals using only limited data. It has broad applications ranging from image compression/reconstruction to the analysis of large-scale sensor networks, and it has become a powerful technique to obtain high-fidelity signals for applications where sufficient observations are not available. We will describe in detail how compressive sensing can be exploited to address a diverse array of problems in data based reconstruction of nonlinear and complex networked systems. The problems include identification of chaotic systems and prediction of catastrophic bifurcations, forecasting future attractors of time-varying nonlinear systems, reconstruction of complex networks with oscillatory and evolutionary game dynamics, detection of hidden nodes, identification of chaotic elements in neuronal networks, reconstruction of complex geospatial networks and nodal positioning, and reconstruction of complex spreading networks with binary data.. A number of alternative methods, such as those based on system response to external driving, synchronization, and noise-induced dynamical correlation, will also be discussed. Due to the high relevance of network reconstruction to biological sciences, a special section is devoted to a brief survey of the current methods to infer biological networks. Finally, a number of open problems including control and controllability of complex nonlinear dynamical networks are discussed. The methods outlined in this Review are principled on various concepts in complexity science and engineering such as phase transitions, bifurcations, stabilities, and robustness. The methodologies have the potential to significantly improve our ability to understand a variety of complex dynamical systems ranging from gene regulatory systems to social networks toward the ultimate goal of controlling such systems.
Reconstitution of the ERG Gene Expression Network Reveals New Biomarkers and Therapeutic Targets in ERG Positive Prostate Tumors

PubMed Central

Dubovenko, Alexey; Serebryiskaya, Tatiana; Nikolsky, Yuri; Nikolskaya, Tatiana; Perlina, Ally; JeBailey, Lellean; Bureeva, Svetlana; Katta, Shilpa; Srivastava, Shiv; Dobi, Albert; Khasanova, Tatiana

2015-01-01

Background: Despite a growing number of studies evaluating cancer of prostate (CaP) specific gene alterations, oncogenic activation of the ETS Related Gene (ERG) by gene fusions remains the most validated cancer gene alteration in CaP. Prevalent gene fusions have been described between the ERG gene and promoter upstream sequences of androgen-inducible genes, predominantly TMPRSS2 (transmembrane protease serine 2). Despite the extensive evaluations of ERG genomic rearrangements, fusion transcripts and the ERG oncoprotein, the prognostic value of ERG remains to be better understood. Using gene expression dataset from matched prostate tumor and normal epithelial cells from an 80 GeneChip experiment examining 40 tumors and their matching normal pairs in 40 patients with known ERG status, we conducted a cancer signaling-focused functional analysis of prostatic carcinoma representing moderate and aggressive cancers stratified by ERG expression. Results: In the present study of matched pairs of laser capture microdissected normal epithelial cells and well-to-moderately differentiated tumor epithelial cells with known ERG gene expression status from 20 patients with localized prostate cancer, we have discovered novel ERG associated biochemical networks. Conclusions: Using causal network reconstruction methods, we have identified three major signaling pathways related to MAPK/PI3K cascade that may indeed contribute synergistically to the ERG dependent tumor development. Moreover, the key components of these pathways have potential as biomarkers and therapeutic target for ERG positive prostate tumors. PMID:26000039
Systems Genetic Analysis of Osteoblast-Lineage Cells

PubMed Central

Calabrese, Gina; Bennett, Brian J.; Orozco, Luz; Kang, Hyun M.; Eskin, Eleazar; Dombret, Carlos; De Backer, Olivier; Lusis, Aldons J.; Farber, Charles R.

2012-01-01

The osteoblast-lineage consists of cells at various stages of maturation that are essential for skeletal development, growth, and maintenance. Over the past decade, many of the signaling cascades that regulate this lineage have been elucidated; however, little is known of the networks that coordinate, modulate, and transmit these signals. Here, we identify a gene network specific to the osteoblast-lineage through the reconstruction of a bone co-expression network using microarray profiles collected on 96 Hybrid Mouse Diversity Panel (HMDP) inbred strains. Of the 21 modules that comprised the bone network, module 9 (M9) contained genes that were highly correlated with prototypical osteoblast maker genes and were more highly expressed in osteoblasts relative to other bone cells. In addition, the M9 contained many of the key genes that define the osteoblast-lineage, which together suggested that it was specific to this lineage. To use the M9 to identify novel osteoblast genes and highlight its biological relevance, we knocked-down the expression of its two most connected “hub” genes, Maged1 and Pard6g. Their perturbation altered both osteoblast proliferation and differentiation. Furthermore, we demonstrated the mice deficient in Maged1 had decreased bone mineral density (BMD). It was also discovered that a local expression quantitative trait locus (eQTL) regulating the Wnt signaling antagonist Sfrp1 was a key driver of the M9. We also show that the M9 is associated with BMD in the HMDP and is enriched for genes implicated in the regulation of human BMD through genome-wide association studies. In conclusion, we have identified a physiologically relevant gene network and used it to discover novel genes and regulatory mechanisms involved in the function of osteoblast-lineage cells. Our results highlight the power of harnessing natural genetic variation to generate co-expression networks that can be used to gain insight into the function of specific cell-types. PMID:23300464
Systems-level modeling of mycobacterial metabolism for the identification of new (multi-)drug targets.

PubMed

Rienksma, Rienk A; Suarez-Diez, Maria; Spina, Lucie; Schaap, Peter J; Martins dos Santos, Vitor A P

2014-12-01

Systems-level metabolic network reconstructions and the derived constraint-based (CB) mathematical models are efficient tools to explore bacterial metabolism. Approximately one-fourth of the Mycobacterium tuberculosis (Mtb) genome contains genes that encode proteins directly involved in its metabolism. These represent potential drug targets that can be systematically probed with CB models through the prediction of genes essential (or the combination thereof) for the pathogen to grow. However, gene essentiality depends on the growth conditions and, so far, no in vitro model precisely mimics the host at the different stages of mycobacterial infection, limiting model predictions. These limitations can be circumvented by combining expression data from in vivo samples with a validated CB model, creating an accurate description of pathogen metabolism in the host. To this end, we present here a thoroughly curated and extended genome-scale CB metabolic model of Mtb quantitatively validated using 13C measurements. We describe some of the efforts made in integrating CB models and high-throughput data to generate condition specific models, and we will discuss challenges ahead. This knowledge and the framework herein presented will enable to identify potential new drug targets, and will foster the development of optimal therapeutic strategies. Copyright © 2014 The Authors. Published by Elsevier Ltd.. All rights reserved.
Genome-Wide Identification of Regulatory Elements and Reconstruction of Gene Regulatory Networks of the Green Alga Chlamydomonas reinhardtii under Carbon Deprivation

PubMed Central

Vischi Winck, Flavia; Arvidsson, Samuel; Riaño-Pachón, Diego Mauricio; Hempel, Sabrina; Koseska, Aneta; Nikoloski, Zoran; Urbina Gomez, David Alejandro; Rupprecht, Jens; Mueller-Roeber, Bernd

2013-01-01

The unicellular green alga Chlamydomonas reinhardtii is a long-established model organism for studies on photosynthesis and carbon metabolism-related physiology. Under conditions of air-level carbon dioxide concentration [CO2], a carbon concentrating mechanism (CCM) is induced to facilitate cellular carbon uptake. CCM increases the availability of carbon dioxide at the site of cellular carbon fixation. To improve our understanding of the transcriptional control of the CCM, we employed FAIRE-seq (formaldehyde-assisted Isolation of Regulatory Elements, followed by deep sequencing) to determine nucleosome-depleted chromatin regions of algal cells subjected to carbon deprivation. Our FAIRE data recapitulated the positions of known regulatory elements in the promoter of the periplasmic carbonic anhydrase (Cah1) gene, which is upregulated during CCM induction, and revealed new candidate regulatory elements at a genome-wide scale. In addition, time series expression patterns of 130 transcription factor (TF) and transcription regulator (TR) genes were obtained for cells cultured under photoautotrophic condition and subjected to a shift from high to low [CO2]. Groups of co-expressed genes were identified and a putative directed gene-regulatory network underlying the CCM was reconstructed from the gene expression data using the recently developed IOTA (inner composition alignment) method. Among the candidate regulatory genes, two members of the MYB-related TF family, Lcr1 (Low-CO 2 response regulator 1) and Lcr2 (Low-CO 2 response regulator 2), may play an important role in down-regulating the expression of a particular set of TF and TR genes in response to low [CO2]. The results obtained provide new insights into the transcriptional control of the CCM and revealed more than 60 new candidate regulatory genes. Deep sequencing of nucleosome-depleted genomic regions indicated the presence of new, previously unknown regulatory elements in the C. reinhardtii genome. Our work can serve as a basis for future functional studies of transcriptional regulator genes and genomic regulatory elements in Chlamydomonas. PMID:24224019
Autonomous Byte Stream Randomizer

NASA Technical Reports Server (NTRS)

Paloulian, George K.; Woo, Simon S.; Chow, Edward T.

2013-01-01

Net-centric networking environments are often faced with limited resources and must utilize bandwidth as efficiently as possible. In networking environments that span wide areas, the data transmission has to be efficient without any redundant or exuberant metadata. The Autonomous Byte Stream Randomizer software provides an extra level of security on top of existing data encryption methods. Randomizing the data s byte stream adds an extra layer to existing data protection methods, thus making it harder for an attacker to decrypt protected data. Based on a generated crypto-graphically secure random seed, a random sequence of numbers is used to intelligently and efficiently swap the organization of bytes in data using the unbiased and memory-efficient in-place Fisher-Yates shuffle method. Swapping bytes and reorganizing the crucial structure of the byte data renders the data file unreadable and leaves the data in a deconstructed state. This deconstruction adds an extra level of security requiring the byte stream to be reconstructed with the random seed in order to be readable. Once the data byte stream has been randomized, the software enables the data to be distributed to N nodes in an environment. Each piece of the data in randomized and distributed form is a separate entity unreadable on its own right, but when combined with all N pieces, is able to be reconstructed back to one. Reconstruction requires possession of the key used for randomizing the bytes, leading to the generation of the same cryptographically secure random sequence of numbers used to randomize the data. This software is a cornerstone capability possessing the ability to generate the same cryptographically secure sequence on different machines and time intervals, thus allowing this software to be used more heavily in net-centric environments where data transfer bandwidth is limited.

lpNet: a linear programming approach to reconstruct signal transduction networks.

PubMed

Matos, Marta R A; Knapp, Bettina; Kaderali, Lars

2015-10-01

With the widespread availability of high-throughput experimental technologies it has become possible to study hundreds to thousands of cellular factors simultaneously, such as coding- or non-coding mRNA or protein concentrations. Still, extracting information about the underlying regulatory or signaling interactions from these data remains a difficult challenge. We present a flexible approach towards network inference based on linear programming. Our method reconstructs the interactions of factors from a combination of perturbation/non-perturbation and steady-state/time-series data. We show both on simulated and real data that our methods are able to reconstruct the underlying networks fast and efficiently, thus shedding new light on biological processes and, in particular, into disease's mechanisms of action. We have implemented the approach as an R package available through bioconductor. This R package is freely available under the Gnu Public License (GPL-3) from bioconductor.org (http://bioconductor.org/packages/release/bioc/html/lpNet.html) and is compatible with most operating systems (Windows, Linux, Mac OS) and hardware architectures. bettina.knapp@helmholtz-muenchen.de Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Reconstruction of magnetic configurations in W7-X using artificial neural networks

NASA Astrophysics Data System (ADS)

Böckenhoff, Daniel; Blatzheim, Marko; Hölbe, Hauke; Niemann, Holger; Pisano, Fabio; Labahn, Roger; Pedersen, Thomas Sunn; The W7-X Team

2018-05-01

It is demonstrated that artificial neural networks can be used to accurately and efficiently predict details of the magnetic topology at the plasma edge of the Wendelstein 7-X stellarator, based on simulated as well as measured heat load patterns onto plasma-facing components observed with infrared cameras. The connection between heat load patterns and the magnetic topology is a challenging regression problem, but one that suits artificial neural networks well. The use of a neural network makes it feasible to analyze and control the plasma exhaust in real-time, an important goal for Wendelstein 7-X, and for magnetic confinement fusion research in general.
Decoding the Regulatory Network for Blood Development from Single-Cell Gene Expression Measurements

PubMed Central

Haghverdi, Laleh; Lilly, Andrew J.; Tanaka, Yosuke; Wilkinson, Adam C.; Buettner, Florian; Macaulay, Iain C.; Jawaid, Wajid; Diamanti, Evangelia; Nishikawa, Shin-Ichi; Piterman, Nir; Kouskoff, Valerie; Theis, Fabian J.; Fisher, Jasmin; Göttgens, Berthold

2015-01-01

Here we report the use of diffusion maps and network synthesis from state transition graphs to better understand developmental pathways from single cell gene expression profiling. We map the progression of mesoderm towards blood in the mouse by single-cell expression analysis of 3,934 cells, capturing cells with blood-forming potential at four sequential developmental stages. By adapting the diffusion plot methodology for dimensionality reduction to single-cell data, we reconstruct the developmental journey to blood at single-cell resolution. Using transitions between individual cellular states as input, we develop a single-cell network synthesis toolkit to generate a computationally executable transcriptional regulatory network model that recapitulates blood development. Model predictions were validated by showing that Sox7 inhibits primitive erythropoiesis, and that Sox and Hox factors control early expression of Erg. We therefore demonstrate that single-cell analysis of a developing organ coupled with computational approaches can reveal the transcriptional programs that control organogenesis. PMID:25664528
A novel method to identify pathways associated with renal cell carcinoma based on a gene co-expression network

PubMed Central

RUAN, XIYUN; LI, HONGYUN; LIU, BO; CHEN, JIE; ZHANG, SHIBAO; SUN, ZEQIANG; LIU, SHUANGQING; SUN, FAHAI; LIU, QINGYONG

2015-01-01

The aim of the present study was to develop a novel method for identifying pathways associated with renal cell carcinoma (RCC) based on a gene co-expression network. A framework was established where a co-expression network was derived from the database as well as various co-expression approaches. First, the backbone of the network based on differentially expressed (DE) genes between RCC patients and normal controls was constructed by the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) database. The differentially co-expressed links were detected by Pearson’s correlation, the empirical Bayesian (EB) approach and Weighted Gene Co-expression Network Analysis (WGCNA). The co-expressed gene pairs were merged by a rank-based algorithm. We obtained 842; 371; 2,883 and 1,595 co-expressed gene pairs from the co-expression networks of the STRING database, Pearson’s correlation EB method and WGCNA, respectively. Two hundred and eighty-one differentially co-expressed (DC) gene pairs were obtained from the merged network using this novel method. Pathway enrichment analysis based on the Kyoto Encyclopedia of Genes and Genomes (KEGG) database and the network enrichment analysis (NEA) method were performed to verify feasibility of the merged method. Results of the KEGG and NEA pathway analyses showed that the network was associated with RCC. The suggested method was computationally efficient to identify pathways associated with RCC and has been identified as a useful complement to traditional co-expression analysis. PMID:26058425
Whole-genome sequencing of the blue whale and other rorquals finds signatures for introgressive gene flow

PubMed Central

Árnason, Úlfur; Kumar, Vikas

2018-01-01

Reconstructing the evolution of baleen whales (Mysticeti) has been problematic because morphological and genetic analyses have produced different scenarios. This might be caused by genomic admixture that may have taken place among some rorquals. We present the genomes of six whales, including the blue whale (Balaenoptera musculus), to reconstruct a species tree of baleen whales and to identify phylogenetic conflicts. Evolutionary multilocus analyses of 34,192 genome fragments reveal a fast radiation of rorquals at 10.5 to 7.5 million years ago coinciding with oceanic circulation shifts. The evolutionarily enigmatic gray whale (Eschrichtius robustus) is placed among rorquals, and the blue whale genome shows a high degree of heterozygosity. The nearly equal frequency of conflicting gene trees suggests that speciation of rorqual evolution occurred under gene flow, which is best depicted by evolutionary networks. Especially in marine environments, sympatric speciation might be common; our results raise questions about how genetic divergence can be established. PMID:29632892
Metabolic network modeling with model organisms.

PubMed

Yilmaz, L Safak; Walhout, Albertha Jm

2017-02-01

Flux balance analysis (FBA) with genome-scale metabolic network models (GSMNM) allows systems level predictions of metabolism in a variety of organisms. Different types of predictions with different accuracy levels can be made depending on the applied experimental constraints ranging from measurement of exchange fluxes to the integration of gene expression data. Metabolic network modeling with model organisms has pioneered method development in this field. In addition, model organism GSMNMs are useful for basic understanding of metabolism, and in the case of animal models, for the study of metabolic human diseases. Here, we discuss GSMNMs of most highly used model organisms with the emphasis on recent reconstructions. Published by Elsevier Ltd.
Metabolic network modeling with model organisms

PubMed Central

Yilmaz, L. Safak; Walhout, Albertha J.M.

2017-01-01

Flux balance analysis (FBA) with genome-scale metabolic network models (GSMNM) allows systems level predictions of metabolism in a variety of organisms. Different types of predictions with different accuracy levels can be made depending on the applied experimental constraints ranging from measurement of exchange fluxes to the integration of gene expression data. Metabolic network modeling with model organisms has pioneered method development in this field. In addition, model organism GSMNMs are useful for basic understanding of metabolism, and in the case of animal models, for the study of metabolic human diseases. Here, we discuss GSMNMs of most highly used model organisms with the emphasis on recent reconstructions. PMID:28088694
Unsupervised Network Analysis of the Plastic Supraoptic Nucleus Transcriptome Predicts Caprin2 Regulatory Interactions.

PubMed

Loh, Su-Yi; Jahans-Price, Thomas; Greenwood, Michael P; Greenwood, Mingkwan; Hoe, See-Ziau; Konopacka, Agnieszka; Campbell, Colin; Murphy, David; Hindmarch, Charles C T

2017-01-01

The supraoptic nucleus (SON) is a group of neurons in the hypothalamus responsible for the synthesis and secretion of the peptide hormones vasopressin and oxytocin. Following physiological cues, such as dehydration, salt-loading and lactation, the SON undergoes a function related plasticity that we have previously described in the rat at the transcriptome level. Using the unsupervised graphical lasso (Glasso) algorithm, we reconstructed a putative network from 500 plastic SON genes in which genes are the nodes and the edges are the inferred interactions. The most active nodal gene identified within the network was Caprin2 . Caprin2 encodes an RNA-binding protein that we have previously shown to be vital for the functioning of osmoregulatory neuroendocrine neurons in the SON of the rat hypothalamus. To test the validity of the Glasso network, we either overexpressed or knocked down Caprin2 transcripts in differentiated rat pheochromocytoma PC12 cells and showed that these manipulations had significant opposite effects on the levels of putative target mRNAs. These studies suggest that the predicative power of the Glasso algorithm within an in vivo system is accurate, and identifies biological targets that may be important to the functional plasticity of the SON.
Discrete dynamical system modelling for gene regulatory networks of 5-hydroxymethylfurfural tolerance for ethanologenic yeast.

PubMed

Song, M; Ouyang, Z; Liu, Z L

2009-05-01

Composed of linear difference equations, a discrete dynamical system (DDS) model was designed to reconstruct transcriptional regulations in gene regulatory networks (GRNs) for ethanologenic yeast Saccharomyces cerevisiae in response to 5-hydroxymethylfurfural (HMF), a bioethanol conversion inhibitor. The modelling aims at identification of a system of linear difference equations to represent temporal interactions among significantly expressed genes. Power stability is imposed on a system model under the normal condition in the absence of the inhibitor. Non-uniform sampling, typical in a time-course experimental design, is addressed by a log-time domain interpolation. A statistically significant DDS model of the yeast GRN derived from time-course gene expression measurements by exposure to HMF, revealed several verified transcriptional regulation events. These events implicate Yap1 and Pdr3, transcription factors consistently known for their regulatory roles by other studies or postulated by independent sequence motif analysis, suggesting their involvement in yeast tolerance and detoxification of the inhibitor.
JCell--a Java-based framework for inferring regulatory networks from time series data.

PubMed

Spieth, C; Supper, J; Streichert, F; Speer, N; Zell, A

2006-08-15

JCell is a Java-based application for reconstructing gene regulatory networks from experimental data. The framework provides several algorithms to identify genetic and metabolic dependencies based on experimental data conjoint with mathematical models to describe and simulate regulatory systems. Owing to the modular structure, researchers can easily implement new methods. JCell is a pure Java application with additional scripting capabilities and thus widely usable, e.g. on parallel or cluster computers. The software is freely available for download at http://www-ra.informatik.uni-tuebingen.de/software/JCell.
NETWORK ASSISTED ANALYSIS TO REVEAL THE GENETIC BASIS OF AUTISM1

PubMed Central

Liu, Li; Lei, Jing; Roeder, Kathryn

2016-01-01

While studies show that autism is highly heritable, the nature of the genetic basis of this disorder remains illusive. Based on the idea that highly correlated genes are functionally interrelated and more likely to affect risk, we develop a novel statistical tool to find more potentially autism risk genes by combining the genetic association scores with gene co-expression in specific brain regions and periods of development. The gene dependence network is estimated using a novel partial neighborhood selection (PNS) algorithm, where node specific properties are incorporated into network estimation for improved statistical and computational efficiency. Then we adopt a hidden Markov random field (HMRF) model to combine the estimated network and the genetic association scores in a systematic manner. The proposed modeling framework can be naturally extended to incorporate additional structural information concerning the dependence between genes. Using currently available genetic association data from whole exome sequencing studies and brain gene expression levels, the proposed algorithm successfully identified 333 genes that plausibly affect autism risk. PMID:27134692
Integration of heterogeneous molecular networks to unravel gene-regulation in Mycobacterium tuberculosis.

PubMed

van Dam, Jesse C J; Schaap, Peter J; Martins dos Santos, Vitor A P; Suárez-Diez, María

2014-09-26

Different methods have been developed to infer regulatory networks from heterogeneous omics datasets and to construct co-expression networks. Each algorithm produces different networks and efforts have been devoted to automatically integrate them into consensus sets. However each separate set has an intrinsic value that is diluted and partly lost when building a consensus network. Here we present a methodology to generate co-expression networks and, instead of a consensus network, we propose an integration framework where the different networks are kept and analysed with additional tools to efficiently combine the information extracted from each network. We developed a workflow to efficiently analyse information generated by different inference and prediction methods. Our methodology relies on providing the user the means to simultaneously visualise and analyse the coexisting networks generated by different algorithms, heterogeneous datasets, and a suite of analysis tools. As a show case, we have analysed the gene co-expression networks of Mycobacterium tuberculosis generated using over 600 expression experiments. Regarding DNA damage repair, we identified SigC as a key control element, 12 new targets for LexA, an updated LexA binding motif, and a potential mismatch repair system. We expanded the DevR regulon with 27 genes while identifying 9 targets wrongly assigned to this regulon. We discovered 10 new genes linked to zinc uptake and a new regulatory mechanism for ZuR. The use of co-expression networks to perform system level analysis allows the development of custom made methodologies. As show cases we implemented a pipeline to integrate ChIP-seq data and another method to uncover multiple regulatory layers. Our workflow is based on representing the multiple types of information as network representations and presenting these networks in a synchronous framework that allows their simultaneous visualization while keeping specific associations from the different networks. By simultaneously exploring these networks and metadata, we gained insights into regulatory mechanisms in M. tuberculosis that could not be obtained through the separate analysis of each data type.
Reveal genes functionally associated with ACADS by a network study.

PubMed

Chen, Yulong; Su, Zhiguang

2015-09-15

Establishing a systematic network is aimed at finding essential human gene-gene/gene-disease pathway by means of network inter-connecting patterns and functional annotation analysis. In the present study, we have analyzed functional gene interactions of short-chain acyl-coenzyme A dehydrogenase gene (ACADS). ACADS plays a vital role in free fatty acid β-oxidation and regulates energy homeostasis. Modules of highly inter-connected genes in disease-specific ACADS network are derived by integrating gene function and protein interaction data. Among the 8 genes in ACADS web retrieved from both STRING and GeneMANIA, ACADS is effectively conjoined with 4 genes including HAHDA, HADHB, ECHS1 and ACAT1. The functional analysis is done via ontological briefing and candidate disease identification. We observed that the highly efficient-interlinked genes connected with ACADS are HAHDA, HADHB, ECHS1 and ACAT1. Interestingly, the ontological aspect of genes in the ACADS network reveals that ACADS, HAHDA and HADHB play equally vital roles in fatty acid metabolism. The gene ACAT1 together with ACADS indulges in ketone metabolism. Our computational gene web analysis also predicts potential candidate disease recognition, thus indicating the involvement of ACADS, HAHDA, HADHB, ECHS1 and ACAT1 not only with lipid metabolism but also with infant death syndrome, skeletal myopathy, acute hepatic encephalopathy, Reye-like syndrome, episodic ketosis, and metabolic acidosis. The current study presents a comprehensible layout of ACADS network, its functional strategies and candidate disease approach associated with ACADS network. Copyright © 2015 Elsevier B.V. All rights reserved.
Simultaneous grouping pursuit and feature selection over an undirected graph*

PubMed Central

Zhu, Yunzhang; Shen, Xiaotong; Pan, Wei

2013-01-01

Summary In high-dimensional regression, grouping pursuit and feature selection have their own merits while complementing each other in battling the curse of dimensionality. To seek a parsimonious model, we perform simultaneous grouping pursuit and feature selection over an arbitrary undirected graph with each node corresponding to one predictor. When the corresponding nodes are reachable from each other over the graph, regression coefficients can be grouped, whose absolute values are the same or close. This is motivated from gene network analysis, where genes tend to work in groups according to their biological functionalities. Through a nonconvex penalty, we develop a computational strategy and analyze the proposed method. Theoretical analysis indicates that the proposed method reconstructs the oracle estimator, that is, the unbiased least squares estimator given the true grouping, leading to consistent reconstruction of grouping structures and informative features, as well as to optimal parameter estimation. Simulation studies suggest that the method combines the benefit of grouping pursuit with that of feature selection, and compares favorably against its competitors in selection accuracy and predictive performance. An application to eQTL data is used to illustrate the methodology, where a network is incorporated into analysis through an undirected graph. PMID:24098061
Self-organizing adaptive map: autonomous learning of curves and surfaces from point samples.

PubMed

Piastra, Marco

2013-05-01

Competitive Hebbian Learning (CHL) (Martinetz, 1993) is a simple and elegant method for estimating the topology of a manifold from point samples. The method has been adopted in a number of self-organizing networks described in the literature and has given rise to related studies in the fields of geometry and computational topology. Recent results from these fields have shown that a faithful reconstruction can be obtained using the CHL method only for curves and surfaces. Within these limitations, these findings constitute a basis for defining a CHL-based, growing self-organizing network that produces a faithful reconstruction of an input manifold. The SOAM (Self-Organizing Adaptive Map) algorithm adapts its local structure autonomously in such a way that it can match the features of the manifold being learned. The adaptation process is driven by the defects arising when the network structure is inadequate, which cause a growth in the density of units. Regions of the network undergo a phase transition and change their behavior whenever a simple, local condition of topological regularity is met. The phase transition is eventually completed across the entire structure and the adaptation process terminates. In specific conditions, the structure thus obtained is homeomorphic to the input manifold. During the adaptation process, the network also has the capability to focus on the acquisition of input point samples in critical regions, with a substantial increase in efficiency. The behavior of the network has been assessed experimentally with typical data sets for surface reconstruction, including suboptimal conditions, e.g. with undersampling and noise. Copyright © 2012 Elsevier Ltd. All rights reserved.
Fast Construction of Near Parsimonious Hybridization Networks for Multiple Phylogenetic Trees.

PubMed

Mirzaei, Sajad; Wu, Yufeng

2016-01-01

Hybridization networks represent plausible evolutionary histories of species that are affected by reticulate evolutionary processes. An established computational problem on hybridization networks is constructing the most parsimonious hybridization network such that each of the given phylogenetic trees (called gene trees) is "displayed" in the network. There have been several previous approaches, including an exact method and several heuristics, for this NP-hard problem. However, the exact method is only applicable to a limited range of data, and heuristic methods can be less accurate and also slow sometimes. In this paper, we develop a new algorithm for constructing near parsimonious networks for multiple binary gene trees. This method is more efficient for large numbers of gene trees than previous heuristics. This new method also produces more parsimonious results on many simulated datasets as well as a real biological dataset than a previous method. We also show that our method produces topologically more accurate networks for many datasets.
Database constraints applied to metabolic pathway reconstruction tools.

PubMed

Vilaplana, Jordi; Solsona, Francesc; Teixido, Ivan; Usié, Anabel; Karathia, Hiren; Alves, Rui; Mateo, Jordi

2014-01-01

Our group developed two biological applications, Biblio-MetReS and Homol-MetReS, accessing the same database of organisms with annotated genes. Biblio-MetReS is a data-mining application that facilitates the reconstruction of molecular networks based on automated text-mining analysis of published scientific literature. Homol-MetReS allows functional (re)annotation of proteomes, to properly identify both the individual proteins involved in the process(es) of interest and their function. It also enables the sets of proteins involved in the process(es) in different organisms to be compared directly. The efficiency of these biological applications is directly related to the design of the shared database. We classified and analyzed the different kinds of access to the database. Based on this study, we tried to adjust and tune the configurable parameters of the database server to reach the best performance of the communication data link to/from the database system. Different database technologies were analyzed. We started the study with a public relational SQL database, MySQL. Then, the same database was implemented by a MapReduce-based database named HBase. The results indicated that the standard configuration of MySQL gives an acceptable performance for low or medium size databases. Nevertheless, tuning database parameters can greatly improve the performance and lead to very competitive runtimes.
Modular and hierarchical structure of social contact networks

NASA Astrophysics Data System (ADS)

Ge, Yuanzheng; Song, Zhichao; Qiu, Xiaogang; Song, Hongbin; Wang, Yong

2013-10-01

Social contact networks exhibit overlapping qualities of communities, hierarchical structure and spatial-correlated nature. We propose a mixing pattern of modular and growing hierarchical structures to reconstruct social contact networks by using an individual’s geospatial distribution information in the real world. The hierarchical structure of social contact networks is defined based on the spatial distance between individuals, and edges among individuals are added in turn from the modular layer to the highest layer. It is a gradual process to construct the hierarchical structure: from the basic modular model up to the global network. The proposed model not only shows hierarchically increasing degree distribution and large clustering coefficients in communities, but also exhibits spatial clustering features of individual distributions. As an evaluation of the method, we reconstruct a hierarchical contact network based on the investigation data of a university. Transmission experiments of influenza H1N1 are carried out on the generated social contact networks, and results show that the constructed network is efficient to reproduce the dynamic process of an outbreak and evaluate interventions. The reproduced spread process exhibits that the spatial clustering of infection is accordant with the clustering of network topology. Moreover, the effect of individual topological character on the spread of influenza is analyzed, and the experiment results indicate that the spread is limited by individual daily contact patterns and local clustering topology rather than individual degree.
Three-Dimensional Terahertz Coded-Aperture Imaging Based on Matched Filtering and Convolutional Neural Network.

PubMed

Chen, Shuo; Luo, Chenggao; Wang, Hongqiang; Deng, Bin; Cheng, Yongqiang; Zhuang, Zhaowen

2018-04-26

As a promising radar imaging technique, terahertz coded-aperture imaging (TCAI) can achieve high-resolution, forward-looking, and staring imaging by producing spatiotemporal independent signals with coded apertures. However, there are still two problems in three-dimensional (3D) TCAI. Firstly, the large-scale reference-signal matrix based on meshing the 3D imaging area creates a heavy computational burden, thus leading to unsatisfactory efficiency. Secondly, it is difficult to resolve the target under low signal-to-noise ratio (SNR). In this paper, we propose a 3D imaging method based on matched filtering (MF) and convolutional neural network (CNN), which can reduce the computational burden and achieve high-resolution imaging for low SNR targets. In terms of the frequency-hopping (FH) signal, the original echo is processed with MF. By extracting the processed echo in different spike pulses separately, targets in different imaging planes are reconstructed simultaneously to decompose the global computational complexity, and then are synthesized together to reconstruct the 3D target. Based on the conventional TCAI model, we deduce and build a new TCAI model based on MF. Furthermore, the convolutional neural network (CNN) is designed to teach the MF-TCAI how to reconstruct the low SNR target better. The experimental results demonstrate that the MF-TCAI achieves impressive performance on imaging ability and efficiency under low SNR. Moreover, the MF-TCAI has learned to better resolve the low-SNR 3D target with the help of CNN. In summary, the proposed 3D TCAI can achieve: (1) low-SNR high-resolution imaging by using MF; (2) efficient 3D imaging by downsizing the large-scale reference-signal matrix; and (3) intelligent imaging with CNN. Therefore, the TCAI based on MF and CNN has great potential in applications such as security screening, nondestructive detection, medical diagnosis, etc.
Reconstruction of the genome-scale co-expression network for the Hippo signaling pathway in colorectal cancer.

PubMed

Dehghanian, Fariba; Hojati, Zohreh; Hosseinkhan, Nazanin; Mousavian, Zaynab; Masoudi-Nejad, Ali

2018-05-26

The Hippo signaling pathway (HSP) has been identified as an essential and complex signaling pathway for tumor suppression that coordinates proliferation, differentiation, cell death, cell growth and stemness. In the present study, we conducted a genome-scale co-expression analysis to reconstruct the HSP in colorectal cancer (CRC). Five key modules were detected through network clustering, and a detailed discussion of two modules containing respectively 18 and 13 over and down-regulated members of HSP was provided. Our results suggest new potential regulatory factors in the HSP. The detected modules also suggest novel genes contributing to CRC. Moreover, differential expression analysis confirmed the differential expression pattern of HSP members and new suggested regulatory factors between tumor and normal samples. These findings can further reveal the importance of HSP in CRC. Copyright © 2018 Elsevier Ltd. All rights reserved.

Reconstruction of networks from one-step data by matching positions

NASA Astrophysics Data System (ADS)

Wu, Jianshe; Dang, Ni; Jiao, Yang

2018-05-01

It is a challenge in estimating the topology of a network from short time series data. In this paper, matching positions is developed to reconstruct the topology of a network from only one-step data. We consider a general network model of coupled agents, in which the phase transformation of each node is determined by its neighbors. From the phase transformation information from one step to the next, the connections of the tail vertices are reconstructed firstly by the matching positions. Removing the already reconstructed vertices, and repeatedly reconstructing the connections of tail vertices, the topology of the entire network is reconstructed. For sparse scale-free networks with more than ten thousands nodes, we almost obtain the actual topology using only the one-step data in simulations.
Bio-crude transcriptomics: gene discovery and metabolic network reconstruction for the biosynthesis of the terpenome of the hydrocarbon oil-producing green alga, Botryococcus braunii race B (Showa).

PubMed

Molnár, István; Lopez, David; Wisecaver, Jennifer H; Devarenne, Timothy P; Weiss, Taylor L; Pellegrini, Matteo; Hackett, Jeremiah D

2012-10-30

Microalgae hold promise for yielding a biofuel feedstock that is sustainable, carbon-neutral, distributed, and only minimally disruptive for the production of food and feed by traditional agriculture. Amongst oleaginous eukaryotic algae, the B race of Botryococcus braunii is unique in that it produces large amounts of liquid hydrocarbons of terpenoid origin. These are comparable to fossil crude oil, and are sequestered outside the cells in a communal extracellular polymeric matrix material. Biosynthetic engineering of terpenoid bio-crude production requires identification of genes and reconstruction of metabolic pathways responsible for production of both hydrocarbons and other metabolites of the alga that compete for photosynthetic carbon and energy. A de novo assembly of 1,334,609 next-generation pyrosequencing reads form the Showa strain of the B race of B. braunii yielded a transcriptomic database of 46,422 contigs with an average length of 756 bp. Contigs were annotated with pathway, ontology, and protein domain identifiers. Manual curation allowed the reconstruction of pathways that produce terpenoid liquid hydrocarbons from primary metabolites, and pathways that divert photosynthetic carbon into tetraterpenoid carotenoids, diterpenoids, and the prenyl chains of meroterpenoid quinones and chlorophyll. Inventories of machine-assembled contigs are also presented for reconstructed pathways for the biosynthesis of competing storage compounds including triacylglycerol and starch. Regeneration of S-adenosylmethionine, and the extracellular localization of the hydrocarbon oils by active transport and possibly autophagy are also investigated. The construction of an annotated transcriptomic database, publicly available in a web-based data depository and annotation tool, provides a foundation for metabolic pathway and network reconstruction, and facilitates further omics studies in the absence of a genome sequence for the Showa strain of B. braunii, race B. Further, the transcriptome database empowers future biosynthetic engineering approaches for strain improvement and the transfer of desirable traits to heterologous hosts.
Bio-crude transcriptomics: Gene discovery and metabolic network reconstruction for the biosynthesis of the terpenome of the hydrocarbon oil-producing green alga, Botryococcus braunii race B (Showa)*

PubMed Central

2012-01-01

Background Microalgae hold promise for yielding a biofuel feedstock that is sustainable, carbon-neutral, distributed, and only minimally disruptive for the production of food and feed by traditional agriculture. Amongst oleaginous eukaryotic algae, the B race of Botryococcus braunii is unique in that it produces large amounts of liquid hydrocarbons of terpenoid origin. These are comparable to fossil crude oil, and are sequestered outside the cells in a communal extracellular polymeric matrix material. Biosynthetic engineering of terpenoid bio-crude production requires identification of genes and reconstruction of metabolic pathways responsible for production of both hydrocarbons and other metabolites of the alga that compete for photosynthetic carbon and energy. Results A de novo assembly of 1,334,609 next-generation pyrosequencing reads form the Showa strain of the B race of B. braunii yielded a transcriptomic database of 46,422 contigs with an average length of 756 bp. Contigs were annotated with pathway, ontology, and protein domain identifiers. Manual curation allowed the reconstruction of pathways that produce terpenoid liquid hydrocarbons from primary metabolites, and pathways that divert photosynthetic carbon into tetraterpenoid carotenoids, diterpenoids, and the prenyl chains of meroterpenoid quinones and chlorophyll. Inventories of machine-assembled contigs are also presented for reconstructed pathways for the biosynthesis of competing storage compounds including triacylglycerol and starch. Regeneration of S-adenosylmethionine, and the extracellular localization of the hydrocarbon oils by active transport and possibly autophagy are also investigated. Conclusions The construction of an annotated transcriptomic database, publicly available in a web-based data depository and annotation tool, provides a foundation for metabolic pathway and network reconstruction, and facilitates further omics studies in the absence of a genome sequence for the Showa strain of B. braunii, race B. Further, the transcriptome database empowers future biosynthetic engineering approaches for strain improvement and the transfer of desirable traits to heterologous hosts. PMID:23110428
Efficient Usage of Dense GNSS Networks in Central Europe for the Visualization and Investigation of Ionospheric TEC Variations

PubMed Central

Zanimonskiy, Yevgen M.; Yampolski, Yuri M.; Figurski, Mariusz

2017-01-01

The technique of the orthogonal projection of ionosphere electronic content variations for mapping total electron content (TEC) allows us to visualize ionospheric irregularities. For the reconstruction of global ionospheric characteristics, numerous global navigation satellite system (GNSS) receivers located in different regions of the Earth are used as sensors. We used dense GNSS networks in central Europe to detect and investigate a special type of plasma inhomogeneities, called travelling ionospheric disturbances (TID). Such use of GNSS sensors allows us to reconstruct the main TID parameters, such as spatial dimensions, velocities, and directions of their movement. The paper gives examples of the restoration of dynamic characteristics of ionospheric irregularities for quiet and disturbed geophysical conditions. Special attention is paid to the dynamics of ionospheric disturbances stimulated by the magnetic storms of two St. Patrick’s Days (17 March 2013 and 2015). Additional opportunities for the remote sensing of the ionosphere with the use of dense regional networks of GNSS receiving sensors have been noted too. PMID:28994718
Efficient Usage of Dense GNSS Networks in Central Europe for the Visualization and Investigation of Ionospheric TEC Variations.

PubMed

Nykiel, Grzegorz; Zanimonskiy, Yevgen M; Yampolski, Yuri M; Figurski, Mariusz

2017-10-10

The technique of the orthogonal projection of ionosphere electronic content variations for mapping total electron content (TEC) allows us to visualize ionospheric irregularities. For the reconstruction of global ionospheric characteristics, numerous global navigation satellite system (GNSS) receivers located in different regions of the Earth are used as sensors. We used dense GNSS networks in central Europe to detect and investigate a special type of plasma inhomogeneities, called travelling ionospheric disturbances (TID). Such use of GNSS sensors allows us to reconstruct the main TID parameters, such as spatial dimensions, velocities, and directions of their movement. The paper gives examples of the restoration of dynamic characteristics of ionospheric irregularities for quiet and disturbed geophysical conditions. Special attention is paid to the dynamics of ionospheric disturbances stimulated by the magnetic storms of two St. Patrick's Days (17 March 2013 and 2015). Additional opportunities for the remote sensing of the ionosphere with the use of dense regional networks of GNSS receiving sensors have been noted too.
Systems Biomedicine of Rabies Delineates the Affected Signaling Pathways.

PubMed

Azimzadeh Jamalkandi, Sadegh; Mozhgani, Sayed-Hamidreza; Gholami Pourbadie, Hamid; Mirzaie, Mehdi; Noorbakhsh, Farshid; Vaziri, Behrouz; Gholami, Alireza; Ansari-Pour, Naser; Jafari, Mohieddin

2016-01-01

The prototypical neurotropic virus, rabies, is a member of the Rhabdoviridae family that causes lethal encephalomyelitis. Although there have been a plethora of studies investigating the etiological mechanism of the rabies virus and many precautionary methods have been implemented to avert the disease outbreak over the last century, the disease has surprisingly no definite remedy at its late stages. The psychological symptoms and the underlying etiology, as well as the rare survival rate from rabies encephalitis, has still remained a mystery. We, therefore, undertook a systems biomedicine approach to identify the network of gene products implicated in rabies. This was done by meta-analyzing whole-transcriptome microarray datasets of the CNS infected by strain CVS-11, and integrating them with interactome data using computational and statistical methods. We first determined the differentially expressed genes (DEGs) in each study and horizontally integrated the results at the mRNA and microRNA levels separately. A total of 61 seed genes involved in signal propagation system were obtained by means of unifying mRNA and microRNA detected integrated DEGs. We then reconstructed a refined protein-protein interaction network (PPIN) of infected cells to elucidate the rabies-implicated signal transduction network (RISN). To validate our findings, we confirmed differential expression of randomly selected genes in the network using Real-time PCR. In conclusion, the identification of seed genes and their network neighborhood within the refined PPIN can be useful for demonstrating signaling pathways including interferon circumvent, toward proliferation and survival, and neuropathological clue, explaining the intricate underlying molecular neuropathology of rabies infection and thus rendered a molecular framework for predicting potential drug targets.
Inference of the oxidative stress network in Anopheles stephensi upon Plasmodium infection.

PubMed

Shrinet, Jatin; Nandal, Umesh Kumar; Adak, Tridibes; Bhatnagar, Raj K; Sunil, Sujatha

2014-01-01

Ookinete invasion of Anopheles midgut is a critical step for malaria transmission; the parasite numbers drop drastically and practically reach a minimum during the parasite's whole life cycle. At this stage, the parasite as well as the vector undergoes immense oxidative stress. Thereafter, the vector undergoes oxidative stress at different time points as the parasite invades its tissues during the parasite development. The present study was undertaken to reconstruct the network of differentially expressed genes involved in oxidative stress in Anopheles stephensi during Plasmodium development and maturation in the midgut. Using high throughput next generation sequencing methods, we generated the transcriptome of the An. stephensi midgut during Plasmodium vinckei petteri oocyst invasion of the midgut epithelium. Further, we utilized large datasets available on public domain on Anopheles during Plasmodium ookinete invasion and Drosophila datasets and arrived upon clusters of genes that may play a role in oxidative stress. Finally, we used support vector machines for the functional prediction of the un-annotated genes of An. stephensi. Integrating the results from all the different data analyses, we identified a total of 516 genes that were involved in oxidative stress in An. stephensi during Plasmodium development. The significantly regulated genes were further extracted from this gene cluster and used to infer an oxidative stress network of An. stephensi. Using system biology approaches, we have been able to ascertain the role of several putative genes in An. stephensi with respect to oxidative stress. Further experimental validations of these genes are underway.
Comparative thoracic anatomy of the wild type and wingless (wg1cn1) mutant of Drosophila melanogaster (Diptera).

PubMed

Fabian, Benjamin; Schneeberg, Katharina; Beutel, Rolf Georg

2016-11-01

Genetically modified organisms are crucial for our understanding of gene regulatory networks, physiological processes and ontogeny. With modern molecular genetic techniques allowing the rapid generation of different Drosophila melanogaster mutants, efficient in-depth morphological investigations become an important issue. Anatomical studies can elucidate the role of certain genes in developmental processes and point out which parts of gene regulatory networks are involved in evolutionary changes of morphological structures. The wingless mutation wg 1 of D. melanogaster was discovered more than 40 years ago. While early studies addressed the external phenotype of these mutants, the documentation of the internal organization was largely restricted to the prominent indirect flight muscles. We used SEM micrographs, histological serial sections, μ-computed tomography, CLSM and 3D reconstructions to study and document the thoracic skeletomuscular system of the wild type and mutant. A recently introduced nomenclature for the musculature of neopteran insects was applied to facilitate comparisons with closely or more distantly related taxa. The mutation is phenotypically mainly characterized by the absence of one or both wings and halteres. The wing is partly or entirely replaced by duplications of mesonotal structures, whereas the haltere and its associated muscles are completely absent on body sides showing the reduction. Both the direct and indirect mesothoracic flight muscles are affected by loss and reorientation of bundles or fibers. Our observations lead to the conclusion that the wingless mutation causes a homeotic transformation in the imaginal discs of wings and halteres with a direct effect on the development of skeletal structures and an indirect effect on the associated muscular system. Copyright © 2016 Elsevier Ltd. All rights reserved.
Next-generation sequencing of mixed genomic DNA allows efficient assembly of rearranged mitochondrial genomes in Amolops chunganensis and Quasipaa boulengeri

PubMed Central

Yuan, Siqi; Zheng, Yuchi; Zeng, Xiaomao

2016-01-01

Recent improvements in next-generation sequencing (NGS) technologies can facilitate the obtainment of mitochondrial genomes. However, it is not clear whether NGS could be effectively used to reconstruct the mitogenome with high gene rearrangement. These high rearrangements would cause amplification failure, and/or assembly and alignment errors. Here, we choose two frogs with rearranged gene order, Amolops chunganensis and Quasipaa boulengeri, to test whether gene rearrangements affect the mitogenome assembly and alignment by using NGS. The mitogenomes with gene rearrangements are sequenced through Illumina MiSeq genomic sequencing and assembled effectively by Trinity v2.1.0 and SOAPdenovo2. Gene order and contents in the mitogenome of A. chunganensis and Q. boulengeri are typical neobatrachian pattern except for rearrangements at the position of “WANCY” tRNA genes cluster. Further, the mitogenome of Q. boulengeri is characterized with a tandem duplication of trnM. Moreover, we utilize 13 protein-coding genes of A. chunganensis, Q. boulengeri and other neobatrachians to reconstruct the phylogenetic tree for evaluating mitochondrial sequence authenticity of A. chunganensis and Q. boulengeri. In this work, we provide nearly complete mitochondrial genomes of A. chunganensis and Q. boulengeri. PMID:27994980
Reconstruction of the experimentally supported human protein interactome: what can we learn?

PubMed

Klapa, Maria I; Tsafou, Kalliopi; Theodoridis, Evangelos; Tsakalidis, Athanasios; Moschonas, Nicholas K

2013-10-02

Understanding the topology and dynamics of the human protein-protein interaction (PPI) network will significantly contribute to biomedical research, therefore its systematic reconstruction is required. Several meta-databases integrate source PPI datasets, but the protein node sets of their networks vary depending on the PPI data combined. Due to this inherent heterogeneity, the way in which the human PPI network expands via multiple dataset integration has not been comprehensively analyzed. We aim at assembling the human interactome in a global structured way and exploring it to gain insights of biological relevance. First, we defined the UniProtKB manually reviewed human "complete" proteome as the reference protein-node set and then we mined five major source PPI datasets for direct PPIs exclusively between the reference proteins. We updated the protein and publication identifiers and normalized all PPIs to the UniProt identifier level. The reconstructed interactome covers approximately 60% of the human proteome and has a scale-free structure. No apparent differentiating gene functional classification characteristics were identified for the unrepresented proteins. The source dataset integration augments the network mainly in PPIs. Polyubiquitin emerged as the highest-degree node, but the inclusion of most of its identified PPIs may be reconsidered. The high number (>300) of connections of the subsequent fifteen proteins correlates well with their essential biological role. According to the power-law network structure, the unrepresented proteins should mainly have up to four connections with equally poorly-connected interactors. Reconstructing the human interactome based on the a priori definition of the protein nodes enabled us to identify the currently included part of the human "complete" proteome, and discuss the role of the proteins within the network topology with respect to their function. As the network expansion has to comply with the scale-free theory, we suggest that the core of the human interactome has essentially emerged. Thus, it could be employed in systems biology and biomedical research, despite the considerable number of currently unrepresented proteins. The latter are probably involved in specialized physiological conditions, justifying the scarcity of related PPI information, and their identification can assist in designing relevant functional experiments and targeted text mining algorithms.
Discovery of time-delayed gene regulatory networks based on temporal gene expression profiling

PubMed Central

Li, Xia; Rao, Shaoqi; Jiang, Wei; Li, Chuanxing; Xiao, Yun; Guo, Zheng; Zhang, Qingpu; Wang, Lihong; Du, Lei; Li, Jing; Li, Li; Zhang, Tianwen; Wang, Qing K

2006-01-01

Background It is one of the ultimate goals for modern biological research to fully elucidate the intricate interplays and the regulations of the molecular determinants that propel and characterize the progression of versatile life phenomena, to name a few, cell cycling, developmental biology, aging, and the progressive and recurrent pathogenesis of complex diseases. The vast amount of large-scale and genome-wide time-resolved data is becoming increasing available, which provides the golden opportunity to unravel the challenging reverse-engineering problem of time-delayed gene regulatory networks. Results In particular, this methodological paper aims to reconstruct regulatory networks from temporal gene expression data by using delayed correlations between genes, i.e., pairwise overlaps of expression levels shifted in time relative each other. We have thus developed a novel model-free computational toolbox termed TdGRN (Time-delayed Gene Regulatory Network) to address the underlying regulations of genes that can span any unit(s) of time intervals. This bioinformatics toolbox has provided a unified approach to uncovering time trends of gene regulations through decision analysis of the newly designed time-delayed gene expression matrix. We have applied the proposed method to yeast cell cycling and human HeLa cell cycling and have discovered most of the underlying time-delayed regulations that are supported by multiple lines of experimental evidence and that are remarkably consistent with the current knowledge on phase characteristics for the cell cyclings. Conclusion We established a usable and powerful model-free approach to dissecting high-order dynamic trends of gene-gene interactions. We have carefully validated the proposed algorithm by applying it to two publicly available cell cycling datasets. In addition to uncovering the time trends of gene regulations for cell cycling, this unified approach can also be used to study the complex gene regulations related to the development, aging and progressive pathogenesis of a complex disease where potential dependences between different experiment units might occurs. PMID:16420705
RegNetwork: an integrated database of transcriptional and post-transcriptional regulatory networks in human and mouse

PubMed Central

Liu, Zhi-Ping; Wu, Canglin; Miao, Hongyu; Wu, Hulin

2015-01-01

Transcriptional and post-transcriptional regulation of gene expression is of fundamental importance to numerous biological processes. Nowadays, an increasing amount of gene regulatory relationships have been documented in various databases and literature. However, to more efficiently exploit such knowledge for biomedical research and applications, it is necessary to construct a genome-wide regulatory network database to integrate the information on gene regulatory relationships that are widely scattered in many different places. Therefore, in this work, we build a knowledge-based database, named ‘RegNetwork’, of gene regulatory networks for human and mouse by collecting and integrating the documented regulatory interactions among transcription factors (TFs), microRNAs (miRNAs) and target genes from 25 selected databases. Moreover, we also inferred and incorporated potential regulatory relationships based on transcription factor binding site (TFBS) motifs into RegNetwork. As a result, RegNetwork contains a comprehensive set of experimentally observed or predicted transcriptional and post-transcriptional regulatory relationships, and the database framework is flexibly designed for potential extensions to include gene regulatory networks for other organisms in the future. Based on RegNetwork, we characterized the statistical and topological properties of genome-wide regulatory networks for human and mouse, we also extracted and interpreted simple yet important network motifs that involve the interplays between TF-miRNA and their targets. In summary, RegNetwork provides an integrated resource on the prior information for gene regulatory relationships, and it enables us to further investigate context-specific transcriptional and post-transcriptional regulatory interactions based on domain-specific experimental data. Database URL: http://www.regnetworkweb.org PMID:26424082
Embracing the comparative approach: how robust phylogenies and broader developmental sampling impacts the understanding of nervous system evolution.

PubMed

Hejnol, Andreas; Lowe, Christopher J

2015-12-19

Molecular biology has provided a rich dataset to develop hypotheses of nervous system evolution. The startling patterning similarities between distantly related animals during the development of their central nervous system (CNS) have resulted in the hypothesis that a CNS with a single centralized medullary cord and a partitioned brain is homologous across bilaterians. However, the ability to precisely reconstruct ancestral neural architectures from molecular genetic information requires that these gene networks specifically map with particular neural anatomies. A growing body of literature representing the development of a wider range of metazoan neural architectures demonstrates that patterning gene network complexity is maintained in animals with more modest levels of neural complexity. Furthermore, a robust phylogenetic framework that provides the basis for testing the congruence of these homology hypotheses has been lacking since the advent of the field of 'evo-devo'. Recent progress in molecular phylogenetics is refining the necessary framework to test previous homology statements that span large evolutionary distances. In this review, we describe recent advances in animal phylogeny and exemplify for two neural characters-the partitioned brain of arthropods and the ventral centralized nerve cords of annelids-a test for congruence using this framework. The sequential sister taxa at the base of Ecdysozoa and Spiralia comprise small, interstitial groups. This topology is not consistent with the hypothesis of homology of tripartitioned brain of arthropods and vertebrates as well as the ventral arthropod and rope-like ladder nervous system of annelids. There can be exquisite conservation of gene regulatory networks between distantly related groups with contrasting levels of nervous system centralization and complexity. Consequently, the utility of molecular characters to reconstruct ancestral neural organization in deep time is limited. © 2015 The Authors.
Embracing the comparative approach: how robust phylogenies and broader developmental sampling impacts the understanding of nervous system evolution

PubMed Central

Hejnol, Andreas; Lowe, Christopher J.

2015-01-01

Molecular biology has provided a rich dataset to develop hypotheses of nervous system evolution. The startling patterning similarities between distantly related animals during the development of their central nervous system (CNS) have resulted in the hypothesis that a CNS with a single centralized medullary cord and a partitioned brain is homologous across bilaterians. However, the ability to precisely reconstruct ancestral neural architectures from molecular genetic information requires that these gene networks specifically map with particular neural anatomies. A growing body of literature representing the development of a wider range of metazoan neural architectures demonstrates that patterning gene network complexity is maintained in animals with more modest levels of neural complexity. Furthermore, a robust phylogenetic framework that provides the basis for testing the congruence of these homology hypotheses has been lacking since the advent of the field of ‘evo-devo’. Recent progress in molecular phylogenetics is refining the necessary framework to test previous homology statements that span large evolutionary distances. In this review, we describe recent advances in animal phylogeny and exemplify for two neural characters—the partitioned brain of arthropods and the ventral centralized nerve cords of annelids—a test for congruence using this framework. The sequential sister taxa at the base of Ecdysozoa and Spiralia comprise small, interstitial groups. This topology is not consistent with the hypothesis of homology of tripartitioned brain of arthropods and vertebrates as well as the ventral arthropod and rope-like ladder nervous system of annelids. There can be exquisite conservation of gene regulatory networks between distantly related groups with contrasting levels of nervous system centralization and complexity. Consequently, the utility of molecular characters to reconstruct ancestral neural organization in deep time is limited. PMID:26554039
Developmental Progression in the Coral Acropora digitifera Is Controlled by Differential Expression of Distinct Regulatory Gene Networks

PubMed Central

Reyes-Bermudez, Alejandro; Villar-Briones, Alejandro; Ramirez-Portilla, Catalina; Hidaka, Michio; Mikheyev, Alexander S.

2016-01-01

Corals belong to the most basal class of the Phylum Cnidaria, which is considered the sister group of bilaterian animals, and thus have become an emerging model to study the evolution of developmental mechanisms. Although cell renewal, differentiation, and maintenance of pluripotency are cellular events shared by multicellular animals, the cellular basis of these fundamental biological processes are still poorly understood. To understand how changes in gene expression regulate morphogenetic transitions at the base of the eumetazoa, we performed quantitative RNA-seq analysis during Acropora digitifera’s development. We collected embryonic, larval, and adult samples to characterize stage-specific transcription profiles, as well as broad expression patterns. Transcription profiles reconstructed development revealing two main expression clusters. The first cluster grouped blastula and gastrula and the second grouped subsequent developmental time points. Consistently, we observed clear differences in gene expression between early and late developmental transitions, with higher numbers of differentially expressed genes and fold changes around gastrulation. Furthermore, we identified three coexpression clusters that represented discrete gene expression patterns. During early transitions, transcriptional networks seemed to regulate cellular fate and morphogenesis of the larval body. In late transitions, these networks seemed to play important roles preparing planulae for switch in lifestyle and regulation of adult processes. Although developmental progression in A. digitifera is regulated to some extent by differential coexpression of well-defined gene networks, stage-specific transcription profiles appear to be independent entities. While negative regulation of transcription is predominant in early development, cell differentiation was upregulated in larval and adult stages. PMID:26941230
The SF3M approach to 3-D photo-reconstruction for non-expert users: application to a gully network

NASA Astrophysics Data System (ADS)

Castillo, C.; James, M. R.; Redel-Macías, M. D.; Pérez, R.; Gómez, J. A.

2015-04-01

3-D photo-reconstruction (PR) techniques have been successfully used to produce high resolution elevation models for different applications and over different spatial scales. However, innovative approaches are required to overcome some limitations that this technique may present in challenging scenarios. Here, we evaluate SF3M, a new graphical user interface for implementing a complete PR workflow based on freely available software (including external calls to VisualSFM and CloudCompare), in combination with a low-cost survey design for the reconstruction of a several-hundred-meters-long gully network. SF3M provided a semi-automated workflow for 3-D reconstruction requiring ~ 49 h (of which only 17% required operator assistance) for obtaining a final gully network model of > 17 million points over a gully plan area of 4230 m2. We show that a walking itinerary along the gully perimeter using two light-weight automatic cameras (1 s time-lapse mode) and a 6 m-long pole is an efficient method for 3-D monitoring of gullies, at a low cost (about EUR 1000 budget for the field equipment) and time requirements (~ 90 min for image collection). A mean error of 6.9 cm at the ground control points was found, mainly due to model deformations derived from the linear geometry of the gully and residual errors in camera calibration. The straightforward image collection and processing approach can be of great benefit for non-expert users working on gully erosion assessment.
Robust synthetic biology design: stochastic game theory approach.

PubMed

Chen, Bor-Sen; Chang, Chia-Hung; Lee, Hsiao-Ching

2009-07-15

Synthetic biology is to engineer artificial biological systems to investigate natural biological phenomena and for a variety of applications. However, the development of synthetic gene networks is still difficult and most newly created gene networks are non-functioning due to uncertain initial conditions and disturbances of extra-cellular environments on the host cell. At present, how to design a robust synthetic gene network to work properly under these uncertain factors is the most important topic of synthetic biology. A robust regulation design is proposed for a stochastic synthetic gene network to achieve the prescribed steady states under these uncertain factors from the minimax regulation perspective. This minimax regulation design problem can be transformed to an equivalent stochastic game problem. Since it is not easy to solve the robust regulation design problem of synthetic gene networks by non-linear stochastic game method directly, the Takagi-Sugeno (T-S) fuzzy model is proposed to approximate the non-linear synthetic gene network via the linear matrix inequality (LMI) technique through the Robust Control Toolbox in Matlab. Finally, an in silico example is given to illustrate the design procedure and to confirm the efficiency and efficacy of the proposed robust gene design method. http://www.ee.nthu.edu.tw/bschen/SyntheticBioDesign_supplement.pdf.
Simultaneous learning of instantaneous and time-delayed genetic interactions using novel information theoretic scoring technique

PubMed Central

2012-01-01

Background Understanding gene interactions is a fundamental question in systems biology. Currently, modeling of gene regulations using the Bayesian Network (BN) formalism assumes that genes interact either instantaneously or with a certain amount of time delay. However in reality, biological regulations, both instantaneous and time-delayed, occur simultaneously. A framework that can detect and model both these two types of interactions simultaneously would represent gene regulatory networks more accurately. Results In this paper, we introduce a framework based on the Bayesian Network (BN) formalism that can represent both instantaneous and time-delayed interactions between genes simultaneously. A novel scoring metric having firm mathematical underpinnings is also proposed that, unlike other recent methods, can score both interactions concurrently and takes into account the reality that multiple regulators can regulate a gene jointly, rather than in an isolated pair-wise manner. Further, a gene regulatory network (GRN) inference method employing an evolutionary search that makes use of the framework and the scoring metric is also presented. Conclusion By taking into consideration the biological fact that both instantaneous and time-delayed regulations can occur among genes, our approach models gene interactions with greater accuracy. The proposed framework is efficient and can be used to infer gene networks having multiple orders of instantaneous and time-delayed regulations simultaneously. Experiments are carried out using three different synthetic networks (with three different mechanisms for generating synthetic data) as well as real life networks of Saccharomyces cerevisiae, E. coli and cyanobacteria gene expression data. The results show the effectiveness of our approach. PMID:22691450
Identifying significant genetic regulatory networks in the prostate cancer from microarray data based on transcription factor analysis and conditional independency.

PubMed

Yeh, Hsiang-Yuan; Cheng, Shih-Wu; Lin, Yu-Chun; Yeh, Cheng-Yu; Lin, Shih-Fang; Soo, Von-Wun

2009-12-21

Prostate cancer is a world wide leading cancer and it is characterized by its aggressive metastasis. According to the clinical heterogeneity, prostate cancer displays different stages and grades related to the aggressive metastasis disease. Although numerous studies used microarray analysis and traditional clustering method to identify the individual genes during the disease processes, the important gene regulations remain unclear. We present a computational method for inferring genetic regulatory networks from micorarray data automatically with transcription factor analysis and conditional independence testing to explore the potential significant gene regulatory networks that are correlated with cancer, tumor grade and stage in the prostate cancer. To deal with missing values in microarray data, we used a K-nearest-neighbors (KNN) algorithm to determine the precise expression values. We applied web services technology to wrap the bioinformatics toolkits and databases to automatically extract the promoter regions of DNA sequences and predicted the transcription factors that regulate the gene expressions. We adopt the microarray datasets consists of 62 primary tumors, 41 normal prostate tissues from Stanford Microarray Database (SMD) as a target dataset to evaluate our method. The predicted results showed that the possible biomarker genes related to cancer and denoted the androgen functions and processes may be in the development of the prostate cancer and promote the cell death in cell cycle. Our predicted results showed that sub-networks of genes SREBF1, STAT6 and PBX1 are strongly related to a high extent while ETS transcription factors ELK1, JUN and EGR2 are related to a low extent. Gene SLC22A3 may explain clinically the differentiation associated with the high grade cancer compared with low grade cancer. Enhancer of Zeste Homolg 2 (EZH2) regulated by RUNX1 and STAT3 is correlated to the pathological stage. We provide a computational framework to reconstruct the genetic regulatory network from the microarray data using biological knowledge and constraint-based inferences. Our method is helpful in verifying possible interaction relations in gene regulatory networks and filtering out incorrect relations inferred by imperfect methods. We predicted not only individual gene related to cancer but also discovered significant gene regulation networks. Our method is also validated in several enriched published papers and databases and the significant gene regulatory networks perform critical biological functions and processes including cell adhesion molecules, androgen and estrogen metabolism, smooth muscle contraction, and GO-annotated processes. Those significant gene regulations and the critical concept of tumor progression are useful to understand cancer biology and disease treatment.
Protein-protein interaction analysis of Alzheimer`s disease and NAFLD based on systems biology methods unhide common ancestor pathways.

PubMed

Karbalaei, Reza; Allahyari, Marzieh; Rezaei-Tavirani, Mostafa; Asadzadeh-Aghdaei, Hamid; Zali, Mohammad Reza

2018-01-01

Analysis reconstruction networks from two diseases, NAFLD and Alzheimer`s diseases and their relationship based on systems biology methods. NAFLD and Alzheimer`s diseases are two complex diseases, with progressive prevalence and high cost for countries. There are some reports on relation and same spreading pathways of these two diseases. In addition, they have some similar risk factors, exclusively lifestyle such as feeding, exercises and so on. Therefore, systems biology approach can help to discover their relationship. DisGeNET and STRING databases were sources of disease genes and constructing networks. Three plugins of Cytoscape software, including ClusterONE, ClueGO and CluePedia, were used to analyze and cluster networks and enrichment of pathways. An R package used to define best centrality method. Finally, based on degree and Betweenness, hubs and bottleneck nodes were defined. Common genes between NAFLD and Alzheimer`s disease were 190 genes that used construct a network with STRING database. The resulting network contained 182 nodes and 2591 edges and comprises from four clusters. Enrichment of these clusters separately lead to carbohydrate metabolism, long chain fatty acid and regulation of JAK-STAT and IL-17 signaling pathways, respectively. Also seven genes selected as hub-bottleneck include: IL6, AKT1, TP53, TNF, JUN, VEGFA and PPARG. Enrichment of these proteins and their first neighbors in network by OMIM database lead to diabetes and obesity as ancestors of NAFLD and AD. Systems biology methods, specifically PPI networks, can be useful for analyzing complicated related diseases. Finding Hub and bottleneck proteins should be the goal of drug designing and introducing disease markers.

Progression of Brain Network Alterations in Cerebral Amyloid Angiopathy.

PubMed

Reijmer, Yael D; Fotiadis, Panagiotis; Riley, Grace A; Xiong, Li; Charidimou, Andreas; Boulouis, Gregoire; Ayres, Alison M; Schwab, Kristin; Rosand, Jonathan; Gurol, M Edip; Viswanathan, Anand; Greenberg, Steven M

2016-10-01

We recently showed that cerebral amyloid angiopathy (CAA) is associated with functionally relevant brain network impairments, in particular affecting posterior white matter connections. Here we examined how these brain network impairments progress over time. Thirty-three patients with probable CAA underwent multimodal brain magnetic resonance imaging at 2 time points (mean follow-up time: 1.3±0.4 years). Brain networks of the hemisphere free of intracerebral hemorrhages were reconstructed using fiber tractography and graph theory. The global efficiency of the network and mean fractional anisotropies of posterior-posterior, frontal-frontal, and posterior-frontal network connections were calculated. Patients with moderate versus severe CAA were defined based on microbleed count, dichotomized at the median (median=35). Global efficiency of the intracerebral hemorrhage-free hemispheric network declined from baseline to follow-up (-0.008±0.003; P=0.029). The decline in global efficiency was most pronounced for patients with severe CAA (group×time interaction P=0.03). The decline in global network efficiency was associated with worse executive functioning (β=0.46; P=0.03). Examination of subgroups of network connections revealed a decline in fractional anisotropies of posterior-posterior connections at both levels of CAA severity (-0.006±0.002; P=0.017; group×time interaction P=0.16). The fractional anisotropies of posterior-frontal and frontal-frontal connections declined in patients with severe but not moderate CAA (group×time interaction P=0.007 and P=0.005). Associations were independent of change in white matter hyperintensity volume. Brain network impairment in patients with CAA worsens measurably over just 1.3-year follow-up and seem to progress from posterior to frontal connections with increasing disease severity. © 2016 American Heart Association, Inc.
Inference of Spatio-Temporal Functions Over Graphs via Multikernel Kriged Kalman Filtering

NASA Astrophysics Data System (ADS)

Ioannidis, Vassilis N.; Romero, Daniel; Giannakis, Georgios B.

2018-06-01

Inference of space-time varying signals on graphs emerges naturally in a plethora of network science related applications. A frequently encountered challenge pertains to reconstructing such dynamic processes, given their values over a subset of vertices and time instants. The present paper develops a graph-aware kernel-based kriged Kalman filter that accounts for the spatio-temporal variations, and offers efficient online reconstruction, even for dynamically evolving network topologies. The kernel-based learning framework bypasses the need for statistical information by capitalizing on the smoothness that graph signals exhibit with respect to the underlying graph. To address the challenge of selecting the appropriate kernel, the proposed filter is combined with a multi-kernel selection module. Such a data-driven method selects a kernel attuned to the signal dynamics on-the-fly within the linear span of a pre-selected dictionary. The novel multi-kernel learning algorithm exploits the eigenstructure of Laplacian kernel matrices to reduce computational complexity. Numerical tests with synthetic and real data demonstrate the superior reconstruction performance of the novel approach relative to state-of-the-art alternatives.
Integrated in silico analyses of regulatory and metabolic networks of Synechococcus sp. PCC 7002 reveal relationships between gene centrality and essentiality

DOE PAGES

Song, Hyun-Seob; McClure, Ryan S.; Bernstein, Hans C.; ...

2015-03-27

Cyanobacteria dynamically relay environmental inputs to intracellular adaptations through a coordinated adjustment of photosynthetic efficiency and carbon processing rates. The output of such adaptations is reflected through changes in transcriptional patterns and metabolic flux distributions that ultimately define growth strategy. To address interrelationships between metabolism and regulation, we performed integrative analyses of metabolic and gene co-expression networks in a model cyanobacterium, Synechococcus sp. PCC 7002. Centrality analyses using the gene co-expression network identified a set of key genes, which were defined here as ‘topologically important.’ Parallel in silico gene knock-out simulations, using the genome-scale metabolic network, classified what we termedmore » as ‘functionally important’ genes, deletion of which affected growth or metabolism. A strong positive correlation was observed between topologically and functionally important genes. Functionally important genes exhibited variable levels of topological centrality; however, the majority of topologically central genes were found to be functionally essential for growth. Subsequent functional enrichment analysis revealed that both functionally and topologically important genes in Synechococcus sp. PCC 7002 are predominantly associated with translation and energy metabolism, two cellular processes critical for growth. This research demonstrates how synergistic network-level analyses can be used for reconciliation of metabolic and gene expression data to uncover fundamental biological principles.« less
Integrated in silico analyses of regulatory and metabolic networks of Synechococcus sp. PCC 7002 reveal relationships between gene centrality and essentiality

DOE Office of Scientific and Technical Information (OSTI.GOV)

Song, Hyun-Seob; McClure, Ryan S.; Bernstein, Hans C.

Cyanobacteria dynamically relay environmental inputs to intracellular adaptations through a coordinated adjustment of photosynthetic efficiency and carbon processing rates. The output of such adaptations is reflected through changes in transcriptional patterns and metabolic flux distributions that ultimately define growth strategy. To address interrelationships between metabolism and regulation, we performed integrative analyses of metabolic and gene co-expression networks in a model cyanobacterium, Synechococcus sp. PCC 7002. Centrality analyses using the gene co-expression network identified a set of key genes, which were defined here as ‘topologically important.’ Parallel in silico gene knock-out simulations, using the genome-scale metabolic network, classified what we termedmore » as ‘functionally important’ genes, deletion of which affected growth or metabolism. A strong positive correlation was observed between topologically and functionally important genes. Functionally important genes exhibited variable levels of topological centrality; however, the majority of topologically central genes were found to be functionally essential for growth. Subsequent functional enrichment analysis revealed that both functionally and topologically important genes in Synechococcus sp. PCC 7002 are predominantly associated with translation and energy metabolism, two cellular processes critical for growth. This research demonstrates how synergistic network-level analyses can be used for reconciliation of metabolic and gene expression data to uncover fundamental biological principles.« less
Efficient Reverse-Engineering of a Developmental Gene Regulatory Network

PubMed Central

Cicin-Sain, Damjan; Ashyraliyev, Maksat; Jaeger, Johannes

2012-01-01

Understanding the complex regulatory networks underlying development and evolution of multi-cellular organisms is a major problem in biology. Computational models can be used as tools to extract the regulatory structure and dynamics of such networks from gene expression data. This approach is called reverse engineering. It has been successfully applied to many gene networks in various biological systems. However, to reconstitute the structure and non-linear dynamics of a developmental gene network in its spatial context remains a considerable challenge. Here, we address this challenge using a case study: the gap gene network involved in segment determination during early development of Drosophila melanogaster. A major problem for reverse-engineering pattern-forming networks is the significant amount of time and effort required to acquire and quantify spatial gene expression data. We have developed a simplified data processing pipeline that considerably increases the throughput of the method, but results in data of reduced accuracy compared to those previously used for gap gene network inference. We demonstrate that we can infer the correct network structure using our reduced data set, and investigate minimal data requirements for successful reverse engineering. Our results show that timing and position of expression domain boundaries are the crucial features for determining regulatory network structure from data, while it is less important to precisely measure expression levels. Based on this, we define minimal data requirements for gap gene network inference. Our results demonstrate the feasibility of reverse-engineering with much reduced experimental effort. This enables more widespread use of the method in different developmental contexts and organisms. Such systematic application of data-driven models to real-world networks has enormous potential. Only the quantitative investigation of a large number of developmental gene regulatory networks will allow us to discover whether there are rules or regularities governing development and evolution of complex multi-cellular organisms. PMID:22807664
A framework for scalable parameter estimation of gene circuit models using structural information.

PubMed

Kuwahara, Hiroyuki; Fan, Ming; Wang, Suojin; Gao, Xin

2013-07-01

Systematic and scalable parameter estimation is a key to construct complex gene regulatory models and to ultimately facilitate an integrative systems biology approach to quantitatively understand the molecular mechanisms underpinning gene regulation. Here, we report a novel framework for efficient and scalable parameter estimation that focuses specifically on modeling of gene circuits. Exploiting the structure commonly found in gene circuit models, this framework decomposes a system of coupled rate equations into individual ones and efficiently integrates them separately to reconstruct the mean time evolution of the gene products. The accuracy of the parameter estimates is refined by iteratively increasing the accuracy of numerical integration using the model structure. As a case study, we applied our framework to four gene circuit models with complex dynamics based on three synthetic datasets and one time series microarray data set. We compared our framework to three state-of-the-art parameter estimation methods and found that our approach consistently generated higher quality parameter solutions efficiently. Although many general-purpose parameter estimation methods have been applied for modeling of gene circuits, our results suggest that the use of more tailored approaches to use domain-specific information may be a key to reverse engineering of complex biological systems. http://sfb.kaust.edu.sa/Pages/Software.aspx. Supplementary data are available at Bioinformatics online.
Knowledge Discovery in Spectral Data by Means of Complex Networks

PubMed Central

Zanin, Massimiliano; Papo, David; Solís, José Luis González; Espinosa, Juan Carlos Martínez; Frausto-Reyes, Claudio; Anda, Pascual Palomares; Sevilla-Escoboza, Ricardo; Boccaletti, Stefano; Menasalvas, Ernestina; Sousa, Pedro

2013-01-01

In the last decade, complex networks have widely been applied to the study of many natural and man-made systems, and to the extraction of meaningful information from the interaction structures created by genes and proteins. Nevertheless, less attention has been devoted to metabonomics, due to the lack of a natural network representation of spectral data. Here we define a technique for reconstructing networks from spectral data sets, where nodes represent spectral bins, and pairs of them are connected when their intensities follow a pattern associated with a disease. The structural analysis of the resulting network can then be used to feed standard data-mining algorithms, for instance for the classification of new (unlabeled) subjects. Furthermore, we show how the structure of the network is resilient to the presence of external additive noise, and how it can be used to extract relevant knowledge about the development of the disease. PMID:24957895
Knowledge discovery in spectral data by means of complex networks.

PubMed

Zanin, Massimiliano; Papo, David; Solís, José Luis González; Espinosa, Juan Carlos Martínez; Frausto-Reyes, Claudio; Anda, Pascual Palomares; Sevilla-Escoboza, Ricardo; Jaimes-Reategui, Rider; Boccaletti, Stefano; Menasalvas, Ernestina; Sousa, Pedro

2013-03-11

In the last decade, complex networks have widely been applied to the study of many natural and man-made systems, and to the extraction of meaningful information from the interaction structures created by genes and proteins. Nevertheless, less attention has been devoted to metabonomics, due to the lack of a natural network representation of spectral data. Here we define a technique for reconstructing networks from spectral data sets, where nodes represent spectral bins, and pairs of them are connected when their intensities follow a pattern associated with a disease. The structural analysis of the resulting network can then be used to feed standard data-mining algorithms, for instance for the classification of new (unlabeled) subjects. Furthermore, we show how the structure of the network is resilient to the presence of external additive noise, and how it can be used to extract relevant knowledge about the development of the disease.
Comparative evaluation of reverse engineering gene regulatory networks with relevance networks, graphical gaussian models and bayesian networks.

PubMed

Werhli, Adriano V; Grzegorczyk, Marco; Husmeier, Dirk

2006-10-15

An important problem in systems biology is the inference of biochemical pathways and regulatory networks from postgenomic data. Various reverse engineering methods have been proposed in the literature, and it is important to understand their relative merits and shortcomings. In the present paper, we compare the accuracy of reconstructing gene regulatory networks with three different modelling and inference paradigms: (1) Relevance networks (RNs): pairwise association scores independent of the remaining network; (2) graphical Gaussian models (GGMs): undirected graphical models with constraint-based inference, and (3) Bayesian networks (BNs): directed graphical models with score-based inference. The evaluation is carried out on the Raf pathway, a cellular signalling network describing the interaction of 11 phosphorylated proteins and phospholipids in human immune system cells. We use both laboratory data from cytometry experiments as well as data simulated from the gold-standard network. We also compare passive observations with active interventions. On Gaussian observational data, BNs and GGMs were found to outperform RNs. The difference in performance was not significant for the non-linear simulated data and the cytoflow data, though. Also, we did not observe a significant difference between BNs and GGMs on observational data in general. However, for interventional data, BNs outperform GGMs and RNs, especially when taking the edge directions rather than just the skeletons of the graphs into account. This suggests that the higher computational costs of inference with BNs over GGMs and RNs are not justified when using only passive observations, but that active interventions in the form of gene knockouts and over-expressions are required to exploit the full potential of BNs. Data, software and supplementary material are available from http://www.bioss.sari.ac.uk/staff/adriano/research.html
Unsupervised Network Analysis of the Plastic Supraoptic Nucleus Transcriptome Predicts Caprin2 Regulatory Interactions

PubMed Central

Jahans-Price, Thomas; Greenwood, Michael P.; Greenwood, Mingkwan; Hoe, See-Ziau; Konopacka, Agnieszka

2017-01-01

Abstract The supraoptic nucleus (SON) is a group of neurons in the hypothalamus responsible for the synthesis and secretion of the peptide hormones vasopressin and oxytocin. Following physiological cues, such as dehydration, salt-loading and lactation, the SON undergoes a function related plasticity that we have previously described in the rat at the transcriptome level. Using the unsupervised graphical lasso (Glasso) algorithm, we reconstructed a putative network from 500 plastic SON genes in which genes are the nodes and the edges are the inferred interactions. The most active nodal gene identified within the network was Caprin2. Caprin2 encodes an RNA-binding protein that we have previously shown to be vital for the functioning of osmoregulatory neuroendocrine neurons in the SON of the rat hypothalamus. To test the validity of the Glasso network, we either overexpressed or knocked down Caprin2 transcripts in differentiated rat pheochromocytoma PC12 cells and showed that these manipulations had significant opposite effects on the levels of putative target mRNAs. These studies suggest that the predicative power of the Glasso algorithm within an in vivo system is accurate, and identifies biological targets that may be important to the functional plasticity of the SON. PMID:29279858
ITEP: an integrated toolkit for exploration of microbial pan-genomes.

PubMed

Benedict, Matthew N; Henriksen, James R; Metcalf, William W; Whitaker, Rachel J; Price, Nathan D

2014-01-03

Comparative genomics is a powerful approach for studying variation in physiological traits as well as the evolution and ecology of microorganisms. Recent technological advances have enabled sequencing large numbers of related genomes in a single project, requiring computational tools for their integrated analysis. In particular, accurate annotations and identification of gene presence and absence are critical for understanding and modeling the cellular physiology of newly sequenced genomes. Although many tools are available to compare the gene contents of related genomes, new tools are necessary to enable close examination and curation of protein families from large numbers of closely related organisms, to integrate curation with the analysis of gain and loss, and to generate metabolic networks linking the annotations to observed phenotypes. We have developed ITEP, an Integrated Toolkit for Exploration of microbial Pan-genomes, to curate protein families, compute similarities to externally-defined domains, analyze gene gain and loss, and generate draft metabolic networks from one or more curated reference network reconstructions in groups of related microbial species among which the combination of core and variable genes constitute the their "pan-genomes". The ITEP toolkit consists of: (1) a series of modular command-line scripts for identification, comparison, curation, and analysis of protein families and their distribution across many genomes; (2) a set of Python libraries for programmatic access to the same data; and (3) pre-packaged scripts to perform common analysis workflows on a collection of genomes. ITEP's capabilities include de novo protein family prediction, ortholog detection, analysis of functional domains, identification of core and variable genes and gene regions, sequence alignments and tree generation, annotation curation, and the integration of cross-genome analysis and metabolic networks for study of metabolic network evolution. ITEP is a powerful, flexible toolkit for generation and curation of protein families. ITEP's modular design allows for straightforward extension as analysis methods and tools evolve. By integrating comparative genomics with the development of draft metabolic networks, ITEP harnesses the power of comparative genomics to build confidence in links between genotype and phenotype and helps disambiguate gene annotations when they are evaluated in both evolutionary and metabolic network contexts.
An experimentally validated network of nine haematopoietic transcription factors reveals mechanisms of cell state stability

PubMed Central

Schütte, Judith; Wang, Huange; Antoniou, Stella; Jarratt, Andrew; Wilson, Nicola K; Riepsaame, Joey; Calero-Nieto, Fernando J; Moignard, Victoria; Basilico, Silvia; Kinston, Sarah J; Hannah, Rebecca L; Chan, Mun Chiang; Nürnberg, Sylvia T; Ouwehand, Willem H; Bonzanni, Nicola; de Bruijn, Marella FTR; Göttgens, Berthold

2016-01-01

Transcription factor (TF) networks determine cell-type identity by establishing and maintaining lineage-specific expression profiles, yet reconstruction of mammalian regulatory network models has been hampered by a lack of comprehensive functional validation of regulatory interactions. Here, we report comprehensive ChIP-Seq, transgenic and reporter gene experimental data that have allowed us to construct an experimentally validated regulatory network model for haematopoietic stem/progenitor cells (HSPCs). Model simulation coupled with subsequent experimental validation using single cell expression profiling revealed potential mechanisms for cell state stabilisation, and also how a leukaemogenic TF fusion protein perturbs key HSPC regulators. The approach presented here should help to improve our understanding of both normal physiological and disease processes. DOI: http://dx.doi.org/10.7554/eLife.11469.001 PMID:26901438
Synthetic incoherent feedforward circuits show adaptation to the amount of their genetic template

PubMed Central

Bleris, Leonidas; Xie, Zhen; Glass, David; Adadey, Asa; Sontag, Eduardo; Benenson, Yaakov

2011-01-01

Natural and synthetic biological networks must function reliably in the face of fluctuating stoichiometry of their molecular components. These fluctuations are caused in part by changes in relative expression efficiency and the DNA template amount of the network-coding genes. Gene product levels could potentially be decoupled from these changes via built-in adaptation mechanisms, thereby boosting network reliability. Here, we show that a mechanism based on an incoherent feedforward motif enables adaptive gene expression in mammalian cells. We modeled, synthesized, and tested transcriptional and post-transcriptional incoherent loops and found that in all cases the gene product adapts to changes in DNA template abundance. We also observed that the post-transcriptional form results in superior adaptation behavior, higher absolute expression levels, and lower intrinsic fluctuations. Our results support a previously hypothesized endogenous role in gene dosage compensation for such motifs and suggest that their incorporation in synthetic networks will improve their robustness and reliability. PMID:21811230
Revealing the Strong Functional Association of adipor2 and cdh13 with adipoq: A Gene Network Study.

PubMed

Bag, Susmita; Anbarasu, Anand

2015-04-01

In the present study, we have analyzed functional gene interactions of adiponectin gene (adipoq). The key role of adipoq is in regulating energy homeostasis and it functions as a novel signaling molecule for adipose tissue. Modules of highly inter-connected genes in disease-specific adipoq network are derived by integrating gene function and protein interaction data. Among twenty genes in adipoq web, adipoq is effectively conjoined with two genes: Adiponectin receptor 2 (adipor2) and cadherin 13 (cdh13). The functional analysis is done via ontological briefing and candidate disease identification. We observed that the highly efficient-interlinked genes connected with adipoq are adipor2 and cdh13. Interestingly, the ontological aspect of adipor2 and cdh13 in the adipoq network reveal the fact that adipoq and adipor2 are involved mostly in glucose and lipid metabolic processes. The gene cdh13 indulge in cell adhesion process with adipoq and adipor2. Our computational gene web analysis also predicts potential candidate disease recognition, thus indicating the involvement of adipoq, adipor2, and cdh13 with not only with obesity but also with breast cancer, leukemia, renal cancer, lung cancer, and cervical cancer. The current study provides researchers a comprehensible layout of adipoq network, its functional strategies and candidate disease approach associated with adipoq network.
Graph Curvature for Differentiating Cancer Networks

PubMed Central

Sandhu, Romeil; Georgiou, Tryphon; Reznik, Ed; Zhu, Liangjia; Kolesov, Ivan; Senbabaoglu, Yasin; Tannenbaum, Allen

2015-01-01

Cellular interactions can be modeled as complex dynamical systems represented by weighted graphs. The functionality of such networks, including measures of robustness, reliability, performance, and efficiency, are intrinsically tied to the topology and geometry of the underlying graph. Utilizing recently proposed geometric notions of curvature on weighted graphs, we investigate the features of gene co-expression networks derived from large-scale genomic studies of cancer. We find that the curvature of these networks reliably distinguishes between cancer and normal samples, with cancer networks exhibiting higher curvature than their normal counterparts. We establish a quantitative relationship between our findings and prior investigations of network entropy. Furthermore, we demonstrate how our approach yields additional, non-trivial pair-wise (i.e. gene-gene) interactions which may be disrupted in cancer samples. The mathematical formulation of our approach yields an exact solution to calculating pair-wise changes in curvature which was computationally infeasible using prior methods. As such, our findings lay the foundation for an analytical approach to studying complex biological networks. PMID:26169480
Pre-Clinical Drug Prioritization via Prognosis-Guided Genetic Interaction Networks

PubMed Central

Xiong, Jianghui; Liu, Juan; Rayner, Simon; Tian, Ze; Li, Yinghui; Chen, Shanguang

2010-01-01

The high rates of failure in oncology drug clinical trials highlight the problems of using pre-clinical data to predict the clinical effects of drugs. Patient population heterogeneity and unpredictable physiology complicate pre-clinical cancer modeling efforts. We hypothesize that gene networks associated with cancer outcome in heterogeneous patient populations could serve as a reference for identifying drug effects. Here we propose a novel in vivo genetic interaction which we call ‘synergistic outcome determination’ (SOD), a concept similar to ‘Synthetic Lethality’. SOD is defined as the synergy of a gene pair with respect to cancer patients' outcome, whose correlation with outcome is due to cooperative, rather than independent, contributions of genes. The method combines microarray gene expression data with cancer prognostic information to identify synergistic gene-gene interactions that are then used to construct interaction networks based on gene modules (a group of genes which share similar function). In this way, we identified a cluster of important epigenetically regulated gene modules. By projecting drug sensitivity-associated genes on to the cancer-specific inter-module network, we defined a perturbation index for each drug based upon its characteristic perturbation pattern on the inter-module network. Finally, by calculating this index for compounds in the NCI Standard Agent Database, we significantly discriminated successful drugs from a broad set of test compounds, and further revealed the mechanisms of drug combinations. Thus, prognosis-guided synergistic gene-gene interaction networks could serve as an efficient in silico tool for pre-clinical drug prioritization and rational design of combinatorial therapies. PMID:21085674
Algebraic model checking for Boolean gene regulatory networks.

PubMed

Tran, Quoc-Nam

2011-01-01

We present a computational method in which modular and Groebner bases (GB) computation in Boolean rings are used for solving problems in Boolean gene regulatory networks (BN). In contrast to other known algebraic approaches, the degree of intermediate polynomials during the calculation of Groebner bases using our method will never grow resulting in a significant improvement in running time and memory space consumption. We also show how calculation in temporal logic for model checking can be done by means of our direct and efficient Groebner basis computation in Boolean rings. We present our experimental results in finding attractors and control strategies of Boolean networks to illustrate our theoretical arguments. The results are promising. Our algebraic approach is more efficient than the state-of-the-art model checker NuSMV on BNs. More importantly, our approach finds all solutions for the BN problems.
Systems Biomedicine of Rabies Delineates the Affected Signaling Pathways

PubMed Central

Azimzadeh Jamalkandi, Sadegh; Mozhgani, Sayed-Hamidreza; Gholami Pourbadie, Hamid; Mirzaie, Mehdi; Noorbakhsh, Farshid; Vaziri, Behrouz; Gholami, Alireza; Ansari-Pour, Naser; Jafari, Mohieddin

2016-01-01

The prototypical neurotropic virus, rabies, is a member of the Rhabdoviridae family that causes lethal encephalomyelitis. Although there have been a plethora of studies investigating the etiological mechanism of the rabies virus and many precautionary methods have been implemented to avert the disease outbreak over the last century, the disease has surprisingly no definite remedy at its late stages. The psychological symptoms and the underlying etiology, as well as the rare survival rate from rabies encephalitis, has still remained a mystery. We, therefore, undertook a systems biomedicine approach to identify the network of gene products implicated in rabies. This was done by meta-analyzing whole-transcriptome microarray datasets of the CNS infected by strain CVS-11, and integrating them with interactome data using computational and statistical methods. We first determined the differentially expressed genes (DEGs) in each study and horizontally integrated the results at the mRNA and microRNA levels separately. A total of 61 seed genes involved in signal propagation system were obtained by means of unifying mRNA and microRNA detected integrated DEGs. We then reconstructed a refined protein–protein interaction network (PPIN) of infected cells to elucidate the rabies-implicated signal transduction network (RISN). To validate our findings, we confirmed differential expression of randomly selected genes in the network using Real-time PCR. In conclusion, the identification of seed genes and their network neighborhood within the refined PPIN can be useful for demonstrating signaling pathways including interferon circumvent, toward proliferation and survival, and neuropathological clue, explaining the intricate underlying molecular neuropathology of rabies infection and thus rendered a molecular framework for predicting potential drug targets. PMID:27872612
An algebra-based method for inferring gene regulatory networks.

PubMed

Vera-Licona, Paola; Jarrah, Abdul; Garcia-Puente, Luis David; McGee, John; Laubenbacher, Reinhard

2014-03-26

The inference of gene regulatory networks (GRNs) from experimental observations is at the heart of systems biology. This includes the inference of both the network topology and its dynamics. While there are many algorithms available to infer the network topology from experimental data, less emphasis has been placed on methods that infer network dynamics. Furthermore, since the network inference problem is typically underdetermined, it is essential to have the option of incorporating into the inference process, prior knowledge about the network, along with an effective description of the search space of dynamic models. Finally, it is also important to have an understanding of how a given inference method is affected by experimental and other noise in the data used. This paper contains a novel inference algorithm using the algebraic framework of Boolean polynomial dynamical systems (BPDS), meeting all these requirements. The algorithm takes as input time series data, including those from network perturbations, such as knock-out mutant strains and RNAi experiments. It allows for the incorporation of prior biological knowledge while being robust to significant levels of noise in the data used for inference. It uses an evolutionary algorithm for local optimization with an encoding of the mathematical models as BPDS. The BPDS framework allows an effective representation of the search space for algebraic dynamic models that improves computational performance. The algorithm is validated with both simulated and experimental microarray expression profile data. Robustness to noise is tested using a published mathematical model of the segment polarity gene network in Drosophila melanogaster. Benchmarking of the algorithm is done by comparison with a spectrum of state-of-the-art network inference methods on data from the synthetic IRMA network to demonstrate that our method has good precision and recall for the network reconstruction task, while also predicting several of the dynamic patterns present in the network. Boolean polynomial dynamical systems provide a powerful modeling framework for the reverse engineering of gene regulatory networks, that enables a rich mathematical structure on the model search space. A C++ implementation of the method, distributed under LPGL license, is available, together with the source code, at http://www.paola-vera-licona.net/Software/EARevEng/REACT.html.
Genotet: An Interactive Web-based Visual Exploration Framework to Support Validation of Gene Regulatory Networks.

PubMed

Yu, Bowen; Doraiswamy, Harish; Chen, Xi; Miraldi, Emily; Arrieta-Ortiz, Mario Luis; Hafemeister, Christoph; Madar, Aviv; Bonneau, Richard; Silva, Cláudio T

2014-12-01

Elucidation of transcriptional regulatory networks (TRNs) is a fundamental goal in biology, and one of the most important components of TRNs are transcription factors (TFs), proteins that specifically bind to gene promoter and enhancer regions to alter target gene expression patterns. Advances in genomic technologies as well as advances in computational biology have led to multiple large regulatory network models (directed networks) each with a large corpus of supporting data and gene-annotation. There are multiple possible biological motivations for exploring large regulatory network models, including: validating TF-target gene relationships, figuring out co-regulation patterns, and exploring the coordination of cell processes in response to changes in cell state or environment. Here we focus on queries aimed at validating regulatory network models, and on coordinating visualization of primary data and directed weighted gene regulatory networks. The large size of both the network models and the primary data can make such coordinated queries cumbersome with existing tools and, in particular, inhibits the sharing of results between collaborators. In this work, we develop and demonstrate a web-based framework for coordinating visualization and exploration of expression data (RNA-seq, microarray), network models and gene-binding data (ChIP-seq). Using specialized data structures and multiple coordinated views, we design an efficient querying model to support interactive analysis of the data. Finally, we show the effectiveness of our framework through case studies for the mouse immune system (a dataset focused on a subset of key cellular functions) and a model bacteria (a small genome with high data-completeness).

Network-constrained group lasso for high-dimensional multinomial classification with application to cancer subtype prediction.

PubMed

Tian, Xinyu; Wang, Xuefeng; Chen, Jun

2014-01-01

Classic multinomial logit model, commonly used in multiclass regression problem, is restricted to few predictors and does not take into account the relationship among variables. It has limited use for genomic data, where the number of genomic features far exceeds the sample size. Genomic features such as gene expressions are usually related by an underlying biological network. Efficient use of the network information is important to improve classification performance as well as the biological interpretability. We proposed a multinomial logit model that is capable of addressing both the high dimensionality of predictors and the underlying network information. Group lasso was used to induce model sparsity, and a network-constraint was imposed to induce the smoothness of the coefficients with respect to the underlying network structure. To deal with the non-smoothness of the objective function in optimization, we developed a proximal gradient algorithm for efficient computation. The proposed model was compared to models with no prior structure information in both simulations and a problem of cancer subtype prediction with real TCGA (the cancer genome atlas) gene expression data. The network-constrained mode outperformed the traditional ones in both cases.
Zebra: A striped network file system

NASA Technical Reports Server (NTRS)

Hartman, John H.; Ousterhout, John K.

1992-01-01

The design of Zebra, a striped network file system, is presented. Zebra applies ideas from log-structured file system (LFS) and RAID research to network file systems, resulting in a network file system that has scalable performance, uses its servers efficiently even when its applications are using small files, and provides high availability. Zebra stripes file data across multiple servers, so that the file transfer rate is not limited by the performance of a single server. High availability is achieved by maintaining parity information for the file system. If a server fails its contents can be reconstructed using the contents of the remaining servers and the parity information. Zebra differs from existing striped file systems in the way it stripes file data: Zebra does not stripe on a per-file basis; instead it stripes the stream of bytes written by each client. Clients write to the servers in units called stripe fragments, which are analogous to segments in an LFS. Stripe fragments contain file blocks that were written recently, without regard to which file they belong. This method of striping has numerous advantages over per-file striping, including increased server efficiency, efficient parity computation, and elimination of parity update.
Background rejection in NEXT using deep neural networks

DOE PAGES

Renner, J.; Farbin, A.; Vidal, J. Muñoz; ...

2017-01-16

Here, we investigate the potential of using deep learning techniques to reject background events in searches for neutrinoless double beta decay with high pressure xenon time projection chambers capable of detailed track reconstruction. The differences in the topological signatures of background and signal events can be learned by deep neural networks via training over many thousands of events. These networks can then be used to classify further events as signal or background, providing an additional background rejection factor at an acceptable loss of efficiency. The networks trained in this study performed better than previous methods developed based on the usemore » of the same topological signatures by a factor of 1.2 to 1.6, and there is potential for further improvement.« less
Genetic networks and soft computing.

PubMed

Mitra, Sushmita; Das, Ranajit; Hayashi, Yoichi

2011-01-01

The analysis of gene regulatory networks provides enormous information on various fundamental cellular processes involving growth, development, hormone secretion, and cellular communication. Their extraction from available gene expression profiles is a challenging problem. Such reverse engineering of genetic networks offers insight into cellular activity toward prediction of adverse effects of new drugs or possible identification of new drug targets. Tasks such as classification, clustering, and feature selection enable efficient mining of knowledge about gene interactions in the form of networks. It is known that biological data is prone to different kinds of noise and ambiguity. Soft computing tools, such as fuzzy sets, evolutionary strategies, and neurocomputing, have been found to be helpful in providing low-cost, acceptable solutions in the presence of various types of uncertainties. In this paper, we survey the role of these soft methodologies and their hybridizations, for the purpose of generating genetic networks.
Functional brain networks reconstruction using group sparsity-regularized learning.

PubMed

Zhao, Qinghua; Li, Will X Y; Jiang, Xi; Lv, Jinglei; Lu, Jianfeng; Liu, Tianming

2018-06-01

Investigating functional brain networks and patterns using sparse representation of fMRI data has received significant interests in the neuroimaging community. It has been reported that sparse representation is effective in reconstructing concurrent and interactive functional brain networks. To date, most of data-driven network reconstruction approaches rarely take consideration of anatomical structures, which are the substrate of brain function. Furthermore, it has been rarely explored whether structured sparse representation with anatomical guidance could facilitate functional networks reconstruction. To address this problem, in this paper, we propose to reconstruct brain networks utilizing the structure guided group sparse regression (S2GSR) in which 116 anatomical regions from the AAL template, as prior knowledge, are employed to guide the network reconstruction when performing sparse representation of whole-brain fMRI data. Specifically, we extract fMRI signals from standard space aligned with the AAL template. Then by learning a global over-complete dictionary, with the learned dictionary as a set of features (regressors), the group structured regression employs anatomical structures as group information to regress whole brain signals. Finally, the decomposition coefficients matrix is mapped back to the brain volume to represent functional brain networks and patterns. We use the publicly available Human Connectome Project (HCP) Q1 dataset as the test bed, and the experimental results indicate that the proposed anatomically guided structure sparse representation is effective in reconstructing concurrent functional brain networks.
A group LASSO-based method for robustly inferring gene regulatory networks from multiple time-course datasets.

PubMed

Liu, Li-Zhi; Wu, Fang-Xiang; Zhang, Wen-Jun

2014-01-01

As an abstract mapping of the gene regulations in the cell, gene regulatory network is important to both biological research study and practical applications. The reverse engineering of gene regulatory networks from microarray gene expression data is a challenging research problem in systems biology. With the development of biological technologies, multiple time-course gene expression datasets might be collected for a specific gene network under different circumstances. The inference of a gene regulatory network can be improved by integrating these multiple datasets. It is also known that gene expression data may be contaminated with large errors or outliers, which may affect the inference results. A novel method, Huber group LASSO, is proposed to infer the same underlying network topology from multiple time-course gene expression datasets as well as to take the robustness to large error or outliers into account. To solve the optimization problem involved in the proposed method, an efficient algorithm which combines the ideas of auxiliary function minimization and block descent is developed. A stability selection method is adapted to our method to find a network topology consisting of edges with scores. The proposed method is applied to both simulation datasets and real experimental datasets. It shows that Huber group LASSO outperforms the group LASSO in terms of both areas under receiver operating characteristic curves and areas under the precision-recall curves. The convergence analysis of the algorithm theoretically shows that the sequence generated from the algorithm converges to the optimal solution of the problem. The simulation and real data examples demonstrate the effectiveness of the Huber group LASSO in integrating multiple time-course gene expression datasets and improving the resistance to large errors or outliers.
iCN718, an Updated and Improved Genome-Scale Metabolic Network Reconstruction of Acinetobacter baumannii AYE.

PubMed

Norsigian, Charles J; Kavvas, Erol; Seif, Yara; Palsson, Bernhard O; Monk, Jonathan M

2018-01-01

Acinetobacter baumannii has become an urgent clinical threat due to the recent emergence of multi-drug resistant strains. There is thus a significant need to discover new therapeutic targets in this organism. One means for doing so is through the use of high-quality genome-scale reconstructions. Well-curated and accurate genome-scale models (GEMs) of A. baumannii would be useful for improving treatment options. We present an updated and improved genome-scale reconstruction of A. baumannii AYE, named iCN718, that improves and standardizes previous A. baumannii AYE reconstructions. iCN718 has 80% accuracy for predicting gene essentiality data and additionally can predict large-scale phenotypic data with as much as 89% accuracy, a new capability for an A. baumannii reconstruction. We further demonstrate that iCN718 can be used to analyze conserved metabolic functions in the A. baumannii core genome and to build strain-specific GEMs of 74 other A. baumannii strains from genome sequence alone. iCN718 will serve as a resource to integrate and synthesize new experimental data being generated for this urgent threat pathogen.
Sequence- and Structure-Based Functional Annotation and Assessment of Metabolic Transporters in Aspergillus oryzae: A Representative Case Study

PubMed Central

Raethong, Nachon; Wong-ekkabut, Jirasak; Laoteng, Kobkul; Vongsangnak, Wanwipa

2016-01-01

Aspergillus oryzae is widely used for the industrial production of enzymes. In A. oryzae metabolism, transporters appear to play crucial roles in controlling the flux of molecules for energy generation, nutrients delivery, and waste elimination in the cell. While the A. oryzae genome sequence is available, transporter annotation remains limited and thus the connectivity of metabolic networks is incomplete. In this study, we developed a metabolic annotation strategy to understand the relationship between the sequence, structure, and function for annotation of A. oryzae metabolic transporters. Sequence-based analysis with manual curation showed that 58 genes of 12,096 total genes in the A. oryzae genome encoded metabolic transporters. Under consensus integrative databases, 55 unambiguous metabolic transporter genes were distributed into channels and pores (7 genes), electrochemical potential-driven transporters (33 genes), and primary active transporters (15 genes). To reveal the transporter functional role, a combination of homology modeling and molecular dynamics simulation was implemented to assess the relationship between sequence to structure and structure to function. As in the energy metabolism of A. oryzae, the H+-ATPase encoded by the AO090005000842 gene was selected as a representative case study of multilevel linkage annotation. Our developed strategy can be used for enhancing metabolic network reconstruction. PMID:27274991
Sequence- and Structure-Based Functional Annotation and Assessment of Metabolic Transporters in Aspergillus oryzae: A Representative Case Study.

PubMed

Raethong, Nachon; Wong-Ekkabut, Jirasak; Laoteng, Kobkul; Vongsangnak, Wanwipa

2016-01-01

Aspergillus oryzae is widely used for the industrial production of enzymes. In A. oryzae metabolism, transporters appear to play crucial roles in controlling the flux of molecules for energy generation, nutrients delivery, and waste elimination in the cell. While the A. oryzae genome sequence is available, transporter annotation remains limited and thus the connectivity of metabolic networks is incomplete. In this study, we developed a metabolic annotation strategy to understand the relationship between the sequence, structure, and function for annotation of A. oryzae metabolic transporters. Sequence-based analysis with manual curation showed that 58 genes of 12,096 total genes in the A. oryzae genome encoded metabolic transporters. Under consensus integrative databases, 55 unambiguous metabolic transporter genes were distributed into channels and pores (7 genes), electrochemical potential-driven transporters (33 genes), and primary active transporters (15 genes). To reveal the transporter functional role, a combination of homology modeling and molecular dynamics simulation was implemented to assess the relationship between sequence to structure and structure to function. As in the energy metabolism of A. oryzae, the H(+)-ATPase encoded by the AO090005000842 gene was selected as a representative case study of multilevel linkage annotation. Our developed strategy can be used for enhancing metabolic network reconstruction.
Structural network alterations and neurological dysfunction in cerebral amyloid angiopathy

PubMed Central

Reijmer, Yael D.; Fotiadis, Panagiotis; Martinez-Ramirez, Sergi; Salat, David H.; Schultz, Aaron; Shoamanesh, Ashkan; Ayres, Alison M.; Vashkevich, Anastasia; Rosas, Diana; Schwab, Kristin; Leemans, Alexander; Biessels, Geert-Jan; Rosand, Jonathan; Johnson, Keith A.; Viswanathan, Anand; Gurol, M. Edip

2015-01-01

Cerebral amyloid angiopathy is a common form of small-vessel disease and an important risk factor for cognitive impairment. The mechanisms linking small-vessel disease to cognitive impairment are not well understood. We hypothesized that in patients with cerebral amyloid angiopathy, multiple small spatially distributed lesions affect cognition through disruption of brain connectivity. We therefore compared the structural brain network in patients with cerebral amyloid angiopathy to healthy control subjects and examined the relationship between markers of cerebral amyloid angiopathy-related brain injury, network efficiency, and potential clinical consequences. Structural brain networks were reconstructed from diffusion-weighted magnetic resonance imaging in 38 non-demented patients with probable cerebral amyloid angiopathy (69 ± 10 years) and 29 similar aged control participants. The efficiency of the brain network was characterized using graph theory and brain amyloid deposition was quantified by Pittsburgh compound B retention on positron emission tomography imaging. Global efficiency of the brain network was reduced in patients compared to controls (0.187 ± 0.018 and 0.201 ± 0.015, respectively, P < 0.001). Network disturbances were most pronounced in the occipital, parietal, and posterior temporal lobes. Among patients, lower global network efficiency was related to higher cortical amyloid load (r = −0.52; P = 0.004), and to magnetic resonance imaging markers of small-vessel disease including increased white matter hyperintensity volume (P < 0.001), lower total brain volume (P = 0.02), and number of microbleeds (trend P = 0.06). Lower global network efficiency was also related to worse performance on tests of processing speed (r = 0.58, P < 0.001), executive functioning (r = 0.54, P = 0.001), gait velocity (r = 0.41, P = 0.02), but not memory. Correlations with cognition were independent of age, sex, education level, and other magnetic resonance imaging markers of small-vessel disease. These findings suggest that reduced structural brain network efficiency might mediate the relationship between advanced cerebral amyloid angiopathy and neurologic dysfunction and that such large-scale brain network measures may represent useful outcome markers for tracking disease progression. PMID:25367025
Network reconstructions with partially available data

NASA Astrophysics Data System (ADS)

Zhang, Chaoyang; Chen, Yang; Hu, Gang

2017-06-01

Many practical systems in natural and social sciences can be described by dynamical networks. Day by day we have measured and accumulated huge amounts of data from these networks, which can be used by us to further our understanding of the world. The structures of the networks producing these data are often unknown. Consequently, understanding the structures of these networks from available data turns to be one of the central issues in interdisciplinary fields, which is called the network reconstruction problem. In this paper, we considered problems of network reconstructions using partially available data and some situations where data availabilities are not sufficient for conventional network reconstructions. Furthermore, we proposed to infer subnetwork with data of the subnetwork available only and other nodes of the entire network hidden; to depict group-group interactions in networks with averages of groups of node variables available; and to perform network reconstructions with known data of node variables only when networks are driven by both unknown internal fast-varying noises and unknown external slowly-varying signals. All these situations are expected to be common in practical systems and the methods and results may be useful for real world applications.
Strategy on energy saving reconstruction of distribution networks based on life cycle cost

NASA Astrophysics Data System (ADS)

Chen, Xiaofei; Qiu, Zejing; Xu, Zhaoyang; Xiao, Chupeng

2017-08-01

Because the actual distribution network reconstruction project funds are often limited, the cost-benefit model and the decision-making method are crucial for distribution network energy saving reconstruction project. From the perspective of life cycle cost (LCC), firstly the research life cycle is determined for the energy saving reconstruction of distribution networks with multi-devices. Then, a new life cycle cost-benefit model for energy-saving reconstruction of distribution network is developed, in which the modification schemes include distribution transformers replacement, lines replacement and reactive power compensation. In the operation loss cost and maintenance cost area, the operation cost model considering the influence of load season characteristics and the maintenance cost segmental model of transformers are proposed. Finally, aiming at the highest energy saving profit per LCC, a decision-making method is developed while considering financial and technical constraints as well. The model and method are applied to a real distribution network reconstruction, and the results prove that the model and method are effective.
Network Analysis Reveals Putative Genes Affecting Meat Quality in Angus Cattle.

PubMed

Mateescu, Raluca G; Garrick, Dorian J; Reecy, James M

2017-01-01

Improvements in eating satisfaction will benefit consumers and should increase beef demand which is of interest to the beef industry. Tenderness, juiciness, and flavor are major determinants of the palatability of beef and are often used to reflect eating satisfaction. Carcass qualities are used as indicator traits for meat quality, with higher quality grade carcasses expected to relate to more tender and palatable meat. However, meat quality is a complex concept determined by many component traits making interpretation of genome-wide association studies (GWAS) on any one component challenging to interpret. Recent approaches combining traditional GWAS with gene network interactions theory could be more efficient in dissecting the genetic architecture of complex traits. Phenotypic measures of 23 traits reflecting carcass characteristics, components of meat quality, along with mineral and peptide concentrations were used along with Illumina 54k bovine SNP genotypes to derive an annotated gene network associated with meat quality in 2,110 Angus beef cattle. The efficient mixed model association (EMMAX) approach in combination with a genomic relationship matrix was used to directly estimate the associations between 54k SNP genotypes and each of the 23 component traits. Genomic correlated regions were identified by partial correlations which were further used along with an information theory algorithm to derive gene network clusters. Correlated SNP across 23 component traits were subjected to network scoring and visualization software to identify significant SNP. Significant pathways implicated in the meat quality complex through GO term enrichment analysis included angiogenesis, inflammation, transmembrane transporter activity, and receptor activity. These results suggest that network analysis using partial correlations and annotation of significant SNP can reveal the genetic architecture of complex traits and provide novel information regarding biological mechanisms and genes that lead to complex phenotypes, like meat quality, and the nutritional and healthfulness value of beef. Improvements in genome annotation and knowledge of gene function will contribute to more comprehensive analyses that will advance our ability to dissect the complex architecture of complex traits.
Network analysis of S. aureus response to ramoplanin reveals modules for virulence factors and resistance mechanisms and characteristic novel genes.

PubMed

Subramanian, Devika; Natarajan, Jeyakumar

2015-12-10

Staphylococcus aureus is a major human pathogen and ramoplanin is an antimicrobial attributed for effective treatment. The goal of this study was to examine the transcriptomic profiles of ramoplanin sensitive and resistant S. aureus to identify putative modules responsible for virulence and resistance-mechanisms and its characteristic novel genes. The dysregulated genes were used to reconstruct protein functional association networks for virulence-factors and resistance-mechanisms individually. Strong link between metabolic-pathways and development of virulence/resistance is suggested. We identified 15 putative modules of virulence factors. Six hypothetical genes were annotated with novel virulence activity among which SACOL0281 was discovered to be an essential virulence factor EsaD. The roles of MazEF toxin-antitoxin system, SACOL0202/SACOL0201 two-component system and that of amino-sugar and nucleotide-sugar metabolism in virulence are also suggested. In addition, 14 putative modules of resistance mechanisms including modules of ribosomal protein-coding genes and metabolic pathways such as biotin-synthesis, TCA-cycle, riboflavin-biosynthesis, peptidoglycan-biosynthesis etc. are also indicated. Copyright © 2015 Elsevier B.V. All rights reserved.
EGFR Signal-Network Reconstruction Demonstrates Metabolic Crosstalk in EMT

PubMed Central

Choudhary, Kumari Sonal; Rohatgi, Neha; Briem, Eirikur; Gudjonsson, Thorarinn; Gudmundsson, Steinn; Rolfsson, Ottar

2016-01-01

Epithelial to mesenchymal transition (EMT) is an important event during development and cancer metastasis. There is limited understanding of the metabolic alterations that give rise to and take place during EMT. Dysregulation of signalling pathways that impact metabolism, including epidermal growth factor receptor (EGFR), are however a hallmark of EMT and metastasis. In this study, we report the investigation into EGFR signalling and metabolic crosstalk of EMT through constraint-based modelling and analysis of the breast epithelial EMT cell model D492 and its mesenchymal counterpart D492M. We built an EGFR signalling network for EMT based on stoichiometric coefficients and constrained the network with gene expression data to build epithelial (EGFR_E) and mesenchymal (EGFR_M) networks. Metabolic alterations arising from differential expression of EGFR genes was derived from a literature review of AKT regulated metabolic genes. Signaling flux differences between EGFR_E and EGFR_M models subsequently allowed metabolism in D492 and D492M cells to be assessed. Higher flux within AKT pathway in the D492 cells compared to D492M suggested higher glycolytic activity in D492 that we confirmed experimentally through measurements of glucose uptake and lactate secretion rates. The signaling genes from the AKT, RAS/MAPK and CaM pathways were predicted to revert D492M to D492 phenotype. Follow-up analysis of EGFR signaling metabolic crosstalk in three additional breast epithelial cell lines highlighted variability in in vitro cell models of EMT. This study shows that the metabolic phenotype may be predicted by in silico analyses of gene expression data of EGFR signaling genes, but this phenomenon is cell-specific and does not follow a simple trend. PMID:27253373
EGFR Signal-Network Reconstruction Demonstrates Metabolic Crosstalk in EMT.

PubMed

Choudhary, Kumari Sonal; Rohatgi, Neha; Halldorsson, Skarphedinn; Briem, Eirikur; Gudjonsson, Thorarinn; Gudmundsson, Steinn; Rolfsson, Ottar

2016-06-01

Epithelial to mesenchymal transition (EMT) is an important event during development and cancer metastasis. There is limited understanding of the metabolic alterations that give rise to and take place during EMT. Dysregulation of signalling pathways that impact metabolism, including epidermal growth factor receptor (EGFR), are however a hallmark of EMT and metastasis. In this study, we report the investigation into EGFR signalling and metabolic crosstalk of EMT through constraint-based modelling and analysis of the breast epithelial EMT cell model D492 and its mesenchymal counterpart D492M. We built an EGFR signalling network for EMT based on stoichiometric coefficients and constrained the network with gene expression data to build epithelial (EGFR_E) and mesenchymal (EGFR_M) networks. Metabolic alterations arising from differential expression of EGFR genes was derived from a literature review of AKT regulated metabolic genes. Signaling flux differences between EGFR_E and EGFR_M models subsequently allowed metabolism in D492 and D492M cells to be assessed. Higher flux within AKT pathway in the D492 cells compared to D492M suggested higher glycolytic activity in D492 that we confirmed experimentally through measurements of glucose uptake and lactate secretion rates. The signaling genes from the AKT, RAS/MAPK and CaM pathways were predicted to revert D492M to D492 phenotype. Follow-up analysis of EGFR signaling metabolic crosstalk in three additional breast epithelial cell lines highlighted variability in in vitro cell models of EMT. This study shows that the metabolic phenotype may be predicted by in silico analyses of gene expression data of EGFR signaling genes, but this phenomenon is cell-specific and does not follow a simple trend.
Exploring Normalization and Network Reconstruction Methods using In Silico and In Vivo Models

EPA Science Inventory

Abstract: Lessons learned from the recent DREAM competitions include: The search for the best network reconstruction method continues, and we need more complete datasets with ground truth from more complex organisms. It has become obvious that the network reconstruction methods t...
Identifying metabolic enzymes with multiple types of association evidence

PubMed Central

Kharchenko, Peter; Chen, Lifeng; Freund, Yoav; Vitkup, Dennis; Church, George M

2006-01-01

Background Existing large-scale metabolic models of sequenced organisms commonly include enzymatic functions which can not be attributed to any gene in that organism. Existing computational strategies for identifying such missing genes rely primarily on sequence homology to known enzyme-encoding genes. Results We present a novel method for identifying genes encoding for a specific metabolic function based on a local structure of metabolic network and multiple types of functional association evidence, including clustering of genes on the chromosome, similarity of phylogenetic profiles, gene expression, protein fusion events and others. Using E. coli and S. cerevisiae metabolic networks, we illustrate predictive ability of each individual type of association evidence and show that significantly better predictions can be obtained based on the combination of all data. In this way our method is able to predict 60% of enzyme-encoding genes of E. coli metabolism within the top 10 (out of 3551) candidates for their enzymatic function, and as a top candidate within 43% of the cases. Conclusion We illustrate that a combination of genome context and other functional association evidence is effective in predicting genes encoding metabolic enzymes. Our approach does not rely on direct sequence homology to known enzyme-encoding genes, and can be used in conjunction with traditional homology-based metabolic reconstruction methods. The method can also be used to target orphan metabolic activities. PMID:16571130
Of woods and webs: possible alternatives to the tree of life for studying genomic fluidity in E. coli.

PubMed

Beauregard-Racine, Julie; Bicep, Cédric; Schliep, Klaus; Lopez, Philippe; Lapointe, François-Joseph; Bapteste, Eric

2011-07-20

We introduce several forest-based and network-based methods for exploring microbial evolution, and apply them to the study of thousands of genes from 30 strains of E. coli. This case study illustrates how additional analyses could offer fast heuristic alternatives to standard tree of life (TOL) approaches. We use gene networks to identify genes with atypical modes of evolution, and genome networks to characterize the evolution of genetic partnerships between E. coli and mobile genetic elements. We develop a novel polychromatic quartet method to capture patterns of recombination within E. coli, to update the clanistic toolkit, and to search for the impact of lateral gene transfer and of pathogenicity on gene evolution in two large forests of trees bearing E. coli. We unravel high rates of lateral gene transfer involving E. coli (about 40% of the trees under study), and show that both core genes and shell genes of E. coli are affected by non-tree-like evolutionary processes. We show that pathogenic lifestyle impacted the structure of 30% of the gene trees, and that pathogenic strains are more likely to transfer genes with one another than with non-pathogenic strains. In addition, we propose five groups of genes as candidate mobile modules of pathogenicity. We also present strong evidence for recent lateral gene transfer between E. coli and mobile genetic elements. Depending on which evolutionary questions biologists want to address (i.e. the identification of modules, genetic partnerships, recombination, lateral gene transfer, or genes with atypical evolutionary modes, etc.), forest-based and network-based methods are preferable to the reconstruction of a single tree, because they provide insights and produce hypotheses about the dynamics of genome evolution, rather than the relative branching order of species and lineages. Such a methodological pluralism - the use of woods and webs - is to be encouraged to analyse the evolutionary processes at play in microbial evolution.This manuscript was reviewed by: Ford Doolittle, Tal Pupko, Richard Burian, James McInerney, Didier Raoult, and Yan Boucher.
Inference and interrogation of a coregulatory network in the context of lipid accumulation in Yarrowia lipolytica.

PubMed

Trébulle, Pauline; Nicaud, Jean-Marc; Leplat, Christophe; Elati, Mohamed

2017-01-01

Complex phenotypes, such as lipid accumulation, result from cooperativity between regulators and the integration of multiscale information. However, the elucidation of such regulatory programs by experimental approaches may be challenging, particularly in context-specific conditions. In particular, we know very little about the regulators of lipid accumulation in the oleaginous yeast of industrial interest Yarrowia lipolytica . This lack of knowledge limits the development of this yeast as an industrial platform, due to the time-consuming and costly laboratory efforts required to design strains with the desired phenotypes. In this study, we aimed to identify context-specific regulators and mechanisms, to guide explorations of the regulation of lipid accumulation in Y. lipolytica . Using gene regulatory network inference, and considering the expression of 6539 genes over 26 time points from GSE35447 for biolipid production and a list of 151 transcription factors, we reconstructed a gene regulatory network comprising 111 transcription factors, 4451 target genes and 17048 regulatory interactions (YL-GRN-1) supported by evidence of protein-protein interactions. This study, based on network interrogation and wet laboratory validation (a) highlights the relevance of our proposed measure, the transcription factors influence, for identifying phases corresponding to changes in physiological state without prior knowledge (b) suggests new potential regulators and drivers of lipid accumulation and (c) experimentally validates the impact of six of the nine regulators identified on lipid accumulation, with variations in lipid content from +43.2% to -31.2% on glucose or glycerol.

Yeast 5 – an expanded reconstruction of the Saccharomyces cerevisiae metabolic network

PubMed Central

2012-01-01

Background Efforts to improve the computational reconstruction of the Saccharomyces cerevisiae biochemical reaction network and to refine the stoichiometrically constrained metabolic models that can be derived from such a reconstruction have continued since the first stoichiometrically constrained yeast genome scale metabolic model was published in 2003. Continuing this ongoing process, we have constructed an update to the Yeast Consensus Reconstruction, Yeast 5. The Yeast Consensus Reconstruction is a product of efforts to forge a community-based reconstruction emphasizing standards compliance and biochemical accuracy via evidence-based selection of reactions. It draws upon models published by a variety of independent research groups as well as information obtained from biochemical databases and primary literature. Results Yeast 5 refines the biochemical reactions included in the reconstruction, particularly reactions involved in sphingolipid metabolism; updates gene-reaction annotations; and emphasizes the distinction between reconstruction and stoichiometrically constrained model. Although it was not a primary goal, this update also improves the accuracy of model prediction of viability and auxotrophy phenotypes and increases the number of epistatic interactions. This update maintains an emphasis on standards compliance, unambiguous metabolite naming, and computer-readable annotations available through a structured document format. Additionally, we have developed MATLAB scripts to evaluate the model’s predictive accuracy and to demonstrate basic model applications such as simulating aerobic and anaerobic growth. These scripts, which provide an independent tool for evaluating the performance of various stoichiometrically constrained yeast metabolic models using flux balance analysis, are included as Additional files 1, 2 and 3. Conclusions Yeast 5 expands and refines the computational reconstruction of yeast metabolism and improves the predictive accuracy of a stoichiometrically constrained yeast metabolic model. It differs from previous reconstructions and models by emphasizing the distinction between the yeast metabolic reconstruction and the stoichiometrically constrained model, and makes both available as Additional file 4 and Additional file 5 and at http://yeast.sf.net/ as separate systems biology markup language (SBML) files. Through this separation, we intend to make the modeling process more accessible, explicit, transparent, and reproducible. PMID:22663945
SF3M software: 3-D photo-reconstruction for non-expert users and its application to a gully network

NASA Astrophysics Data System (ADS)

Castillo, C.; James, M. R.; Redel-Macías, M. D.; Pérez, R.; Gómez, J. A.

2015-08-01

Three-dimensional photo-reconstruction (PR) techniques have been successfully used to produce high-resolution surface models for different applications and over different spatial scales. However, innovative approaches are required to overcome some limitations that this technique may present for field image acquisition in challenging scene geometries. Here, we evaluate SF3M, a new graphical user interface for implementing a complete PR workflow based on freely available software (including external calls to VisualSFM and CloudCompare), in combination with a low-cost survey design for the reconstruction of a several-hundred-metres-long gully network. SF3M provided a semi-automated workflow for 3-D reconstruction requiring ~ 49 h (of which only 17 % required operator assistance) for obtaining a final gully network model of > 17 million points over a gully plan area of 4230 m2. We show that a walking itinerary along the gully perimeter using two lightweight automatic cameras (1 s time-lapse mode) and a 6 m long pole is an efficient method for 3-D monitoring of gullies, at a low cost (~ EUR 1000 budget for the field equipment) and the time requirements (~ 90 min for image collection). A mean error of 6.9 cm at the ground control points was found, mainly due to model deformations derived from the linear geometry of the gully and residual errors in camera calibration. The straightforward image collection and processing approach can be of great benefit for non-expert users working on gully erosion assessment.
Boosting probabilistic graphical model inference by incorporating prior knowledge from multiple sources.

PubMed

Praveen, Paurush; Fröhlich, Holger

2013-01-01

Inferring regulatory networks from experimental data via probabilistic graphical models is a popular framework to gain insights into biological systems. However, the inherent noise in experimental data coupled with a limited sample size reduces the performance of network reverse engineering. Prior knowledge from existing sources of biological information can address this low signal to noise problem by biasing the network inference towards biologically plausible network structures. Although integrating various sources of information is desirable, their heterogeneous nature makes this task challenging. We propose two computational methods to incorporate various information sources into a probabilistic consensus structure prior to be used in graphical model inference. Our first model, called Latent Factor Model (LFM), assumes a high degree of correlation among external information sources and reconstructs a hidden variable as a common source in a Bayesian manner. The second model, a Noisy-OR, picks up the strongest support for an interaction among information sources in a probabilistic fashion. Our extensive computational studies on KEGG signaling pathways as well as on gene expression data from breast cancer and yeast heat shock response reveal that both approaches can significantly enhance the reconstruction accuracy of Bayesian Networks compared to other competing methods as well as to the situation without any prior. Our framework allows for using diverse information sources, like pathway databases, GO terms and protein domain data, etc. and is flexible enough to integrate new sources, if available.
Database Constraints Applied to Metabolic Pathway Reconstruction Tools

PubMed Central

Vilaplana, Jordi; Solsona, Francesc; Teixido, Ivan; Usié, Anabel; Karathia, Hiren; Alves, Rui; Mateo, Jordi

2014-01-01

Our group developed two biological applications, Biblio-MetReS and Homol-MetReS, accessing the same database of organisms with annotated genes. Biblio-MetReS is a data-mining application that facilitates the reconstruction of molecular networks based on automated text-mining analysis of published scientific literature. Homol-MetReS allows functional (re)annotation of proteomes, to properly identify both the individual proteins involved in the process(es) of interest and their function. It also enables the sets of proteins involved in the process(es) in different organisms to be compared directly. The efficiency of these biological applications is directly related to the design of the shared database. We classified and analyzed the different kinds of access to the database. Based on this study, we tried to adjust and tune the configurable parameters of the database server to reach the best performance of the communication data link to/from the database system. Different database technologies were analyzed. We started the study with a public relational SQL database, MySQL. Then, the same database was implemented by a MapReduce-based database named HBase. The results indicated that the standard configuration of MySQL gives an acceptable performance for low or medium size databases. Nevertheless, tuning database parameters can greatly improve the performance and lead to very competitive runtimes. PMID:25202745
Inferring gene ontologies from pairwise similarity data

PubMed Central

Kramer, Michael; Dutkowski, Janusz; Yu, Michael; Bafna, Vineet; Ideker, Trey

2014-01-01

Motivation: While the manually curated Gene Ontology (GO) is widely used, inferring a GO directly from -omics data is a compelling new problem. Recognizing that ontologies are a directed acyclic graph (DAG) of terms and hierarchical relations, algorithms are needed that: analyze a full matrix of gene–gene pairwise similarities from -omics data;infer true hierarchical structure in these data rather than enforcing hierarchy as a computational artifact; andrespect biological pleiotropy, by which a term in the hierarchy can relate to multiple higher level terms. Methods addressing these requirements are just beginning to emerge—none has been evaluated for GO inference. Methods: We consider two algorithms [Clique Extracted Ontology (CliXO), LocalFitness] that uniquely satisfy these requirements, compared with methods including standard clustering. CliXO is a new approach that finds maximal cliques in a network induced by progressive thresholding of a similarity matrix. We evaluate each method’s ability to reconstruct the GO biological process ontology from a similarity matrix based on (a) semantic similarities for GO itself or (b) three -omics datasets for yeast. Results: For task (a) using semantic similarity, CliXO accurately reconstructs GO (>99% precision, recall) and outperforms other approaches (<20% precision, <20% recall). For task (b) using -omics data, CliXO outperforms other methods using two -omics datasets and achieves ∼30% precision and recall using YeastNet v3, similar to an earlier approach (Network Extracted Ontology) and better than LocalFitness or standard clustering (20–25% precision, recall). Conclusion: This study provides algorithmic foundation for building gene ontologies by capturing hierarchical and pleiotropic structure embedded in biomolecular data. Contact: tideker@ucsd.edu PMID:24932003
DOE Office of Scientific and Technical Information (OSTI.GOV)

Renner, J.; Farbin, A.; Vidal, J. Muñoz

Here, we investigate the potential of using deep learning techniques to reject background events in searches for neutrinoless double beta decay with high pressure xenon time projection chambers capable of detailed track reconstruction. The differences in the topological signatures of background and signal events can be learned by deep neural networks via training over many thousands of events. These networks can then be used to classify further events as signal or background, providing an additional background rejection factor at an acceptable loss of efficiency. The networks trained in this study performed better than previous methods developed based on the usemore » of the same topological signatures by a factor of 1.2 to 1.6, and there is potential for further improvement.« less
Identifying significant genetic regulatory networks in the prostate cancer from microarray data based on transcription factor analysis and conditional independency

PubMed Central

2009-01-01

Background Prostate cancer is a world wide leading cancer and it is characterized by its aggressive metastasis. According to the clinical heterogeneity, prostate cancer displays different stages and grades related to the aggressive metastasis disease. Although numerous studies used microarray analysis and traditional clustering method to identify the individual genes during the disease processes, the important gene regulations remain unclear. We present a computational method for inferring genetic regulatory networks from micorarray data automatically with transcription factor analysis and conditional independence testing to explore the potential significant gene regulatory networks that are correlated with cancer, tumor grade and stage in the prostate cancer. Results To deal with missing values in microarray data, we used a K-nearest-neighbors (KNN) algorithm to determine the precise expression values. We applied web services technology to wrap the bioinformatics toolkits and databases to automatically extract the promoter regions of DNA sequences and predicted the transcription factors that regulate the gene expressions. We adopt the microarray datasets consists of 62 primary tumors, 41 normal prostate tissues from Stanford Microarray Database (SMD) as a target dataset to evaluate our method. The predicted results showed that the possible biomarker genes related to cancer and denoted the androgen functions and processes may be in the development of the prostate cancer and promote the cell death in cell cycle. Our predicted results showed that sub-networks of genes SREBF1, STAT6 and PBX1 are strongly related to a high extent while ETS transcription factors ELK1, JUN and EGR2 are related to a low extent. Gene SLC22A3 may explain clinically the differentiation associated with the high grade cancer compared with low grade cancer. Enhancer of Zeste Homolg 2 (EZH2) regulated by RUNX1 and STAT3 is correlated to the pathological stage. Conclusions We provide a computational framework to reconstruct the genetic regulatory network from the microarray data using biological knowledge and constraint-based inferences. Our method is helpful in verifying possible interaction relations in gene regulatory networks and filtering out incorrect relations inferred by imperfect methods. We predicted not only individual gene related to cancer but also discovered significant gene regulation networks. Our method is also validated in several enriched published papers and databases and the significant gene regulatory networks perform critical biological functions and processes including cell adhesion molecules, androgen and estrogen metabolism, smooth muscle contraction, and GO-annotated processes. Those significant gene regulations and the critical concept of tumor progression are useful to understand cancer biology and disease treatment. PMID:20025723
Zipf's Law in Gene Expression

NASA Astrophysics Data System (ADS)

Furusawa, Chikara; Kaneko, Kunihiko

2003-02-01

Using data from gene expression databases on various organisms and tissues, including yeast, nematodes, human normal and cancer tissues, and embryonic stem cells, we found that the abundances of expressed genes exhibit a power-law distribution with an exponent close to -1; i.e., they obey Zipf’s law. Furthermore, by simulations of a simple model with an intracellular reaction network, we found that Zipf’s law of chemical abundance is a universal feature of cells where such a network optimizes the efficiency and faithfulness of self-reproduction. These findings provide novel insights into the nature of the organization of reaction dynamics in living cells.
Computational gene network study on antibiotic resistance genes of Acinetobacter baumannii.

PubMed

Anitha, P; Anbarasu, Anand; Ramaiah, Sudha

2014-05-01

Multi Drug Resistance (MDR) in Acinetobacter baumannii is one of the major threats for emerging nosocomial infections in hospital environment. Multidrug-resistance in A. baumannii may be due to the implementation of multi-combination resistance mechanisms such as β-lactamase synthesis, Penicillin-Binding Proteins (PBPs) changes, alteration in porin proteins and in efflux pumps against various existing classes of antibiotics. Multiple antibiotic resistance genes are involved in MDR. These resistance genes are transferred through plasmids, which are responsible for the dissemination of antibiotic resistance among Acinetobacter spp. In addition, these resistance genes may also have a tendency to interact with each other or with their gene products. Therefore, it becomes necessary to understand the impact of these interactions in antibiotic resistance mechanism. Hence, our study focuses on protein and gene network analysis on various resistance genes, to elucidate the role of the interacting proteins and to study their functional contribution towards antibiotic resistance. From the search tool for the retrieval of interacting gene/protein (STRING), a total of 168 functional partners for 15 resistance genes were extracted based on the confidence scoring system. The network study was then followed up with functional clustering of associated partners using molecular complex detection (MCODE). Later, we selected eight efficient clusters based on score. Interestingly, the associated protein we identified from the network possessed greater functional similarity with known resistance genes. This network-based approach on resistance genes of A. baumannii could help in identifying new genes/proteins and provide clues on their association in antibiotic resistance. Copyright © 2014 Elsevier Ltd. All rights reserved.
Reconstruction of the experimentally supported human protein interactome: what can we learn?

PubMed Central

2013-01-01

Background Understanding the topology and dynamics of the human protein-protein interaction (PPI) network will significantly contribute to biomedical research, therefore its systematic reconstruction is required. Several meta-databases integrate source PPI datasets, but the protein node sets of their networks vary depending on the PPI data combined. Due to this inherent heterogeneity, the way in which the human PPI network expands via multiple dataset integration has not been comprehensively analyzed. We aim at assembling the human interactome in a global structured way and exploring it to gain insights of biological relevance. Results First, we defined the UniProtKB manually reviewed human “complete” proteome as the reference protein-node set and then we mined five major source PPI datasets for direct PPIs exclusively between the reference proteins. We updated the protein and publication identifiers and normalized all PPIs to the UniProt identifier level. The reconstructed interactome covers approximately 60% of the human proteome and has a scale-free structure. No apparent differentiating gene functional classification characteristics were identified for the unrepresented proteins. The source dataset integration augments the network mainly in PPIs. Polyubiquitin emerged as the highest-degree node, but the inclusion of most of its identified PPIs may be reconsidered. The high number (>300) of connections of the subsequent fifteen proteins correlates well with their essential biological role. According to the power-law network structure, the unrepresented proteins should mainly have up to four connections with equally poorly-connected interactors. Conclusions Reconstructing the human interactome based on the a priori definition of the protein nodes enabled us to identify the currently included part of the human “complete” proteome, and discuss the role of the proteins within the network topology with respect to their function. As the network expansion has to comply with the scale-free theory, we suggest that the core of the human interactome has essentially emerged. Thus, it could be employed in systems biology and biomedical research, despite the considerable number of currently unrepresented proteins. The latter are probably involved in specialized physiological conditions, justifying the scarcity of related PPI information, and their identification can assist in designing relevant functional experiments and targeted text mining algorithms. PMID:24088582
Liquid metal angiography for mega contrast X-ray visualization of vascular network in reconstructing in-vitro organ anatomy.

PubMed

Wang, Qian; Yu, Yang; Pan, Keqin; Liu, Jing

2014-07-01

Visualization on the anatomical vessel networks plays a vital role in the physiological or pathological investigations. However, so far it still remains a big challenge to identify the fine structures of the smallest capillary vessel networks via conventional imaging ways. Here, the room temperature liquid metal angiography was proposed for the first time to generate mega contrast X-ray images for multiscale vasculature mapping. Particularly, gallium was adopted as the room temperature liquid metal contrast agent and infused into the vessels of in vitro pig hearts and kidneys. We scanned the samples under X-ray and compared the angiograms with those obtained via conventional contrast agent--the iohexol. As quantitatively demonstrated by the grayscale histograms and numerical indexes, the contrast of the vessels to the surrounding tissues in the liquid metal angiograms is orders higher than that of the iohexol enhanced images. And the angiogram has reached detailed enough width of 0.1 mm for the tiny vessels, which indicated that the capillaries can be clearly distinguished under the liquid metal enhanced images. Further, with tomography from the micro-CT, we also managed to reconstruct the 3-D structures of the kidney vessels. Tremendous clarity and efficiency of the method over existing approaches have been experimentally clarified. It was disclosed that the usually invisible capillary networks now become distinctively clear in the gallium angiograms. This basic mechanism has generalized purpose and can be extended to a wide spectrum of 3-D computational tomographic areas. It opens a new soft tool for quickly reconstructing high-resolution spatial channel networks for scientific researches as well as engineering practices where complicated and time-consuming resections are no longer a necessity.
Draft sequencing and comparative genomics of Xylella fastidiosa strains reveal novel biological insights.

PubMed

Bhattacharyya, Anamitra; Stilwagen, Stephanie; Reznik, Gary; Feil, Helene; Feil, William S; Anderson, Iain; Bernal, Axel; D'Souza, Mark; Ivanova, Natalia; Kapatral, Vinayak; Larsen, Niels; Los, Tamara; Lykidis, Athanasios; Selkov, Eugene; Walunas, Theresa L; Purcell, Alexander; Edwards, Rob A; Hawkins, Trevor; Haselkorn, Robert; Overbeek, Ross; Kyrpides, Nikos C; Predki, Paul F

2002-10-01

Draft sequencing is a rapid and efficient method for determining the near-complete sequence of microbial genomes. Here we report a comparative analysis of one complete and two draft genome sequences of the phytopathogenic bacterium, Xylella fastidiosa, which causes serious disease in plants, including citrus, almond, and oleander. We present highlights of an in silico analysis based on a comparison of reconstructions of core biological subsystems. Cellular pathway reconstructions have been used to identify a small number of genes, which are likely to reside within the draft genomes but are not captured in the draft assembly. These represented only a small fraction of all genes and were predominantly large and small ribosomal subunit protein components. By using this approach, some of the inherent limitations of draft sequence can be significantly reduced. Despite the incomplete nature of the draft genomes, it is possible to identify several phage-related genes, which appear to be absent from the draft genomes and not the result of insufficient sequence sampling. This region may therefore identify potential host-specific functions. Based on this first functional reconstruction of a phytopathogenic microbe, we spotlight an unusual respiration machinery as a potential target for biological control. We also predicted and developed a new defined growth medium for Xylella.
NABIC marker database: A molecular markers information network of agricultural crops.

PubMed

Kim, Chang-Kug; Seol, Young-Joo; Lee, Dong-Jun; Jeong, In-Seon; Yoon, Ung-Han; Lee, Gang-Seob; Hahn, Jang-Ho; Park, Dong-Suk

2013-01-01

In 2013, National Agricultural Biotechnology Information Center (NABIC) reconstructs a molecular marker database for useful genetic resources. The web-based marker database consists of three major functional categories: map viewer, RSN marker and gene annotation. It provides 7250 marker locations, 3301 RSN marker property, 3280 molecular marker annotation information in agricultural plants. The individual molecular marker provides information such as marker name, expressed sequence tag number, gene definition and general marker information. This updated marker-based database provides useful information through a user-friendly web interface that assisted in tracing any new structures of the chromosomes and gene positional functions using specific molecular markers. The database is available for free at http://nabic.rda.go.kr/gere/rice/molecularMarkers/
Craniofacial reconstruction evaluation by geodesic network.

PubMed

Zhao, Junli; Liu, Cuiting; Wu, Zhongke; Duan, Fuqing; Wang, Kang; Jia, Taorui; Liu, Quansheng

2014-01-01

Craniofacial reconstruction is to estimate an individual's face model from its skull. It has a widespread application in forensic medicine, archeology, medical cosmetic surgery, and so forth. However, little attention is paid to the evaluation of craniofacial reconstruction. This paper proposes an objective method to evaluate globally and locally the reconstructed craniofacial faces based on the geodesic network. Firstly, the geodesic networks of the reconstructed craniofacial face and the original face are built, respectively, by geodesics and isogeodesics, whose intersections are network vertices. Then, the absolute value of the correlation coefficient of the features of all corresponding geodesic network vertices between two models is taken as the holistic similarity, where the weighted average of the shape index values in a neighborhood is defined as the feature of each network vertex. Moreover, the geodesic network vertices of each model are divided into six subareas, that is, forehead, eyes, nose, mouth, cheeks, and chin, and the local similarity is measured for each subarea. Experiments using 100 pairs of reconstructed craniofacial faces and their corresponding original faces show that the evaluation by our method is roughly consistent with the subjective evaluation derived from thirty-five persons in five groups.
Predicting gene regulatory networks by combining spatial and temporal gene expression data in Arabidopsis root stem cells

PubMed Central

de Luis Balaguer, Maria Angels; Fisher, Adam P.; Clark, Natalie M.; Fernandez-Espinosa, Maria Guadalupe; Möller, Barbara K.; Weijers, Dolf; Williams, Cranos; Lorenzo, Oscar; Sozzani, Rosangela

2017-01-01

Identifying the transcription factors (TFs) and associated networks involved in stem cell regulation is essential for understanding the initiation and growth of plant tissues and organs. Although many TFs have been shown to have a role in the Arabidopsis root stem cells, a comprehensive view of the transcriptional signature of the stem cells is lacking. In this work, we used spatial and temporal transcriptomic data to predict interactions among the genes involved in stem cell regulation. To accomplish this, we transcriptionally profiled several stem cell populations and developed a gene regulatory network inference algorithm that combines clustering with dynamic Bayesian network inference. We leveraged the topology of our networks to infer potential major regulators. Specifically, through mathematical modeling and experimental validation, we identified PERIANTHIA (PAN) as an important molecular regulator of quiescent center function. The results presented in this work show that our combination of molecular biology, computational biology, and mathematical modeling is an efficient approach to identify candidate factors that function in the stem cells. PMID:28827319
Proteome- and transcriptome-driven reconstruction of the human myocyte metabolic network and its use for identification of markers for diabetes.

PubMed

Väremo, Leif; Scheele, Camilla; Broholm, Christa; Mardinoglu, Adil; Kampf, Caroline; Asplund, Anna; Nookaew, Intawat; Uhlén, Mathias; Pedersen, Bente Klarlund; Nielsen, Jens

2015-05-12

Skeletal myocytes are metabolically active and susceptible to insulin resistance and are thus implicated in type 2 diabetes (T2D). This complex disease involves systemic metabolic changes, and their elucidation at the systems level requires genome-wide data and biological networks. Genome-scale metabolic models (GEMs) provide a network context for the integration of high-throughput data. We generated myocyte-specific RNA-sequencing data and investigated their correlation with proteome data. These data were then used to reconstruct a comprehensive myocyte GEM. Next, we performed a meta-analysis of six studies comparing muscle transcription in T2D versus healthy subjects. Transcriptional changes were mapped on the myocyte GEM, revealing extensive transcriptional regulation in T2D, particularly around pyruvate oxidation, branched-chain amino acid catabolism, and tetrahydrofolate metabolism, connected through the downregulated dihydrolipoamide dehydrogenase. Strikingly, the gene signature underlying this metabolic regulation successfully classifies the disease state of individual samples, suggesting that regulation of these pathways is a ubiquitous feature of myocytes in response to T2D. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
Global Landscape of a Co-Expressed Gene Network in Barley and its Application to Gene Discovery in Triticeae Crops

PubMed Central

Mochida, Keiichi; Uehara-Yamaguchi, Yukiko; Yoshida, Takuhiro; Sakurai, Tetsuya; Shinozaki, Kazuo

2011-01-01

Accumulated transcriptome data can be used to investigate regulatory networks of genes involved in various biological systems. Co-expression analysis data sets generated from comprehensively collected transcriptome data sets now represent efficient resources that are capable of facilitating the discovery of genes with closely correlated expression patterns. In order to construct a co-expression network for barley, we analyzed 45 publicly available experimental series, which are composed of 1,347 sets of GeneChip data for barley. On the basis of a gene-to-gene weighted correlation coefficient, we constructed a global barley co-expression network and classified it into clusters of subnetwork modules. The resulting clusters are candidates for functional regulatory modules in the barley transcriptome. To annotate each of the modules, we performed comparative annotation using genes in Arabidopsis and Brachypodium distachyon. On the basis of a comparative analysis between barley and two model species, we investigated functional properties from the representative distributions of the gene ontology (GO) terms. Modules putatively involved in drought stress response and cellulose biogenesis have been identified. These modules are discussed to demonstrate the effectiveness of the co-expression analysis. Furthermore, we applied the data set of co-expressed genes coupled with comparative analysis in attempts to discover potentially Triticeae-specific network modules. These results demonstrate that analysis of the co-expression network of the barley transcriptome together with comparative analysis should promote the process of gene discovery in barley. Furthermore, the insights obtained should be transferable to investigations of Triticeae plants. The associated data set generated in this analysis is publicly accessible at http://coexpression.psc.riken.jp/barley/. PMID:21441235
The Evolution of Gene Regulatory Networks that Define Arthropod Body Plans.

PubMed

Auman, Tzach; Chipman, Ariel D

2017-09-01

Our understanding of the genetics of arthropod body plan development originally stems from work on Drosophila melanogaster from the late 1970s and onward. In Drosophila, there is a relatively detailed model for the network of gene interactions that proceeds in a sequential-hierarchical fashion to define the main features of the body plan. Over the years, we have a growing understanding of the networks involved in defining the body plan in an increasing number of arthropod species. It is now becoming possible to tease out the conserved aspects of these networks and to try to reconstruct their evolution. In this contribution, we focus on several key nodes of these networks, starting from early patterning in which the main axes are determined and the broad morphological domains of the embryo are defined, and on to later stage wherein the growth zone network is active in sequential addition of posterior segments. The pattern of conservation of networks is very patchy, with some key aspects being highly conserved in all arthropods and others being very labile. Many aspects of early axis patterning are highly conserved, as are some aspects of sequential segment generation. In contrast, regional patterning varies among different taxa, and some networks, such as the terminal patterning network, are only found in a limited range of taxa. The growth zone segmentation network is ancient and is probably plesiomorphic to all arthropods. In some insects, it has undergone significant modification to give rise to a more hardwired network that generates individual segments separately. In other insects and in most arthropods, the sequential segmentation network has undergone a significant amount of systems drift, wherein many of the genes have changed. However, it maintains a conserved underlying logic and function. © The Author 2017. Published by Oxford University Press on behalf of the Society for Integrative and Comparative Biology. All rights reserved. For permissions please email: journals.permissions@oup.com.
Differential C3NET reveals disease networks of direct physical interactions

PubMed Central

2011-01-01

Background Genes might have different gene interactions in different cell conditions, which might be mapped into different networks. Differential analysis of gene networks allows spotting condition-specific interactions that, for instance, form disease networks if the conditions are a disease, such as cancer, and normal. This could potentially allow developing better and subtly targeted drugs to cure cancer. Differential network analysis with direct physical gene interactions needs to be explored in this endeavour. Results C3NET is a recently introduced information theory based gene network inference algorithm that infers direct physical gene interactions from expression data, which was shown to give consistently higher inference performances over various networks than its competitors. In this paper, we present, DC3net, an approach to employ C3NET in inferring disease networks. We apply DC3net on a synthetic and real prostate cancer datasets, which show promising results. With loose cutoffs, we predicted 18583 interactions from tumor and normal samples in total. Although there are no reference interactions databases for the specific conditions of our samples in the literature, we found verifications for 54 of our predicted direct physical interactions from only four of the biological interaction databases. As an example, we predicted that RAD50 with TRF2 have prostate cancer specific interaction that turned out to be having validation from the literature. It is known that RAD50 complex associates with TRF2 in the S phase of cell cycle, which suggests that this predicted interaction may promote telomere maintenance in tumor cells in order to allow tumor cells to divide indefinitely. Our enrichment analysis suggests that the identified tumor specific gene interactions may be potentially important in driving the growth in prostate cancer. Additionally, we found that the highest connected subnetwork of our predicted tumor specific network is enriched for all proliferation genes, which further suggests that the genes in this network may serve in the process of oncogenesis. Conclusions Our approach reveals disease specific interactions. It may help to make experimental follow-up studies more cost and time efficient by prioritizing disease relevant parts of the global gene network. PMID:21777411
Saliva Microbiota Carry Caries-Specific Functional Gene Signatures

PubMed Central

Chang, Xingzhi; Yuan, Xiao; Tu, Qichao; Yuan, Tong; Deng, Ye; Hemme, Christopher L.; Van Nostrand, Joy; Cui, Xinping; He, Zhili; Chen, Zhenggang; Guo, Dawei; Yu, Jiangbo; Zhang, Yue; Zhou, Jizhong; Xu, Jian

2014-01-01

Human saliva microbiota is phylogenetically divergent among host individuals yet their roles in health and disease are poorly appreciated. We employed a microbial functional gene microarray, HuMiChip 1.0, to reconstruct the global functional profiles of human saliva microbiota from ten healthy and ten caries-active adults. Saliva microbiota in the pilot population featured a vast diversity of functional genes. No significant distinction in gene number or diversity indices was observed between healthy and caries-active microbiota. However, co-presence network analysis of functional genes revealed that caries-active microbiota was more divergent in non-core genes than healthy microbiota, despite both groups exhibited a similar degree of conservation at their respective core genes. Furthermore, functional gene structure of saliva microbiota could potentially distinguish caries-active patients from healthy hosts. Microbial functions such as Diaminopimelate epimerase, Prephenate dehydrogenase, Pyruvate-formate lyase and N-acetylmuramoyl-L-alanine amidase were significantly linked to caries. Therefore, saliva microbiota carried disease-associated functional signatures, which could be potentially exploited for caries diagnosis. PMID:24533043

Saliva microbiota carry caries-specific functional gene signatures.

PubMed

Yang, Fang; Ning, Kang; Chang, Xingzhi; Yuan, Xiao; Tu, Qichao; Yuan, Tong; Deng, Ye; Hemme, Christopher L; Van Nostrand, Joy; Cui, Xinping; He, Zhili; Chen, Zhenggang; Guo, Dawei; Yu, Jiangbo; Zhang, Yue; Zhou, Jizhong; Xu, Jian

2014-01-01

Human saliva microbiota is phylogenetically divergent among host individuals yet their roles in health and disease are poorly appreciated. We employed a microbial functional gene microarray, HuMiChip 1.0, to reconstruct the global functional profiles of human saliva microbiota from ten healthy and ten caries-active adults. Saliva microbiota in the pilot population featured a vast diversity of functional genes. No significant distinction in gene number or diversity indices was observed between healthy and caries-active microbiota. However, co-presence network analysis of functional genes revealed that caries-active microbiota was more divergent in non-core genes than healthy microbiota, despite both groups exhibited a similar degree of conservation at their respective core genes. Furthermore, functional gene structure of saliva microbiota could potentially distinguish caries-active patients from healthy hosts. Microbial functions such as Diaminopimelate epimerase, Prephenate dehydrogenase, Pyruvate-formate lyase and N-acetylmuramoyl-L-alanine amidase were significantly linked to caries. Therefore, saliva microbiota carried disease-associated functional signatures, which could be potentially exploited for caries diagnosis.
AtmiRNET: a web-based resource for reconstructing regulatory networks of Arabidopsis microRNAs.

PubMed

Chien, Chia-Hung; Chiang-Hsieh, Yi-Fan; Chen, Yi-An; Chow, Chi-Nga; Wu, Nai-Yun; Hou, Ping-Fu; Chang, Wen-Chi

2015-01-01

Compared with animal microRNAs (miRNAs), our limited knowledge of how miRNAs involve in significant biological processes in plants is still unclear. AtmiRNET is a novel resource geared toward plant scientists for reconstructing regulatory networks of Arabidopsis miRNAs. By means of highlighted miRNA studies in target recognition, functional enrichment of target genes, promoter identification and detection of cis- and trans-elements, AtmiRNET allows users to explore mechanisms of transcriptional regulation and miRNA functions in Arabidopsis thaliana, which are rarely investigated so far. High-throughput next-generation sequencing datasets from transcriptional start sites (TSSs)-relevant experiments as well as five core promoter elements were collected to establish the support vector machine-based prediction model for Arabidopsis miRNA TSSs. Then, high-confidence transcription factors participate in transcriptional regulation of Arabidopsis miRNAs are provided based on statistical approach. Furthermore, both experimentally verified and putative miRNA-target interactions, whose validity was supported by the correlations between the expression levels of miRNAs and their targets, are elucidated for functional enrichment analysis. The inferred regulatory networks give users an intuitive insight into the pivotal roles of Arabidopsis miRNAs through the crosstalk between miRNA transcriptional regulation (upstream) and miRNA-mediate (downstream) gene circuits. The valuable information that is visually oriented in AtmiRNET recruits the scant understanding of plant miRNAs and will be useful (e.g. ABA-miR167c-auxin signaling pathway) for further research. Database URL: http://AtmiRNET.itps.ncku.edu.tw/ © The Author(s) 2015. Published by Oxford University Press.
Relationships between probabilistic Boolean networks and dynamic Bayesian networks as models of gene regulatory networks

PubMed Central

Lähdesmäki, Harri; Hautaniemi, Sampsa; Shmulevich, Ilya; Yli-Harja, Olli

2006-01-01

A significant amount of attention has recently been focused on modeling of gene regulatory networks. Two frequently used large-scale modeling frameworks are Bayesian networks (BNs) and Boolean networks, the latter one being a special case of its recent stochastic extension, probabilistic Boolean networks (PBNs). PBN is a promising model class that generalizes the standard rule-based interactions of Boolean networks into the stochastic setting. Dynamic Bayesian networks (DBNs) is a general and versatile model class that is able to represent complex temporal stochastic processes and has also been proposed as a model for gene regulatory systems. In this paper, we concentrate on these two model classes and demonstrate that PBNs and a certain subclass of DBNs can represent the same joint probability distribution over their common variables. The major benefit of introducing the relationships between the models is that it opens up the possibility of applying the standard tools of DBNs to PBNs and vice versa. Hence, the standard learning tools of DBNs can be applied in the context of PBNs, and the inference methods give a natural way of handling the missing values in PBNs which are often present in gene expression measurements. Conversely, the tools for controlling the stationary behavior of the networks, tools for projecting networks onto sub-networks, and efficient learning schemes can be used for DBNs. In other words, the introduced relationships between the models extend the collection of analysis tools for both model classes. PMID:17415411
Streamlining and Large Ancestral Genomes in Archaea Inferred with a Phylogenetic Birth-and-Death Model

PubMed Central

Miklós, István

2009-01-01

Homologous genes originate from a common ancestor through vertical inheritance, duplication, or horizontal gene transfer. Entire homolog families spawned by a single ancestral gene can be identified across multiple genomes based on protein sequence similarity. The sequences, however, do not always reveal conclusively the history of large families. To study the evolution of complete gene repertoires, we propose here a mathematical framework that does not rely on resolved gene family histories. We show that so-called phylogenetic profiles, formed by family sizes across multiple genomes, are sufficient to infer principal evolutionary trends. The main novelty in our approach is an efficient algorithm to compute the likelihood of a phylogenetic profile in a model of birth-and-death processes acting on a phylogeny. We examine known gene families in 28 archaeal genomes using a probabilistic model that involves lineage- and family-specific components of gene acquisition, duplication, and loss. The model enables us to consider all possible histories when inferring statistics about archaeal evolution. According to our reconstruction, most lineages are characterized by a net loss of gene families. Major increases in gene repertoire have occurred only a few times. Our reconstruction underlines the importance of persistent streamlining processes in shaping genome composition in Archaea. It also suggests that early archaeal genomes were as complex as typical modern ones, and even show signs, in the case of the methanogenic ancestor, of an extremely large gene repertoire. PMID:19570746
A protocol for generating a high-quality genome-scale metabolic reconstruction.

PubMed

Thiele, Ines; Palsson, Bernhard Ø

2010-01-01

Network reconstructions are a common denominator in systems biology. Bottom-up metabolic network reconstructions have been developed over the last 10 years. These reconstructions represent structured knowledge bases that abstract pertinent information on the biochemical transformations taking place within specific target organisms. The conversion of a reconstruction into a mathematical format facilitates a myriad of computational biological studies, including evaluation of network content, hypothesis testing and generation, analysis of phenotypic characteristics and metabolic engineering. To date, genome-scale metabolic reconstructions for more than 30 organisms have been published and this number is expected to increase rapidly. However, these reconstructions differ in quality and coverage that may minimize their predictive potential and use as knowledge bases. Here we present a comprehensive protocol describing each step necessary to build a high-quality genome-scale metabolic reconstruction, as well as the common trials and tribulations. Therefore, this protocol provides a helpful manual for all stages of the reconstruction process.
A protocol for generating a high-quality genome-scale metabolic reconstruction

PubMed Central

Thiele, Ines; Palsson, Bernhard Ø.

2011-01-01

Network reconstructions are a common denominator in systems biology. Bottom-up metabolic network reconstructions have developed over the past 10 years. These reconstructions represent structured knowledge-bases that abstract pertinent information on the biochemical transformations taking place within specific target organisms. The conversion of a reconstruction into a mathematical format facilitates myriad computational biological studies including evaluation of network content, hypothesis testing and generation, analysis of phenotypic characteristics, and metabolic engineering. To date, genome-scale metabolic reconstructions for more than 30 organisms have been published and this number is expected to increase rapidly. However, these reconstructions differ in quality and coverage that may minimize their predictive potential and use as knowledge-bases. Here, we present a comprehensive protocol describing each step necessary to build a high-quality genome-scale metabolic reconstruction as well as common trials and tribulations. Therefore, this protocol provides a helpful manual for all stages of the reconstruction process. PMID:20057383
Multi-tissue analysis of co-expression networks by higher-order generalized singular value decomposition identifies functionally coherent transcriptional modules.

PubMed

Xiao, Xiaolin; Moreno-Moral, Aida; Rotival, Maxime; Bottolo, Leonardo; Petretto, Enrico

2014-01-01

Recent high-throughput efforts such as ENCODE have generated a large body of genome-scale transcriptional data in multiple conditions (e.g., cell-types and disease states). Leveraging these data is especially important for network-based approaches to human disease, for instance to identify coherent transcriptional modules (subnetworks) that can inform functional disease mechanisms and pathological pathways. Yet, genome-scale network analysis across conditions is significantly hampered by the paucity of robust and computationally-efficient methods. Building on the Higher-Order Generalized Singular Value Decomposition, we introduce a new algorithmic approach for efficient, parameter-free and reproducible identification of network-modules simultaneously across multiple conditions. Our method can accommodate weighted (and unweighted) networks of any size and can similarly use co-expression or raw gene expression input data, without hinging upon the definition and stability of the correlation used to assess gene co-expression. In simulation studies, we demonstrated distinctive advantages of our method over existing methods, which was able to recover accurately both common and condition-specific network-modules without entailing ad-hoc input parameters as required by other approaches. We applied our method to genome-scale and multi-tissue transcriptomic datasets from rats (microarray-based) and humans (mRNA-sequencing-based) and identified several common and tissue-specific subnetworks with functional significance, which were not detected by other methods. In humans we recapitulated the crosstalk between cell-cycle progression and cell-extracellular matrix interactions processes in ventricular zones during neocortex expansion and further, we uncovered pathways related to development of later cognitive functions in the cortical plate of the developing brain which were previously unappreciated. Analyses of seven rat tissues identified a multi-tissue subnetwork of co-expressed heat shock protein (Hsp) and cardiomyopathy genes (Bag3, Cryab, Kras, Emd, Plec), which was significantly replicated using separate failing heart and liver gene expression datasets in humans, thus revealing a conserved functional role for Hsp genes in cardiovascular disease.
ocsESTdb: a database of oil crop seed EST sequences for comparative analysis and investigation of a global metabolic network and oil accumulation metabolism.

PubMed

Ke, Tao; Yu, Jingyin; Dong, Caihua; Mao, Han; Hua, Wei; Liu, Shengyi

2015-01-21

Oil crop seeds are important sources of fatty acids (FAs) for human and animal nutrition. Despite their importance, there is a lack of an essential bioinformatics resource on gene transcription of oil crops from a comparative perspective. In this study, we developed ocsESTdb, the first database of expressed sequence tag (EST) information on seeds of four large-scale oil crops with an emphasis on global metabolic networks and oil accumulation metabolism that target the involved unigenes. A total of 248,522 ESTs and 106,835 unigenes were collected from the cDNA libraries of rapeseed (Brassica napus), soybean (Glycine max), sesame (Sesamum indicum) and peanut (Arachis hypogaea). These unigenes were annotated by a sequence similarity search against databases including TAIR, NR protein database, Gene Ontology, COG, Swiss-Prot, TrEMBL and Kyoto Encyclopedia of Genes and Genomes (KEGG). Five genome-scale metabolic networks that contain different numbers of metabolites and gene-enzyme reaction-association entries were analysed and constructed using Cytoscape and yEd programs. Details of unigene entries, deduced amino acid sequences and putative annotation are available from our database to browse, search and download. Intuitive and graphical representations of EST/unigene sequences, functional annotations, metabolic pathways and metabolic networks are also available. ocsESTdb will be updated regularly and can be freely accessed at http://ocri-genomics.org/ocsESTdb/ . ocsESTdb may serve as a valuable and unique resource for comparative analysis of acyl lipid synthesis and metabolism in oilseed plants. It also may provide vital insights into improving oil content in seeds of oil crop species by transcriptional reconstruction of the metabolic network.
Network Analysis of Epidermal Growth Factor Signaling Using Integrated Genomic, Proteomic and Phosphorylation Data

PubMed Central

Waters, Katrina M.; Liu, Tao; Quesenberry, Ryan D.; Willse, Alan R.; Bandyopadhyay, Somnath; Kathmann, Loel E.; Weber, Thomas J.; Smith, Richard D.; Wiley, H. Steven; Thrall, Brian D.

2012-01-01

To understand how integration of multiple data types can help decipher cellular responses at the systems level, we analyzed the mitogenic response of human mammary epithelial cells to epidermal growth factor (EGF) using whole genome microarrays, mass spectrometry-based proteomics and large-scale western blots with over 1000 antibodies. A time course analysis revealed significant differences in the expression of 3172 genes and 596 proteins, including protein phosphorylation changes measured by western blot. Integration of these disparate data types showed that each contributed qualitatively different components to the observed cell response to EGF and that varying degrees of concordance in gene expression and protein abundance measurements could be linked to specific biological processes. Networks inferred from individual data types were relatively limited, whereas networks derived from the integrated data recapitulated the known major cellular responses to EGF and exhibited more highly connected signaling nodes than networks derived from any individual dataset. While cell cycle regulatory pathways were altered as anticipated, we found the most robust response to mitogenic concentrations of EGF was induction of matrix metalloprotease cascades, highlighting the importance of the EGFR system as a regulator of the extracellular environment. These results demonstrate the value of integrating multiple levels of biological information to more accurately reconstruct networks of cellular response. PMID:22479638
Influence networks based on coexpression improve drug target discovery for the development of novel cancer therapeutics.

PubMed

Penrod, Nadia M; Moore, Jason H

2014-02-05

The demand for novel molecularly targeted drugs will continue to rise as we move forward toward the goal of personalizing cancer treatment to the molecular signature of individual tumors. However, the identification of targets and combinations of targets that can be safely and effectively modulated is one of the greatest challenges facing the drug discovery process. A promising approach is to use biological networks to prioritize targets based on their relative positions to one another, a property that affects their ability to maintain network integrity and propagate information-flow. Here, we introduce influence networks and demonstrate how they can be used to generate influence scores as a network-based metric to rank genes as potential drug targets. We use this approach to prioritize genes as drug target candidates in a set of ER⁺ breast tumor samples collected during the course of neoadjuvant treatment with the aromatase inhibitor letrozole. We show that influential genes, those with high influence scores, tend to be essential and include a higher proportion of essential genes than those prioritized based on their position (i.e. hubs or bottlenecks) within the same network. Additionally, we show that influential genes represent novel biologically relevant drug targets for the treatment of ER⁺ breast cancers. Moreover, we demonstrate that gene influence differs between untreated tumors and residual tumors that have adapted to drug treatment. In this way, influence scores capture the context-dependent functions of genes and present the opportunity to design combination treatment strategies that take advantage of the tumor adaptation process. Influence networks efficiently find essential genes as promising drug targets and combinations of targets to inform the development of molecularly targeted drugs and their use.
Influence networks based on coexpression improve drug target discovery for the development of novel cancer therapeutics

PubMed Central

2014-01-01

Background The demand for novel molecularly targeted drugs will continue to rise as we move forward toward the goal of personalizing cancer treatment to the molecular signature of individual tumors. However, the identification of targets and combinations of targets that can be safely and effectively modulated is one of the greatest challenges facing the drug discovery process. A promising approach is to use biological networks to prioritize targets based on their relative positions to one another, a property that affects their ability to maintain network integrity and propagate information-flow. Here, we introduce influence networks and demonstrate how they can be used to generate influence scores as a network-based metric to rank genes as potential drug targets. Results We use this approach to prioritize genes as drug target candidates in a set of ER + breast tumor samples collected during the course of neoadjuvant treatment with the aromatase inhibitor letrozole. We show that influential genes, those with high influence scores, tend to be essential and include a higher proportion of essential genes than those prioritized based on their position (i.e. hubs or bottlenecks) within the same network. Additionally, we show that influential genes represent novel biologically relevant drug targets for the treatment of ER + breast cancers. Moreover, we demonstrate that gene influence differs between untreated tumors and residual tumors that have adapted to drug treatment. In this way, influence scores capture the context-dependent functions of genes and present the opportunity to design combination treatment strategies that take advantage of the tumor adaptation process. Conclusions Influence networks efficiently find essential genes as promising drug targets and combinations of targets to inform the development of molecularly targeted drugs and their use. PMID:24495353
Design of Compressed Sensing Algorithm for Coal Mine IoT Moving Measurement Data Based on a Multi-Hop Network and Total Variation.

PubMed

Wang, Gang; Zhao, Zhikai; Ning, Yongjie

2018-05-28

As the application of a coal mine Internet of Things (IoT), mobile measurement devices, such as intelligent mine lamps, cause moving measurement data to be increased. How to transmit these large amounts of mobile measurement data effectively has become an urgent problem. This paper presents a compressed sensing algorithm for the large amount of coal mine IoT moving measurement data based on a multi-hop network and total variation. By taking gas data in mobile measurement data as an example, two network models for the transmission of gas data flow, namely single-hop and multi-hop transmission modes, are investigated in depth, and a gas data compressed sensing collection model is built based on a multi-hop network. To utilize the sparse characteristics of gas data, the concept of total variation is introduced and a high-efficiency gas data compression and reconstruction method based on Total Variation Sparsity based on Multi-Hop (TVS-MH) is proposed. According to the simulation results, by using the proposed method, the moving measurement data flow from an underground distributed mobile network can be acquired and transmitted efficiently.
COBRApy: COnstraints-Based Reconstruction and Analysis for Python.

PubMed

Ebrahim, Ali; Lerman, Joshua A; Palsson, Bernhard O; Hyduke, Daniel R

2013-08-08

COnstraint-Based Reconstruction and Analysis (COBRA) methods are widely used for genome-scale modeling of metabolic networks in both prokaryotes and eukaryotes. Due to the successes with metabolism, there is an increasing effort to apply COBRA methods to reconstruct and analyze integrated models of cellular processes. The COBRA Toolbox for MATLAB is a leading software package for genome-scale analysis of metabolism; however, it was not designed to elegantly capture the complexity inherent in integrated biological networks and lacks an integration framework for the multiomics data used in systems biology. The openCOBRA Project is a community effort to promote constraints-based research through the distribution of freely available software. Here, we describe COBRA for Python (COBRApy), a Python package that provides support for basic COBRA methods. COBRApy is designed in an object-oriented fashion that facilitates the representation of the complex biological processes of metabolism and gene expression. COBRApy does not require MATLAB to function; however, it includes an interface to the COBRA Toolbox for MATLAB to facilitate use of legacy codes. For improved performance, COBRApy includes parallel processing support for computationally intensive processes. COBRApy is an object-oriented framework designed to meet the computational challenges associated with the next generation of stoichiometric constraint-based models and high-density omics data sets. http://opencobra.sourceforge.net/
Reconstructing networks from dynamics with correlated noise

NASA Astrophysics Data System (ADS)

Tam, H. C.; Ching, Emily S. C.; Lai, Pik-Yin

2018-07-01

Reconstructing the structure of complex networks from measurements of the nodes is a challenge in many branches of science. External influences are always present and act as a noise to the networks of interest. In this paper, we present a method for reconstructing networks from measured dynamics of the nodes subjected to correlated noise that cannot be approximated by a white noise. This method can reconstruct the links of both bidirectional and directed networks, the correlation time and strength of the noise, and also the relative coupling strength of the links when the coupling functions have certain properties. Our method is built upon theoretical relations between network structure and measurable quantities from the dynamics that we have derived for systems that have fixed point dynamics in the noise-free limit. Using these theoretical results, we can further explain the shortcomings of two common practices of inferring links for bidirectional networks using the Pearson correlation coefficient and the partial correlation coefficient.
Mechanisms and Evolution of Control Logic in Prokaryotic Transcriptional Regulation

PubMed Central

van Hijum, Sacha A. F. T.; Medema, Marnix H.; Kuipers, Oscar P.

2009-01-01

Summary: A major part of organismal complexity and versatility of prokaryotes resides in their ability to fine-tune gene expression to adequately respond to internal and external stimuli. Evolution has been very innovative in creating intricate mechanisms by which different regulatory signals operate and interact at promoters to drive gene expression. The regulation of target gene expression by transcription factors (TFs) is governed by control logic brought about by the interaction of regulators with TF binding sites (TFBSs) in cis-regulatory regions. A factor that in large part determines the strength of the response of a target to a given TF is motif stringency, the extent to which the TFBS fits the optimal TFBS sequence for a given TF. Advances in high-throughput technologies and computational genomics allow reconstruction of transcriptional regulatory networks in silico. To optimize the prediction of transcriptional regulatory networks, i.e., to separate direct regulation from indirect regulation, a thorough understanding of the control logic underlying the regulation of gene expression is required. This review summarizes the state of the art of the elements that determine the functionality of TFBSs by focusing on the molecular biological mechanisms and evolutionary origins of cis-regulatory regions. PMID:19721087
Azimuth-invariant mueller-matrix differentiation of the optical anisotropy of biological tissues

NASA Astrophysics Data System (ADS)

Ushenko, V. A.; Sidor, M. I.; Marchuk, Yu. F.; Pashkovskaya, N. V.; Andreichuk, D. R.

2014-07-01

A Mueller-matrix model is proposed for analysis of the optical anisotropy of protein networks of optically thin nondepolarizing layers of biological tissues with allowance for birefringence and dichroism. The model is used to construct algorithms for reconstruction of coordinate distributions of phase shifts and coefficient of linear dichroism. Objective criteria for differentiation of benign and malignant tissues of female genitals are formulated in the framework of the statistical analysis of such distributions. Approaches of evidence-based medicine are used to determine the working characteristics (sensitivity, specificity, and accuracy) of the Mueller-matrix method for the reconstruction of the parameters of optical anisotropy and show its efficiency in the differentiation of benign and malignant tumors.
redGEM: Systematic reduction and analysis of genome-scale metabolic reconstructions for development of consistent core metabolic models

PubMed Central

Ataman, Meric

2017-01-01

Genome-scale metabolic reconstructions have proven to be valuable resources in enhancing our understanding of metabolic networks as they encapsulate all known metabolic capabilities of the organisms from genes to proteins to their functions. However the complexity of these large metabolic networks often hinders their utility in various practical applications. Although reduced models are commonly used for modeling and in integrating experimental data, they are often inconsistent across different studies and laboratories due to different criteria and detail, which can compromise transferability of the findings and also integration of experimental data from different groups. In this study, we have developed a systematic semi-automatic approach to reduce genome-scale models into core models in a consistent and logical manner focusing on the central metabolism or subsystems of interest. The method minimizes the loss of information using an approach that combines graph-based search and optimization methods. The resulting core models are shown to be able to capture key properties of the genome-scale models and preserve consistency in terms of biomass and by-product yields, flux and concentration variability and gene essentiality. The development of these “consistently-reduced” models will help to clarify and facilitate integration of different experimental data to draw new understanding that can be directly extendable to genome-scale models. PMID:28727725
An algebra-based method for inferring gene regulatory networks

PubMed Central

2014-01-01

Background The inference of gene regulatory networks (GRNs) from experimental observations is at the heart of systems biology. This includes the inference of both the network topology and its dynamics. While there are many algorithms available to infer the network topology from experimental data, less emphasis has been placed on methods that infer network dynamics. Furthermore, since the network inference problem is typically underdetermined, it is essential to have the option of incorporating into the inference process, prior knowledge about the network, along with an effective description of the search space of dynamic models. Finally, it is also important to have an understanding of how a given inference method is affected by experimental and other noise in the data used. Results This paper contains a novel inference algorithm using the algebraic framework of Boolean polynomial dynamical systems (BPDS), meeting all these requirements. The algorithm takes as input time series data, including those from network perturbations, such as knock-out mutant strains and RNAi experiments. It allows for the incorporation of prior biological knowledge while being robust to significant levels of noise in the data used for inference. It uses an evolutionary algorithm for local optimization with an encoding of the mathematical models as BPDS. The BPDS framework allows an effective representation of the search space for algebraic dynamic models that improves computational performance. The algorithm is validated with both simulated and experimental microarray expression profile data. Robustness to noise is tested using a published mathematical model of the segment polarity gene network in Drosophila melanogaster. Benchmarking of the algorithm is done by comparison with a spectrum of state-of-the-art network inference methods on data from the synthetic IRMA network to demonstrate that our method has good precision and recall for the network reconstruction task, while also predicting several of the dynamic patterns present in the network. Conclusions Boolean polynomial dynamical systems provide a powerful modeling framework for the reverse engineering of gene regulatory networks, that enables a rich mathematical structure on the model search space. A C++ implementation of the method, distributed under LPGL license, is available, together with the source code, at http://www.paola-vera-licona.net/Software/EARevEng/REACT.html. PMID:24669835
Digital 3D reconstructions using histological serial sections of lung tissue including the alveolar capillary network.

PubMed

Grothausmann, Roman; Knudsen, Lars; Ochs, Matthias; Mühlfeld, Christian

2017-02-01

Grothausmann R, Knudsen L, Ochs M, Mühlfeld C. Digital 3D reconstructions using histological serial sections of lung tissue including the alveolar capillary network. Am J Physiol Lung Cell Mol Physiol 312: L243-L257, 2017. First published December 2, 2016; doi:10.1152/ajplung.00326.2016-The alveolar capillary network (ACN) provides an enormously large surface area that is necessary for pulmonary gas exchange. Changes of the ACN during normal or pathological development or in pulmonary diseases are of great functional impact and warrant further analysis. Due to the complexity of the three-dimensional (3D) architecture of the ACN, 2D approaches are limited in providing a comprehensive impression of the characteristics of the normal ACN or the nature of its alterations. Stereological methods offer a quantitative way to assess the ACN in 3D in terms of capillary volume, surface area, or number but lack a 3D visualization to interpret the data. Hence, the necessity to visualize the ACN in 3D and to correlate this with data from the same set of data arises. Such an approach requires a large sample volume combined with a high resolution. Here, we present a technically simple and cost-efficient approach to create 3D representations of lung tissue ranging from bronchioles over alveolar ducts and alveoli up to the ACN from more than 1 mm sample extent to a resolution of less than 1 μm. The method is based on automated image acquisition of serially sectioned epoxy resin-embedded lung tissue fixed by vascular perfusion and subsequent automated digital reconstruction and analysis of the 3D data. This efficient method may help to better understand mechanisms of vascular development and pathology of the lung. Copyright © 2017 the American Physiological Society.
Differential network analysis reveals the genome-wide landscape of estrogen receptor modulation in hormonal cancers

PubMed Central

Hsiao, Tzu-Hung; Chiu, Yu-Chiao; Hsu, Pei-Yin; Lu, Tzu-Pin; Lai, Liang-Chuan; Tsai, Mong-Hsun; Huang, Tim H.-M.; Chuang, Eric Y.; Chen, Yidong

2016-01-01

Several mutual information (MI)-based algorithms have been developed to identify dynamic gene-gene and function-function interactions governed by key modulators (genes, proteins, etc.). Due to intensive computation, however, these methods rely heavily on prior knowledge and are limited in genome-wide analysis. We present the modulated gene/gene set interaction (MAGIC) analysis to systematically identify genome-wide modulation of interaction networks. Based on a novel statistical test employing conjugate Fisher transformations of correlation coefficients, MAGIC features fast computation and adaption to variations of clinical cohorts. In simulated datasets MAGIC achieved greatly improved computation efficiency and overall superior performance than the MI-based method. We applied MAGIC to construct the estrogen receptor (ER) modulated gene and gene set (representing biological function) interaction networks in breast cancer. Several novel interaction hubs and functional interactions were discovered. ER+ dependent interaction between TGFβ and NFκB was further shown to be associated with patient survival. The findings were verified in independent datasets. Using MAGIC, we also assessed the essential roles of ER modulation in another hormonal cancer, ovarian cancer. Overall, MAGIC is a systematic framework for comprehensively identifying and constructing the modulated interaction networks in a whole-genome landscape. MATLAB implementation of MAGIC is available for academic uses at https://github.com/chiuyc/MAGIC. PMID:26972162

[Nuclear transfer of goat somatic cells transgenic for human lactoferrin].

PubMed

Li, Lan; Shen, Wei; Pan, Qing-Yu; Min, Ling-Jiang; Sun, Yu-Jiang; Fang, Yong-Wei; Deng, Ji-Xian; Pan, Qing-Jie

2006-12-01

Transgenic animal mammary gland bioreactors are being used to produce recombinant proteins with appropriate post-translational modifications, and nuclear transfer of transgenic somatic cells is a more powerful method to produce mammary gland bioreactor. Here we describe efficient gene transfer and nuclear transfer in goat somatic cells. Gene targeting vector pGBC2LF was constructed by cloning human lactoferrin (LF) gene cDNA into exon 2 of the milk goat beta-casein gene, and the endogenous start condon was replaced by that of human LF gene. Goat fetal fibroblasts were transfected with linearized pGBC2LF and 14 cell lines were positive according to PCR and Southern blot. The transgenic cells were used as donor cells of nuclear transfer, and some of reconstructed embryos could develop to blastocyst in vitro.
Consensus between Pipelines in Structural Brain Networks

PubMed Central

Parker, Christopher S.; Deligianni, Fani; Cardoso, M. Jorge; Daga, Pankaj; Modat, Marc; Dayan, Michael; Clark, Chris A.

2014-01-01

Structural brain networks may be reconstructed from diffusion MRI tractography data and have great potential to further our understanding of the topological organisation of brain structure in health and disease. Network reconstruction is complex and involves a series of processesing methods including anatomical parcellation, registration, fiber orientation estimation and whole-brain fiber tractography. Methodological choices at each stage can affect the anatomical accuracy and graph theoretical properties of the reconstructed networks, meaning applying different combinations in a network reconstruction pipeline may produce substantially different networks. Furthermore, the choice of which connections are considered important is unclear. In this study, we assessed the similarity between structural networks obtained using two independent state-of-the-art reconstruction pipelines. We aimed to quantify network similarity and identify the core connections emerging most robustly in both pipelines. Similarity of network connections was compared between pipelines employing different atlases by merging parcels to a common and equivalent node scale. We found a high agreement between the networks across a range of fiber density thresholds. In addition, we identified a robust core of highly connected regions coinciding with a peak in similarity across network density thresholds, and replicated these results with atlases at different node scales. The binary network properties of these core connections were similar between pipelines but showed some differences in atlases across node scales. This study demonstrates the utility of applying multiple structural network reconstrution pipelines to diffusion data in order to identify the most important connections for further study. PMID:25356977
Recovering time-varying networks of dependencies in social and biological studies.

PubMed

Ahmed, Amr; Xing, Eric P

2009-07-21

A plausible representation of the relational information among entities in dynamic systems such as a living cell or a social community is a stochastic network that is topologically rewiring and semantically evolving over time. Although there is a rich literature in modeling static or temporally invariant networks, little has been done toward recovering the network structure when the networks are not observable in a dynamic context. In this article, we present a machine learning method called TESLA, which builds on a temporally smoothed l(1)-regularized logistic regression formalism that can be cast as a standard convex-optimization problem and solved efficiently by using generic solvers scalable to large networks. We report promising results on recovering simulated time-varying networks and on reverse engineering the latent sequence of temporally rewiring political and academic social networks from longitudinal data, and the evolving gene networks over >4,000 genes during the life cycle of Drosophila melanogaster from a microarray time course at a resolution limited only by sample frequency.
Reconstitution of the myocardium in regenerating newt hearts is preceded by transient deposition of extracellular matrix components.

PubMed

Piatkowski, Tanja; Mühlfeld, Christian; Borchardt, Thilo; Braun, Thomas

2013-07-01

Adult newts efficiently regenerate the heart after injury in a process that involves proliferation of cardiac muscle and nonmuscle cells and repatterning of the myocardium. To analyze the processes that underlie heart regeneration in newts, we characterized the structural changes in the myocardium that allow regeneration after mechanical injury. We found that cardiomyocytes in the damaged ventricle mainly die by necrosis and are removed during the first week after injury, paving the way for the extension of thin myocardial trabeculae, which initially contain only very few cardiomyocytes. During the following 200 days, these thin trabeculae fill up with new cardiomyocytes until the myocardium is fully reconstituted. Interestingly, reconstruction of the newly formed trabeculated network is accompanied by transient deposition of extracellular matrix (ECM) components such as collagen III. We conclude that the ECM is a critical guidance cue for outgrowing and branching trabeculae to reconstruct the trabeculated network, which represents a hallmark of uninjured cardiac tissue in newts.
Gene regulatory network inference from multifactorial perturbation data using both regression and correlation analyses.

PubMed

Xiong, Jie; Zhou, Tong

2012-01-01

An important problem in systems biology is to reconstruct gene regulatory networks (GRNs) from experimental data and other a priori information. The DREAM project offers some types of experimental data, such as knockout data, knockdown data, time series data, etc. Among them, multifactorial perturbation data are easier and less expensive to obtain than other types of experimental data and are thus more common in practice. In this article, a new algorithm is presented for the inference of GRNs using the DREAM4 multifactorial perturbation data. The GRN inference problem among [Formula: see text] genes is decomposed into [Formula: see text] different regression problems. In each of the regression problems, the expression level of a target gene is predicted solely from the expression level of a potential regulation gene. For different potential regulation genes, different weights for a specific target gene are constructed by using the sum of squared residuals and the Pearson correlation coefficient. Then these weights are normalized to reflect effort differences of regulating distinct genes. By appropriately choosing the parameters of the power law, we constructe a 0-1 integer programming problem. By solving this problem, direct regulation genes for an arbitrary gene can be estimated. And, the normalized weight of a gene is modified, on the basis of the estimation results about the existence of direct regulations to it. These normalized and modified weights are used in queuing the possibility of the existence of a corresponding direct regulation. Computation results with the DREAM4 In Silico Size 100 Multifactorial subchallenge show that estimation performances of the suggested algorithm can even outperform the best team. Using the real data provided by the DREAM5 Network Inference Challenge, estimation performances can be ranked third. Furthermore, the high precision of the obtained most reliable predictions shows the suggested algorithm may be helpful in guiding biological experiment designs.
Comparative genomics of the lactic acid bacteria

DOE Office of Scientific and Technical Information (OSTI.GOV)

Makarova, K.; Slesarev, A.; Wolf, Y.

Lactic acid-producing bacteria are associated with various plant and animal niches and play a key role in the production of fermented foods and beverages. We report nine genome sequences representing the phylogenetic and functional diversity of these bacteria. The small genomes of lactic acid bacteria encode a broad repertoire of transporters for efficient carbon and nitrogen acquisition from the nutritionally rich environments they inhabit and reflect a limited range of biosynthetic capabilities that indicate both prototrophic and auxotrophic strains. Phylogenetic analyses, comparison of gene content across the group, and reconstruction of ancestral gene sets indicate a combination of extensive genemore » loss and key gene acquisitions via horizontal gene transfer during the coevolution of lactic acid bacteria with their habitats.« less
Boosting Probabilistic Graphical Model Inference by Incorporating Prior Knowledge from Multiple Sources

PubMed Central

Praveen, Paurush; Fröhlich, Holger

2013-01-01

Inferring regulatory networks from experimental data via probabilistic graphical models is a popular framework to gain insights into biological systems. However, the inherent noise in experimental data coupled with a limited sample size reduces the performance of network reverse engineering. Prior knowledge from existing sources of biological information can address this low signal to noise problem by biasing the network inference towards biologically plausible network structures. Although integrating various sources of information is desirable, their heterogeneous nature makes this task challenging. We propose two computational methods to incorporate various information sources into a probabilistic consensus structure prior to be used in graphical model inference. Our first model, called Latent Factor Model (LFM), assumes a high degree of correlation among external information sources and reconstructs a hidden variable as a common source in a Bayesian manner. The second model, a Noisy-OR, picks up the strongest support for an interaction among information sources in a probabilistic fashion. Our extensive computational studies on KEGG signaling pathways as well as on gene expression data from breast cancer and yeast heat shock response reveal that both approaches can significantly enhance the reconstruction accuracy of Bayesian Networks compared to other competing methods as well as to the situation without any prior. Our framework allows for using diverse information sources, like pathway databases, GO terms and protein domain data, etc. and is flexible enough to integrate new sources, if available. PMID:23826291
AlgaGEM – a genome-scale metabolic reconstruction of algae based on the Chlamydomonas reinhardtii genome

PubMed Central

2011-01-01

Background Microalgae have the potential to deliver biofuels without the associated competition for land resources. In order to realise the rates and titres necessary for commercial production, however, system-level metabolic engineering will be required. Genome scale metabolic reconstructions have revolutionized microbial metabolic engineering and are used routinely for in silico analysis and design. While genome scale metabolic reconstructions have been developed for many prokaryotes and model eukaryotes, the application to less well characterized eukaryotes such as algae is challenging not at least due to a lack of compartmentalization data. Results We have developed a genome-scale metabolic network model (named AlgaGEM) covering the metabolism for a compartmentalized algae cell based on the Chlamydomonas reinhardtii genome. AlgaGEM is a comprehensive literature-based genome scale metabolic reconstruction that accounts for the functions of 866 unique ORFs, 1862 metabolites, 2249 gene-enzyme-reaction-association entries, and 1725 unique reactions. The reconstruction was compartmentalized into the cytoplasm, mitochondrion, plastid and microbody using available data for algae complemented with compartmentalisation data for Arabidopsis thaliana. AlgaGEM describes a functional primary metabolism of Chlamydomonas and significantly predicts distinct algal behaviours such as the catabolism or secretion rather than recycling of phosphoglycolate in photorespiration. AlgaGEM was validated through the simulation of growth and algae metabolic functions inferred from literature. Using efficient resource utilisation as the optimality criterion, AlgaGEM predicted observed metabolic effects under autotrophic, heterotrophic and mixotrophic conditions. AlgaGEM predicts increased hydrogen production when cyclic electron flow is disrupted as seen in a high producing mutant derived from mutational studies. The model also predicted the physiological pathway for H2 production and identified new targets to further improve H2 yield. Conclusions AlgaGEM is a viable and comprehensive framework for in silico functional analysis and can be used to derive new, non-trivial hypotheses for exploring this metabolically versatile organism. Flux balance analysis can be used to identify bottlenecks and new targets to metabolically engineer microalgae for production of biofuels. PMID:22369158
Inverse problems in eddy current testing using neural network

NASA Astrophysics Data System (ADS)

Yusa, N.; Cheng, W.; Miya, K.

2000-05-01

Reconstruction of crack in conductive material is one of the most important issues in the field of eddy current testing. Although many attempts to reconstruct cracks have been made, most of them deal with only artificial cracks machined with electro-discharge. However, in the case of natural cracks like stress corrosion cracking or inter-granular attack, there must be contact region and therefore their conductivity is not necessarily zero. In this study, an attempt to reconstruct natural cracks using neural network is presented. The neural network was trained through numerical simulated data obtained by the fast forward solver that calculated unflawed potential data a priori to save computational time. The solver is based on A-φ method discretized by using FEM-BEM A natural crack was modeled as an area whose conductivity was less than that of a specimen. The distribution of conductivity in that area was reconstructed as well. It took much time to train the network, but the speed of reconstruction was extremely fast after once it was trained. Well-trained network gave good reconstruction result.
Pleiotropy, redundancy and the evolution of flowers.

PubMed

Albert, Victor A; Oppenheimer, David G; Lindqvist, Charlotte

2002-07-01

Most angiosperm flowers are tightly integrated, functionally bisexual shoots that have carpels with enclosed ovules. Flowering plants evolved from within the gymnosperms, which lack this combination of innovations. Paradoxically, phylogenetic reconstructions suggest that the flowering plant lineage substantially pre-dates the evolution of flowers themselves. We provide a model based on known gene regulatory networks whereby positive selection on a single, partially redundant gene duplicate 'trapped' the ancestors of flower-bearing plants into the condensed, bisexual state approximately 130 million years ago. The LEAFY (LFY) gene of Arabidopsis encodes a master regulator that functions as the main conduit of environmental signals to the reproductive developmental program. We directly link the elimination of one LFY paralog, pleiotropically maintained in gymnosperms, to the sudden appearance of flowers in the fossil record.
Computational Tools for Metabolic Engineering

PubMed Central

Copeland, Wilbert B.; Bartley, Bryan A.; Chandran, Deepak; Galdzicki, Michal; Kim, Kyung H.; Sleight, Sean C.; Maranas, Costas D.; Sauro, Herbert M.

2012-01-01

A great variety of software applications are now employed in the metabolic engineering field. These applications have been created to support a wide range of experimental and analysis techniques. Computational tools are utilized throughout the metabolic engineering workflow to extract and interpret relevant information from large data sets, to present complex models in a more manageable form, and to propose efficient network design strategies. In this review, we present a number of tools that can assist in modifying and understanding cellular metabolic networks. The review covers seven areas of relevance to metabolic engineers. These include metabolic reconstruction efforts, network visualization, nucleic acid and protein engineering, metabolic flux analysis, pathway prospecting, post-structural network analysis and culture optimization. The list of available tools is extensive and we can only highlight a small, representative portion of the tools from each area. PMID:22629572
Skeletal camera network embedded structure-from-motion for 3D scene reconstruction from UAV images

NASA Astrophysics Data System (ADS)

Xu, Zhihua; Wu, Lixin; Gerke, Markus; Wang, Ran; Yang, Huachao

2016-11-01

Structure-from-Motion (SfM) techniques have been widely used for 3D scene reconstruction from multi-view images. However, due to the large computational costs of SfM methods there is a major challenge in processing highly overlapping images, e.g. images from unmanned aerial vehicles (UAV). This paper embeds a novel skeletal camera network (SCN) into SfM to enable efficient 3D scene reconstruction from a large set of UAV images. First, the flight control data are used within a weighted graph to construct a topologically connected camera network (TCN) to determine the spatial connections between UAV images. Second, the TCN is refined using a novel hierarchical degree bounded maximum spanning tree to generate a SCN, which contains a subset of edges from the TCN and ensures that each image is involved in at least a 3-view configuration. Third, the SCN is embedded into the SfM to produce a novel SCN-SfM method, which allows performing tie-point matching only for the actually connected image pairs. The proposed method was applied in three experiments with images from two fixed-wing UAVs and an octocopter UAV, respectively. In addition, the SCN-SfM method was compared to three other methods for image connectivity determination. The comparison shows a significant reduction in the number of matched images if our method is used, which leads to less computational costs. At the same time the achieved scene completeness and geometric accuracy are comparable.
Identification of Gene Networks for Residual Feed Intake in Angus Cattle Using Genomic Prediction and RNA-seq.

PubMed

Weber, Kristina L; Welly, Bryan T; Van Eenennaam, Alison L; Young, Amy E; Porto-Neto, Laercio R; Reverter, Antonio; Rincon, Gonzalo

2016-01-01

Improvement in feed conversion efficiency can improve the sustainability of beef cattle production, but genomic selection for feed efficiency affects many underlying molecular networks and physiological traits. This study describes the differences between steer progeny of two influential Angus bulls with divergent genomic predictions for residual feed intake (RFI). Eight steer progeny of each sire were phenotyped for growth and feed intake from 8 mo. of age (average BW 254 kg, with a mean difference between sire groups of 4.8 kg) until slaughter at 14-16 mo. of age (average BW 534 kg, sire group difference of 28.8 kg). Terminal samples from pituitary gland, skeletal muscle, liver, adipose, and duodenum were collected from each steer for transcriptome sequencing. Gene expression networks were derived using partial correlation and information theory (PCIT), including differentially expressed (DE) genes, tissue specific (TS) genes, transcription factors (TF), and genes associated with RFI from a genome-wide association study (GWAS). Relative to progeny of the high RFI sire, progeny of the low RFI sire had -0.56 kg/d finishing period RFI (P = 0.05), -1.08 finishing period feed conversion ratio (P = 0.01), +3.3 kg^0.75 finishing period metabolic mid-weight (MMW; P = 0.04), +28.8 kg final body weight (P = 0.01), -12.9 feed bunk visits per day (P = 0.02) with +0.60 min/visit duration (P = 0.01), and +0.0045 carcass specific gravity (weight in air/weight in air-weight in water, a predictor of carcass fat content; P = 0.03). RNA-seq identified 633 DE genes between sire groups among 17,016 expressed genes. PCIT analysis identified >115,000 significant co-expression correlations between genes and 25 TF hubs, i.e. controllers of clusters of DE, TS, and GWAS SNP genes. Pathway analysis suggests low RFI bull progeny possess heightened gut inflammation and reduced fat deposition. This multi-omics analysis shows how differences in RFI genomic breeding values can impact other traits and gene co-expression networks.
Meneco, a Topology-Based Gap-Filling Tool Applicable to Degraded Genome-Wide Metabolic Networks

PubMed Central

Prigent, Sylvain; Frioux, Clémence; Dittami, Simon M.; Larhlimi, Abdelhalim; Collet, Guillaume; Gutknecht, Fabien; Got, Jeanne; Eveillard, Damien; Bourdon, Jérémie; Plewniak, Frédéric; Tonon, Thierry; Siegel, Anne

2017-01-01

Increasing amounts of sequence data are becoming available for a wide range of non-model organisms. Investigating and modelling the metabolic behaviour of those organisms is highly relevant to understand their biology and ecology. As sequences are often incomplete and poorly annotated, draft networks of their metabolism largely suffer from incompleteness. Appropriate gap-filling methods to identify and add missing reactions are therefore required to address this issue. However, current tools rely on phenotypic or taxonomic information, or are very sensitive to the stoichiometric balance of metabolic reactions, especially concerning the co-factors. This type of information is often not available or at least prone to errors for newly-explored organisms. Here we introduce Meneco, a tool dedicated to the topological gap-filling of genome-scale draft metabolic networks. Meneco reformulates gap-filling as a qualitative combinatorial optimization problem, omitting constraints raised by the stoichiometry of a metabolic network considered in other methods, and solves this problem using Answer Set Programming. Run on several artificial test sets gathering 10,800 degraded Escherichia coli networks Meneco was able to efficiently identify essential reactions missing in networks at high degradation rates, outperforming the stoichiometry-based tools in scalability. To demonstrate the utility of Meneco we applied it to two case studies. Its application to recent metabolic networks reconstructed for the brown algal model Ectocarpus siliculosus and an associated bacterium Candidatus Phaeomarinobacter ectocarpi revealed several candidate metabolic pathways for algal-bacterial interactions. Then Meneco was used to reconstruct, from transcriptomic and metabolomic data, the first metabolic network for the microalga Euglena mutabilis. These two case studies show that Meneco is a versatile tool to complete draft genome-scale metabolic networks produced from heterogeneous data, and to suggest relevant reactions that explain the metabolic capacity of a biological system. PMID:28129330
Meneco, a Topology-Based Gap-Filling Tool Applicable to Degraded Genome-Wide Metabolic Networks.

PubMed

Prigent, Sylvain; Frioux, Clémence; Dittami, Simon M; Thiele, Sven; Larhlimi, Abdelhalim; Collet, Guillaume; Gutknecht, Fabien; Got, Jeanne; Eveillard, Damien; Bourdon, Jérémie; Plewniak, Frédéric; Tonon, Thierry; Siegel, Anne

2017-01-01

Increasing amounts of sequence data are becoming available for a wide range of non-model organisms. Investigating and modelling the metabolic behaviour of those organisms is highly relevant to understand their biology and ecology. As sequences are often incomplete and poorly annotated, draft networks of their metabolism largely suffer from incompleteness. Appropriate gap-filling methods to identify and add missing reactions are therefore required to address this issue. However, current tools rely on phenotypic or taxonomic information, or are very sensitive to the stoichiometric balance of metabolic reactions, especially concerning the co-factors. This type of information is often not available or at least prone to errors for newly-explored organisms. Here we introduce Meneco, a tool dedicated to the topological gap-filling of genome-scale draft metabolic networks. Meneco reformulates gap-filling as a qualitative combinatorial optimization problem, omitting constraints raised by the stoichiometry of a metabolic network considered in other methods, and solves this problem using Answer Set Programming. Run on several artificial test sets gathering 10,800 degraded Escherichia coli networks Meneco was able to efficiently identify essential reactions missing in networks at high degradation rates, outperforming the stoichiometry-based tools in scalability. To demonstrate the utility of Meneco we applied it to two case studies. Its application to recent metabolic networks reconstructed for the brown algal model Ectocarpus siliculosus and an associated bacterium Candidatus Phaeomarinobacter ectocarpi revealed several candidate metabolic pathways for algal-bacterial interactions. Then Meneco was used to reconstruct, from transcriptomic and metabolomic data, the first metabolic network for the microalga Euglena mutabilis. These two case studies show that Meneco is a versatile tool to complete draft genome-scale metabolic networks produced from heterogeneous data, and to suggest relevant reactions that explain the metabolic capacity of a biological system.
m6A-Driver: Identifying Context-Specific mRNA m6A Methylation-Driven Gene Interaction Networks

PubMed Central

Zhang, Song-Yao; Zhang, Shao-Wu; Liu, Lian; Huang, Yufei

2016-01-01

As the most prevalent mammalian mRNA epigenetic modification, N6-methyladenosine (m6A) has been shown to possess important post-transcriptional regulatory functions. However, the regulatory mechanisms and functional circuits of m6A are still largely elusive. To help unveil the regulatory circuitry mediated by mRNA m6A methylation, we develop here m6A-Driver, an algorithm for predicting m6A-driven genes and associated networks, whose functional interactions are likely to be actively modulated by m6A methylation under a specific condition. Specifically, m6A-Driver integrates the PPI network and the predicted differential m6A methylation sites from methylated RNA immunoprecipitation sequencing (MeRIP-Seq) data using a Random Walk with Restart (RWR) algorithm and then builds a consensus m6A-driven network of m6A-driven genes. To evaluate the performance, we applied m6A-Driver to build the context-specific m6A-driven networks for 4 known m6A (de)methylases, i.e., FTO, METTL3, METTL14 and WTAP. Our results suggest that m6A-Driver can robustly and efficiently identify m6A-driven genes that are functionally more enriched and associated with higher degree of differential expression than differential m6A methylated genes. Pathway analysis of the constructed context-specific m6A-driven gene networks further revealed the regulatory circuitry underlying the dynamic interplays between the methyltransferases and demethylase at the epitranscriptomic layer of gene regulation. PMID:28027310
Analysis of copy number variations in Holstein cows identify potential mechanisms contributing to differences in residual feed intake.

PubMed

Hou, Yali; Bickhart, Derek M; Chung, Hoyoung; Hutchison, Jana L; Norman, H Duane; Connor, Erin E; Liu, George E

2012-11-01

Genomic structural variation is an important and abundant source of genetic and phenotypic variation. In this study, we performed an initial analysis of copy number variations (CNVs) using BovineHD SNP genotyping data from 147 Holstein cows identified as having high or low feed efficiency as estimated by residual feed intake (RFI). We detected 443 candidate CNV regions (CNVRs) that represent 18.4 Mb (0.6 %) of the genome. To investigate the functional impacts of CNVs, we created two groups of 30 individual animals with extremely low or high estimated breeding values (EBVs) for RFI, and referred to these groups as low intake (LI; more efficient) or high intake (HI; less efficient), respectively. We identified 240 (~9.0 Mb) and 274 (~10.2 Mb) CNVRs from LI and HI groups, respectively. Approximately 30-40 % of the CNVRs were specific to the LI group or HI group of animals. The 240 LI CNVRs overlapped with 137 Ensembl genes. Network analyses indicated that the LI-specific genes were predominantly enriched for those functioning in the inflammatory response and immunity. By contrast, the 274 HI CNVRs contained 177 Ensembl genes. Network analyses indicated that the HI-specific genes were particularly involved in the cell cycle, and organ and bone development. These results relate CNVs to two key variables, namely immune response and organ and bone development. The data indicate that greater feed efficiency relates more closely to immune response, whereas cattle with reduced feed efficiency may have a greater capacity for organ and bone development.
Rapid tomographic reconstruction based on machine learning for time-resolved combustion diagnostics

NASA Astrophysics Data System (ADS)

Yu, Tao; Cai, Weiwei; Liu, Yingzheng

2018-04-01

Optical tomography has attracted surged research efforts recently due to the progress in both the imaging concepts and the sensor and laser technologies. The high spatial and temporal resolutions achievable by these methods provide unprecedented opportunity for diagnosis of complicated turbulent combustion. However, due to the high data throughput and the inefficiency of the prevailing iterative methods, the tomographic reconstructions which are typically conducted off-line are computationally formidable. In this work, we propose an efficient inversion method based on a machine learning algorithm, which can extract useful information from the previous reconstructions and build efficient neural networks to serve as a surrogate model to rapidly predict the reconstructions. Extreme learning machine is cited here as an example for demonstrative purpose simply due to its ease of implementation, fast learning speed, and good generalization performance. Extensive numerical studies were performed, and the results show that the new method can dramatically reduce the computational time compared with the classical iterative methods. This technique is expected to be an alternative to existing methods when sufficient training data are available. Although this work is discussed under the context of tomographic absorption spectroscopy, we expect it to be useful also to other high speed tomographic modalities such as volumetric laser-induced fluorescence and tomographic laser-induced incandescence which have been demonstrated for combustion diagnostics.
Rapid tomographic reconstruction based on machine learning for time-resolved combustion diagnostics.

PubMed

Yu, Tao; Cai, Weiwei; Liu, Yingzheng

2018-04-01

Optical tomography has attracted surged research efforts recently due to the progress in both the imaging concepts and the sensor and laser technologies. The high spatial and temporal resolutions achievable by these methods provide unprecedented opportunity for diagnosis of complicated turbulent combustion. However, due to the high data throughput and the inefficiency of the prevailing iterative methods, the tomographic reconstructions which are typically conducted off-line are computationally formidable. In this work, we propose an efficient inversion method based on a machine learning algorithm, which can extract useful information from the previous reconstructions and build efficient neural networks to serve as a surrogate model to rapidly predict the reconstructions. Extreme learning machine is cited here as an example for demonstrative purpose simply due to its ease of implementation, fast learning speed, and good generalization performance. Extensive numerical studies were performed, and the results show that the new method can dramatically reduce the computational time compared with the classical iterative methods. This technique is expected to be an alternative to existing methods when sufficient training data are available. Although this work is discussed under the context of tomographic absorption spectroscopy, we expect it to be useful also to other high speed tomographic modalities such as volumetric laser-induced fluorescence and tomographic laser-induced incandescence which have been demonstrated for combustion diagnostics.
iAB-RBC-283: A proteomically derived knowledge-base of erythrocyte metabolism that can be used to simulate its physiological and patho-physiological states.

PubMed

Bordbar, Aarash; Jamshidi, Neema; Palsson, Bernhard O

2011-07-12

The development of high-throughput technologies capable of whole cell measurements of genes, proteins, and metabolites has led to the emergence of systems biology. Integrated analysis of the resulting omic data sets has proved to be hard to achieve. Metabolic network reconstructions enable complex relationships amongst molecular components to be represented formally in a biologically relevant manner while respecting physical constraints. In silico models derived from such reconstructions can then be queried or interrogated through mathematical simulations. Proteomic profiling studies of the mature human erythrocyte have shown more proteins present related to metabolic function than previously thought; however the significance and the causal consequences of these findings have not been explored. Erythrocyte proteomic data was used to reconstruct the most expansive description of erythrocyte metabolism to date, following extensive manual curation, assessment of the literature, and functional testing. The reconstruction contains 281 enzymes representing functions from glycolysis to cofactor and amino acid metabolism. Such a comprehensive view of erythrocyte metabolism implicates the erythrocyte as a potential biomarker for different diseases as well as a 'cell-based' drug-screening tool. The analysis shows that 94 erythrocyte enzymes are implicated in morbid single nucleotide polymorphisms, representing 142 pathologies. In addition, over 230 FDA-approved and experimental pharmaceuticals have enzymatic targets in the erythrocyte. The advancement of proteomic technologies and increased generation of high-throughput proteomic data have created the need for a means to analyze these data in a coherent manner. Network reconstructions provide a systematic means to integrate and analyze proteomic data in a biologically meaning manner. Analysis of the red cell proteome has revealed an unexpected level of complexity in the functional capabilities of human erythrocyte metabolism.

Sybil--efficient constraint-based modelling in R.

PubMed

Gelius-Dietrich, Gabriel; Desouki, Abdelmoneim Amer; Fritzemeier, Claus Jonathan; Lercher, Martin J

2013-11-13

Constraint-based analyses of metabolic networks are widely used to simulate the properties of genome-scale metabolic networks. Publicly available implementations tend to be slow, impeding large scale analyses such as the genome-wide computation of pairwise gene knock-outs, or the automated search for model improvements. Furthermore, available implementations cannot easily be extended or adapted by users. Here, we present sybil, an open source software library for constraint-based analyses in R; R is a free, platform-independent environment for statistical computing and graphics that is widely used in bioinformatics. Among other functions, sybil currently provides efficient methods for flux-balance analysis (FBA), MOMA, and ROOM that are about ten times faster than previous implementations when calculating the effect of whole-genome single gene deletions in silico on a complete E. coli metabolic model. Due to the object-oriented architecture of sybil, users can easily build analysis pipelines in R or even implement their own constraint-based algorithms. Based on its highly efficient communication with different mathematical optimisation programs, sybil facilitates the exploration of high-dimensional optimisation problems on small time scales. Sybil and all its dependencies are open source. Sybil and its documentation are available for download from the comprehensive R archive network (CRAN).
A methodology for the analysis of differential coexpression across the human lifespan.

PubMed

Gillis, Jesse; Pavlidis, Paul

2009-09-22

Differential coexpression is a change in coexpression between genes that may reflect 'rewiring' of transcriptional networks. It has previously been hypothesized that such changes might be occurring over time in the lifespan of an organism. While both coexpression and differential expression of genes have been previously studied in life stage change or aging, differential coexpression has not. Generalizing differential coexpression analysis to many time points presents a methodological challenge. Here we introduce a method for analyzing changes in coexpression across multiple ordered groups (e.g., over time) and extensively test its validity and usefulness. Our method is based on the use of the Haar basis set to efficiently represent changes in coexpression at multiple time scales, and thus represents a principled and generalizable extension of the idea of differential coexpression to life stage data. We used published microarray studies categorized by age to test the methodology. We validated the methodology by testing our ability to reconstruct Gene Ontology (GO) categories using our measure of differential coexpression and compared this result to using coexpression alone. Our method allows significant improvement in characterizing these groups of genes. Further, we examine the statistical properties of our measure of differential coexpression and establish that the results are significant both statistically and by an improvement in semantic similarity. In addition, we found that our method finds more significant changes in gene relationships compared to several other methods of expressing temporal relationships between genes, such as coexpression over time. Differential coexpression over age generates significant and biologically relevant information about the genes producing it. Our Haar basis methodology for determining age-related differential coexpression performs better than other tested methods. The Haar basis set also lends itself to ready interpretation in terms of both evolutionary and physiological mechanisms of aging and can be seen as a natural generalization of two-category differential coexpression. paul@bioinformatics.ubc.ca.
Rearrangement moves on rooted phylogenetic networks

PubMed Central

Gambette, Philippe; van Iersel, Leo; Jones, Mark; Scornavacca, Celine

2017-01-01

Phylogenetic tree reconstruction is usually done by local search heuristics that explore the space of the possible tree topologies via simple rearrangements of their structure. Tree rearrangement heuristics have been used in combination with practically all optimization criteria in use, from maximum likelihood and parsimony to distance-based principles, and in a Bayesian context. Their basic components are rearrangement moves that specify all possible ways of generating alternative phylogenies from a given one, and whose fundamental property is to be able to transform, by repeated application, any phylogeny into any other phylogeny. Despite their long tradition in tree-based phylogenetics, very little research has gone into studying similar rearrangement operations for phylogenetic network—that is, phylogenies explicitly representing scenarios that include reticulate events such as hybridization, horizontal gene transfer, population admixture, and recombination. To fill this gap, we propose “horizontal” moves that ensure that every network of a certain complexity can be reached from any other network of the same complexity, and “vertical” moves that ensure reachability between networks of different complexities. When applied to phylogenetic trees, our horizontal moves—named rNNI and rSPR—reduce to the best-known moves on rooted phylogenetic trees, nearest-neighbor interchange and rooted subtree pruning and regrafting. Besides a number of reachability results—separating the contributions of horizontal and vertical moves—we prove that rNNI moves are local versions of rSPR moves, and provide bounds on the sizes of the rNNI neighborhoods. The paper focuses on the most biologically meaningful versions of phylogenetic networks, where edges are oriented and reticulation events clearly identified. Moreover, our rearrangement moves are robust to the fact that networks with higher complexity usually allow a better fit with the data. Our goal is to provide a solid basis for practical phylogenetic network reconstruction. PMID:28763439
Systems Biology Analysis Merging Phenotype, Metabolomic and Genomic Data Identifies Non-SMC Condensin I Complex, Subunit G (NCAPG) and Cellular Maintenance Processes as Major Contributors to Genetic Variability in Bovine Feed Efficiency

PubMed Central

Widmann, Philipp; Reverter, Antonio; Weikard, Rosemarie; Suhre, Karsten; Hammon, Harald M.; Albrecht, Elke; Kuehn, Christa

2015-01-01

Feed efficiency is a paramount factor for livestock economy. Previous studies had indicated a substantial heritability of several feed efficiency traits. In our study, we investigated the genetic background of residual feed intake, a commonly used parameter of feed efficiency, in a cattle resource population generated from crossing dairy and beef cattle. Starting from a whole genome association analysis, we subsequently performed combined phenotype-metabolome-genome analysis taking a systems biology approach by inferring gene networks based on partial correlation and information theory approaches. Our data about biological processes enriched with genes from the feed efficiency network suggest that genetic variation in feed efficiency is driven by genetic modulation of basic processes relevant to general cellular functions. When looking at the predicted upstream regulators from the feed efficiency network, the Tumor Protein P53 (TP53) and Transforming Growth Factor beta 1 (TGFB1) genes stood out regarding significance of overlap and number of target molecules in the data set. These results further support the hypothesis that TP53 is a major upstream regulator for genetic variation of feed efficiency. Furthermore, our data revealed a significant effect of both, the Non-SMC Condensin I Complex, Subunit G (NCAPG) I442M (rs109570900) and the Growth /differentiation factor 8 (GDF8) Q204X (rs110344317) loci, on residual feed intake and feed conversion. For both loci, the growth promoting allele at the onset of puberty was associated with a negative, but favorable effect on residual feed intake. The elevated energy demand for increased growth triggered by the NCAPG 442M allele is obviously not fully compensated for by an increased efficiency in converting feed into body tissue. As a consequence, the individuals carrying the NCAPG 442M allele had an additional demand for energy uptake that is reflected by the association of the allele with increased daily energy intake as observed in our study. PMID:25875852
Using evolutionary computations to understand the design and evolution of gene and cell regulatory networks.

PubMed

Spirov, Alexander; Holloway, David

2013-07-15

This paper surveys modeling approaches for studying the evolution of gene regulatory networks (GRNs). Modeling of the design or 'wiring' of GRNs has become increasingly common in developmental and medical biology, as a means of quantifying gene-gene interactions, the response to perturbations, and the overall dynamic motifs of networks. Drawing from developments in GRN 'design' modeling, a number of groups are now using simulations to study how GRNs evolve, both for comparative genomics and to uncover general principles of evolutionary processes. Such work can generally be termed evolution in silico. Complementary to these biologically-focused approaches, a now well-established field of computer science is Evolutionary Computations (ECs), in which highly efficient optimization techniques are inspired from evolutionary principles. In surveying biological simulation approaches, we discuss the considerations that must be taken with respect to: (a) the precision and completeness of the data (e.g. are the simulations for very close matches to anatomical data, or are they for more general exploration of evolutionary principles); (b) the level of detail to model (we proceed from 'coarse-grained' evolution of simple gene-gene interactions to 'fine-grained' evolution at the DNA sequence level); (c) to what degree is it important to include the genome's cellular context; and (d) the efficiency of computation. With respect to the latter, we argue that developments in computer science EC offer the means to perform more complete simulation searches, and will lead to more comprehensive biological predictions. Copyright © 2013 Elsevier Inc. All rights reserved.
Solving gap metabolites and blocked reactions in genome-scale models: application to the metabolic network of Blattabacterium cuenoti.

PubMed

Ponce-de-León, Miguel; Montero, Francisco; Peretó, Juli

2013-10-31

Metabolic reconstruction is the computational-based process that aims to elucidate the network of metabolites interconnected through reactions catalyzed by activities assigned to one or more genes. Reconstructed models may contain inconsistencies that appear as gap metabolites and blocked reactions. Although automatic methods for solving this problem have been previously developed, there are many situations where manual curation is still needed. We introduce a general definition of gap metabolite that allows its detection in a straightforward manner. Moreover, a method for the detection of Unconnected Modules, defined as isolated sets of blocked reactions connected through gap metabolites, is proposed. The method has been successfully applied to the curation of iCG238, the genome-scale metabolic model for the bacterium Blattabacterium cuenoti, obligate endosymbiont of cockroaches. We found the proposed approach to be a valuable tool for the curation of genome-scale metabolic models. The outcome of its application to the genome-scale model B. cuenoti iCG238 is a more accurate model version named as B. cuenoti iMP240.
A Computational Approach to Estimate Interorgan Metabolic Transport in a Mammal

PubMed Central

Cui, Xiao; Geffers, Lars; Eichele, Gregor; Yan, Jun

2014-01-01

In multicellular organisms metabolism is distributed across different organs, each of which has specific requirements to perform its own specialized task. But different organs also have to support the metabolic homeostasis of the organism as a whole by interorgan metabolite transport. Recent studies have successfully reconstructed global metabolic networks in tissues and cell types and attempts have been made to connect organs with interorgan metabolite transport. Instead of these complicated approaches to reconstruct global metabolic networks, we proposed in this study a novel approach to study interorgan metabolite transport focusing on transport processes mediated by solute carrier (Slc) transporters and their couplings to cognate enzymatic reactions. We developed a computational approach to identify and score potential interorgan metabolite transports based on the integration of metabolism and transports in different organs in the adult mouse from quantitative gene expression data. This allowed us to computationally estimate the connectivity between 17 mouse organs via metabolite transport. Finally, by applying our method to circadian metabolism, we showed that our approach can shed new light on the current understanding of interorgan metabolite transport at a whole-body level in mammals. PMID:24971892
Genome-Scale Reconstruction and Analysis of the Metabolic Network in the Hyperthermophilic Archaeon Sulfolobus Solfataricus

PubMed Central

Ulas, Thomas; Riemer, S. Alexander; Zaparty, Melanie; Siebers, Bettina; Schomburg, Dietmar

2012-01-01

We describe the reconstruction of a genome-scale metabolic model of the crenarchaeon Sulfolobus solfataricus, a hyperthermoacidophilic microorganism. It grows in terrestrial volcanic hot springs with growth occurring at pH 2–4 (optimum 3.5) and a temperature of 75–80°C (optimum 80°C). The genome of Sulfolobus solfataricus P2 contains 2,992,245 bp on a single circular chromosome and encodes 2,977 proteins and a number of RNAs. The network comprises 718 metabolic and 58 transport/exchange reactions and 705 unique metabolites, based on the annotated genome and available biochemical data. Using the model in conjunction with constraint-based methods, we simulated the metabolic fluxes induced by different environmental and genetic conditions. The predictions were compared to experimental measurements and phenotypes of S. solfataricus. Furthermore, the performance of the network for 35 different carbon sources known for S. solfataricus from the literature was simulated. Comparing the growth on different carbon sources revealed that glycerol is the carbon source with the highest biomass flux per imported carbon atom (75% higher than glucose). Experimental data was also used to fit the model to phenotypic observations. In addition to the commonly known heterotrophic growth of S. solfataricus, the crenarchaeon is also able to grow autotrophically using the hydroxypropionate-hydroxybutyrate cycle for bicarbonate fixation. We integrated this pathway into our model and compared bicarbonate fixation with growth on glucose as sole carbon source. Finally, we tested the robustness of the metabolism with respect to gene deletions using the method of Minimization of Metabolic Adjustment (MOMA), which predicted that 18% of all possible single gene deletions would be lethal for the organism. PMID:22952675
The TF-miRNA Coregulation Network in Oral Lichen Planus

PubMed Central

Zuo, Yu-Ling; Gong, Di-Ping; Li, Bi-Ze; Zhao, Juan; Zhou, Ling-Yue; Shao, Fang-Yang; Jin, Zhao; He, Yuan

2015-01-01

Oral lichen planus (OLP) is a chronic inflammatory disease that affects oral mucosa, some of which may finally develop into oral squamous cell carcinoma. Therefore, pinpointing the molecular mechanisms underlying the pathogenesis of OLP is important to develop efficient treatments for OLP. Recently, the accumulation of the large amount of omics data, especially transcriptome data, provides opportunities to investigate OLPs from a systematic perspective. In this paper, assuming that the OLP associated genes have functional relationships, we present a new approach to identify OLP related gene modules from gene regulatory networks. In particular, we find that the gene modules regulated by both transcription factors (TFs) and microRNAs (miRNAs) play important roles in the pathogenesis of OLP and many genes in the modules have been reported to be related to OLP in the literature. PMID:26064947
Refining Pathways: A Model Comparison Approach

PubMed Central

Moffa, Giusi; Erdmann, Gerrit; Voloshanenko, Oksana; Hundsrucker, Christian; Sadeh, Mohammad J.; Boutros, Michael; Spang, Rainer

2016-01-01

Cellular signalling pathways consolidate multiple molecular interactions into working models of signal propagation, amplification, and modulation. They are described and visualized as networks. Adjusting network topologies to experimental data is a key goal of systems biology. While network reconstruction algorithms like nested effects models are well established tools of computational biology, their data requirements can be prohibitive for their practical use. In this paper we suggest focussing on well defined aspects of a pathway and develop the computational tools to do so. We adapt the framework of nested effect models to focus on a specific aspect of activated Wnt signalling in HCT116 colon cancer cells: Does the activation of Wnt target genes depend on the secretion of Wnt ligands or do mutations in the signalling molecule β-catenin make this activation independent from them? We framed this question into two competing classes of models: Models that depend on Wnt ligands secretion versus those that do not. The model classes translate into restrictions of the pathways in the network topology. Wnt dependent models are more flexible than Wnt independent models. Bayes factors are the standard Bayesian tool to compare different models fairly on the data evidence. In our analysis, the Bayes factors depend on the number of potential Wnt signalling target genes included in the models. Stability analysis with respect to this number showed that the data strongly favours Wnt ligands dependent models for all realistic numbers of target genes. PMID:27248690
Suppressed Expression of T-Box Transcription Factors is Involved in Senescence in Chronic Obstructive Pulmonary Disease

DOE Office of Scientific and Technical Information (OSTI.GOV)

Acquaah-Mensah, George; Malhotra, Deepti; Vulimiri, Madhulika

2012-06-19

Chronic obstructive pulmonary disease (COPD) is a major global health problem. The etiology of COPD has been associated with apoptosis, oxidative stress, and inflammation. However, understanding of the molecular interactions that modulate COPD pathogenesis remains only partly resolved. We conducted an exploratory study on COPD etiology to identify the key molecular participants. We used information-theoretic algorithms including Context Likelihood of Relatedness (CLR), Algorithm for the Reconstruction of Accurate Cellular Networks (ARACNE), and Inferelator. We captured direct functional associations among genes, given a compendium of gene expression profiles of human lung epithelial cells. A set of genes differentially expressed in COPD,more » as reported in a previous study were superposed with the resulting transcriptional regulatory networks. After factoring in the properties of the networks, an established COPD susceptibility locus and domain-domain interactions involving protein products of genes in the generated networks, several molecular candidates were predicted to be involved in the etiology of COPD. These include COL4A3, CFLAR, GULP1, PDCD1, CASP10, PAX3, BOK, HSPD1, PITX2, and PML. Furthermore, T-box (TBX) genes and cyclin-dependent kinase inhibitor 2A (CDKN2A), which are in a direct transcriptional regulatory relationship, emerged as preeminent participants in the etiology of COPD by means of senescence. Contrary to observations in neoplasms, our study reveals that the expression of genes and proteins in the lung samples from patients with COPD indicate an increased tendency towards cellular senescence. The expression of the anti-senescence mediators TBX transcription factors, chromatin modifiers histone deacetylases, and sirtuins was suppressed; while the expression of TBX-regulated cellular senescence markers such as CDKN2A, CDKN1A, and CAV1 was elevated in the peripheral lung tissue samples from patients with COPD. The critical balance between senescence and anti-senescence factors is disrupted towards senescence in COPD lungs.« less
optGpSampler: an improved tool for uniformly sampling the solution-space of genome-scale metabolic networks.

PubMed

Megchelenbrink, Wout; Huynen, Martijn; Marchiori, Elena

2014-01-01

Constraint-based models of metabolic networks are typically underdetermined, because they contain more reactions than metabolites. Therefore the solutions to this system do not consist of unique flux rates for each reaction, but rather a space of possible flux rates. By uniformly sampling this space, an estimated probability distribution for each reaction's flux in the network can be obtained. However, sampling a high dimensional network is time-consuming. Furthermore, the constraints imposed on the network give rise to an irregularly shaped solution space. Therefore more tailored, efficient sampling methods are needed. We propose an efficient sampling algorithm (called optGpSampler), which implements the Artificial Centering Hit-and-Run algorithm in a different manner than the sampling algorithm implemented in the COBRA Toolbox for metabolic network analysis, here called gpSampler. Results of extensive experiments on different genome-scale metabolic networks show that optGpSampler is up to 40 times faster than gpSampler. Application of existing convergence diagnostics on small network reconstructions indicate that optGpSampler converges roughly ten times faster than gpSampler towards similar sampling distributions. For networks of higher dimension (i.e. containing more than 500 reactions), we observed significantly better convergence of optGpSampler and a large deviation between the samples generated by the two algorithms. optGpSampler for Matlab and Python is available for non-commercial use at: http://cs.ru.nl/~wmegchel/optGpSampler/.
Reconstructing the regulatory circuit of cell fate determination in yeast mating response.

PubMed

Shao, Bin; Yuan, Haiyu; Zhang, Rongfei; Wang, Xuan; Zhang, Shuwen; Ouyang, Qi; Hao, Nan; Luo, Chunxiong

2017-07-01

Massive technological advances enabled high-throughput measurements of proteomic changes in biological processes. However, retrieving biological insights from large-scale protein dynamics data remains a challenging task. Here we used the mating differentiation in yeast Saccharomyces cerevisiae as a model and developed integrated experimental and computational approaches to analyze the proteomic dynamics during the process of cell fate determination. When exposed to a high dose of mating pheromone, the yeast cell undergoes growth arrest and forms a shmoo-like morphology; however, at intermediate doses, chemotropic elongated growth is initialized. To understand the gene regulatory networks that control this differentiation switch, we employed a high-throughput microfluidic imaging system that allows real-time and simultaneous measurements of cell growth and protein expression. Using kinetic modeling of protein dynamics, we classified the stimulus-dependent changes in protein abundance into two sources: global changes due to physiological alterations and gene-specific changes. A quantitative framework was proposed to decouple gene-specific regulatory modes from the growth-dependent global modulation of protein abundance. Based on the temporal patterns of gene-specific regulation, we established the network architectures underlying distinct cell fates using a reverse engineering method and uncovered the dose-dependent rewiring of gene regulatory network during mating differentiation. Furthermore, our results suggested a potential crosstalk between the pheromone response pathway and the target of rapamycin (TOR)-regulated ribosomal biogenesis pathway, which might underlie a cell differentiation switch in yeast mating response. In summary, our modeling approach addresses the distinct impacts of the global and gene-specific regulation on the control of protein dynamics and provides new insights into the mechanisms of cell fate determination. We anticipate that our integrated experimental and modeling strategies could be widely applicable to other biological systems.
Systems biology of the structural proteome.

PubMed

Brunk, Elizabeth; Mih, Nathan; Monk, Jonathan; Zhang, Zhen; O'Brien, Edward J; Bliven, Spencer E; Chen, Ke; Chang, Roger L; Bourne, Philip E; Palsson, Bernhard O

2016-03-11

The success of genome-scale models (GEMs) can be attributed to the high-quality, bottom-up reconstructions of metabolic, protein synthesis, and transcriptional regulatory networks on an organism-specific basis. Such reconstructions are biochemically, genetically, and genomically structured knowledge bases that can be converted into a mathematical format to enable a myriad of computational biological studies. In recent years, genome-scale reconstructions have been extended to include protein structural information, which has opened up new vistas in systems biology research and empowered applications in structural systems biology and systems pharmacology. Here, we present the generation, application, and dissemination of genome-scale models with protein structures (GEM-PRO) for Escherichia coli and Thermotoga maritima. We show the utility of integrating molecular scale analyses with systems biology approaches by discussing several comparative analyses on the temperature dependence of growth, the distribution of protein fold families, substrate specificity, and characteristic features of whole cell proteomes. Finally, to aid in the grand challenge of big data to knowledge, we provide several explicit tutorials of how protein-related information can be linked to genome-scale models in a public GitHub repository ( https://github.com/SBRG/GEMPro/tree/master/GEMPro_recon/). Translating genome-scale, protein-related information to structured data in the format of a GEM provides a direct mapping of gene to gene-product to protein structure to biochemical reaction to network states to phenotypic function. Integration of molecular-level details of individual proteins, such as their physical, chemical, and structural properties, further expands the description of biochemical network-level properties, and can ultimately influence how to model and predict whole cell phenotypes as well as perform comparative systems biology approaches to study differences between organisms. GEM-PRO offers insight into the physical embodiment of an organism's genotype, and its use in this comparative framework enables exploration of adaptive strategies for these organisms, opening the door to many new lines of research. With these provided tools, tutorials, and background, the reader will be in a position to run GEM-PRO for their own purposes.
Alzheimer's disease master regulators analysis: search for potential molecular targets and drug repositioning candidates.

PubMed

Vargas, D M; De Bastiani, M A; Zimmer, E R; Klamt, F

2018-06-23

Alzheimer's disease (AD) is a multifactorial and complex neuropathology that involves impairment of many intricate molecular mechanisms. Despite recent advances, AD pathophysiological characterization remains incomplete, which hampers the development of effective treatments. In fact, currently, there are no effective pharmacological treatments for AD. Integrative strategies such as transcription regulatory network and master regulator analyses exemplify promising new approaches to study complex diseases and may help in the identification of potential pharmacological targets. In this study, we used transcription regulatory network and master regulator analyses on transcriptomic data of human hippocampus to identify transcription factors (TFs) that can potentially act as master regulators in AD. All expression profiles were obtained from the Gene Expression Omnibus database using the GEOquery package. A normal hippocampus transcription factor-centered regulatory network was reconstructed using the ARACNe algorithm. Master regulator analysis and two-tail gene set enrichment analysis were employed to evaluate the inferred regulatory units in AD case-control studies. Finally, we used a connectivity map adaptation to prospect new potential therapeutic interventions by drug repurposing. We identified TFs with already reported involvement in AD, such as ATF2 and PARK2, as well as possible new targets for future investigations, such as CNOT7, CSRNP2, SLC30A9, and TSC22D1. Furthermore, Connectivity Map Analysis adaptation suggested the repositioning of six FDA-approved drugs that can potentially modulate master regulator candidate regulatory units (Cefuroxime, Cyproterone, Dydrogesterone, Metrizamide, Trimethadione, and Vorinostat). Using a transcription factor-centered regulatory network reconstruction we were able to identify several potential molecular targets and six drug candidates for repositioning in AD. Our study provides further support for the use of bioinformatics tools as exploratory strategies in neurodegenerative diseases research, and also provides new perspectives on molecular targets and drug therapies for future investigation and validation in AD.
Integrative View of α2,3-Sialyltransferases (ST3Gal) Molecular and Functional Evolution in Deuterostomes: Significance of Lineage-Specific Losses

PubMed Central

Petit, Daniel; Teppa, Elin; Mir, Anne-Marie; Vicogne, Dorothée; Thisse, Christine; Thisse, Bernard; Filloux, Cyril; Harduin-Lepers, Anne

2015-01-01

Sialyltransferases are responsible for the synthesis of a diverse range of sialoglycoconjugates predicted to be pivotal to deuterostomes’ evolution. In this work, we reconstructed the evolutionary history of the metazoan α2,3-sialyltransferases family (ST3Gal), a subset of sialyltransferases encompassing six subfamilies (ST3Gal I–ST3Gal VI) functionally characterized in mammals. Exploration of genomic and expressed sequence tag databases and search of conserved sialylmotifs led to the identification of a large data set of st3gal-related gene sequences. Molecular phylogeny and large scale sequence similarity network analysis identified four new vertebrate subfamilies called ST3Gal III-r, ST3Gal VII, ST3Gal VIII, and ST3Gal IX. To address the issue of the origin and evolutionary relationships of the st3gal-related genes, we performed comparative syntenic mapping of st3gal gene loci combined to ancestral genome reconstruction. The ten vertebrate ST3Gal subfamilies originated from genome duplication events at the base of vertebrates and are organized in three distinct and ancient groups of genes predating the early deuterostomes. Inferring st3gal gene family history identified also several lineage-specific gene losses, the significance of which was explored in a functional context. Toward this aim, spatiotemporal distribution of st3gal genes was analyzed in zebrafish and bovine tissues. In addition, molecular evolutionary analyses using specificity determining position and coevolved amino acid predictions led to the identification of amino acid residues with potential implication in functional divergence of vertebrate ST3Gal. We propose a detailed scenario of the evolutionary relationships of st3gal genes coupled to a conceptual framework of the evolution of ST3Gal functions. PMID:25534026
Efficiency of nuclear and mitochondrial markers recovering and supporting known amniote groups.

PubMed

Lambret-Frotté, Julia; Perini, Fernando Araújo; de Moraes Russo, Claudia Augusta

2012-01-01

We have analysed the efficiency of all mitochondrial protein coding genes and six nuclear markers (Adora3, Adrb2, Bdnf, Irbp, Rag2 and Vwf) in reconstructing and statistically supporting known amniote groups (murines, rodents, primates, eutherians, metatherians, therians). The efficiencies of maximum likelihood, Bayesian inference, maximum parsimony, neighbor-joining and UPGMA were also evaluated, by assessing the number of correct and incorrect recovered groupings. In addition, we have compared support values using the conservative bootstrap test and the Bayesian posterior probabilities. First, no correlation was observed between gene size and marker efficiency in recovering or supporting correct nodes. As expected, tree-building methods performed similarly, even UPGMA that, in some cases, outperformed other most extensively used methods. Bayesian posterior probabilities tend to show much higher support values than the conservative bootstrap test, for correct and incorrect nodes. Our results also suggest that nuclear markers do not necessarily show a better performance than mitochondrial genes. The so-called dependency among mitochondrial markers was not observed comparing genome performances. Finally, the amniote groups with lowest recovery rates were therians and rodents, despite the morphological support for their monophyletic status. We suggest that, regardless of the tree-building method, a few carefully selected genes are able to unfold a detailed and robust scenario of phylogenetic hypotheses, particularly if taxon sampling is increased.
Temporal network analysis identifies early physiological and transcriptomic indicators of mild drought in Brassica rapa

PubMed Central

Gehan, Malia A; Mockler, Todd C; Weinig, Cynthia; Ewers, Brent E

2017-01-01

The dynamics of local climates make development of agricultural strategies challenging. Yield improvement has progressed slowly, especially in drought-prone regions where annual crop production suffers from episodic aridity. Underlying drought responses are circadian and diel control of gene expression that regulate daily variations in metabolic and physiological pathways. To identify transcriptomic changes that occur in the crop Brassica rapa during initial perception of drought, we applied a co-expression network approach to associate rhythmic gene expression changes with physiological responses. Coupled analysis of transcriptome and physiological parameters over a two-day time course in control and drought-stressed plants provided temporal resolution necessary for correlation of network modules with dynamic changes in stomatal conductance, photosynthetic rate, and photosystem II efficiency. This approach enabled the identification of drought-responsive genes based on their differential rhythmic expression profiles in well-watered versus droughted networks and provided new insights into the dynamic physiological changes that occur during drought. PMID:28826479
Enhanced capital-asset pricing model for the reconstruction of bipartite financial networks.

PubMed

Squartini, Tiziano; Almog, Assaf; Caldarelli, Guido; van Lelyveld, Iman; Garlaschelli, Diego; Cimini, Giulio

2017-09-01

Reconstructing patterns of interconnections from partial information is one of the most important issues in the statistical physics of complex networks. A paramount example is provided by financial networks. In fact, the spreading and amplification of financial distress in capital markets are strongly affected by the interconnections among financial institutions. Yet, while the aggregate balance sheets of institutions are publicly disclosed, information on single positions is mostly confidential and, as such, unavailable. Standard approaches to reconstruct the network of financial interconnection produce unrealistically dense topologies, leading to a biased estimation of systemic risk. Moreover, reconstruction techniques are generally designed for monopartite networks of bilateral exposures between financial institutions, thus failing in reproducing bipartite networks of security holdings (e.g., investment portfolios). Here we propose a reconstruction method based on constrained entropy maximization, tailored for bipartite financial networks. Such a procedure enhances the traditional capital-asset pricing model (CAPM) and allows us to reproduce the correct topology of the network. We test this enhanced CAPM (ECAPM) method on a dataset, collected by the European Central Bank, of detailed security holdings of European institutional sectors over a period of six years (2009-2015). Our approach outperforms the traditional CAPM and the recently proposed maximum-entropy CAPM both in reproducing the network topology and in estimating systemic risk due to fire sales spillovers. In general, ECAPM can be applied to the whole class of weighted bipartite networks described by the fitness model.
Enhanced capital-asset pricing model for the reconstruction of bipartite financial networks

NASA Astrophysics Data System (ADS)

Squartini, Tiziano; Almog, Assaf; Caldarelli, Guido; van Lelyveld, Iman; Garlaschelli, Diego; Cimini, Giulio

2017-09-01

Reconstructing patterns of interconnections from partial information is one of the most important issues in the statistical physics of complex networks. A paramount example is provided by financial networks. In fact, the spreading and amplification of financial distress in capital markets are strongly affected by the interconnections among financial institutions. Yet, while the aggregate balance sheets of institutions are publicly disclosed, information on single positions is mostly confidential and, as such, unavailable. Standard approaches to reconstruct the network of financial interconnection produce unrealistically dense topologies, leading to a biased estimation of systemic risk. Moreover, reconstruction techniques are generally designed for monopartite networks of bilateral exposures between financial institutions, thus failing in reproducing bipartite networks of security holdings (e.g., investment portfolios). Here we propose a reconstruction method based on constrained entropy maximization, tailored for bipartite financial networks. Such a procedure enhances the traditional capital-asset pricing model (CAPM) and allows us to reproduce the correct topology of the network. We test this enhanced CAPM (ECAPM) method on a dataset, collected by the European Central Bank, of detailed security holdings of European institutional sectors over a period of six years (2009-2015). Our approach outperforms the traditional CAPM and the recently proposed maximum-entropy CAPM both in reproducing the network topology and in estimating systemic risk due to fire sales spillovers. In general, ECAPM can be applied to the whole class of weighted bipartite networks described by the fitness model.

Compressed sensing for energy-efficient wireless telemonitoring of noninvasive fetal ECG via block sparse Bayesian learning.

PubMed

Zhang, Zhilin; Jung, Tzyy-Ping; Makeig, Scott; Rao, Bhaskar D

2013-02-01

Fetal ECG (FECG) telemonitoring is an important branch in telemedicine. The design of a telemonitoring system via a wireless body area network with low energy consumption for ambulatory use is highly desirable. As an emerging technique, compressed sensing (CS) shows great promise in compressing/reconstructing data with low energy consumption. However, due to some specific characteristics of raw FECG recordings such as nonsparsity and strong noise contamination, current CS algorithms generally fail in this application. This paper proposes to use the block sparse Bayesian learning framework to compress/reconstruct nonsparse raw FECG recordings. Experimental results show that the framework can reconstruct the raw recordings with high quality. Especially, the reconstruction does not destroy the interdependence relation among the multichannel recordings. This ensures that the independent component analysis decomposition of the reconstructed recordings has high fidelity. Furthermore, the framework allows the use of a sparse binary sensing matrix with much fewer nonzero entries to compress recordings. Particularly, each column of the matrix can contain only two nonzero entries. This shows that the framework, compared to other algorithms such as current CS algorithms and wavelet algorithms, can greatly reduce code execution in CPU in the data compression stage.
Phylogeny and evolutionary histories of Pyrus L. revealed by phylogenetic trees and networks based on data from multiple DNA sequences.

PubMed

Zheng, Xiaoyan; Cai, Danying; Potter, Daniel; Postman, Joseph; Liu, Jing; Teng, Yuanwen

2014-11-01

Reconstructing the phylogeny of Pyrus has been difficult due to the wide distribution of the genus and lack of informative data. In this study, we collected 110 accessions representing 25 Pyrus species and constructed both phylogenetic trees and phylogenetic networks based on multiple DNA sequence datasets. Phylogenetic trees based on both cpDNA and nuclear LFY2int2-N (LN) data resulted in poor resolution, especially, only five primary species were monophyletic in the LN tree. A phylogenetic network of LN suggested that reticulation caused by hybridization is one of the major evolutionary processes for Pyrus species. Polytomies of the gene trees and star-like structure of cpDNA networks suggested rapid radiation is another major evolutionary process, especially for the occidental species. Pyrus calleryana and P. regelii were the earliest diverged Pyrus species. Two North African species, P. cordata, P. spinosa and P. betulaefolia were descendent of primitive stock Pyrus species and still share some common molecular characters. Southwestern China, where a large number of P. pashia populations are found, is probably the most important diversification center of Pyrus. More accessions and nuclear genes are needed for further understanding the evolutionary histories of Pyrus. Copyright © 2014 Elsevier Inc. All rights reserved.
A paradigm for viewing biologic systems as scale-free networks based on energy efficiency: implications for present therapies and the future of evolution.

PubMed

Yun, Anthony J; Lee, Patrick Y; Doux, John D

2006-01-01

A network constitutes an abstract description of the relationships among entities, respectively termed links and nodes. If a power law describes the probability distribution of the number of links per node, the network is said to be scale-free. Scale-free networks feature link clustering around certain hubs based on preferential attachments that emerge due either to merit or legacy. Biologic systems ranging from sub-atomic to ecosystems represent scale-free networks in which energy efficiency forms the basis of preferential attachments. This paradigm engenders a novel scale-free network theory of evolution based on energy efficiency. As environmental flux induces fitness dislocations and compels a new meritocracy, new merit-based hubs emerge, previously merit-based hubs become legacy hubs, and network recalibration occurs to achieve system optimization. To date, Darwinian evolution, characterized by innovation sampling, variation, and selection through filtered termination, has enabled biologic progress through optimization of energy efficiency. However, as humans remodel their environment, increasing the level of unanticipated fitness dislocations and inducing evolutionary stress, the tendency of networks to exhibit inertia and retain legacy hubs engender maladaptations. Many modern diseases may fundamentally derive from these evolutionary displacements. Death itself may constitute a programmed adaptation, terminating individuals who represent legacy hubs and recalibrating the network. As memes replace genes as the basis of innovation, death itself has become a legacy hub. Post-Darwinian evolution may favor indefinite persistence to optimize energy efficiency. We describe strategies to reprogram or decommission legacy hubs that participate in human disease and death.
CardioNet: a human metabolic network suited for the study of cardiomyocyte metabolism.

PubMed

Karlstädt, Anja; Fliegner, Daniela; Kararigas, Georgios; Ruderisch, Hugo Sanchez; Regitz-Zagrosek, Vera; Holzhütter, Hermann-Georg

2012-08-29

Availability of oxygen and nutrients in the coronary circulation is a crucial determinant of cardiac performance. Nutrient composition of coronary blood may significantly vary in specific physiological and pathological conditions, for example, administration of special diets, long-term starvation, physical exercise or diabetes. Quantitative analysis of cardiac metabolism from a systems biology perspective may help to a better understanding of the relationship between nutrient supply and efficiency of metabolic processes required for an adequate cardiac output. Here we present CardioNet, the first large-scale reconstruction of the metabolic network of the human cardiomyocyte comprising 1793 metabolic reactions, including 560 transport processes in six compartments. We use flux-balance analysis to demonstrate the capability of the network to accomplish a set of 368 metabolic functions required for maintaining the structural and functional integrity of the cell. Taking the maintenance of ATP, biosynthesis of ceramide, cardiolipin and further important phospholipids as examples, we analyse how a changed supply of glucose, lactate, fatty acids and ketone bodies may influence the efficiency of these essential processes. CardioNet is a functionally validated metabolic network of the human cardiomyocyte that enables theorectical studies of cellular metabolic processes crucial for the accomplishment of an adequate cardiac output.
Abasy Atlas: a comprehensive inventory of systems, global network properties and systems-level elements across bacteria

PubMed Central

Ibarra-Arellano, Miguel A.; Campos-González, Adrián I.; Treviño-Quintanilla, Luis G.; Tauch, Andreas; Freyre-González, Julio A.

2016-01-01

The availability of databases electronically encoding curated regulatory networks and of high-throughput technologies and methods to discover regulatory interactions provides an invaluable source of data to understand the principles underpinning the organization and evolution of these networks responsible for cellular regulation. Nevertheless, data on these sources never goes beyond the regulon level despite the fact that regulatory networks are complex hierarchical-modular structures still challenging our understanding. This brings the necessity for an inventory of systems across a large range of organisms, a key step to rendering feasible comparative systems biology approaches. In this work, we take the first step towards a global understanding of the regulatory networks organization by making a cartography of the functional architectures of diverse bacteria. Abasy (Across-bacteria systems) Atlas provides a comprehensive inventory of annotated functional systems, global network properties and systems-level elements (global regulators, modular genes shaping functional systems, basal machinery genes and intermodular genes) predicted by the natural decomposition approach for reconstructed and meta-curated regulatory networks across a large range of bacteria, including pathogenically and biotechnologically relevant organisms. The meta-curation of regulatory datasets provides the most complete and reliable set of regulatory interactions currently available, which can even be projected into subsets by considering the force or weight of evidence supporting them or the systems that they belong to. Besides, Abasy Atlas provides data enabling large-scale comparative systems biology studies aimed at understanding the common principles and particular lifestyle adaptions of systems across bacteria. Abasy Atlas contains systems and system-level elements for 50 regulatory networks comprising 78 649 regulatory interactions covering 42 bacteria in nine taxa, containing 3708 regulons and 1776 systems. All this brings together a large corpus of data that will surely inspire studies to generate hypothesis regarding the principles governing the evolution and organization of systems and the functional architectures controlling them. Database URL: http://abasy.ccg.unam.mx PMID:27242034
HOXB7 and Hsa-miR-222 as the Potential Therapeutic Candidates for Metastatic Colorectal Cancer.

PubMed

Iman, Maryam; Mostafavi, Seyede Samaneh; Arab, Seyed Shahriar; Azimzadeh, Sadegh; Poorebrahim, Mansour

2016-01-01

Recent studies have shown that the high mortality of patients with colorectal cancer (CRC) is related to its ability to spread the surrounding tissues, thus there is a need for designing and developing new drugs. Here, we proposed a combinational therapy strategy, an inhibitory peptide in combination with miRNA targeting, for modulating CRC metastasis. In this study, some of the recent patents were also reviewed. After data analysis with GEO2R and gene annotation using DAVID server, regulatory interactions of differentially expressed genes (DEGs) were obtained from STRING, GeneMANIA, KEGG and TRED databases. In parallel, the corresponding validated microRNAs (miRNAs) were obtained from mirDIP web server and a miRNA-DEG regulatory network was also reconstructed. Clustering and topological analyses of the regulatory networks were performed using Cytoscape plug-ins. We found the HOXB family as the most important functional complex in DEG-derived regulatory network. Accordingly, an anti-HOXB7 peptide was designed based on the binding interface of its coactivator, PBX1. Topological analysis of miRNA-DEG network indicated that hsa-miR-222 is one of the most important oncomirs involved in regulation of DEGs activities. Thus, this miRNA, along with HOXB7, was also considered as the potential target for inhibiting CRC metastasis. Molecular docking studies exhibited that the designed peptide can bind to desired binding pocket of HOXB7 in a highaffinity manner. Further confirmations were also observed in Molecular dynamics (MD) simulations carried out by GROMACS v5.0.2 simulation package. In conclusion, our findings suggest that simultaneous targeting of key regulatory genes and miRNAs may be a useful strategy for prevention of CRC metastasis.
Enhanced reconstruction of weighted networks from strengths and degrees

NASA Astrophysics Data System (ADS)

Mastrandrea, Rossana; Squartini, Tiziano; Fagiolo, Giorgio; Garlaschelli, Diego

2014-04-01

Network topology plays a key role in many phenomena, from the spreading of diseases to that of financial crises. Whenever the whole structure of a network is unknown, one must resort to reconstruction methods that identify the least biased ensemble of networks consistent with the partial information available. A challenging case, frequently encountered due to privacy issues in the analysis of interbank flows and Big Data, is when there is only local (node-specific) aggregate information available. For binary networks, the relevant ensemble is one where the degree (number of links) of each node is constrained to its observed value. However, for weighted networks the problem is much more complicated. While the naïve approach prescribes to constrain the strengths (total link weights) of all nodes, recent counter-intuitive results suggest that in weighted networks the degrees are often more informative than the strengths. This implies that the reconstruction of weighted networks would be significantly enhanced by the specification of both strengths and degrees, a computationally hard and bias-prone procedure. Here we solve this problem by introducing an analytical and unbiased maximum-entropy method that works in the shortest possible time and does not require the explicit generation of reconstructed samples. We consider several real-world examples and show that, while the strengths alone give poor results, the additional knowledge of the degrees yields accurately reconstructed networks. Information-theoretic criteria rigorously confirm that the degree sequence, as soon as it is non-trivial, is irreducible to the strength sequence. Our results have strong implications for the analysis of motifs and communities and whenever the reconstructed ensemble is required as a null model to detect higher-order patterns.
Metatranscriptomic insights on gene expression and regulatory controls in Candidatus Accumulibacter phosphatis

DOE PAGES

Oyserman, Ben O.; Noguera, Daniel R.; del Rio, Tijana Glavina; ...

2015-11-10

Previous studies on enhanced biological phosphorus removal (EBPR) have focused on reconstructing genomic blueprints for the model polyphosphate-accumulating organism Candidatus Accumulibacter phosphatis. Here, a time series metatranscriptome generated from enrichment cultures of Accumulibacter was used to gain insight into anerobic/aerobic metabolism and regulatory mechanisms within an EBPR cycle. Co-expressed gene clusters were identified displaying ecologically relevant trends consistent with batch cycle phases. Transcripts displaying increased abundance during anerobic acetate contact were functionally enriched in energy production and conversion, including upregulation of both cytoplasmic and membrane-bound hydrogenases demonstrating the importance of transcriptional regulation to manage energy and electron flux during anerobicmore » acetate contact. We hypothesized and demonstrated hydrogen production after anerobic acetate contact, a previously unknown strategy for Accumulibacter to maintain redox balance. Genes involved in anerobic glycine utilization were identified and phosphorus release after anerobic glycine contact demonstrated, suggesting that Accumulibacter routes diverse carbon sources to acetyl-CoA formation via previously unrecognized pathways. A comparative genomics analysis of sequences upstream of co-expressed genes identified two statistically significant putative regulatory motifs. One palindromic motif was identified upstream of genes involved in PHA synthesis and acetate activation and is hypothesized to be a phaR binding site, hence representing a hypothetical PHA modulon. A second motif was identified ~35 base pairs (bp) upstream of a large and diverse array of genes and hence may represent a sigma factor binding site. As a result, this analysis provides a basis and framework for further investigations into Accumulibacter metabolism and the reconstruction of regulatory networks in uncultured organisms.« less
Metatranscriptomic insights on gene expression and regulatory controls in Candidatus Accumulibacter phosphatis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Oyserman, Ben O.; Noguera, Daniel R.; del Rio, Tijana Glavina

Previous studies on enhanced biological phosphorus removal (EBPR) have focused on reconstructing genomic blueprints for the model polyphosphate-accumulating organism Candidatus Accumulibacter phosphatis. Here, a time series metatranscriptome generated from enrichment cultures of Accumulibacter was used to gain insight into anerobic/aerobic metabolism and regulatory mechanisms within an EBPR cycle. Co-expressed gene clusters were identified displaying ecologically relevant trends consistent with batch cycle phases. Transcripts displaying increased abundance during anerobic acetate contact were functionally enriched in energy production and conversion, including upregulation of both cytoplasmic and membrane-bound hydrogenases demonstrating the importance of transcriptional regulation to manage energy and electron flux during anerobicmore » acetate contact. We hypothesized and demonstrated hydrogen production after anerobic acetate contact, a previously unknown strategy for Accumulibacter to maintain redox balance. Genes involved in anerobic glycine utilization were identified and phosphorus release after anerobic glycine contact demonstrated, suggesting that Accumulibacter routes diverse carbon sources to acetyl-CoA formation via previously unrecognized pathways. A comparative genomics analysis of sequences upstream of co-expressed genes identified two statistically significant putative regulatory motifs. One palindromic motif was identified upstream of genes involved in PHA synthesis and acetate activation and is hypothesized to be a phaR binding site, hence representing a hypothetical PHA modulon. A second motif was identified ~35 base pairs (bp) upstream of a large and diverse array of genes and hence may represent a sigma factor binding site. As a result, this analysis provides a basis and framework for further investigations into Accumulibacter metabolism and the reconstruction of regulatory networks in uncultured organisms.« less
Statistical inference approach to structural reconstruction of complex networks from binary time series

NASA Astrophysics Data System (ADS)

Ma, Chuang; Chen, Han-Shuang; Lai, Ying-Cheng; Zhang, Hai-Feng

2018-02-01

Complex networks hosting binary-state dynamics arise in a variety of contexts. In spite of previous works, to fully reconstruct the network structure from observed binary data remains challenging. We articulate a statistical inference based approach to this problem. In particular, exploiting the expectation-maximization (EM) algorithm, we develop a method to ascertain the neighbors of any node in the network based solely on binary data, thereby recovering the full topology of the network. A key ingredient of our method is the maximum-likelihood estimation of the probabilities associated with actual or nonexistent links, and we show that the EM algorithm can distinguish the two kinds of probability values without any ambiguity, insofar as the length of the available binary time series is reasonably long. Our method does not require any a priori knowledge of the detailed dynamical processes, is parameter-free, and is capable of accurate reconstruction even in the presence of noise. We demonstrate the method using combinations of distinct types of binary dynamical processes and network topologies, and provide a physical understanding of the underlying reconstruction mechanism. Our statistical inference based reconstruction method contributes an additional piece to the rapidly expanding "toolbox" of data based reverse engineering of complex networked systems.
Statistical inference approach to structural reconstruction of complex networks from binary time series.

PubMed

Ma, Chuang; Chen, Han-Shuang; Lai, Ying-Cheng; Zhang, Hai-Feng

2018-02-01

Complex networks hosting binary-state dynamics arise in a variety of contexts. In spite of previous works, to fully reconstruct the network structure from observed binary data remains challenging. We articulate a statistical inference based approach to this problem. In particular, exploiting the expectation-maximization (EM) algorithm, we develop a method to ascertain the neighbors of any node in the network based solely on binary data, thereby recovering the full topology of the network. A key ingredient of our method is the maximum-likelihood estimation of the probabilities associated with actual or nonexistent links, and we show that the EM algorithm can distinguish the two kinds of probability values without any ambiguity, insofar as the length of the available binary time series is reasonably long. Our method does not require any a priori knowledge of the detailed dynamical processes, is parameter-free, and is capable of accurate reconstruction even in the presence of noise. We demonstrate the method using combinations of distinct types of binary dynamical processes and network topologies, and provide a physical understanding of the underlying reconstruction mechanism. Our statistical inference based reconstruction method contributes an additional piece to the rapidly expanding "toolbox" of data based reverse engineering of complex networked systems.
Reconstructing Networks from Profit Sequences in Evolutionary Games via a Multiobjective Optimization Approach with Lasso Initialization

PubMed Central

Wu, Kai; Liu, Jing; Wang, Shuai

2016-01-01

Evolutionary games (EG) model a common type of interactions in various complex, networked, natural and social systems. Given such a system with only profit sequences being available, reconstructing the interacting structure of EG networks is fundamental to understand and control its collective dynamics. Existing approaches used to handle this problem, such as the lasso, a convex optimization method, need a user-defined constant to control the tradeoff between the natural sparsity of networks and measurement error (the difference between observed data and simulated data). However, a shortcoming of these approaches is that it is not easy to determine these key parameters which can maximize the performance. In contrast to these approaches, we first model the EG network reconstruction problem as a multiobjective optimization problem (MOP), and then develop a framework which involves multiobjective evolutionary algorithm (MOEA), followed by solution selection based on knee regions, termed as MOEANet, to solve this MOP. We also design an effective initialization operator based on the lasso for MOEA. We apply the proposed method to reconstruct various types of synthetic and real-world networks, and the results show that our approach is effective to avoid the above parameter selecting problem and can reconstruct EG networks with high accuracy. PMID:27886244
Reconstructing Networks from Profit Sequences in Evolutionary Games via a Multiobjective Optimization Approach with Lasso Initialization

NASA Astrophysics Data System (ADS)

Wu, Kai; Liu, Jing; Wang, Shuai

2016-11-01

Evolutionary games (EG) model a common type of interactions in various complex, networked, natural and social systems. Given such a system with only profit sequences being available, reconstructing the interacting structure of EG networks is fundamental to understand and control its collective dynamics. Existing approaches used to handle this problem, such as the lasso, a convex optimization method, need a user-defined constant to control the tradeoff between the natural sparsity of networks and measurement error (the difference between observed data and simulated data). However, a shortcoming of these approaches is that it is not easy to determine these key parameters which can maximize the performance. In contrast to these approaches, we first model the EG network reconstruction problem as a multiobjective optimization problem (MOP), and then develop a framework which involves multiobjective evolutionary algorithm (MOEA), followed by solution selection based on knee regions, termed as MOEANet, to solve this MOP. We also design an effective initialization operator based on the lasso for MOEA. We apply the proposed method to reconstruct various types of synthetic and real-world networks, and the results show that our approach is effective to avoid the above parameter selecting problem and can reconstruct EG networks with high accuracy.
Reconstruction of Tissue-Specific Metabolic Networks Using CORDA

PubMed Central

Schultz, André; Qutub, Amina A.

2016-01-01

Human metabolism involves thousands of reactions and metabolites. To interpret this complexity, computational modeling becomes an essential experimental tool. One of the most popular techniques to study human metabolism as a whole is genome scale modeling. A key challenge to applying genome scale modeling is identifying critical metabolic reactions across diverse human tissues. Here we introduce a novel algorithm called Cost Optimization Reaction Dependency Assessment (CORDA) to build genome scale models in a tissue-specific manner. CORDA performs more efficiently computationally, shows better agreement to experimental data, and displays better model functionality and capacity when compared to previous algorithms. CORDA also returns reaction associations that can greatly assist in any manual curation to be performed following the automated reconstruction process. Using CORDA, we developed a library of 76 healthy and 20 cancer tissue-specific reconstructions. These reconstructions identified which metabolic pathways are shared across diverse human tissues. Moreover, we identified changes in reactions and pathways that are differentially included and present different capacity profiles in cancer compared to healthy tissues, including up-regulation of folate metabolism, the down-regulation of thiamine metabolism, and tight regulation of oxidative phosphorylation. PMID:26942765
A database of human genes and a gene network involved in response to tick-borne encephalitis virus infection.

PubMed

Ignatieva, Elena V; Igoshin, Alexander V; Yudin, Nikolay S

2017-12-28

Tick-borne encephalitis is caused by the neurotropic, positive-sense RNA virus, tick-borne encephalitis virus (TBEV). TBEV infection can lead to a variety of clinical manifestations ranging from slight fever to severe neurological illness. Very little is known about genetic factors predisposing to severe forms of disease caused by TBEV. The aims of the study were to compile a catalog of human genes involved in response to TBEV infection and to rank genes from the catalog based on the number of neighbors in the network of pairwise interactions involving these genes and TBEV RNA or proteins. Based on manual review and curation of scientific publications a catalog comprising 140 human genes involved in response to TBEV infection was developed. To provide access to data on all genes, the TBEVhostDB web resource ( http://icg.nsc.ru/TBEVHostDB/ ) was created. We reconstructed a network formed by pairwise interactions between TBEV virion itself, viral RNA and viral proteins and 140 genes/proteins from TBEVHostDB. Genes were ranked according to the number of interactions in the network. Two genes/proteins (CCR5 and IFNAR1) that had maximal number of interactions were revealed. It was found that the subnetworks formed by CCR5 and IFNAR1 and their neighbors were a fragments of two key pathways functioning during the course of tick-borne encephalitis: (1) the attenuation of interferon-I signaling pathway by the TBEV NS5 protein that targeted peptidase D; (2) proinflammation and tissue damage pathway triggered by chemokine receptor CCR5 interacting with CD4, CCL3, CCL4, CCL2. Among nine genes associated with severe forms of TBEV infection, three genes/proteins (CCR5, IL10, ARID1B) were found to have protein-protein interactions within the network, and two genes/proteins (IFNL3 and the IL10, that was just mentioned) were up- or down-regulated in response to TBEV infection. Based on this finding, potential mechanisms for participation of CCR5, IL10, ARID1B, and IFNL3 in the host response to TBEV infection were suggested. A database comprising 140 human genes involved in response to TBEV infection was compiled and the TBEVHostDB web resource, providing access to all genes was created. This is the first effort of integrating and unifying data on genetic factors that may predispose to severe forms of diseases caused by TBEV. The TBEVHostDB could potentially be used for assessment of risk factors for severe forms of tick-borne encephalitis and for the design of personalized pharmacological strategies for the treatment of TBEV infection.
DNA-Binding Kinetics Determines the Mechanism of Noise-Induced Switching in Gene Networks

PubMed Central

Tse, Margaret J.; Chu, Brian K.; Roy, Mahua; Read, Elizabeth L.

2015-01-01

Gene regulatory networks are multistable dynamical systems in which attractor states represent cell phenotypes. Spontaneous, noise-induced transitions between these states are thought to underlie critical cellular processes, including cell developmental fate decisions, phenotypic plasticity in fluctuating environments, and carcinogenesis. As such, there is increasing interest in the development of theoretical and computational approaches that can shed light on the dynamics of these stochastic state transitions in multistable gene networks. We applied a numerical rare-event sampling algorithm to study transition paths of spontaneous noise-induced switching for a ubiquitous gene regulatory network motif, the bistable toggle switch, in which two mutually repressive genes compete for dominant expression. We find that the method can efficiently uncover detailed switching mechanisms that involve fluctuations both in occupancies of DNA regulatory sites and copy numbers of protein products. In addition, we show that the rate parameters governing binding and unbinding of regulatory proteins to DNA strongly influence the switching mechanism. In a regime of slow DNA-binding/unbinding kinetics, spontaneous switching occurs relatively frequently and is driven primarily by fluctuations in DNA-site occupancies. In contrast, in a regime of fast DNA-binding/unbinding kinetics, switching occurs rarely and is driven by fluctuations in levels of expressed protein. Our results demonstrate how spontaneous cell phenotype transitions involve collective behavior of both regulatory proteins and DNA. Computational approaches capable of simulating dynamics over many system variables are thus well suited to exploring dynamic mechanisms in gene networks. PMID:26488666
Node fingerprinting: an efficient heuristic for aligning biological networks.

PubMed

Radu, Alex; Charleston, Michael

2014-10-01

With the continuing increase in availability of biological data and improvements to biological models, biological network analysis has become a promising area of research. An emerging technique for the analysis of biological networks is through network alignment. Network alignment has been used to calculate genetic distance, similarities between regulatory structures, and the effect of external forces on gene expression, and to depict conditional activity of expression modules in cancer. Network alignment is algorithmically complex, and therefore we must rely on heuristics, ideally as efficient and accurate as possible. The majority of current techniques for network alignment rely on precomputed information, such as with protein sequence alignment, or on tunable network alignment parameters, which may introduce an increased computational overhead. Our presented algorithm, which we call Node Fingerprinting (NF), is appropriate for performing global pairwise network alignment without precomputation or tuning, can be fully parallelized, and is able to quickly compute an accurate alignment between two biological networks. It has performed as well as or better than existing algorithms on biological and simulated data, and with fewer computational resources. The algorithmic validation performed demonstrates the low computational resource requirements of NF.
Establishing gene models from the Pinus pinaster genome using gene capture and BAC sequencing.

PubMed

Seoane-Zonjic, Pedro; Cañas, Rafael A; Bautista, Rocío; Gómez-Maldonado, Josefa; Arrillaga, Isabel; Fernández-Pozo, Noé; Claros, M Gonzalo; Cánovas, Francisco M; Ávila, Concepción

2016-02-27

In the era of DNA throughput sequencing, assembling and understanding gymnosperm mega-genomes remains a challenge. Although drafts of three conifer genomes have recently been published, this number is too low to understand the full complexity of conifer genomes. Using techniques focused on specific genes, gene models can be established that can aid in the assembly of gene-rich regions, and this information can be used to compare genomes and understand functional evolution. In this study, gene capture technology combined with BAC isolation and sequencing was used as an experimental approach to establish de novo gene structures without a reference genome. Probes were designed for 866 maritime pine transcripts to sequence genes captured from genomic DNA. The gene models were constructed using GeneAssembler, a new bioinformatic pipeline, which reconstructed over 82% of the gene structures, and a high proportion (85%) of the captured gene models contained sequences from the promoter regulatory region. In a parallel experiment, the P. pinaster BAC library was screened to isolate clones containing genes whose cDNA sequence were already available. BAC clones containing the asparagine synthetase, sucrose synthase and xyloglucan endotransglycosylase gene sequences were isolated and used in this study. The gene models derived from the gene capture approach were compared with the genomic sequences derived from the BAC clones. This combined approach is a particularly efficient way to capture the genomic structures of gene families with a small number of members. The experimental approach used in this study is a valuable combined technique to study genomic gene structures in species for which a reference genome is unavailable. It can be used to establish exon/intron boundaries in unknown gene structures, to reconstruct incomplete genes and to obtain promoter sequences that can be used for transcriptional studies. A bioinformatics algorithm (GeneAssembler) is also provided as a Ruby gem for this class of analyses.
Correction of dog dystrophic epidermolysis bullosa by transplantation of genetically modified epidermal autografts.

PubMed

Gache, Yannick; Pin, Didier; Gagnoux-Palacios, Laurent; Carozzo, Claude; Meneguzzi, Guerrino

2011-10-01

Recessive dystrophic epidermolysis bullosa (RDEB) is a severe skin blistering condition caused by mutations in the gene coding for collagen type VII. Genetically engineered RDEB dog keratinocytes were used to generate autologous epidermal sheets subsequently grafted on two RDEB dogs carrying a homozygous missense mutation in the col7a1 gene and expressing baseline amounts of the aberrant protein. Transplanted cells regenerated a differentiated and vascularized auto-renewing epidermis progressively repopulated by dendritic cells and melanocytes. No adverse immune reaction was detected in either dog. In dog 1, the grafted epidermis firmly adhered to the dermis throughout the 24-month follow-up, which correlated with efficient transduction (100%) of highly clonogenic epithelial cells and sustained transgene expression. In dog 2, less efficient (65%) transduction of primary keratinocytes resulted in a loss of the transplanted epidermis and graft blistering 5 months after transplantation. These data provide the proof of principle for ex vivo gene therapy of RDEB patients with missense mutations in collagen type VII by engraftment of the reconstructed epidermis, and demonstrate that highly efficient transduction of epidermal stem cells is crucial for successful gene therapy of inherited skin diseases in which correction of the genetic defect confers no major selective advantage in cell culture.
Combining graph and flux-based structures to decipher phenotypic essential metabolites within metabolic networks.

PubMed

Laniau, Julie; Frioux, Clémence; Nicolas, Jacques; Baroukh, Caroline; Cortes, Maria-Paz; Got, Jeanne; Trottier, Camille; Eveillard, Damien; Siegel, Anne

2017-01-01

The emergence of functions in biological systems is a long-standing issue that can now be addressed at the cell level with the emergence of high throughput technologies for genome sequencing and phenotyping. The reconstruction of complete metabolic networks for various organisms is a key outcome of the analysis of these data, giving access to a global view of cell functioning. The analysis of metabolic networks may be carried out by simply considering the architecture of the reaction network or by taking into account the stoichiometry of reactions. In both approaches, this analysis is generally centered on the outcome of the network and considers all metabolic compounds to be equivalent in this respect. As in the case of genes and reactions, about which the concept of essentiality has been developed, it seems, however, that some metabolites play crucial roles in system responses, due to the cell structure or the internal wiring of the metabolic network. We propose a classification of metabolic compounds according to their capacity to influence the activation of targeted functions (generally the growth phenotype) in a cell. We generalize the concept of essentiality to metabolites and introduce the concept of the phenotypic essential metabolite (PEM) which influences the growth phenotype according to sustainability, producibility or optimal-efficiency criteria. We have developed and made available a tool, Conquests , which implements a method combining graph-based and flux-based analysis, two approaches that are usually considered separately. The identification of PEMs is made effective by using a logical programming approach. The exhaustive study of phenotypic essential metabolites in six genome-scale metabolic models suggests that the combination and the comparison of graph, stoichiometry and optimal flux-based criteria allows some features of the metabolic network functionality to be deciphered by focusing on a small number of compounds. By considering the best combination of both graph-based and flux-based techniques, the Conquests python package advocates for a broader use of these compounds both to facilitate network curation and to promote a precise understanding of metabolic phenotype.

Combining graph and flux-based structures to decipher phenotypic essential metabolites within metabolic networks

PubMed Central

Frioux, Clémence; Nicolas, Jacques; Baroukh, Caroline; Cortes, Maria-Paz; Got, Jeanne; Trottier, Camille; Eveillard, Damien

2017-01-01

Background The emergence of functions in biological systems is a long-standing issue that can now be addressed at the cell level with the emergence of high throughput technologies for genome sequencing and phenotyping. The reconstruction of complete metabolic networks for various organisms is a key outcome of the analysis of these data, giving access to a global view of cell functioning. The analysis of metabolic networks may be carried out by simply considering the architecture of the reaction network or by taking into account the stoichiometry of reactions. In both approaches, this analysis is generally centered on the outcome of the network and considers all metabolic compounds to be equivalent in this respect. As in the case of genes and reactions, about which the concept of essentiality has been developed, it seems, however, that some metabolites play crucial roles in system responses, due to the cell structure or the internal wiring of the metabolic network. Results We propose a classification of metabolic compounds according to their capacity to influence the activation of targeted functions (generally the growth phenotype) in a cell. We generalize the concept of essentiality to metabolites and introduce the concept of the phenotypic essential metabolite (PEM) which influences the growth phenotype according to sustainability, producibility or optimal-efficiency criteria. We have developed and made available a tool, Conquests, which implements a method combining graph-based and flux-based analysis, two approaches that are usually considered separately. The identification of PEMs is made effective by using a logical programming approach. Conclusion The exhaustive study of phenotypic essential metabolites in six genome-scale metabolic models suggests that the combination and the comparison of graph, stoichiometry and optimal flux-based criteria allows some features of the metabolic network functionality to be deciphered by focusing on a small number of compounds. By considering the best combination of both graph-based and flux-based techniques, the Conquests python package advocates for a broader use of these compounds both to facilitate network curation and to promote a precise understanding of metabolic phenotype. PMID:29038751
Reconstructing Cell Lineages from Single-Cell Gene Expression Data: A Pilot Study

DTIC Science & Technology

2016-08-30

Reconstructing cell lineages from single- cell gene expression data: a pilot study The goal of this pilot study is to develop novel mathematical...methods, by leveraging tools developed in the bifurcation theory, to infer the underlying cell -state dynamics from single- cell gene expression data. Our...proposed method contains two steps. The first step is to reconstruct the temporal order of the cells from gene expression data, whereas the second
SSER: Species specific essential reactions database.

PubMed

Labena, Abraham A; Ye, Yuan-Nong; Dong, Chuan; Zhang, Fa-Z; Guo, Feng-Biao

2017-04-19

Essential reactions are vital components of cellular networks. They are the foundations of synthetic biology and are potential candidate targets for antimetabolic drug design. Especially if a single reaction is catalyzed by multiple enzymes, then inhibiting the reaction would be a better option than targeting the enzymes or the corresponding enzyme-encoding gene. The existing databases such as BRENDA, BiGG, KEGG, Bio-models, Biosilico, and many others offer useful and comprehensive information on biochemical reactions. But none of these databases especially focus on essential reactions. Therefore, building a centralized repository for this class of reactions would be of great value. Here, we present a species-specific essential reactions database (SSER). The current version comprises essential biochemical and transport reactions of twenty-six organisms which are identified via flux balance analysis (FBA) combined with manual curation on experimentally validated metabolic network models. Quantitative data on the number of essential reactions, number of the essential reactions associated with their respective enzyme-encoding genes and shared essential reactions across organisms are the main contents of the database. SSER would be a prime source to obtain essential reactions data and related gene and metabolite information and it can significantly facilitate the metabolic network models reconstruction and analysis, and drug target discovery studies. Users can browse, search, compare and download the essential reactions of organisms of their interest through the website http://cefg.uestc.edu.cn/sser .
From genomics to chemical genomics: new developments in KEGG

PubMed Central

Kanehisa, Minoru; Goto, Susumu; Hattori, Masahiro; Aoki-Kinoshita, Kiyoko F.; Itoh, Masumi; Kawashima, Shuichi; Katayama, Toshiaki; Araki, Michihiro; Hirakawa, Mika

2006-01-01

The increasing amount of genomic and molecular information is the basis for understanding higher-order biological systems, such as the cell and the organism, and their interactions with the environment, as well as for medical, industrial and other practical applications. The KEGG resource () provides a reference knowledge base for linking genomes to biological systems, categorized as building blocks in the genomic space (KEGG GENES) and the chemical space (KEGG LIGAND), and wiring diagrams of interaction networks and reaction networks (KEGG PATHWAY). A fourth component, KEGG BRITE, has been formally added to the KEGG suite of databases. This reflects our attempt to computerize functional interpretations as part of the pathway reconstruction process based on the hierarchically structured knowledge about the genomic, chemical and network spaces. In accordance with the new chemical genomics initiatives, the scope of KEGG LIGAND has been significantly expanded to cover both endogenous and exogenous molecules. Specifically, RPAIR contains curated chemical structure transformation patterns extracted from known enzymatic reactions, which would enable analysis of genome-environment interactions, such as the prediction of new reactions and new enzyme genes that would degrade new environmental compounds. Additionally, drug information is now stored separately and linked to new KEGG DRUG structure maps. PMID:16381885
The Spring of Systems Biology-Driven Breeding.

PubMed

Lavarenne, Jérémy; Guyomarc'h, Soazig; Sallaud, Christophe; Gantet, Pascal; Lucas, Mikaël

2018-05-12

Genetics and molecular biology have contributed to the development of rationalized plant breeding programs. Recent developments in both high-throughput experimental analyses of biological systems and in silico data processing offer the possibility to address the whole gene regulatory network (GRN) controlling a given trait. GRN models can be applied to identify topological features helping to shortlist potential candidate genes for breeding purposes. Time-series data sets can be used to support dynamic modelling of the network. This will enable a deeper comprehension of network behaviour and the identification of the few elements to be genetically rewired to push the system towards a modified phenotype of interest. This paves the way to design more efficient, systems biology-based breeding strategies. Copyright © 2018 Elsevier Ltd. All rights reserved.
Systems level mapping of metabolic complexity in Mycobacterium tuberculosis to identify high-value drug targets.

PubMed

Vashisht, Rohit; Bhat, Ashwini G; Kushwaha, Shreeram; Bhardwaj, Anshu; Brahmachari, Samir K

2014-10-11

The effectiveness of current therapeutic regimens for Mycobacterium tuberculosis (Mtb) is diminished by the need for prolonged therapy and the rise of drug resistant/tolerant strains. This global health threat, despite decades of basic research and a wealth of legacy knowledge, is due to a lack of systems level understanding that can innovate the process of fast acting and high efficacy drug discovery. The enhanced functional annotations of the Mtb genome, which were previously obtained through a crowd sourcing approach was used to reconstruct the metabolic network of Mtb in a bottom up manner. We represent this information by developing a novel Systems Biology Spindle Map of Metabolism (SBSM) and comprehend its static and dynamic structure using various computational approaches based on simulation and design. The reconstructed metabolism of Mtb encompasses 961 metabolites, involved in 1152 reactions catalyzed by 890 protein coding genes, organized into 50 pathways. By accounting for static and dynamic analysis of SBSM in Mtb we identified various critical proteins required for the growth and survival of bacteria. Further, we assessed the potential of these proteins as putative drug targets that are fast acting and less toxic. Further, we formulate a novel concept of metabolic persister genes (MPGs) and compared our predictions with published in vitro and in vivo experimental evidence. Through such analyses, we report for the first time that de novo biosynthesis of NAD may give rise to bacterial persistence in Mtb under conditions of metabolic stress induced by conventional anti-tuberculosis therapy. We propose such MPG's as potential combination of drug targets for existing antibiotics that can improve their efficacy and efficiency for drug tolerant bacteria. The systems level framework formulated by us to identify potential non-toxic drug targets and strategies to circumvent the issue of bacterial persistence can substantially aid in the process of TB drug discovery and translational research.
Extending gene ontology with gene association networks.

PubMed

Peng, Jiajie; Wang, Tao; Wang, Jixuan; Wang, Yadong; Chen, Jin

2016-04-15

Gene ontology (GO) is a widely used resource to describe the attributes for gene products. However, automatic GO maintenance remains to be difficult because of the complex logical reasoning and the need of biological knowledge that are not explicitly represented in the GO. The existing studies either construct whole GO based on network data or only infer the relations between existing GO terms. None is purposed to add new terms automatically to the existing GO. We proposed a new algorithm 'GOExtender' to efficiently identify all the connected gene pairs labeled by the same parent GO terms. GOExtender is used to predict new GO terms with biological network data, and connect them to the existing GO. Evaluation tests on biological process and cellular component categories of different GO releases showed that GOExtender can extend new GO terms automatically based on the biological network. Furthermore, we applied GOExtender to the recent release of GO and discovered new GO terms with strong support from literature. Software and supplementary document are available at www.msu.edu/%7Ejinchen/GOExtender jinchen@msu.edu or ydwang@hit.edu.cn Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Modular design of metabolic network for robust production of n-butanol from galactose-glucose mixtures.

PubMed

Lim, Hyun Gyu; Lim, Jae Hyung; Jung, Gyoo Yeol

2015-01-01

Refactoring microorganisms for efficient production of advanced biofuel such as n-butanol from a mixture of sugars in the cheap feedstock is a prerequisite to achieve economic feasibility in biorefinery. However, production of biofuel from inedible and cheap feedstock is highly challenging due to the slower utilization of biomass-driven sugars, arising from complex assimilation pathway, difficulties in amplification of biosynthetic pathways for heterologous metabolite, and redox imbalance caused by consuming intracellular reducing power to produce quite reduced biofuel. Even with these problems, the microorganisms should show robust production of biofuel to obtain industrial feasibility. Thus, refactoring microorganisms for efficient conversion is highly desirable in biofuel production. In this study, we engineered robust Escherichia coli to accomplish high production of n-butanol from galactose-glucose mixtures via the design of modular pathway, an efficient and systematic way, to reconstruct the entire metabolic pathway with many target genes. Three modular pathways designed using the predictable genetic elements were assembled for efficient galactose utilization, n-butanol production, and redox re-balancing to robustly produce n-butanol from a sugar mixture of galactose and glucose. Specifically, the engineered strain showed dramatically increased n-butanol production (3.3-fold increased to 6.2 g/L after 48-h fermentation) compared to the parental strain (1.9 g/L) in galactose-supplemented medium. Moreover, fermentation with mixtures of galactose and glucose at various ratios from 2:1 to 1:2 confirmed that our engineered strain was able to robustly produce n-butanol regardless of sugar composition with simultaneous utilization of galactose and glucose. Collectively, modular pathway engineering of metabolic network can be an effective approach in strain development for optimal biofuel production with cost-effective fermentable sugars. To the best of our knowledge, this study demonstrated the first and highest n-butanol production from galactose in E. coli. Moreover, robust production of n-butanol with sugar mixtures with variable composition would facilitate the economic feasibility of the microbial process using a mixture of sugars from cheap biomass in the near future.
Differential co-expression analysis reveals a novel prognostic gene module in ovarian cancer.

PubMed

Gov, Esra; Arga, Kazim Yalcin

2017-07-10

Ovarian cancer is one of the most significant disease among gynecological disorders that women suffered from over the centuries. However, disease-specific and effective biomarkers were still not available, since studies have focused on individual genes associated with ovarian cancer, ignoring the interactions and associations among the gene products. Here, ovarian cancer differential co-expression networks were reconstructed via meta-analysis of gene expression data and co-expressed gene modules were identified in epithelial cells from ovarian tumor and healthy ovarian surface epithelial samples to propose ovarian cancer associated genes and their interactions. We propose a novel, highly interconnected, differentially co-expressed, and co-regulated gene module in ovarian cancer consisting of 84 prognostic genes. Furthermore, the specificity of the module to ovarian cancer was shown through analyses of datasets in nine other cancers. These observations underscore the importance of transcriptome based systems biomarkers research in deciphering the elusive pathophysiology of ovarian cancer, and here, we present reciprocal interplay between candidate ovarian cancer genes and their transcriptional regulatory dynamics. The corresponding gene module might provide new insights on ovarian cancer prognosis and treatment strategies that continue to place a significant burden on global health.
LitMiner and WikiGene: identifying problem-related key players of gene regulation using publication abstracts.

PubMed

Maier, Holger; Döhr, Stefanie; Grote, Korbinian; O'Keeffe, Sean; Werner, Thomas; Hrabé de Angelis, Martin; Schneider, Ralf

2005-07-01

The LitMiner software is a literature data-mining tool that facilitates the identification of major gene regulation key players related to a user-defined field of interest in PubMed abstracts. The prediction of gene-regulatory relationships is based on co-occurrence analysis of key terms within the abstracts. LitMiner predicts relationships between key terms from the biomedical domain in four categories (genes, chemical compounds, diseases and tissues). Owing to the limitations (no direction, unverified automatic prediction) of the co-occurrence approach, the primary data in the LitMiner database represent postulated basic gene-gene relationships. The usefulness of the LitMiner system has been demonstrated recently in a study that reconstructed disease-related regulatory networks by promoter modelling that was initiated by a LitMiner generated primary gene list. To overcome the limitations and to verify and improve the data, we developed WikiGene, a Wiki-based curation tool that allows revision of the data by expert users over the Internet. LitMiner (http://andromeda.gsf.de/litminer) and WikiGene (http://andromeda.gsf.de/wiki) can be used unrestricted with any Internet browser.
Metabolic reconstruction, constraint-based analysis and game theory to probe genome-scale metabolic networks.

PubMed

Ruppin, Eytan; Papin, Jason A; de Figueiredo, Luis F; Schuster, Stefan

2010-08-01

With the advent of modern omics technologies, it has become feasible to reconstruct (quasi-) whole-cell metabolic networks and characterize them in more and more detail. Computer simulations of the dynamic behavior of such networks are difficult due to a lack of kinetic data and to computational limitations. In contrast, network analysis based on appropriate constraints such as the steady-state condition (constraint-based analysis) is feasible and allows one to derive conclusions about the system's metabolic capabilities. Here, we review methods for the reconstruction of metabolic networks, modeling techniques such as flux balance analysis and elementary flux modes and current progress in their development and applications. Game-theoretical methods for studying metabolic networks are discussed as well. Copyright © 2010 Elsevier Ltd. All rights reserved.
Behavior-specific changes in transcriptional modules lead to distinct and predictable neurogenomic states

PubMed Central

Chandrasekaran, Sriram; Ament, Seth A.; Eddy, James A.; Rodriguez-Zas, Sandra L.; Schatz, Bruce R.; Price, Nathan D.; Robinson, Gene E.

2011-01-01

Using brain transcriptomic profiles from 853 individual honey bees exhibiting 48 distinct behavioral phenotypes in naturalistic contexts, we report that behavior-specific neurogenomic states can be inferred from the coordinated action of transcription factors (TFs) and their predicted target genes. Unsupervised hierarchical clustering of these transcriptomic profiles showed three clusters that correspond to three ecologically important behavioral categories: aggression, maturation, and foraging. To explore the genetic influences potentially regulating these behavior-specific neurogenomic states, we reconstructed a brain transcriptional regulatory network (TRN) model. This brain TRN quantitatively predicts with high accuracy gene expression changes of more than 2,000 genes involved in behavior, even for behavioral phenotypes on which it was not trained, suggesting that there is a core set of TFs that regulates behavior-specific gene expression in the bee brain, and other TFs more specific to particular categories. TFs playing key roles in the TRN include well-known regulators of neural and behavioral plasticity, e.g., Creb, as well as TFs better known in other biological contexts, e.g., NF-κB (immunity). Our results reveal three insights concerning the relationship between genes and behavior. First, distinct behaviors are subserved by distinct neurogenomic states in the brain. Second, the neurogenomic states underlying different behaviors rely upon both shared and distinct transcriptional modules. Third, despite the complexity of the brain, simple linear relationships between TFs and their putative target genes are a surprisingly prominent feature of the networks underlying behavior. PMID:21960440
Avian influenza H5N1 viral and bird migration networks in Asia

USGS Publications Warehouse

Tian, Huaivu; Zhou, Sen; Dong, Lu; Van Boeckel, Thomas P.; Cui, Yujun; Newman, Scott H.; Takekawa, John Y.; Prosser, Diann J.; Xiao, Xiangming; Wu, Yarong; Cazelles, Bernard; Huang, Shanqian; Yang, Ruifu; Grenfell, Bryan T.; Xu, Bing

2015-01-01

The spatial spread of the highly pathogenic avian influenza virus H5N1 and its long-term persistence in Asia have resulted in avian influenza panzootics and enormous economic losses in the poultry sector. However, an understanding of the regional long-distance transmission and seasonal patterns of the virus is still lacking. In this study, we present a phylogeographic approach to reconstruct the viral migration network. We show that within each wild fowl migratory flyway, the timing of H5N1 outbreaks and viral migrations are closely associated, but little viral transmission was observed between the flyways. The bird migration network is shown to better reflect the observed viral gene sequence data than other networks and contributes to seasonal H5N1 epidemics in local regions and its large-scale transmission along flyways. These findings have potentially far-reaching consequences, improving our understanding of how bird migration drives the periodic reemergence of H5N1 in Asia.
Avian influenza H5N1 viral and bird migration networks in Asia

PubMed Central

Tian, Huaiyu; Zhou, Sen; Dong, Lu; Van Boeckel, Thomas P.; Cui, Yujun; Newman, Scott H.; Takekawa, John Y.; Prosser, Diann J.; Xiao, Xiangming; Wu, Yarong; Cazelles, Bernard; Huang, Shanqian; Yang, Ruifu; Grenfell, Bryan T.; Xu, Bing

2015-01-01

The spatial spread of the highly pathogenic avian influenza virus H5N1 and its long-term persistence in Asia have resulted in avian influenza panzootics and enormous economic losses in the poultry sector. However, an understanding of the regional long-distance transmission and seasonal patterns of the virus is still lacking. In this study, we present a phylogeographic approach to reconstruct the viral migration network. We show that within each wild fowl migratory flyway, the timing of H5N1 outbreaks and viral migrations are closely associated, but little viral transmission was observed between the flyways. The bird migration network is shown to better reflect the observed viral gene sequence data than other networks and contributes to seasonal H5N1 epidemics in local regions and its large-scale transmission along flyways. These findings have potentially far-reaching consequences, improving our understanding of how bird migration drives the periodic reemergence of H5N1 in Asia. PMID:25535385
From Gene Trees to a Dated Allopolyploid Network: Insights from the Angiosperm Genus Viola (Violaceae)

PubMed Central

Marcussen, Thomas; Heier, Lise; Brysting, Anne K.; Oxelman, Bengt; Jakobsen, Kjetill S.

2015-01-01

Allopolyploidization accounts for a significant fraction of speciation events in many eukaryotic lineages. However, existing phylogenetic and dating methods require tree-like topologies and are unable to handle the network-like phylogenetic relationships of lineages containing allopolyploids. No explicit framework has so far been established for evaluating competing network topologies, and few attempts have been made to date phylogenetic networks. We used a four-step approach to generate a dated polyploid species network for the cosmopolitan angiosperm genus Viola L. (Violaceae Batch.). The genus contains ca 600 species and both recent (neo-) and more ancient (meso-) polyploid lineages distributed over 16 sections. First, we obtained DNA sequences of three low-copy nuclear genes and one chloroplast region, from 42 species representing all 16 sections. Second, we obtained fossil-calibrated chronograms for each nuclear gene marker. Third, we determined the most parsimonious multilabeled genome tree and its corresponding network, resolved at the section (not the species) level. Reconstructing the “correct” network for a set of polyploids depends on recovering all homoeologs, i.e., all subgenomes, in these polyploids. Assuming the presence of Viola subgenome lineages that were not detected by the nuclear gene phylogenies (“ghost subgenome lineages”) significantly reduced the number of inferred polyploidization events. We identified the most parsimonious network topology from a set of five competing scenarios differing in the interpretation of homoeolog extinctions and lineage sorting, based on (i) fewest possible ghost subgenome lineages, (ii) fewest possible polyploidization events, and (iii) least possible deviation from expected ploidy as inferred from available chromosome counts of the involved polyploid taxa. Finally, we estimated the homoploid and polyploid speciation times of the most parsimonious network. Homoploid speciation times were estimated by coalescent analysis of gene tree node ages. Polyploid speciation times were estimated by comparing branch lengths and speciation rates of lineages with and without ploidy shifts. Our analyses recognize Viola as an old genus (crown age 31 Ma) whose evolutionary history has been profoundly affected by allopolyploidy. Between 16 and 21 allopolyploidizations are necessary to explain the diversification of the 16 major lineages (sections) of Viola, suggesting that allopolyploidy has accounted for a high percentage—between 67% and 88%—of the speciation events at this level. The theoretical and methodological approaches presented here for (i) constructing networks and (ii) dating speciation events within a network, have general applicability for phylogenetic studies of groups where allopolyploidization has occurred. They make explicit use of a hitherto underexplored source of ploidy information from chromosome counts to help resolve phylogenetic cases where incomplete sequence data hampers network inference. Importantly, the coalescent-based method used herein circumvents the assumption of tree-like evolution required by most techniques for dating speciation events. PMID:25281848
A deep convolutional neural network using directional wavelets for low-dose X-ray CT reconstruction.

PubMed

Kang, Eunhee; Min, Junhong; Ye, Jong Chul

2017-10-01

Due to the potential risk of inducing cancer, radiation exposure by X-ray CT devices should be reduced for routine patient scanning. However, in low-dose X-ray CT, severe artifacts typically occur due to photon starvation, beam hardening, and other causes, all of which decrease the reliability of the diagnosis. Thus, a high-quality reconstruction method from low-dose X-ray CT data has become a major research topic in the CT community. Conventional model-based de-noising approaches are, however, computationally very expensive, and image-domain de-noising approaches cannot readily remove CT-specific noise patterns. To tackle these problems, we want to develop a new low-dose X-ray CT algorithm based on a deep-learning approach. We propose an algorithm which uses a deep convolutional neural network (CNN) which is applied to the wavelet transform coefficients of low-dose CT images. More specifically, using a directional wavelet transform to extract the directional component of artifacts and exploit the intra- and inter- band correlations, our deep network can effectively suppress CT-specific noise. In addition, our CNN is designed with a residual learning architecture for faster network training and better performance. Experimental results confirm that the proposed algorithm effectively removes complex noise patterns from CT images derived from a reduced X-ray dose. In addition, we show that the wavelet-domain CNN is efficient when used to remove noise from low-dose CT compared to existing approaches. Our results were rigorously evaluated by several radiologists at the Mayo Clinic and won second place at the 2016 "Low-Dose CT Grand Challenge." To the best of our knowledge, this work is the first deep-learning architecture for low-dose CT reconstruction which has been rigorously evaluated and proven to be effective. In addition, the proposed algorithm, in contrast to existing model-based iterative reconstruction (MBIR) methods, has considerable potential to benefit from large data sets. Therefore, we believe that the proposed algorithm opens a new direction in the area of low-dose CT research. © 2017 American Association of Physicists in Medicine.
Graphite Web: web tool for gene set analysis exploiting pathway topology

PubMed Central

Sales, Gabriele; Calura, Enrica; Martini, Paolo; Romualdi, Chiara

2013-01-01

Graphite web is a novel web tool for pathway analyses and network visualization for gene expression data of both microarray and RNA-seq experiments. Several pathway analyses have been proposed either in the univariate or in the global and multivariate context to tackle the complexity and the interpretation of expression results. These methods can be further divided into ‘topological’ and ‘non-topological’ methods according to their ability to gain power from pathway topology. Biological pathways are, in fact, not only gene lists but can be represented through a network where genes and connections are, respectively, nodes and edges. To this day, the most used approaches are non-topological and univariate although they miss the relationship among genes. On the contrary, topological and multivariate approaches are more powerful, but difficult to be used by researchers without bioinformatic skills. Here we present Graphite web, the first public web server for pathway analysis on gene expression data that combines topological and multivariate pathway analyses with an efficient system of interactive network visualizations for easy results interpretation. Specifically, Graphite web implements five different gene set analyses on three model organisms and two pathway databases. Graphite Web is freely available at http://graphiteweb.bio.unipd.it/. PMID:23666626
The Roland Maze Project school-based extensive air shower network

NASA Astrophysics Data System (ADS)

Feder, J.; Jȩdrzejczak, K.; Karczmarczyk, J.; Lewandowski, R.; Swarzyński, J.; Szabelska, B.; Szabelski, J.; Wibig, T.

2006-01-01

We plan to construct the large area network of extensive air shower detectors placed on the roofs of high school buildings in the city of Łódź. Detection points will be connected by INTERNET to the central server and their work will be synchronized by GPS. The main scientific goal of the project are studies of ultra high energy cosmic rays. Using existing town infrastructure (INTERNET, power supply, etc.) will significantly reduce the cost of the experiment. Engaging high school students in the research program should significantly increase their knowledge of science and modern technologies, and can be a very efficient way of science popularisation. We performed simulations of the projected network capabilities of registering Extensive Air Showers and reconstructing energies of primary particles. Results of the simulations and the current status of project realisation will be presented.
Flux balance analysis of primary metabolism in Chlamydomonas reinhardtii.

PubMed

Boyle, Nanette R; Morgan, John A

2009-01-07

Photosynthetic organisms convert atmospheric carbon dioxide into numerous metabolites along the pathways to make new biomass. Aquatic photosynthetic organisms, which fix almost half of global inorganic carbon, have great potential: as a carbon dioxide fixation method, for the economical production of chemicals, or as a source for lipids and starch which can then be converted to biofuels. To harness this potential through metabolic engineering and to maximize production, a more thorough understanding of photosynthetic metabolism must first be achieved. A model algal species, C. reinhardtii, was chosen and the metabolic network reconstructed. Intracellular fluxes were then calculated using flux balance analysis (FBA). The metabolic network of primary metabolism for a green alga, C. reinhardtii, was reconstructed using genomic and biochemical information. The reconstructed network accounts for the intracellular localization of enzymes to three compartments and includes 484 metabolic reactions and 458 intracellular metabolites. Based on BLAST searches, one newly annotated enzyme (fructose-1,6-bisphosphatase) was added to the Chlamydomonas reinhardtii database. FBA was used to predict metabolic fluxes under three growth conditions, autotrophic, heterotrophic and mixotrophic growth. Biomass yields ranged from 28.9 g per mole C for autotrophic growth to 15 g per mole C for heterotrophic growth. The flux balance analysis model of central and intermediary metabolism in C. reinhardtii is the first such model for algae and the first model to include three metabolically active compartments. In addition to providing estimates of intracellular fluxes, metabolic reconstruction and modelling efforts also provide a comprehensive method for annotation of genome databases. As a result of our reconstruction, one new enzyme was annotated in the database and several others were found to be missing; implying new pathways or non-conserved enzymes. The use of FBA to estimate intracellular fluxes also provides flux values that can be used as a starting point for rational engineering of C. reinhardtii. From these initial estimates, it is clear that aerobic heterotrophic growth on acetate has a low yield on carbon, while mixotrophically and autotrophically grown cells are significantly more carbon efficient.
A multispecies tree ring reconstruction of Potomac River streamflow (950-2001)

NASA Astrophysics Data System (ADS)

Maxwell, R. Stockton; Hessl, Amy E.; Cook, Edward R.; Pederson, Neil

2011-05-01

Mean May-September Potomac River streamflow was reconstructed from 950-2001 using a network of tree ring chronologies (n = 27) representing multiple species. We chose a nested principal components reconstruction method to maximize use of available chronologies backward in time. Explained variance during the period of calibration ranged from 20% to 53% depending on the number and species of chronologies available in each 25 year time step. The model was verified by two goodness of fit tests, the coefficient of efficiency (CE) and the reduction of error statistic (RE). The RE and CE never fell below zero, suggesting the model had explanatory power over the entire period of reconstruction. Beta weights indicated a loss of explained variance during the 1550-1700 period that we hypothesize was caused by the reduction in total number of predictor chronologies and loss of important predictor species. Thus, the reconstruction is strongest from 1700-2001. Frequency, intensity, and duration of drought and pluvial events were examined to aid water resource managers. We found that the instrumental period did not represent adequately the full range of annual to multidecadal variability present in the reconstruction. Our reconstruction of mean May-September Potomac River streamflow was a significant improvement over the Cook and Jacoby (1983) reconstruction because it expanded the seasonal window, lengthened the record by 780 years, and better replicated the mean and variance of the instrumental record. By capitalizing on variable phenologies and tree growth responses to climate, multispecies reconstructions may provide significantly more information about past hydroclimate, especially in regions with low aridity and high tree species diversity.

Microbial Community Metabolic Modeling: A Community Data-Driven Network Reconstruction: COMMUNITY DATA-DRIVEN METABOLIC NETWORK MODELING

DOE Office of Scientific and Technical Information (OSTI.GOV)

Henry, Christopher S.; Bernstein, Hans C.; Weisenhorn, Pamela

Metabolic network modeling of microbial communities provides an in-depth understanding of community-wide metabolic and regulatory processes. Compared to single organism analyses, community metabolic network modeling is more complex because it needs to account for interspecies interactions. To date, most approaches focus on reconstruction of high-quality individual networks so that, when combined, they can predict community behaviors as a result of interspecies interactions. However, this conventional method becomes ineffective for communities whose members are not well characterized and cannot be experimentally interrogated in isolation. Here, we tested a new approach that uses community-level data as a critical input for the networkmore » reconstruction process. This method focuses on directly predicting interspecies metabolic interactions in a community, when axenic information is insufficient. We validated our method through the case study of a bacterial photoautotroph-heterotroph consortium that was used to provide data needed for a community-level metabolic network reconstruction. Resulting simulations provided experimentally validated predictions of how a photoautotrophic cyanobacterium supports the growth of an obligate heterotrophic species by providing organic carbon and nitrogen sources.« less
Stochasticity versus determinism: consequences for realistic gene regulatory network modelling and evolution.

PubMed

Jenkins, Dafyd J; Stekel, Dov J

2010-02-01

Gene regulation is one important mechanism in producing observed phenotypes and heterogeneity. Consequently, the study of gene regulatory network (GRN) architecture, function and evolution now forms a major part of modern biology. However, it is impossible to experimentally observe the evolution of GRNs on the timescales on which living species evolve. In silico evolution provides an approach to studying the long-term evolution of GRNs, but many models have either considered network architecture from non-adaptive evolution, or evolution to non-biological objectives. Here, we address a number of important modelling and biological questions about the evolution of GRNs to the realistic goal of biomass production. Can different commonly used simulation paradigms, in particular deterministic and stochastic Boolean networks, with and without basal gene expression, be used to compare adaptive with non-adaptive evolution of GRNs? Are these paradigms together with this goal sufficient to generate a range of solutions? Will the interaction between a biological goal and evolutionary dynamics produce trade-offs between growth and mutational robustness? We show that stochastic basal gene expression forces shrinkage of genomes due to energetic constraints and is a prerequisite for some solutions. In systems that are able to evolve rates of basal expression, two optima, one with and one without basal expression, are observed. Simulation paradigms without basal expression generate bloated networks with non-functional elements. Further, a range of functional solutions was observed under identical conditions only in stochastic networks. Moreover, there are trade-offs between efficiency and yield, indicating an inherent intertwining of fitness and evolutionary dynamics.
Reconstruction of a Real World Social Network using the Potts Model and Loopy Belief Propagation.

PubMed

Bisconti, Cristian; Corallo, Angelo; Fortunato, Laura; Gentile, Antonio A; Massafra, Andrea; Pellè, Piergiuseppe

2015-01-01

The scope of this paper is to test the adoption of a statistical model derived from Condensed Matter Physics, for the reconstruction of the structure of a social network. The inverse Potts model, traditionally applied to recursive observations of quantum states in an ensemble of particles, is here addressed to observations of the members' states in an organization and their (anti)correlations, thus inferring interactions as links among the members. Adopting proper (Bethe) approximations, such an inverse problem is showed to be tractable. Within an operational framework, this network-reconstruction method is tested for a small real-world social network, the Italian parliament. In this study case, it is easy to track statuses of the parliament members, using (co)sponsorships of law proposals as the initial dataset. In previous studies of similar activity-based networks, the graph structure was inferred directly from activity co-occurrences: here we compare our statistical reconstruction with such standard methods, outlining discrepancies and advantages.
Reconstruction of a Real World Social Network using the Potts Model and Loopy Belief Propagation

PubMed Central

Bisconti, Cristian; Corallo, Angelo; Fortunato, Laura; Gentile, Antonio A.; Massafra, Andrea; Pellè, Piergiuseppe

2015-01-01

The scope of this paper is to test the adoption of a statistical model derived from Condensed Matter Physics, for the reconstruction of the structure of a social network. The inverse Potts model, traditionally applied to recursive observations of quantum states in an ensemble of particles, is here addressed to observations of the members' states in an organization and their (anti)correlations, thus inferring interactions as links among the members. Adopting proper (Bethe) approximations, such an inverse problem is showed to be tractable. Within an operational framework, this network-reconstruction method is tested for a small real-world social network, the Italian parliament. In this study case, it is easy to track statuses of the parliament members, using (co)sponsorships of law proposals as the initial dataset. In previous studies of similar activity-based networks, the graph structure was inferred directly from activity co-occurrences: here we compare our statistical reconstruction with such standard methods, outlining discrepancies and advantages. PMID:26617539
Integrating mitosis, toxicity, and transgene expression in a telecommunications packet-switched network model of lipoplex-mediated gene delivery.

PubMed

Martin, Timothy M; Wysocki, Beata J; Beyersdorf, Jared P; Wysocki, Tadeusz A; Pannier, Angela K

2014-08-01

Gene delivery systems transport exogenous genetic information to cells or biological systems with the potential to directly alter endogenous gene expression and behavior with applications in functional genomics, tissue engineering, medical devices, and gene therapy. Nonviral systems offer advantages over viral systems because of their low immunogenicity, inexpensive synthesis, and easy modification but suffer from lower transfection levels. The representation of gene transfer using models offers perspective and interpretation of complex cellular mechanisms,including nonviral gene delivery where exact mechanisms are unknown. Here, we introduce a novel telecommunications model of the nonviral gene delivery process in which the delivery of the gene to a cell is synonymous with delivery of a packet of information to a destination computer within a packet-switched computer network. Such a model uses nodes and layers to simplify the complexity of modeling the transfection process and to overcome several challenges of existing models. These challenges include a limited scope and limited time frame, which often does not incorporate biological effects known to affect transfection. The telecommunication model was constructed in MATLAB to model lipoplex delivery of the gene encoding the green fluorescent protein to HeLa cells. Mitosis and toxicity events were included in the model resulting in simulation outputs of nuclear internalization and transfection efficiency that correlated with experimental data. A priori predictions based on model sensitivity analysis suggest that increasing endosomal escape and decreasing lysosomal degradation, protein degradation, and GFP-induced toxicity can improve transfection efficiency by three-fold. Application of the telecommunications model to nonviral gene delivery offers insight into the development of new gene delivery systems with therapeutically relevant transfection levels.
A support vector machine based test for incongruence between sets of trees in tree space

PubMed Central

2012-01-01

Background The increased use of multi-locus data sets for phylogenetic reconstruction has increased the need to determine whether a set of gene trees significantly deviate from the phylogenetic patterns of other genes. Such unusual gene trees may have been influenced by other evolutionary processes such as selection, gene duplication, or horizontal gene transfer. Results Motivated by this problem we propose a nonparametric goodness-of-fit test for two empirical distributions of gene trees, and we developed the software GeneOut to estimate a p-value for the test. Our approach maps trees into a multi-dimensional vector space and then applies support vector machines (SVMs) to measure the separation between two sets of pre-defined trees. We use a permutation test to assess the significance of the SVM separation. To demonstrate the performance of GeneOut, we applied it to the comparison of gene trees simulated within different species trees across a range of species tree depths. Applied directly to sets of simulated gene trees with large sample sizes, GeneOut was able to detect very small differences between two set of gene trees generated under different species trees. Our statistical test can also include tree reconstruction into its test framework through a variety of phylogenetic optimality criteria. When applied to DNA sequence data simulated from different sets of gene trees, results in the form of receiver operating characteristic (ROC) curves indicated that GeneOut performed well in the detection of differences between sets of trees with different distributions in a multi-dimensional space. Furthermore, it controlled false positive and false negative rates very well, indicating a high degree of accuracy. Conclusions The non-parametric nature of our statistical test provides fast and efficient analyses, and makes it an applicable test for any scenario where evolutionary or other factors can lead to trees with different multi-dimensional distributions. The software GeneOut is freely available under the GNU public license. PMID:22909268
Developing integrated crop knowledge networks to advance candidate gene discovery.

PubMed

Hassani-Pak, Keywan; Castellote, Martin; Esch, Maria; Hindle, Matthew; Lysenko, Artem; Taubert, Jan; Rawlings, Christopher

2016-12-01

The chances of raising crop productivity to enhance global food security would be greatly improved if we had a complete understanding of all the biological mechanisms that underpinned traits such as crop yield, disease resistance or nutrient and water use efficiency. With more crop genomes emerging all the time, we are nearer having the basic information, at the gene-level, to begin assembling crop gene catalogues and using data from other plant species to understand how the genes function and how their interactions govern crop development and physiology. Unfortunately, the task of creating such a complete knowledge base of gene functions, interaction networks and trait biology is technically challenging because the relevant data are dispersed in myriad databases in a variety of data formats with variable quality and coverage. In this paper we present a general approach for building genome-scale knowledge networks that provide a unified representation of heterogeneous but interconnected datasets to enable effective knowledge mining and gene discovery. We describe the datasets and outline the methods, workflows and tools that we have developed for creating and visualising these networks for the major crop species, wheat and barley. We present the global characteristics of such knowledge networks and with an example linking a seed size phenotype to a barley WRKY transcription factor orthologous to TTG2 from Arabidopsis, we illustrate the value of integrated data in biological knowledge discovery. The software we have developed (www.ondex.org) and the knowledge resources (http://knetminer.rothamsted.ac.uk) we have created are all open-source and provide a first step towards systematic and evidence-based gene discovery in order to facilitate crop improvement.
Experimental Definition and Validation of Protein Coding Transcripts in Chlamydomonas reinhardtii

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kourosh Salehi-Ashtiani; Jason A. Papin

Algal fuel sources promise unsurpassed yields in a carbon neutral manner that minimizes resource competition between agriculture and fuel crops. Many challenges must be addressed before algal biofuels can be accepted as a component of the fossil fuel replacement strategy. One significant challenge is that the cost of algal fuel production must become competitive with existing fuel alternatives. Algal biofuel production presents the opportunity to fine-tune microbial metabolic machinery for an optimal blend of biomass constituents and desired fuel molecules. Genome-scale model-driven algal metabolic design promises to facilitate both goals by directing the utilization of metabolites in the complex, interconnectedmore » metabolic networks to optimize production of the compounds of interest. Using Chlamydomonas reinhardtii as a model, we developed a systems-level methodology bridging metabolic network reconstruction with annotation and experimental verification of enzyme encoding open reading frames. We reconstructed a genome-scale metabolic network for this alga and devised a novel light-modeling approach that enables quantitative growth prediction for a given light source, resolving wavelength and photon flux. We experimentally verified transcripts accounted for in the network and physiologically validated model function through simulation and generation of new experimental growth data, providing high confidence in network contents and predictive applications. The network offers insight into algal metabolism and potential for genetic engineering and efficient light source design, a pioneering resource for studying light-driven metabolism and quantitative systems biology. Our approach to generate a predictive metabolic model integrated with cloned open reading frames, provides a cost-effective platform to generate metabolic engineering resources. While the generated resources are specific to algal systems, the approach that we have developed is not specific to algae and can be readily expanded to other microbial systems as well as higher plants and animals.« less
Constructing networks with correlation maximization methods.

PubMed

Mellor, Joseph C; Wu, Jie; Delisi, Charles

2004-01-01

Problems of inference in systems biology are ideally reduced to formulations which can efficiently represent the features of interest. In the case of predicting gene regulation and pathway networks, an important feature which describes connected genes and proteins is the relationship between active and inactive forms, i.e. between the "on" and "off" states of the components. While not optimal at the limits of resolution, these logical relationships between discrete states can often yield good approximations of the behavior in larger complex systems, where exact representation of measurement relationships may be intractable. We explore techniques for extracting binary state variables from measurement of gene expression, and go on to describe robust measures for statistical significance and information that can be applied to many such types of data. We show how statistical strength and information are equivalent criteria in limiting cases, and demonstrate the application of these measures to simple systems of gene regulation.
Reconstructing dynamic molecular states from single-cell time series.

PubMed

Huang, Lirong; Pauleve, Loic; Zechner, Christoph; Unger, Michael; Hansen, Anders S; Koeppl, Heinz

2016-09-01

The notion of state for a system is prevalent in the quantitative sciences and refers to the minimal system summary sufficient to describe the time evolution of the system in a self-consistent manner. This is a prerequisite for a principled understanding of the inner workings of a system. Owing to the complexity of intracellular processes, experimental techniques that can retrieve a sufficient summary are beyond our reach. For the case of stochastic biomolecular reaction networks, we show how to convert the partial state information accessible by experimental techniques into a full system state using mathematical analysis together with a computational model. This is intimately related to the notion of conditional Markov processes and we introduce the posterior master equation and derive novel approximations to the corresponding infinite-dimensional posterior moment dynamics. We exemplify this state reconstruction approach using both in silico data and single-cell data from two gene expression systems in Saccharomyces cerevisiae, where we reconstruct the dynamic promoter and mRNA states from noisy protein abundance measurements. © 2016 The Author(s).
Energy-efficient ECG compression on wireless biosensors via minimal coherence sensing and weighted ℓ₁ minimization reconstruction.

PubMed

Zhang, Jun; Gu, Zhenghui; Yu, Zhu Liang; Li, Yuanqing

2015-03-01

Low energy consumption is crucial for body area networks (BANs). In BAN-enabled ECG monitoring, the continuous monitoring entails the need of the sensor nodes to transmit a huge data to the sink node, which leads to excessive energy consumption. To reduce airtime over energy-hungry wireless links, this paper presents an energy-efficient compressed sensing (CS)-based approach for on-node ECG compression. At first, an algorithm called minimal mutual coherence pursuit is proposed to construct sparse binary measurement matrices, which can be used to encode the ECG signals with superior performance and extremely low complexity. Second, in order to minimize the data rate required for faithful reconstruction, a weighted ℓ1 minimization model is derived by exploring the multisource prior knowledge in wavelet domain. Experimental results on MIT-BIH arrhythmia database reveals that the proposed approach can obtain higher compression ratio than the state-of-the-art CS-based methods. Together with its low encoding complexity, our approach can achieve significant energy saving in both encoding process and wireless transmission.
Fiber Orientation Estimation Guided by a Deep Network.

PubMed

Ye, Chuyang; Prince, Jerry L

2017-09-01

Diffusion magnetic resonance imaging (dMRI) is currently the only tool for noninvasively imaging the brain's white matter tracts. The fiber orientation (FO) is a key feature computed from dMRI for tract reconstruction. Because the number of FOs in a voxel is usually small, dictionary-based sparse reconstruction has been used to estimate FOs. However, accurate estimation of complex FO configurations in the presence of noise can still be challenging. In this work we explore the use of a deep network for FO estimation in a dictionary-based framework and propose an algorithm named Fiber Orientation Reconstruction guided by a Deep Network (FORDN). FORDN consists of two steps. First, we use a smaller dictionary encoding coarse basis FOs to represent diffusion signals. To estimate the mixture fractions of the dictionary atoms, a deep network is designed to solve the sparse reconstruction problem. Second, the coarse FOs inform the final FO estimation, where a larger dictionary encoding a dense basis of FOs is used and a weighted ℓ 1 -norm regularized least squares problem is solved to encourage FOs that are consistent with the network output. FORDN was evaluated and compared with state-of-the-art algorithms that estimate FOs using sparse reconstruction on simulated and typical clinical dMRI data. The results demonstrate the benefit of using a deep network for FO estimation.
A low-rank matrix recovery approach for energy efficient EEG acquisition for a wireless body area network.

PubMed

Majumdar, Angshul; Gogna, Anupriya; Ward, Rabab

2014-08-25

We address the problem of acquiring and transmitting EEG signals in Wireless Body Area Networks (WBAN) in an energy efficient fashion. In WBANs, the energy is consumed by three operations: sensing (sampling), processing and transmission. Previous studies only addressed the problem of reducing the transmission energy. For the first time, in this work, we propose a technique to reduce sensing and processing energy as well: this is achieved by randomly under-sampling the EEG signal. We depart from previous Compressed Sensing based approaches and formulate signal recovery (from under-sampled measurements) as a matrix completion problem. A new algorithm to solve the matrix completion problem is derived here. We test our proposed method and find that the reconstruction accuracy of our method is significantly better than state-of-the-art techniques; and we achieve this while saving sensing, processing and transmission energy. Simple power analysis shows that our proposed methodology consumes considerably less power compared to previous CS based techniques.
Multiconstrained gene clustering based on generalized projections

PubMed Central

2010-01-01

Background Gene clustering for annotating gene functions is one of the fundamental issues in bioinformatics. The best clustering solution is often regularized by multiple constraints such as gene expressions, Gene Ontology (GO) annotations and gene network structures. How to integrate multiple pieces of constraints for an optimal clustering solution still remains an unsolved problem. Results We propose a novel multiconstrained gene clustering (MGC) method within the generalized projection onto convex sets (POCS) framework used widely in image reconstruction. Each constraint is formulated as a corresponding set. The generalized projector iteratively projects the clustering solution onto these sets in order to find a consistent solution included in the intersection set that satisfies all constraints. Compared with previous MGC methods, POCS can integrate multiple constraints from different nature without distorting the original constraints. To evaluate the clustering solution, we also propose a new performance measure referred to as Gene Log Likelihood (GLL) that considers genes having more than one function and hence in more than one cluster. Comparative experimental results show that our POCS-based gene clustering method outperforms current state-of-the-art MGC methods. Conclusions The POCS-based MGC method can successfully combine multiple constraints from different nature for gene clustering. Also, the proposed GLL is an effective performance measure for the soft clustering solutions. PMID:20356386
Factors Determining the Efficiency of Porcine Somatic Cell Nuclear Transfer: Data Analysis with Over 200,000 Reconstructed Embryos

PubMed Central

Liu, Tianbin; Dou, Hongwei; Xiang, Xi; Li, Yong; Pang, Xinzhi; Zhang, Yijie; Chen, Yu; Luan, Jing; Xu, Ying; Yang, Zhenzhen; Yang, Wenxian; Liu, Huan; Li, Feida; Wang, Hui; Yang, Huanming; Bolund, Lars; Vajta, Gabor

2015-01-01

Abstract Data analysis in somatic cell nuclear transfer (SCNT) research is usually limited to several hundreds or thousands of reconstructed embryos. Here, we report mass results obtained with an established and consistent porcine SCNT system (handmade cloning [HMC]). During the experimental period, 228,230 reconstructed embryos and 82,969 blastocysts were produced. After being transferred into 656 recipients, 1070 piglets were obtained. First, the effects of different types of donor cells, including fetal fibroblasts (FFs), adult fibroblasts (AFs), adult preadipocytes (APs), and adult blood mesenchymal (BM) cells, were investigated on the further in vitro and in vivo development. Compared to adult donor cells (AFs, APs, BM cells, respectively), FF cells resulted in a lower blastocyst/reconstructed embryo rate (30.38% vs. 37.94%, 34.65%, and 34.87%, respectively), but a higher overall efficiency on the number of piglets born alive per total blastocysts transferred (1.50% vs. 0.86%, 1.03%, and 0.91%, respectively) and a lower rate of developmental abnormalities (10.87% vs. 56.57%, 24.39%, and 51.85%, respectively). Second, recloning was performed with cloned adult fibroblasts (CAFs) and cloned fetal fibroblasts (CFFs). When CAFs were used as the nuclear donor, fewer developmental abnormalities and higher overall efficiency were observed compared to AFs (56.57% vs. 28.13% and 0.86% vs. 1.59%, respectively). However, CFFs had an opposite effect on these parameters when compared with CAFs (94.12% vs. 10.87% and 0.31% vs. 1.50%, respectively). Third, effects of genetic modification on the efficiency of SCNT were investigated with transgenic fetal fibroblasts (TFFs) and gene knockout fetal fibroblasts (KOFFs). Genetic modification of FFs increased developmental abnormalities (38.96% and 25.24% vs. 10.87% for KOFFs, TFFs, and FFs, respectively). KOFFs resulted in lower overall efficiency compared to TFFs and FFs (0.68% vs. 1.62% and 1.50%, respectively). In conclusion, this is the first report of large-scale analysis of porcine cell nuclear transfer that provides important data for potential industrialization of HMC technology. PMID:26655078
Link prediction boosted psychiatry disorder classification for functional connectivity network

NASA Astrophysics Data System (ADS)

Li, Weiwei; Mei, Xue; Wang, Hao; Zhou, Yu; Huang, Jiashuang

2017-02-01

Functional connectivity network (FCN) is an effective tool in psychiatry disorders classification, and represents cross-correlation of the regional blood oxygenation level dependent signal. However, FCN is often incomplete for suffering from missing and spurious edges. To accurate classify psychiatry disorders and health control with the incomplete FCN, we first `repair' the FCN with link prediction, and then exact the clustering coefficients as features to build a weak classifier for every FCN. Finally, we apply a boosting algorithm to combine these weak classifiers for improving classification accuracy. Our method tested by three datasets of psychiatry disorder, including Alzheimer's Disease, Schizophrenia and Attention Deficit Hyperactivity Disorder. The experimental results show our method not only significantly improves the classification accuracy, but also efficiently reconstructs the incomplete FCN.
Dynamic regulation of genetic pathways and targets during aging in Caenorhabditis elegans.

PubMed

He, Kan; Zhou, Tao; Shao, Jiaofang; Ren, Xiaoliang; Zhao, Zhongying; Liu, Dahai

2014-03-01

Numerous genetic targets and some individual pathways associated with aging have been identified using the worm model. However, less is known about the genetic mechanisms of aging in genome wide, particularly at the level of multiple pathways as well as the regulatory networks during aging. Here, we employed the gene expression datasets of three time points during aging in Caenorhabditis elegans (C. elegans) and performed the approach of gene set enrichment analysis (GSEA) on each dataset between adjacent stages. As a result, multiple genetic pathways and targets were identified as significantly down- or up-regulated. Among them, 5 truly aging-dependent signaling pathways including MAPK signaling pathway, mTOR signaling pathway, Wnt signaling pathway, TGF-beta signaling pathway and ErbB signaling pathway as well as 12 significantly associated genes were identified with dynamic expression pattern during aging. On the other hand, the continued declines in the regulation of several metabolic pathways have been demonstrated to display age-related changes. Furthermore, the reconstructed regulatory networks based on three of aging related Chromatin immunoprecipitation experiments followed by sequencing (ChIP-seq) datasets and the expression matrices of 154 involved genes in above signaling pathways provide new insights into aging at the multiple pathways level. The combination of multiple genetic pathways and targets needs to be taken into consideration in future studies of aging, in which the dynamic regulation would be uncovered.
From Corynebacterium glutamicum to Mycobacterium tuberculosis—towards transfers of gene regulatory networks and integrated data analyses with MycoRegNet

PubMed Central

Krawczyk, Justina; Kohl, Thomas A.; Goesmann, Alexander; Kalinowski, Jörn; Baumbach, Jan

2009-01-01

Year by year, approximately two million people die from tuberculosis, a disease caused by the bacterium Mycobacterium tuberculosis. There is a tremendous need for new anti-tuberculosis therapies (antituberculotica) and drugs to cope with the spread of tuberculosis. Despite many efforts to obtain a better understanding of M. tuberculosis' pathogenicity and its survival strategy in humans, many questions are still unresolved. Among other cellular processes in bacteria, pathogenicity is controlled by transcriptional regulation. Thus, various studies on M. tuberculosis concentrate on the analysis of transcriptional regulation in order to gain new insights on pathogenicity and other essential processes ensuring mycobacterial survival. We designed a bioinformatics pipeline for the reliable transfer of gene regulations between taxonomically closely related organisms that incorporates (i) a prediction of orthologous genes and (ii) the prediction of transcription factor binding sites. In total, 460 regulatory interactions were identified for M. tuberculosis using our comparative approach. Based on that, we designed a publicly available platform that aims to data integration, analysis, visualization and finally the reconstruction of mycobacterial transcriptional gene regulatory networks: MycoRegNet. It is a comprehensive database system and analysis platform that offers several methods for data exploration and the generation of novel hypotheses. MycoRegNet is publicly available at http://mycoregnet.cebitec.uni-bielefeld.de. PMID:19494184
Global transcriptional regulatory network for Escherichia coli robustly connects gene expression to transcription factor activities

PubMed Central

Fang, Xin; Sastry, Anand; Mih, Nathan; Kim, Donghyuk; Tan, Justin; Lloyd, Colton J.; Gao, Ye; Yang, Laurence; Palsson, Bernhard O.

2017-01-01

Transcriptional regulatory networks (TRNs) have been studied intensely for >25 y. Yet, even for the Escherichia coli TRN—probably the best characterized TRN—several questions remain. Here, we address three questions: (i) How complete is our knowledge of the E. coli TRN; (ii) how well can we predict gene expression using this TRN; and (iii) how robust is our understanding of the TRN? First, we reconstructed a high-confidence TRN (hiTRN) consisting of 147 transcription factors (TFs) regulating 1,538 transcription units (TUs) encoding 1,764 genes. The 3,797 high-confidence regulatory interactions were collected from published, validated chromatin immunoprecipitation (ChIP) data and RegulonDB. For 21 different TF knockouts, up to 63% of the differentially expressed genes in the hiTRN were traced to the knocked-out TF through regulatory cascades. Second, we trained supervised machine learning algorithms to predict the expression of 1,364 TUs given TF activities using 441 samples. The algorithms accurately predicted condition-specific expression for 86% (1,174 of 1,364) of the TUs, while 193 TUs (14%) were predicted better than random TRNs. Third, we identified 10 regulatory modules whose definitions were robust against changes to the TRN or expression compendium. Using surrogate variable analysis, we also identified three unmodeled factors that systematically influenced gene expression. Our computational workflow comprehensively characterizes the predictive capabilities and systems-level functions of an organism’s TRN from disparate data types. PMID:28874552
Genetic Network Programming with Reconstructed Individuals

NASA Astrophysics Data System (ADS)

Ye, Fengming; Mabu, Shingo; Wang, Lutao; Eto, Shinji; Hirasawa, Kotaro

A lot of research on evolutionary computation has been done and some significant classical methods such as Genetic Algorithm (GA), Genetic Programming (GP), Evolutionary Programming (EP), and Evolution Strategies (ES) have been studied. Recently, a new approach named Genetic Network Programming (GNP) has been proposed. GNP can evolve itself and find the optimal solution. It is based on the idea of Genetic Algorithm and uses the data structure of directed graphs. Many papers have demonstrated that GNP can deal with complex problems in the dynamic environments very efficiently and effectively. As a result, recently, GNP is getting more and more attentions and is used in many different areas such as data mining, extracting trading rules of stock markets, elevator supervised control systems, etc., and GNP has obtained some outstanding results. Focusing on the GNP's distinguished expression ability of the graph structure, this paper proposes a method named Genetic Network Programming with Reconstructed Individuals (GNP-RI). The aim of GNP-RI is to balance the exploitation and exploration of GNP, that is, to strengthen the exploitation ability by using the exploited information extensively during the evolution process of GNP and finally obtain better performances than that of GNP. In the proposed method, the worse individuals are reconstructed and enhanced by the elite information before undergoing genetic operations (mutation and crossover). The enhancement of worse individuals mimics the maturing phenomenon in nature, where bad individuals can become smarter after receiving a good education. In this paper, GNP-RI is applied to the tile-world problem which is an excellent bench mark for evaluating the proposed architecture. The performance of GNP-RI is compared with that of the conventional GNP. The simulation results show some advantages of GNP-RI demonstrating its superiority over the conventional GNPs.

Scriptaid and 5-aza-2'deoxycytidine enhanced expression of pluripotent genes and in vitro developmental competence in interspecies Black-footed cat cloned embryos

USGS Publications Warehouse

Gómez, M. C.; Biancardi, M.N.; Jenkins, J.A.; Dumas, C.; Galiguis, J.; Wang, G.; Earle Pope, C.

2012-01-01

Somatic cell nuclear transfer offers the possibility of preserving endangered species including the black-footed cat, which is threatened with extinction. The effectiveness and efficiency of somatic cell nuclear transfer (SCNT) depends on a variety of factors, but 'inappropriate epigenetic reprogramming of the transplanted nucleus is the primary cause of the developmental failure of cloned embryos. Abnormal epigenetic events such as DNA methylation and histone modifications during SCNT perturb the expression of imprinted and pluripotent-related genes that, consequently, may result in foetal and neonatal abnormalities. We have demonstrated that pregnancies can be established after transfer of black-footed cat cloned embryos into domestic cat recipients, but none of the implanted embryos developed to term and the foetal failure has been associated to aberrant reprogramming in cloned embryos. There is growing evidence that modifying the epigenetic pattern of the chromatin template of both donor cells and reconstructed embryos with a combination of inhibitors of histone deacetylases and DNA methyltransferases results in enhanced gene reactivation and improved in vitro and in vivo developmental competence. Epigenetic modifications of the chromatin template of black-footed cat donor cells and reconstructed embryos with epigenetic-modifying compounds enhanced in vitro development, and regulated the expression of pluripotent genes, but these epigenetic modifications did not improve in vivo developmental competence.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Deng, Ye; Zhang, Ping; Qin, Yujia

When trying to discern network interactions among different species/populations in microbial communities interests have been evoked in recent years, but little information is available about temporal dynamics of microbial network interactions in response to environmental perturbations. We modified the random matrix theory-based network approach to discern network succession in groundwater microbial communities in response to emulsified vegetable oil (EVO) amendment for uranium bioremediation. Groundwater microbial communities from one control and seven monitor wells were analysed with a functional gene array (GeoChip 3.0), and functional molecular ecological networks (fMENs) at different time points were reconstructed. Our results showed that the networkmore » interactions were dramatically altered by EVO amendment. Dynamic and resilient succession was evident: fairly simple at the initial stage (Day 0), increasingly complex at the middle period (Days 4, 17, 31), most complex at Day 80, and then decreasingly complex at a later stage (140–269 days). Unlike previous studies in other habitats, negative interactions predominated in a time-series fMEN, suggesting strong competition among different microbial species in the groundwater systems after EVO injection. In particular, several keystone sulfate-reducing bacteria showed strong negative interactions with their network neighbours. These results provide mechanistic understanding of the decreased phylogenetic diversity during environmental perturbations.« less
DEEP--a tool for differential expression effector prediction.

PubMed

Degenhardt, Jost; Haubrock, Martin; Dönitz, Jürgen; Wingender, Edgar; Crass, Torsten

2007-07-01

High-throughput methods for measuring transcript abundance, like SAGE or microarrays, are widely used for determining differences in gene expression between different tissue types, dignities (normal/malignant) or time points. Further analysis of such data frequently aims at the identification of gene interaction networks that form the causal basis for the observed properties of the systems under examination. To this end, it is usually not sufficient to rely on the measured gene expression levels alone; rather, additional biological knowledge has to be taken into account in order to generate useful hypotheses about the molecular mechanism leading to the realization of a certain phenotype. We present a method that combines gene expression data with biological expert knowledge on molecular interaction networks, as described by the TRANSPATH database on signal transduction, to predict additional--and not necessarily differentially expressed--genes or gene products which might participate in processes specific for either of the examined tissues or conditions. In a first step, significance values for over-expression in tissue/condition A or B are assigned to all genes in the expression data set. Genes with a significance value exceeding a certain threshold are used as starting points for the reconstruction of a graph with signaling components as nodes and signaling events as edges. In a subsequent graph traversal process, again starting from the previously identified differentially expressed genes, all encountered nodes 'inherit' all their starting nodes' significance values. In a final step, the graph is visualized, the nodes being colored according to a weighted average of their inherited significance values. Each node's, or sub-network's, predominant color, ranging from green (significant for tissue/condition A) over yellow (not significant for either tissue/condition) to red (significant for tissue/condition B), thus gives an immediate visual clue on which molecules--differentially expressed or not--may play pivotal roles in the tissues or conditions under examination. The described method has been implemented in Java as a client/server application and a web interface called DEEP (Differential Expression Effector Prediction). The client, which features an easy-to-use graphical interface, can freely be downloaded from the following URL: http://deep.bioinf.med.uni-goettingen.de.
Comparative analysis of grapevine whole-genome gene predictions, functional annotation, categorization and integration of the predicted gene sequences

PubMed Central

2012-01-01

Background The first draft assembly and gene prediction of the grapevine genome (8X base coverage) was made available to the scientific community in 2007, and functional annotation was developed on this gene prediction. Since then additional Sanger sequences were added to the 8X sequences pool and a new version of the genomic sequence with superior base coverage (12X) was produced. Results In order to more efficiently annotate the function of the genes predicted in the new assembly, it is important to build on as much of the previous work as possible, by transferring 8X annotation of the genome to the 12X version. The 8X and 12X assemblies and gene predictions of the grapevine genome were compared to answer the question, “Can we uniquely map 8X predicted genes to 12X predicted genes?” The results show that while the assemblies and gene structure predictions are too different to make a complete mapping between them, most genes (18,725) showed a one-to-one relationship between 8X predicted genes and the last version of 12X predicted genes. In addition, reshuffled genomic sequence structures appeared. These highlight regions of the genome where the gene predictions need to be taken with caution. Based on the new grapevine gene functional annotation and in-depth functional categorization, twenty eight new molecular networks have been created for VitisNet while the existing networks were updated. Conclusions The outcomes of this study provide a functional annotation of the 12X genes, an update of VitisNet, the system of the grapevine molecular networks, and a new functional categorization of genes. Data are available at the VitisNet website (http://www.sdstate.edu/ps/research/vitis/pathways.cfm). PMID:22554261
A novel algorithm for finding optimal driver nodes to target control complex networks and its applications for drug targets identification.

PubMed

Guo, Wei-Feng; Zhang, Shao-Wu; Shi, Qian-Qian; Zhang, Cheng-Ming; Zeng, Tao; Chen, Luonan

2018-01-19

The advances in target control of complex networks not only can offer new insights into the general control dynamics of complex systems, but also be useful for the practical application in systems biology, such as discovering new therapeutic targets for disease intervention. In many cases, e.g. drug target identification in biological networks, we usually require a target control on a subset of nodes (i.e., disease-associated genes) with minimum cost, and we further expect that more driver nodes consistent with a certain well-selected network nodes (i.e., prior-known drug-target genes). Therefore, motivated by this fact, we pose and address a new and practical problem called as target control problem with objectives-guided optimization (TCO): how could we control the interested variables (or targets) of a system with the optional driver nodes by minimizing the total quantity of drivers and meantime maximizing the quantity of constrained nodes among those drivers. Here, we design an efficient algorithm (TCOA) to find the optional driver nodes for controlling targets in complex networks. We apply our TCOA to several real-world networks, and the results support that our TCOA can identify more precise driver nodes than the existing control-fucus approaches. Furthermore, we have applied TCOA to two bimolecular expert-curate networks. Source code for our TCOA is freely available from http://sysbio.sibcb.ac.cn/cb/chenlab/software.htm or https://github.com/WilfongGuo/guoweifeng . In the previous theoretical research for the full control, there exists an observation and conclusion that the driver nodes tend to be low-degree nodes. However, for target control the biological networks, we find interestingly that the driver nodes tend to be high-degree nodes, which is more consistent with the biological experimental observations. Furthermore, our results supply the novel insights into how we can efficiently target control a complex system, and especially many evidences on the practical strategic utility of TCOA to incorporate prior drug information into potential drug-target forecasts. Thus applicably, our method paves a novel and efficient way to identify the drug targets for leading the phenotype transitions of underlying biological networks.
Toward a physical basis of attention and self-regulation

NASA Astrophysics Data System (ADS)

Posner, Michael I.; Rothbart, Mary K.

2009-06-01

The concept of self-regulation is central to the understanding of human development. Self-regulation allows effective socialization and predicts both psychological pathologies and levels of achievement in schools. What has been missing are neural mechanisms to provide understanding of the cellular and molecular basis for self-regulation. We show that self-regulation can be measured during childhood by parental reports and by self-reports of adolescents and adults. These reports are summarized by a higher order factor called effortful control, which reflects perceptions about the ability of a given person to regulate their behavior in accord with cultural norms. Throughout childhood effortful control is related to children's performance in computerized conflict related tasks. Conflict tasks have been shown in neuroimaging studies to activate specific brain networks of executive attention. Several brain areas work together at rest and during cognitive tasks to regulate competing brain activity and thus control resulting behavior. The cellular structure of the anterior cingulate and insula contain cells, unique to humans and higher primates that provide strong links to remote brain areas. During conflict tasks, anterior cingulate activity is correlated with activity in remote sensory and emotional systems, depending upon the information selected for the task. During adolescence the structure and activity of the anterior cingulate has been found to be correlated with self-reports of effortful control. Studies have provided a perspective on how genes and environment act to shape the executive attention network, providing a physical basis for self-regulation. The anterior cingulate is regulated by dopamine. Genes that influence dopamine levels in the CNS have been shown to influence the efficiency of self-regulation. For example, alleles of the COMT gene that influence the efficiency of dopamine transmission are related to the ability to resolve conflict. Humans with disorders involving deletion of this gene exhibit large deficits in self-regulation. Alleles of other genes influencing dopamine and serotonin transmission have also been found to influence ability to resolve conflict in cognitive tasks. However, as is the case for many genes, the effectiveness of COMT alleles in shaping self-regulation depends upon cultural influences such as parenting. Studies find that aspects of parenting quality and parent training can influence child behavior and the efficiency of self-regulation. During development, the network that relates to self-regulation undergoes important changes in connectivity. Infants can use parts of the self-regulatory network to detect errors in sensory information, but the network does not yet have sufficient connectivity to organize brain activity in a coherent way. During middle childhood, along with increased projection cells involved in remote connections of dorsal anterior cingulate and prefrontal and parietal cortex, executive network connectivity increases and shifts from predominantly short to longer range connections. During this period specific exercises can influence network development and improve self-regulation. Understanding the physical basis of self-regulation has already cast light on individual differences in normal and pathological states and gives promise of allowing the design of methods to improve aspects of human development.
Abasy Atlas: a comprehensive inventory of systems, global network properties and systems-level elements across bacteria.

PubMed

Ibarra-Arellano, Miguel A; Campos-González, Adrián I; Treviño-Quintanilla, Luis G; Tauch, Andreas; Freyre-González, Julio A

2016-01-01

The availability of databases electronically encoding curated regulatory networks and of high-throughput technologies and methods to discover regulatory interactions provides an invaluable source of data to understand the principles underpinning the organization and evolution of these networks responsible for cellular regulation. Nevertheless, data on these sources never goes beyond the regulon level despite the fact that regulatory networks are complex hierarchical-modular structures still challenging our understanding. This brings the necessity for an inventory of systems across a large range of organisms, a key step to rendering feasible comparative systems biology approaches. In this work, we take the first step towards a global understanding of the regulatory networks organization by making a cartography of the functional architectures of diverse bacteria. Abasy ( A: cross- BA: cteria SY: stems) Atlas provides a comprehensive inventory of annotated functional systems, global network properties and systems-level elements (global regulators, modular genes shaping functional systems, basal machinery genes and intermodular genes) predicted by the natural decomposition approach for reconstructed and meta-curated regulatory networks across a large range of bacteria, including pathogenically and biotechnologically relevant organisms. The meta-curation of regulatory datasets provides the most complete and reliable set of regulatory interactions currently available, which can even be projected into subsets by considering the force or weight of evidence supporting them or the systems that they belong to. Besides, Abasy Atlas provides data enabling large-scale comparative systems biology studies aimed at understanding the common principles and particular lifestyle adaptions of systems across bacteria. Abasy Atlas contains systems and system-level elements for 50 regulatory networks comprising 78 649 regulatory interactions covering 42 bacteria in nine taxa, containing 3708 regulons and 1776 systems. All this brings together a large corpus of data that will surely inspire studies to generate hypothesis regarding the principles governing the evolution and organization of systems and the functional architectures controlling them.Database URL: http://abasy.ccg.unam.mx. © The Author(s) 2016. Published by Oxford University Press.
Biocomputional construction of a gene network under acid stress in Synechocystis sp. PCC 6803.

PubMed

Li, Yi; Rao, Nini; Yang, Feng; Zhang, Ying; Yang, Yang; Liu, Han-ming; Guo, Fengbiao; Huang, Jian

2014-01-01

Acid stress is one of the most serious threats that cyanobacteria have to face, and it has an impact at all levels from genome to phenotype. However, very little is known about the detailed response mechanism to acid stress in this species. We present here a general analysis of the gene regulatory network of Synechocystis sp. PCC 6803 in response to acid stress using comparative genome analysis and biocomputational prediction. In this study, we collected 85 genes and used them as an initial template to predict new genes through co-regulation, protein-protein interactions and the phylogenetic profile, and 179 new genes were obtained to form a complete template. In addition, we found that 11 enriched pathways such as glycolysis are closely related to the acid stress response. Finally, we constructed a regulatory network for the intricate relationship of these genes and summarize the key steps in response to acid stress. This is the first time a bioinformatic approach has been taken systematically to gene interactions in cyanobacteria and the elaboration of their cell metabolism and regulatory pathways under acid stress, which is more efficient than a traditional experimental study. The results also provide theoretical support for similar research into environmental stresses in cyanobacteria and possible industrial applications. Copyright © 2014 Institut Pasteur. Published by Elsevier Masson SAS. All rights reserved.
Candidate Gene Approach for Parasite Resistance in Sheep – Variation in Immune Pathway Genes and Association with Fecal Egg Count

PubMed Central

Periasamy, Kathiravan; Pichler, Rudolf; Poli, Mario; Cristel, Silvina; Cetrá, Bibiana; Medus, Daniel; Basar, Muladno; A. K., Thiruvenkadan; Ramasamy, Saravanan; Ellahi, Masroor Babbar; Mohammed, Faruque; Teneva, Atanaska; Shamsuddin, Mohammed; Podesta, Mario Garcia; Diallo, Adama

2014-01-01

Sheep chromosome 3 (Oar3) has the largest number of QTLs reported to be significantly associated with resistance to gastro-intestinal nematodes. This study aimed to identify single nucleotide polymorphisms (SNPs) within candidate genes located in sheep chromosome 3 as well as genes involved in major immune pathways. A total of 41 SNPs were identified across 38 candidate genes in a panel of unrelated sheep and genotyped in 713 animals belonging to 22 breeds across Asia, Europe and South America. The variations and evolution of immune pathway genes were assessed in sheep populations across these macro-environmental regions that significantly differ in the diversity and load of pathogens. The mean minor allele frequency (MAF) did not vary between Asian and European sheep reflecting the absence of ascertainment bias. Phylogenetic analysis revealed two major clusters with most of South Asian, South East Asian and South West Asian breeds clustering together while European and South American sheep breeds clustered together distinctly. Analysis of molecular variance revealed strong phylogeographic structure at loci located in immune pathway genes, unlike microsatellite and genome wide SNP markers. To understand the influence of natural selection processes, SNP loci located in chromosome 3 were utilized to reconstruct haplotypes, the diversity of which showed significant deviations from selective neutrality. Reduced Median network of reconstructed haplotypes showed balancing selection in force at these loci. Preliminary association of SNP genotypes with phenotypes recorded 42 days post challenge revealed significant differences (P<0.05) in fecal egg count, body weight change and packed cell volume at two, four and six SNP loci respectively. In conclusion, the present study reports strong phylogeographic structure and balancing selection operating at SNP loci located within immune pathway genes. Further, SNP loci identified in the study were found to have potential for future large scale association studies in naturally exposed sheep populations. PMID:24533078
Ab initio reconstruction of transcriptomes of pluripotent and lineage committed cells reveals gene structures of thousands of lincRNAs

PubMed Central

Guttman, Mitchell; Garber, Manuel; Levin, Joshua Z.; Donaghey, Julie; Robinson, James; Adiconis, Xian; Fan, Lin; Koziol, Magdalena J.; Gnirke, Andreas; Nusbaum, Chad; Rinn, John L.; Lander, Eric S.; Regev, Aviv

2010-01-01

RNA-Seq provides an unbiased way to study a transcriptome, including both coding and non-coding genes. To date, most RNA-Seq studies have critically depended on existing annotations, and thus focused on expression levels and variation in known transcripts. Here, we present Scripture, a method to reconstruct the transcriptome of a mammalian cell using only RNA-Seq reads and the genome sequence. We apply it to mouse embryonic stem cells, neuronal precursor cells, and lung fibroblasts to accurately reconstruct the full-length gene structures for the vast majority of known expressed genes. We identify substantial variation in protein-coding genes, including thousands of novel 5′-start sites, 3′-ends, and internal coding exons. We then determine the gene structures of over a thousand lincRNA and antisense loci. Our results open the way to direct experimental manipulation of thousands of non-coding RNAs, and demonstrate the power of ab initio reconstruction to render a comprehensive picture of mammalian transcriptomes. PMID:20436462
A collaborative computing framework of cloud network and WBSN applied to fall detection and 3-D motion reconstruction.

PubMed

Lai, Chin-Feng; Chen, Min; Pan, Jeng-Shyang; Youn, Chan-Hyun; Chao, Han-Chieh

2014-03-01

As cloud computing and wireless body sensor network technologies become gradually developed, ubiquitous healthcare services prevent accidents instantly and effectively, as well as provides relevant information to reduce related processing time and cost. This study proposes a co-processing intermediary framework integrated cloud and wireless body sensor networks, which is mainly applied to fall detection and 3-D motion reconstruction. In this study, the main focuses includes distributed computing and resource allocation of processing sensing data over the computing architecture, network conditions and performance evaluation. Through this framework, the transmissions and computing time of sensing data are reduced to enhance overall performance for the services of fall events detection and 3-D motion reconstruction.
The emergence and early evolution of biological carbon-fixation.

PubMed

Braakman, Rogier; Smith, Eric

2012-01-01

The fixation of CO₂ into living matter sustains all life on Earth, and embeds the biosphere within geochemistry. The six known chemical pathways used by extant organisms for this function are recognized to have overlaps, but their evolution is incompletely understood. Here we reconstruct the complete early evolutionary history of biological carbon-fixation, relating all modern pathways to a single ancestral form. We find that innovations in carbon-fixation were the foundation for most major early divergences in the tree of life. These findings are based on a novel method that fully integrates metabolic and phylogenetic constraints. Comparing gene-profiles across the metabolic cores of deep-branching organisms and requiring that they are capable of synthesizing all their biomass components leads to the surprising conclusion that the most common form for deep-branching autotrophic carbon-fixation combines two disconnected sub-networks, each supplying carbon to distinct biomass components. One of these is a linear folate-based pathway of CO₂ reduction previously only recognized as a fixation route in the complete Wood-Ljungdahl pathway, but which more generally may exclude the final step of synthesizing acetyl-CoA. Using metabolic constraints we then reconstruct a "phylometabolic" tree with a high degree of parsimony that traces the evolution of complete carbon-fixation pathways, and has a clear structure down to the root. This tree requires few instances of lateral gene transfer or convergence, and instead suggests a simple evolutionary dynamic in which all divergences have primary environmental causes. Energy optimization and oxygen toxicity are the two strongest forces of selection. The root of this tree combines the reductive citric acid cycle and the Wood-Ljungdahl pathway into a single connected network. This linked network lacks the selective optimization of modern fixation pathways but its redundancy leads to a more robust topology, making it more plausible than any modern pathway as a primitive universal ancestral form.
Improved Maximum Parsimony Models for Phylogenetic Networks.

PubMed

Van Iersel, Leo; Jones, Mark; Scornavacca, Celine

2018-05-01

Phylogenetic networks are well suited to represent evolutionary histories comprising reticulate evolution. Several methods aiming at reconstructing explicit phylogenetic networks have been developed in the last two decades. In this article, we propose a new definition of maximum parsimony for phylogenetic networks that permits to model biological scenarios that cannot be modeled by the definitions currently present in the literature (namely, the "hardwired" and "softwired" parsimony). Building on this new definition, we provide several algorithmic results that lay the foundations for new parsimony-based methods for phylogenetic network reconstruction.
Biochemical Network Stochastic Simulator (BioNetS): software for stochastic modeling of biochemical networks.

PubMed

Adalsteinsson, David; McMillen, David; Elston, Timothy C

2004-03-08

Intrinsic fluctuations due to the stochastic nature of biochemical reactions can have large effects on the response of biochemical networks. This is particularly true for pathways that involve transcriptional regulation, where generally there are two copies of each gene and the number of messenger RNA (mRNA) molecules can be small. Therefore, there is a need for computational tools for developing and investigating stochastic models of biochemical networks. We have developed the software package Biochemical Network Stochastic Simulator (BioNetS) for efficiently and accurately simulating stochastic models of biochemical networks. BioNetS has a graphical user interface that allows models to be entered in a straightforward manner, and allows the user to specify the type of random variable (discrete or continuous) for each chemical species in the network. The discrete variables are simulated using an efficient implementation of the Gillespie algorithm. For the continuous random variables, BioNetS constructs and numerically solves the appropriate chemical Langevin equations. The software package has been developed to scale efficiently with network size, thereby allowing large systems to be studied. BioNetS runs as a BioSpice agent and can be downloaded from http://www.biospice.org. BioNetS also can be run as a stand alone package. All the required files are accessible from http://x.amath.unc.edu/BioNetS. We have developed BioNetS to be a reliable tool for studying the stochastic dynamics of large biochemical networks. Important features of BioNetS are its ability to handle hybrid models that consist of both continuous and discrete random variables and its ability to model cell growth and division. We have verified the accuracy and efficiency of the numerical methods by considering several test systems.
From gene trees to a dated allopolyploid network: insights from the angiosperm genus Viola (Violaceae).

PubMed

Marcussen, Thomas; Heier, Lise; Brysting, Anne K; Oxelman, Bengt; Jakobsen, Kjetill S

2015-01-01

Allopolyploidization accounts for a significant fraction of speciation events in many eukaryotic lineages. However, existing phylogenetic and dating methods require tree-like topologies and are unable to handle the network-like phylogenetic relationships of lineages containing allopolyploids. No explicit framework has so far been established for evaluating competing network topologies, and few attempts have been made to date phylogenetic networks. We used a four-step approach to generate a dated polyploid species network for the cosmopolitan angiosperm genus Viola L. (Violaceae Batch.). The genus contains ca 600 species and both recent (neo-) and more ancient (meso-) polyploid lineages distributed over 16 sections. First, we obtained DNA sequences of three low-copy nuclear genes and one chloroplast region, from 42 species representing all 16 sections. Second, we obtained fossil-calibrated chronograms for each nuclear gene marker. Third, we determined the most parsimonious multilabeled genome tree and its corresponding network, resolved at the section (not the species) level. Reconstructing the "correct" network for a set of polyploids depends on recovering all homoeologs, i.e., all subgenomes, in these polyploids. Assuming the presence of Viola subgenome lineages that were not detected by the nuclear gene phylogenies ("ghost subgenome lineages") significantly reduced the number of inferred polyploidization events. We identified the most parsimonious network topology from a set of five competing scenarios differing in the interpretation of homoeolog extinctions and lineage sorting, based on (i) fewest possible ghost subgenome lineages, (ii) fewest possible polyploidization events, and (iii) least possible deviation from expected ploidy as inferred from available chromosome counts of the involved polyploid taxa. Finally, we estimated the homoploid and polyploid speciation times of the most parsimonious network. Homoploid speciation times were estimated by coalescent analysis of gene tree node ages. Polyploid speciation times were estimated by comparing branch lengths and speciation rates of lineages with and without ploidy shifts. Our analyses recognize Viola as an old genus (crown age 31 Ma) whose evolutionary history has been profoundly affected by allopolyploidy. Between 16 and 21 allopolyploidizations are necessary to explain the diversification of the 16 major lineages (sections) of Viola, suggesting that allopolyploidy has accounted for a high percentage-between 67% and 88%-of the speciation events at this level. The theoretical and methodological approaches presented here for (i) constructing networks and (ii) dating speciation events within a network, have general applicability for phylogenetic studies of groups where allopolyploidization has occurred. They make explicit use of a hitherto underexplored source of ploidy information from chromosome counts to help resolve phylogenetic cases where incomplete sequence data hampers network inference. Importantly, the coalescent-based method used herein circumvents the assumption of tree-like evolution required by most techniques for dating speciation events. © The Author(s) 2014. Published by Oxford University Press, on behalf of the Society of Systematic Biologists.
Flexible network reconstruction from relational databases with Cytoscape and CytoSQL

PubMed Central

2010-01-01

Background Molecular interaction networks can be efficiently studied using network visualization software such as Cytoscape. The relevant nodes, edges and their attributes can be imported in Cytoscape in various file formats, or directly from external databases through specialized third party plugins. However, molecular data are often stored in relational databases with their own specific structure, for which dedicated plugins do not exist. Therefore, a more generic solution is presented. Results A new Cytoscape plugin 'CytoSQL' is developed to connect Cytoscape to any relational database. It allows to launch SQL ('Structured Query Language') queries from within Cytoscape, with the option to inject node or edge features of an existing network as SQL arguments, and to convert the retrieved data to Cytoscape network components. Supported by a set of case studies we demonstrate the flexibility and the power of the CytoSQL plugin in converting specific data subsets into meaningful network representations. Conclusions CytoSQL offers a unified approach to let Cytoscape interact with relational databases. Thanks to the power of the SQL syntax, this tool can rapidly generate and enrich networks according to very complex criteria. The plugin is available at http://www.ptools.ua.ac.be/CytoSQL. PMID:20594316
Flexible network reconstruction from relational databases with Cytoscape and CytoSQL.

PubMed

Laukens, Kris; Hollunder, Jens; Dang, Thanh Hai; De Jaeger, Geert; Kuiper, Martin; Witters, Erwin; Verschoren, Alain; Van Leemput, Koenraad

2010-07-01

Molecular interaction networks can be efficiently studied using network visualization software such as Cytoscape. The relevant nodes, edges and their attributes can be imported in Cytoscape in various file formats, or directly from external databases through specialized third party plugins. However, molecular data are often stored in relational databases with their own specific structure, for which dedicated plugins do not exist. Therefore, a more generic solution is presented. A new Cytoscape plugin 'CytoSQL' is developed to connect Cytoscape to any relational database. It allows to launch SQL ('Structured Query Language') queries from within Cytoscape, with the option to inject node or edge features of an existing network as SQL arguments, and to convert the retrieved data to Cytoscape network components. Supported by a set of case studies we demonstrate the flexibility and the power of the CytoSQL plugin in converting specific data subsets into meaningful network representations. CytoSQL offers a unified approach to let Cytoscape interact with relational databases. Thanks to the power of the SQL syntax, this tool can rapidly generate and enrich networks according to very complex criteria. The plugin is available at http://www.ptools.ua.ac.be/CytoSQL.
Annotation of gene function in citrus using gene expression information and co-expression networks

PubMed Central

2014-01-01

Background The genus Citrus encompasses major cultivated plants such as sweet orange, mandarin, lemon and grapefruit, among the world’s most economically important fruit crops. With increasing volumes of transcriptomics data available for these species, Gene Co-expression Network (GCN) analysis is a viable option for predicting gene function at a genome-wide scale. GCN analysis is based on a “guilt-by-association” principle whereby genes encoding proteins involved in similar and/or related biological processes may exhibit similar expression patterns across diverse sets of experimental conditions. While bioinformatics resources such as GCN analysis are widely available for efficient gene function prediction in model plant species including Arabidopsis, soybean and rice, in citrus these tools are not yet developed. Results We have constructed a comprehensive GCN for citrus inferred from 297 publicly available Affymetrix Genechip Citrus Genome microarray datasets, providing gene co-expression relationships at a genome-wide scale (33,000 transcripts). The comprehensive citrus GCN consists of a global GCN (condition-independent) and four condition-dependent GCNs that survey the sweet orange species only, all citrus fruit tissues, all citrus leaf tissues, or stress-exposed plants. All of these GCNs are clustered using genome-wide, gene-centric (guide) and graph clustering algorithms for flexibility of gene function prediction. For each putative cluster, gene ontology (GO) enrichment and gene expression specificity analyses were performed to enhance gene function, expression and regulation pattern prediction. The guide-gene approach was used to infer novel roles of genes involved in disease susceptibility and vitamin C metabolism, and graph-clustering approaches were used to investigate isoprenoid/phenylpropanoid metabolism in citrus peel, and citric acid catabolism via the GABA shunt in citrus fruit. Conclusions Integration of citrus gene co-expression networks, functional enrichment analysis and gene expression information provide opportunities to infer gene function in citrus. We present a publicly accessible tool, Network Inference for Citrus Co-Expression (NICCE, http://citrus.adelaide.edu.au/nicce/home.aspx), for the gene co-expression analysis in citrus. PMID:25023870
Last millennium Northern Hemisphere summer temperatures from tree rings: Part II, spatially resolved reconstructions

NASA Astrophysics Data System (ADS)

Anchukaitis, Kevin J.; Wilson, Rob; Briffa, Keith R.; Büntgen, Ulf; Cook, Edward R.; D'Arrigo, Rosanne; Davi, Nicole; Esper, Jan; Frank, David; Gunnarson, Björn E.; Hegerl, Gabi; Helama, Samuli; Klesse, Stefan; Krusic, Paul J.; Linderholm, Hans W.; Myglan, Vladimir; Osborn, Timothy J.; Zhang, Peng; Rydval, Milos; Schneider, Lea; Schurer, Andrew; Wiles, Greg; Zorita, Eduardo

2017-05-01

Climate field reconstructions from networks of tree-ring proxy data can be used to characterize regional-scale climate changes, reveal spatial anomaly patterns associated with atmospheric circulation changes, radiative forcing, and large-scale modes of ocean-atmosphere variability, and provide spatiotemporal targets for climate model comparison and evaluation. Here we use a multiproxy network of tree-ring chronologies to reconstruct spatially resolved warm season (May-August) mean temperatures across the extratropical Northern Hemisphere (40-90°N) using Point-by-Point Regression (PPR). The resulting annual maps of temperature anomalies (750-1988 CE) reveal a consistent imprint of volcanism, with 96% of reconstructed grid points experiencing colder conditions following eruptions. Solar influences are detected at the bicentennial (de Vries) frequency, although at other time scales the influence of insolation variability is weak. Approximately 90% of reconstructed grid points show warmer temperatures during the Medieval Climate Anomaly when compared to the Little Ice Age, although the magnitude varies spatially across the hemisphere. Estimates of field reconstruction skill through time and over space can guide future temporal extension and spatial expansion of the proxy network.
Genome wide predictions of miRNA regulation by transcription factors.

PubMed

Ruffalo, Matthew; Bar-Joseph, Ziv

2016-09-01

Reconstructing regulatory networks from expression and interaction data is a major goal of systems biology. While much work has focused on trying to experimentally and computationally determine the set of transcription-factors (TFs) and microRNAs (miRNAs) that regulate genes in these networks, relatively little work has focused on inferring the regulation of miRNAs by TFs. Such regulation can play an important role in several biological processes including development and disease. The main challenge for predicting such interactions is the very small positive training set currently available. Another challenge is the fact that a large fraction of miRNAs are encoded within genes making it hard to determine the specific way in which they are regulated. To enable genome wide predictions of TF-miRNA interactions, we extended semi-supervised machine-learning approaches to integrate a large set of different types of data including sequence, expression, ChIP-seq and epigenetic data. As we show, the methods we develop achieve good performance on both a labeled test set, and when analyzing general co-expression networks. We next analyze mRNA and miRNA cancer expression data, demonstrating the advantage of using the predicted set of interactions for identifying more coherent and relevant modules, genes, and miRNAs. The complete set of predictions is available on the supporting website and can be used by any method that combines miRNAs, genes, and TFs. Code and full set of predictions are available from the supporting website: http://cs.cmu.edu/~mruffalo/tf-mirna/ zivbj@cs.cmu.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

Ab initio nanostructure determination

NASA Astrophysics Data System (ADS)

Gujarathi, Saurabh

Reconstruction of complex structures is an inverse problem arising in virtually all areas of science and technology, from protein structure determination to bulk heterostructure solar cells and the structure of nanoparticles. This problem is cast as a complex network problem where the edges in a network have weights equal to the Euclidean distance between their endpoints. A method, called Tribond, for the reconstruction of the locations of the nodes of the network given only the edge weights of the Euclidean network is presented. The timing results indicate that the algorithm is a low order polynomial in the number of nodes in the network in two dimensions. Reconstruction of Euclidean networks in two dimensions of about one thousand nodes in approximately twenty four hours on a desktop computer using this implementation is done. In three dimensions, the computational cost for the reconstruction is a higher order polynomial in the number of nodes and reconstruction of small Euclidean networks in three dimensions is shown. If a starting network of size five is assumed to be given, then for a network of size 100, the remaining reconstruction can be done in about two hours on a desktop computer. In situations when we have less precise data, modifications of the method may be necessary and are discussed. A related problem in one dimension known as the Optimal Golomb ruler (OGR) is also studied. A statistical physics Hamiltonian to describe the OGR problem is introduced and the first order phase transition from a symmetric low constraint phase to a complex symmetry broken phase at high constraint is studied. Despite the fact that the Hamiltonian is not disordered, the asymmetric phase is highly irregular with geometric frustration. The phase diagram is obtained and it is seen that even at a very low temperature T there is a phase transition at finite and non-zero value of the constraint parameter gamma/mu. Analytic calculations for the scaling of the density and free energy of the ruler are done and they are compared with those from the mean field approach. A scaling law is also derived for the length of OGR, which is consistent with Erdos conjecture and with numerical results.
[Reconstruction of Leptospira interrogans lipL21 gene and characteristics of its expression product].

PubMed

Luo, Dong-jiao; Hu, Ye; Dennin, R H; Yan, Jie

2007-09-01

To reconstruct the nucleotide sequence of Leptospira interrogans lipL21 gene for increasing the output of prokaryotic expression and to understand the changes on immunogenicity of the expression products before and after reconstruction, and to determine the position of envelope lipoprotein LipL21 on the surface of leptospiral body. According to the preferred codons of E.coli, the nucleotide sequence of lipL21 gene was designed and synthesized, and then its prokaryotic expression system was constructed. By using SDS-PAGE plus BioRad agarose image analysor, the expression level changes of lipL21 genes before and after reconstruction were measured. A Western blot assay using rabbit anti-TR/Patoc I serum as the first antibody was performed to identify the immunoreactivity of the two target recombinant proteins rLipL21s before and after reconstruction. The changes of cross agglutination titers of antisera against two rLipL21s before and after reconstruction to the different leptospiral serogroups were demonstrated using microscope agglutination test (MAT). Immuno-electronmicroscopy was applied to confirm the location of LipL21s. The expression outputs of original and reconstructed lipL21 genes were 8.5 % and 46.5 % of the total bacterial proteins, respectively. Both the two rLipL21s could take place immune conjugation reaction with TR/Patoc I antiserum. After immunization with each of the two rLipL21s in rabbits, the animals could produce specific antibody. Similar MAT titers with 1:80 - 1:320 of the two antisera against rLipL21s were present. LipL21 was confirmed to locate on the surface of leptospiral envelope. LipL21 is a superficial antigen of Leptospira interrogans. The expression output of the reconstructed lipL21 gene is remarkably increased. The expression rLipL21 maintains fine antigenicity and immunoreactivity and its antibody still shows an extensive cross immunoagglutination activity. The high expression of the reconstructed lipL21 gene will offer a favorable condition to use its product for further developing a novel universal vaccine as well as detection kit of leptospirosis.
Transcriptional Networks in Single Perivascular Cells Sorted from Human Adipose Tissue Reveal a Hierarchy of Mesenchymal Stem Cells.

PubMed

Hardy, W Reef; Moldovan, Nicanor I; Moldovan, Leni; Livak, Kenneth J; Datta, Krishna; Goswami, Chirayu; Corselli, Mirko; Traktuev, Dmitry O; Murray, Iain R; Péault, Bruno; March, Keith

2017-05-01

Adipose tissue is a rich source of multipotent mesenchymal stem-like cells, located in the perivascular niche. Based on their surface markers, these have been assigned to two main categories: CD31 - /CD45 - /CD34 + /CD146 - cells (adventitial stromal/stem cells [ASCs]) and CD31 - /CD45 - /CD34 - /CD146 + cells (pericytes [PCs]). These populations display heterogeneity of unknown significance. We hypothesized that aldehyde dehydrogenase (ALDH) activity, a functional marker of primitivity, could help to better define ASC and PC subclasses. To this end, the stromal vascular fraction from a human lipoaspirate was simultaneously stained with fluorescent antibodies to CD31, CD45, CD34, and CD146 antigens and the ALDH substrate Aldefluor, then sorted by fluorescence-activated cell sorting. Individual ASCs (n = 67) and PCs (n = 73) selected from the extremities of the ALDH-staining spectrum were transcriptionally profiled by Fluidigm single-cell quantitative polymerase chain reaction for a predefined set (n = 429) of marker genes. To these single-cell data, we applied differential expression and principal component and clustering analysis, as well as an original gene coexpression network reconstruction algorithm. Despite the stochasticity at the single-cell level, covariation of gene expression analysis yielded multiple network connectivity parameters suggesting that these perivascular progenitor cell subclasses possess the following order of maturity: (a) ALDH br ASC (most primitive); (b) ALDH dim ASC; (c) ALDH br PC; (d) ALDH dim PC (least primitive). This order was independently supported by specific combinations of class-specific expressed genes and further confirmed by the analysis of associated signaling pathways. In conclusion, single-cell transcriptional analysis of four populations isolated from fat by surface markers and enzyme activity suggests a developmental hierarchy among perivascular mesenchymal stem cells supported by markers and coexpression networks. Stem Cells 2017;35:1273-1289. © 2017 AlphaMed Press.
A phylogenetic Kalman filter for ancestral trait reconstruction using molecular data.

PubMed

Lartillot, Nicolas

2014-02-15

Correlation between life history or ecological traits and genomic features such as nucleotide or amino acid composition can be used for reconstructing the evolutionary history of the traits of interest along phylogenies. Thus far, however, such ancestral reconstructions have been done using simple linear regression approaches that do not account for phylogenetic inertia. These reconstructions could instead be seen as a genuine comparative regression problem, such as formalized by classical generalized least-square comparative methods, in which the trait of interest and the molecular predictor are represented as correlated Brownian characters coevolving along the phylogeny. Here, a Bayesian sampler is introduced, representing an alternative and more efficient algorithmic solution to this comparative regression problem, compared with currently existing generalized least-square approaches. Technically, ancestral trait reconstruction based on a molecular predictor is shown to be formally equivalent to a phylogenetic Kalman filter problem, for which backward and forward recursions are developed and implemented in the context of a Markov chain Monte Carlo sampler. The comparative regression method results in more accurate reconstructions and a more faithful representation of uncertainty, compared with simple linear regression. Application to the reconstruction of the evolution of optimal growth temperature in Archaea, using GC composition in ribosomal RNA stems and amino acid composition of a sample of protein-coding genes, confirms previous findings, in particular, pointing to a hyperthermophilic ancestor for the kingdom. The program is freely available at www.phylobayes.org.
Analog "neuronal" networks in early vision.

PubMed Central

Koch, C; Marroquin, J; Yuille, A

1986-01-01

Many problems in early vision can be formulated in terms of minimizing a cost function. Examples are shape from shading, edge detection, motion analysis, structure from motion, and surface interpolation. As shown by Poggio and Koch [Poggio, T. & Koch, C. (1985) Proc. R. Soc. London, Ser. B 226, 303-323], quadratic variational problems, an important subset of early vision tasks, can be "solved" by linear, analog electrical, or chemical networks. However, in the presence of discontinuities, the cost function is nonquadratic, raising the question of designing efficient algorithms for computing the optimal solution. Recently, Hopfield and Tank [Hopfield, J. J. & Tank, D. W. (1985) Biol. Cybern. 52, 141-152] have shown that networks of nonlinear analog "neurons" can be effective in computing the solution of optimization problems. We show how these networks can be generalized to solve the nonconvex energy functionals of early vision. We illustrate this approach by implementing a specific analog network, solving the problem of reconstructing a smooth surface from sparse data while preserving its discontinuities. These results suggest a novel computational strategy for solving early vision problems in both biological and real-time artificial vision systems. PMID:3459172
Gene coexpression measures in large heterogeneous samples using count statistics.

PubMed

Wang, Y X Rachel; Waterman, Michael S; Huang, Haiyan

2014-11-18

With the advent of high-throughput technologies making large-scale gene expression data readily available, developing appropriate computational tools to process these data and distill insights into systems biology has been an important part of the "big data" challenge. Gene coexpression is one of the earliest techniques developed that is still widely in use for functional annotation, pathway analysis, and, most importantly, the reconstruction of gene regulatory networks, based on gene expression data. However, most coexpression measures do not specifically account for local features in expression profiles. For example, it is very likely that the patterns of gene association may change or only exist in a subset of the samples, especially when the samples are pooled from a range of experiments. We propose two new gene coexpression statistics based on counting local patterns of gene expression ranks to take into account the potentially diverse nature of gene interactions. In particular, one of our statistics is designed for time-course data with local dependence structures, such as time series coupled over a subregion of the time domain. We provide asymptotic analysis of their distributions and power, and evaluate their performance against a wide range of existing coexpression measures on simulated and real data. Our new statistics are fast to compute, robust against outliers, and show comparable and often better general performance.
Comprehensive reconstruction and in silico analysis of Aspergillus niger genome-scale metabolic network model that accounts for 1210 ORFs.

PubMed

Lu, Hongzhong; Cao, Weiqiang; Ouyang, Liming; Xia, Jianye; Huang, Mingzhi; Chu, Ju; Zhuang, Yingping; Zhang, Siliang; Noorman, Henk

2017-03-01

Aspergillus niger is one of the most important cell factories for industrial enzymes and organic acids production. A comprehensive genome-scale metabolic network model (GSMM) with high quality is crucial for efficient strain improvement and process optimization. The lack of accurate reaction equations and gene-protein-reaction associations (GPRs) in the current best model of A. niger named GSMM iMA871, however, limits its application scope. To overcome these limitations, we updated the A. niger GSMM by combining the latest genome annotation and literature mining technology. Compared with iMA871, the number of reactions in iHL1210 was increased from 1,380 to 1,764, and the number of unique ORFs from 871 to 1,210. With the aid of our transcriptomics analysis, the existence of 63% ORFs and 68% reactions in iHL1210 can be verified when glucose was used as the only carbon source. Physiological data from chemostat cultivations, 13 C-labeled and molecular experiments from the published literature were further used to check the performance of iHL1210. The average correlation coefficients between the predicted fluxes and estimated fluxes from 13 C-labeling data were sufficiently high (above 0.89) and the prediction of cell growth on most of the reported carbon and nitrogen sources was consistent. Using the updated genome-scale model, we evaluated gene essentiality on synthetic and yeast extract medium, as well as the effects of NADPH supply on glucoamylase production in A. niger. In summary, the new A. niger GSMM iHL1210 contains significant improvements with respect to the metabolic coverage and prediction performance, which paves the way for systematic metabolic engineering of A. niger. Biotechnol. Bioeng. 2017;114: 685-695. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
MorphDB: Prioritizing Genes for Specialized Metabolism Pathways and Gene Ontology Categories in Plants.

PubMed

Zwaenepoel, Arthur; Diels, Tim; Amar, David; Van Parys, Thomas; Shamir, Ron; Van de Peer, Yves; Tzfadia, Oren

2018-01-01

Recent times have seen an enormous growth of "omics" data, of which high-throughput gene expression data are arguably the most important from a functional perspective. Despite huge improvements in computational techniques for the functional classification of gene sequences, common similarity-based methods often fall short of providing full and reliable functional information. Recently, the combination of comparative genomics with approaches in functional genomics has received considerable interest for gene function analysis, leveraging both gene expression based guilt-by-association methods and annotation efforts in closely related model organisms. Besides the identification of missing genes in pathways, these methods also typically enable the discovery of biological regulators (i.e., transcription factors or signaling genes). A previously built guilt-by-association method is MORPH, which was proven to be an efficient algorithm that performs particularly well in identifying and prioritizing missing genes in plant metabolic pathways. Here, we present MorphDB, a resource where MORPH-based candidate genes for large-scale functional annotations (Gene Ontology, MapMan bins) are integrated across multiple plant species. Besides a gene centric query utility, we present a comparative network approach that enables researchers to efficiently browse MORPH predictions across functional gene sets and species, facilitating efficient gene discovery and candidate gene prioritization. MorphDB is available at http://bioinformatics.psb.ugent.be/webtools/morphdb/morphDB/index/. We also provide a toolkit, named "MORPH bulk" (https://github.com/arzwa/morph-bulk), for running MORPH in bulk mode on novel data sets, enabling researchers to apply MORPH to their own species of interest.
Prediction of EST functional relationships via literature mining with user-specified parameters.

PubMed

Wang, Hei-Chia; Huang, Tian-Hsiang

2009-04-01

The massive amount of expressed sequence tags (ESTs) gathered over recent years has triggered great interest in efficient applications for genomic research. In particular, EST functional relationships can be used to determine a possible gene network for biological processes of interest. In recent years, many researchers have tried to determine EST functional relationships by analyzing the biological literature. However, it has been challenging to find efficient prediction methods. Moreover, an annotated EST is usually associated with many functions, so successful methods must be able to distinguish between relevant and irrelevant functions based on user specifications. This paper proposes a method to discover functional relationships between ESTs of interest by analyzing literature from the Medical Literature Analysis and Retrieval System Online, with user-specified parameters for selecting keywords. This method performs better than the multiple kernel documents method in setting up a specific threshold for gathering materials. The method is also able to uncover known functional relationships, as shown by a comparison with the Kyoto Encyclopedia of Genes and Genomes database. The reliable EST relationships predicted by the proposed method can help to construct gene networks for specific biological functions of interest.
Unbiased Combinatorial Genomic Approaches to Identify Alternative Therapeutic Targets within the TSC Signaling Network

DTIC Science & Technology

2014-06-01

Specifically, we combined the CRISPR genome editing system with a novel approach allowing efficient single cell cloning of Drosophila cells with the aim of...and culture these to produce cultures completely lacking wildtype sequence at the target locus. No robust methods existed to clone single Drosophila ...targeting all kinases and phosphatases (563 genes) in the Drosophila genome . 65 samples that displayed synthetic lethality (15 genes) or synthetic
Medical image security using modified chaos-based cryptography approach

NASA Astrophysics Data System (ADS)

Talib Gatta, Methaq; Al-latief, Shahad Thamear Abd

2018-05-01

The progressive development in telecommunication and networking technologies have led to the increased popularity of telemedicine usage which involve storage and transfer of medical images and related information so security concern is emerged. This paper presents a method to provide the security to the medical images since its play a major role in people healthcare organizations. The main idea in this work based on the chaotic sequence in order to provide efficient encryption method that allows reconstructing the original image from the encrypted image with high quality and minimum distortion in its content and doesn’t effect in human treatment and diagnosing. Experimental results prove the efficiency of the proposed method using some of statistical measures and robust correlation between original image and decrypted image.
Equal Graph Partitioning on Estimated Infection Network as an Effective Epidemic Mitigation Measure

PubMed Central

Hadidjojo, Jeremy; Cheong, Siew Ann

2011-01-01

Controlling severe outbreaks remains the most important problem in infectious disease area. With time, this problem will only become more severe as population density in urban centers grows. Social interactions play a very important role in determining how infectious diseases spread, and organization of people along social lines gives rise to non-spatial networks in which the infections spread. Infection networks are different for diseases with different transmission modes, but are likely to be identical or highly similar for diseases that spread the same way. Hence, infection networks estimated from common infections can be useful to contain epidemics of a more severe disease with the same transmission mode. Here we present a proof-of-concept study demonstrating the effectiveness of epidemic mitigation based on such estimated infection networks. We first generate artificial social networks of different sizes and average degrees, but with roughly the same clustering characteristic. We then start SIR epidemics on these networks, censor the simulated incidences, and use them to reconstruct the infection network. We then efficiently fragment the estimated network by removing the smallest number of nodes identified by a graph partitioning algorithm. Finally, we demonstrate the effectiveness of this targeted strategy, by comparing it against traditional untargeted strategies, in slowing down and reducing the size of advancing epidemics. PMID:21799777
On the quirks of maximum parsimony and likelihood on phylogenetic networks.

PubMed

Bryant, Christopher; Fischer, Mareike; Linz, Simone; Semple, Charles

2017-03-21

Maximum parsimony is one of the most frequently-discussed tree reconstruction methods in phylogenetic estimation. However, in recent years it has become more and more apparent that phylogenetic trees are often not sufficient to describe evolution accurately. For instance, processes like hybridization or lateral gene transfer that are commonplace in many groups of organisms and result in mosaic patterns of relationships cannot be represented by a single phylogenetic tree. This is why phylogenetic networks, which can display such events, are becoming of more and more interest in phylogenetic research. It is therefore necessary to extend concepts like maximum parsimony from phylogenetic trees to networks. Several suggestions for possible extensions can be found in recent literature, for instance the softwired and the hardwired parsimony concepts. In this paper, we analyze the so-called big parsimony problem under these two concepts, i.e. we investigate maximum parsimonious networks and analyze their properties. In particular, we show that finding a softwired maximum parsimony network is possible in polynomial time. We also show that the set of maximum parsimony networks for the hardwired definition always contains at least one phylogenetic tree. Lastly, we investigate some parallels of parsimony to different likelihood concepts on phylogenetic networks. Copyright © 2017 Elsevier Ltd. All rights reserved.
Third-dimension information retrieval from a single convergent-beam transmission electron diffraction pattern using an artificial neural network

NASA Astrophysics Data System (ADS)

Pennington, Robert S.; Van den Broek, Wouter; Koch, Christoph T.

2014-05-01

We have reconstructed third-dimension specimen information from convergent-beam electron diffraction (CBED) patterns simulated using the stacked-Bloch-wave method. By reformulating the stacked-Bloch-wave formalism as an artificial neural network and optimizing with resilient back propagation, we demonstrate specimen orientation reconstructions with depth resolutions down to 5 nm. To show our algorithm's ability to analyze realistic data, we also discuss and demonstrate our algorithm reconstructing from noisy data and using a limited number of CBED disks. Applicability of this reconstruction algorithm to other specimen parameters is discussed.
Optimal Compressed Sensing and Reconstruction of Unstructured Mesh Datasets

DOE PAGES

Salloum, Maher; Fabian, Nathan D.; Hensinger, David M.; ...

2017-08-09

Exascale computing promises quantities of data too large to efficiently store and transfer across networks in order to be able to analyze and visualize the results. We investigate compressed sensing (CS) as an in situ method to reduce the size of the data as it is being generated during a large-scale simulation. CS works by sampling the data on the computational cluster within an alternative function space such as wavelet bases and then reconstructing back to the original space on visualization platforms. While much work has gone into exploring CS on structured datasets, such as image data, we investigate itsmore » usefulness for point clouds such as unstructured mesh datasets often found in finite element simulations. We sample using a technique that exhibits low coherence with tree wavelets found to be suitable for point clouds. We reconstruct using the stagewise orthogonal matching pursuit algorithm that we improved to facilitate automated use in batch jobs. We analyze the achievable compression ratios and the quality and accuracy of reconstructed results at each compression ratio. In the considered case studies, we are able to achieve compression ratios up to two orders of magnitude with reasonable reconstruction accuracy and minimal visual deterioration in the data. Finally, our results suggest that, compared to other compression techniques, CS is attractive in cases where the compression overhead has to be minimized and where the reconstruction cost is not a significant concern.« less
Genome reconstructions indicate the partitioning of ecological functions inside a phytoplankton bloom in the Amundsen Sea, Antarctica

PubMed Central

Delmont, Tom O.; Eren, A. Murat; Vineis, Joseph H.; Post, Anton F.

2015-01-01

Antarctica polynyas support intense phytoplankton blooms, impacting their environment by a substantial depletion of inorganic carbon and nutrients. These blooms are dominated by the colony-forming haptophyte Phaeocystis antarctica and they are accompanied by a distinct bacterial population. Yet, the ecological role these bacteria may play in P. antarctica blooms awaits elucidation of their functional gene pool and of the geochemical activities they support. Here, we report on a metagenome (~160 million reads) analysis of the microbial community associated with a P. antarctica bloom event in the Amundsen Sea polynya (West Antarctica). Genomes of the most abundant Bacteroidetes and Proteobacteria populations have been reconstructed and a network analysis indicates a strong functional partitioning of these bacterial taxa. Three of them (SAR92, and members of the Oceanospirillaceae and Cryomorphaceae) are found in close association with P. antarctica colonies. Distinct features of their carbohydrate, nitrogen, sulfur and iron metabolisms may serve to support mutualistic relationships with P. antarctica. The SAR92 genome indicates a specialization in the degradation of fatty acids and dimethylsulfoniopropionate (compounds released by P. antarctica) into dimethyl sulfide, an aerosol precursor. The Oceanospirillaceae genome carries genes that may enhance algal physiology (cobalamin synthesis). Finally, the Cryomorphaceae genome is enriched in genes that function in cell or colony invasion. A novel pico-eukaryote, Micromonas related genome (19.6 Mb, ~94% completion) was also recovered. It contains the gene for an anti-freeze protein, which is lacking in Micromonas at lower latitudes. These draft genomes are representative for abundant microbial taxa across the Southern Ocean surface. PMID:26579075
HipMCL: a high-performance parallel implementation of the Markov clustering algorithm for large-scale networks

PubMed Central

Azad, Ariful; Ouzounis, Christos A; Kyrpides, Nikos C; Buluç, Aydin

2018-01-01

Abstract Biological networks capture structural or functional properties of relevant entities such as molecules, proteins or genes. Characteristic examples are gene expression networks or protein–protein interaction networks, which hold information about functional affinities or structural similarities. Such networks have been expanding in size due to increasing scale and abundance of biological data. While various clustering algorithms have been proposed to find highly connected regions, Markov Clustering (MCL) has been one of the most successful approaches to cluster sequence similarity or expression networks. Despite its popularity, MCL’s scalability to cluster large datasets still remains a bottleneck due to high running times and memory demands. Here, we present High-performance MCL (HipMCL), a parallel implementation of the original MCL algorithm that can run on distributed-memory computers. We show that HipMCL can efficiently utilize 2000 compute nodes and cluster a network of ∼70 million nodes with ∼68 billion edges in ∼2.4 h. By exploiting distributed-memory environments, HipMCL clusters large-scale networks several orders of magnitude faster than MCL and enables clustering of even bigger networks. HipMCL is based on MPI and OpenMP and is freely available under a modified BSD license. PMID:29315405
HipMCL: a high-performance parallel implementation of the Markov clustering algorithm for large-scale networks

DOE PAGES

Azad, Ariful; Pavlopoulos, Georgios A.; Ouzounis, Christos A.; ...

2018-01-05

Biological networks capture structural or functional properties of relevant entities such as molecules, proteins or genes. Characteristic examples are gene expression networks or protein–protein interaction networks, which hold information about functional affinities or structural similarities. Such networks have been expanding in size due to increasing scale and abundance of biological data. While various clustering algorithms have been proposed to find highly connected regions, Markov Clustering (MCL) has been one of the most successful approaches to cluster sequence similarity or expression networks. Despite its popularity, MCL’s scalability to cluster large datasets still remains a bottleneck due to high running times andmore » memory demands. In this paper, we present High-performance MCL (HipMCL), a parallel implementation of the original MCL algorithm that can run on distributed-memory computers. We show that HipMCL can efficiently utilize 2000 compute nodes and cluster a network of ~70 million nodes with ~68 billion edges in ~2.4 h. By exploiting distributed-memory environments, HipMCL clusters large-scale networks several orders of magnitude faster than MCL and enables clustering of even bigger networks. Finally, HipMCL is based on MPI and OpenMP and is freely available under a modified BSD license.« less
HipMCL: a high-performance parallel implementation of the Markov clustering algorithm for large-scale networks

DOE Office of Scientific and Technical Information (OSTI.GOV)

Azad, Ariful; Pavlopoulos, Georgios A.; Ouzounis, Christos A.

Biological networks capture structural or functional properties of relevant entities such as molecules, proteins or genes. Characteristic examples are gene expression networks or protein–protein interaction networks, which hold information about functional affinities or structural similarities. Such networks have been expanding in size due to increasing scale and abundance of biological data. While various clustering algorithms have been proposed to find highly connected regions, Markov Clustering (MCL) has been one of the most successful approaches to cluster sequence similarity or expression networks. Despite its popularity, MCL’s scalability to cluster large datasets still remains a bottleneck due to high running times andmore » memory demands. In this paper, we present High-performance MCL (HipMCL), a parallel implementation of the original MCL algorithm that can run on distributed-memory computers. We show that HipMCL can efficiently utilize 2000 compute nodes and cluster a network of ~70 million nodes with ~68 billion edges in ~2.4 h. By exploiting distributed-memory environments, HipMCL clusters large-scale networks several orders of magnitude faster than MCL and enables clustering of even bigger networks. Finally, HipMCL is based on MPI and OpenMP and is freely available under a modified BSD license.« less
Automatic Fabric Defect Detection with a Multi-Scale Convolutional Denoising Autoencoder Network Model.

PubMed

Mei, Shuang; Wang, Yudan; Wen, Guojun

2018-04-02

Fabric defect detection is a necessary and essential step of quality control in the textile manufacturing industry. Traditional fabric inspections are usually performed by manual visual methods, which are low in efficiency and poor in precision for long-term industrial applications. In this paper, we propose an unsupervised learning-based automated approach to detect and localize fabric defects without any manual intervention. This approach is used to reconstruct image patches with a convolutional denoising autoencoder network at multiple Gaussian pyramid levels and to synthesize detection results from the corresponding resolution channels. The reconstruction residual of each image patch is used as the indicator for direct pixel-wise prediction. By segmenting and synthesizing the reconstruction residual map at each resolution level, the final inspection result can be generated. This newly developed method has several prominent advantages for fabric defect detection. First, it can be trained with only a small amount of defect-free samples. This is especially important for situations in which collecting large amounts of defective samples is difficult and impracticable. Second, owing to the multi-modal integration strategy, it is relatively more robust and accurate compared to general inspection methods (the results at each resolution level can be viewed as a modality). Third, according to our results, it can address multiple types of textile fabrics, from simple to more complex. Experimental results demonstrate that the proposed model is robust and yields good overall performance with high precision and acceptable recall rates.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.