genome-scale metabolic reconstructions: Topics by Science.gov

Sample records for genome-scale metabolic reconstructions

A protocol for generating a high-quality genome-scale metabolic reconstruction.

PubMed

Thiele, Ines; Palsson, Bernhard Ø

2010-01-01

Network reconstructions are a common denominator in systems biology. Bottom-up metabolic network reconstructions have been developed over the last 10 years. These reconstructions represent structured knowledge bases that abstract pertinent information on the biochemical transformations taking place within specific target organisms. The conversion of a reconstruction into a mathematical format facilitates a myriad of computational biological studies, including evaluation of network content, hypothesis testing and generation, analysis of phenotypic characteristics and metabolic engineering. To date, genome-scale metabolic reconstructions for more than 30 organisms have been published and this number is expected to increase rapidly. However, these reconstructions differ in quality and coverage that may minimize their predictive potential and use as knowledge bases. Here we present a comprehensive protocol describing each step necessary to build a high-quality genome-scale metabolic reconstruction, as well as the common trials and tribulations. Therefore, this protocol provides a helpful manual for all stages of the reconstruction process.
A protocol for generating a high-quality genome-scale metabolic reconstruction

PubMed Central

Thiele, Ines; Palsson, Bernhard Ø.

2011-01-01

Network reconstructions are a common denominator in systems biology. Bottom-up metabolic network reconstructions have developed over the past 10 years. These reconstructions represent structured knowledge-bases that abstract pertinent information on the biochemical transformations taking place within specific target organisms. The conversion of a reconstruction into a mathematical format facilitates myriad computational biological studies including evaluation of network content, hypothesis testing and generation, analysis of phenotypic characteristics, and metabolic engineering. To date, genome-scale metabolic reconstructions for more than 30 organisms have been published and this number is expected to increase rapidly. However, these reconstructions differ in quality and coverage that may minimize their predictive potential and use as knowledge-bases. Here, we present a comprehensive protocol describing each step necessary to build a high-quality genome-scale metabolic reconstruction as well as common trials and tribulations. Therefore, this protocol provides a helpful manual for all stages of the reconstruction process. PMID:20057383
BiGG: a Biochemical Genetic and Genomic knowledgebase of large scale metabolic reconstructions

PubMed Central

2010-01-01

Background Genome-scale metabolic reconstructions under the Constraint Based Reconstruction and Analysis (COBRA) framework are valuable tools for analyzing the metabolic capabilities of organisms and interpreting experimental data. As the number of such reconstructions and analysis methods increases, there is a greater need for data uniformity and ease of distribution and use. Description We describe BiGG, a knowledgebase of Biochemically, Genetically and Genomically structured genome-scale metabolic network reconstructions. BiGG integrates several published genome-scale metabolic networks into one resource with standard nomenclature which allows components to be compared across different organisms. BiGG can be used to browse model content, visualize metabolic pathway maps, and export SBML files of the models for further analysis by external software packages. Users may follow links from BiGG to several external databases to obtain additional information on genes, proteins, reactions, metabolites and citations of interest. Conclusions BiGG addresses a need in the systems biology community to have access to high quality curated metabolic models and reconstructions. It is freely available for academic use at http://bigg.ucsd.edu. PMID:20426874
Genome-Scale Reconstruction of the Human Astrocyte Metabolic Network

PubMed Central

Martín-Jiménez, Cynthia A.; Salazar-Barreto, Diego; Barreto, George E.; González, Janneth

2017-01-01

Astrocytes are the most abundant cells of the central nervous system; they have a predominant role in maintaining brain metabolism. In this sense, abnormal metabolic states have been found in different neuropathological diseases. Determination of metabolic states of astrocytes is difficult to model using current experimental approaches given the high number of reactions and metabolites present. Thus, genome-scale metabolic networks derived from transcriptomic data can be used as a framework to elucidate how astrocytes modulate human brain metabolic states during normal conditions and in neurodegenerative diseases. We performed a Genome-Scale Reconstruction of the Human Astrocyte Metabolic Network with the purpose of elucidating a significant portion of the metabolic map of the astrocyte. This is the first global high-quality, manually curated metabolic reconstruction network of a human astrocyte. It includes 5,007 metabolites and 5,659 reactions distributed among 8 cell compartments, (extracellular, cytoplasm, mitochondria, endoplasmic reticle, Golgi apparatus, lysosome, peroxisome and nucleus). Using the reconstructed network, the metabolic capabilities of human astrocytes were calculated and compared both in normal and ischemic conditions. We identified reactions activated in these two states, which can be useful for understanding the astrocytic pathways that are affected during brain disease. Additionally, we also showed that the obtained flux distributions in the model, are in accordance with literature-based findings. Up to date, this is the most complete representation of the human astrocyte in terms of inclusion of genes, proteins, reactions and metabolic pathways, being a useful guide for in-silico analysis of several metabolic behaviors of the astrocyte during normal and pathologic states. PMID:28243200
AlgaGEM – a genome-scale metabolic reconstruction of algae based on the Chlamydomonas reinhardtii genome

PubMed Central

2011-01-01

Background Microalgae have the potential to deliver biofuels without the associated competition for land resources. In order to realise the rates and titres necessary for commercial production, however, system-level metabolic engineering will be required. Genome scale metabolic reconstructions have revolutionized microbial metabolic engineering and are used routinely for in silico analysis and design. While genome scale metabolic reconstructions have been developed for many prokaryotes and model eukaryotes, the application to less well characterized eukaryotes such as algae is challenging not at least due to a lack of compartmentalization data. Results We have developed a genome-scale metabolic network model (named AlgaGEM) covering the metabolism for a compartmentalized algae cell based on the Chlamydomonas reinhardtii genome. AlgaGEM is a comprehensive literature-based genome scale metabolic reconstruction that accounts for the functions of 866 unique ORFs, 1862 metabolites, 2249 gene-enzyme-reaction-association entries, and 1725 unique reactions. The reconstruction was compartmentalized into the cytoplasm, mitochondrion, plastid and microbody using available data for algae complemented with compartmentalisation data for Arabidopsis thaliana. AlgaGEM describes a functional primary metabolism of Chlamydomonas and significantly predicts distinct algal behaviours such as the catabolism or secretion rather than recycling of phosphoglycolate in photorespiration. AlgaGEM was validated through the simulation of growth and algae metabolic functions inferred from literature. Using efficient resource utilisation as the optimality criterion, AlgaGEM predicted observed metabolic effects under autotrophic, heterotrophic and mixotrophic conditions. AlgaGEM predicts increased hydrogen production when cyclic electron flow is disrupted as seen in a high producing mutant derived from mutational studies. The model also predicted the physiological pathway for H2 production and
Reconstruction of 24 Penicillium genome-scale metabolic models shows diversity based on their secondary metabolism.

PubMed

Prigent, Sylvain; Nielsen, Jens Christian; Frisvad, Jens Christian; Nielsen, Jens

2018-06-05

Modelling of metabolism at the genome-scale have proved to be an efficient method for explaining observed phenotypic traits in living organisms. Further, it can be used as a means of predicting the effect of genetic modifications e.g. for development of microbial cell factories. With the increasing amount of genome sequencing data available, a need exists to accurately and efficiently generate such genome-scale metabolic models (GEMs) of non-model organisms, for which data is sparse. In this study, we present an automatic reconstruction approach applied to 24 Penicillium species, which have potential for production of pharmaceutical secondary metabolites or used in the manufacturing of food products such as cheeses. The models were based on the MetaCyc database and a previously published Penicillium GEM, and gave rise to comprehensive genome-scale metabolic descriptions. The models proved that while central carbon metabolism is highly conserved, secondary metabolic pathways represent the main diversity among the species. The automatic reconstruction approach presented in this study can be applied to generate GEMs of other understudied organisms, and the developed GEMs are a useful resource for the study of Penicillium metabolism, for example with the scope of developing novel cell factories. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
Reconstruction of genome-scale human metabolic models using omics data.

PubMed

Ryu, Jae Yong; Kim, Hyun Uk; Lee, Sang Yup

2015-08-01

The impact of genome-scale human metabolic models on human systems biology and medical sciences is becoming greater, thanks to increasing volumes of model building platforms and publicly available omics data. The genome-scale human metabolic models started with Recon 1 in 2007, and have since been used to describe metabolic phenotypes of healthy and diseased human tissues and cells, and to predict therapeutic targets. Here we review recent trends in genome-scale human metabolic modeling, including various generic and tissue/cell type-specific human metabolic models developed to date, and methods, databases and platforms used to construct them. For generic human metabolic models, we pay attention to Recon 2 and HMR 2.0 with emphasis on data sources used to construct them. Draft and high-quality tissue/cell type-specific human metabolic models have been generated using these generic human metabolic models. Integration of tissue/cell type-specific omics data with the generic human metabolic models is the key step, and we discuss omics data and their integration methods to achieve this task. The initial version of the tissue/cell type-specific human metabolic models can further be computationally refined through gap filling, reaction directionality assignment and the subcellular localization of metabolic reactions. We review relevant tools for this model refinement procedure as well. Finally, we suggest the direction of further studies on reconstructing an improved human metabolic model.
Genome scale metabolic reconstruction of Chlorella variabilis for exploring its metabolic potential for biofuels.

PubMed

Juneja, Ankita; Chaplen, Frank W R; Murthy, Ganti S

2016-08-01

A compartmentalized genome scale metabolic network was reconstructed for Chlorella variabilis to offer insight into various metabolic potentials from this alga. The model, iAJ526, was reconstructed with 1455 reactions, 1236 metabolites and 526 genes. 21% of the reactions were transport reactions and about 81% of the total reactions were associated with enzymes. Along with gap filling reactions, 2 major sub-pathways were added to the model, chitosan synthesis and rhamnose metabolism. The reconstructed model had reaction participation of 4.3 metabolites per reaction and average lethality fraction of 0.21. The model was effective in capturing the growth of C. variabilis under three light conditions (white, red and red+blue light) with fair agreement. This reconstructed metabolic network will serve an important role in systems biology for further exploration of metabolism for specific target metabolites and enable improved characteristics in the strain through metabolic engineering. Copyright © 2016 Elsevier Ltd. All rights reserved.
Global Metabolic Reconstruction and Metabolic Gene Evolution in the Cattle Genome

PubMed Central

Kim, Woonsu; Park, Hyesun; Seo, Seongwon

2016-01-01

The sequence of cattle genome provided a valuable opportunity to systematically link genetic and metabolic traits of cattle. The objectives of this study were 1) to reconstruct genome-scale cattle-specific metabolic pathways based on the most recent and updated cattle genome build and 2) to identify duplicated metabolic genes in the cattle genome for better understanding of metabolic adaptations in cattle. A bioinformatic pipeline of an organism for amalgamating genomic annotations from multiple sources was updated. Using this, an amalgamated cattle genome database based on UMD_3.1, was created. The amalgamated cattle genome database is composed of a total of 33,292 genes: 19,123 consensus genes between NCBI and Ensembl databases, 8,410 and 5,493 genes only found in NCBI or Ensembl, respectively, and 266 genes from NCBI scaffolds. A metabolic reconstruction of the cattle genome and cattle pathway genome database (PGDB) was also developed using Pathway Tools, followed by an intensive manual curation. The manual curation filled or revised 68 pathway holes, deleted 36 metabolic pathways, and added 23 metabolic pathways. Consequently, the curated cattle PGDB contains 304 metabolic pathways, 2,460 reactions including 2,371 enzymatic reactions, and 4,012 enzymes. Furthermore, this study identified eight duplicated genes in 12 metabolic pathways in the cattle genome compared to human and mouse. Some of these duplicated genes are related with specific hormone biosynthesis and detoxifications. The updated genome-scale metabolic reconstruction is a useful tool for understanding biology and metabolic characteristics in cattle. There has been significant improvements in the quality of cattle genome annotations and the MetaCyc database. The duplicated metabolic genes in the cattle genome compared to human and mouse implies evolutionary changes in the cattle genome and provides a useful information for further research on understanding metabolic adaptations of cattle. PMID
A genome-scale metabolic reconstruction of Pseudomonas putida KT2440: iJN746 as a cell factory.

PubMed

Nogales, Juan; Palsson, Bernhard Ø; Thiele, Ines

2008-09-16

Pseudomonas putida is the best studied pollutant degradative bacteria and is harnessed by industrial biotechnology to synthesize fine chemicals. Since the publication of P. putida KT2440's genome, some in silico analyses of its metabolic and biotechnology capacities have been published. However, global understanding of the capabilities of P. putida KT2440 requires the construction of a metabolic model that enables the integration of classical experimental data along with genomic and high-throughput data. The constraint-based reconstruction and analysis (COBRA) approach has been successfully used to build and analyze in silico genome-scale metabolic reconstructions. We present a genome-scale reconstruction of P. putida KT2440's metabolism, iJN746, which was constructed based on genomic, biochemical, and physiological information. This manually-curated reconstruction accounts for 746 genes, 950 reactions, and 911 metabolites. iJN746 captures biotechnologically relevant pathways, including polyhydroxyalkanoate synthesis and catabolic pathways of aromatic compounds (e.g., toluene, benzoate, phenylacetate, nicotinate), not described in other metabolic reconstructions or biochemical databases. The predictive potential of iJN746 was validated using experimental data including growth performance and gene deletion studies. Furthermore, in silico growth on toluene was found to be oxygen-limited, suggesting the existence of oxygen-efficient pathways not yet annotated in P. putida's genome. Moreover, we evaluated the production efficiency of polyhydroxyalkanoates from various carbon sources and found fatty acids as the most prominent candidates, as expected. Here we presented the first genome-scale reconstruction of P. putida, a biotechnologically interesting all-surrounder. Taken together, this work illustrates the utility of iJN746 as i) a knowledge-base, ii) a discovery tool, and iii) an engineering platform to explore P. putida's potential in bioremediation and bioplastic
An Experimentally-Supported Genome-Scale Metabolic Network Reconstruction for Yersinia pestis CO92

DOE Office of Scientific and Technical Information (OSTI.GOV)

Charusanti, Pep; Chauhan, Sadhana; Mcateer, Kathleen

2011-10-13

Yersinia pestis is a gram-negative bacterium that causes plague, a disease linked historically to the Black Death in Europe during the Middle Ages and to several outbreaks during the modern era. Metabolism in Y. pestis displays remarkable flexibility and robustness, allowing the bacterium to proliferate in both warm-blooded mammalian hosts and cold-blooded insect vectors such as fleas. Here we report a genome-scale reconstruction and mathematical model of metabolism for Y. pestis CO92 and supporting experimental growth and metabolite measurements. The model contains 815 genes, 678 proteins, 963 unique metabolites and 1678 reactions, accurately simulates growth on a range of carbonmore » sources both qualitatively and quantitatively, and identifies gaps in several key biosynthetic pathways and suggests how those gaps might be filled. Furthermore, our model presents hypotheses to explain certain known nutritional requirements characteristic of this strain. Y. pestis continues to be a dangerous threat to human health during modern times. The Y. pestis genome-scale metabolic reconstruction presented here, which has been benchmarked against experimental data and correctly reproduces known phenotypes, thus provides an in silico platform with which to investigate the metabolism of this important human pathogen.« less
Exploring Hydrogenotrophic Methanogenesis: a Genome Scale Metabolic Reconstruction of Methanococcus maripaludis

DOE PAGES

Richards, Matthew A.; Lie, Thomas J.; Zhang, Juan; ...

2016-10-10

Hydrogenotrophic methanogenesis occurs in multiple environments, ranging from the intestinal tracts of animals to anaerobic sediments and hot springs. Energy conservation in hydrogenotrophic methanogens was long a mystery; only within the last decade was it reported that net energy conservation for growth depends on electron bifurcation. In this work, we focus onMethanococcus maripaludis, a well-studied hydrogenotrophic marine methanogen. To better understand hydrogenotrophic methanogenesis and compare it with methylotrophic methanogenesis that utilizes oxidative phosphorylation rather than electron bifurcation, we have built iMR539, a genome scale metabolic reconstruction that accounts for 539 of the 1,722 protein-coding genes ofM. maripaludisstrain S2. Our reconstructedmore » metabolic network uses recent literature to not only represent the central electron bifurcation reaction but also incorporate vital biosynthesis and assimilation pathways, including unique cofactor and coenzyme syntheses. We show that our model accurately predicts experimental growth and gene knockout data, with 93% accuracy and a Matthews correlation coefficient of 0.78. Furthermore, we use our metabolic network reconstruction to probe the implications of electron bifurcation by showing its essentiality, as well as investigating the infeasibility of aceticlastic methanogenesis in the network. Additionally, we demonstrate a method of applying thermodynamic constraints to a metabolic model to quickly estimate overall free-energy changes between what comes in and out of the cell. Finally, we describe a novel reconstruction-specific computational toolbox we created to improve usability. Together, our results provide a computational network for exploring hydrogenotrophic methanogenesis and confirm the importance of electron bifurcation in this process. Understanding and applying hydrogenotrophic methanogenesis is a promising avenue for developing new bioenergy technologies around methane gas
iCN718, an Updated and Improved Genome-Scale Metabolic Network Reconstruction of Acinetobacter baumannii AYE.

PubMed

Norsigian, Charles J; Kavvas, Erol; Seif, Yara; Palsson, Bernhard O; Monk, Jonathan M

2018-01-01

Acinetobacter baumannii has become an urgent clinical threat due to the recent emergence of multi-drug resistant strains. There is thus a significant need to discover new therapeutic targets in this organism. One means for doing so is through the use of high-quality genome-scale reconstructions. Well-curated and accurate genome-scale models (GEMs) of A. baumannii would be useful for improving treatment options. We present an updated and improved genome-scale reconstruction of A. baumannii AYE, named iCN718, that improves and standardizes previous A. baumannii AYE reconstructions. iCN718 has 80% accuracy for predicting gene essentiality data and additionally can predict large-scale phenotypic data with as much as 89% accuracy, a new capability for an A. baumannii reconstruction. We further demonstrate that iCN718 can be used to analyze conserved metabolic functions in the A. baumannii core genome and to build strain-specific GEMs of 74 other A. baumannii strains from genome sequence alone. iCN718 will serve as a resource to integrate and synthesize new experimental data being generated for this urgent threat pathogen.
Traceability, reproducibility and wiki-exploration for “à-la-carte” reconstructions of genome-scale metabolic models

PubMed Central

Got, Jeanne; Cortés, María Paz; Maass, Alejandro

2018-01-01

Genome-scale metabolic models have become the tool of choice for the global analysis of microorganism metabolism, and their reconstruction has attained high standards of quality and reliability. Improvements in this area have been accompanied by the development of some major platforms and databases, and an explosion of individual bioinformatics methods. Consequently, many recent models result from “à la carte” pipelines, combining the use of platforms, individual tools and biological expertise to enhance the quality of the reconstruction. Although very useful, introducing heterogeneous tools, that hardly interact with each other, causes loss of traceability and reproducibility in the reconstruction process. This represents a real obstacle, especially when considering less studied species whose metabolic reconstruction can greatly benefit from the comparison to good quality models of related organisms. This work proposes an adaptable workspace, AuReMe, for sustainable reconstructions or improvements of genome-scale metabolic models involving personalized pipelines. At each step, relevant information related to the modifications brought to the model by a method is stored. This ensures that the process is reproducible and documented regardless of the combination of tools used. Additionally, the workspace establishes a way to browse metabolic models and their metadata through the automatic generation of ad-hoc local wikis dedicated to monitoring and facilitating the process of reconstruction. AuReMe supports exploration and semantic query based on RDF databases. We illustrate how this workspace allowed handling, in an integrated way, the metabolic reconstructions of non-model organisms such as an extremophile bacterium or eukaryote algae. Among relevant applications, the latter reconstruction led to putative evolutionary insights of a metabolic pathway. PMID:29791443
redGEM: Systematic reduction and analysis of genome-scale metabolic reconstructions for development of consistent core metabolic models

PubMed Central

Ataman, Meric

2017-01-01

Genome-scale metabolic reconstructions have proven to be valuable resources in enhancing our understanding of metabolic networks as they encapsulate all known metabolic capabilities of the organisms from genes to proteins to their functions. However the complexity of these large metabolic networks often hinders their utility in various practical applications. Although reduced models are commonly used for modeling and in integrating experimental data, they are often inconsistent across different studies and laboratories due to different criteria and detail, which can compromise transferability of the findings and also integration of experimental data from different groups. In this study, we have developed a systematic semi-automatic approach to reduce genome-scale models into core models in a consistent and logical manner focusing on the central metabolism or subsystems of interest. The method minimizes the loss of information using an approach that combines graph-based search and optimization methods. The resulting core models are shown to be able to capture key properties of the genome-scale models and preserve consistency in terms of biomass and by-product yields, flux and concentration variability and gene essentiality. The development of these “consistently-reduced” models will help to clarify and facilitate integration of different experimental data to draw new understanding that can be directly extendable to genome-scale models. PMID:28727725
Automation on the generation of genome-scale metabolic models.

PubMed

Reyes, R; Gamermann, D; Montagud, A; Fuente, D; Triana, J; Urchueguía, J F; de Córdoba, P Fernández

2012-12-01

Nowadays, the reconstruction of genome-scale metabolic models is a nonautomatized and interactive process based on decision making. This lengthy process usually requires a full year of one person's work in order to satisfactory collect, analyze, and validate the list of all metabolic reactions present in a specific organism. In order to write this list, one manually has to go through a huge amount of genomic, metabolomic, and physiological information. Currently, there is no optimal algorithm that allows one to automatically go through all this information and generate the models taking into account probabilistic criteria of unicity and completeness that a biologist would consider. This work presents the automation of a methodology for the reconstruction of genome-scale metabolic models for any organism. The methodology that follows is the automatized version of the steps implemented manually for the reconstruction of the genome-scale metabolic model of a photosynthetic organism, Synechocystis sp. PCC6803. The steps for the reconstruction are implemented in a computational platform (COPABI) that generates the models from the probabilistic algorithms that have been developed. For validation of the developed algorithm robustness, the metabolic models of several organisms generated by the platform have been studied together with published models that have been manually curated. Network properties of the models, like connectivity and average shortest mean path of the different models, have been compared and analyzed.
Systems metabolic engineering: genome-scale models and beyond.

PubMed

Blazeck, John; Alper, Hal

2010-07-01

The advent of high throughput genome-scale bioinformatics has led to an exponential increase in available cellular system data. Systems metabolic engineering attempts to use data-driven approaches--based on the data collected with high throughput technologies--to identify gene targets and optimize phenotypical properties on a systems level. Current systems metabolic engineering tools are limited for predicting and defining complex phenotypes such as chemical tolerances and other global, multigenic traits. The most pragmatic systems-based tool for metabolic engineering to arise is the in silico genome-scale metabolic reconstruction. This tool has seen wide adoption for modeling cell growth and predicting beneficial gene knockouts, and we examine here how this approach can be expanded for novel organisms. This review will highlight advances of the systems metabolic engineering approach with a focus on de novo development and use of genome-scale metabolic reconstructions for metabolic engineering applications. We will then discuss the challenges and prospects for this emerging field to enable model-based metabolic engineering. Specifically, we argue that current state-of-the-art systems metabolic engineering techniques represent a viable first step for improving product yield that still must be followed by combinatorial techniques or random strain mutagenesis to achieve optimal cellular systems.
Genome-scale reconstruction of the Streptococcus pyogenes M49 metabolic network reveals growth requirements and indicates potential drug targets.

PubMed

Levering, Jennifer; Fiedler, Tomas; Sieg, Antje; van Grinsven, Koen W A; Hering, Silvio; Veith, Nadine; Olivier, Brett G; Klett, Lara; Hugenholtz, Jeroen; Teusink, Bas; Kreikemeyer, Bernd; Kummer, Ursula

2016-08-20

Genome-scale metabolic models comprise stoichiometric relations between metabolites, as well as associations between genes and metabolic reactions and facilitate the analysis of metabolism. We computationally reconstructed the metabolic network of the lactic acid bacterium Streptococcus pyogenes M49. Initially, we based the reconstruction on genome annotations and already existing and curated metabolic networks of Bacillus subtilis, Escherichia coli, Lactobacillus plantarum and Lactococcus lactis. This initial draft was manually curated with the final reconstruction accounting for 480 genes associated with 576 reactions and 558 metabolites. In order to constrain the model further, we performed growth experiments of wild type and arcA deletion strains of S. pyogenes M49 in a chemically defined medium and calculated nutrient uptake and production fluxes. We additionally performed amino acid auxotrophy experiments to test the consistency of the model. The established genome-scale model can be used to understand the growth requirements of the human pathogen S. pyogenes and define optimal and suboptimal conditions, but also to describe differences and similarities between S. pyogenes and related lactic acid bacteria such as L. lactis in order to find strategies to reduce the growth of the pathogen and propose drug targets. Copyright © 2016 Elsevier B.V. All rights reserved.
Techniques for Large-Scale Bacterial Genome Manipulation and Characterization of the Mutants with Respect to In Silico Metabolic Reconstructions.

PubMed

diCenzo, George C; Finan, Turlough M

2018-01-01

The rate at which all genes within a bacterial genome can be identified far exceeds the ability to characterize these genes. To assist in associating genes with cellular functions, a large-scale bacterial genome deletion approach can be employed to rapidly screen tens to thousands of genes for desired phenotypes. Here, we provide a detailed protocol for the generation of deletions of large segments of bacterial genomes that relies on the activity of a site-specific recombinase. In this procedure, two recombinase recognition target sequences are introduced into known positions of a bacterial genome through single cross-over plasmid integration. Subsequent expression of the site-specific recombinase mediates recombination between the two target sequences, resulting in the excision of the intervening region and its loss from the genome. We further illustrate how this deletion system can be readily adapted to function as a large-scale in vivo cloning procedure, in which the region excised from the genome is captured as a replicative plasmid. We next provide a procedure for the metabolic analysis of bacterial large-scale genome deletion mutants using the Biolog Phenotype MicroArray™ system. Finally, a pipeline is described, and a sample Matlab script is provided, for the integration of the obtained data with a draft metabolic reconstruction for the refinement of the reactions and gene-protein-reaction relationships in a metabolic reconstruction.
Reconstruction of metabolic pathways for the cattle genome

PubMed Central

Seo, Seongwon; Lewin, Harris A

2009-01-01

Background Metabolic reconstruction of microbial, plant and animal genomes is a necessary step toward understanding the evolutionary origins of metabolism and species-specific adaptive traits. The aims of this study were to reconstruct conserved metabolic pathways in the cattle genome and to identify metabolic pathways with missing genes and proteins. The MetaCyc database and PathwayTools software suite were chosen for this work because they are widely used and easy to implement. Results An amalgamated cattle genome database was created using the NCBI and Ensembl cattle genome databases (based on build 3.1) as data sources. PathwayTools was used to create a cattle-specific pathway genome database, which was followed by comprehensive manual curation for the reconstruction of metabolic pathways. The curated database, CattleCyc 1.0, consists of 217 metabolic pathways. A total of 64 mammalian-specific metabolic pathways were modified from the reference pathways in MetaCyc, and two pathways previously identified but missing from MetaCyc were added. Comparative analysis of metabolic pathways revealed the absence of mammalian genes for 22 metabolic enzymes whose activity was reported in the literature. We also identified six human metabolic protein-coding genes for which the cattle ortholog is missing from the sequence assembly. Conclusion CattleCyc is a powerful tool for understanding the biology of ruminants and other cetartiodactyl species. In addition, the approach used to develop CattleCyc provides a framework for the metabolic reconstruction of other newly sequenced mammalian genomes. It is clear that metabolic pathway analysis strongly reflects the quality of the underlying genome annotations. Thus, having well-annotated genomes from many mammalian species hosted in BioCyc will facilitate the comparative analysis of metabolic pathways among different species and a systems approach to comparative physiology. PMID:19284618

Metabolism and evolution: A comparative study of reconstructed genome-level metabolic networks

NASA Astrophysics Data System (ADS)

Almaas, Eivind

2008-03-01

The availability of high-quality annotations of sequenced genomes has made it possible to generate organism-specific comprehensive maps of cellular metabolism. Currently, more than twenty such metabolic reconstructions are publicly available, with the majority focused on bacteria. A typical metabolic reconstruction for a bacterium results in a complex network containing hundreds of metabolites (nodes) and reactions (links), while some even contain more than a thousand. The constrain-based optimization approach of flux-balance analysis (FBA) is used to investigate the functional characteristics of such large-scale metabolic networks, making it possible to estimate an organism's growth behavior in a wide variety of nutrient environments, as well as its robustness to gene loss. We have recently completed the genome-level metabolic reconstruction of Yersinia pseudotuberculosis, as well as the three Yersinia pestis biovars Antiqua, Mediaevalis, and Orientalis. While Y. pseudotuberculosis typically only causes fever and abdominal pain that can mimic appendicitis, the evolutionary closely related Y. pestis strains are the aetiological agents of the bubonic plague. In this presentation, I will discuss our results and conclusions from a comparative study on the evolution of metabolic function in the four Yersiniae networks using FBA and related techniques, and I will give particular focus to the interplay between metabolic network topology and evolutionary flexibility.
Zea mays iRS1563: A Comprehensive Genome-Scale Metabolic Reconstruction of Maize Metabolism

PubMed Central

Saha, Rajib; Suthers, Patrick F.; Maranas, Costas D.

2011-01-01

The scope and breadth of genome-scale metabolic reconstructions have continued to expand over the last decade. Herein, we introduce a genome-scale model for a plant with direct applications to food and bioenergy production (i.e., maize). Maize annotation is still underway, which introduces significant challenges in the association of metabolic functions to genes. The developed model is designed to meet rigorous standards on gene-protein-reaction (GPR) associations, elementally and charged balanced reactions and a biomass reaction abstracting the relative contribution of all biomass constituents. The metabolic network contains 1,563 genes and 1,825 metabolites involved in 1,985 reactions from primary and secondary maize metabolism. For approximately 42% of the reactions direct literature evidence for the participation of the reaction in maize was found. As many as 445 reactions and 369 metabolites are unique to the maize model compared to the AraGEM model for A. thaliana. 674 metabolites and 893 reactions are present in Zea mays iRS1563 that are not accounted for in maize C4GEM. All reactions are elementally and charged balanced and localized into six different compartments (i.e., cytoplasm, mitochondrion, plastid, peroxisome, vacuole and extracellular). GPR associations are also established based on the functional annotation information and homology prediction accounting for monofunctional, multifunctional and multimeric proteins, isozymes and protein complexes. We describe results from performing flux balance analysis under different physiological conditions, (i.e., photosynthesis, photorespiration and respiration) of a C4 plant and also explore model predictions against experimental observations for two naturally occurring mutants (i.e., bm1 and bm3). The developed model corresponds to the largest and more complete to-date effort at cataloguing metabolism for a plant species. PMID:21755001
A refined genome-scale reconstruction of Chlamydomonas metabolism provides a platform for systems-level analyses

DOE PAGES

Imam, Saheed; Schäuble, Sascha; Valenzuela, Jacob; ...

2015-10-20

Microalgae have reemerged as organisms of prime biotechnological interest due to their ability to synthesize a suite of valuable chemicals. To harness the capabilities of these organisms, we need a comprehensive systems-level understanding of their metabolism, which can be fundamentally achieved through large-scale mechanistic models of metabolism. In this study, we present a revised and significantly improved genome-scale metabolic model for the widely-studied microalga, Chlamydomonas reinhardtii. The model, iCre1355, represents a major advance over previous models, both in content and predictive power. iCre1355 encompasses a broad range of metabolic functions encoded across the nuclear, chloroplast and mitochondrial genomes accounting formore » 1355 genes (1460 transcripts), 2394 and 1133 metabolites. We found improved performance over the previous metabolic model based on comparisons of predictive accuracy across 306 phenotypes (from 81 mutants), lipid yield analysis and growth rates derived from chemostat-grown cells (under three conditions). Measurement of macronutrient uptake revealed carbon and phosphate to be good predictors of growth rate, while nitrogen consumption appeared to be in excess. We analyzed high-resolution time series transcriptomics data using iCre1355 to uncover dynamic pathway-level changes that occur in response to nitrogen starvation and changes in light intensity. This approach enabled accurate prediction of growth rates, the cessation of growth and accumulation of triacylglycerols during nitrogen starvation, and the temporal response of different growth-associated pathways to increased light intensity. Thus, iCre1355 represents an experimentally validated genome-scale reconstruction of C. reinhardtii metabolism that should serve as a useful resource for studying the metabolic processes of this and related microalgae.« less
A genome-scale metabolic network reconstruction of tomato (Solanum lycopersicum L.) and its application to photorespiratory metabolism.

PubMed

Yuan, Huili; Cheung, C Y Maurice; Poolman, Mark G; Hilbers, Peter A J; van Riel, Natal A W

2016-01-01

Tomato (Solanum lycopersicum L.) has been studied extensively due to its high economic value in the market, and high content in health-promoting antioxidant compounds. Tomato is also considered as an excellent model organism for studying the development and metabolism of fleshy fruits. However, the growth, yield and fruit quality of tomatoes can be affected by drought stress, a common abiotic stress for tomato. To investigate the potential metabolic response of tomato plants to drought, we reconstructed iHY3410, a genome-scale metabolic model of tomato leaf, and used this metabolic network to simulate tomato leaf metabolism. The resulting model includes 3410 genes and 2143 biochemical and transport reactions distributed across five intracellular organelles including cytosol, plastid, mitochondrion, peroxisome and vacuole. The model successfully described the known metabolic behaviour of tomato leaf under heterotrophic and phototrophic conditions. The in silico investigation of the metabolic characteristics for photorespiration and other relevant metabolic processes under drought stress suggested that: (i) the flux distributions through the mevalonate (MVA) pathway under drought were distinct from that under normal conditions; and (ii) the changes in fluxes through core metabolic pathways with varying flux ratio of RubisCO carboxylase to oxygenase may contribute to the adaptive stress response of plants. In addition, we improved on previous studies of reaction essentiality analysis for leaf metabolism by including potential alternative routes for compensating reaction knockouts. Altogether, the genome-scale model provides a sound framework for investigating tomato metabolism and gives valuable insights into the functional consequences of abiotic stresses. © 2015 The Authors.The Plant Journal published by Society for Experimental Biology and John Wiley & Sons Ltd.
A refined genome-scale reconstruction of Chlamydomonas metabolism provides a platform for systems-level analyses.

PubMed

Imam, Saheed; Schäuble, Sascha; Valenzuela, Jacob; López García de Lomana, Adrián; Carter, Warren; Price, Nathan D; Baliga, Nitin S

2015-12-01

Microalgae have reemerged as organisms of prime biotechnological interest due to their ability to synthesize a suite of valuable chemicals. To harness the capabilities of these organisms, we need a comprehensive systems-level understanding of their metabolism, which can be fundamentally achieved through large-scale mechanistic models of metabolism. In this study, we present a revised and significantly improved genome-scale metabolic model for the widely-studied microalga, Chlamydomonas reinhardtii. The model, iCre1355, represents a major advance over previous models, both in content and predictive power. iCre1355 encompasses a broad range of metabolic functions encoded across the nuclear, chloroplast and mitochondrial genomes accounting for 1355 genes (1460 transcripts), 2394 and 1133 metabolites. We found improved performance over the previous metabolic model based on comparisons of predictive accuracy across 306 phenotypes (from 81 mutants), lipid yield analysis and growth rates derived from chemostat-grown cells (under three conditions). Measurement of macronutrient uptake revealed carbon and phosphate to be good predictors of growth rate, while nitrogen consumption appeared to be in excess. We analyzed high-resolution time series transcriptomics data using iCre1355 to uncover dynamic pathway-level changes that occur in response to nitrogen starvation and changes in light intensity. This approach enabled accurate prediction of growth rates, the cessation of growth and accumulation of triacylglycerols during nitrogen starvation, and the temporal response of different growth-associated pathways to increased light intensity. Thus, iCre1355 represents an experimentally validated genome-scale reconstruction of C. reinhardtii metabolism that should serve as a useful resource for studying the metabolic processes of this and related microalgae. © 2015 The Authors The Plant Journal © 2015 John Wiley & Sons Ltd.
Reconstruction and Validation of a Genome-Scale Metabolic Model for the Filamentous Fungus Neurospora crassa Using FARM

PubMed Central

Hood, Heather M.; Ocasio, Linda R.; Sachs, Matthew S.; Galagan, James E.

2013-01-01

The filamentous fungus Neurospora crassa played a central role in the development of twentieth-century genetics, biochemistry and molecular biology, and continues to serve as a model organism for eukaryotic biology. Here, we have reconstructed a genome-scale model of its metabolism. This model consists of 836 metabolic genes, 257 pathways, 6 cellular compartments, and is supported by extensive manual curation of 491 literature citations. To aid our reconstruction, we developed three optimization-based algorithms, which together comprise Fast Automated Reconstruction of Metabolism (FARM). These algorithms are: LInear MEtabolite Dilution Flux Balance Analysis (limed-FBA), which predicts flux while linearly accounting for metabolite dilution; One-step functional Pruning (OnePrune), which removes blocked reactions with a single compact linear program; and Consistent Reproduction Of growth/no-growth Phenotype (CROP), which reconciles differences between in silico and experimental gene essentiality faster than previous approaches. Against an independent test set of more than 300 essential/non-essential genes that were not used to train the model, the model displays 93% sensitivity and specificity. We also used the model to simulate the biochemical genetics experiments originally performed on Neurospora by comprehensively predicting nutrient rescue of essential genes and synthetic lethal interactions, and we provide detailed pathway-based mechanistic explanations of our predictions. Our model provides a reliable computational framework for the integration and interpretation of ongoing experimental efforts in Neurospora, and we anticipate that our methods will substantially reduce the manual effort required to develop high-quality genome-scale metabolic models for other organisms. PMID:23935467
Improved annotation through genome-scale metabolic modeling of Aspergillus oryzae

PubMed Central

Vongsangnak, Wanwipa; Olsen, Peter; Hansen, Kim; Krogsgaard, Steen; Nielsen, Jens

2008-01-01

Background Since ancient times the filamentous fungus Aspergillus oryzae has been used in the fermentation industry for the production of fermented sauces and the production of industrial enzymes. Recently, the genome sequence of A. oryzae with 12,074 annotated genes was released but the number of hypothetical proteins accounted for more than 50% of the annotated genes. Considering the industrial importance of this fungus, it is therefore valuable to improve the annotation and further integrate genomic information with biochemical and physiological information available for this microorganism and other related fungi. Here we proposed the gene prediction by construction of an A. oryzae Expressed Sequence Tag (EST) library, sequencing and assembly. We enhanced the function assignment by our developed annotation strategy. The resulting better annotation was used to reconstruct the metabolic network leading to a genome scale metabolic model of A. oryzae. Results Our assembled EST sequences we identified 1,046 newly predicted genes in the A. oryzae genome. Furthermore, it was possible to assign putative protein functions to 398 of the newly predicted genes. Noteworthy, our annotation strategy resulted in assignment of new putative functions to 1,469 hypothetical proteins already present in the A. oryzae genome database. Using the substantially improved annotated genome we reconstructed the metabolic network of A. oryzae. This network contains 729 enzymes, 1,314 enzyme-encoding genes, 1,073 metabolites and 1,846 (1,053 unique) biochemical reactions. The metabolic reactions are compartmentalized into the cytosol, the mitochondria, the peroxisome and the extracellular space. Transport steps between the compartments and the extracellular space represent 281 reactions, of which 161 are unique. The metabolic model was validated and shown to correctly describe the phenotypic behavior of A. oryzae grown on different carbon sources. Conclusion A much enhanced annotation of the A
Advances in the integration of transcriptional regulatory information into genome-scale metabolic models.

PubMed

Vivek-Ananth, R P; Samal, Areejit

2016-09-01

A major goal of systems biology is to build predictive computational models of cellular metabolism. Availability of complete genome sequences and wealth of legacy biochemical information has led to the reconstruction of genome-scale metabolic networks in the last 15 years for several organisms across the three domains of life. Due to paucity of information on kinetic parameters associated with metabolic reactions, the constraint-based modelling approach, flux balance analysis (FBA), has proved to be a vital alternative to investigate the capabilities of reconstructed metabolic networks. In parallel, advent of high-throughput technologies has led to the generation of massive amounts of omics data on transcriptional regulation comprising mRNA transcript levels and genome-wide binding profile of transcriptional regulators. A frontier area in metabolic systems biology has been the development of methods to integrate the available transcriptional regulatory information into constraint-based models of reconstructed metabolic networks in order to increase the predictive capabilities of computational models and understand the regulation of cellular metabolism. Here, we review the existing methods to integrate transcriptional regulatory information into constraint-based models of metabolic networks. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Improved evidence-based genome-scale metabolic models for maize leaf, embryo, and endosperm

PubMed Central

Seaver, Samuel M. D.; Bradbury, Louis M. T.; Frelin, Océane; Zarecki, Raphy; Ruppin, Eytan; Hanson, Andrew D.; Henry, Christopher S.

2015-01-01

There is a growing demand for genome-scale metabolic reconstructions for plants, fueled by the need to understand the metabolic basis of crop yield and by progress in genome and transcriptome sequencing. Methods are also required to enable the interpretation of plant transcriptome data to study how cellular metabolic activity varies under different growth conditions or even within different organs, tissues, and developmental stages. Such methods depend extensively on the accuracy with which genes have been mapped to the biochemical reactions in the plant metabolic pathways. Errors in these mappings lead to metabolic reconstructions with an inflated number of reactions and possible generation of unreliable metabolic phenotype predictions. Here we introduce a new evidence-based genome-scale metabolic reconstruction of maize, with significant improvements in the quality of the gene-reaction associations included within our model. We also present a new approach for applying our model to predict active metabolic genes based on transcriptome data. This method includes a minimal set of reactions associated with low expression genes to enable activity of a maximum number of reactions associated with high expression genes. We apply this method to construct an organ-specific model for the maize leaf, and tissue specific models for maize embryo and endosperm cells. We validate our models using fluxomics data for the endosperm and embryo, demonstrating an improved capacity of our models to fit the available fluxomics data. All models are publicly available via the DOE Systems Biology Knowledgebase and PlantSEED, and our new method is generally applicable for analysis transcript profiles from any plant, paving the way for further in silico studies with a wide variety of plant genomes. PMID:25806041
Improved evidence-based genome-scale metabolic models for maize leaf, embryo, and endosperm

DOE PAGES

Seaver, Samuel M.D.; Bradbury, Louis M.T.; Frelin, Océane; ...

2015-03-10

There is a growing demand for genome-scale metabolic reconstructions for plants, fueled by the need to understand the metabolic basis of crop yield and by progress in genome and transcriptome sequencing. Methods are also required to enable the interpretation of plant transcriptome data to study how cellular metabolic activity varies under different growth conditions or even within different organs, tissues, and developmental stages. Such methods depend extensively on the accuracy with which genes have been mapped to the biochemical reactions in the plant metabolic pathways. Errors in these mappings lead to metabolic reconstructions with an inflated number of reactions andmore » possible generation of unreliable metabolic phenotype predictions. Here we introduce a new evidence-based genome-scale metabolic reconstruction of maize, with significant improvements in the quality of the gene-reaction associations included within our model. We also present a new approach for applying our model to predict active metabolic genes based on transcriptome data. This method includes a minimal set of reactions associated with low expression genes to enable activity of a maximum number of reactions associated with high expression genes. We apply this method to construct an organ-specific model for the maize leaf, and tissue specific models for maize embryo and endosperm cells. We validate our models using fluxomics data for the endosperm and embryo, demonstrating an improved capacity of our models to fit the available fluxomics data. All models are publicly available via the DOE Systems Biology Knowledgebase and PlantSEED, and our new method is generally applicable for analysis transcript profiles from any plant, paving the way for further in silico studies with a wide variety of plant genomes.« less
Genome-Scale Reconstruction and Analysis of the Metabolic Network in the Hyperthermophilic Archaeon Sulfolobus Solfataricus

PubMed Central

Ulas, Thomas; Riemer, S. Alexander; Zaparty, Melanie; Siebers, Bettina; Schomburg, Dietmar

2012-01-01

We describe the reconstruction of a genome-scale metabolic model of the crenarchaeon Sulfolobus solfataricus, a hyperthermoacidophilic microorganism. It grows in terrestrial volcanic hot springs with growth occurring at pH 2–4 (optimum 3.5) and a temperature of 75–80°C (optimum 80°C). The genome of Sulfolobus solfataricus P2 contains 2,992,245 bp on a single circular chromosome and encodes 2,977 proteins and a number of RNAs. The network comprises 718 metabolic and 58 transport/exchange reactions and 705 unique metabolites, based on the annotated genome and available biochemical data. Using the model in conjunction with constraint-based methods, we simulated the metabolic fluxes induced by different environmental and genetic conditions. The predictions were compared to experimental measurements and phenotypes of S. solfataricus. Furthermore, the performance of the network for 35 different carbon sources known for S. solfataricus from the literature was simulated. Comparing the growth on different carbon sources revealed that glycerol is the carbon source with the highest biomass flux per imported carbon atom (75% higher than glucose). Experimental data was also used to fit the model to phenotypic observations. In addition to the commonly known heterotrophic growth of S. solfataricus, the crenarchaeon is also able to grow autotrophically using the hydroxypropionate-hydroxybutyrate cycle for bicarbonate fixation. We integrated this pathway into our model and compared bicarbonate fixation with growth on glucose as sole carbon source. Finally, we tested the robustness of the metabolism with respect to gene deletions using the method of Minimization of Metabolic Adjustment (MOMA), which predicted that 18% of all possible single gene deletions would be lethal for the organism. PMID:22952675
Genome-scale modeling of human metabolism - a systems biology approach.

PubMed

Mardinoglu, Adil; Gatto, Francesco; Nielsen, Jens

2013-09-01

Altered metabolism is linked to the appearance of various human diseases and a better understanding of disease-associated metabolic changes may lead to the identification of novel prognostic biomarkers and the development of new therapies. Genome-scale metabolic models (GEMs) have been employed for studying human metabolism in a systematic manner, as well as for understanding complex human diseases. In the past decade, such metabolic models - one of the fundamental aspects of systems biology - have started contributing to the understanding of the mechanistic relationship between genotype and phenotype. In this review, we focus on the construction of the Human Metabolic Reaction database, the generation of healthy cell type- and cancer-specific GEMs using different procedures, and the potential applications of these developments in the study of human metabolism and in the identification of metabolic changes associated with various disorders. We further examine how in silico genome-scale reconstructions can be employed to simulate metabolic flux distributions and how high-throughput omics data can be analyzed in a context-dependent fashion. Insights yielded from this mechanistic modeling approach can be used for identifying new therapeutic agents and drug targets as well as for the discovery of novel biomarkers. Finally, recent advancements in genome-scale modeling and the future challenge of developing a model of whole-body metabolism are presented. The emergent contribution of GEMs to personalized and translational medicine is also discussed. Copyright © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
The RAVEN Toolbox and Its Use for Generating a Genome-scale Metabolic Model for Penicillium chrysogenum

PubMed Central

Agren, Rasmus; Liu, Liming; Shoaie, Saeed; Vongsangnak, Wanwipa; Nookaew, Intawat; Nielsen, Jens

2013-01-01

We present the RAVEN (Reconstruction, Analysis and Visualization of Metabolic Networks) Toolbox: a software suite that allows for semi-automated reconstruction of genome-scale models. It makes use of published models and/or the KEGG database, coupled with extensive gap-filling and quality control features. The software suite also contains methods for visualizing simulation results and omics data, as well as a range of methods for performing simulations and analyzing the results. The software is a useful tool for system-wide data analysis in a metabolic context and for streamlined reconstruction of metabolic networks based on protein homology. The RAVEN Toolbox workflow was applied in order to reconstruct a genome-scale metabolic model for the important microbial cell factory Penicillium chrysogenum Wisconsin54-1255. The model was validated in a bibliomic study of in total 440 references, and it comprises 1471 unique biochemical reactions and 1006 ORFs. It was then used to study the roles of ATP and NADPH in the biosynthesis of penicillin, and to identify potential metabolic engineering targets for maximization of penicillin production. PMID:23555215
Consistency Analysis of Genome-Scale Models of Bacterial Metabolism: A Metamodel Approach

PubMed Central

Ponce-de-Leon, Miguel; Calle-Espinosa, Jorge; Peretó, Juli; Montero, Francisco

2015-01-01

Genome-scale metabolic models usually contain inconsistencies that manifest as blocked reactions and gap metabolites. With the purpose to detect recurrent inconsistencies in metabolic models, a large-scale analysis was performed using a previously published dataset of 130 genome-scale models. The results showed that a large number of reactions (~22%) are blocked in all the models where they are present. To unravel the nature of such inconsistencies a metamodel was construed by joining the 130 models in a single network. This metamodel was manually curated using the unconnected modules approach, and then, it was used as a reference network to perform a gap-filling on each individual genome-scale model. Finally, a set of 36 models that had not been considered during the construction of the metamodel was used, as a proof of concept, to extend the metamodel with new biochemical information, and to assess its impact on gap-filling results. The analysis performed on the metamodel allowed to conclude: 1) the recurrent inconsistencies found in the models were already present in the metabolic database used during the reconstructions process; 2) the presence of inconsistencies in a metabolic database can be propagated to the reconstructed models; 3) there are reactions not manifested as blocked which are active as a consequence of some classes of artifacts, and; 4) the results of an automatic gap-filling are highly dependent on the consistency and completeness of the metamodel or metabolic database used as the reference network. In conclusion the consistency analysis should be applied to metabolic databases in order to detect and fill gaps as well as to detect and remove artifacts and redundant information. PMID:26629901
Reconstruction of Tissue-Specific Metabolic Networks Using CORDA

PubMed Central

Schultz, André; Qutub, Amina A.

2016-01-01

Human metabolism involves thousands of reactions and metabolites. To interpret this complexity, computational modeling becomes an essential experimental tool. One of the most popular techniques to study human metabolism as a whole is genome scale modeling. A key challenge to applying genome scale modeling is identifying critical metabolic reactions across diverse human tissues. Here we introduce a novel algorithm called Cost Optimization Reaction Dependency Assessment (CORDA) to build genome scale models in a tissue-specific manner. CORDA performs more efficiently computationally, shows better agreement to experimental data, and displays better model functionality and capacity when compared to previous algorithms. CORDA also returns reaction associations that can greatly assist in any manual curation to be performed following the automated reconstruction process. Using CORDA, we developed a library of 76 healthy and 20 cancer tissue-specific reconstructions. These reconstructions identified which metabolic pathways are shared across diverse human tissues. Moreover, we identified changes in reactions and pathways that are differentially included and present different capacity profiles in cancer compared to healthy tissues, including up-regulation of folate metabolism, the down-regulation of thiamine metabolism, and tight regulation of oxidative phosphorylation. PMID:26942765
A Protocol for Generating and Exchanging (Genome-Scale) Metabolic Resource Allocation Models.

PubMed

Reimers, Alexandra-M; Lindhorst, Henning; Waldherr, Steffen

2017-09-06

In this article, we present a protocol for generating a complete (genome-scale) metabolic resource allocation model, as well as a proposal for how to represent such models in the systems biology markup language (SBML). Such models are used to investigate enzyme levels and achievable growth rates in large-scale metabolic networks. Although the idea of metabolic resource allocation studies has been present in the field of systems biology for some years, no guidelines for generating such a model have been published up to now. This paper presents step-by-step instructions for building a (dynamic) resource allocation model, starting with prerequisites such as a genome-scale metabolic reconstruction, through building protein and noncatalytic biomass synthesis reactions and assigning turnover rates for each reaction. In addition, we explain how one can use SBML level 3 in combination with the flux balance constraints and our resource allocation modeling annotation to represent such models.
A Genome-Scale Metabolic Reconstruction of Mycoplasma genitalium, iPS189

PubMed Central

Suthers, Patrick F.; Dasika, Madhukar S.; Kumar, Vinay Satish; Denisov, Gennady; Glass, John I.; Maranas, Costas D.

2009-01-01

With a genome size of ∼580 kb and approximately 480 protein coding regions, Mycoplasma genitalium is one of the smallest known self-replicating organisms and, additionally, has extremely fastidious nutrient requirements. The reduced genomic content of M. genitalium has led researchers to suggest that the molecular assembly contained in this organism may be a close approximation to the minimal set of genes required for bacterial growth. Here, we introduce a systematic approach for the construction and curation of a genome-scale in silico metabolic model for M. genitalium. Key challenges included estimation of biomass composition, handling of enzymes with broad specificities, and the lack of a defined medium. Computational tools were subsequently employed to identify and resolve connectivity gaps in the model as well as growth prediction inconsistencies with gene essentiality experimental data. The curated model, M. genitalium iPS189 (262 reactions, 274 metabolites), is 87% accurate in recapitulating in vivo gene essentiality results for M. genitalium. Approaches and tools described herein provide a roadmap for the automated construction of in silico metabolic models of other organisms. PMID:19214212
Genome-scale model reveals metabolic basis of biomass partitioning in a model diatom

DOE PAGES

Levering, Jennifer; Broddrick, Jared; Dupont, Christopher L.; ...

2016-05-06

Diatoms are eukaryotic microalgae that contain genes from various sources, including bacteria and the secondary endosymbiotic host. Due to this unique combination of genes, diatoms are taxonomically and functionally distinct from other algae and vascular plants and confer novel metabolic capabilities. Based on the genome annotation, we performed a genome-scale metabolic network reconstruction for the marine diatom Phaeodactylum tricornutum. Due to their endosymbiotic origin, diatoms possess a complex chloroplast structure which complicates the prediction of subcellular protein localization. Based on previous work we implemented a pipeline that exploits a series of bioinformatics tools to predict protein localization. The manually curatedmore » reconstructed metabolic network iLB1027_lipid accounts for 1,027 genes associated with 4,456 reactions and 2,172 metabolites distributed across six compartments. To constrain the genome-scale model, we determined the organism specific biomass composition in terms of lipids, carbohydrates, and proteins using Fourier transform infrared spectrometry. Our simulations indicate the presence of a yet unknown glutamine-ornithine shunt that could be used to transfer reducing equivalents generated by photosynthesis to the mitochondria. Furthermore, the model reflects the known biochemical composition of P. tricornutum in defined culture conditions and enables metabolic engineering strategies to improve the use of P. tricornutum for biotechnological applications.« less
Next-generation genome-scale models for metabolic engineering.

PubMed

King, Zachary A; Lloyd, Colton J; Feist, Adam M; Palsson, Bernhard O

2015-12-01

Constraint-based reconstruction and analysis (COBRA) methods have become widely used tools for metabolic engineering in both academic and industrial laboratories. By employing a genome-scale in silico representation of the metabolic network of a host organism, COBRA methods can be used to predict optimal genetic modifications that improve the rate and yield of chemical production. A new generation of COBRA models and methods is now being developed--encompassing many biological processes and simulation strategies-and next-generation models enable new types of predictions. Here, three key examples of applying COBRA methods to strain optimization are presented and discussed. Then, an outlook is provided on the next generation of COBRA models and the new types of predictions they will enable for systems metabolic engineering. Copyright © 2014 Elsevier Ltd. All rights reserved.
Analysis of Aspergillus nidulans metabolism at the genome-scale

PubMed Central

David, Helga; Özçelik, İlknur Ş; Hofmann, Gerald; Nielsen, Jens

2008-01-01

Background Aspergillus nidulans is a member of a diverse group of filamentous fungi, sharing many of the properties of its close relatives with significance in the fields of medicine, agriculture and industry. Furthermore, A. nidulans has been a classical model organism for studies of development biology and gene regulation, and thus it has become one of the best-characterized filamentous fungi. It was the first Aspergillus species to have its genome sequenced, and automated gene prediction tools predicted 9,451 open reading frames (ORFs) in the genome, of which less than 10% were assigned a function. Results In this work, we have manually assigned functions to 472 orphan genes in the metabolism of A. nidulans, by using a pathway-driven approach and by employing comparative genomics tools based on sequence similarity. The central metabolism of A. nidulans, as well as biosynthetic pathways of relevant secondary metabolites, was reconstructed based on detailed metabolic reconstructions available for A. niger and Saccharomyces cerevisiae, and information on the genetics, biochemistry and physiology of A. nidulans. Thereby, it was possible to identify metabolic functions without a gene associated, and to look for candidate ORFs in the genome of A. nidulans by comparing its sequence to sequences of well-characterized genes in other species encoding the function of interest. A classification system, based on defined criteria, was developed for evaluating and selecting the ORFs among the candidates, in an objective and systematic manner. The functional assignments served as a basis to develop a mathematical model, linking 666 genes (both previously and newly annotated) to metabolic roles. The model was used to simulate metabolic behavior and additionally to integrate, analyze and interpret large-scale gene expression data concerning a study on glucose repression, thereby providing a means of upgrading the information content of experimental data and getting further

Genome-scale metabolic network of Cordyceps militaris useful for comparative analysis of entomopathogenic fungi.

PubMed

Vongsangnak, Wanwipa; Raethong, Nachon; Mujchariyakul, Warasinee; Nguyen, Nam Ninh; Leong, Hon Wai; Laoteng, Kobkul

2017-08-30

The first genome-scale metabolic network of Cordyceps militaris (iWV1170) was constructed representing its whole metabolisms, which consisted of 894 metabolites and 1,267 metabolic reactions across five compartments, including the plasma membrane, cytoplasm, mitochondria, peroxisome and extracellular space. The iWV1170 could be exploited to explain its phenotypes of growth ability, cordycepin and other metabolites production on various substrates. A high number of genes encoding extracellular enzymes for degradation of complex carbohydrates, lipids and proteins were existed in C. militaris genome. By comparative genome-scale analysis, the adenine metabolic pathway towards putative cordycepin biosynthesis was reconstructed, indicating their evolutionary relationships across eleven species of entomopathogenic fungi. The overall metabolic routes involved in the putative cordycepin biosynthesis were also identified in C. militaris, including central carbon metabolism, amino acid metabolism (glycine, l-glutamine and l-aspartate) and nucleotide metabolism (adenosine and adenine). Interestingly, a lack of the sequence coding for ribonucleotide reductase inhibitor was observed in C. militaris that might contribute to its over-production of cordycepin. Copyright © 2017. Published by Elsevier B.V.
iAK692: A genome-scale metabolic model of Spirulina platensis C1

PubMed Central

2012-01-01

Background Spirulina (Arthrospira) platensis is a well-known filamentous cyanobacterium used in the production of many industrial products, including high value compounds, healthy food supplements, animal feeds, pharmaceuticals and cosmetics, for example. It has been increasingly studied around the world for scientific purposes, especially for its genome, biology, physiology, and also for the analysis of its small-scale metabolic network. However, the overall description of the metabolic and biotechnological capabilities of S. platensis requires the development of a whole cellular metabolism model. Recently, the S. platensis C1 (Arthrospira sp. PCC9438) genome sequence has become available, allowing systems-level studies of this commercial cyanobacterium. Results In this work, we present the genome-scale metabolic network analysis of S. platensis C1, iAK692, its topological properties, and its metabolic capabilities and functions. The network was reconstructed from the S. platensis C1 annotated genomic sequence using Pathway Tools software to generate a preliminary network. Then, manual curation was performed based on a collective knowledge base and a combination of genomic, biochemical, and physiological information. The genome-scale metabolic model consists of 692 genes, 837 metabolites, and 875 reactions. We validated iAK692 by conducting fermentation experiments and simulating the model under autotrophic, heterotrophic, and mixotrophic growth conditions using COBRA toolbox. The model predictions under these growth conditions were consistent with the experimental results. The iAK692 model was further used to predict the unique active reactions and essential genes for each growth condition. Additionally, the metabolic states of iAK692 during autotrophic and mixotrophic growths were described by phenotypic phase plane (PhPP) analysis. Conclusions This study proposes the first genome-scale model of S. platensis C1, iAK692, which is a predictive metabolic platform
iAK692: a genome-scale metabolic model of Spirulina platensis C1.

PubMed

Klanchui, Amornpan; Khannapho, Chiraphan; Phodee, Atchara; Cheevadhanarak, Supapon; Meechai, Asawin

2012-06-15

Spirulina (Arthrospira) platensis is a well-known filamentous cyanobacterium used in the production of many industrial products, including high value compounds, healthy food supplements, animal feeds, pharmaceuticals and cosmetics, for example. It has been increasingly studied around the world for scientific purposes, especially for its genome, biology, physiology, and also for the analysis of its small-scale metabolic network. However, the overall description of the metabolic and biotechnological capabilities of S. platensis requires the development of a whole cellular metabolism model. Recently, the S. platensis C1 (Arthrospira sp. PCC9438) genome sequence has become available, allowing systems-level studies of this commercial cyanobacterium. In this work, we present the genome-scale metabolic network analysis of S. platensis C1, iAK692, its topological properties, and its metabolic capabilities and functions. The network was reconstructed from the S. platensis C1 annotated genomic sequence using Pathway Tools software to generate a preliminary network. Then, manual curation was performed based on a collective knowledge base and a combination of genomic, biochemical, and physiological information. The genome-scale metabolic model consists of 692 genes, 837 metabolites, and 875 reactions. We validated iAK692 by conducting fermentation experiments and simulating the model under autotrophic, heterotrophic, and mixotrophic growth conditions using COBRA toolbox. The model predictions under these growth conditions were consistent with the experimental results. The iAK692 model was further used to predict the unique active reactions and essential genes for each growth condition. Additionally, the metabolic states of iAK692 during autotrophic and mixotrophic growths were described by phenotypic phase plane (PhPP) analysis. This study proposes the first genome-scale model of S. platensis C1, iAK692, which is a predictive metabolic platform for a global understanding of
Toward the automated generation of genome-scale metabolic networks in the SEED.

PubMed

DeJongh, Matthew; Formsma, Kevin; Boillot, Paul; Gould, John; Rycenga, Matthew; Best, Aaron

2007-04-26

Current methods for the automated generation of genome-scale metabolic networks focus on genome annotation and preliminary biochemical reaction network assembly, but do not adequately address the process of identifying and filling gaps in the reaction network, and verifying that the network is suitable for systems level analysis. Thus, current methods are only sufficient for generating draft-quality networks, and refinement of the reaction network is still largely a manual, labor-intensive process. We have developed a method for generating genome-scale metabolic networks that produces substantially complete reaction networks, suitable for systems level analysis. Our method partitions the reaction space of central and intermediary metabolism into discrete, interconnected components that can be assembled and verified in isolation from each other, and then integrated and verified at the level of their interconnectivity. We have developed a database of components that are common across organisms, and have created tools for automatically assembling appropriate components for a particular organism based on the metabolic pathways encoded in the organism's genome. This focuses manual efforts on that portion of an organism's metabolism that is not yet represented in the database. We have demonstrated the efficacy of our method by reverse-engineering and automatically regenerating the reaction network from a published genome-scale metabolic model for Staphylococcus aureus. Additionally, we have verified that our method capitalizes on the database of common reaction network components created for S. aureus, by using these components to generate substantially complete reconstructions of the reaction networks from three other published metabolic models (Escherichia coli, Helicobacter pylori, and Lactococcus lactis). We have implemented our tools and database within the SEED, an open-source software environment for comparative genome annotation and analysis. Our method sets the
A genome-scale metabolic model of the lipid-accumulating yeast Yarrowia lipolytica

PubMed Central

2012-01-01

Background Yarrowia lipolytica is an oleaginous yeast which has emerged as an important microorganism for several biotechnological processes, such as the production of organic acids, lipases and proteases. It is also considered a good candidate for single-cell oil production. Although some of its metabolic pathways are well studied, its metabolic engineering is hindered by the lack of a genome-scale model that integrates the current knowledge about its metabolism. Results Combining in silico tools and expert manual curation, we have produced an accurate genome-scale metabolic model for Y. lipolytica. Using a scaffold derived from a functional metabolic model of the well-studied but phylogenetically distant yeast S. cerevisiae, we mapped conserved reactions, rewrote gene associations, added species-specific reactions and inserted specialized copies of scaffold reactions to account for species-specific expansion of protein families. We used physiological measures obtained under lab conditions to validate our predictions. Conclusions Y. lipolytica iNL895 represents the first well-annotated metabolic model of an oleaginous yeast, providing a base for future metabolic improvement, and a starting point for the metabolic reconstruction of other species in the Yarrowia clade and other oleaginous yeasts. PMID:22558935
A metabolite-centric view on flux distributions in genome-scale metabolic models

PubMed Central

2013-01-01

Background Genome-scale metabolic models are important tools in systems biology. They permit the in-silico prediction of cellular phenotypes via mathematical optimisation procedures, most importantly flux balance analysis. Current studies on metabolic models mostly consider reaction fluxes in isolation. Based on a recently proposed metabolite-centric approach, we here describe a set of methods that enable the analysis and interpretation of flux distributions in an integrated metabolite-centric view. We demonstrate how this framework can be used for the refinement of genome-scale metabolic models. Results We applied the metabolite-centric view developed here to the most recent metabolic reconstruction of Escherichia coli. By compiling the balance sheets of a small number of currency metabolites, we were able to fully characterise the energy metabolism as predicted by the model and to identify a possibility for model refinement in NADPH metabolism. Selected branch points were examined in detail in order to demonstrate how a metabolite-centric view allows identifying functional roles of metabolites. Fructose 6-phosphate aldolase and the sedoheptulose bisphosphate bypass were identified as enzymatic reactions that can carry high fluxes in the model but are unlikely to exhibit significant activity in vivo. Performing a metabolite essentiality analysis, unconstrained import and export of iron ions could be identified as potentially problematic for the quality of model predictions. Conclusions The system-wide analysis of split ratios and branch points allows a much deeper insight into the metabolic network than reaction-centric analyses. Extending an earlier metabolite-centric approach, the methods introduced here establish an integrated metabolite-centric framework for the interpretation of flux distributions in genome-scale metabolic networks that can complement the classical reaction-centric framework. Analysing fluxes and their metabolic context simultaneously opens
Investigating host-pathogen behavior and their interaction using genome-scale metabolic network models.

PubMed

Sadhukhan, Priyanka P; Raghunathan, Anu

2014-01-01

Genome Scale Metabolic Modeling methods represent one way to compute whole cell function starting from the genome sequence of an organism and contribute towards understanding and predicting the genotype-phenotype relationship. About 80 models spanning all the kingdoms of life from archaea to eukaryotes have been built till date and used to interrogate cell phenotype under varying conditions. These models have been used to not only understand the flux distribution in evolutionary conserved pathways like glycolysis and the Krebs cycle but also in applications ranging from value added product formation in Escherichia coli to predicting inborn errors of Homo sapiens metabolism. This chapter describes a protocol that delineates the process of genome scale metabolic modeling for analysing host-pathogen behavior and interaction using flux balance analysis (FBA). The steps discussed in the process include (1) reconstruction of a metabolic network from the genome sequence, (2) its representation in a precise mathematical framework, (3) its translation to a model, and (4) the analysis using linear algebra and optimization. The methods for biological interpretations of computed cell phenotypes in the context of individual host and pathogen models and their integration are also discussed.
Genome scale engineering techniques for metabolic engineering.

PubMed

Liu, Rongming; Bassalo, Marcelo C; Zeitoun, Ramsey I; Gill, Ryan T

2015-11-01

Metabolic engineering has expanded from a focus on designs requiring a small number of genetic modifications to increasingly complex designs driven by advances in genome-scale engineering technologies. Metabolic engineering has been generally defined by the use of iterative cycles of rational genome modifications, strain analysis and characterization, and a synthesis step that fuels additional hypothesis generation. This cycle mirrors the Design-Build-Test-Learn cycle followed throughout various engineering fields that has recently become a defining aspect of synthetic biology. This review will attempt to summarize recent genome-scale design, build, test, and learn technologies and relate their use to a range of metabolic engineering applications. Copyright © 2015 International Metabolic Engineering Society. Published by Elsevier Inc. All rights reserved.
Genome-Scale Metabolic Modeling of Archaea Lends Insight into Diversity of Metabolic Function

PubMed Central

2017-01-01

Decades of biochemical, bioinformatic, and sequencing data are currently being systematically compiled into genome-scale metabolic reconstructions (GEMs). Such reconstructions are knowledge-bases useful for engineering, modeling, and comparative analysis. Here we review the fifteen GEMs of archaeal species that have been constructed to date. They represent primarily members of the Euryarchaeota with three-quarters comprising representative of methanogens. Unlike other reviews on GEMs, we specially focus on archaea. We briefly review the GEM construction process and the genealogy of the archaeal models. The major insights gained during the construction of these models are then reviewed with specific focus on novel metabolic pathway predictions and growth characteristics. Metabolic pathway usage is discussed in the context of the composition of each organism's biomass and their specific energy and growth requirements. We show how the metabolic models can be used to study the evolution of metabolism in archaea. Conservation of particular metabolic pathways can be studied by comparing reactions using the genes associated with their enzymes. This demonstrates the utility of GEMs to evolutionary studies, far beyond their original purpose of metabolic modeling; however, much needs to be done before archaeal models are as extensively complete as those for bacteria. PMID:28133437
Reconstruction and in silico analysis of an Actinoplanes sp. SE50/110 genome-scale metabolic model for acarbose production

PubMed Central

Wang, Yali; Xu, Nan; Ye, Chao; Liu, Liming; Shi, Zhongping; Wu, Jing

2015-01-01

Actinoplanes sp. SE50/110 produces the α-glucosidase inhibitor acarbose, which is used to treat type 2 diabetes mellitus. To obtain a comprehensive understanding of its cellular metabolism, a genome-scale metabolic model of strain SE50/110, iYLW1028, was reconstructed on the bases of the genome annotation, biochemical databases, and extensive literature mining. Model iYLW1028 comprises 1028 genes, 1128 metabolites, and 1219 reactions. One hundred and twenty-two and eighty one genes were essential for cell growth on acarbose synthesis and sucrose media, respectively, and the acarbose biosynthetic pathway in SE50/110 was expounded completely. Based on model predictions, the addition of arginine and histidine to the media increased acarbose production by 78 and 59%, respectively. Additionally, dissolved oxygen has a great effect on acarbose production based on model predictions. Furthermore, genes to be overexpressed for the overproduction of acarbose were identified, and the deletion of treY eliminated the formation of by-product component C. Model iYLW1028 is a useful platform for optimizing and systems metabolic engineering for acarbose production in Actinoplanes sp. SE50/110. PMID:26161077
Solving gap metabolites and blocked reactions in genome-scale models: application to the metabolic network of Blattabacterium cuenoti.

PubMed

Ponce-de-León, Miguel; Montero, Francisco; Peretó, Juli

2013-10-31

Metabolic reconstruction is the computational-based process that aims to elucidate the network of metabolites interconnected through reactions catalyzed by activities assigned to one or more genes. Reconstructed models may contain inconsistencies that appear as gap metabolites and blocked reactions. Although automatic methods for solving this problem have been previously developed, there are many situations where manual curation is still needed. We introduce a general definition of gap metabolite that allows its detection in a straightforward manner. Moreover, a method for the detection of Unconnected Modules, defined as isolated sets of blocked reactions connected through gap metabolites, is proposed. The method has been successfully applied to the curation of iCG238, the genome-scale metabolic model for the bacterium Blattabacterium cuenoti, obligate endosymbiont of cockroaches. We found the proposed approach to be a valuable tool for the curation of genome-scale metabolic models. The outcome of its application to the genome-scale model B. cuenoti iCG238 is a more accurate model version named as B. cuenoti iMP240.
Metabolic reconstruction, constraint-based analysis and game theory to probe genome-scale metabolic networks.

PubMed

Ruppin, Eytan; Papin, Jason A; de Figueiredo, Luis F; Schuster, Stefan

2010-08-01

With the advent of modern omics technologies, it has become feasible to reconstruct (quasi-) whole-cell metabolic networks and characterize them in more and more detail. Computer simulations of the dynamic behavior of such networks are difficult due to a lack of kinetic data and to computational limitations. In contrast, network analysis based on appropriate constraints such as the steady-state condition (constraint-based analysis) is feasible and allows one to derive conclusions about the system's metabolic capabilities. Here, we review methods for the reconstruction of metabolic networks, modeling techniques such as flux balance analysis and elementary flux modes and current progress in their development and applications. Game-theoretical methods for studying metabolic networks are discussed as well. Copyright © 2010 Elsevier Ltd. All rights reserved.
Likelihood-based gene annotations for gap filling and quality assessment in genome-scale metabolic models

DOE PAGES

Benedict, Matthew N.; Mundy, Michael B.; Henry, Christopher S.; ...

2014-10-16

Genome-scale metabolic models provide a powerful means to harness information from genomes to deepen biological insights. With exponentially increasing sequencing capacity, there is an enormous need for automated reconstruction techniques that can provide more accurate models in a short time frame. Current methods for automated metabolic network reconstruction rely on gene and reaction annotations to build draft metabolic networks and algorithms to fill gaps in these networks. However, automated reconstruction is hampered by database inconsistencies, incorrect annotations, and gap filling largely without considering genomic information. Here we develop an approach for applying genomic information to predict alternative functions for genesmore » and estimate their likelihoods from sequence homology. We show that computed likelihood values were significantly higher for annotations found in manually curated metabolic networks than those that were not. We then apply these alternative functional predictions to estimate reaction likelihoods, which are used in a new gap filling approach called likelihood-based gap filling to predict more genomically consistent solutions. To validate the likelihood-based gap filling approach, we applied it to models where essential pathways were removed, finding that likelihood-based gap filling identified more biologically relevant solutions than parsimony-based gap filling approaches. We also demonstrate that models gap filled using likelihood-based gap filling provide greater coverage and genomic consistency with metabolic gene functions compared to parsimony-based approaches. Interestingly, despite these findings, we found that likelihoods did not significantly affect consistency of gap filled models with Biolog and knockout lethality data. This indicates that the phenotype data alone cannot necessarily be used to discriminate between alternative solutions for gap filling and therefore, that the use of other information is necessary
Likelihood-Based Gene Annotations for Gap Filling and Quality Assessment in Genome-Scale Metabolic Models

PubMed Central

Benedict, Matthew N.; Mundy, Michael B.; Henry, Christopher S.; Chia, Nicholas; Price, Nathan D.

2014-01-01

Genome-scale metabolic models provide a powerful means to harness information from genomes to deepen biological insights. With exponentially increasing sequencing capacity, there is an enormous need for automated reconstruction techniques that can provide more accurate models in a short time frame. Current methods for automated metabolic network reconstruction rely on gene and reaction annotations to build draft metabolic networks and algorithms to fill gaps in these networks. However, automated reconstruction is hampered by database inconsistencies, incorrect annotations, and gap filling largely without considering genomic information. Here we develop an approach for applying genomic information to predict alternative functions for genes and estimate their likelihoods from sequence homology. We show that computed likelihood values were significantly higher for annotations found in manually curated metabolic networks than those that were not. We then apply these alternative functional predictions to estimate reaction likelihoods, which are used in a new gap filling approach called likelihood-based gap filling to predict more genomically consistent solutions. To validate the likelihood-based gap filling approach, we applied it to models where essential pathways were removed, finding that likelihood-based gap filling identified more biologically relevant solutions than parsimony-based gap filling approaches. We also demonstrate that models gap filled using likelihood-based gap filling provide greater coverage and genomic consistency with metabolic gene functions compared to parsimony-based approaches. Interestingly, despite these findings, we found that likelihoods did not significantly affect consistency of gap filled models with Biolog and knockout lethality data. This indicates that the phenotype data alone cannot necessarily be used to discriminate between alternative solutions for gap filling and therefore, that the use of other information is necessary to
Genome-scale reconstruction of the metabolic network in Yersinia pestis, strain 91001

DOE Office of Scientific and Technical Information (OSTI.GOV)

Navid, A; Almaas, E

2009-01-13

The gram-negative bacterium Yersinia pestis, the aetiological agent of bubonic plague, is one the deadliest pathogens known to man. Despite its historical reputation, plague is a modern disease which annually afflicts thousands of people. Public safety considerations greatly limit clinical experimentation on this organism and thus development of theoretical tools to analyze the capabilities of this pathogen is of utmost importance. Here, we report the first genome-scale metabolic model of Yersinia pestis biovar Mediaevalis based both on its recently annotated genome, and physiological and biochemical data from literature. Our model demonstrates excellent agreement with Y. pestis known metabolic needs andmore » capabilities. Since Y. pestis is a meiotrophic organism, we have developed CryptFind, a systematic approach to identify all candidate cryptic genes responsible for known and theoretical meiotrophic phenomena. In addition to uncovering every known cryptic gene for Y. pestis, our analysis of the rhamnose fermentation pathway suggests that betB is the responsible cryptic gene. Despite all of our medical advances, we still do not have a vaccine for bubonic plague. Recent discoveries of antibiotic resistant strains of Yersinia pestis coupled with the threat of plague being used as a bioterrorism weapon compel us to develop new tools for studying the physiology of this deadly pathogen. Using our theoretical model, we can study the cell's phenotypic behavior under different circumstances and identify metabolic weaknesses which may be harnessed for the development of therapeutics. Additionally, the automatic identification of cryptic genes expands the usage of genomic data for pharmaceutical purposes.« less
The genome sequence of E. coli W (ATCC 9637): comparative genome analysis and an improved genome-scale reconstruction of E. coli

PubMed Central

2011-01-01

Background Escherichia coli is a model prokaryote, an important pathogen, and a key organism for industrial biotechnology. E. coli W (ATCC 9637), one of four strains designated as safe for laboratory purposes, has not been sequenced. E. coli W is a fast-growing strain and is the only safe strain that can utilize sucrose as a carbon source. Lifecycle analysis has demonstrated that sucrose from sugarcane is a preferred carbon source for industrial bioprocesses. Results We have sequenced and annotated the genome of E. coli W. The chromosome is 4,900,968 bp and encodes 4,764 ORFs. Two plasmids, pRK1 (102,536 bp) and pRK2 (5,360 bp), are also present. W has unique features relative to other sequenced laboratory strains (K-12, B and Crooks): it has a larger genome and belongs to phylogroup B1 rather than A. W also grows on a much broader range of carbon sources than does K-12. A genome-scale reconstruction was developed and validated in order to interrogate metabolic properties. Conclusions The genome of W is more similar to commensal and pathogenic B1 strains than phylogroup A strains, and therefore has greater utility for comparative analyses with these strains. W should therefore be the strain of choice, or 'type strain' for group B1 comparative analyses. The genome annotation and tools created here are expected to allow further utilization and development of E. coli W as an industrial organism for sucrose-based bioprocesses. Refinements in our E. coli metabolic reconstruction allow it to more accurately define E. coli metabolism relative to previous models. PMID:21208457
Genome-Scale Metabolic Model for the Green Alga Chlorella vulgaris UTEX 395 Accurately Predicts Phenotypes under Autotrophic, Heterotrophic, and Mixotrophic Growth Conditions1

PubMed Central

Zuñiga, Cristal; Li, Chien-Ting; Zielinski, Daniel C.; Guarnieri, Michael T.; Antoniewicz, Maciek R.; Zengler, Karsten

2016-01-01

The green microalga Chlorella vulgaris has been widely recognized as a promising candidate for biofuel production due to its ability to store high lipid content and its natural metabolic versatility. Compartmentalized genome-scale metabolic models constructed from genome sequences enable quantitative insight into the transport and metabolism of compounds within a target organism. These metabolic models have long been utilized to generate optimized design strategies for an improved production process. Here, we describe the reconstruction, validation, and application of a genome-scale metabolic model for C. vulgaris UTEX 395, iCZ843. The reconstruction represents the most comprehensive model for any eukaryotic photosynthetic organism to date, based on the genome size and number of genes in the reconstruction. The highly curated model accurately predicts phenotypes under photoautotrophic, heterotrophic, and mixotrophic conditions. The model was validated against experimental data and lays the foundation for model-driven strain design and medium alteration to improve yield. Calculated flux distributions under different trophic conditions show that a number of key pathways are affected by nitrogen starvation conditions, including central carbon metabolism and amino acid, nucleotide, and pigment biosynthetic pathways. Furthermore, model prediction of growth rates under various medium compositions and subsequent experimental validation showed an increased growth rate with the addition of tryptophan and methionine. PMID:27372244
Genome-Scale Metabolic Model for the Green Alga Chlorella vulgaris UTEX 395 Accurately Predicts Phenotypes under Autotrophic, Heterotrophic, and Mixotrophic Growth Conditions

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zuniga, Cristal; Li, Chien -Ting; Huelsman, Tyler

The green microalgae Chlorella vulgaris has been widely recognized as a promising candidate for biofuel production due to its ability to store high lipid content and its natural metabolic versatility. Compartmentalized genome-scale metabolic models constructed from genome sequences enable quantitative insight into the transport and metabolism of compounds within a target organism. These metabolic models have long been utilized to generate optimized design strategies for an improved production process. Here, we describe the reconstruction, validation, and application of a genome-scale metabolic model for C. vulgaris UTEX 395, iCZ843. The reconstruction represents the most comprehensive model for any eukaryotic photosynthetic organismmore » to date, based on the genome size and number of genes in the reconstruction. The highly curated model accurately predicts phenotypes under photoautotrophic, heterotrophic, and mixotrophic conditions. The model was validated against experimental data and lays the foundation for model-driven strain design and medium alteration to improve yield. Calculated flux distributions under different trophic conditions show that a number of key pathways are affected by nitrogen starvation conditions, including central carbon metabolism and amino acid, nucleotide, and pigment biosynthetic pathways. Moreover, model prediction of growth rates under various medium compositions and subsequent experimental validation showed an increased growth rate with the addition of tryptophan and methionine.« less
Genome-Scale Metabolic Model for the Green Alga Chlorella vulgaris UTEX 395 Accurately Predicts Phenotypes under Autotrophic, Heterotrophic, and Mixotrophic Growth Conditions

DOE PAGES

Zuniga, Cristal; Li, Chien -Ting; Huelsman, Tyler; ...

2016-07-02

The green microalgae Chlorella vulgaris has been widely recognized as a promising candidate for biofuel production due to its ability to store high lipid content and its natural metabolic versatility. Compartmentalized genome-scale metabolic models constructed from genome sequences enable quantitative insight into the transport and metabolism of compounds within a target organism. These metabolic models have long been utilized to generate optimized design strategies for an improved production process. Here, we describe the reconstruction, validation, and application of a genome-scale metabolic model for C. vulgaris UTEX 395, iCZ843. The reconstruction represents the most comprehensive model for any eukaryotic photosynthetic organismmore » to date, based on the genome size and number of genes in the reconstruction. The highly curated model accurately predicts phenotypes under photoautotrophic, heterotrophic, and mixotrophic conditions. The model was validated against experimental data and lays the foundation for model-driven strain design and medium alteration to improve yield. Calculated flux distributions under different trophic conditions show that a number of key pathways are affected by nitrogen starvation conditions, including central carbon metabolism and amino acid, nucleotide, and pigment biosynthetic pathways. Moreover, model prediction of growth rates under various medium compositions and subsequent experimental validation showed an increased growth rate with the addition of tryptophan and methionine.« less
Genome-Scale Metabolic Model for the Green Alga Chlorella vulgaris UTEX 395 Accurately Predicts Phenotypes under Autotrophic, Heterotrophic, and Mixotrophic Growth Conditions.

PubMed

Zuñiga, Cristal; Li, Chien-Ting; Huelsman, Tyler; Levering, Jennifer; Zielinski, Daniel C; McConnell, Brian O; Long, Christopher P; Knoshaug, Eric P; Guarnieri, Michael T; Antoniewicz, Maciek R; Betenbaugh, Michael J; Zengler, Karsten

2016-09-01

The green microalga Chlorella vulgaris has been widely recognized as a promising candidate for biofuel production due to its ability to store high lipid content and its natural metabolic versatility. Compartmentalized genome-scale metabolic models constructed from genome sequences enable quantitative insight into the transport and metabolism of compounds within a target organism. These metabolic models have long been utilized to generate optimized design strategies for an improved production process. Here, we describe the reconstruction, validation, and application of a genome-scale metabolic model for C. vulgaris UTEX 395, iCZ843. The reconstruction represents the most comprehensive model for any eukaryotic photosynthetic organism to date, based on the genome size and number of genes in the reconstruction. The highly curated model accurately predicts phenotypes under photoautotrophic, heterotrophic, and mixotrophic conditions. The model was validated against experimental data and lays the foundation for model-driven strain design and medium alteration to improve yield. Calculated flux distributions under different trophic conditions show that a number of key pathways are affected by nitrogen starvation conditions, including central carbon metabolism and amino acid, nucleotide, and pigment biosynthetic pathways. Furthermore, model prediction of growth rates under various medium compositions and subsequent experimental validation showed an increased growth rate with the addition of tryptophan and methionine. © 2016 American Society of Plant Biologists. All rights reserved.

Identifying all moiety conservation laws in genome-scale metabolic networks.

PubMed

De Martino, Andrea; De Martino, Daniele; Mulet, Roberto; Pagnani, Andrea

2014-01-01

The stoichiometry of a metabolic network gives rise to a set of conservation laws for the aggregate level of specific pools of metabolites, which, on one hand, pose dynamical constraints that cross-link the variations of metabolite concentrations and, on the other, provide key insight into a cell's metabolic production capabilities. When the conserved quantity identifies with a chemical moiety, extracting all such conservation laws from the stoichiometry amounts to finding all non-negative integer solutions of a linear system, a programming problem known to be NP-hard. We present an efficient strategy to compute the complete set of integer conservation laws of a genome-scale stoichiometric matrix, also providing a certificate for correctness and maximality of the solution. Our method is deployed for the analysis of moiety conservation relationships in two large-scale reconstructions of the metabolism of the bacterium E. coli, in six tissue-specific human metabolic networks, and, finally, in the human reactome as a whole, revealing that bacterial metabolism could be evolutionarily designed to cover broader production spectra than human metabolism. Convergence to the full set of moiety conservation laws in each case is achieved in extremely reduced computing times. In addition, we uncover a scaling relation that links the size of the independent pool basis to the number of metabolites, for which we present an analytical explanation.
Genome-scale metabolic reconstructions and theoretical investigation of methane conversion in Methylomicrobium buryatense strain 5G(B1).

PubMed

de la Torre, Andrea; Metivier, Aisha; Chu, Frances; Laurens, Lieve M L; Beck, David A C; Pienkos, Philip T; Lidstrom, Mary E; Kalyuzhnaya, Marina G

2015-11-25

Methane-utilizing bacteria (methanotrophs) are capable of growth on methane and are attractive systems for bio-catalysis. However, the application of natural methanotrophic strains to large-scale production of value-added chemicals/biofuels requires a number of physiological and genetic alterations. An accurate metabolic model coupled with flux balance analysis can provide a solid interpretative framework for experimental data analyses and integration. A stoichiometric flux balance model of Methylomicrobium buryatense strain 5G(B1) was constructed and used for evaluating metabolic engineering strategies for biofuels and chemical production with a methanotrophic bacterium as the catalytic platform. The initial metabolic reconstruction was based on whole-genome predictions. Each metabolic step was manually verified, gapfilled, and modified in accordance with genome-wide expression data. The final model incorporates a total of 841 reactions (in 167 metabolic pathways). Of these, up to 400 reactions were recruited to produce 118 intracellular metabolites. The flux balance simulations suggest that only the transfer of electrons from methanol oxidation to methane oxidation steps can support measured growth and methane/oxygen consumption parameters, while the scenario employing NADH as a possible source of electrons for particulate methane monooxygenase cannot. Direct coupling between methane oxidation and methanol oxidation accounts for most of the membrane-associated methane monooxygenase activity. However the best fit to experimental results is achieved only after assuming that the efficiency of direct coupling depends on growth conditions and additional NADH input (about 0.1-0.2 mol of incremental NADH per one mol of methane oxidized). The additional input is proposed to cover loss of electrons through inefficiency and to sustain methane oxidation at perturbations or support uphill electron transfer. Finally, the model was used for testing the carbon conversion
Genome-scale modelling of microbial metabolism with temporal and spatial resolution.

PubMed

Henson, Michael A

2015-12-01

Most natural microbial systems have evolved to function in environments with temporal and spatial variations. A major limitation to understanding such complex systems is the lack of mathematical modelling frameworks that connect the genomes of individual species and temporal and spatial variations in the environment to system behaviour. The goal of this review is to introduce the emerging field of spatiotemporal metabolic modelling based on genome-scale reconstructions of microbial metabolism. The extension of flux balance analysis (FBA) to account for both temporal and spatial variations in the environment is termed spatiotemporal FBA (SFBA). Following a brief overview of FBA and its established dynamic extension, the SFBA problem is introduced and recent progress is described. Three case studies are reviewed to illustrate the current state-of-the-art and possible future research directions are outlined. The author posits that SFBA is the next frontier for microbial metabolic modelling and a rapid increase in methods development and system applications is anticipated. © 2015 Authors; published by Portland Press Limited.
Supervised de novo reconstruction of metabolic pathways from metabolome-scale compound sets

PubMed Central

Kotera, Masaaki; Tabei, Yasuo; Yamanishi, Yoshihiro; Tokimatsu, Toshiaki; Goto, Susumu

2013-01-01

Motivation: The metabolic pathway is an important biochemical reaction network involving enzymatic reactions among chemical compounds. However, it is assumed that a large number of metabolic pathways remain unknown, and many reactions are still missing even in known pathways. Therefore, the most important challenge in metabolomics is the automated de novo reconstruction of metabolic pathways, which includes the elucidation of previously unknown reactions to bridge the metabolic gaps. Results: In this article, we develop a novel method to reconstruct metabolic pathways from a large compound set in the reaction-filling framework. We define feature vectors representing the chemical transformation patterns of compound–compound pairs in enzymatic reactions using chemical fingerprints. We apply a sparsity-induced classifier to learn what we refer to as ‘enzymatic-reaction likeness’, i.e. whether compound pairs are possibly converted to each other by enzymatic reactions. The originality of our method lies in the search for potential reactions among many compounds at a time, in the extraction of reaction-related chemical transformation patterns and in the large-scale applicability owing to the computational efficiency. In the results, we demonstrate the usefulness of our proposed method on the de novo reconstruction of 134 metabolic pathways in Kyoto Encyclopedia of Genes and Genomes (KEGG). Our comprehensively predicted reaction networks of 15 698 compounds enable us to suggest many potential pathways and to increase research productivity in metabolomics. Availability: Softwares are available on request. Supplementary material are available at http://web.kuicr.kyoto-u.ac.jp/supp/kot/ismb2013/. Contact: goto@kuicr.kyoto-u.ac.jp PMID:23812977
Metabolic network reconstruction of Chlamydomonas offers insight into light-driven algal metabolism

PubMed Central

Chang, Roger L; Ghamsari, Lila; Manichaikul, Ani; Hom, Erik F Y; Balaji, Santhanam; Fu, Weiqi; Shen, Yun; Hao, Tong; Palsson, Bernhard Ø; Salehi-Ashtiani, Kourosh; Papin, Jason A

2011-01-01

Metabolic network reconstruction encompasses existing knowledge about an organism's metabolism and genome annotation, providing a platform for omics data analysis and phenotype prediction. The model alga Chlamydomonas reinhardtii is employed to study diverse biological processes from photosynthesis to phototaxis. Recent heightened interest in this species results from an international movement to develop algal biofuels. Integrating biological and optical data, we reconstructed a genome-scale metabolic network for this alga and devised a novel light-modeling approach that enables quantitative growth prediction for a given light source, resolving wavelength and photon flux. We experimentally verified transcripts accounted for in the network and physiologically validated model function through simulation and generation of new experimental growth data, providing high confidence in network contents and predictive applications. The network offers insight into algal metabolism and potential for genetic engineering and efficient light source design, a pioneering resource for studying light-driven metabolism and quantitative systems biology. PMID:21811229
Creation of a Genome-Wide Metabolic Pathway Database for Populus trichocarpa Using a New Approach for Reconstruction and Curation of Metabolic Pathways for Plants1[W][OA

PubMed Central

Zhang, Peifen; Dreher, Kate; Karthikeyan, A.; Chi, Anjo; Pujar, Anuradha; Caspi, Ron; Karp, Peter; Kirkup, Vanessa; Latendresse, Mario; Lee, Cynthia; Mueller, Lukas A.; Muller, Robert; Rhee, Seung Yon

2010-01-01

Metabolic networks reconstructed from sequenced genomes or transcriptomes can help visualize and analyze large-scale experimental data, predict metabolic phenotypes, discover enzymes, engineer metabolic pathways, and study metabolic pathway evolution. We developed a general approach for reconstructing metabolic pathway complements of plant genomes. Two new reference databases were created and added to the core of the infrastructure: a comprehensive, all-plant reference pathway database, PlantCyc, and a reference enzyme sequence database, RESD, for annotating metabolic functions of protein sequences. PlantCyc (version 3.0) includes 714 metabolic pathways and 2,619 reactions from over 300 species. RESD (version 1.0) contains 14,187 literature-supported enzyme sequences from across all kingdoms. We used RESD, PlantCyc, and MetaCyc (an all-species reference metabolic pathway database), in conjunction with the pathway prediction software Pathway Tools, to reconstruct a metabolic pathway database, PoplarCyc, from the recently sequenced genome of Populus trichocarpa. PoplarCyc (version 1.0) contains 321 pathways with 1,807 assigned enzymes. Comparing PoplarCyc (version 1.0) with AraCyc (version 6.0, Arabidopsis [Arabidopsis thaliana]) showed comparable numbers of pathways distributed across all domains of metabolism in both databases, except for a higher number of AraCyc pathways in secondary metabolism and a 1.5-fold increase in carbohydrate metabolic enzymes in PoplarCyc. Here, we introduce these new resources and demonstrate the feasibility of using them to identify candidate enzymes for specific pathways and to analyze metabolite profiling data through concrete examples. These resources can be searched by text or BLAST, browsed, and downloaded from our project Web site (http://plantcyc.org). PMID:20522724
13C metabolic flux analysis at a genome-scale.

PubMed

Gopalakrishnan, Saratram; Maranas, Costas D

2015-11-01

Metabolic models used in 13C metabolic flux analysis generally include a limited number of reactions primarily from central metabolism. They typically omit degradation pathways, complete cofactor balances, and atom transition contributions for reactions outside central metabolism. This study addresses the impact on prediction fidelity of scaling-up mapping models to a genome-scale. The core mapping model employed in this study accounts for (75 reactions and 65 metabolites) primarily from central metabolism. The genome-scale metabolic mapping model (GSMM) (697 reaction and 595 metabolites) is constructed using as a basis the iAF1260 model upon eliminating reactions guaranteed not to carry flux based on growth and fermentation data for a minimal glucose growth medium. Labeling data for 17 amino acid fragments obtained from cells fed with glucose labeled at the second carbon was used to obtain fluxes and ranges. Metabolic fluxes and confidence intervals are estimated, for both core and genome-scale mapping models, by minimizing the sum of square of differences between predicted and experimentally measured labeling patterns using the EMU decomposition algorithm. Overall, we find that both topology and estimated values of the metabolic fluxes remain largely consistent between core and GSM model. Stepping up to a genome-scale mapping model leads to wider flux inference ranges for 20 key reactions present in the core model. The glycolysis flux range doubles due to the possibility of active gluconeogenesis, the TCA flux range expanded by 80% due to the availability of a bypass through arginine consistent with labeling data, and the transhydrogenase reaction flux was essentially unresolved due to the presence of as many as five routes for the inter-conversion of NADPH to NADH afforded by the genome-scale model. By globally accounting for ATP demands in the GSMM model the unused ATP decreased drastically with the lower bound matching the maintenance ATP requirement. A non
MEMOSys: Bioinformatics platform for genome-scale metabolic models

PubMed Central

2011-01-01

Background Recent advances in genomic sequencing have enabled the use of genome sequencing in standard biological and biotechnological research projects. The challenge is how to integrate the large amount of data in order to gain novel biological insights. One way to leverage sequence data is to use genome-scale metabolic models. We have therefore designed and implemented a bioinformatics platform which supports the development of such metabolic models. Results MEMOSys (MEtabolic MOdel research and development System) is a versatile platform for the management, storage, and development of genome-scale metabolic models. It supports the development of new models by providing a built-in version control system which offers access to the complete developmental history. Moreover, the integrated web board, the authorization system, and the definition of user roles allow collaborations across departments and institutions. Research on existing models is facilitated by a search system, references to external databases, and a feature-rich comparison mechanism. MEMOSys provides customizable data exchange mechanisms using the SBML format to enable analysis in external tools. The web application is based on the Java EE framework and offers an intuitive user interface. It currently contains six annotated microbial metabolic models. Conclusions We have developed a web-based system designed to provide researchers a novel application facilitating the management and development of metabolic models. The system is freely available at http://www.icbi.at/MEMOSys. PMID:21276275
Network Thermodynamic Curation of Human and Yeast Genome-Scale Metabolic Models

PubMed Central

Martínez, Verónica S.; Quek, Lake-Ee; Nielsen, Lars K.

2014-01-01

Genome-scale models are used for an ever-widening range of applications. Although there has been much focus on specifying the stoichiometric matrix, the predictive power of genome-scale models equally depends on reaction directions. Two-thirds of reactions in the two eukaryotic reconstructions Homo sapiens Recon 1 and Yeast 5 are specified as irreversible. However, these specifications are mainly based on biochemical textbooks or on their similarity to other organisms and are rarely underpinned by detailed thermodynamic analysis. In this study, a to our knowledge new workflow combining network-embedded thermodynamic and flux variability analysis was used to evaluate existing irreversibility constraints in Recon 1 and Yeast 5 and to identify new ones. A total of 27 and 16 new irreversible reactions were identified in Recon 1 and Yeast 5, respectively, whereas only four reactions were found with directions incorrectly specified against thermodynamics (three in Yeast 5 and one in Recon 1). The workflow further identified for both models several isolated internal loops that require further curation. The framework also highlighted the need for substrate channeling (in human) and ATP hydrolysis (in yeast) for the essential reaction catalyzed by phosphoribosylaminoimidazole carboxylase in purine metabolism. Finally, the framework highlighted differences in proline metabolism between yeast (cytosolic anabolism and mitochondrial catabolism) and humans (exclusively mitochondrial metabolism). We conclude that network-embedded thermodynamics facilitates the specification and validation of irreversibility constraints in compartmentalized metabolic models, at the same time providing further insight into network properties. PMID:25028891
An enhanced genome-scale metabolic reconstruction of Streptomyces clavuligerus identifies novel strain improvement strategies.

PubMed

Toro, León; Pinilla, Laura; Avignone-Rossa, Claudio; Ríos-Estepa, Rigoberto

2018-05-01

In this work, we expanded and updated a genome-scale metabolic model of Streptomyces clavuligerus. The model includes 1021 genes and 1494 biochemical reactions; genome-reaction information was curated and new features related to clavam metabolism and to the biomass synthesis equation were incorporated. The model was validated using experimental data from the literature and simulations were performed to predict cellular growth and clavulanic acid biosynthesis. Flux balance analysis (FBA) showed that limiting concentrations of phosphate and an excess of ammonia accumulation are unfavorable for growth and clavulanic acid biosynthesis. The evaluation of different objective functions for FBA showed that maximization of ATP yields the best predictions for cellular behavior in continuous cultures, while the maximization of growth rate provides better predictions for batch cultures. Through gene essentiality analysis, 130 essential genes were found using a limited in silico media, while 100 essential genes were identified in amino acid-supplemented media. Finally, a strain design was carried out to identify candidate genes to be overexpressed or knocked out so as to maximize antibiotic biosynthesis. Interestingly, potential metabolic engineering targets, identified in this study, have not been tested experimentally.
Dynamic genome-scale metabolic modeling of the yeast Pichia pastoris.

PubMed

Saitua, Francisco; Torres, Paulina; Pérez-Correa, José Ricardo; Agosin, Eduardo

2017-02-21

Pichia pastoris shows physiological advantages in producing recombinant proteins, compared to other commonly used cell factories. This yeast is mostly grown in dynamic cultivation systems, where the cell's environment is continuously changing and many variables influence process productivity. In this context, a model capable of explaining and predicting cell behavior for the rational design of bioprocesses is highly desirable. Currently, there are five genome-scale metabolic reconstructions of P. pastoris which have been used to predict extracellular cell behavior in stationary conditions. In this work, we assembled a dynamic genome-scale metabolic model for glucose-limited, aerobic cultivations of Pichia pastoris. Starting from an initial model structure for batch and fed-batch cultures, we performed pre/post regression diagnostics to ensure that model parameters were identifiable, significant and sensitive. Once identified, the non-relevant ones were iteratively fixed until a priori robust modeling structures were found for each type of cultivation. Next, the robustness of these reduced structures was confirmed by calibrating the model with new datasets, where no sensitivity, identifiability or significance problems appeared in their parameters. Afterwards, the model was validated for the prediction of batch and fed-batch dynamics in the studied conditions. Lastly, the model was employed as a case study to analyze the metabolic flux distribution of a fed-batch culture and to unravel genetic and process engineering strategies to improve the production of recombinant Human Serum Albumin (HSA). Simulation of single knock-outs indicated that deviation of carbon towards cysteine and tryptophan formation improves HSA production. The deletion of methylene tetrahydrofolate dehydrogenase could increase the HSA volumetric productivity by 630%. Moreover, given specific bioprocess limitations and strain characteristics, the model suggests that implementation of a decreasing
Exploring metabolic pathways in genome-scale networks via generating flux modes.

PubMed

Rezola, A; de Figueiredo, L F; Brock, M; Pey, J; Podhorski, A; Wittmann, C; Schuster, S; Bockmayr, A; Planes, F J

2011-02-15

The reconstruction of metabolic networks at the genome scale has allowed the analysis of metabolic pathways at an unprecedented level of complexity. Elementary flux modes (EFMs) are an appropriate concept for such analysis. However, their number grows in a combinatorial fashion as the size of the metabolic network increases, which renders the application of EFMs approach to large metabolic networks difficult. Novel methods are expected to deal with such complexity. In this article, we present a novel optimization-based method for determining a minimal generating set of EFMs, i.e. a convex basis. We show that a subset of elements of this convex basis can be effectively computed even in large metabolic networks. Our method was applied to examine the structure of pathways producing lysine in Escherichia coli. We obtained a more varied and informative set of pathways in comparison with existing methods. In addition, an alternative pathway to produce lysine was identified using a detour via propionyl-CoA, which shows the predictive power of our novel approach. The source code in C++ is available upon request.
Network thermodynamic curation of human and yeast genome-scale metabolic models.

PubMed

Martínez, Verónica S; Quek, Lake-Ee; Nielsen, Lars K

2014-07-15

Genome-scale models are used for an ever-widening range of applications. Although there has been much focus on specifying the stoichiometric matrix, the predictive power of genome-scale models equally depends on reaction directions. Two-thirds of reactions in the two eukaryotic reconstructions Homo sapiens Recon 1 and Yeast 5 are specified as irreversible. However, these specifications are mainly based on biochemical textbooks or on their similarity to other organisms and are rarely underpinned by detailed thermodynamic analysis. In this study, a to our knowledge new workflow combining network-embedded thermodynamic and flux variability analysis was used to evaluate existing irreversibility constraints in Recon 1 and Yeast 5 and to identify new ones. A total of 27 and 16 new irreversible reactions were identified in Recon 1 and Yeast 5, respectively, whereas only four reactions were found with directions incorrectly specified against thermodynamics (three in Yeast 5 and one in Recon 1). The workflow further identified for both models several isolated internal loops that require further curation. The framework also highlighted the need for substrate channeling (in human) and ATP hydrolysis (in yeast) for the essential reaction catalyzed by phosphoribosylaminoimidazole carboxylase in purine metabolism. Finally, the framework highlighted differences in proline metabolism between yeast (cytosolic anabolism and mitochondrial catabolism) and humans (exclusively mitochondrial metabolism). We conclude that network-embedded thermodynamics facilitates the specification and validation of irreversibility constraints in compartmentalized metabolic models, at the same time providing further insight into network properties. Copyright © 2014 Biophysical Society. Published by Elsevier Inc. All rights reserved.
Characterizing the optimal flux space of genome-scale metabolic reconstructions through modified latin-hypercube sampling.

PubMed

Chaudhary, Neha; Tøndel, Kristin; Bhatnagar, Rakesh; dos Santos, Vítor A P Martins; Puchałka, Jacek

2016-03-01

Genome-Scale Metabolic Reconstructions (GSMRs), along with optimization-based methods, predominantly Flux Balance Analysis (FBA) and its derivatives, are widely applied for assessing and predicting the behavior of metabolic networks upon perturbation, thereby enabling identification of potential novel drug targets and biotechnologically relevant pathways. The abundance of alternate flux profiles has led to the evolution of methods to explore the complete solution space aiming to increase the accuracy of predictions. Herein we present a novel, generic algorithm to characterize the entire flux space of GSMR upon application of FBA, leading to the optimal value of the objective (the optimal flux space). Our method employs Modified Latin-Hypercube Sampling (LHS) to effectively border the optimal space, followed by Principal Component Analysis (PCA) to identify and explain the major sources of variability within it. The approach was validated with the elementary mode analysis of a smaller network of Saccharomyces cerevisiae and applied to the GSMR of Pseudomonas aeruginosa PAO1 (iMO1086). It is shown to surpass the commonly used Monte Carlo Sampling (MCS) in providing a more uniform coverage for a much larger network in less number of samples. Results show that although many fluxes are identified as variable upon fixing the objective value, majority of the variability can be reduced to several main patterns arising from a few alternative pathways. In iMO1086, initial variability of 211 reactions could almost entirely be explained by 7 alternative pathway groups. These findings imply that the possibilities to reroute greater portions of flux may be limited within metabolic networks of bacteria. Furthermore, the optimal flux space is subject to change with environmental conditions. Our method may be a useful device to validate the predictions made by FBA-based tools, by describing the optimal flux space associated with these predictions, thus to improve them.
Metabolic reconstruction and flux analysis of industrial Pichia yeasts.

PubMed

Chung, Bevan Kai-Sheng; Lakshmanan, Meiyappan; Klement, Maximilian; Ching, Chi Bun; Lee, Dong-Yup

2013-03-01

Pichia yeasts have been recognized as important microbial cell factories in the biotechnological industry. Notably, the Pichia pastoris and Pichia stipitis species have attracted much research interest due to their unique cellular physiology and metabolic capability: P. pastoris has the ability to utilize methanol for cell growth and recombinant protein production, while P. stipitis is capable of assimilating xylose to produce ethanol under oxygen-limited conditions. To harness these characteristics for biotechnological applications, it is highly required to characterize their metabolic behavior. Recently, following the genome sequencing of these two Pichia species, genome-scale metabolic networks have been reconstructed to model the yeasts' metabolism from a systems perspective. To date, there are three genome-scale models available for each of P. pastoris and P. stipitis. In this mini-review, we provide an overview of the models, discuss certain limitations of previous studies, and propose potential future works that can be conducted to better understand and engineer Pichia yeasts for industrial applications.
Probing the genome-scale metabolic landscape of Bordetella pertussis, the causative agent of whooping cough.

PubMed

Branco Dos Santos, Filipe; Olivier, Brett G; Boele, Joost; Smessaert, Vincent; De Rop, Philippe; Krumpochova, Petra; Klau, Gunnar W; Giera, Martin; Dehottay, Philippe; Teusink, Bas; Goffin, Philippe

2017-08-25

Whooping cough is a highly-contagious respiratory disease caused by Bordetella pertussi s. Despite vaccination, its incidence has been rising alarmingly, and yet, the physiology of B. pertussis remains poorly understood. We combined genome-scale metabolic reconstruction, a novel optimization algorithm and experimental data to probe the full metabolic potential of this pathogen, using strain Tohama I as a reference. Experimental validation showed that B. pertussis secretes a significant proportion of nitrogen as arginine and purine nucleosides, which may contribute to modulation of the host response. We also found that B. pertussis can be unexpectedly versatile, being able to metabolize many compounds while displaying minimal nutrient requirements. It can grow without cysteine - using inorganic sulfur sources such as thiosulfate - and it can grow on organic acids such as citrate or lactate as sole carbon sources, providing in vivo demonstration that its TCA cycle is functional. Although the metabolic reconstruction of eight additional strains indicates that the structural genes underlying this metabolic flexibility are widespread, experimental validation suggests a role of strain-specific regulatory mechanisms in shaping metabolic capabilities. Among five alternative strains tested, three were shown to grow on substrate combinations requiring a functional TCA cycle, but only one could use thiosulfate. Finally, the metabolic model was used to rationally design growth media with over two-fold improvements in pertussis toxin production. This study thus provides novel insights into B. pertussis physiology, and highlights the potential, but also limitations of models solely based on metabolic gene content. IMPORTANCE The metabolic capabilities of Bordetella pertussis - the causative agent of whooping cough - were investigated from a systems-level perspective. We constructed a comprehensive genome-scale metabolic model for B. pertussis , and challenged its predictions
Probing the Genome-Scale Metabolic Landscape of Bordetella pertussis, the Causative Agent of Whooping Cough

PubMed Central

Olivier, Brett G.; Boele, Joost; Smessaert, Vincent; De Rop, Philippe; Krumpochova, Petra; Klau, Gunnar W.; Giera, Martin; Dehottay, Philippe; Goffin, Philippe

2017-01-01

ABSTRACT Whooping cough is a highly contagious respiratory disease caused by Bordetella pertussis. Despite widespread vaccination, its incidence has been rising alarmingly, and yet, the physiology of B. pertussis remains poorly understood. We combined genome-scale metabolic reconstruction, a novel optimization algorithm, and experimental data to probe the full metabolic potential of this pathogen, using B. pertussis strain Tohama I as a reference. Experimental validation showed that B. pertussis secretes a significant proportion of nitrogen as arginine and purine nucleosides, which may contribute to modulation of the host response. We also found that B. pertussis can be unexpectedly versatile, being able to metabolize many compounds while displaying minimal nutrient requirements. It can grow without cysteine, using inorganic sulfur sources, such as thiosulfate, and it can grow on organic acids, such as citrate or lactate, as sole carbon sources, providing in vivo demonstration that its tricarboxylic acid (TCA) cycle is functional. Although the metabolic reconstruction of eight additional strains indicates that the structural genes underlying this metabolic flexibility are widespread, experimental validation suggests a role of strain-specific regulatory mechanisms in shaping metabolic capabilities. Among five alternative strains tested, three strains were shown to grow on substrate combinations requiring a functional TCA cycle, but only one strain could use thiosulfate. Finally, the metabolic model was used to rationally design growth media with >2-fold improvements in pertussis toxin production. This study thus provides novel insights into B. pertussis physiology and highlights the potential, but also the limitations, of models based solely on metabolic gene content. IMPORTANCE The metabolic capabilities of Bordetella pertussis, the causative agent of whooping cough, were investigated from a systems-level perspective. We constructed a comprehensive genome-scale
Grohar: Automated Visualization of Genome-Scale Metabolic Models and Their Pathways.

PubMed

Moškon, Miha; Zimic, Nikolaj; Mraz, Miha

2018-05-01

Genome-scale metabolic models (GEMs) have become a powerful tool for the investigation of the entire metabolism of the organism in silico. These models are, however, often extremely hard to reconstruct and also difficult to apply to the selected problem. Visualization of the GEM allows us to easier comprehend the model, to perform its graphical analysis, to find and correct the faulty relations, to identify the parts of the system with a designated function, etc. Even though several approaches for the automatic visualization of GEMs have been proposed, metabolic maps are still manually drawn or at least require large amount of manual curation. We present Grohar, a computational tool for automatic identification and visualization of GEM (sub)networks and their metabolic fluxes. These (sub)networks can be specified directly by listing the metabolites of interest or indirectly by providing reference metabolic pathways from different sources, such as KEGG, SBML, or Matlab file. These pathways are identified within the GEM using three different pathway alignment algorithms. Grohar also supports the visualization of the model adjustments (e.g., activation or inhibition of metabolic reactions) after perturbations are induced.
Pore-scale simulation of microbial growth using a genome-scale metabolic model: Implications for Darcy-scale reactive transport

NASA Astrophysics Data System (ADS)

Tartakovsky, G. D.; Tartakovsky, A. M.; Scheibe, T. D.; Fang, Y.; Mahadevan, R.; Lovley, D. R.

2013-09-01

Recent advances in microbiology have enabled the quantitative simulation of microbial metabolism and growth based on genome-scale characterization of metabolic pathways and fluxes. We have incorporated a genome-scale metabolic model of the iron-reducing bacteria Geobacter sulfurreducens into a pore-scale simulation of microbial growth based on coupling of iron reduction to oxidation of a soluble electron donor (acetate). In our model, fluid flow and solute transport is governed by a combination of the Navier-Stokes and advection-diffusion-reaction equations. Microbial growth occurs only on the surface of soil grains where solid-phase mineral iron oxides are available. Mass fluxes of chemical species associated with microbial growth are described by the genome-scale microbial model, implemented using a constraint-based metabolic model, and provide the Robin-type boundary condition for the advection-diffusion equation at soil grain surfaces. Conventional models of microbially-mediated subsurface reactions use a lumped reaction model that does not consider individual microbial reaction pathways, and describe reactions rates using empirically-derived rate formulations such as the Monod-type kinetics. We have used our pore-scale model to explore the relationship between genome-scale metabolic models and Monod-type formulations, and to assess the manifestation of pore-scale variability (microenvironments) in terms of apparent Darcy-scale microbial reaction rates. The genome-scale model predicted lower biomass yield, and different stoichiometry for iron consumption, in comparison to prior Monod formulations based on energetics considerations. We were able to fit an equivalent Monod model, by modifying the reaction stoichiometry and biomass yield coefficient, that could effectively match results of the genome-scale simulation of microbial behaviors under excess nutrient conditions, but predictions of the fitted Monod model deviated from those of the genome-scale model
Pore-scale simulation of microbial growth using a genome-scale metabolic model: Implications for Darcy-scale reactive transport

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tartakovsky, Guzel D.; Tartakovsky, Alexandre M.; Scheibe, Timothy D.

2013-09-07

Recent advances in microbiology have enabled the quantitative simulation of microbial metabolism and growth based on genome-scale characterization of metabolic pathways and fluxes. We have incorporated a genome-scale metabolic model of the iron-reducing bacteria Geobacter sulfurreducens into a pore-scale simulation of microbial growth based on coupling of iron reduction to oxidation of a soluble electron donor (acetate). In our model, fluid flow and solute transport is governed by a combination of the Navier-Stokes and advection-diffusion-reaction equations. Microbial growth occurs only on the surface of soil grains where solid-phase mineral iron oxides are available. Mass fluxes of chemical species associated withmore » microbial growth are described by the genome-scale microbial model, implemented using a constraint-based metabolic model, and provide the Robin-type boundary condition for the advection-diffusion equation at soil grain surfaces. Conventional models of microbially-mediated subsurface reactions use a lumped reaction model that does not consider individual microbial reaction pathways, and describe reactions rates using empirically-derived rate formulations such as the Monod-type kinetics. We have used our pore-scale model to explore the relationship between genome-scale metabolic models and Monod-type formulations, and to assess the manifestation of pore-scale variability (microenvironments) in terms of apparent Darcy-scale microbial reaction rates. The genome-scale model predicted lower biomass yield, and different stoichiometry for iron consumption, in comparisonto prior Monod formulations based on energetics considerations. We were able to fit an equivalent Monod model, by modifying the reaction stoichiometry and biomass yield coefficient, that could effectively match results of the genome-scale simulation of microbial behaviors under excess nutrient conditions, but predictions of the fitted Monod model deviated from those of the genome-scale

Pore-scale simulation of microbial growth using a genome-scale metabolic model: Implications for Darcy-scale reactive transport

NASA Astrophysics Data System (ADS)

Scheibe, T. D.; Tartakovsky, G.; Tartakovsky, A. M.; Fang, Y.; Mahadevan, R.; Lovley, D. R.

2012-12-01

Recent advances in microbiology have enabled the quantitative simulation of microbial metabolism and growth based on genome-scale characterization of metabolic pathways and fluxes. We have incorporated a genome-scale metabolic model of the iron-reducing bacteria Geobacter sulfurreducens into a pore-scale simulation of microbial growth based on coupling of iron reduction to oxidation of a soluble electron donor (acetate). In our model, fluid flow and solute transport is governed by a combination of the Navier-Stokes and advection-diffusion-reaction equations. Microbial growth occurs only on the surface of soil grains where solid-phase mineral iron oxides are available. Mass fluxes of chemical species associated with microbial growth are described by the genome-scale microbial model, implemented using a constraint-based metabolic model, and provide the Robin-type boundary condition for the advection-diffusion equation at soil grain surfaces. Conventional models of microbially-mediated subsurface reactions use a lumped reaction model that does not consider individual microbial reaction pathways, and describe reactions rates using empirically-derived rate formulations such as the Monod-type kinetics. We have used our pore-scale model to explore the relationship between genome-scale metabolic models and Monod-type formulations, and to assess the manifestation of pore-scale variability (microenvironments) in terms of apparent Darcy-scale microbial reaction rates. The genome-scale model predicted lower biomass yield, and different stoichiometry for iron consumption, in comparison to prior Monod formulations based on energetics considerations. We were able to fit an equivalent Monod model, by modifying the reaction stoichiometry and biomass yield coefficient, that could effectively match results of the genome-scale simulation of microbial behaviors under excess nutrient conditions, but predictions of the fitted Monod model deviated from those of the genome-scale model
Identification of functional differences in metabolic networks using comparative genomics and constraint-based models.

PubMed

Hamilton, Joshua J; Reed, Jennifer L

2012-01-01

Genome-scale network reconstructions are useful tools for understanding cellular metabolism, and comparisons of such reconstructions can provide insight into metabolic differences between organisms. Recent efforts toward comparing genome-scale models have focused primarily on aligning metabolic networks at the reaction level and then looking at differences and similarities in reaction and gene content. However, these reaction comparison approaches are time-consuming and do not identify the effect network differences have on the functional states of the network. We have developed a bilevel mixed-integer programming approach, CONGA, to identify functional differences between metabolic networks by comparing network reconstructions aligned at the gene level. We first identify orthologous genes across two reconstructions and then use CONGA to identify conditions under which differences in gene content give rise to differences in metabolic capabilities. By seeking genes whose deletion in one or both models disproportionately changes flux through a selected reaction (e.g., growth or by-product secretion) in one model over another, we are able to identify structural metabolic network differences enabling unique metabolic capabilities. Using CONGA, we explore functional differences between two metabolic reconstructions of Escherichia coli and identify a set of reactions responsible for chemical production differences between the two models. We also use this approach to aid in the development of a genome-scale model of Synechococcus sp. PCC 7002. Finally, we propose potential antimicrobial targets in Mycobacterium tuberculosis and Staphylococcus aureus based on differences in their metabolic capabilities. Through these examples, we demonstrate that a gene-centric approach to comparing metabolic networks allows for a rapid comparison of metabolic models at a functional level. Using CONGA, we can identify differences in reaction and gene content which give rise to different
Identification of Functional Differences in Metabolic Networks Using Comparative Genomics and Constraint-Based Models

PubMed Central

Hamilton, Joshua J.; Reed, Jennifer L.

2012-01-01

Genome-scale network reconstructions are useful tools for understanding cellular metabolism, and comparisons of such reconstructions can provide insight into metabolic differences between organisms. Recent efforts toward comparing genome-scale models have focused primarily on aligning metabolic networks at the reaction level and then looking at differences and similarities in reaction and gene content. However, these reaction comparison approaches are time-consuming and do not identify the effect network differences have on the functional states of the network. We have developed a bilevel mixed-integer programming approach, CONGA, to identify functional differences between metabolic networks by comparing network reconstructions aligned at the gene level. We first identify orthologous genes across two reconstructions and then use CONGA to identify conditions under which differences in gene content give rise to differences in metabolic capabilities. By seeking genes whose deletion in one or both models disproportionately changes flux through a selected reaction (e.g., growth or by-product secretion) in one model over another, we are able to identify structural metabolic network differences enabling unique metabolic capabilities. Using CONGA, we explore functional differences between two metabolic reconstructions of Escherichia coli and identify a set of reactions responsible for chemical production differences between the two models. We also use this approach to aid in the development of a genome-scale model of Synechococcus sp. PCC 7002. Finally, we propose potential antimicrobial targets in Mycobacterium tuberculosis and Staphylococcus aureus based on differences in their metabolic capabilities. Through these examples, we demonstrate that a gene-centric approach to comparing metabolic networks allows for a rapid comparison of metabolic models at a functional level. Using CONGA, we can identify differences in reaction and gene content which give rise to different
Genome-scale reconstruction of the metabolic network in Yersinia pestis CO92

NASA Astrophysics Data System (ADS)

Navid, Ali; Almaas, Eivind

2007-03-01

The gram-negative bacterium Yersinia pestis is the causative agent of bubonic plague. Using publicly available genomic, biochemical and physiological data, we have developed a constraint-based flux balance model of metabolism in the CO92 strain (biovar Orientalis) of this organism. The metabolic reactions were appropriately compartmentalized, and the model accounts for the exchange of metabolites, as well as the import of nutrients and export of waste products. We have characterized the metabolic capabilities and phenotypes of this organism, after comparing the model predictions with available experimental observations to evaluate accuracy and completeness. We have also begun preliminary studies into how cellular metabolism affects virulence.
IMGMD: A platform for the integration and standardisation of In silico Microbial Genome-scale Metabolic Models.

PubMed

Ye, Chao; Xu, Nan; Dong, Chuan; Ye, Yuannong; Zou, Xuan; Chen, Xiulai; Guo, Fengbiao; Liu, Liming

2017-04-07

Genome-scale metabolic models (GSMMs) constitute a platform that combines genome sequences and detailed biochemical information to quantify microbial physiology at the system level. To improve the unity, integrity, correctness, and format of data in published GSMMs, a consensus IMGMD database was built in the LAMP (Linux + Apache + MySQL + PHP) system by integrating and standardizing 328 GSMMs constructed for 139 microorganisms. The IMGMD database can help microbial researchers download manually curated GSMMs, rapidly reconstruct standard GSMMs, design pathways, and identify metabolic targets for strategies on strain improvement. Moreover, the IMGMD database facilitates the integration of wet-lab and in silico data to gain an additional insight into microbial physiology. The IMGMD database is freely available, without any registration requirements, at http://imgmd.jiangnan.edu.cn/database.
Evaluation of a Genome-Scale In Silico Metabolic Model for Geobacter metallireducens Using Proteomic Data from a Field Biostimulation Experiment

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fang, Yilin; Wilkins, Michael J.; Yabusaki, Steven B.

2012-12-12

Biomass and shotgun global proteomics data that reflected relative protein abundances from samples collected during the 2008 experiment at the U.S. Department of Energy Integrated Field-Scale Subsurface Research Challenge site in Rifle, Colorado, provided an unprecedented opportunity to validate a genome-scale metabolic model of Geobacter metallireducens and assess its performance with respect to prediction of metal reduction, biomass yield, and growth rate under dynamic field conditions. Reconstructed from annotated genomic sequence, biochemical, and physiological data, the constraint-based in silico model of G. metallireducens relates an annotated genome sequence to the physiological functions with 697 reactions controlled by 747 enzyme-coding genes.more » Proteomic analysis showed that 180 of the 637 G. metallireducens proteins detected during the 2008 experiment were associated with specific metabolic reactions in the in silico model. When the field-calibrated Fe(III) terminal electron acceptor process reaction in a reactive transport model for the field experiments was replaced with the genome-scale model, the model predicted that the largest metabolic fluxes through the in silico model reactions generally correspond to the highest abundances of proteins that catalyze those reactions. Central metabolism predicted by the model agrees well with protein abundance profiles inferred from proteomic analysis. Model discrepancies with the proteomic data, such as the relatively low fluxes through amino acid transport and metabolism, revealed pathways or flux constraints in the in silico model that could be updated to more accurately predict metabolic processes that occur in the subsurface environment.« less
Metabolic network reconstruction and genome-scale model of butanol-producing strain Clostridium beijerinckii NCIMB 8052

PubMed Central

2011-01-01

Background Solventogenic clostridia offer a sustainable alternative to petroleum-based production of butanol--an important chemical feedstock and potential fuel additive or replacement. C. beijerinckii is an attractive microorganism for strain design to improve butanol production because it (i) naturally produces the highest recorded butanol concentrations as a byproduct of fermentation; and (ii) can co-ferment pentose and hexose sugars (the primary products from lignocellulosic hydrolysis). Interrogating C. beijerinckii metabolism from a systems viewpoint using constraint-based modeling allows for simulation of the global effect of genetic modifications. Results We present the first genome-scale metabolic model (iCM925) for C. beijerinckii, containing 925 genes, 938 reactions, and 881 metabolites. To build the model we employed a semi-automated procedure that integrated genome annotation information from KEGG, BioCyc, and The SEED, and utilized computational algorithms with manual curation to improve model completeness. Interestingly, we found only a 34% overlap in reactions collected from the three databases--highlighting the importance of evaluating the predictive accuracy of the resulting genome-scale model. To validate iCM925, we conducted fermentation experiments using the NCIMB 8052 strain, and evaluated the ability of the model to simulate measured substrate uptake and product production rates. Experimentally observed fermentation profiles were found to lie within the solution space of the model; however, under an optimal growth objective, additional constraints were needed to reproduce the observed profiles--suggesting the existence of selective pressures other than optimal growth. Notably, a significantly enriched fraction of actively utilized reactions in simulations--constrained to reflect experimental rates--originated from the set of reactions that overlapped between all three databases (P = 3.52 × 10-9, Fisher's exact test). Inhibition of the
Reconstruction of the metabolic network of Pseudomonas aeruginosa to interrogate virulence factor synthesis

NASA Astrophysics Data System (ADS)

Bartell, Jennifer A.; Blazier, Anna S.; Yen, Phillip; Thøgersen, Juliane C.; Jelsbak, Lars; Goldberg, Joanna B.; Papin, Jason A.

2017-03-01

Virulence-linked pathways in opportunistic pathogens are putative therapeutic targets that may be associated with less potential for resistance than targets in growth-essential pathways. However, efficacy of virulence-linked targets may be affected by the contribution of virulence-related genes to metabolism. We evaluate the complex interrelationships between growth and virulence-linked pathways using a genome-scale metabolic network reconstruction of Pseudomonas aeruginosa strain PA14 and an updated, expanded reconstruction of P. aeruginosa strain PAO1. The PA14 reconstruction accounts for the activity of 112 virulence-linked genes and virulence factor synthesis pathways that produce 17 unique compounds. We integrate eight published genome-scale mutant screens to validate gene essentiality predictions in rich media, contextualize intra-screen discrepancies and evaluate virulence-linked gene distribution across essentiality datasets. Computational screening further elucidates interconnectivity between inhibition of virulence factor synthesis and growth. Successful validation of selected gene perturbations using PA14 transposon mutants demonstrates the utility of model-driven screening of therapeutic targets.
A multi-tissue type genome-scale metabolic network for analysis of whole-body systems physiology

PubMed Central

2011-01-01

Background Genome-scale metabolic reconstructions provide a biologically meaningful mechanistic basis for the genotype-phenotype relationship. The global human metabolic network, termed Recon 1, has recently been reconstructed allowing the systems analysis of human metabolic physiology and pathology. Utilizing high-throughput data, Recon 1 has recently been tailored to different cells and tissues, including the liver, kidney, brain, and alveolar macrophage. These models have shown utility in the study of systems medicine. However, no integrated analysis between human tissues has been done. Results To describe tissue-specific functions, Recon 1 was tailored to describe metabolism in three human cells: adipocytes, hepatocytes, and myocytes. These cell-specific networks were manually curated and validated based on known cellular metabolic functions. To study intercellular interactions, a novel multi-tissue type modeling approach was developed to integrate the metabolic functions for the three cell types, and subsequently used to simulate known integrated metabolic cycles. In addition, the multi-tissue model was used to study diabetes: a pathology with systemic properties. High-throughput data was integrated with the network to determine differential metabolic activity between obese and type II obese gastric bypass patients in a whole-body context. Conclusion The multi-tissue type modeling approach presented provides a platform to study integrated metabolic states. As more cell and tissue-specific models are released, it is critical to develop a framework in which to study their interdependencies. PMID:22041191
A mixed-integer linear programming approach to the reduction of genome-scale metabolic networks.

PubMed

Röhl, Annika; Bockmayr, Alexander

2017-01-03

Constraint-based analysis has become a widely used method to study metabolic networks. While some of the associated algorithms can be applied to genome-scale network reconstructions with several thousands of reactions, others are limited to small or medium-sized models. In 2015, Erdrich et al. introduced a method called NetworkReducer, which reduces large metabolic networks to smaller subnetworks, while preserving a set of biological requirements that can be specified by the user. Already in 2001, Burgard et al. developed a mixed-integer linear programming (MILP) approach for computing minimal reaction sets under a given growth requirement. Here we present an MILP approach for computing minimum subnetworks with the given properties. The minimality (with respect to the number of active reactions) is not guaranteed by NetworkReducer, while the method by Burgard et al. does not allow specifying the different biological requirements. Our procedure is about 5-10 times faster than NetworkReducer and can enumerate all minimum subnetworks in case there exist several ones. This allows identifying common reactions that are present in all subnetworks, and reactions appearing in alternative pathways. Applying complex analysis methods to genome-scale metabolic networks is often not possible in practice. Thus it may become necessary to reduce the size of the network while keeping important functionalities. We propose a MILP solution to this problem. Compared to previous work, our approach is more efficient and allows computing not only one, but even all minimum subnetworks satisfying the required properties.
Computational modelling of genome-scale metabolic networks and its application to CHO cell cultures.

PubMed

Rejc, Živa; Magdevska, Lidija; Tršelič, Tilen; Osolin, Timotej; Vodopivec, Rok; Mraz, Jakob; Pavliha, Eva; Zimic, Nikolaj; Cvitanović, Tanja; Rozman, Damjana; Moškon, Miha; Mraz, Miha

2017-09-01

Genome-scale metabolic models (GEMs) have become increasingly important in recent years. Currently, GEMs are the most accurate in silico representation of the genotype-phenotype link. They allow us to study complex networks from the systems perspective. Their application may drastically reduce the amount of experimental and clinical work, improve diagnostic tools and increase our understanding of complex biological phenomena. GEMs have also demonstrated high potential for the optimisation of bio-based production of recombinant proteins. Herein, we review the basic concepts, methods, resources and software tools used for the reconstruction and application of GEMs. We overview the evolution of the modelling efforts devoted to the metabolism of Chinese Hamster Ovary (CHO) cells. We present a case study on CHO cell metabolism under different amino acid depletions. This leads us to the identification of the most influential as well as essential amino acids in selected CHO cell lines. Copyright © 2017 Elsevier Ltd. All rights reserved.
Modeling cancer metabolism on a genome scale

PubMed Central

Yizhak, Keren; Chaneton, Barbara; Gottlieb, Eyal; Ruppin, Eytan

2015-01-01

Cancer cells have fundamentally altered cellular metabolism that is associated with their tumorigenicity and malignancy. In addition to the widely studied Warburg effect, several new key metabolic alterations in cancer have been established over the last decade, leading to the recognition that altered tumor metabolism is one of the hallmarks of cancer. Deciphering the full scope and functional implications of the dysregulated metabolism in cancer requires both the advancement of a variety of omics measurements and the advancement of computational approaches for the analysis and contextualization of the accumulated data. Encouragingly, while the metabolic network is highly interconnected and complex, it is at the same time probably the best characterized cellular network. Following, this review discusses the challenges that genome-scale modeling of cancer metabolism has been facing. We survey several recent studies demonstrating the first strides that have been done, testifying to the value of this approach in portraying a network-level view of the cancer metabolism and in identifying novel drug targets and biomarkers. Finally, we outline a few new steps that may further advance this field. PMID:26130389
Multiscale Metabolic Modeling of C4 Plants: Connecting Nonlinear Genome-Scale Models to Leaf-Scale Metabolism in Developing Maize Leaves

PubMed Central

Bogart, Eli; Myers, Christopher R.

2016-01-01

C4 plants, such as maize, concentrate carbon dioxide in a specialized compartment surrounding the veins of their leaves to improve the efficiency of carbon dioxide assimilation. Nonlinear relationships between carbon dioxide and oxygen levels and reaction rates are key to their physiology but cannot be handled with standard techniques of constraint-based metabolic modeling. We demonstrate that incorporating these relationships as constraints on reaction rates and solving the resulting nonlinear optimization problem yields realistic predictions of the response of C4 systems to environmental and biochemical perturbations. Using a new genome-scale reconstruction of maize metabolism, we build an 18000-reaction, nonlinearly constrained model describing mesophyll and bundle sheath cells in 15 segments of the developing maize leaf, interacting via metabolite exchange, and use RNA-seq and enzyme activity measurements to predict spatial variation in metabolic state by a novel method that optimizes correlation between fluxes and expression data. Though such correlations are known to be weak in general, we suggest that developmental gradients may be particularly suited to the inference of metabolic fluxes from expression data, and we demonstrate that our method predicts fluxes that achieve high correlation with the data, successfully capture the experimentally observed base-to-tip transition between carbon-importing tissue and carbon-exporting tissue, and include a nonzero growth rate, in contrast to prior results from similar methods in other systems. PMID:26990967
Caveat emptor: limitations of the automated reconstruction of metabolic pathways in Plasmodium.

PubMed

Ginsburg, Hagai

2009-01-01

The functional reconstruction of metabolic pathways from an annotated genome is a tedious and demanding enterprise. Automation of this endeavor using bioinformatics algorithms could cope with the ever-increasing number of sequenced genomes and accelerate the process. Here, the manual reconstruction of metabolic pathways in the functional genomic database of Plasmodium falciparum--Malaria Parasite Metabolic Pathways--is described and compared with pathways generated automatically as they appear in PlasmoCyc, metaSHARK and the Kyoto Encyclopedia for Genes and Genomes. A critical evaluation of this comparison discloses that the automatic reconstruction of pathways generates manifold paths that need an expert manual verification to accept some and reject most others based on manually curated gene annotation.
Enumeration of Smallest Intervention Strategies in Genome-Scale Metabolic Networks

PubMed Central

von Kamp, Axel; Klamt, Steffen

2014-01-01

One ultimate goal of metabolic network modeling is the rational redesign of biochemical networks to optimize the production of certain compounds by cellular systems. Although several constraint-based optimization techniques have been developed for this purpose, methods for systematic enumeration of intervention strategies in genome-scale metabolic networks are still lacking. In principle, Minimal Cut Sets (MCSs; inclusion-minimal combinations of reaction or gene deletions that lead to the fulfilment of a given intervention goal) provide an exhaustive enumeration approach. However, their disadvantage is the combinatorial explosion in larger networks and the requirement to compute first the elementary modes (EMs) which itself is impractical in genome-scale networks. We present MCSEnumerator, a new method for effective enumeration of the smallest MCSs (with fewest interventions) in genome-scale metabolic network models. For this we combine two approaches, namely (i) the mapping of MCSs to EMs in a dual network, and (ii) a modified algorithm by which shortest EMs can be effectively determined in large networks. In this way, we can identify the smallest MCSs by calculating the shortest EMs in the dual network. Realistic application examples demonstrate that our algorithm is able to list thousands of the most efficient intervention strategies in genome-scale networks for various intervention problems. For instance, for the first time we could enumerate all synthetic lethals in E.coli with combinations of up to 5 reactions. We also applied the new algorithm exemplarily to compute strain designs for growth-coupled synthesis of different products (ethanol, fumarate, serine) by E.coli. We found numerous new engineering strategies partially requiring less knockouts and guaranteeing higher product yields (even without the assumption of optimal growth) than reported previously. The strength of the presented approach is that smallest intervention strategies can be quickly
Optimal knockout strategies in genome-scale metabolic networks using particle swarm optimization.

PubMed

Nair, Govind; Jungreuthmayer, Christian; Zanghellini, Jürgen

2017-02-01

Knockout strategies, particularly the concept of constrained minimal cut sets (cMCSs), are an important part of the arsenal of tools used in manipulating metabolic networks. Given a specific design, cMCSs can be calculated even in genome-scale networks. We would however like to find not only the optimal intervention strategy for a given design but the best possible design too. Our solution (PSOMCS) is to use particle swarm optimization (PSO) along with the direct calculation of cMCSs from the stoichiometric matrix to obtain optimal designs satisfying multiple objectives. To illustrate the working of PSOMCS, we apply it to a toy network. Next we show its superiority by comparing its performance against other comparable methods on a medium sized E. coli core metabolic network. PSOMCS not only finds solutions comparable to previously published results but also it is orders of magnitude faster. Finally, we use PSOMCS to predict knockouts satisfying multiple objectives in a genome-scale metabolic model of E. coli and compare it with OptKnock and RobustKnock. PSOMCS finds competitive knockout strategies and designs compared to other current methods and is in some cases significantly faster. It can be used in identifying knockouts which will force optimal desired behaviors in large and genome scale metabolic networks. It will be even more useful as larger metabolic models of industrially relevant organisms become available.
Genome-scale reconstruction of the sigma factor network in Escherichia coli: topology and functional states

PubMed Central

2014-01-01

Background At the beginning of the transcription process, the RNA polymerase (RNAP) core enzyme requires a σ-factor to recognize the genomic location at which the process initiates. Although the crucial role of σ-factors has long been appreciated and characterized for many individual promoters, we do not yet have a genome-scale assessment of their function. Results Using multiple genome-scale measurements, we elucidated the network of σ-factor and promoter interactions in Escherichia coli. The reconstructed network includes 4,724 σ-factor-specific promoters corresponding to transcription units (TUs), representing an increase of more than 300% over what has been previously reported. The reconstructed network was used to investigate competition between alternative σ-factors (the σ70 and σ38 regulons), confirming the competition model of σ substitution and negative regulation by alternative σ-factors. Comparison with σ-factor binding in Klebsiella pneumoniae showed that transcriptional regulation of conserved genes in closely related species is unexpectedly divergent. Conclusions The reconstructed network reveals the regulatory complexity of the promoter architecture in prokaryotic genomes, and opens a path to the direct determination of the systems biology of their transcriptional regulatory networks. PMID:24461193
Comprehensive reconstruction and in silico analysis of Aspergillus niger genome-scale metabolic network model that accounts for 1210 ORFs.

PubMed

Lu, Hongzhong; Cao, Weiqiang; Ouyang, Liming; Xia, Jianye; Huang, Mingzhi; Chu, Ju; Zhuang, Yingping; Zhang, Siliang; Noorman, Henk

2017-03-01

Aspergillus niger is one of the most important cell factories for industrial enzymes and organic acids production. A comprehensive genome-scale metabolic network model (GSMM) with high quality is crucial for efficient strain improvement and process optimization. The lack of accurate reaction equations and gene-protein-reaction associations (GPRs) in the current best model of A. niger named GSMM iMA871, however, limits its application scope. To overcome these limitations, we updated the A. niger GSMM by combining the latest genome annotation and literature mining technology. Compared with iMA871, the number of reactions in iHL1210 was increased from 1,380 to 1,764, and the number of unique ORFs from 871 to 1,210. With the aid of our transcriptomics analysis, the existence of 63% ORFs and 68% reactions in iHL1210 can be verified when glucose was used as the only carbon source. Physiological data from chemostat cultivations, 13 C-labeled and molecular experiments from the published literature were further used to check the performance of iHL1210. The average correlation coefficients between the predicted fluxes and estimated fluxes from 13 C-labeling data were sufficiently high (above 0.89) and the prediction of cell growth on most of the reported carbon and nitrogen sources was consistent. Using the updated genome-scale model, we evaluated gene essentiality on synthetic and yeast extract medium, as well as the effects of NADPH supply on glucoamylase production in A. niger. In summary, the new A. niger GSMM iHL1210 contains significant improvements with respect to the metabolic coverage and prediction performance, which paves the way for systematic metabolic engineering of A. niger. Biotechnol. Bioeng. 2017;114: 685-695. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Genome-scale resources for Thermoanaerobacterium saccharolyticum.

PubMed

Currie, Devin H; Raman, Babu; Gowen, Christopher M; Tschaplinski, Timothy J; Land, Miriam L; Brown, Steven D; Covalla, Sean F; Klingeman, Dawn M; Yang, Zamin K; Engle, Nancy L; Johnson, Courtney M; Rodriguez, Miguel; Shaw, A Joe; Kenealy, William R; Lynd, Lee R; Fong, Stephen S; Mielenz, Jonathan R; Davison, Brian H; Hogsett, David A; Herring, Christopher D

2015-06-26

Thermoanaerobacterium saccharolyticum is a hemicellulose-degrading thermophilic anaerobe that was previously engineered to produce ethanol at high yield. A major project was undertaken to develop this organism into an industrial biocatalyst, but the lack of genome information and resources were recognized early on as a key limitation. Here we present a set of genome-scale resources to enable the systems level investigation and development of this potentially important industrial organism. Resources include a complete genome sequence for strain JW/SL-YS485, a genome-scale reconstruction of metabolism, tiled microarray data showing transcription units, mRNA expression data from 71 different growth conditions or timepoints and GC/MS-based metabolite analysis data from 42 different conditions or timepoints. Growth conditions include hemicellulose hydrolysate, the inhibitors HMF, furfural, diamide, and ethanol, as well as high levels of cellulose, xylose, cellobiose or maltodextrin. The genome consists of a 2.7 Mbp chromosome and a 110 Kbp megaplasmid. An active prophage was also detected, and the expression levels of CRISPR genes were observed to increase in association with those of the phage. Hemicellulose hydrolysate elicited a response of carbohydrate transport and catabolism genes, as well as poorly characterized genes suggesting a redox challenge. In some conditions, a time series of combined transcription and metabolite measurements were made to allow careful study of microbial physiology under process conditions. As a demonstration of the potential utility of the metabolic reconstruction, the OptKnock algorithm was used to predict a set of gene knockouts that maximize growth-coupled ethanol production. The predictions validated intuitive strain designs and matched previous experimental results. These data will be a useful asset for efforts to develop T. saccharolyticum for efficient industrial production of biofuels. The resources presented herein may also be
Construction of a Genome-Scale Metabolic Model of Arthrospira platensis NIES-39 and Metabolic Design for Cyanobacterial Bioproduction

PubMed Central

Yoshikawa, Katsunori; Aikawa, Shimpei; Kojima, Yuta; Toya, Yoshihiro; Furusawa, Chikara; Kondo, Akihiko; Shimizu, Hiroshi

2015-01-01

Arthrospira (Spirulina) platensis is a promising feedstock and host strain for bioproduction because of its high accumulation of glycogen and superior characteristics for industrial production. Metabolic simulation using a genome-scale metabolic model and flux balance analysis is a powerful method that can be used to design metabolic engineering strategies for the improvement of target molecule production. In this study, we constructed a genome-scale metabolic model of A. platensis NIES-39 including 746 metabolic reactions and 673 metabolites, and developed novel strategies to improve the production of valuable metabolites, such as glycogen and ethanol. The simulation results obtained using the metabolic model showed high consistency with experimental results for growth rates under several trophic conditions and growth capabilities on various organic substrates. The metabolic model was further applied to design a metabolic network to improve the autotrophic production of glycogen and ethanol. Decreased flux of reactions related to the TCA cycle and phosphoenolpyruvate reaction were found to improve glycogen production. Furthermore, in silico knockout simulation indicated that deletion of genes related to the respiratory chain, such as NAD(P)H dehydrogenase and cytochrome-c oxidase, could enhance ethanol production by using ammonium as a nitrogen source. PMID:26640947

Integration and Validation of the Genome-Scale Metabolic Models of Pichia pastoris: A Comprehensive Update of Protein Glycosylation Pathways, Lipid and Energy Metabolism.

PubMed

Tomàs-Gamisans, Màrius; Ferrer, Pau; Albiol, Joan

2016-01-01

Genome-scale metabolic models (GEMs) are tools that allow predicting a phenotype from a genotype under certain environmental conditions. GEMs have been developed in the last ten years for a broad range of organisms, and are used for multiple purposes such as discovering new properties of metabolic networks, predicting new targets for metabolic engineering, as well as optimizing the cultivation conditions for biochemicals or recombinant protein production. Pichia pastoris is one of the most widely used organisms for heterologous protein expression. There are different GEMs for this methylotrophic yeast of which the most relevant and complete in the published literature are iPP668, PpaMBEL1254 and iLC915. However, these three models differ regarding certain pathways, terminology for metabolites and reactions and annotations. Moreover, GEMs for some species are typically built based on the reconstructed models of related model organisms. In these cases, some organism-specific pathways could be missing or misrepresented. In order to provide an updated and more comprehensive GEM for P. pastoris, we have reconstructed and validated a consensus model integrating and merging all three existing models. In this step a comprehensive review and integration of the metabolic pathways included in each one of these three versions was performed. In addition, the resulting iMT1026 model includes a new description of some metabolic processes. Particularly new information described in recently published literature is included, mainly related to fatty acid and sphingolipid metabolism, glycosylation and cell energetics. Finally the reconstructed model was tested and validated, by comparing the results of the simulations with available empirical physiological datasets results obtained from a wide range of experimental conditions, such as different carbon sources, distinct oxygen availability conditions, as well as producing of two different recombinant proteins. In these simulations, the
Context-specific metabolic networks are consistent with experiments.

PubMed

Becker, Scott A; Palsson, Bernhard O

2008-05-16

Reconstructions of cellular metabolism are publicly available for a variety of different microorganisms and some mammalian genomes. To date, these reconstructions are "genome-scale" and strive to include all reactions implied by the genome annotation, as well as those with direct experimental evidence. Clearly, many of the reactions in a genome-scale reconstruction will not be active under particular conditions or in a particular cell type. Methods to tailor these comprehensive genome-scale reconstructions into context-specific networks will aid predictive in silico modeling for a particular situation. We present a method called Gene Inactivity Moderated by Metabolism and Expression (GIMME) to achieve this goal. The GIMME algorithm uses quantitative gene expression data and one or more presupposed metabolic objectives to produce the context-specific reconstruction that is most consistent with the available data. Furthermore, the algorithm provides a quantitative inconsistency score indicating how consistent a set of gene expression data is with a particular metabolic objective. We show that this algorithm produces results consistent with biological experiments and intuition for adaptive evolution of bacteria, rational design of metabolic engineering strains, and human skeletal muscle cells. This work represents progress towards producing constraint-based models of metabolism that are specific to the conditions where the expression profiling data is available.
Predicting growth of the healthy infant using a genome scale metabolic model.

PubMed

Nilsson, Avlant; Mardinoglu, Adil; Nielsen, Jens

2017-01-01

An estimated 165 million children globally have stunted growth, and extensive growth data are available. Genome scale metabolic models allow the simulation of molecular flux over each metabolic enzyme, and are well adapted to analyze biological systems. We used a human genome scale metabolic model to simulate the mechanisms of growth and integrate data about breast-milk intake and composition with the infant's biomass and energy expenditure of major organs. The model predicted daily metabolic fluxes from birth to age 6 months, and accurately reproduced standard growth curves and changes in body composition. The model corroborates the finding that essential amino and fatty acids do not limit growth, but that energy is the main growth limiting factor. Disruptions to the supply and demand of energy markedly affected the predicted growth, indicating that elevated energy expenditure may be detrimental. The model was used to simulate the metabolic effect of mineral deficiencies, and showed the greatest growth reduction for deficiencies in copper, iron, and magnesium ions which affect energy production through oxidative phosphorylation. The model and simulation method were integrated to a platform and shared with the research community. The growth model constitutes another step towards the complete representation of human metabolism, and may further help improve the understanding of the mechanisms underlying stunting.
Quantitative Assessment of Thermodynamic Constraints on the Solution Space of Genome-Scale Metabolic Models

PubMed Central

Hamilton, Joshua J.; Dwivedi, Vivek; Reed, Jennifer L.

2013-01-01

Constraint-based methods provide powerful computational techniques to allow understanding and prediction of cellular behavior. These methods rely on physiochemical constraints to eliminate infeasible behaviors from the space of available behaviors. One such constraint is thermodynamic feasibility, the requirement that intracellular flux distributions obey the laws of thermodynamics. The past decade has seen several constraint-based methods that interpret this constraint in different ways, including those that are limited to small networks, rely on predefined reaction directions, and/or neglect the relationship between reaction free energies and metabolite concentrations. In this work, we utilize one such approach, thermodynamics-based metabolic flux analysis (TMFA), to make genome-scale, quantitative predictions about metabolite concentrations and reaction free energies in the absence of prior knowledge of reaction directions, while accounting for uncertainties in thermodynamic estimates. We applied TMFA to a genome-scale network reconstruction of Escherichia coli and examined the effect of thermodynamic constraints on the flux space. We also assessed the predictive performance of TMFA against gene essentiality and quantitative metabolomics data, under both aerobic and anaerobic, and optimal and suboptimal growth conditions. Based on these results, we propose that TMFA is a useful tool for validating phenotypes and generating hypotheses, and that additional types of data and constraints can improve predictions of metabolite concentrations. PMID:23870272
Reconstruction of the vertebrate ancestral genome reveals dynamic genome reorganization in early vertebrates.

PubMed

Nakatani, Yoichiro; Takeda, Hiroyuki; Kohara, Yuji; Morishita, Shinichi

2007-09-01

Although several vertebrate genomes have been sequenced, little is known about the genome evolution of early vertebrates and how large-scale genomic changes such as the two rounds of whole-genome duplications (2R WGD) affected evolutionary complexity and novelty in vertebrates. Reconstructing the ancestral vertebrate genome is highly nontrivial because of the difficulty in identifying traces originating from the 2R WGD. To resolve this problem, we developed a novel method capable of pinning down remains of the 2R WGD in the human and medaka fish genomes using invertebrate tunicate and sea urchin genes to define ohnologs, i.e., paralogs produced by the 2R WGD. We validated the reconstruction using the chicken genome, which was not considered in the reconstruction step, and observed that many ancestral proto-chromosomes were retained in the chicken genome and had one-to-one correspondence to chicken microchromosomes, thereby confirming the reconstructed ancestral genomes. Our reconstruction revealed a contrast between the slow karyotype evolution after the second WGD and the rapid, lineage-specific genome reorganizations that occurred in the ancestral lineages of major taxonomic groups such as teleost fishes, amphibians, reptiles, and marsupials.
MultiMetEval: Comparative and Multi-Objective Analysis of Genome-Scale Metabolic Models

PubMed Central

Gevorgyan, Albert; Kierzek, Andrzej M.; Breitling, Rainer; Takano, Eriko

2012-01-01

Comparative metabolic modelling is emerging as a novel field, supported by the development of reliable and standardized approaches for constructing genome-scale metabolic models in high throughput. New software solutions are needed to allow efficient comparative analysis of multiple models in the context of multiple cellular objectives. Here, we present the user-friendly software framework Multi-Metabolic Evaluator (MultiMetEval), built upon SurreyFBA, which allows the user to compose collections of metabolic models that together can be subjected to flux balance analysis. Additionally, MultiMetEval implements functionalities for multi-objective analysis by calculating the Pareto front between two cellular objectives. Using a previously generated dataset of 38 actinobacterial genome-scale metabolic models, we show how these approaches can lead to exciting novel insights. Firstly, after incorporating several pathways for the biosynthesis of natural products into each of these models, comparative flux balance analysis predicted that species like Streptomyces that harbour the highest diversity of secondary metabolite biosynthetic gene clusters in their genomes do not necessarily have the metabolic network topology most suitable for compound overproduction. Secondly, multi-objective analysis of biomass production and natural product biosynthesis in these actinobacteria shows that the well-studied occurrence of discrete metabolic switches during the change of cellular objectives is inherent to their metabolic network architecture. Comparative and multi-objective modelling can lead to insights that could not be obtained by normal flux balance analyses. MultiMetEval provides a powerful platform that makes these analyses straightforward for biologists. Sources and binaries of MultiMetEval are freely available from https://github.com/PiotrZakrzewski/MetEval/downloads. PMID:23272111
Yeast 5 – an expanded reconstruction of the Saccharomyces cerevisiae metabolic network

PubMed Central

2012-01-01

Background Efforts to improve the computational reconstruction of the Saccharomyces cerevisiae biochemical reaction network and to refine the stoichiometrically constrained metabolic models that can be derived from such a reconstruction have continued since the first stoichiometrically constrained yeast genome scale metabolic model was published in 2003. Continuing this ongoing process, we have constructed an update to the Yeast Consensus Reconstruction, Yeast 5. The Yeast Consensus Reconstruction is a product of efforts to forge a community-based reconstruction emphasizing standards compliance and biochemical accuracy via evidence-based selection of reactions. It draws upon models published by a variety of independent research groups as well as information obtained from biochemical databases and primary literature. Results Yeast 5 refines the biochemical reactions included in the reconstruction, particularly reactions involved in sphingolipid metabolism; updates gene-reaction annotations; and emphasizes the distinction between reconstruction and stoichiometrically constrained model. Although it was not a primary goal, this update also improves the accuracy of model prediction of viability and auxotrophy phenotypes and increases the number of epistatic interactions. This update maintains an emphasis on standards compliance, unambiguous metabolite naming, and computer-readable annotations available through a structured document format. Additionally, we have developed MATLAB scripts to evaluate the model’s predictive accuracy and to demonstrate basic model applications such as simulating aerobic and anaerobic growth. These scripts, which provide an independent tool for evaluating the performance of various stoichiometrically constrained yeast metabolic models using flux balance analysis, are included as Additional files 1, 2 and 3. Conclusions Yeast 5 expands and refines the computational reconstruction of yeast metabolism and improves the predictive accuracy of a
Genome-scale metabolic analysis of Clostridium thermocellum for bioethanol production

PubMed Central

2010-01-01

Background Microorganisms possess diverse metabolic capabilities that can potentially be leveraged for efficient production of biofuels. Clostridium thermocellum (ATCC 27405) is a thermophilic anaerobe that is both cellulolytic and ethanologenic, meaning that it can directly use the plant sugar, cellulose, and biochemically convert it to ethanol. A major challenge in using microorganisms for chemical production is the need to modify the organism to increase production efficiency. The process of properly engineering an organism is typically arduous. Results Here we present a genome-scale model of C. thermocellum metabolism, iSR432, for the purpose of establishing a computational tool to study the metabolic network of C. thermocellum and facilitate efforts to engineer C. thermocellum for biofuel production. The model consists of 577 reactions involving 525 intracellular metabolites, 432 genes, and a proteomic-based representation of a cellulosome. The process of constructing this metabolic model led to suggested annotation refinements for 27 genes and identification of areas of metabolism requiring further study. The accuracy of the iSR432 model was tested using experimental growth and by-product secretion data for growth on cellobiose and fructose. Analysis using this model captures the relationship between the reduction-oxidation state of the cell and ethanol secretion and allowed for prediction of gene deletions and environmental conditions that would increase ethanol production. Conclusions By incorporating genomic sequence data, network topology, and experimental measurements of enzyme activities and metabolite fluxes, we have generated a model that is reasonably accurate at predicting the cellular phenotype of C. thermocellum and establish a strong foundation for rational strain design. In addition, we are able to draw some important conclusions regarding the underlying metabolic mechanisms for observed behaviors of C. thermocellum and highlight remaining gaps
Reliable and efficient solution of genome-scale models of Metabolism and macromolecular Expression

DOE PAGES

Ma, Ding; Yang, Laurence; Fleming, Ronan M. T.; ...

2017-01-18

Currently, Constraint-Based Reconstruction and Analysis (COBRA) is the only methodology that permits integrated modeling of Metabolism and macromolecular Expression (ME) at genome-scale. Linear optimization computes steady-state flux solutions to ME models, but flux values are spread over many orders of magnitude. Data values also have greatly varying magnitudes. Furthermore, standard double-precision solvers may return inaccurate solutions or report that no solution exists. Exact simplex solvers based on rational arithmetic require a near-optimal warm start to be practical on large problems (current ME models have 70,000 constraints and variables and will grow larger). We also developed a quadrupleprecision version of ourmore » linear and nonlinear optimizer MINOS, and a solution procedure (DQQ) involving Double and Quad MINOS that achieves reliability and efficiency for ME models and other challenging problems tested here. DQQ will enable extensive use of large linear and nonlinear models in systems biology and other applications involving multiscale data.« less
Reliable and efficient solution of genome-scale models of Metabolism and macromolecular Expression

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ma, Ding; Yang, Laurence; Fleming, Ronan M. T.

Currently, Constraint-Based Reconstruction and Analysis (COBRA) is the only methodology that permits integrated modeling of Metabolism and macromolecular Expression (ME) at genome-scale. Linear optimization computes steady-state flux solutions to ME models, but flux values are spread over many orders of magnitude. Data values also have greatly varying magnitudes. Furthermore, standard double-precision solvers may return inaccurate solutions or report that no solution exists. Exact simplex solvers based on rational arithmetic require a near-optimal warm start to be practical on large problems (current ME models have 70,000 constraints and variables and will grow larger). We also developed a quadrupleprecision version of ourmore » linear and nonlinear optimizer MINOS, and a solution procedure (DQQ) involving Double and Quad MINOS that achieves reliability and efficiency for ME models and other challenging problems tested here. DQQ will enable extensive use of large linear and nonlinear models in systems biology and other applications involving multiscale data.« less
Quantitative assessment of thermodynamic constraints on the solution space of genome-scale metabolic models.

PubMed

Hamilton, Joshua J; Dwivedi, Vivek; Reed, Jennifer L

2013-07-16

Constraint-based methods provide powerful computational techniques to allow understanding and prediction of cellular behavior. These methods rely on physiochemical constraints to eliminate infeasible behaviors from the space of available behaviors. One such constraint is thermodynamic feasibility, the requirement that intracellular flux distributions obey the laws of thermodynamics. The past decade has seen several constraint-based methods that interpret this constraint in different ways, including those that are limited to small networks, rely on predefined reaction directions, and/or neglect the relationship between reaction free energies and metabolite concentrations. In this work, we utilize one such approach, thermodynamics-based metabolic flux analysis (TMFA), to make genome-scale, quantitative predictions about metabolite concentrations and reaction free energies in the absence of prior knowledge of reaction directions, while accounting for uncertainties in thermodynamic estimates. We applied TMFA to a genome-scale network reconstruction of Escherichia coli and examined the effect of thermodynamic constraints on the flux space. We also assessed the predictive performance of TMFA against gene essentiality and quantitative metabolomics data, under both aerobic and anaerobic, and optimal and suboptimal growth conditions. Based on these results, we propose that TMFA is a useful tool for validating phenotypes and generating hypotheses, and that additional types of data and constraints can improve predictions of metabolite concentrations. Copyright © 2013 Biophysical Society. Published by Elsevier Inc. All rights reserved.
Modeling Neisseria meningitidis metabolism: from genome to metabolic fluxes

PubMed Central

Baart, Gino JE; Zomer, Bert; de Haan, Alex; van der Pol, Leo A; Beuvery, E Coen; Tramper, Johannes; Martens, Dirk E

2007-01-01

Background Neisseria meningitidis is a human pathogen that can infect diverse sites within the human host. The major diseases caused by N. meningitidis are responsible for death and disability, especially in young infants. In general, most of the recent work on N. meningitidis focuses on potential antigens and their functions, immunogenicity, and pathogenicity mechanisms. Very little work has been carried out on Neisseria primary metabolism over the past 25 years. Results Using the genomic database of N. meningitidis serogroup B together with biochemical and physiological information in the literature we constructed a genome-scale flux model for the primary metabolism of N. meningitidis. The validity of a simplified metabolic network derived from the genome-scale metabolic network was checked using flux-balance analysis in chemostat cultures. Several useful predictions were obtained from in silico experiments, including substrate preference. A minimal medium for growth of N. meningitidis was designed and tested succesfully in batch and chemostat cultures. Conclusion The verified metabolic model describes the primary metabolism of N. meningitidis in a chemostat in steady state. The genome-scale model is valuable because it offers a framework to study N. meningitidis metabolism as a whole, or certain aspects of it, and it can also be used for the purpose of vaccine process development (for example, the design of growth media). The flux distribution of the main metabolic pathways (that is, the pentose phosphate pathway and the Entner-Douderoff pathway) indicates that the major part of pyruvate (69%) is synthesized through the ED-cleavage, a finding that is in good agreement with literature. PMID:17617894
An integrated approach to reconstructing genome-scale transcriptional regulatory networks

DOE PAGES

Imam, Saheed; Noguera, Daniel R.; Donohue, Timothy J.; ...

2015-02-27

Transcriptional regulatory networks (TRNs) program cells to dynamically alter their gene expression in response to changing internal or environmental conditions. In this study, we develop a novel workflow for generating large-scale TRN models that integrates comparative genomics data, global gene expression analyses, and intrinsic properties of transcription factors (TFs). An assessment of this workflow using benchmark datasets for the well-studied γ-proteobacterium Escherichia coli showed that it outperforms expression-based inference approaches, having a significantly larger area under the precision-recall curve. Further analysis indicated that this integrated workflow captures different aspects of the E. coli TRN than expression-based approaches, potentially making themmore » highly complementary. We leveraged this new workflow and observations to build a large-scale TRN model for the α-Proteobacterium Rhodobacter sphaeroides that comprises 120 gene clusters, 1211 genes (including 93 TFs), 1858 predicted protein-DNA interactions and 76 DNA binding motifs. We found that ~67% of the predicted gene clusters in this TRN are enriched for functions ranging from photosynthesis or central carbon metabolism to environmental stress responses. We also found that members of many of the predicted gene clusters were consistent with prior knowledge in R. sphaeroides and/or other bacteria. Experimental validation of predictions from this R. sphaeroides TRN model showed that high precision and recall was also obtained for TFs involved in photosynthesis (PpsR), carbon metabolism (RSP_0489) and iron homeostasis (RSP_3341). In addition, this integrative approach enabled generation of TRNs with increased information content relative to R. sphaeroides TRN models built via other approaches. We also show how this approach can be used to simultaneously produce TRN models for each related organism used in the comparative genomics analysis. Our results highlight the advantages of
Integration and Validation of the Genome-Scale Metabolic Models of Pichia pastoris: A Comprehensive Update of Protein Glycosylation Pathways, Lipid and Energy Metabolism

PubMed Central

Tomàs-Gamisans, Màrius; Ferrer, Pau; Albiol, Joan

2016-01-01

Motivation Genome-scale metabolic models (GEMs) are tools that allow predicting a phenotype from a genotype under certain environmental conditions. GEMs have been developed in the last ten years for a broad range of organisms, and are used for multiple purposes such as discovering new properties of metabolic networks, predicting new targets for metabolic engineering, as well as optimizing the cultivation conditions for biochemicals or recombinant protein production. Pichia pastoris is one of the most widely used organisms for heterologous protein expression. There are different GEMs for this methylotrophic yeast of which the most relevant and complete in the published literature are iPP668, PpaMBEL1254 and iLC915. However, these three models differ regarding certain pathways, terminology for metabolites and reactions and annotations. Moreover, GEMs for some species are typically built based on the reconstructed models of related model organisms. In these cases, some organism-specific pathways could be missing or misrepresented. Results In order to provide an updated and more comprehensive GEM for P. pastoris, we have reconstructed and validated a consensus model integrating and merging all three existing models. In this step a comprehensive review and integration of the metabolic pathways included in each one of these three versions was performed. In addition, the resulting iMT1026 model includes a new description of some metabolic processes. Particularly new information described in recently published literature is included, mainly related to fatty acid and sphingolipid metabolism, glycosylation and cell energetics. Finally the reconstructed model was tested and validated, by comparing the results of the simulations with available empirical physiological datasets results obtained from a wide range of experimental conditions, such as different carbon sources, distinct oxygen availability conditions, as well as producing of two different recombinant proteins. In
MapMaker and PathTracer for tracking carbon in genome-scale metabolic models

PubMed Central

Tervo, Christopher J.; Reed, Jennifer L.

2016-01-01

Constraint-based reconstruction and analysis (COBRA) modeling results can be difficult to interpret given the large numbers of reactions in genome-scale models. While paths in metabolic networks can be found, existing methods are not easily combined with constraint-based approaches. To address this limitation, two tools (MapMaker and PathTracer) were developed to find paths (including cycles) between metabolites, where each step transfers carbon from reactant to product. MapMaker predicts carbon transfer maps (CTMs) between metabolites using only information on molecular formulae and reaction stoichiometry, effectively determining which reactants and products share carbon atoms. MapMaker correctly assigned CTMs for over 97% of the 2,251 reactions in an Escherichia coli metabolic model (iJO1366). Using CTMs as inputs, PathTracer finds paths between two metabolites. PathTracer was applied to iJO1366 to investigate the importance of using CTMs and COBRA constraints when enumerating paths, to find active and high flux paths in flux balance analysis (FBA) solutions, to identify paths for putrescine utilization, and to elucidate a potential CO2 fixation pathway in E. coli. These results illustrate how MapMaker and PathTracer can be used in combination with constraint-based models to identify feasible, active, and high flux paths between metabolites. PMID:26771089
RegPrecise 3.0--a resource for genome-scale exploration of transcriptional regulation in bacteria.

PubMed

Novichkov, Pavel S; Kazakov, Alexey E; Ravcheev, Dmitry A; Leyn, Semen A; Kovaleva, Galina Y; Sutormin, Roman A; Kazanov, Marat D; Riehl, William; Arkin, Adam P; Dubchak, Inna; Rodionov, Dmitry A

2013-11-01

Genome-scale prediction of gene regulation and reconstruction of transcriptional regulatory networks in prokaryotes is one of the critical tasks of modern genomics. Bacteria from different taxonomic groups, whose lifestyles and natural environments are substantially different, possess highly diverged transcriptional regulatory networks. The comparative genomics approaches are useful for in silico reconstruction of bacterial regulons and networks operated by both transcription factors (TFs) and RNA regulatory elements (riboswitches). RegPrecise (http://regprecise.lbl.gov) is a web resource for collection, visualization and analysis of transcriptional regulons reconstructed by comparative genomics. We significantly expanded a reference collection of manually curated regulons we introduced earlier. RegPrecise 3.0 provides access to inferred regulatory interactions organized by phylogenetic, structural and functional properties. Taxonomy-specific collections include 781 TF regulogs inferred in more than 160 genomes representing 14 taxonomic groups of Bacteria. TF-specific collections include regulogs for a selected subset of 40 TFs reconstructed across more than 30 taxonomic lineages. Novel collections of regulons operated by RNA regulatory elements (riboswitches) include near 400 regulogs inferred in 24 bacterial lineages. RegPrecise 3.0 provides four classifications of the reference regulons implemented as controlled vocabularies: 55 TF protein families; 43 RNA motif families; ~150 biological processes or metabolic pathways; and ~200 effectors or environmental signals. Genome-wide visualization of regulatory networks and metabolic pathways covered by the reference regulons are available for all studied genomes. A separate section of RegPrecise 3.0 contains draft regulatory networks in 640 genomes obtained by an conservative propagation of the reference regulons to closely related genomes. RegPrecise 3.0 gives access to the transcriptional regulons reconstructed in
Genome-scale metabolic modeling of Mucor circinelloides and comparative analysis with other oleaginous species.

PubMed

Vongsangnak, Wanwipa; Klanchui, Amornpan; Tawornsamretkit, Iyarest; Tatiyaborwornchai, Witthawin; Laoteng, Kobkul; Meechai, Asawin

2016-06-01

We present a novel genome-scale metabolic model iWV1213 of Mucor circinelloides, which is an oleaginous fungus for industrial applications. The model contains 1213 genes, 1413 metabolites and 1326 metabolic reactions across different compartments. We demonstrate that iWV1213 is able to accurately predict the growth rates of M. circinelloides on various nutrient sources and culture conditions using Flux Balance Analysis and Phenotypic Phase Plane analysis. Comparative analysis of three oleaginous genome-scale models, including M. circinelloides (iWV1213), Mortierella alpina (iCY1106) and Yarrowia lipolytica (iYL619_PCP) revealed that iWV1213 possesses a higher number of genes involved in carbohydrate, amino acid, and lipid metabolisms that might contribute to its versatility in nutrient utilization. Moreover, the identification of unique and common active reactions among the Zygomycetes oleaginous models using Flux Variability Analysis unveiled a set of gene/enzyme candidates as metabolic engineering targets for cellular improvement. Thus, iWV1213 offers a powerful metabolic engineering tool for multi-level omics analysis, enabling strain optimization as a cell factory platform of lipid-based production. Copyright © 2016 Elsevier B.V. All rights reserved.
A Method to Constrain Genome-Scale Models with 13C Labeling Data

PubMed Central

García Martín, Héctor; Kumar, Vinay Satish; Weaver, Daniel; Ghosh, Amit; Chubukov, Victor; Mukhopadhyay, Aindrila; Arkin, Adam; Keasling, Jay D.

2015-01-01

Current limitations in quantitatively predicting biological behavior hinder our efforts to engineer biological systems to produce biofuels and other desired chemicals. Here, we present a new method for calculating metabolic fluxes, key targets in metabolic engineering, that incorporates data from 13C labeling experiments and genome-scale models. The data from 13C labeling experiments provide strong flux constraints that eliminate the need to assume an evolutionary optimization principle such as the growth rate optimization assumption used in Flux Balance Analysis (FBA). This effective constraining is achieved by making the simple but biologically relevant assumption that flux flows from core to peripheral metabolism and does not flow back. The new method is significantly more robust than FBA with respect to errors in genome-scale model reconstruction. Furthermore, it can provide a comprehensive picture of metabolite balancing and predictions for unmeasured extracellular fluxes as constrained by 13C labeling data. A comparison shows that the results of this new method are similar to those found through 13C Metabolic Flux Analysis (13C MFA) for central carbon metabolism but, additionally, it provides flux estimates for peripheral metabolism. The extra validation gained by matching 48 relative labeling measurements is used to identify where and why several existing COnstraint Based Reconstruction and Analysis (COBRA) flux prediction algorithms fail. We demonstrate how to use this knowledge to refine these methods and improve their predictive capabilities. This method provides a reliable base upon which to improve the design of biological systems. PMID:26379153
Comparative genome-scale modelling of Staphylococcus aureus strains identifies strain-specific metabolic capabilities linked to pathogenicity

PubMed Central

Bosi, Emanuele; Monk, Jonathan M.; Aziz, Ramy K.; Fondi, Marco; Nizet, Victor; Palsson, Bernhard Ø.

2016-01-01

Staphylococcus aureus is a preeminent bacterial pathogen capable of colonizing diverse ecological niches within its human host. We describe here the pangenome of S. aureus based on analysis of genome sequences from 64 strains of S. aureus spanning a range of ecological niches, host types, and antibiotic resistance profiles. Based on this set, S. aureus is expected to have an open pangenome composed of 7,411 genes and a core genome composed of 1,441 genes. Metabolism was highly conserved in this core genome; however, differences were identified in amino acid and nucleotide biosynthesis pathways between the strains. Genome-scale models (GEMs) of metabolism were constructed for the 64 strains of S. aureus. These GEMs enabled a systems approach to characterizing the core metabolic and panmetabolic capabilities of the S. aureus species. All models were predicted to be auxotrophic for the vitamins niacin (vitamin B3) and thiamin (vitamin B1), whereas strain-specific auxotrophies were predicted for riboflavin (vitamin B2), guanosine, leucine, methionine, and cysteine, among others. GEMs were used to systematically analyze growth capabilities in more than 300 different growth-supporting environments. The results identified metabolic capabilities linked to pathogenic traits and virulence acquisitions. Such traits can be used to differentiate strains responsible for mild vs. severe infections and preference for hosts (e.g., animals vs. humans). Genome-scale analysis of multiple strains of a species can thus be used to identify metabolic determinants of virulence and increase our understanding of why certain strains of this deadly pathogen have spread rapidly throughout the world. PMID:27286824
Meneco, a Topology-Based Gap-Filling Tool Applicable to Degraded Genome-Wide Metabolic Networks

PubMed Central

Prigent, Sylvain; Frioux, Clémence; Dittami, Simon M.; Larhlimi, Abdelhalim; Collet, Guillaume; Gutknecht, Fabien; Got, Jeanne; Eveillard, Damien; Bourdon, Jérémie; Plewniak, Frédéric; Tonon, Thierry; Siegel, Anne

2017-01-01

Increasing amounts of sequence data are becoming available for a wide range of non-model organisms. Investigating and modelling the metabolic behaviour of those organisms is highly relevant to understand their biology and ecology. As sequences are often incomplete and poorly annotated, draft networks of their metabolism largely suffer from incompleteness. Appropriate gap-filling methods to identify and add missing reactions are therefore required to address this issue. However, current tools rely on phenotypic or taxonomic information, or are very sensitive to the stoichiometric balance of metabolic reactions, especially concerning the co-factors. This type of information is often not available or at least prone to errors for newly-explored organisms. Here we introduce Meneco, a tool dedicated to the topological gap-filling of genome-scale draft metabolic networks. Meneco reformulates gap-filling as a qualitative combinatorial optimization problem, omitting constraints raised by the stoichiometry of a metabolic network considered in other methods, and solves this problem using Answer Set Programming. Run on several artificial test sets gathering 10,800 degraded Escherichia coli networks Meneco was able to efficiently identify essential reactions missing in networks at high degradation rates, outperforming the stoichiometry-based tools in scalability. To demonstrate the utility of Meneco we applied it to two case studies. Its application to recent metabolic networks reconstructed for the brown algal model Ectocarpus siliculosus and an associated bacterium Candidatus Phaeomarinobacter ectocarpi revealed several candidate metabolic pathways for algal-bacterial interactions. Then Meneco was used to reconstruct, from transcriptomic and metabolomic data, the first metabolic network for the microalga Euglena mutabilis. These two case studies show that Meneco is a versatile tool to complete draft genome-scale metabolic networks produced from heterogeneous data, and to

Meneco, a Topology-Based Gap-Filling Tool Applicable to Degraded Genome-Wide Metabolic Networks.

PubMed

Prigent, Sylvain; Frioux, Clémence; Dittami, Simon M; Thiele, Sven; Larhlimi, Abdelhalim; Collet, Guillaume; Gutknecht, Fabien; Got, Jeanne; Eveillard, Damien; Bourdon, Jérémie; Plewniak, Frédéric; Tonon, Thierry; Siegel, Anne

2017-01-01

Increasing amounts of sequence data are becoming available for a wide range of non-model organisms. Investigating and modelling the metabolic behaviour of those organisms is highly relevant to understand their biology and ecology. As sequences are often incomplete and poorly annotated, draft networks of their metabolism largely suffer from incompleteness. Appropriate gap-filling methods to identify and add missing reactions are therefore required to address this issue. However, current tools rely on phenotypic or taxonomic information, or are very sensitive to the stoichiometric balance of metabolic reactions, especially concerning the co-factors. This type of information is often not available or at least prone to errors for newly-explored organisms. Here we introduce Meneco, a tool dedicated to the topological gap-filling of genome-scale draft metabolic networks. Meneco reformulates gap-filling as a qualitative combinatorial optimization problem, omitting constraints raised by the stoichiometry of a metabolic network considered in other methods, and solves this problem using Answer Set Programming. Run on several artificial test sets gathering 10,800 degraded Escherichia coli networks Meneco was able to efficiently identify essential reactions missing in networks at high degradation rates, outperforming the stoichiometry-based tools in scalability. To demonstrate the utility of Meneco we applied it to two case studies. Its application to recent metabolic networks reconstructed for the brown algal model Ectocarpus siliculosus and an associated bacterium Candidatus Phaeomarinobacter ectocarpi revealed several candidate metabolic pathways for algal-bacterial interactions. Then Meneco was used to reconstruct, from transcriptomic and metabolomic data, the first metabolic network for the microalga Euglena mutabilis. These two case studies show that Meneco is a versatile tool to complete draft genome-scale metabolic networks produced from heterogeneous data, and to
Genomic Reconstruction of Carbohydrate Utilization Capacities in Microbial-Mat Derived Consortia

PubMed Central

Leyn, Semen A.; Maezato, Yukari; Romine, Margaret F.; Rodionov, Dmitry A.

2017-01-01

Two nearly identical unicyanobacterial consortia (UCC) were previously isolated from benthic microbial mats that occur in a heliothermal saline lake in northern Washington State. Carbohydrates are a primary source of carbon and energy for most heterotrophic bacteria. Since CO2 is the only carbon source provided, the cyanobacterium must provide a source of carbon to the heterotrophs. Available genomic sequences for all members of the UCC provide opportunity to investigate the metabolic routes of carbon transfer between autotroph and heterotrophs. Here, we applied a subsystem-based comparative genomics approach to reconstruct carbohydrate utilization pathways and identify glycohydrolytic enzymes, carbohydrate transporters and pathway-specific transcriptional regulators in 17 heterotrophic members of the UCC. The reconstructed metabolic pathways include 800 genes, near a one-fourth of which encode enzymes, transporters and regulators with newly assigned metabolic functions resulting in discovery of novel functional variants of carbohydrate utilization pathways. The in silico analysis revealed the utilization capabilities for 40 carbohydrates and their derivatives. Two Halomonas species demonstrated the largest number of sugar catabolic pathways. Trehalose, sucrose, maltose, glucose, and beta-glucosides are the most commonly utilized saccharides in this community. Reconstructed regulons for global regulators HexR and CceR include central carbohydrate metabolism genes in the members of Gammaproteobacteria and Alphaproteobacteria, respectively. Genomics analyses were supplemented by experimental characterization of metabolic phenotypes in four isolates derived from the consortia. Measurements of isolate growth on the defined medium supplied with individual carbohydrates confirmed most of the predicted catabolic phenotypes. Not all consortia members use carbohydrates and only a few use complex polysaccharides suggesting a hierarchical carbon flow from cyanobacteria to
Genome-scale metabolic model for Lactococcus lactis MG1363 and its application to the analysis of flavor formation.

PubMed

Flahaut, Nicolas A L; Wiersma, Anne; van de Bunt, Bert; Martens, Dirk E; Schaap, Peter J; Sijtsma, Lolke; Dos Santos, Vitor A Martins; de Vos, Willem M

2013-10-01

Lactococcus lactis subsp. cremoris MG1363 is a paradigm strain for lactococci used in industrial dairy fermentations. However, despite of its importance for process development, no genome-scale metabolic model has been reported thus far. Moreover, current models for other lactococci only focus on growth and sugar degradation. A metabolic model that includes nitrogen metabolism and flavor-forming pathways is instrumental for the understanding and designing new industrial applications of these lactic acid bacteria. A genome-scale, constraint-based model of the metabolism and transport in L. lactis MG1363, accounting for 518 genes, 754 reactions, and 650 metabolites, was developed and experimentally validated. Fifty-nine reactions are directly or indirectly involved in flavor formation. Flux Balance Analysis and Flux Variability Analysis were used to investigate flux distributions within the whole metabolic network. Anaerobic carbon-limited continuous cultures were used for estimating the energetic parameters. A thorough model-driven analysis showing a highly flexible nitrogen metabolism, e.g., branched-chain amino acid catabolism which coupled with the redox balance, is pivotal for the prediction of the formation of different flavor compounds. Furthermore, the model predicted the formation of volatile sulfur compounds as a result of the fermentation. These products were subsequently identified in the experimental fermentations carried out. Thus, the genome-scale metabolic model couples the carbon and nitrogen metabolism in L. lactis MG1363 with complete known catabolic pathways leading to flavor formation. The model provided valuable insights into the metabolic networks underlying flavor formation and has the potential to contribute to new developments in dairy industries and cheese-flavor research.
Capturing the response of Clostridium acetobutylicum to chemical stressors using a regulated genome-scale metabolic model

DOE PAGES

Dash, Satyakam; Mueller, Thomas J.; Venkataramanan, Keerthi P.; ...

2014-10-14

Clostridia are anaerobic Gram-positive Firmicutes containing broad and flexible systems for substrate utilization, which have been used successfully to produce a range of industrial compounds. Clostridium acetobutylicum has been used to produce butanol on an industrial scale through acetone-butanol-ethanol (ABE) fermentation. A genome-scale metabolic (GSM) model is a powerful tool for understanding the metabolic capacities of an organism and developing metabolic engineering strategies for strain development. The integration of stress related specific transcriptomics information with the GSM model provides opportunities for elucidating the focal points of regulation.
Genome-reconstruction for eukaryotes from complex natural microbial communities.

PubMed

West, Patrick T; Probst, Alexander J; Grigoriev, Igor V; Thomas, Brian C; Banfield, Jillian F

2018-04-01

Microbial eukaryotes are integral components of natural microbial communities, and their inclusion is critical for many ecosystem studies, yet the majority of published metagenome analyses ignore eukaryotes. In order to include eukaryotes in environmental studies, we propose a method to recover eukaryotic genomes from complex metagenomic samples. A key step for genome recovery is separation of eukaryotic and prokaryotic fragments. We developed a k -mer-based strategy, EukRep, for eukaryotic sequence identification and applied it to environmental samples to show that it enables genome recovery, genome completeness evaluation, and prediction of metabolic potential. We used this approach to test the effect of addition of organic carbon on a geyser-associated microbial community and detected a substantial change of the community metabolism, with selection against almost all candidate phyla bacteria and archaea and for eukaryotes. Near complete genomes were reconstructed for three fungi placed within the Eurotiomycetes and an arthropod. While carbon fixation and sulfur oxidation were important functions in the geyser community prior to carbon addition, the organic carbon-impacted community showed enrichment for secreted proteases, secreted lipases, cellulose targeting CAZymes, and methanol oxidation. We demonstrate the broader utility of EukRep by reconstructing and evaluating relatively high-quality fungal, protist, and rotifer genomes from complex environmental samples. This approach opens the way for cultivation-independent analyses of whole microbial communities. © 2018 West et al.; Published by Cold Spring Harbor Laboratory Press.
Dissecting metabolic behavior of lipid over-producing strain of Mucor circinelloides through genome-scale metabolic network and multi-level data integration.

PubMed

Vongsangnak, Wanwipa; Kingkaw, Amornthep; Yang, Junhuan; Song, Yuanda; Laoteng, Kobkul

2018-09-05

Lipid accumulation is an important cellular process of oleaginous microorganisms. To dissect metabolic behavior of oleaginous Zygomycetes, the lipid over-producing strain, Mucor circinelloides WJ11, was subjected for omics-scale analysis. The genome annotation was improved and used for construction of genome-scale metabolic network of WJ11 strain. Then, the quality of the metabolic network was enhanced by incorporating gene and protein expression data. In addition to the known oleaginous genes, our results showed a number of newly identified unique genes of WJ11 strain, which involved in central carbon metabolism, lipid, amino acid and nitrogen metabolisms. The systematic compilations indicated the additional metabolic routes with the involvement in supplying precursors (acetyl-CoA, NADPH and fatty acyl substrate) for fatty acid and lipid biosynthesis. Interestingly, amino acid metabolism played a substantial role in responsive mechanism of the fungal cells to nutrient imbalance circumstance through lipogenesis as the finding of reporter metabolites (l-methionine, l-glutamate, l-aspartate, l-asparagine and l-glutamine) at lipid-accumulating stage. The cooperative function of certain lipid-degrading enzymes at the particular growth stage was elucidated by integrating the metabolic networks with gene expression data. The unique feature of carotenoid biosynthetic route in WJ11 strain was also identified by protein domain analysis. Taken together, there were cross-functional metabolisms in regulating lipid biosynthesis and retaining high level of cellular lipids in the representative of lipid over-producing strains. Copyright © 2018 Elsevier B.V. All rights reserved.
Metabolic Network Modeling of Microbial Communities

PubMed Central

Biggs, Matthew B.; Medlock, Gregory L.; Kolling, Glynis L.

2015-01-01

Genome-scale metabolic network reconstructions and constraint-based analysis are powerful methods that have the potential to make functional predictions about microbial communities. Current use of genome-scale metabolic networks to characterize the metabolic functions of microbial communities includes species compartmentalization, separating species-level and community-level objectives, dynamic analysis, the “enzyme-soup” approach, multi-scale modeling, and others. There are many challenges inherent to the field, including a need for tools that accurately assign high-level omics signals to individual community members, new automated reconstruction methods that rival manual curation, and novel algorithms for integrating omics data and engineering communities. As technologies and modeling frameworks improve, we expect that there will be proportional advances in the fields of ecology, health science, and microbial community engineering. PMID:26109480
SEED Servers: High-Performance Access to the SEED Genomes, Annotations, and Metabolic Models

PubMed Central

Aziz, Ramy K.; Devoid, Scott; Disz, Terrence; Edwards, Robert A.; Henry, Christopher S.; Olsen, Gary J.; Olson, Robert; Overbeek, Ross; Parrello, Bruce; Pusch, Gordon D.; Stevens, Rick L.; Vonstein, Veronika; Xia, Fangfang

2012-01-01

The remarkable advance in sequencing technology and the rising interest in medical and environmental microbiology, biotechnology, and synthetic biology resulted in a deluge of published microbial genomes. Yet, genome annotation, comparison, and modeling remain a major bottleneck to the translation of sequence information into biological knowledge, hence computational analysis tools are continuously being developed for rapid genome annotation and interpretation. Among the earliest, most comprehensive resources for prokaryotic genome analysis, the SEED project, initiated in 2003 as an integration of genomic data and analysis tools, now contains >5,000 complete genomes, a constantly updated set of curated annotations embodied in a large and growing collection of encoded subsystems, a derived set of protein families, and hundreds of genome-scale metabolic models. Until recently, however, maintaining current copies of the SEED code and data at remote locations has been a pressing issue. To allow high-performance remote access to the SEED database, we developed the SEED Servers (http://www.theseed.org/servers): four network-based servers intended to expose the data in the underlying relational database, support basic annotation services, offer programmatic access to the capabilities of the RAST annotation server, and provide access to a growing collection of metabolic models that support flux balance analysis. The SEED servers offer open access to regularly updated data, the ability to annotate prokaryotic genomes, the ability to create metabolic reconstructions and detailed models of metabolism, and access to hundreds of existing metabolic models. This work offers and supports a framework upon which other groups can build independent research efforts. Large integrations of genomic data represent one of the major intellectual resources driving research in biology, and programmatic access to the SEED data will provide significant utility to a broad collection of potential
Reconstruction of the lipid metabolism for the microalga Monoraphidium neglectum from its genome sequence reveals characteristics suitable for biofuel production

PubMed Central

2013-01-01

Background Microalgae are gaining importance as sustainable production hosts in the fields of biotechnology and bioenergy. A robust biomass accumulating strain of the genus Monoraphidium (SAG 48.87) was investigated in this work as a potential feedstock for biofuel production. The genome was sequenced, annotated, and key enzymes for triacylglycerol formation were elucidated. Results Monoraphidium neglectum was identified as an oleaginous species with favourable growth characteristics as well as a high potential for crude oil production, based on neutral lipid contents of approximately 21% (dry weight) under nitrogen starvation, composed of predominantly C18:1 and C16:0 fatty acids. Further characterization revealed growth in a relatively wide pH range and salt concentrations of up to 1.0% NaCl, in which the cells exhibited larger structures. This first full genome sequencing of a member of the Selenastraceae revealed a diploid, approximately 68 Mbp genome with a G + C content of 64.7%. The circular chloroplast genome was assembled to a 135,362 bp single contig, containing 67 protein-coding genes. The assembly of the mitochondrial genome resulted in two contigs with an approximate total size of 94 kb, the largest known mitochondrial genome within algae. 16,761 protein-coding genes were assigned to the nuclear genome. Comparison of gene sets with respect to functional categories revealed a higher gene number assigned to the category “carbohydrate metabolic process” and in “fatty acid biosynthetic process” in M. neglectum when compared to Chlamydomonas reinhardtii and Nannochloropsis gaditana, indicating a higher metabolic diversity for applications in carbohydrate conversions of biotechnological relevance. Conclusions The genome of M. neglectum, as well as the metabolic reconstruction of crucial lipid pathways, provides new insights into the diversity of the lipid metabolism in microalgae. The results of this work provide a platform to encourage the
Optimizing and evaluating the reconstruction of Metagenome-assembled microbial genomes.

PubMed

Papudeshi, Bhavya; Haggerty, J Matthew; Doane, Michael; Morris, Megan M; Walsh, Kevin; Beattie, Douglas T; Pande, Dnyanada; Zaeri, Parisa; Silva, Genivaldo G Z; Thompson, Fabiano; Edwards, Robert A; Dinsdale, Elizabeth A

2017-11-28

Microbiome/host interactions describe characteristics that affect the host's health. Shotgun metagenomics includes sequencing a random subset of the microbiome to analyze its taxonomic and metabolic potential. Reconstruction of DNA fragments into genomes from metagenomes (called metagenome-assembled genomes) assigns unknown fragments to taxa/function and facilitates discovery of novel organisms. Genome reconstruction incorporates sequence assembly and sorting of assembled sequences into bins, characteristic of a genome. However, the microbial community composition, including taxonomic and phylogenetic diversity may influence genome reconstruction. We determine the optimal reconstruction method for four microbiome projects that had variable sequencing platforms (IonTorrent and Illumina), diversity (high or low), and environment (coral reefs and kelp forests), using a set of parameters to select for optimal assembly and binning tools. We tested the effects of the assembly and binning processes on population genome reconstruction using 105 marine metagenomes from 4 projects. Reconstructed genomes were obtained from each project using 3 assemblers (IDBA, MetaVelvet, and SPAdes) and 2 binning tools (GroopM and MetaBat). We assessed the efficiency of assemblers using statistics that including contig continuity and contig chimerism and the effectiveness of binning tools using genome completeness and taxonomic identification. We concluded that SPAdes, assembled more contigs (143,718 ± 124 contigs) of longer length (N50 = 1632 ± 108 bp), and incorporated the most sequences (sequences-assembled = 19.65%). The microbial richness and evenness were maintained across the assembly, suggesting low contig chimeras. SPAdes assembly was responsive to the biological and technological variations within the project, compared with other assemblers. Among binning tools, we conclude that MetaBat produced bins with less variation in GC content (average standard deviation: 1
Reconstructing metabolic flux vectors from extreme pathways: defining the alpha-spectrum.

PubMed

Wiback, Sharon J; Mahadevan, Radhakrishnan; Palsson, Bernhard Ø

2003-10-07

The move towards genome-scale analysis of cellular functions has necessitated the development of analytical (in silico) methods to understand such large and complex biochemical reaction networks. One such method is extreme pathway analysis that uses stoichiometry and thermodynamic irreversibly to define mathematically unique, systemic metabolic pathways. These extreme pathways form the edges of a high-dimensional convex cone in the flux space that contains all the attainable steady state solutions, or flux distributions, for the metabolic network. By definition, any steady state flux distribution can be described as a nonnegative linear combination of the extreme pathways. To date, much effort has been focused on calculating, defining, and understanding these extreme pathways. However, little work has been performed to determine how these extreme pathways contribute to a given steady state flux distribution. This study represents an initial effort aimed at defining how physiological steady state solutions can be reconstructed from a network's extreme pathways. In general, there is not a unique set of nonnegative weightings on the extreme pathways that produce a given steady state flux distribution but rather a range of possible values. This range can be determined using linear optimization to maximize and minimize the weightings of a particular extreme pathway in the reconstruction, resulting in what we have termed the alpha-spectrum. The alpha-spectrum defines which extreme pathways can and cannot be included in the reconstruction of a given steady state flux distribution and to what extent they individually contribute to the reconstruction. It is shown that accounting for transcriptional regulatory constraints can considerably shrink the alpha-spectrum. The alpha-spectrum is computed and interpreted for two cases; first, optimal states of a skeleton representation of core metabolism that include transcriptional regulation, and second for human red blood cell
Reconstruction of an Integrated Genome-Scale Co-Expression Network Reveals Key Modules Involved in Lung Adenocarcinoma

PubMed Central

Hosseini Ashtiani, Saman; Moeini, Ali; Nowzari-Dalini, Abbas; Masoudi-Nejad, Ali

2013-01-01

Our goal of this study was to reconstruct a “genome-scale co-expression network” and find important modules in lung adenocarcinoma so that we could identify the genes involved in lung adenocarcinoma. We integrated gene mutation, GWAS, CGH, array-CGH and SNP array data in order to identify important genes and loci in genome-scale. Afterwards, on the basis of the identified genes a co-expression network was reconstructed from the co-expression data. The reconstructed network was named “genome-scale co-expression network”. As the next step, 23 key modules were disclosed through clustering. In this study a number of genes have been identified for the first time to be implicated in lung adenocarcinoma by analyzing the modules. The genes EGFR, PIK3CA, TAF15, XIAP, VAPB, Appl1, Rab5a, ARF4, CLPTM1L, SP4, ZNF124, LPP, FOXP1, SOX18, MSX2, NFE2L2, SMARCC1, TRA2B, CBX3, PRPF6, ATP6V1C1, MYBBP1A, MACF1, GRM2, TBXA2R, PRKAR2A, PTK2, PGF and MYO10 are among the genes that belong to modules 1 and 22. All these genes, being implicated in at least one of the phenomena, namely cell survival, proliferation and metastasis, have an over-expression pattern similar to that of EGFR. In few modules, the genes such as CCNA2 (Cyclin A2), CCNB2 (Cyclin B2), CDK1, CDK5, CDC27, CDCA5, CDCA8, ASPM, BUB1, KIF15, KIF2C, NEK2, NUSAP1, PRC1, SMC4, SYCE2, TFDP1, CDC42 and ARHGEF9 are present that play a crucial role in cell cycle progression. In addition to the mentioned genes, there are some other genes (i.e. DLGAP5, BIRC5, PSMD2, Src, TTK, SENP2, PSMD2, DOK2, FUS and etc.) in the modules. PMID:23874428
Reconstruction of an integrated genome-scale co-expression network reveals key modules involved in lung adenocarcinoma.

PubMed

Bidkhori, Gholamreza; Narimani, Zahra; Hosseini Ashtiani, Saman; Moeini, Ali; Nowzari-Dalini, Abbas; Masoudi-Nejad, Ali

2013-01-01

Our goal of this study was to reconstruct a "genome-scale co-expression network" and find important modules in lung adenocarcinoma so that we could identify the genes involved in lung adenocarcinoma. We integrated gene mutation, GWAS, CGH, array-CGH and SNP array data in order to identify important genes and loci in genome-scale. Afterwards, on the basis of the identified genes a co-expression network was reconstructed from the co-expression data. The reconstructed network was named "genome-scale co-expression network". As the next step, 23 key modules were disclosed through clustering. In this study a number of genes have been identified for the first time to be implicated in lung adenocarcinoma by analyzing the modules. The genes EGFR, PIK3CA, TAF15, XIAP, VAPB, Appl1, Rab5a, ARF4, CLPTM1L, SP4, ZNF124, LPP, FOXP1, SOX18, MSX2, NFE2L2, SMARCC1, TRA2B, CBX3, PRPF6, ATP6V1C1, MYBBP1A, MACF1, GRM2, TBXA2R, PRKAR2A, PTK2, PGF and MYO10 are among the genes that belong to modules 1 and 22. All these genes, being implicated in at least one of the phenomena, namely cell survival, proliferation and metastasis, have an over-expression pattern similar to that of EGFR. In few modules, the genes such as CCNA2 (Cyclin A2), CCNB2 (Cyclin B2), CDK1, CDK5, CDC27, CDCA5, CDCA8, ASPM, BUB1, KIF15, KIF2C, NEK2, NUSAP1, PRC1, SMC4, SYCE2, TFDP1, CDC42 and ARHGEF9 are present that play a crucial role in cell cycle progression. In addition to the mentioned genes, there are some other genes (i.e. DLGAP5, BIRC5, PSMD2, Src, TTK, SENP2, PSMD2, DOK2, FUS and etc.) in the modules.
Reconstructing the Backbone of the Saccharomycotina Yeast Phylogeny Using Genome-Scale Data

PubMed Central

Shen, Xing-Xing; Zhou, Xiaofan; Kominek, Jacek; Kurtzman, Cletus P.; Hittinger, Chris Todd; Rokas, Antonis

2016-01-01

Understanding the phylogenetic relationships among the yeasts of the subphylum Saccharomycotina is a prerequisite for understanding the evolution of their metabolisms and ecological lifestyles. In the last two decades, the use of rDNA and multilocus data sets has greatly advanced our understanding of the yeast phylogeny, but many deep relationships remain unsupported. In contrast, phylogenomic analyses have involved relatively few taxa and lineages that were often selected with limited considerations for covering the breadth of yeast biodiversity. Here we used genome sequence data from 86 publicly available yeast genomes representing nine of the 11 known major lineages and 10 nonyeast fungal outgroups to generate a 1233-gene, 96-taxon data matrix. Species phylogenies reconstructed using two different methods (concatenation and coalescence) and two data matrices (amino acids or the first two codon positions) yielded identical and highly supported relationships between the nine major lineages. Aside from the lineage comprised by the family Pichiaceae, all other lineages were monophyletic. Most interrelationships among yeast species were robust across the two methods and data matrices. However, eight of the 93 internodes conflicted between analyses or data sets, including the placements of: the clade defined by species that have reassigned the CUG codon to encode serine, instead of leucine; the clade defined by a whole genome duplication; and the species Ascoidea rubescens. These phylogenomic analyses provide a robust roadmap for future comparative work across the yeast subphylum in the disciplines of taxonomy, molecular genetics, evolutionary biology, ecology, and biotechnology. To further this end, we have also provided a BLAST server to query the 86 Saccharomycotina genomes, which can be found at http://y1000plus.org/blast. PMID:27672114
Reconstructing the backbone of the Saccharomycotina yeast phylogeny using genome-scale data

DOE Office of Scientific and Technical Information (OSTI.GOV)

Shen, Xing -Xing; Zhou, Xiaofan; Kominek, Jacek

Understanding the phylogenetic relationships among the yeasts of the subphylum Saccharomycotina is a prerequisite for understanding the evolution of their metabolisms and ecological lifestyles. In the last two decades, the use of rDNA and multilocus data sets has greatly advanced our understanding of the yeast phylogeny, but many deep relationships remain unsupported. In contrast, phylogenomic analyses have involved relatively few taxa and lineages that were often selected with limited considerations for covering the breadth of yeast biodiversity. Here we used genome sequence data from 86 publicly available yeast genomes representing nine of the 11 known major lineages and 10 nonyeastmore » fungal outgroups to generate a 1233-gene, 96-taxon data matrix. Species phylogenies reconstructed using two different methods (concatenation and coalescence) and two data matrices (amino acids or the first two codon positions) yielded identical and highly supported relationships between the nine major lineages. Aside from the lineage comprised by the family Pichiaceae, all other lineages were monophyletic. Most interrelationships among yeast species were robust across the two methods and data matrices. Furthermore, eight of the 93 internodes conflicted between analyses or data sets, including the placements of: the clade defined by species that have reassigned the CUG codon to encode serine, instead of leucine; the clade defined by a whole genome duplication; and the species Ascoidea rubescens. These phylogenomic analyses provide a robust roadmap for future comparative work across the yeast subphylum in the disciplines of taxonomy, molecular genetics, evolutionary biology, ecology, and biotechnology. To further this end, we have also provided a BLAST server to query the 86 Saccharomycotina genomes, which can be found at http://y1000plus.org/blast.« less
Reconstructing the backbone of the Saccharomycotina yeast phylogeny using genome-scale data

DOE PAGES

Shen, Xing -Xing; Zhou, Xiaofan; Kominek, Jacek; ...

2016-09-26

Understanding the phylogenetic relationships among the yeasts of the subphylum Saccharomycotina is a prerequisite for understanding the evolution of their metabolisms and ecological lifestyles. In the last two decades, the use of rDNA and multilocus data sets has greatly advanced our understanding of the yeast phylogeny, but many deep relationships remain unsupported. In contrast, phylogenomic analyses have involved relatively few taxa and lineages that were often selected with limited considerations for covering the breadth of yeast biodiversity. Here we used genome sequence data from 86 publicly available yeast genomes representing nine of the 11 known major lineages and 10 nonyeastmore » fungal outgroups to generate a 1233-gene, 96-taxon data matrix. Species phylogenies reconstructed using two different methods (concatenation and coalescence) and two data matrices (amino acids or the first two codon positions) yielded identical and highly supported relationships between the nine major lineages. Aside from the lineage comprised by the family Pichiaceae, all other lineages were monophyletic. Most interrelationships among yeast species were robust across the two methods and data matrices. Furthermore, eight of the 93 internodes conflicted between analyses or data sets, including the placements of: the clade defined by species that have reassigned the CUG codon to encode serine, instead of leucine; the clade defined by a whole genome duplication; and the species Ascoidea rubescens. These phylogenomic analyses provide a robust roadmap for future comparative work across the yeast subphylum in the disciplines of taxonomy, molecular genetics, evolutionary biology, ecology, and biotechnology. To further this end, we have also provided a BLAST server to query the 86 Saccharomycotina genomes, which can be found at http://y1000plus.org/blast.« less
Environmental versatility promotes modularity in genome-scale metabolic networks.

PubMed

Samal, Areejit; Wagner, Andreas; Martin, Olivier C

2011-08-24

The ubiquity of modules in biological networks may result from an evolutionary benefit of a modular organization. For instance, modularity may increase the rate of adaptive evolution, because modules can be easily combined into new arrangements that may benefit their carrier. Conversely, modularity may emerge as a by-product of some trait. We here ask whether this last scenario may play a role in genome-scale metabolic networks that need to sustain life in one or more chemical environments. For such networks, we define a network module as a maximal set of reactions that are fully coupled, i.e., whose fluxes can only vary in fixed proportions. This definition overcomes limitations of purely graph based analyses of metabolism by exploiting the functional links between reactions. We call a metabolic network viable in a given chemical environment if it can synthesize all of an organism's biomass compounds from nutrients in this environment. An organism's metabolism is highly versatile if it can sustain life in many different chemical environments. We here ask whether versatility affects the modularity of metabolic networks. Using recently developed techniques to randomly sample large numbers of viable metabolic networks from a vast space of metabolic networks, we use flux balance analysis to study in silico metabolic networks that differ in their versatility. We find that highly versatile networks are also highly modular. They contain more modules and more reactions that are organized into modules. Most or all reactions in a module are associated with the same biochemical pathways. Modules that arise in highly versatile networks generally involve reactions that process nutrients or closely related chemicals. We also observe that the metabolism of E. coli is significantly more modular than even our most versatile networks. Our work shows that modularity in metabolic networks can be a by-product of functional constraints, e.g., the need to sustain life in multiple
Integrating Kinetic Model of E. coli with Genome Scale Metabolic Fluxes Overcomes Its Open System Problem and Reveals Bistability in Central Metabolism

PubMed Central

Mannan, Ahmad A.; Toya, Yoshihiro; Shimizu, Kazuyuki; McFadden, Johnjoe; Kierzek, Andrzej M.; Rocco, Andrea

2015-01-01

An understanding of the dynamics of the metabolic profile of a bacterial cell is sought from a dynamical systems analysis of kinetic models. This modelling formalism relies on a deterministic mathematical description of enzyme kinetics and their metabolite regulation. However, it is severely impeded by the lack of available kinetic information, limiting the size of the system that can be modelled. Furthermore, the subsystem of the metabolic network whose dynamics can be modelled is faced with three problems: how to parameterize the model with mostly incomplete steady state data, how to close what is now an inherently open system, and how to account for the impact on growth. In this study we address these challenges of kinetic modelling by capitalizing on multi-‘omics’ steady state data and a genome-scale metabolic network model. We use these to generate parameters that integrate knowledge embedded in the genome-scale metabolic network model, into the most comprehensive kinetic model of the central carbon metabolism of E. coli realized to date. As an application, we performed a dynamical systems analysis of the resulting enriched model. This revealed bistability of the central carbon metabolism and thus its potential to express two distinct metabolic states. Furthermore, since our model-informing technique ensures both stable states are constrained by the same thermodynamically feasible steady state growth rate, the ensuing bistability represents a temporal coexistence of the two states, and by extension, reveals the emergence of a phenotypically heterogeneous population. PMID:26469081
High-throughput comparison, functional annotation, and metabolic modeling of plant genomes using the PlantSEED resource

PubMed Central

Seaver, Samuel M. D.; Gerdes, Svetlana; Frelin, Océane; Lerma-Ortiz, Claudia; Bradbury, Louis M. T.; Zallot, Rémi; Hasnain, Ghulam; Niehaus, Thomas D.; El Yacoubi, Basma; Pasternak, Shiran; Olson, Robert; Pusch, Gordon; Overbeek, Ross; Stevens, Rick; de Crécy-Lagard, Valérie; Ware, Doreen; Hanson, Andrew D.; Henry, Christopher S.

2014-01-01

The increasing number of sequenced plant genomes is placing new demands on the methods applied to analyze, annotate, and model these genomes. Today’s annotation pipelines result in inconsistent gene assignments that complicate comparative analyses and prevent efficient construction of metabolic models. To overcome these problems, we have developed the PlantSEED, an integrated, metabolism-centric database to support subsystems-based annotation and metabolic model reconstruction for plant genomes. PlantSEED combines SEED subsystems technology, first developed for microbial genomes, with refined protein families and biochemical data to assign fully consistent functional annotations to orthologous genes, particularly those encoding primary metabolic pathways. Seamless integration with its parent, the prokaryotic SEED database, makes PlantSEED a unique environment for cross-kingdom comparative analysis of plant and bacterial genomes. The consistent annotations imposed by PlantSEED permit rapid reconstruction and modeling of primary metabolism for all plant genomes in the database. This feature opens the unique possibility of model-based assessment of the completeness and accuracy of gene annotation and thus allows computational identification of genes and pathways that are restricted to certain genomes or need better curation. We demonstrate the PlantSEED system by producing consistent annotations for 10 reference genomes. We also produce a functioning metabolic model for each genome, gapfilling to identify missing annotations and proposing gene candidates for missing annotations. Models are built around an extended biomass composition representing the most comprehensive published to date. To our knowledge, our models are the first to be published for seven of the genomes analyzed. PMID:24927599
High-throughput comparison, functional annotation, and metabolic modeling of plant genomes using the PlantSEED resource.

PubMed

Seaver, Samuel M D; Gerdes, Svetlana; Frelin, Océane; Lerma-Ortiz, Claudia; Bradbury, Louis M T; Zallot, Rémi; Hasnain, Ghulam; Niehaus, Thomas D; El Yacoubi, Basma; Pasternak, Shiran; Olson, Robert; Pusch, Gordon; Overbeek, Ross; Stevens, Rick; de Crécy-Lagard, Valérie; Ware, Doreen; Hanson, Andrew D; Henry, Christopher S

2014-07-01

The increasing number of sequenced plant genomes is placing new demands on the methods applied to analyze, annotate, and model these genomes. Today's annotation pipelines result in inconsistent gene assignments that complicate comparative analyses and prevent efficient construction of metabolic models. To overcome these problems, we have developed the PlantSEED, an integrated, metabolism-centric database to support subsystems-based annotation and metabolic model reconstruction for plant genomes. PlantSEED combines SEED subsystems technology, first developed for microbial genomes, with refined protein families and biochemical data to assign fully consistent functional annotations to orthologous genes, particularly those encoding primary metabolic pathways. Seamless integration with its parent, the prokaryotic SEED database, makes PlantSEED a unique environment for cross-kingdom comparative analysis of plant and bacterial genomes. The consistent annotations imposed by PlantSEED permit rapid reconstruction and modeling of primary metabolism for all plant genomes in the database. This feature opens the unique possibility of model-based assessment of the completeness and accuracy of gene annotation and thus allows computational identification of genes and pathways that are restricted to certain genomes or need better curation. We demonstrate the PlantSEED system by producing consistent annotations for 10 reference genomes. We also produce a functioning metabolic model for each genome, gapfilling to identify missing annotations and proposing gene candidates for missing annotations. Models are built around an extended biomass composition representing the most comprehensive published to date. To our knowledge, our models are the first to be published for seven of the genomes analyzed.

Characterizing steady states of genome-scale metabolic networks in continuous cell cultures.

PubMed

Fernandez-de-Cossio-Diaz, Jorge; Leon, Kalet; Mulet, Roberto

2017-11-01

In the continuous mode of cell culture, a constant flow carrying fresh media replaces culture fluid, cells, nutrients and secreted metabolites. Here we present a model for continuous cell culture coupling intra-cellular metabolism to extracellular variables describing the state of the bioreactor, taking into account the growth capacity of the cell and the impact of toxic byproduct accumulation. We provide a method to determine the steady states of this system that is tractable for metabolic networks of arbitrary complexity. We demonstrate our approach in a toy model first, and then in a genome-scale metabolic network of the Chinese hamster ovary cell line, obtaining results that are in qualitative agreement with experimental observations. We derive a number of consequences from the model that are independent of parameter values. The ratio between cell density and dilution rate is an ideal control parameter to fix a steady state with desired metabolic properties. This conclusion is robust even in the presence of multi-stability, which is explained in our model by a negative feedback loop due to toxic byproduct accumulation. A complex landscape of steady states emerges from our simulations, including multiple metabolic switches, which also explain why cell-line and media benchmarks carried out in batch culture cannot be extrapolated to perfusion. On the other hand, we predict invariance laws between continuous cell cultures with different parameters. A practical consequence is that the chemostat is an ideal experimental model for large-scale high-density perfusion cultures, where the complex landscape of metabolic transitions is faithfully reproduced.
Characterizing steady states of genome-scale metabolic networks in continuous cell cultures

PubMed Central

Leon, Kalet; Mulet, Roberto

2017-01-01

In the continuous mode of cell culture, a constant flow carrying fresh media replaces culture fluid, cells, nutrients and secreted metabolites. Here we present a model for continuous cell culture coupling intra-cellular metabolism to extracellular variables describing the state of the bioreactor, taking into account the growth capacity of the cell and the impact of toxic byproduct accumulation. We provide a method to determine the steady states of this system that is tractable for metabolic networks of arbitrary complexity. We demonstrate our approach in a toy model first, and then in a genome-scale metabolic network of the Chinese hamster ovary cell line, obtaining results that are in qualitative agreement with experimental observations. We derive a number of consequences from the model that are independent of parameter values. The ratio between cell density and dilution rate is an ideal control parameter to fix a steady state with desired metabolic properties. This conclusion is robust even in the presence of multi-stability, which is explained in our model by a negative feedback loop due to toxic byproduct accumulation. A complex landscape of steady states emerges from our simulations, including multiple metabolic switches, which also explain why cell-line and media benchmarks carried out in batch culture cannot be extrapolated to perfusion. On the other hand, we predict invariance laws between continuous cell cultures with different parameters. A practical consequence is that the chemostat is an ideal experimental model for large-scale high-density perfusion cultures, where the complex landscape of metabolic transitions is faithfully reproduced. PMID:29131817
Genome-scale metabolic model of Pichia pastoris with native and humanized glycosylation of recombinant proteins.

PubMed

Irani, Zahra Azimzadeh; Kerkhoven, Eduard J; Shojaosadati, Seyed Abbas; Nielsen, Jens

2016-05-01

Pichia pastoris is used for commercial production of human therapeutic proteins, and genome-scale models of P. pastoris metabolism have been generated in the past to study the metabolism and associated protein production by this yeast. A major challenge with clinical usage of recombinant proteins produced by P. pastoris is the difference in N-glycosylation of proteins produced by humans and this yeast. However, through metabolic engineering, a P. pastoris strain capable of producing humanized N-glycosylated proteins was constructed. The current genome-scale models of P. pastoris do not address native nor humanized N-glycosylation, and we therefore developed ihGlycopastoris, an extension to the iLC915 model with both native and humanized N-glycosylation for recombinant protein production, but also an estimation of N-glycosylation of P. pastoris native proteins. This new model gives a better prediction of protein yield, demonstrates the effect of the different types of N-glycosylation of protein yield, and can be used to predict potential targets for strain improvement. The model represents a step towards a more complete description of protein production in P. pastoris, which is required for using these models to understand and optimize protein production processes. © 2015 Wiley Periodicals, Inc.
Unique attributes of cyanobacterial metabolism revealed by improved genome-scale metabolic modeling and essential gene analysis

DOE PAGES

Broddrick, Jared T.; Rubin, Benjamin E.; Welkie, David G.; ...

2016-12-20

The model cyanobacterium, Synechococcus elongatus PCC 7942, is a genetically tractable obligate phototroph that is being developed for the bioproduction of high-value chemicals. Genome-scale models (GEMs) have been successfully used to assess and engineer cellular metabolism; however, GEMs of phototrophic metabolism have been limited by the lack of experimental datasets for model validation and the challenges of incorporating photon uptake. In this paper, we develop a GEM of metabolism in S. elongatus using random barcode transposon site sequencing (RB-TnSeq) essential gene and physiological data specific to photoautotrophic metabolism. The model explicitly describes photon absorption and accounts for shading, resulting inmore » the characteristic linear growth curve of photoautotrophs. GEM predictions of gene essentiality were compared with data obtained from recent dense-transposon mutagenesis experiments. This dataset allowed major improvements to the accuracy of the model. Furthermore, discrepancies between GEM predictions and the in vivo dataset revealed biological characteristics, such as the importance of a truncated, linear TCA pathway, low flux toward amino acid synthesis from photorespiration, and knowledge gaps within nucleotide metabolism. Finally, coupling of strong experimental support and photoautotrophic modeling methods thus resulted in a highly accurate model of S. elongatus metabolism that highlights previously unknown areas of S. elongatus biology.« less
Unique attributes of cyanobacterial metabolism revealed by improved genome-scale metabolic modeling and essential gene analysis

PubMed Central

Broddrick, Jared T.; Rubin, Benjamin E.; Welkie, David G.; Du, Niu; Mih, Nathan; Diamond, Spencer; Lee, Jenny J.; Golden, Susan S.; Palsson, Bernhard O.

2016-01-01

The model cyanobacterium, Synechococcus elongatus PCC 7942, is a genetically tractable obligate phototroph that is being developed for the bioproduction of high-value chemicals. Genome-scale models (GEMs) have been successfully used to assess and engineer cellular metabolism; however, GEMs of phototrophic metabolism have been limited by the lack of experimental datasets for model validation and the challenges of incorporating photon uptake. Here, we develop a GEM of metabolism in S. elongatus using random barcode transposon site sequencing (RB-TnSeq) essential gene and physiological data specific to photoautotrophic metabolism. The model explicitly describes photon absorption and accounts for shading, resulting in the characteristic linear growth curve of photoautotrophs. GEM predictions of gene essentiality were compared with data obtained from recent dense-transposon mutagenesis experiments. This dataset allowed major improvements to the accuracy of the model. Furthermore, discrepancies between GEM predictions and the in vivo dataset revealed biological characteristics, such as the importance of a truncated, linear TCA pathway, low flux toward amino acid synthesis from photorespiration, and knowledge gaps within nucleotide metabolism. Coupling of strong experimental support and photoautotrophic modeling methods thus resulted in a highly accurate model of S. elongatus metabolism that highlights previously unknown areas of S. elongatus biology. PMID:27911809
Unique attributes of cyanobacterial metabolism revealed by improved genome-scale metabolic modeling and essential gene analysis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Broddrick, Jared T.; Rubin, Benjamin E.; Welkie, David G.

The model cyanobacterium, Synechococcus elongatus PCC 7942, is a genetically tractable obligate phototroph that is being developed for the bioproduction of high-value chemicals. Genome-scale models (GEMs) have been successfully used to assess and engineer cellular metabolism; however, GEMs of phototrophic metabolism have been limited by the lack of experimental datasets for model validation and the challenges of incorporating photon uptake. In this paper, we develop a GEM of metabolism in S. elongatus using random barcode transposon site sequencing (RB-TnSeq) essential gene and physiological data specific to photoautotrophic metabolism. The model explicitly describes photon absorption and accounts for shading, resulting inmore » the characteristic linear growth curve of photoautotrophs. GEM predictions of gene essentiality were compared with data obtained from recent dense-transposon mutagenesis experiments. This dataset allowed major improvements to the accuracy of the model. Furthermore, discrepancies between GEM predictions and the in vivo dataset revealed biological characteristics, such as the importance of a truncated, linear TCA pathway, low flux toward amino acid synthesis from photorespiration, and knowledge gaps within nucleotide metabolism. Finally, coupling of strong experimental support and photoautotrophic modeling methods thus resulted in a highly accurate model of S. elongatus metabolism that highlights previously unknown areas of S. elongatus biology.« less
Systematic assignment of thermodynamic constraints in metabolic network models

PubMed Central

Kümmel, Anne; Panke, Sven; Heinemann, Matthias

2006-01-01

Background The availability of genome sequences for many organisms enabled the reconstruction of several genome-scale metabolic network models. Currently, significant efforts are put into the automated reconstruction of such models. For this, several computational tools have been developed that particularly assist in identifying and compiling the organism-specific lists of metabolic reactions. In contrast, the last step of the model reconstruction process, which is the definition of the thermodynamic constraints in terms of reaction directionalities, still needs to be done manually. No computational method exists that allows for an automated and systematic assignment of reaction directions in genome-scale models. Results We present an algorithm that – based on thermodynamics, network topology and heuristic rules – automatically assigns reaction directions in metabolic models such that the reaction network is thermodynamically feasible with respect to the production of energy equivalents. It first exploits all available experimentally derived Gibbs energies of formation to identify irreversible reactions. As these thermodynamic data are not available for all metabolites, in a next step, further reaction directions are assigned on the basis of network topology considerations and thermodynamics-based heuristic rules. Briefly, the algorithm identifies reaction subsets from the metabolic network that are able to convert low-energy co-substrates into their high-energy counterparts and thus net produce energy. Our algorithm aims at disabling such thermodynamically infeasible cyclic operation of reaction subnetworks by assigning reaction directions based on a set of thermodynamics-derived heuristic rules. We demonstrate our algorithm on a genome-scale metabolic model of E. coli. The introduced systematic direction assignment yielded 130 irreversible reactions (out of 920 total reactions), which corresponds to about 70% of all irreversible reactions that are required to
Genome-based Modeling and Design of Metabolic Interactions in Microbial Communities

PubMed Central

Mahadevan, Radhakrishnan; Henson, Michael A.

2012-01-01

Biotechnology research is traditionally focused on individual microbial strains that are perceived to have the necessary metabolic functions, or the capability to have these functions introduced, to achieve a particular task. For many important applications, the development of such omnipotent microbes is an extremely challenging if not impossible task. By contrast, nature employs a radically different strategy based on synergistic combinations of different microbial species that collectively achieve the desired task. These natural communities have evolved to exploit the native metabolic capabilities of each species and are highly adaptive to changes in their environments. However, microbial communities have proven difficult to study due to a lack of suitable experimental and computational tools. With the advent of genome sequencing, omics technologies, bioinformatics and genome-scale modeling, researchers now have unprecedented capabilities to analyze and engineer the metabolism of microbial communities. The goal of this review is to summarize recent applications of genome-scale metabolic modeling to microbial communities. A brief introduction to lumped community models is used to motivate the need for genome-level descriptions of individual species and their metabolic interactions. The review of genome-scale models begins with static modeling approaches, which are appropriate for communities where the extracellular environment can be assumed to be time invariant or slowly varying. Dynamic extensions of the static modeling approach are described, and then applications of genome-scale models for design of synthetic microbial communities are reviewed. The review concludes with a summary of metagenomic tools for analyzing community metabolism and an outlook for future research. PMID:24688668
Genome-based Modeling and Design of Metabolic Interactions in Microbial Communities.

PubMed

Mahadevan, Radhakrishnan; Henson, Michael A

2012-01-01

Biotechnology research is traditionally focused on individual microbial strains that are perceived to have the necessary metabolic functions, or the capability to have these functions introduced, to achieve a particular task. For many important applications, the development of such omnipotent microbes is an extremely challenging if not impossible task. By contrast, nature employs a radically different strategy based on synergistic combinations of different microbial species that collectively achieve the desired task. These natural communities have evolved to exploit the native metabolic capabilities of each species and are highly adaptive to changes in their environments. However, microbial communities have proven difficult to study due to a lack of suitable experimental and computational tools. With the advent of genome sequencing, omics technologies, bioinformatics and genome-scale modeling, researchers now have unprecedented capabilities to analyze and engineer the metabolism of microbial communities. The goal of this review is to summarize recent applications of genome-scale metabolic modeling to microbial communities. A brief introduction to lumped community models is used to motivate the need for genome-level descriptions of individual species and their metabolic interactions. The review of genome-scale models begins with static modeling approaches, which are appropriate for communities where the extracellular environment can be assumed to be time invariant or slowly varying. Dynamic extensions of the static modeling approach are described, and then applications of genome-scale models for design of synthetic microbial communities are reviewed. The review concludes with a summary of metagenomic tools for analyzing community metabolism and an outlook for future research.
Proteiniphilum saccharofermentans str. M3/6T isolated from a laboratory biogas reactor is versatile in polysaccharide and oligopeptide utilization as deduced from genome-based metabolic reconstructions.

PubMed

Tomazetto, Geizecler; Hahnke, Sarah; Wibberg, Daniel; Pühler, Alfred; Klocke, Michael; Schlüter, Andreas

2018-06-01

Proteiniphilum saccharofermentans str. M3/6 T is a recently described species within the family Porphyromonadaceae (phylum Bacteroidetes ), which was isolated from a mesophilic laboratory-scale biogas reactor. The genome of the strain was completely sequenced and manually annotated to reconstruct its metabolic potential regarding biomass degradation and fermentation pathways. The P. saccharofermentans str. M3/6 T genome consists of a 4,414,963 bp chromosome featuring an average GC-content of 43.63%. Genome analyses revealed that the strain possesses 3396 protein-coding sequences. Among them are 158 genes assigned to the carbohydrate-active-enzyme families as defined by the CAZy database, including 116 genes encoding glycosyl hydrolases (GHs) involved in pectin, arabinogalactan, hemicellulose (arabinan, xylan, mannan, β-glucans), starch, fructan and chitin degradation. The strain also features several transporter genes, some of which are located in polysaccharide utilization loci (PUL). PUL gene products are involved in glycan binding, transport and utilization at the cell surface. In the genome of strain M3/6 T , 64 PUL are present and most of them in association with genes encoding carbohydrate-active enzymes. Accordingly, the strain was predicted to metabolize several sugars yielding carbon dioxide, hydrogen, acetate, formate, propionate and isovalerate as end-products of the fermentation process. Moreover, P. saccharofermentans str. M3/6 T encodes extracellular and intracellular proteases and transporters predicted to be involved in protein and oligopeptide degradation. Comparative analyses between P. saccharofermentans str. M3/6 T and its closest described relative P. acetatigenes str. DSM 18083 T indicate that both strains share a similar metabolism regarding decomposition of complex carbohydrates and fermentation of sugars.
A genome-scale Escherichia coli kinetic metabolic model k-ecoli457 satisfying flux data for multiple mutant strains

DOE Office of Scientific and Technical Information (OSTI.GOV)

Khodayari, Ali; Maranas, Costas D.

Kinetic models of metabolism at a genome scale that faithfully recapitulate the effect of multiple genetic interventions would be transformative in our ability to reliably design novel overproducing microbial strains. Here, we introduce k-ecoli457, a genome-scale kinetic model of Escherichia coli metabolism that satisfies fluxomic data for wild-type and 25 mutant strains under different substrates and growth conditions. The k-ecoli457 model contains 457 model reactions, 337 metabolites and 295 substrate-level regulatory interactions. Parameterization is carried out using a genetic algorithm by simultaneously imposing all available fluxomic data (about 30 measured fluxes per mutant). Furthermore, the Pearson correlation coefficient between experimentalmore » data and predicted product yields for 320 engineered strains spanning 24 product metabolites is 0.84. This is substantially higher than that using flux balance analysis, minimization of metabolic adjustment or maximization of product yield exhibiting systematic errors with correlation coefficients of, respectively, 0.18, 0.37 and 0.47.« less
A genome-scale Escherichia coli kinetic metabolic model k-ecoli457 satisfying flux data for multiple mutant strains

DOE PAGES

Khodayari, Ali; Maranas, Costas D.

2016-12-20

Kinetic models of metabolism at a genome scale that faithfully recapitulate the effect of multiple genetic interventions would be transformative in our ability to reliably design novel overproducing microbial strains. Here, we introduce k-ecoli457, a genome-scale kinetic model of Escherichia coli metabolism that satisfies fluxomic data for wild-type and 25 mutant strains under different substrates and growth conditions. The k-ecoli457 model contains 457 model reactions, 337 metabolites and 295 substrate-level regulatory interactions. Parameterization is carried out using a genetic algorithm by simultaneously imposing all available fluxomic data (about 30 measured fluxes per mutant). Furthermore, the Pearson correlation coefficient between experimentalmore » data and predicted product yields for 320 engineered strains spanning 24 product metabolites is 0.84. This is substantially higher than that using flux balance analysis, minimization of metabolic adjustment or maximization of product yield exhibiting systematic errors with correlation coefficients of, respectively, 0.18, 0.37 and 0.47.« less
optGpSampler: an improved tool for uniformly sampling the solution-space of genome-scale metabolic networks.

PubMed

Megchelenbrink, Wout; Huynen, Martijn; Marchiori, Elena

2014-01-01

Constraint-based models of metabolic networks are typically underdetermined, because they contain more reactions than metabolites. Therefore the solutions to this system do not consist of unique flux rates for each reaction, but rather a space of possible flux rates. By uniformly sampling this space, an estimated probability distribution for each reaction's flux in the network can be obtained. However, sampling a high dimensional network is time-consuming. Furthermore, the constraints imposed on the network give rise to an irregularly shaped solution space. Therefore more tailored, efficient sampling methods are needed. We propose an efficient sampling algorithm (called optGpSampler), which implements the Artificial Centering Hit-and-Run algorithm in a different manner than the sampling algorithm implemented in the COBRA Toolbox for metabolic network analysis, here called gpSampler. Results of extensive experiments on different genome-scale metabolic networks show that optGpSampler is up to 40 times faster than gpSampler. Application of existing convergence diagnostics on small network reconstructions indicate that optGpSampler converges roughly ten times faster than gpSampler towards similar sampling distributions. For networks of higher dimension (i.e. containing more than 500 reactions), we observed significantly better convergence of optGpSampler and a large deviation between the samples generated by the two algorithms. optGpSampler for Matlab and Python is available for non-commercial use at: http://cs.ru.nl/~wmegchel/optGpSampler/.
Genomes as documents of evolutionary history: a probabilistic macrosynteny model for the reconstruction of ancestral genomes

PubMed Central

Nakatani, Yoichiro; McLysaght, Aoife

2017-01-01

Abstract Motivation: It has been argued that whole-genome duplication (WGD) exerted a profound influence on the course of evolution. For the purpose of fully understanding the impact of WGD, several formal algorithms have been developed for reconstructing pre-WGD gene order in yeast and plant. However, to the best of our knowledge, those algorithms have never been successfully applied to WGD events in teleost and vertebrate, impeded by extensive gene shuffling and gene losses. Results: Here, we present a probabilistic model of macrosynteny (i.e. conserved linkage or chromosome-scale distribution of orthologs), develop a variational Bayes algorithm for inferring the structure of pre-WGD genomes, and study estimation accuracy by simulation. Then, by applying the method to the teleost WGD, we demonstrate effectiveness of the algorithm in a situation where gene-order reconstruction algorithms perform relatively poorly due to a high rate of rearrangement and extensive gene losses. Our high-resolution reconstruction reveals previously overlooked small-scale rearrangements, necessitating a revision to previous views on genome structure evolution in teleost and vertebrate. Conclusions: We have reconstructed the structure of a pre-WGD genome by employing a variational Bayes approach that was originally developed for inferring topics from millions of text documents. Interestingly, comparison of the macrosynteny and topic model algorithms suggests that macrosynteny can be regarded as documents on ancestral genome structure. From this perspective, the present study would seem to provide a textbook example of the prevalent metaphor that genomes are documents of evolutionary history. Availability and implementation: The analysis data are available for download at http://www.gen.tcd.ie/molevol/supp_data/MacrosyntenyTGD.zip, and the software written in Java is available upon request. Contact: yoichiro.nakatani@tcd.ie or aoife.mclysaght@tcd.ie Supplementary information
Genomes as documents of evolutionary history: a probabilistic macrosynteny model for the reconstruction of ancestral genomes.

PubMed

Nakatani, Yoichiro; McLysaght, Aoife

2017-07-15

It has been argued that whole-genome duplication (WGD) exerted a profound influence on the course of evolution. For the purpose of fully understanding the impact of WGD, several formal algorithms have been developed for reconstructing pre-WGD gene order in yeast and plant. However, to the best of our knowledge, those algorithms have never been successfully applied to WGD events in teleost and vertebrate, impeded by extensive gene shuffling and gene losses. Here, we present a probabilistic model of macrosynteny (i.e. conserved linkage or chromosome-scale distribution of orthologs), develop a variational Bayes algorithm for inferring the structure of pre-WGD genomes, and study estimation accuracy by simulation. Then, by applying the method to the teleost WGD, we demonstrate effectiveness of the algorithm in a situation where gene-order reconstruction algorithms perform relatively poorly due to a high rate of rearrangement and extensive gene losses. Our high-resolution reconstruction reveals previously overlooked small-scale rearrangements, necessitating a revision to previous views on genome structure evolution in teleost and vertebrate. We have reconstructed the structure of a pre-WGD genome by employing a variational Bayes approach that was originally developed for inferring topics from millions of text documents. Interestingly, comparison of the macrosynteny and topic model algorithms suggests that macrosynteny can be regarded as documents on ancestral genome structure. From this perspective, the present study would seem to provide a textbook example of the prevalent metaphor that genomes are documents of evolutionary history. The analysis data are available for download at http://www.gen.tcd.ie/molevol/supp_data/MacrosyntenyTGD.zip , and the software written in Java is available upon request. yoichiro.nakatani@tcd.ie or aoife.mclysaght@tcd.ie. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All
Systems biology study of mucopolysaccharidosis using a human metabolic reconstruction network.

PubMed

Salazar, Diego A; Rodríguez-López, Alexander; Herreño, Angélica; Barbosa, Hector; Herrera, Juliana; Ardila, Andrea; Barreto, George E; González, Janneth; Alméciga-Díaz, Carlos J

2016-02-01

Mucopolysaccharidosis (MPS) is a group of lysosomal storage diseases (LSD), characterized by the deficiency of a lysosomal enzyme responsible for the degradation of glycosaminoglycans (GAG). This deficiency leads to the lysosomal accumulation of partially degraded GAG. Nevertheless, deficiency of a single lysosomal enzyme has been associated with impairment in other cell mechanism, such as apoptosis and redox balance. Although GAG analysis represents the main biomarker for MPS diagnosis, it has several limitations that can lead to a misdiagnosis, whereby the identification of new biomarkers represents an important issue for MPS. In this study, we used a system biology approach, through the use of a genome-scale human metabolic reconstruction to understand the effect of metabolism alterations in cell homeostasis and to identify potential new biomarkers in MPS. In-silico MPS models were generated by silencing of MPS-related enzymes, and were analyzed through a flux balance and variability analysis. We found that MPS models used approximately 2286 reactions to satisfy the objective function. Impaired reactions were mainly involved in cellular respiration, mitochondrial process, amino acid and lipid metabolism, and ion exchange. Metabolic changes were similar for MPS I and II, and MPS III A to C; while the remaining MPS showed unique metabolic profiles. Eight and thirteen potential high-confidence biomarkers were identified for MPS IVB and VII, respectively, which were associated with the secondary pathologic process of LSD. In vivo evaluation of predicted intermediate confidence biomarkers (β-hexosaminidase and β-glucoronidase) for MPS IVA and VI correlated with the in-silico prediction. These results show the potential of a computational human metabolic reconstruction to understand the molecular mechanisms this group of diseases, which can be used to identify new biomarkers for MPS. Copyright © 2015. Published by Elsevier Inc.
Comparative functional pan-genome analyses to build connections between genomic dynamics and phenotypic evolution in polycyclic aromatic hydrocarbon metabolism in the genus Mycobacterium.

PubMed

Kweon, Ohgew; Kim, Seong-Jae; Blom, Jochen; Kim, Sung-Kwan; Kim, Bong-Soo; Baek, Dong-Heon; Park, Su Inn; Sutherland, John B; Cerniglia, Carl E

2015-02-14

The bacterial genus Mycobacterium is of great interest in the medical and biotechnological fields. Despite a flood of genome sequencing and functional genomics data, significant gaps in knowledge between genome and phenome seriously hinder efforts toward the treatment of mycobacterial diseases and practical biotechnological applications. In this study, we propose the use of systematic, comparative functional pan-genomic analysis to build connections between genomic dynamics and phenotypic evolution in polycyclic aromatic hydrocarbon (PAH) metabolism in the genus Mycobacterium. Phylogenetic, phenotypic, and genomic information for 27 completely genome-sequenced mycobacteria was systematically integrated to reconstruct a mycobacterial phenotype network (MPN) with a pan-genomic concept at a network level. In the MPN, mycobacterial phenotypes show typical scale-free relationships. PAH degradation is an isolated phenotype with the lowest connection degree, consistent with phylogenetic and environmental isolation of PAH degraders. A series of functional pan-genomic analyses provide conserved and unique types of genomic evidence for strong epistatic and pleiotropic impacts on evolutionary trajectories of the PAH-degrading phenotype. Under strong natural selection, the detailed gene gain/loss patterns from horizontal gene transfer (HGT)/deletion events hypothesize a plausible evolutionary path, an epistasis-based birth and pleiotropy-dependent death, for PAH metabolism in the genus Mycobacterium. This study generated a practical mycobacterial compendium of phenotypic and genomic changes, focusing on the PAH-degrading phenotype, with a pan-genomic perspective of the evolutionary events and the environmental challenges. Our findings suggest that when selection acts on PAH metabolism, only a small fraction of possible trajectories is likely to be observed, owing mainly to a combination of the ambiguous phenotypic effects of PAHs and the corresponding pleiotropy- and epistasis
Proteome- and transcriptome-driven reconstruction of the human myocyte metabolic network and its use for identification of markers for diabetes.

PubMed

Väremo, Leif; Scheele, Camilla; Broholm, Christa; Mardinoglu, Adil; Kampf, Caroline; Asplund, Anna; Nookaew, Intawat; Uhlén, Mathias; Pedersen, Bente Klarlund; Nielsen, Jens

2015-05-12

Skeletal myocytes are metabolically active and susceptible to insulin resistance and are thus implicated in type 2 diabetes (T2D). This complex disease involves systemic metabolic changes, and their elucidation at the systems level requires genome-wide data and biological networks. Genome-scale metabolic models (GEMs) provide a network context for the integration of high-throughput data. We generated myocyte-specific RNA-sequencing data and investigated their correlation with proteome data. These data were then used to reconstruct a comprehensive myocyte GEM. Next, we performed a meta-analysis of six studies comparing muscle transcription in T2D versus healthy subjects. Transcriptional changes were mapped on the myocyte GEM, revealing extensive transcriptional regulation in T2D, particularly around pyruvate oxidation, branched-chain amino acid catabolism, and tetrahydrofolate metabolism, connected through the downregulated dihydrolipoamide dehydrogenase. Strikingly, the gene signature underlying this metabolic regulation successfully classifies the disease state of individual samples, suggesting that regulation of these pathways is a ubiquitous feature of myocytes in response to T2D. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
The reconstruction of 2,631 draft metagenome-assembled genomes from the global oceans.

PubMed

Tully, Benjamin J; Graham, Elaina D; Heidelberg, John F

2018-01-16

Microorganisms play a crucial role in mediating global biogeochemical cycles in the marine environment. By reconstructing the genomes of environmental organisms through metagenomics, researchers are able to study the metabolic potential of Bacteria and Archaea that are resistant to isolation in the laboratory. Utilizing the large metagenomic dataset generated from 234 samples collected during the Tara Oceans circumnavigation expedition, we were able to assemble 102 billion paired-end reads into 562 million contigs, which in turn were co-assembled and consolidated in to 7.2 million contigs ≥2 kb in length. Approximately 1 million of these contigs were binned to reconstruct draft genomes. In total, 2,631 draft genomes with an estimated completion of ≥50% were generated (1,491 draft genomes >70% complete; 603 genomes >90% complete). A majority of the draft genomes were manually assigned phylogeny based on sets of concatenated phylogenetic marker genes and/or 16S rRNA gene sequences. The draft genomes are now publically available for the research community at-large.
The reconstruction of 2,631 draft metagenome-assembled genomes from the global oceans

PubMed Central

Tully, Benjamin J.; Graham, Elaina D.; Heidelberg, John F.

2018-01-01

Microorganisms play a crucial role in mediating global biogeochemical cycles in the marine environment. By reconstructing the genomes of environmental organisms through metagenomics, researchers are able to study the metabolic potential of Bacteria and Archaea that are resistant to isolation in the laboratory. Utilizing the large metagenomic dataset generated from 234 samples collected during the Tara Oceans circumnavigation expedition, we were able to assemble 102 billion paired-end reads into 562 million contigs, which in turn were co-assembled and consolidated in to 7.2 million contigs ≥2 kb in length. Approximately 1 million of these contigs were binned to reconstruct draft genomes. In total, 2,631 draft genomes with an estimated completion of ≥50% were generated (1,491 draft genomes >70% complete; 603 genomes >90% complete). A majority of the draft genomes were manually assigned phylogeny based on sets of concatenated phylogenetic marker genes and/or 16S rRNA gene sequences. The draft genomes are now publically available for the research community at-large. PMID:29337314

Integration of Plant Metabolomics Data with Metabolic Networks: Progresses and Challenges.

PubMed

Töpfer, Nadine; Seaver, Samuel M D; Aharoni, Asaph

2018-01-01

In the last decade, plant genome-scale modeling has developed rapidly and modeling efforts have advanced from representing metabolic behavior of plant heterotrophic cell suspensions to studying the complex interplay of cell types, tissues, and organs. A crucial driving force for such developments is the availability and integration of "omics" data (e.g., transcriptomics, proteomics, and metabolomics) which enable the reconstruction, extraction, and application of context-specific metabolic networks. In this chapter, we demonstrate a workflow to integrate gas chromatography coupled to mass spectrometry (GC-MS)-based metabolomics data of tomato fruit pericarp (flesh) tissue, at five developmental stages, with a genome-scale reconstruction of tomato metabolism. This method allows for the extraction of context-specific networks reflecting changing activities of metabolic pathways throughout fruit development and maturation.
Coordinating Environmental Genomics and Geochemistry Reveals Metabolic Transitions in a Hot Spring Ecosystem

PubMed Central

Swingley, Wesley D.; Meyer-Dombard, D’Arcy R.; Shock, Everett L.; Alsop, Eric B.; Falenski, Heinz D.; Havig, Jeff R.; Raymond, Jason

2012-01-01

We have constructed a conceptual model of biogeochemical cycles and metabolic and microbial community shifts within a hot spring ecosystem via coordinated analysis of the “Bison Pool” (BP) Environmental Genome and a complementary contextual geochemical dataset of ∼75 geochemical parameters. 2,321 16S rRNA clones and 470 megabases of environmental sequence data were produced from biofilms at five sites along the outflow of BP, an alkaline hot spring in Sentinel Meadow (Lower Geyser Basin) of Yellowstone National Park. This channel acts as a >22 m gradient of decreasing temperature, increasing dissolved oxygen, and changing availability of biologically important chemical species, such as those containing nitrogen and sulfur. Microbial life at BP transitions from a 92°C chemotrophic streamer biofilm community in the BP source pool to a 56°C phototrophic mat community. We improved automated annotation of the BP environmental genomes using BLAST-based Markov clustering. We have also assigned environmental genome sequences to individual microbial community members by complementing traditional homology-based assignment with nucleotide word-usage algorithms, allowing more than 70% of all reads to be assigned to source organisms. This assignment yields high genome coverage in dominant community members, facilitating reconstruction of nearly complete metabolic profiles and in-depth analysis of the relation between geochemical and metabolic changes along the outflow. We show that changes in environmental conditions and energy availability are associated with dramatic shifts in microbial communities and metabolic function. We have also identified an organism constituting a novel phylum in a metabolic “transition” community, located physically between the chemotroph- and phototroph-dominated sites. The complementary analysis of biogeochemical and environmental genomic data from BP has allowed us to build ecosystem-based conceptual models for this hot spring
Genome-scale metabolic modeling of responses to polymyxins in Pseudomonas aeruginosa.

PubMed

Zhu, Yan; Czauderna, Tobias; Zhao, Jinxin; Klapperstueck, Matthias; Maifiah, Mohd Hafidz Mahamad; Han, Mei-Ling; Lu, Jing; Sommer, Björn; Velkov, Tony; Lithgow, Trevor; Song, Jiangning; Schreiber, Falk; Li, Jian

2018-04-01

Pseudomonas aeruginosa often causes multidrug-resistant infections in immunocompromised patients, and polymyxins are often used as the last-line therapy. Alarmingly, resistance to polymyxins has been increasingly reported worldwide recently. To rescue this last-resort class of antibiotics, it is necessary to systematically understand how P. aeruginosa alters its metabolism in response to polymyxin treatment, thereby facilitating the development of effective therapies. To this end, a genome-scale metabolic model (GSMM) was used to analyze bacterial metabolic changes at the systems level. A high-quality GSMM iPAO1 was constructed for P. aeruginosa PAO1 for antimicrobial pharmacological research. Model iPAO1 encompasses an additional periplasmic compartment and contains 3022 metabolites, 4265 reactions, and 1458 genes in total. Growth prediction on 190 carbon and 95 nitrogen sources achieved an accuracy of 89.1%, outperforming all reported P. aeruginosa models. Notably, prediction of the essential genes for growth achieved a high accuracy of 87.9%. Metabolic simulation showed that lipid A modifications associated with polymyxin resistance exert a limited impact on bacterial growth and metabolism but remarkably change the physiochemical properties of the outer membrane. Modeling with transcriptomics constraints revealed a broad range of metabolic responses to polymyxin treatment, including reduced biomass synthesis, upregulated amino acid catabolism, induced flux through the tricarboxylic acid cycle, and increased redox turnover. Overall, iPAO1 represents the most comprehensive GSMM constructed to date for Pseudomonas. It provides a powerful systems pharmacology platform for the elucidation of complex killing mechanisms of antibiotics.
Patterns of Metabolite Changes Identified from Large-Scale Gene Perturbations in Arabidopsis Using a Genome-Scale Metabolic Network1[OPEN

PubMed Central

Kim, Taehyong; Dreher, Kate; Nilo-Poyanco, Ricardo; Lee, Insuk; Fiehn, Oliver; Lange, Bernd Markus; Nikolau, Basil J.; Sumner, Lloyd; Welti, Ruth; Wurtele, Eve S.; Rhee, Seung Y.

2015-01-01

Metabolomics enables quantitative evaluation of metabolic changes caused by genetic or environmental perturbations. However, little is known about how perturbing a single gene changes the metabolic system as a whole and which network and functional properties are involved in this response. To answer this question, we investigated the metabolite profiles from 136 mutants with single gene perturbations of functionally diverse Arabidopsis (Arabidopsis thaliana) genes. Fewer than 10 metabolites were changed significantly relative to the wild type in most of the mutants, indicating that the metabolic network was robust to perturbations of single metabolic genes. These changed metabolites were closer to each other in a genome-scale metabolic network than expected by chance, supporting the notion that the genetic perturbations changed the network more locally than globally. Surprisingly, the changed metabolites were close to the perturbed reactions in only 30% of the mutants of the well-characterized genes. To determine the factors that contributed to the distance between the observed metabolic changes and the perturbation site in the network, we examined nine network and functional properties of the perturbed genes. Only the isozyme number affected the distance between the perturbed reactions and changed metabolites. This study revealed patterns of metabolic changes from large-scale gene perturbations and relationships between characteristics of the perturbed genes and metabolic changes. PMID:25670818
Constraining Genome-Scale Models to Represent the Bow Tie Structure of Metabolism for 13C Metabolic Flux Analysis

PubMed Central

Ando, David; Singh, Jahnavi; Keasling, Jay D.; García Martín, Héctor

2018-01-01

Determination of internal metabolic fluxes is crucial for fundamental and applied biology because they map how carbon and electrons flow through metabolism to enable cell function. 13C Metabolic Flux Analysis (13C MFA) and Two-Scale 13C Metabolic Flux Analysis (2S-13C MFA) are two techniques used to determine such fluxes. Both operate on the simplifying approximation that metabolic flux from peripheral metabolism into central “core” carbon metabolism is minimal, and can be omitted when modeling isotopic labeling in core metabolism. The validity of this “two-scale” or “bow tie” approximation is supported both by the ability to accurately model experimental isotopic labeling data, and by experimentally verified metabolic engineering predictions using these methods. However, the boundaries of core metabolism that satisfy this approximation can vary across species, and across cell culture conditions. Here, we present a set of algorithms that (1) systematically calculate flux bounds for any specified “core” of a genome-scale model so as to satisfy the bow tie approximation and (2) automatically identify an updated set of core reactions that can satisfy this approximation more efficiently. First, we leverage linear programming to simultaneously identify the lowest fluxes from peripheral metabolism into core metabolism compatible with the observed growth rate and extracellular metabolite exchange fluxes. Second, we use Simulated Annealing to identify an updated set of core reactions that allow for a minimum of fluxes into core metabolism to satisfy these experimental constraints. Together, these methods accelerate and automate the identification of a biologically reasonable set of core reactions for use with 13C MFA or 2S-13C MFA, as well as provide for a substantially lower set of flux bounds for fluxes into the core as compared with previous methods. We provide an open source Python implementation of these algorithms at https
Evaluation of a genome-scale in silico metabolic model for Geobacter metallireducens by using proteomic data from a field biostimulation experiment.

PubMed

Fang, Yilin; Wilkins, Michael J; Yabusaki, Steven B; Lipton, Mary S; Long, Philip E

2012-12-01

Accurately predicting the interactions between microbial metabolism and the physical subsurface environment is necessary to enhance subsurface energy development, soil and groundwater cleanup, and carbon management. This study was an initial attempt to confirm the metabolic functional roles within an in silico model using environmental proteomic data collected during field experiments. Shotgun global proteomics data collected during a subsurface biostimulation experiment were used to validate a genome-scale metabolic model of Geobacter metallireducens-specifically, the ability of the metabolic model to predict metal reduction, biomass yield, and growth rate under dynamic field conditions. The constraint-based in silico model of G. metallireducens relates an annotated genome sequence to the physiological functions with 697 reactions controlled by 747 enzyme-coding genes. Proteomic analysis showed that 180 of the 637 G. metallireducens proteins detected during the 2008 experiment were associated with specific metabolic reactions in the in silico model. When the field-calibrated Fe(III) terminal electron acceptor process reaction in a reactive transport model for the field experiments was replaced with the genome-scale model, the model predicted that the largest metabolic fluxes through the in silico model reactions generally correspond to the highest abundances of proteins that catalyze those reactions. Central metabolism predicted by the model agrees well with protein abundance profiles inferred from proteomic analysis. Model discrepancies with the proteomic data, such as the relatively low abundances of proteins associated with amino acid transport and metabolism, revealed pathways or flux constraints in the in silico model that could be updated to more accurately predict metabolic processes that occur in the subsurface environment.
Genome-scale estimate of the metabolic turnover of E. Coli from the energy balance analysis

NASA Astrophysics Data System (ADS)

De Martino, D.

2016-02-01

In this article the notion of metabolic turnover is revisited in the light of recent results of out-of-equilibrium thermodynamics. By means of Monte Carlo methods we perform an exact sampling of the enzymatic fluxes in a genome scale metabolic network of E. Coli in stationary growth conditions from which we infer the metabolites turnover times. However the latter are inferred from net fluxes, and we argue that this approximation is not valid for enzymes working nearby thermodynamic equilibrium. We recalculate turnover times from total fluxes by performing an energy balance analysis of the network and recurring to the fluctuation theorem. We find in many cases values one of order of magnitude lower, implying a faster picture of intermediate metabolism.
Reframed Genome-Scale Metabolic Model to Facilitate Genetic Design and Integration with Expression Data.

PubMed

Gu, Deqing; Jian, Xingxing; Zhang, Cheng; Hua, Qiang

2017-01-01

Genome-scale metabolic network models (GEMs) have played important roles in the design of genetically engineered strains and helped biologists to decipher metabolism. However, due to the complex gene-reaction relationships that exist in model systems, most algorithms have limited capabilities with respect to directly predicting accurate genetic design for metabolic engineering. In particular, methods that predict reaction knockout strategies leading to overproduction are often impractical in terms of gene manipulations. Recently, we proposed a method named logical transformation of model (LTM) to simplify the gene-reaction associations by introducing intermediate pseudo reactions, which makes it possible to generate genetic design. Here, we propose an alternative method to relieve researchers from deciphering complex gene-reactions by adding pseudo gene controlling reactions. In comparison to LTM, this new method introduces fewer pseudo reactions and generates a much smaller model system named as gModel. We showed that gModel allows two seldom reported applications: identification of minimal genomes and design of minimal cell factories within a modified OptKnock framework. In addition, gModel could be used to integrate expression data directly and improve the performance of the E-Fmin method for predicting fluxes. In conclusion, the model transformation procedure will facilitate genetic research based on GEMs, extending their applications.
Extreme-Scale De Novo Genome Assembly

DOE Office of Scientific and Technical Information (OSTI.GOV)

Georganas, Evangelos; Hofmeyr, Steven; Egan, Rob

De novo whole genome assembly reconstructs genomic sequence from short, overlapping, and potentially erroneous DNA segments and is one of the most important computations in modern genomics. This work presents HipMER, a high-quality end-to-end de novo assembler designed for extreme scale analysis, via efficient parallelization of the Meraculous code. Genome assembly software has many components, each of which stresses different components of a computer system. This chapter explains the computational challenges involved in each step of the HipMer pipeline, the key distributed data structures, and communication costs in detail. We present performance results of assembling the human genome and themore » large hexaploid wheat genome on large supercomputers up to tens of thousands of cores.« less
Comparative genomics of transcriptional regulation of methionine metabolism in proteobacteria

DOE PAGES

Leyn, Semen A.; Suvorova, Inna A.; Kholina, Tatiana D.; ...

2014-11-20

Methionine metabolism and uptake genes in Proteobacteria are controlled by a variety of RNA and DNA regulatory systems. We have applied comparative genomics to reconstruct regulons for three known transcription factors, MetJ, MetR, and SahR, and three known riboswitch motifs, SAH, SAM-SAH, and SAM_alpha, in ~200 genomes from 22 taxonomic groups of Proteobacteria. We also identified two novel regulons: a SahR-like transcription factor SamR controlling various methionine biosynthesis genes in the Xanthomonadales group, and a potential RNA regulatory element with terminator-antiterminator mechanism controlling the metX or metZ genes in beta-proteobacteria. For each analyzed regulator we identified the core, taxon-specific andmore » genome-specific regulon members. By analyzing the distribution of these regulators in bacterial genomes and by comparing their regulon contents we elucidated possible evolutionary scenarios for the regulation of the methionine metabolism genes in Proteobacteria.« less
Reconstruction of metabolic pathways by combining probabilistic graphical model-based and knowledge-based methods

PubMed Central

2014-01-01

Automatic reconstruction of metabolic pathways for an organism from genomics and transcriptomics data has been a challenging and important problem in bioinformatics. Traditionally, known reference pathways can be mapped into an organism-specific ones based on its genome annotation and protein homology. However, this simple knowledge-based mapping method might produce incomplete pathways and generally cannot predict unknown new relations and reactions. In contrast, ab initio metabolic network construction methods can predict novel reactions and interactions, but its accuracy tends to be low leading to a lot of false positives. Here we combine existing pathway knowledge and a new ab initio Bayesian probabilistic graphical model together in a novel fashion to improve automatic reconstruction of metabolic networks. Specifically, we built a knowledge database containing known, individual gene / protein interactions and metabolic reactions extracted from existing reference pathways. Known reactions and interactions were then used as constraints for Bayesian network learning methods to predict metabolic pathways. Using individual reactions and interactions extracted from different pathways of many organisms to guide pathway construction is new and improves both the coverage and accuracy of metabolic pathway construction. We applied this probabilistic knowledge-based approach to construct the metabolic networks from yeast gene expression data and compared its results with 62 known metabolic networks in the KEGG database. The experiment showed that the method improved the coverage of metabolic network construction over the traditional reference pathway mapping method and was more accurate than pure ab initio methods. PMID:25374614
Computational Modeling of Human Metabolism and Its Application to Systems Biomedicine.

PubMed

Aurich, Maike K; Thiele, Ines

2016-01-01

Modern high-throughput techniques offer immense opportunities to investigate whole-systems behavior, such as those underlying human diseases. However, the complexity of the data presents challenges in interpretation, and new avenues are needed to address the complexity of both diseases and data. Constraint-based modeling is one formalism applied in systems biology. It relies on a genome-scale reconstruction that captures extensive biochemical knowledge regarding an organism. The human genome-scale metabolic reconstruction is increasingly used to understand normal cellular and disease states because metabolism is an important factor in many human diseases. The application of human genome-scale reconstruction ranges from mere querying of the model as a knowledge base to studies that take advantage of the model's topology and, most notably, to functional predictions based on cell- and condition-specific metabolic models built based on omics data.An increasing number and diversity of biomedical questions are being addressed using constraint-based modeling and metabolic models. One of the most successful biomedical applications to date is cancer metabolism, but constraint-based modeling also holds great potential for inborn errors of metabolism or obesity. In addition, it offers great prospects for individualized approaches to diagnostics and the design of disease prevention and intervention strategies. Metabolic models support this endeavor by providing easy access to complex high-throughput datasets. Personalized metabolic models have been introduced. Finally, constraint-based modeling can be used to model whole-body metabolism, which will enable the elucidation of metabolic interactions between organs and disturbances of these interactions as either causes or consequence of metabolic diseases. This chapter introduces constraint-based modeling and describes some of its contributions to systems biomedicine.
Comparative genomics of metabolic capacities of regulons controlled by cis-regulatory RNA motifs in bacteria.

PubMed

Sun, Eric I; Leyn, Semen A; Kazanov, Marat D; Saier, Milton H; Novichkov, Pavel S; Rodionov, Dmitry A

2013-09-02

In silico comparative genomics approaches have been efficiently used for functional prediction and reconstruction of metabolic and regulatory networks. Riboswitches are metabolite-sensing structures often found in bacterial mRNA leaders controlling gene expression on transcriptional or translational levels.An increasing number of riboswitches and other cis-regulatory RNAs have been recently classified into numerous RNA families in the Rfam database. High conservation of these RNA motifs provides a unique advantage for their genomic identification and comparative analysis. A comparative genomics approach implemented in the RegPredict tool was used for reconstruction and functional annotation of regulons controlled by RNAs from 43 Rfam families in diverse taxonomic groups of Bacteria. The inferred regulons include ~5200 cis-regulatory RNAs and more than 12000 target genes in 255 microbial genomes. All predicted RNA-regulated genes were classified into specific and overall functional categories. Analysis of taxonomic distribution of these categories allowed us to establish major functional preferences for each analyzed cis-regulatory RNA motif family. Overall, most RNA motif regulons showed predictable functional content in accordance with their experimentally established effector ligands. Our results suggest that some RNA motifs (including thiamin pyrophosphate and cobalamin riboswitches that control the cofactor metabolism) are widespread and likely originated from the last common ancestor of all bacteria. However, many more analyzed RNA motifs are restricted to a narrow taxonomic group of bacteria and likely represent more recent evolutionary innovations. The reconstructed regulatory networks for major known RNA motifs substantially expand the existing knowledge of transcriptional regulation in bacteria. The inferred regulons can be used for genetic experiments, functional annotations of genes, metabolic reconstruction and evolutionary analysis. The obtained genome
Fast ancestral gene order reconstruction of genomes with unequal gene content.

PubMed

Feijão, Pedro; Araujo, Eloi

2016-11-11

During evolution, genomes are modified by large scale structural events, such as rearrangements, deletions or insertions of large blocks of DNA. Of particular interest, in order to better understand how this type of genomic evolution happens, is the reconstruction of ancestral genomes, given a phylogenetic tree with extant genomes at its leaves. One way of solving this problem is to assume a rearrangement model, such as Double Cut and Join (DCJ), and find a set of ancestral genomes that minimizes the number of events on the input tree. Since this problem is NP-hard for most rearrangement models, exact solutions are practical only for small instances, and heuristics have to be used for larger datasets. This type of approach can be called event-based. Another common approach is based on finding conserved structures between the input genomes, such as adjacencies between genes, possibly also assigning weights that indicate a measure of confidence or probability that this particular structure is present on each ancestral genome, and then finding a set of non conflicting adjacencies that optimize some given function, usually trying to maximize total weight and minimizing character changes in the tree. We call this type of methods homology-based. In previous work, we proposed an ancestral reconstruction method that combines homology- and event-based ideas, using the concept of intermediate genomes, that arise in DCJ rearrangement scenarios. This method showed better rate of correctly reconstructed adjacencies than other methods, while also being faster, since the use of intermediate genomes greatly reduces the search space. Here, we generalize the intermediate genome concept to genomes with unequal gene content, extending our method to account for gene insertions and deletions of any length. In many of the simulated datasets, our proposed method had better results than MLGO and MGRA, two state-of-the-art algorithms for ancestral reconstruction with unequal gene content
Refining metabolic models and accounting for regulatory effects.

PubMed

Kim, Joonhoon; Reed, Jennifer L

2014-10-01

Advances in genome-scale metabolic modeling allow us to investigate and engineer metabolism at a systems level. Metabolic network reconstructions have been made for many organisms and computational approaches have been developed to convert these reconstructions into predictive models. However, due to incomplete knowledge these reconstructions often have missing or extraneous components and interactions, which can be identified by reconciling model predictions with experimental data. Recent studies have provided methods to further improve metabolic model predictions by incorporating transcriptional regulatory interactions and high-throughput omics data to yield context-specific metabolic models. Here we discuss recent approaches for resolving model-data discrepancies and building context-specific metabolic models. Once developed highly accurate metabolic models can be used in a variety of biotechnology applications. Copyright © 2014 Elsevier Ltd. All rights reserved.
MOST-visualization: software for producing automated textbook-style maps of genome-scale metabolic networks.

PubMed

Kelley, James J; Maor, Shay; Kim, Min Kyung; Lane, Anatoliy; Lun, Desmond S

2017-08-15

Visualization of metabolites, reactions and pathways in genome-scale metabolic networks (GEMs) can assist in understanding cellular metabolism. Three attributes are desirable in software used for visualizing GEMs: (i) automation, since GEMs can be quite large; (ii) production of understandable maps that provide ease in identification of pathways, reactions and metabolites; and (iii) visualization of the entire network to show how pathways are interconnected. No software currently exists for visualizing GEMs that satisfies all three characteristics, but MOST-Visualization, an extension of the software package MOST (Metabolic Optimization and Simulation Tool), satisfies (i), and by using a pre-drawn overview map of metabolism based on the Roche map satisfies (ii) and comes close to satisfying (iii). MOST is distributed for free on the GNU General Public License. The software and full documentation are available at http://most.ccib.rutgers.edu/. dslun@rutgers.edu. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
CMIP: a software package capable of reconstructing genome-wide regulatory networks using gene expression data.

PubMed

Zheng, Guangyong; Xu, Yaochen; Zhang, Xiujun; Liu, Zhi-Ping; Wang, Zhuo; Chen, Luonan; Zhu, Xin-Guang

2016-12-23

A gene regulatory network (GRN) represents interactions of genes inside a cell or tissue, in which vertexes and edges stand for genes and their regulatory interactions respectively. Reconstruction of gene regulatory networks, in particular, genome-scale networks, is essential for comparative exploration of different species and mechanistic investigation of biological processes. Currently, most of network inference methods are computationally intensive, which are usually effective for small-scale tasks (e.g., networks with a few hundred genes), but are difficult to construct GRNs at genome-scale. Here, we present a software package for gene regulatory network reconstruction at a genomic level, in which gene interaction is measured by the conditional mutual information measurement using a parallel computing framework (so the package is named CMIP). The package is a greatly improved implementation of our previous PCA-CMI algorithm. In CMIP, we provide not only an automatic threshold determination method but also an effective parallel computing framework for network inference. Performance tests on benchmark datasets show that the accuracy of CMIP is comparable to most current network inference methods. Moreover, running tests on synthetic datasets demonstrate that CMIP can handle large datasets especially genome-wide datasets within an acceptable time period. In addition, successful application on a real genomic dataset confirms its practical applicability of the package. This new software package provides a powerful tool for genomic network reconstruction to biological community. The software can be accessed at http://www.picb.ac.cn/CMIP/ .
Starch biosynthesis in cassava: a genome-based pathway reconstruction and its exploitation in data integration

PubMed Central

2013-01-01

Background Cassava is a well-known starchy root crop utilized for food, feed and biofuel production. However, the comprehension underlying the process of starch production in cassava is not yet available. Results In this work, we exploited the recently released genome information and utilized the post-genomic approaches to reconstruct the metabolic pathway of starch biosynthesis in cassava using multiple plant templates. The quality of pathway reconstruction was assured by the employed parsimonious reconstruction framework and the collective validation steps. Our reconstructed pathway is presented in the form of an informative map, which describes all important information of the pathway, and an interactive map, which facilitates the integration of omics data into the metabolic pathway. Additionally, to demonstrate the advantage of the reconstructed pathways beyond just the schematic presentation, the pathway could be used for incorporating the gene expression data obtained from various developmental stages of cassava roots. Our results exhibited the distinct activities of the starch biosynthesis pathway in different stages of root development at the transcriptional level whereby the activity of the pathway is higher toward the development of mature storage roots. Conclusions To expand its applications, the interactive map of the reconstructed starch biosynthesis pathway is available for download at the SBI group’s website (http://sbi.pdti.kmutt.ac.th/?page_id=33). This work is considered a big step in the quantitative modeling pipeline aiming to investigate the dynamic regulation of starch biosynthesis in cassava roots. PMID:23938102
Starch biosynthesis in cassava: a genome-based pathway reconstruction and its exploitation in data integration.

PubMed

Saithong, Treenut; Rongsirikul, Oratai; Kalapanulak, Saowalak; Chiewchankaset, Porntip; Siriwat, Wanatsanan; Netrphan, Supatcharee; Suksangpanomrung, Malinee; Meechai, Asawin; Cheevadhanarak, Supapon

2013-08-10

Cassava is a well-known starchy root crop utilized for food, feed and biofuel production. However, the comprehension underlying the process of starch production in cassava is not yet available. In this work, we exploited the recently released genome information and utilized the post-genomic approaches to reconstruct the metabolic pathway of starch biosynthesis in cassava using multiple plant templates. The quality of pathway reconstruction was assured by the employed parsimonious reconstruction framework and the collective validation steps. Our reconstructed pathway is presented in the form of an informative map, which describes all important information of the pathway, and an interactive map, which facilitates the integration of omics data into the metabolic pathway. Additionally, to demonstrate the advantage of the reconstructed pathways beyond just the schematic presentation, the pathway could be used for incorporating the gene expression data obtained from various developmental stages of cassava roots. Our results exhibited the distinct activities of the starch biosynthesis pathway in different stages of root development at the transcriptional level whereby the activity of the pathway is higher toward the development of mature storage roots. To expand its applications, the interactive map of the reconstructed starch biosynthesis pathway is available for download at the SBI group's website (http://sbi.pdti.kmutt.ac.th/?page_id=33). This work is considered a big step in the quantitative modeling pipeline aiming to investigate the dynamic regulation of starch biosynthesis in cassava roots.
FMM: a web server for metabolic pathway reconstruction and comparative analysis.

PubMed

Chou, Chih-Hung; Chang, Wen-Chi; Chiu, Chih-Min; Huang, Chih-Chang; Huang, Hsien-Da

2009-07-01

Synthetic Biology, a multidisciplinary field, is growing rapidly. Improving the understanding of biological systems through mimicry and producing bio-orthogonal systems with new functions are two complementary pursuits in this field. A web server called FMM (From Metabolite to Metabolite) was developed for this purpose. FMM can reconstruct metabolic pathways form one metabolite to another metabolite among different species, based mainly on the Kyoto Encyclopedia of Genes and Genomes (KEGG) database and other integrated biological databases. Novel presentation for connecting different KEGG maps is newly provided. Both local and global graphical views of the metabolic pathways are designed. FMM has many applications in Synthetic Biology and Metabolic Engineering. For example, the reconstruction of metabolic pathways to produce valuable metabolites or secondary metabolites in bacteria or yeast is a promising strategy for drug production. FMM provides a highly effective way to elucidate the genes from which species should be cloned into those microorganisms based on FMM pathway comparative analysis. Consequently, FMM is an effective tool for applications in synthetic biology to produce both drugs and biofuels. This novel and innovative resource is now freely available at http://FMM.mbc.nctu.edu.tw/.

Genome-scale biological models for industrial microbial systems.

PubMed

Xu, Nan; Ye, Chao; Liu, Liming

2018-04-01

The primary aims and challenges associated with microbial fermentation include achieving faster cell growth, higher productivity, and more robust production processes. Genome-scale biological models, predicting the formation of an interaction among genetic materials, enzymes, and metabolites, constitute a systematic and comprehensive platform to analyze and optimize the microbial growth and production of biological products. Genome-scale biological models can help optimize microbial growth-associated traits by simulating biomass formation, predicting growth rates, and identifying the requirements for cell growth. With regard to microbial product biosynthesis, genome-scale biological models can be used to design product biosynthetic pathways, accelerate production efficiency, and reduce metabolic side effects, leading to improved production performance. The present review discusses the development of microbial genome-scale biological models since their emergence and emphasizes their pertinent application in improving industrial microbial fermentation of biological products.
Physiologically Shrinking the Solution Space of a Saccharomyces cerevisiae Genome-Scale Model Suggests the Role of the Metabolic Network in Shaping Gene Expression Noise.

PubMed

Chi, Baofang; Tao, Shiheng; Liu, Yanlin

2015-01-01

Sampling the solution space of genome-scale models is generally conducted to determine the feasible region for metabolic flux distribution. Because the region for actual metabolic states resides only in a small fraction of the entire space, it is necessary to shrink the solution space to improve the predictive power of a model. A common strategy is to constrain models by integrating extra datasets such as high-throughput datasets and C13-labeled flux datasets. However, studies refining these approaches by performing a meta-analysis of massive experimental metabolic flux measurements, which are closely linked to cellular phenotypes, are limited. In the present study, experimentally identified metabolic flux data from 96 published reports were systematically reviewed. Several strong associations among metabolic flux phenotypes were observed. These phenotype-phenotype associations at the flux level were quantified and integrated into a Saccharomyces cerevisiae genome-scale model as extra physiological constraints. By sampling the shrunken solution space of the model, the metabolic flux fluctuation level, which is an intrinsic trait of metabolic reactions determined by the network, was estimated and utilized to explore its relationship to gene expression noise. Although no correlation was observed in all enzyme-coding genes, a relationship between metabolic flux fluctuation and expression noise of genes associated with enzyme-dosage sensitive reactions was detected, suggesting that the metabolic network plays a role in shaping gene expression noise. Such correlation was mainly attributed to the genes corresponding to non-essential reactions, rather than essential ones. This was at least partially, due to regulations underlying the flux phenotype-phenotype associations. Altogether, this study proposes a new approach in shrinking the solution space of a genome-scale model, of which sampling provides new insights into gene expression noise.
Flux Balance Analysis of Cyanobacterial Metabolism: The Metabolic Network of Synechocystis sp. PCC 6803

PubMed Central

Knoop, Henning; Gründel, Marianne; Zilliges, Yvonne; Lehmann, Robert; Hoffmann, Sabrina; Lockau, Wolfgang; Steuer, Ralf

2013-01-01

Cyanobacteria are versatile unicellular phototrophic microorganisms that are highly abundant in many environments. Owing to their capability to utilize solar energy and atmospheric carbon dioxide for growth, cyanobacteria are increasingly recognized as a prolific resource for the synthesis of valuable chemicals and various biofuels. To fully harness the metabolic capabilities of cyanobacteria necessitates an in-depth understanding of the metabolic interconversions taking place during phototrophic growth, as provided by genome-scale reconstructions of microbial organisms. Here we present an extended reconstruction and analysis of the metabolic network of the unicellular cyanobacterium Synechocystis sp. PCC 6803. Building upon several recent reconstructions of cyanobacterial metabolism, unclear reaction steps are experimentally validated and the functional consequences of unknown or dissenting pathway topologies are discussed. The updated model integrates novel results with respect to the cyanobacterial TCA cycle, an alleged glyoxylate shunt, and the role of photorespiration in cellular growth. Going beyond conventional flux-balance analysis, we extend the computational analysis to diurnal light/dark cycles of cyanobacterial metabolism. PMID:23843751
Improving the phenotype predictions of a yeast genome-scale metabolic model by incorporating enzymatic constraints

DOE PAGES

Sánchez, Benjamín J.; Zhang, Cheng; Nilsson, Avlant; ...

2017-03-08

Genome-scale metabolic models (GEMs) are widely used to calculate metabolic phenotypes. They rely on defining a set of constraints, the most common of which is that the production of metabolites and/or growth are limited by the carbon source uptake rate. However, enzyme abundances and kinetics, which act as limitations on metabolic fluxes, are not taken into account. Here, we present GECKO, a method that enhances a GEM to account for enzymes as part of reactions, thereby ensuring that each metabolic flux does not exceed its maximum capacity, equal to the product of the enzyme's abundance and turnover number. We appliedmore » GECKO to a Saccharomyces cerevisiae GEM and demonstrated that the new model could correctly describe phenotypes that the previous model could not, particularly under high enzymatic pressure conditions, such as yeast growing on different carbon sources in excess, coping with stress, or overexpressing a specific pathway. GECKO also allows to directly integrate quantitative proteomics data; by doing so, we significantly reduced flux variability of the model, in over 60% of metabolic reactions. Additionally, the model gives insight into the distribution of enzyme usage between and within metabolic pathways. The developed method and model are expected to increase the use of model-based design in metabolic engineering.« less
Improving the phenotype predictions of a yeast genome-scale metabolic model by incorporating enzymatic constraints

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sánchez, Benjamín J.; Zhang, Cheng; Nilsson, Avlant

Genome-scale metabolic models (GEMs) are widely used to calculate metabolic phenotypes. They rely on defining a set of constraints, the most common of which is that the production of metabolites and/or growth are limited by the carbon source uptake rate. However, enzyme abundances and kinetics, which act as limitations on metabolic fluxes, are not taken into account. Here, we present GECKO, a method that enhances a GEM to account for enzymes as part of reactions, thereby ensuring that each metabolic flux does not exceed its maximum capacity, equal to the product of the enzyme's abundance and turnover number. We appliedmore » GECKO to a Saccharomyces cerevisiae GEM and demonstrated that the new model could correctly describe phenotypes that the previous model could not, particularly under high enzymatic pressure conditions, such as yeast growing on different carbon sources in excess, coping with stress, or overexpressing a specific pathway. GECKO also allows to directly integrate quantitative proteomics data; by doing so, we significantly reduced flux variability of the model, in over 60% of metabolic reactions. Additionally, the model gives insight into the distribution of enzyme usage between and within metabolic pathways. The developed method and model are expected to increase the use of model-based design in metabolic engineering.« less
The OME Framework for genome-scale systems biology

DOE Office of Scientific and Technical Information (OSTI.GOV)

Palsson, Bernhard O.; Ebrahim, Ali; Federowicz, Steve

The life sciences are undergoing continuous and accelerating integration with computational and engineering sciences. The biology that many in the field have been trained on may be hardly recognizable in ten to twenty years. One of the major drivers for this transformation is the blistering pace of advancements in DNA sequencing and synthesis. These advances have resulted in unprecedented amounts of new data, information, and knowledge. Many software tools have been developed to deal with aspects of this transformation and each is sorely needed [1-3]. However, few of these tools have been forced to deal with the full complexity ofmore » genome-scale models along with high throughput genome- scale data. This particular situation represents a unique challenge, as it is simultaneously necessary to deal with the vast breadth of genome-scale models and the dizzying depth of high-throughput datasets. It has been observed time and again that as the pace of data generation continues to accelerate, the pace of analysis significantly lags behind [4]. It is also evident that, given the plethora of databases and software efforts [5-12], it is still a significant challenge to work with genome-scale metabolic models, let alone next-generation whole cell models [13-15]. We work at the forefront of model creation and systems scale data generation [16-18]. The OME Framework was borne out of a practical need to enable genome-scale modeling and data analysis under a unified framework to drive the next generation of genome-scale biological models. Here we present the OME Framework. It exists as a set of Python classes. However, we want to emphasize the importance of the underlying design as an addition to the discussions on specifications of a digital cell. A great deal of work and valuable progress has been made by a number of communities [13, 19-24] towards interchange formats and implementations designed to achieve similar goals. While many software tools exist for handling
BiGG Models: A platform for integrating, standardizing and sharing genome-scale models

PubMed Central

King, Zachary A.; Lu, Justin; Dräger, Andreas; Miller, Philip; Federowicz, Stephen; Lerman, Joshua A.; Ebrahim, Ali; Palsson, Bernhard O.; Lewis, Nathan E.

2016-01-01

Genome-scale metabolic models are mathematically-structured knowledge bases that can be used to predict metabolic pathway usage and growth phenotypes. Furthermore, they can generate and test hypotheses when integrated with experimental data. To maximize the value of these models, centralized repositories of high-quality models must be established, models must adhere to established standards and model components must be linked to relevant databases. Tools for model visualization further enhance their utility. To meet these needs, we present BiGG Models (http://bigg.ucsd.edu), a completely redesigned Biochemical, Genetic and Genomic knowledge base. BiGG Models contains more than 75 high-quality, manually-curated genome-scale metabolic models. On the website, users can browse, search and visualize models. BiGG Models connects genome-scale models to genome annotations and external databases. Reaction and metabolite identifiers have been standardized across models to conform to community standards and enable rapid comparison across models. Furthermore, BiGG Models provides a comprehensive application programming interface for accessing BiGG Models with modeling and analysis tools. As a resource for highly curated, standardized and accessible models of metabolism, BiGG Models will facilitate diverse systems biology studies and support knowledge-based analysis of diverse experimental data. PMID:26476456
Acidithiobacillus ferrooxidans metabolism: from genome sequence to industrial applications

PubMed Central

Valdés, Jorge; Pedroso, Inti; Quatrini, Raquel; Dodson, Robert J; Tettelin, Herve; Blake, Robert; Eisen, Jonathan A; Holmes, David S

2008-01-01

Background Acidithiobacillus ferrooxidans is a major participant in consortia of microorganisms used for the industrial recovery of copper (bioleaching or biomining). It is a chemolithoautrophic, γ-proteobacterium using energy from the oxidation of iron- and sulfur-containing minerals for growth. It thrives at extremely low pH (pH 1–2) and fixes both carbon and nitrogen from the atmosphere. It solubilizes copper and other metals from rocks and plays an important role in nutrient and metal biogeochemical cycling in acid environments. The lack of a well-developed system for genetic manipulation has prevented thorough exploration of its physiology. Also, confusion has been caused by prior metabolic models constructed based upon the examination of multiple, and sometimes distantly related, strains of the microorganism. Results The genome of the type strain A. ferrooxidans ATCC 23270 was sequenced and annotated to identify general features and provide a framework for in silico metabolic reconstruction. Earlier models of iron and sulfur oxidation, biofilm formation, quorum sensing, inorganic ion uptake, and amino acid metabolism are confirmed and extended. Initial models are presented for central carbon metabolism, anaerobic metabolism (including sulfur reduction, hydrogen metabolism and nitrogen fixation), stress responses, DNA repair, and metal and toxic compound fluxes. Conclusion Bioinformatics analysis provides a valuable platform for gene discovery and functional prediction that helps explain the activity of A. ferrooxidans in industrial bioleaching and its role as a primary producer in acidic environments. An analysis of the genome of the type strain provides a coherent view of its gene content and metabolic potential. PMID:19077236
Acidithiobacillus ferrooxidans metabolism: from genome sequence to industrial applications.

PubMed

Valdés, Jorge; Pedroso, Inti; Quatrini, Raquel; Dodson, Robert J; Tettelin, Herve; Blake, Robert; Eisen, Jonathan A; Holmes, David S

2008-12-11

Acidithiobacillus ferrooxidans is a major participant in consortia of microorganisms used for the industrial recovery of copper (bioleaching or biomining). It is a chemolithoautrophic, gamma-proteobacterium using energy from the oxidation of iron- and sulfur-containing minerals for growth. It thrives at extremely low pH (pH 1-2) and fixes both carbon and nitrogen from the atmosphere. It solubilizes copper and other metals from rocks and plays an important role in nutrient and metal biogeochemical cycling in acid environments. The lack of a well-developed system for genetic manipulation has prevented thorough exploration of its physiology. Also, confusion has been caused by prior metabolic models constructed based upon the examination of multiple, and sometimes distantly related, strains of the microorganism. The genome of the type strain A. ferrooxidans ATCC 23270 was sequenced and annotated to identify general features and provide a framework for in silico metabolic reconstruction. Earlier models of iron and sulfur oxidation, biofilm formation, quorum sensing, inorganic ion uptake, and amino acid metabolism are confirmed and extended. Initial models are presented for central carbon metabolism, anaerobic metabolism (including sulfur reduction, hydrogen metabolism and nitrogen fixation), stress responses, DNA repair, and metal and toxic compound fluxes. Bioinformatics analysis provides a valuable platform for gene discovery and functional prediction that helps explain the activity of A. ferrooxidans in industrial bioleaching and its role as a primary producer in acidic environments. An analysis of the genome of the type strain provides a coherent view of its gene content and metabolic potential.
Modeling and Simulation of Optimal Resource Management during the Diurnal Cycle in Emiliania huxleyi by Genome-Scale Reconstruction and an Extended Flux Balance Analysis Approach.

PubMed

Knies, David; Wittmüß, Philipp; Appel, Sebastian; Sawodny, Oliver; Ederer, Michael; Feuer, Ronny

2015-10-28

The coccolithophorid unicellular alga Emiliania huxleyi is known to form large blooms, which have a strong effect on the marine carbon cycle. As a photosynthetic organism, it is subjected to a circadian rhythm due to the changing light conditions throughout the day. For a better understanding of the metabolic processes under these periodically-changing environmental conditions, a genome-scale model based on a genome reconstruction of the E. huxleyi strain CCMP 1516 was created. It comprises 410 reactions and 363 metabolites. Biomass composition is variable based on the differentiation into functional biomass components and storage metabolites. The model is analyzed with a flux balance analysis approach called diurnal flux balance analysis (diuFBA) that was designed for organisms with a circadian rhythm. It allows storage metabolites to accumulate or be consumed over the diurnal cycle, while keeping the structure of a classical FBA problem. A feature of this approach is that the production and consumption of storage metabolites is not defined externally via the biomass composition, but the result of optimal resource management adapted to the diurnally-changing environmental conditions. The model in combination with this approach is able to simulate the variable biomass composition during the diurnal cycle in proximity to literature data.
COBRApy: COnstraints-Based Reconstruction and Analysis for Python.

PubMed

Ebrahim, Ali; Lerman, Joshua A; Palsson, Bernhard O; Hyduke, Daniel R

2013-08-08

COnstraint-Based Reconstruction and Analysis (COBRA) methods are widely used for genome-scale modeling of metabolic networks in both prokaryotes and eukaryotes. Due to the successes with metabolism, there is an increasing effort to apply COBRA methods to reconstruct and analyze integrated models of cellular processes. The COBRA Toolbox for MATLAB is a leading software package for genome-scale analysis of metabolism; however, it was not designed to elegantly capture the complexity inherent in integrated biological networks and lacks an integration framework for the multiomics data used in systems biology. The openCOBRA Project is a community effort to promote constraints-based research through the distribution of freely available software. Here, we describe COBRA for Python (COBRApy), a Python package that provides support for basic COBRA methods. COBRApy is designed in an object-oriented fashion that facilitates the representation of the complex biological processes of metabolism and gene expression. COBRApy does not require MATLAB to function; however, it includes an interface to the COBRA Toolbox for MATLAB to facilitate use of legacy codes. For improved performance, COBRApy includes parallel processing support for computationally intensive processes. COBRApy is an object-oriented framework designed to meet the computational challenges associated with the next generation of stoichiometric constraint-based models and high-density omics data sets. http://opencobra.sourceforge.net/
Genome-Scale Metabolic Reconstructions and Theoretical Investigation of Methane Conversion in Methylomicrobium buryatense Strain 5G(B1)

DOE PAGES

de la Torre, Andrea; Metivier, Aisha; Chu, Frances; ...

2015-11-25

Methane-utilizing bacteria (methanotrophs) are capable of growth on methane and are attractive systems for bio-catalysis. However, the application of natural methanotrophic strains to large-scale production of value-added chemicals/biofuels requires a number of physiological and genetic alterations. An accurate metabolic model coupled with flux balance analysis can provide a solid interpretative framework for experimental data analyses and integration.
The Use of Weighted Graphs for Large-Scale Genome Analysis

PubMed Central

Zhou, Fang; Toivonen, Hannu; King, Ross D.

2014-01-01

There is an acute need for better tools to extract knowledge from the growing flood of sequence data. For example, thousands of complete genomes have been sequenced, and their metabolic networks inferred. Such data should enable a better understanding of evolution. However, most existing network analysis methods are based on pair-wise comparisons, and these do not scale to thousands of genomes. Here we propose the use of weighted graphs as a data structure to enable large-scale phylogenetic analysis of networks. We have developed three types of weighted graph for enzymes: taxonomic (these summarize phylogenetic importance), isoenzymatic (these summarize enzymatic variety/redundancy), and sequence-similarity (these summarize sequence conservation); and we applied these types of weighted graph to survey prokaryotic metabolism. To demonstrate the utility of this approach we have compared and contrasted the large-scale evolution of metabolism in Archaea and Eubacteria. Our results provide evidence for limits to the contingency of evolution. PMID:24619061
C4GEM, a Genome-Scale Metabolic Model to Study C4 Plant Metabolism1[W][OA

PubMed Central

de Oliveira Dal’Molin, Cristiana Gomes; Quek, Lake-Ee; Palfreyman, Robin William; Brumbley, Stevens Michael; Nielsen, Lars Keld

2010-01-01

Leaves of C4 grasses (such as maize [Zea mays], sugarcane [Saccharum officinarum], and sorghum [Sorghum bicolor]) form a classical Kranz leaf anatomy. Unlike C3 plants, where photosynthetic CO2 fixation proceeds in the mesophyll (M), the fixation process in C4 plants is distributed between two cell types, the M cell and the bundle sheath (BS) cell. Here, we develop a C4 genome-scale model (C4GEM) for the investigation of flux distribution in M and BS cells during C4 photosynthesis. C4GEM, to our knowledge, is the first large-scale metabolic model that encapsulates metabolic interactions between two different cell types. C4GEM is based on the Arabidopsis (Arabidopsis thaliana) model (AraGEM) but has been extended by adding reactions and transporters responsible to represent three different C4 subtypes (NADP-ME [for malic enzyme], NAD-ME, and phosphoenolpyruvate carboxykinase). C4GEM has been validated for its ability to synthesize 47 biomass components and consists of 1,588 unique reactions, 1,755 metabolites, 83 interorganelle transporters, and 29 external transporters (including transport through plasmodesmata). Reactions in the common C4 model have been associated with well-annotated C4 species (NADP-ME subtypes): 3,557 genes in sorghum, 11,623 genes in maize, and 3,881 genes in sugarcane. The number of essential reactions not assigned to genes is 131, 135, and 156 in sorghum, maize, and sugarcane, respectively. Flux balance analysis was used to assess the metabolic activity in M and BS cells during C4 photosynthesis. Our simulations were consistent with chloroplast proteomic studies, and C4GEM predicted the classical C4 photosynthesis pathway and its major effect in organelle function in M and BS. The model also highlights differences in metabolic activities around photosystem I and photosystem II for three different C4 subtypes. Effects of CO2 leakage were also explored. C4GEM is a viable framework for in silico analysis of cell cooperation between M and BS
Reconstruction of Ancestral Genomes in Presence of Gene Gain and Loss.

PubMed

Avdeyev, Pavel; Jiang, Shuai; Aganezov, Sergey; Hu, Fei; Alekseyev, Max A

2016-03-01

Since most dramatic genomic changes are caused by genome rearrangements as well as gene duplications and gain/loss events, it becomes crucial to understand their mechanisms and reconstruct ancestral genomes of the given genomes. This problem was shown to be NP-complete even in the "simplest" case of three genomes, thus calling for heuristic rather than exact algorithmic solutions. At the same time, a larger number of input genomes may actually simplify the problem in practice as it was earlier illustrated with MGRA, a state-of-the-art software tool for reconstruction of ancestral genomes of multiple genomes. One of the key obstacles for MGRA and other similar tools is presence of breakpoint reuses when the same breakpoint region is broken by several different genome rearrangements in the course of evolution. Furthermore, such tools are often limited to genomes composed of the same genes with each gene present in a single copy in every genome. This limitation makes these tools inapplicable for many biological datasets and degrades the resolution of ancestral reconstructions in diverse datasets. We address these deficiencies by extending the MGRA algorithm to genomes with unequal gene contents. The developed next-generation tool MGRA2 can handle gene gain/loss events and shares the ability of MGRA to reconstruct ancestral genomes uniquely in the case of limited breakpoint reuse. Furthermore, MGRA2 employs a number of novel heuristics to cope with higher breakpoint reuse and process datasets inaccessible for MGRA. In practical experiments, MGRA2 shows superior performance for simulated and real genomes as compared to other ancestral genome reconstruction tools.
Development of Computational Tools for Metabolic Model Curation, Flux Elucidation and Strain Design

DOE Office of Scientific and Technical Information (OSTI.GOV)

Maranas, Costas D

An overarching goal of the Department of Energy mission is the efficient deployment and engineering of microbial and plant systems to enable biomass conversion in pursuit of high energy density liquid biofuels. This has spurred the pace at which new organisms are sequenced and annotated. This torrent of genomic information has opened the door to understanding metabolism in not just skeletal pathways and a handful of microorganisms but for truly genome-scale reconstructions derived for hundreds of microbes and plants. Understanding and redirecting metabolism is crucial because metabolic fluxes are unique descriptors of cellular physiology that directly assess the current cellularmore » state and quantify the effect of genetic engineering interventions. At the same time, however, trying to keep pace with the rate of genomic data generation has ushered in a number of modeling and computational challenges related to (i) the automated assembly, testing and correction of genome-scale metabolic models, (ii) metabolic flux elucidation using labeled isotopes, and (iii) comprehensive identification of engineering interventions leading to the desired metabolism redirection.« less
Integration of a constraint-based metabolic model of Brassica napus developing seeds with 13C-metabolic flux analysis

PubMed Central

Hay, Jordan O.; Shi, Hai; Heinzel, Nicolas; Hebbelmann, Inga; Rolletschek, Hardy; Schwender, Jorg

2014-01-01

The use of large-scale or genome-scale metabolic reconstructions for modeling and simulation of plant metabolism and integration of those models with large-scale omics and experimental flux data is becoming increasingly important in plant metabolic research. Here we report an updated version of bna572, a bottom-up reconstruction of oilseed rape (Brassica napus L.; Brassicaceae) developing seeds with emphasis on representation of biomass-component biosynthesis. New features include additional seed-relevant pathways for isoprenoid, sterol, phenylpropanoid, flavonoid, and choline biosynthesis. Being now based on standardized data formats and procedures for model reconstruction, bna572+ is available as a COBRA-compliant Systems Biology Markup Language (SBML) model and conforms to the Minimum Information Requested in the Annotation of Biochemical Models (MIRIAM) standards for annotation of external data resources. Bna572+ contains 966 genes, 671 reactions, and 666 metabolites distributed among 11 subcellular compartments. It is referenced to the Arabidopsis thaliana genome, with gene-protein-reaction (GPR) associations resolving subcellular localization. Detailed mass and charge balancing and confidence scoring were applied to all reactions. Using B. napus seed specific transcriptome data, expression was verified for 78% of bna572+ genes and 97% of reactions. Alongside bna572+ we also present a revised carbon centric model for 13C-Metabolic Flux Analysis (13C-MFA) with all its reactions being referenced to bna572+ based on linear projections. By integration of flux ratio constraints obtained from 13C-MFA and by elimination of infinite flux bounds around thermodynamically infeasible loops based on COBRA loopless methods, we demonstrate improvements in predictive power of Flux Variability Analysis (FVA). Using this combined approach we characterize the difference in metabolic flux of developing seeds of two B. napus genotypes contrasting in starch and oil content. PMID
Integration of a constraint-based metabolic model of Brassica napus developing seeds with 13C-metabolic flux analysis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hay, Jordan O.; Shi, Hai; Heinzel, Nicolas

The use of large-scale or genome-scale metabolic reconstructions for modeling and simulation of plant metabolism and integration of those models with large-scale omics and experimental flux data is becoming increasingly important in plant metabolic research. Here we report an updated version of bna572, a bottom-up reconstruction of oilseed rape (Brassica napus L.; Brassicaceae) developing seeds with emphasis on representation of biomass-component biosynthesis. New features include additional seed-relevant pathways for isoprenoid, sterol, phenylpropanoid, flavonoid, and choline biosynthesis. Being now based on standardized data formats and procedures for model reconstruction, bna572+ is available as a COBRA-compliant Systems Biology Markup Language (SBML) modelmore » and conforms to the Minimum Information Requested in the Annotation of Biochemical Models (MIRIAM) standards for annotation of external data resources. Bna572+ contains 966 genes, 671 reactions, and 666 metabolites distributed among 11 subcellular compartments. It is referenced to the Arabidopsis thaliana genome, with gene-protein-reaction (GPR) associations resolving subcellular localization. Detailed mass and charge balancing and confidence scoring were applied to all reactions. Using B. napus seed specific transcriptome data, expression was verified for 78% of bna572+ genes and 97% of reactions. Alongside bna572+ we also present a revised carbon centric model for 13C-Metabolic Flux Analysis ( 13C-MFA) with all its reactions being referenced to bna572+ based on linear projections. By integration of flux ratio constraints obtained from 13C-MFA and by elimination of infinite flux bounds around thermodynamically infeasible loops based on COBRA loopless methods, we demonstrate improvements in predictive power of Flux Variability Analysis (FVA). In conclusion, using this combined approach we characterize the difference in metabolic flux of developing seeds of two B. napus genotypes contrasting in starch
Integration of a constraint-based metabolic model of Brassica napus developing seeds with 13C-metabolic flux analysis

DOE PAGES

Hay, Jordan O.; Shi, Hai; Heinzel, Nicolas; ...

2014-12-19

The use of large-scale or genome-scale metabolic reconstructions for modeling and simulation of plant metabolism and integration of those models with large-scale omics and experimental flux data is becoming increasingly important in plant metabolic research. Here we report an updated version of bna572, a bottom-up reconstruction of oilseed rape (Brassica napus L.; Brassicaceae) developing seeds with emphasis on representation of biomass-component biosynthesis. New features include additional seed-relevant pathways for isoprenoid, sterol, phenylpropanoid, flavonoid, and choline biosynthesis. Being now based on standardized data formats and procedures for model reconstruction, bna572+ is available as a COBRA-compliant Systems Biology Markup Language (SBML) modelmore » and conforms to the Minimum Information Requested in the Annotation of Biochemical Models (MIRIAM) standards for annotation of external data resources. Bna572+ contains 966 genes, 671 reactions, and 666 metabolites distributed among 11 subcellular compartments. It is referenced to the Arabidopsis thaliana genome, with gene-protein-reaction (GPR) associations resolving subcellular localization. Detailed mass and charge balancing and confidence scoring were applied to all reactions. Using B. napus seed specific transcriptome data, expression was verified for 78% of bna572+ genes and 97% of reactions. Alongside bna572+ we also present a revised carbon centric model for 13C-Metabolic Flux Analysis ( 13C-MFA) with all its reactions being referenced to bna572+ based on linear projections. By integration of flux ratio constraints obtained from 13C-MFA and by elimination of infinite flux bounds around thermodynamically infeasible loops based on COBRA loopless methods, we demonstrate improvements in predictive power of Flux Variability Analysis (FVA). In conclusion, using this combined approach we characterize the difference in metabolic flux of developing seeds of two B. napus genotypes contrasting in starch
Lipid Metabolic Versatility in Malassezia spp. Yeasts Studied through Metabolic Modeling

PubMed Central

Triana, Sergio; de Cock, Hans; Ohm, Robin A.; Danies, Giovanna; Wösten, Han A. B.; Restrepo, Silvia; González Barrios, Andrés F.; Celis, Adriana

2017-01-01

Malassezia species are lipophilic and lipid-dependent yeasts belonging to the human and animal microbiota. Typically, they are isolated from regions rich in sebaceous glands. They have been associated with dermatological diseases such as seborrheic dermatitis, pityriasis versicolor, atopic dermatitis, and folliculitis. The genomes of Malassezia globosa, Malassezia sympodialis, and Malassezia pachydermatis lack the genes related to fatty acid synthesis. Here, the lipid-synthesis pathways of these species, as well as of Malassezia furfur, and of an atypical M. furfur variant were reconstructed using genome data and Constraints Based Reconstruction and Analysis. To this end, the genomes of M. furfur CBS 1878 and the atypical M. furfur 4DS were sequenced and annotated. The resulting Enzyme Commission numbers and predicted reactions were similar to the other Malassezia strains despite the differences in their genome size. Proteomic profiling was utilized to validate flux distributions. Flux differences were observed in the production of steroids in M. furfur and in the metabolism of butanoate in M. pachydermatis. The predictions obtained via these metabolic reconstructions also suggested defects in the assimilation of palmitic acid in M. globosa, M. sympodialis, M. pachydermatis, and the atypical variant of M. furfur, but not in M. furfur. These predictions were validated via physiological characterization, showing the predictive power of metabolic network reconstructions to provide new clues about the metabolic versatility of Malassezia. PMID:28959251

Lipid Metabolic Versatility in Malassezia spp. Yeasts Studied through Metabolic Modeling.

PubMed

Triana, Sergio; de Cock, Hans; Ohm, Robin A; Danies, Giovanna; Wösten, Han A B; Restrepo, Silvia; González Barrios, Andrés F; Celis, Adriana

2017-01-01

Malassezia species are lipophilic and lipid-dependent yeasts belonging to the human and animal microbiota. Typically, they are isolated from regions rich in sebaceous glands. They have been associated with dermatological diseases such as seborrheic dermatitis, pityriasis versicolor, atopic dermatitis, and folliculitis. The genomes of Malassezia globosa , Malassezia sympodialis , and Malassezia pachydermatis lack the genes related to fatty acid synthesis. Here, the lipid-synthesis pathways of these species, as well as of Malassezia furfur , and of an atypical M. furfur variant were reconstructed using genome data and Constraints Based Reconstruction and Analysis. To this end, the genomes of M. furfur CBS 1878 and the atypical M. furfur 4DS were sequenced and annotated. The resulting Enzyme Commission numbers and predicted reactions were similar to the other Malassezia strains despite the differences in their genome size. Proteomic profiling was utilized to validate flux distributions. Flux differences were observed in the production of steroids in M. furfur and in the metabolism of butanoate in M. pachydermatis . The predictions obtained via these metabolic reconstructions also suggested defects in the assimilation of palmitic acid in M. globosa , M. sympodialis , M. pachydermatis , and the atypical variant of M. furfur , but not in M. furfur. These predictions were validated via physiological characterization, showing the predictive power of metabolic network reconstructions to provide new clues about the metabolic versatility of Malassezia .
Rapid Countermeasure Discovery against Francisella tularensis Based on a Metabolic Network Reconstruction

DTIC Science & Technology

2013-05-21

minimum inhibitory concentrations and mammalian cell cytotoxicities. The most promising compound had a low molecular weight, was non-toxic, and abolished... molecular weight, was non-toxic, and abolished bacterial growth at 13 mM, with putative activity against pantetheine-phosphate adenylyltransferase, an...time period. Metabolic genome-scale models of bacteria have provided a computational framework for in silico simulations to evaluate how metabolic
Metabolic network modeling with model organisms.

PubMed

Yilmaz, L Safak; Walhout, Albertha Jm

2017-02-01

Flux balance analysis (FBA) with genome-scale metabolic network models (GSMNM) allows systems level predictions of metabolism in a variety of organisms. Different types of predictions with different accuracy levels can be made depending on the applied experimental constraints ranging from measurement of exchange fluxes to the integration of gene expression data. Metabolic network modeling with model organisms has pioneered method development in this field. In addition, model organism GSMNMs are useful for basic understanding of metabolism, and in the case of animal models, for the study of metabolic human diseases. Here, we discuss GSMNMs of most highly used model organisms with the emphasis on recent reconstructions. Published by Elsevier Ltd.
Metabolic network modeling with model organisms

PubMed Central

Yilmaz, L. Safak; Walhout, Albertha J.M.

2017-01-01

Flux balance analysis (FBA) with genome-scale metabolic network models (GSMNM) allows systems level predictions of metabolism in a variety of organisms. Different types of predictions with different accuracy levels can be made depending on the applied experimental constraints ranging from measurement of exchange fluxes to the integration of gene expression data. Metabolic network modeling with model organisms has pioneered method development in this field. In addition, model organism GSMNMs are useful for basic understanding of metabolism, and in the case of animal models, for the study of metabolic human diseases. Here, we discuss GSMNMs of most highly used model organisms with the emphasis on recent reconstructions. PMID:28088694
A Computational Solution to Automatically Map Metabolite Libraries in the Context of Genome Scale Metabolic Networks.

PubMed

Merlet, Benjamin; Paulhe, Nils; Vinson, Florence; Frainay, Clément; Chazalviel, Maxime; Poupin, Nathalie; Gloaguen, Yoann; Giacomoni, Franck; Jourdan, Fabien

2016-01-01

This article describes a generic programmatic method for mapping chemical compound libraries on organism-specific metabolic networks from various databases (KEGG, BioCyc) and flat file formats (SBML and Matlab files). We show how this pipeline was successfully applied to decipher the coverage of chemical libraries set up by two metabolomics facilities MetaboHub (French National infrastructure for metabolomics and fluxomics) and Glasgow Polyomics (GP) on the metabolic networks available in the MetExplore web server. The present generic protocol is designed to formalize and reduce the volume of information transfer between the library and the network database. Matching of metabolites between libraries and metabolic networks is based on InChIs or InChIKeys and therefore requires that these identifiers are specified in both libraries and networks. In addition to providing covering statistics, this pipeline also allows the visualization of mapping results in the context of metabolic networks. In order to achieve this goal, we tackled issues on programmatic interaction between two servers, improvement of metabolite annotation in metabolic networks and automatic loading of a mapping in genome scale metabolic network analysis tool MetExplore. It is important to note that this mapping can also be performed on a single or a selection of organisms of interest and is thus not limited to large facilities.
Simultaneous prediction of enzyme orthologs from chemical transformation patterns for de novo metabolic pathway reconstruction

PubMed Central

Tabei, Yasuo; Yamanishi, Yoshihiro; Kotera, Masaaki

2016-01-01

Motivation: Metabolic pathways are an important class of molecular networks consisting of compounds, enzymes and their interactions. The understanding of global metabolic pathways is extremely important for various applications in ecology and pharmacology. However, large parts of metabolic pathways remain unknown, and most organism-specific pathways contain many missing enzymes. Results: In this study we propose a novel method to predict the enzyme orthologs that catalyze the putative reactions to facilitate the de novo reconstruction of metabolic pathways from metabolome-scale compound sets. The algorithm detects the chemical transformation patterns of substrate–product pairs using chemical graph alignments, and constructs a set of enzyme-specific classifiers to simultaneously predict all the enzyme orthologs that could catalyze the putative reactions of the substrate–product pairs in the joint learning framework. The originality of the method lies in its ability to make predictions for thousands of enzyme orthologs simultaneously, as well as its extraction of enzyme-specific chemical transformation patterns of substrate–product pairs. We demonstrate the usefulness of the proposed method by applying it to some ten thousands of metabolic compounds, and analyze the extracted chemical transformation patterns that provide insights into the characteristics and specificities of enzymes. The proposed method will open the door to both primary (central) and secondary metabolism in genomics research, increasing research productivity to tackle a wide variety of environmental and public health matters. Availability and Implementation: Contact: maskot@bio.titech.ac.jp PMID:27307627
Randomizing Genome-Scale Metabolic Networks

PubMed Central

Samal, Areejit; Martin, Olivier C.

2011-01-01

Networks coming from protein-protein interactions, transcriptional regulation, signaling, or metabolism may appear to have “unusual” properties. To quantify this, it is appropriate to randomize the network and test the hypothesis that the network is not statistically different from expected in a motivated ensemble. However, when dealing with metabolic networks, the randomization of the network using edge exchange generates fictitious reactions that are biochemically meaningless. Here we provide several natural ensembles of randomized metabolic networks. A first constraint is to use valid biochemical reactions. Further constraints correspond to imposing appropriate functional constraints. We explain how to perform these randomizations with the help of Markov Chain Monte Carlo (MCMC) and show that they allow one to approach the properties of biological metabolic networks. The implication of the present work is that the observed global structural properties of real metabolic networks are likely to be the consequence of simple biochemical and functional constraints. PMID:21779409
BiGG Models: A platform for integrating, standardizing and sharing genome-scale models

DOE PAGES

King, Zachary A.; Lu, Justin; Drager, Andreas; ...

2015-10-17

In this study, genome-scale metabolic models are mathematically structured knowledge bases that can be used to predict metabolic pathway usage and growth phenotypes. Furthermore, they can generate and test hypotheses when integrated with experimental data. To maximize the value of these models, centralized repositories of high-quality models must be established, models must adhere to established standards and model components must be linked to relevant databases. Tools for model visualization further enhance their utility. To meet these needs, we present BiGG Models (http://bigg.ucsd.edu), a completely redesigned Biochemical, Genetic and Genomic knowledge base. BiGG Models contains more than 75 high-quality, manually-curated genome-scalemore » metabolic models. On the website, users can browse, search and visualize models. BiGG Models connects genome-scale models to genome annotations and external databases. Reaction and metabolite identifiers have been standardized across models to conform to community standards and enable rapid comparison across models. Furthermore, BiGG Models provides a comprehensive application programming interface for accessing BiGG Models with modeling and analysis tools. As a resource for highly curated, standardized and accessible models of metabolism, BiGG Models will facilitate diverse systems biology studies and support knowledge-based analysis of diverse experimental data.« less
Reconstruction of the genome-scale co-expression network for the Hippo signaling pathway in colorectal cancer.

PubMed

Dehghanian, Fariba; Hojati, Zohreh; Hosseinkhan, Nazanin; Mousavian, Zaynab; Masoudi-Nejad, Ali

2018-05-26

The Hippo signaling pathway (HSP) has been identified as an essential and complex signaling pathway for tumor suppression that coordinates proliferation, differentiation, cell death, cell growth and stemness. In the present study, we conducted a genome-scale co-expression analysis to reconstruct the HSP in colorectal cancer (CRC). Five key modules were detected through network clustering, and a detailed discussion of two modules containing respectively 18 and 13 over and down-regulated members of HSP was provided. Our results suggest new potential regulatory factors in the HSP. The detected modules also suggest novel genes contributing to CRC. Moreover, differential expression analysis confirmed the differential expression pattern of HSP members and new suggested regulatory factors between tumor and normal samples. These findings can further reveal the importance of HSP in CRC. Copyright © 2018 Elsevier Ltd. All rights reserved.
A genome-scale metabolic flux model of Escherichia coli K–12 derived from the EcoCyc database

PubMed Central

2014-01-01

Background Constraint-based models of Escherichia coli metabolic flux have played a key role in computational studies of cellular metabolism at the genome scale. We sought to develop a next-generation constraint-based E. coli model that achieved improved phenotypic prediction accuracy while being frequently updated and easy to use. We also sought to compare model predictions with experimental data to highlight open questions in E. coli biology. Results We present EcoCyc–18.0–GEM, a genome-scale model of the E. coli K–12 MG1655 metabolic network. The model is automatically generated from the current state of EcoCyc using the MetaFlux software, enabling the release of multiple model updates per year. EcoCyc–18.0–GEM encompasses 1445 genes, 2286 unique metabolic reactions, and 1453 unique metabolites. We demonstrate a three-part validation of the model that breaks new ground in breadth and accuracy: (i) Comparison of simulated growth in aerobic and anaerobic glucose culture with experimental results from chemostat culture and simulation results from the E. coli modeling literature. (ii) Essentiality prediction for the 1445 genes represented in the model, in which EcoCyc–18.0–GEM achieves an improved accuracy of 95.2% in predicting the growth phenotype of experimental gene knockouts. (iii) Nutrient utilization predictions under 431 different media conditions, for which the model achieves an overall accuracy of 80.7%. The model’s derivation from EcoCyc enables query and visualization via the EcoCyc website, facilitating model reuse and validation by inspection. We present an extensive investigation of disagreements between EcoCyc–18.0–GEM predictions and experimental data to highlight areas of interest to E. coli modelers and experimentalists, including 70 incorrect predictions of gene essentiality on glucose, 80 incorrect predictions of gene essentiality on glycerol, and 83 incorrect predictions of nutrient utilization. Conclusion Significant
ReacKnock: Identifying Reaction Deletion Strategies for Microbial Strain Optimization Based on Genome-Scale Metabolic Network

PubMed Central

Xu, Zixiang; Zheng, Ping; Sun, Jibin; Ma, Yanhe

2013-01-01

Gene knockout has been used as a common strategy to improve microbial strains for producing chemicals. Several algorithms are available to predict the target reactions to be deleted. Most of them apply mixed integer bi-level linear programming (MIBLP) based on metabolic networks, and use duality theory to transform bi-level optimization problem of large-scale MIBLP to single-level programming. However, the validity of the transformation was not proved. Solution of MIBLP depends on the structure of inner problem. If the inner problem is continuous, Karush-Kuhn-Tucker (KKT) method can be used to reformulate the MIBLP to a single-level one. We adopt KKT technique in our algorithm ReacKnock to attack the intractable problem of the solution of MIBLP, demonstrated with the genome-scale metabolic network model of E. coli for producing various chemicals such as succinate, ethanol, threonine and etc. Compared to the previous methods, our algorithm is fast, stable and reliable to find the optimal solutions for all the chemical products tested, and able to provide all the alternative deletion strategies which lead to the same industrial objective. PMID:24348984
Genome scale transcriptomics of baculovirus-insect interactions.

PubMed

Nguyen, Quan; Nielsen, Lars K; Reid, Steven

2013-11-12

Baculovirus-insect cell technologies are applied in the production of complex proteins, veterinary and human vaccines, gene delivery vectors' and biopesticides. Better understanding of how baculoviruses and insect cells interact would facilitate baculovirus-based production. While complete genomic sequences are available for over 58 baculovirus species, little insect genomic information is known. The release of the Bombyx mori and Plutella xylostella genomes, the accumulation of EST sequences for several Lepidopteran species, and especially the availability of two genome-scale analysis tools, namely oligonucleotide microarrays and next generation sequencing (NGS), have facilitated expression studies to generate a rich picture of insect gene responses to baculovirus infections. This review presents current knowledge on the interaction dynamics of the baculovirus-insect system' which is relatively well studied in relation to nucleocapsid transportation, apoptosis, and heat shock responses, but is still poorly understood regarding responses involved in pro-survival pathways, DNA damage pathways, protein degradation, translation, signaling pathways, RNAi pathways, and importantly metabolic pathways for energy, nucleotide and amino acid production. We discuss how the two genome-scale transcriptomic tools can be applied for studying such pathways and suggest that proteomics and metabolomics can produce complementary findings to transcriptomic studies.
Genome-centric resolution of microbial diversity, metabolism and interactions in anaerobic digestion.

PubMed

Vanwonterghem, Inka; Jensen, Paul D; Rabaey, Korneel; Tyson, Gene W

2016-09-01

Our understanding of the complex interconnected processes performed by microbial communities is hindered by our inability to culture the vast majority of microorganisms. Metagenomics provides a way to bypass this cultivation bottleneck and recent advances in this field now allow us to recover a growing number of genomes representing previously uncultured populations from increasingly complex environments. In this study, a temporal genome-centric metagenomic analysis was performed of lab-scale anaerobic digesters that host complex microbial communities fulfilling a series of interlinked metabolic processes to enable the conversion of cellulose to methane. In total, 101 population genomes that were moderate to near-complete were recovered based primarily on differential coverage binning. These populations span 19 phyla, represent mostly novel species and expand the genomic coverage of several rare phyla. Classification into functional guilds based on their metabolic potential revealed metabolic networks with a high level of functional redundancy as well as niche specialization, and allowed us to identify potential roles such as hydrolytic specialists for several rare, uncultured populations. Genome-centric analyses of complex microbial communities across diverse environments provide the key to understanding the phylogenetic and metabolic diversity of these interactive communities. © 2016 Society for Applied Microbiology and John Wiley & Sons Ltd.
Beyond Linear Sequence Comparisons: The use of genome-levelcharacters for phylogenetic reconstruction

DOE Office of Scientific and Technical Information (OSTI.GOV)

Boore, Jeffrey L.

2004-11-27

Although the phylogenetic relationships of many organisms have been convincingly resolved by the comparisons of nucleotide or amino acid sequences, others have remained equivocal despite great effort. Now that large-scale genome sequencing projects are sampling many lineages, it is becoming feasible to compare large data sets of genome-level features and to develop this as a tool for phylogenetic reconstruction that has advantages over conventional sequence comparisons. Although it is unlikely that these will address a large number of evolutionary branch points across the broad tree of life due to the infeasibility of such sampling, they have great potential for convincinglymore » resolving many critical, contested relationships for which no other data seems promising. However, it is important that we recognize potential pitfalls, establish reasonable standards for acceptance, and employ rigorous methodology to guard against a return to earlier days of scenario-driven evolutionary reconstructions.« less
Ethanol production improvement driven by genome-scale metabolic modeling and sensitivity analysis in Scheffersomyces stipitis

PubMed Central

2017-01-01

The yeast Scheffersomyces stipitis naturally produces ethanol from xylose, however reaching high ethanol yields is strongly dependent on aeration conditions. It has been reported that changes in the availability of NAD(H/+) cofactors can improve fermentation in some microorganisms. In this work genome-scale metabolic modeling and phenotypic phase plane analysis were used to characterize metabolic response on a range of uptake rates. Sensitivity analysis was used to assess the effect of ARC on ethanol production indicating that modifying ARC by inhibiting the respiratory chain ethanol production can be improved. It was shown experimentally in batch culture using Rotenone as an inhibitor of the mitochondrial NADH dehydrogenase complex I (CINADH), increasing ethanol yield by 18%. Furthermore, trajectories for uptakes rates, specific productivity and specific growth rate were determined by modeling the batch culture, to calculate ARC associated to the addition of CINADH inhibitor. Results showed that the increment in ethanol production via respiratory inhibition is due to excess in ARC, which generates an increase in ethanol production. Thus ethanol production improvement could be predicted by a change in ARC. PMID:28658270
Ethanol production improvement driven by genome-scale metabolic modeling and sensitivity analysis in Scheffersomyces stipitis.

PubMed

Acevedo, Alejandro; Conejeros, Raúl; Aroca, Germán

2017-01-01

The yeast Scheffersomyces stipitis naturally produces ethanol from xylose, however reaching high ethanol yields is strongly dependent on aeration conditions. It has been reported that changes in the availability of NAD(H/+) cofactors can improve fermentation in some microorganisms. In this work genome-scale metabolic modeling and phenotypic phase plane analysis were used to characterize metabolic response on a range of uptake rates. Sensitivity analysis was used to assess the effect of ARC on ethanol production indicating that modifying ARC by inhibiting the respiratory chain ethanol production can be improved. It was shown experimentally in batch culture using Rotenone as an inhibitor of the mitochondrial NADH dehydrogenase complex I (CINADH), increasing ethanol yield by 18%. Furthermore, trajectories for uptakes rates, specific productivity and specific growth rate were determined by modeling the batch culture, to calculate ARC associated to the addition of CINADH inhibitor. Results showed that the increment in ethanol production via respiratory inhibition is due to excess in ARC, which generates an increase in ethanol production. Thus ethanol production improvement could be predicted by a change in ARC.
Genome-Scale Architecture of Small Molecule Regulatory Networks and the Fundamental Trade-Off between Regulation and Enzymatic Activity

DOE Office of Scientific and Technical Information (OSTI.GOV)

Reznik, Ed; Christodoulou, Dimitris; Goldford, Joshua E.

Metabolic flux is in part regulated by endogenous small molecules that modulate the catalytic activity of an enzyme, e.g., allosteric inhibition. In contrast to transcriptional regulation of enzymes, technical limitations have hindered the production of a genome-scale atlas of small molecule-enzyme regulatory interactions. Here, we develop a framework leveraging the vast, but fragmented, biochemical literature to reconstruct and analyze the small molecule regulatory network (SMRN) of the model organism Escherichia coli, including the primary metabolite regulators and enzyme targets. Using metabolic control analysis, we prove a fundamental trade-off between regulation and enzymatic activity, and we combine it with metabolomic measurementsmore » and the SMRN to make inferences on the sensitivity of enzymes to their regulators. By generalizing the analysis to other organisms, we identify highly conserved regulatory interactions across evolutionarily divergent species, further emphasizing a critical role for small molecule interactions in the maintenance of metabolic homeostasis.« less
Genome-Scale Architecture of Small Molecule Regulatory Networks and the Fundamental Trade-Off between Regulation and Enzymatic Activity

DOE PAGES

Reznik, Ed; Christodoulou, Dimitris; Goldford, Joshua E.; ...

2017-09-12

Metabolic flux is in part regulated by endogenous small molecules that modulate the catalytic activity of an enzyme, e.g., allosteric inhibition. In contrast to transcriptional regulation of enzymes, technical limitations have hindered the production of a genome-scale atlas of small molecule-enzyme regulatory interactions. Here, we develop a framework leveraging the vast, but fragmented, biochemical literature to reconstruct and analyze the small molecule regulatory network (SMRN) of the model organism Escherichia coli, including the primary metabolite regulators and enzyme targets. Using metabolic control analysis, we prove a fundamental trade-off between regulation and enzymatic activity, and we combine it with metabolomic measurementsmore » and the SMRN to make inferences on the sensitivity of enzymes to their regulators. By generalizing the analysis to other organisms, we identify highly conserved regulatory interactions across evolutionarily divergent species, further emphasizing a critical role for small molecule interactions in the maintenance of metabolic homeostasis.« less
Integration of Genome-Scale Modeling and Transcript Profiling Reveals Metabolic Pathways Underlying Light and Temperature Acclimation in Arabidopsis[C][W

PubMed Central

Töpfer, Nadine; Caldana, Camila; Grimbs, Sergio; Willmitzer, Lothar; Fernie, Alisdair R.; Nikoloski, Zoran

2013-01-01

Understanding metabolic acclimation of plants to challenging environmental conditions is essential for dissecting the role of metabolic pathways in growth and survival. As stresses involve simultaneous physiological alterations across all levels of cellular organization, a comprehensive characterization of the role of metabolic pathways in acclimation necessitates integration of genome-scale models with high-throughput data. Here, we present an integrative optimization-based approach, which, by coupling a plant metabolic network model and transcriptomics data, can predict the metabolic pathways affected in a single, carefully controlled experiment. Moreover, we propose three optimization-based indices that characterize different aspects of metabolic pathway behavior in the context of the entire metabolic network. We demonstrate that the proposed approach and indices facilitate quantitative comparisons and characterization of the plant metabolic response under eight different light and/or temperature conditions. The predictions of the metabolic functions involved in metabolic acclimation of Arabidopsis thaliana to the changing conditions are in line with experimental evidence and result in a hypothesis about the role of homocysteine-to-Cys interconversion and Asn biosynthesis. The approach can also be used to reveal the role of particular metabolic pathways in other scenarios, while taking into consideration the entirety of characterized plant metabolism. PMID:23613196
Managing microbial communities for sequentially reconstruct genomes from complex metagenomes

NASA Astrophysics Data System (ADS)

Delmont, Tom O.; Vogel, Timothy M.; Simonet, Pascal

2013-04-01

Global understanding on environmental microbial communities is currently limited by the bottleneck of genome reconstruction. Soil is a typical example where individual cells are currently mostly uncultured and metagenomic datasets unassembled. In this study, the microbial community composition of a natural grassland soil was managed under several controlled selective pressures to experiment a "multi-evenness" stratagem for sequentially attempt to reconstruct genomes from a complex metagenome. While lowly represented in the natural community, several newly dominant genomes (an enrichment attaining 105 in some cases) were successfully reconstructed under various "harsh" tested conditions. These genomes belong to several genera including (but not restricted to) Leifsonia, Rhodanobacter, Bacillus, Ktedonobacter, Xanthomonas, Streptomyces and Burkholderia. So far, from 10 to 78% of generated metagenomic datasets were reconstructed, so providing access to more than 88 000 genes of known or unknown functions and to their genetic environment. Adaptative genes directly related to selective pressures were found, mostly in large plasmids. Functions of potential industrial interest (e.g., novel polyketide synthase modules in Streptomyces) were also discovered. Furthermore, an important phage infection snapshot (>1500X of coverage for the most represented phage) was observed among the Streptomyces population (three distinct genomes reconstructed) of a particular enrichment (mercury, 0.02g/kg) during the fourth month of incubation. This "divide and conquer" strategy could be applied to other environments and using auxiliary sequencing approaches like single cell to detect, connect and mine taxa and functions of interest while creating an extensive set of reference genomes from across the planet. Next limit could turn out to become our imagination defining novel selective pressures to sequentially make dominant the 1030 cells of the biosphere.

Systems approach to characterize the metabolism of liver cancer stem cells expressing CD133

NASA Astrophysics Data System (ADS)

Hur, Wonhee; Ryu, Jae Yong; Kim, Hyun Uk; Hong, Sung Woo; Lee, Eun Byul; Lee, Sang Yup; Yoon, Seung Kew

2017-04-01

Liver cancer stem cells (LCSCs) have attracted attention because they cause therapeutic resistance in hepatocellular carcinoma (HCC). Understanding the metabolism of LCSCs can be a key to developing therapeutic strategy, but metabolic characteristics have not yet been studied. Here, we systematically analyzed and compared the global metabolic phenotype between LCSCs and non-LCSCs using transcriptome and metabolome data. We also reconstructed genome-scale metabolic models (GEMs) for LCSC and non-LCSC to comparatively examine differences in their metabolism at genome-scale. We demonstrated that LCSCs exhibited an increased proliferation rate through enhancing glycolysis compared with non-LCSCs. We also confirmed that MYC, a central point of regulation in cancer metabolism, was significantly up-regulated in LCSCs compared with non-LCSCs. Moreover, LCSCs tend to have less active fatty acid oxidation. In this study, the metabolic characteristics of LCSCs were identified using integrative systems analysis, and these characteristics could be potential cures for the resistance of liver cancer cells to anticancer treatments.
Genome-Wide Prediction of Metabolic Enzymes, Pathways, and Gene Clusters in Plants

DOE PAGES

Schläpfer, Pascal; Zhang, Peifen; Wang, Chuan; ...

2017-04-01

Plant metabolism underpins many traits of ecological and agronomic importance. Plants produce numerous compounds to cope with their environments but the biosynthetic pathways for most of these compounds have not yet been elucidated. To engineer and improve metabolic traits, we will need comprehensive and accurate knowledge of the organization and regulation of plant metabolism at the genome scale. Here, we present a computational pipeline to identify metabolic enzymes, pathways, and gene clusters from a sequenced genome. Using this pipeline, we generated metabolic pathway databases for 22 species and identified metabolic gene clusters from 18 species. This unified resource can bemore » used to conduct a wide array of comparative studies of plant metabolism. Using the resource, we discovered a widespread occurrence of metabolic gene clusters in plants: 11,969 clusters from 18 species. The prevalence of metabolic gene clusters offers an intriguing possibility of an untapped source for uncovering new metabolite biosynthesis pathways. For example, more than 1,700 clusters contain enzymes that could generate a specialized metabolite scaffold (signature enzymes) and enzymes that modify the scaffold (tailoring enzymes). In four species with sufficient gene expression data, we identified 43 highly coexpressed clusters that contain signature and tailoring enzymes, of which eight were characterized previously to be functional pathways. Finally, we identified patterns of genome organization that implicate local gene duplication and, to a lesser extent, single gene transposition as having played roles in the evolution of plant metabolic gene clusters.« less
Genome-Wide Prediction of Metabolic Enzymes, Pathways, and Gene Clusters in Plants

DOE Office of Scientific and Technical Information (OSTI.GOV)

Schläpfer, Pascal; Zhang, Peifen; Wang, Chuan

Plant metabolism underpins many traits of ecological and agronomic importance. Plants produce numerous compounds to cope with their environments but the biosynthetic pathways for most of these compounds have not yet been elucidated. To engineer and improve metabolic traits, we will need comprehensive and accurate knowledge of the organization and regulation of plant metabolism at the genome scale. Here, we present a computational pipeline to identify metabolic enzymes, pathways, and gene clusters from a sequenced genome. Using this pipeline, we generated metabolic pathway databases for 22 species and identified metabolic gene clusters from 18 species. This unified resource can bemore » used to conduct a wide array of comparative studies of plant metabolism. Using the resource, we discovered a widespread occurrence of metabolic gene clusters in plants: 11,969 clusters from 18 species. The prevalence of metabolic gene clusters offers an intriguing possibility of an untapped source for uncovering new metabolite biosynthesis pathways. For example, more than 1,700 clusters contain enzymes that could generate a specialized metabolite scaffold (signature enzymes) and enzymes that modify the scaffold (tailoring enzymes). In four species with sufficient gene expression data, we identified 43 highly coexpressed clusters that contain signature and tailoring enzymes, of which eight were characterized previously to be functional pathways. Finally, we identified patterns of genome organization that implicate local gene duplication and, to a lesser extent, single gene transposition as having played roles in the evolution of plant metabolic gene clusters.« less
In silico method for modelling metabolism and gene product expression at genome scale

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lerman, Joshua A.; Hyduke, Daniel R.; Latif, Haythem

2012-07-03

Transcription and translation use raw materials and energy generated metabolically to create the macromolecular machinery responsible for all cellular functions, including metabolism. A biochemically accurate model of molecular biology and metabolism will facilitate comprehensive and quantitative computations of an organism's molecular constitution as a function of genetic and environmental parameters. Here we formulate a model of metabolism and macromolecular expression. Prototyping it using the simple microorganism Thermotoga maritima, we show our model accurately simulates variations in cellular composition and gene expression. Moreover, through in silico comparative transcriptomics, the model allows the discovery of new regulons and improving the genome andmore » transcription unit annotations. Our method presents a framework for investigating molecular biology and cellular physiology in silico and may allow quantitative interpretation of multi-omics data sets in the context of an integrated biochemical description of an organism.« less
Optimizing eukaryotic cell hosts for protein production through systems biotechnology and genome-scale modeling.

PubMed

Gutierrez, Jahir M; Lewis, Nathan E

2015-07-01

Eukaryotic cell lines, including Chinese hamster ovary cells, yeast, and insect cells, are invaluable hosts for the production of many recombinant proteins. With the advent of genomic resources, one can now leverage genome-scale computational modeling of cellular pathways to rationally engineer eukaryotic host cells. Genome-scale models of metabolism include all known biochemical reactions occurring in a specific cell. By describing these mathematically and using tools such as flux balance analysis, the models can simulate cell physiology and provide targets for cell engineering that could lead to enhanced cell viability, titer, and productivity. Here we review examples in which metabolic models in eukaryotic cell cultures have been used to rationally select targets for genetic modification, improve cellular metabolic capabilities, design media supplementation, and interpret high-throughput omics data. As more comprehensive models of metabolism and other cellular processes are developed for eukaryotic cell culture, these will enable further exciting developments in cell line engineering, thus accelerating recombinant protein production and biotechnology in the years to come. Copyright © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Bioreactor microbial ecosystems with differentiated methanogenic phenol biodegradation and competitive metabolic pathways unraveled with genome-resolved metagenomics.

PubMed

Ju, Feng; Wang, Yubo; Zhang, Tong

2018-01-01

Methanogenic biodegradation of aromatic compounds depends on syntrophic metabolism. However, metabolic enzymes and pathways of uncultured microorganisms and their ecological interactions with methanogenic consortia are unknown because of their resistance to isolation and limited genomic information. Genome-resolved metagenomics approaches were used to reconstruct and dissect 23 prokaryotic genomes from 37 and 20 °C methanogenic phenol-degrading reactors. Comparative genomic evidence suggests that temperature difference leads to the colonization of two distinct cooperative sub-communities that can respire sulfate/sulfite/sulfur or nitrate/nitrite compounds and compete for uptake of methanogenic substrates (e.g., acetate and hydrogen). This competition may differentiate methanogenesis. The uncultured ε - Proteobacterium G1, whose close relatives have broad ecological niches including the deep-sea vents, aquifers, sediment, limestone caves, spring, and anaerobic digesters, is implicated as a Sulfurovum -like facultative anaerobic diazotroph with metabolic versatility and remarkable environmental adaptability. We provide first genomic evidence for butyrate, alcohol, and carbohydrate utilization by a Chloroflexi T78 clade bacterium, and phenol carboxylation and assimilatory sulfite reduction in a Cryptanaerobacter bacterium. Genome-resolved metagenomics enriches our view on the differentiation of microbial community composition, metabolic pathways, and ecological interactions in temperature-differentiated methanogenic phenol-degrading bioreactors. These findings suggest optimization strategies for methanogenesis on phenol, such as temperature control, protection from light, feed desulfurization, and hydrogen sulfide removal from bioreactors. Moreover, decoding genome-borne properties (e.g., antibiotic, arsenic, and heavy metal resistance) of uncultured bacteria help to bring up alternative schemes to isolate them.
GEM System: automatic prototyping of cell-wide metabolic pathway models from genomes.

PubMed

Arakawa, Kazuharu; Yamada, Yohei; Shinoda, Kosaku; Nakayama, Yoichi; Tomita, Masaru

2006-03-23

Successful realization of a "systems biology" approach to analyzing cells is a grand challenge for our understanding of life. However, current modeling approaches to cell simulation are labor-intensive, manual affairs, and therefore constitute a major bottleneck in the evolution of computational cell biology. We developed the Genome-based Modeling (GEM) System for the purpose of automatically prototyping simulation models of cell-wide metabolic pathways from genome sequences and other public biological information. Models generated by the GEM System include an entire Escherichia coli metabolism model comprising 968 reactions of 1195 metabolites, achieving 100% coverage when compared with the KEGG database, 92.38% with the EcoCyc database, and 95.06% with iJR904 genome-scale model. The GEM System prototypes qualitative models to reduce the labor-intensive tasks required for systems biology research. Models of over 90 bacterial genomes are available at our web site.
Transcriptional regulation of NAD metabolism in bacteria: genomic reconstruction of NiaR (YrxA) regulon

PubMed Central

Rodionov, Dmitry A.; Li, Xiaoqing; Rodionova, Irina A.; Yang, Chen; Sorci, Leonardo; Dervyn, Etienne; Martynowski, Dariusz; Zhang, Hong; Gelfand, Mikhail S.; Osterman, Andrei L.

2008-01-01

A comparative genomic approach was used to reconstruct transcriptional regulation of NAD biosynthesis in bacteria containing orthologs of Bacillus subtilis gene yrxA, a previously identified niacin-responsive repressor of NAD de novo synthesis. Members of YrxA family (re-named here NiaR) are broadly conserved in the Bacillus/Clostridium group and in the deeply branching Fusobacteria and Thermotogales lineages. We analyzed upstream regions of genes associated with NAD biosynthesis to identify candidate NiaR-binding DNA motifs and assess the NiaR regulon content in these species. Representatives of the two distinct types of candidate NiaR-binding sites, characteristic of the Firmicutes and Thermotogales, were verified by an electrophoretic mobility shift assay. In addition to transcriptional control of the nadABC genes, the NiaR regulon in some species extends to niacin salvage (the pncAB genes) and includes uncharacterized membrane proteins possibly involved in niacin transport. The involvement in niacin uptake proposed for one of these proteins (re-named NiaP), encoded by the B. subtilis gene yceI, was experimentally verified. In addition to bacteria, members of the NiaP family are conserved in multicellular eukaryotes, including human, pointing to possible NaiP involvement in niacin utilization in these organisms. Overall, the analysis of the NiaR and NrtR regulons (described in the accompanying paper) revealed mechanisms of transcriptional regulation of NAD metabolism in nearly a hundred diverse bacteria. PMID:18276644
Revisiting the chlorophyll biosynthesis pathway using genome scale metabolic model of Oryza sativa japonica

PubMed Central

Chatterjee, Ankita; Kundu, Sudip

2015-01-01

Chlorophyll is one of the most important pigments present in green plants and rice is one of the major food crops consumed worldwide. We curated the existing genome scale metabolic model (GSM) of rice leaf by incorporating new compartment, reactions and transporters. We used this modified GSM to elucidate how the chlorophyll is synthesized in a leaf through a series of bio-chemical reactions spanned over different organelles using inorganic macronutrients and light energy. We predicted the essential reactions and the associated genes of chlorophyll synthesis and validated against the existing experimental evidences. Further, ammonia is known to be the preferred source of nitrogen in rice paddy fields. The ammonia entering into the plant is assimilated in the root and leaf. The focus of the present work is centered on rice leaf metabolism. We studied the relative importance of ammonia transporters through the chloroplast and the cytosol and their interlink with other intracellular transporters. Ammonia assimilation in the leaves takes place by the enzyme glutamine synthetase (GS) which is present in the cytosol (GS1) and chloroplast (GS2). Our results provided possible explanation why GS2 mutants show normal growth under minimum photorespiration and appear chlorotic when exposed to air. PMID:26443104
Wholly Rickettsia! Reconstructed Metabolic Profile of the Quintessential Bacterial Parasite of Eukaryotic Cells

PubMed Central

Driscoll, Timothy P.; Verhoeve, Victoria I.; Guillotte, Mark L.; Lehman, Stephanie S.; Rennoll, Sherri A.; Beier-Sexton, Magda; Rahman, M. Sayeedur; Azad, Abdu F.

2017-01-01

ABSTRACT Reductive genome evolution has purged many metabolic pathways from obligate intracellular Rickettsia (Alphaproteobacteria; Rickettsiaceae). While some aspects of host-dependent rickettsial metabolism have been characterized, the array of host-acquired metabolites and their cognate transporters remains unknown. This dearth of information has thwarted efforts to obtain an axenic Rickettsia culture, a major impediment to conventional genetic approaches. Using phylogenomics and computational pathway analysis, we reconstructed the Rickettsia metabolic and transport network, identifying 51 host-acquired metabolites (only 21 previously characterized) needed to compensate for degraded biosynthesis pathways. In the absence of glycolysis and the pentose phosphate pathway, cell envelope glycoconjugates are synthesized from three imported host sugars, with a range of additional host-acquired metabolites fueling the tricarboxylic acid cycle. Fatty acid and glycerophospholipid pathways also initiate from host precursors, and import of both isoprenes and terpenoids is required for the synthesis of ubiquinone and the lipid carrier of lipid I and O-antigen. Unlike metabolite-provisioning bacterial symbionts of arthropods, rickettsiae cannot synthesize B vitamins or most other cofactors, accentuating their parasitic nature. Six biosynthesis pathways contain holes (missing enzymes); similar patterns in taxonomically diverse bacteria suggest alternative enzymes that await discovery. A paucity of characterized and predicted transporters emphasizes the knowledge gap concerning how rickettsiae import host metabolites, some of which are large and not known to be transported by bacteria. Collectively, our reconstructed metabolic network offers clues to how rickettsiae hijack host metabolic pathways. This blueprint for growth determinants is an important step toward the design of axenic media to rescue rickettsiae from the eukaryotic cell. PMID:28951473
Genome-Wide Prediction of Metabolic Enzymes, Pathways, and Gene Clusters in Plants.

PubMed

Schläpfer, Pascal; Zhang, Peifen; Wang, Chuan; Kim, Taehyong; Banf, Michael; Chae, Lee; Dreher, Kate; Chavali, Arvind K; Nilo-Poyanco, Ricardo; Bernard, Thomas; Kahn, Daniel; Rhee, Seung Y

2017-04-01

Plant metabolism underpins many traits of ecological and agronomic importance. Plants produce numerous compounds to cope with their environments but the biosynthetic pathways for most of these compounds have not yet been elucidated. To engineer and improve metabolic traits, we need comprehensive and accurate knowledge of the organization and regulation of plant metabolism at the genome scale. Here, we present a computational pipeline to identify metabolic enzymes, pathways, and gene clusters from a sequenced genome. Using this pipeline, we generated metabolic pathway databases for 22 species and identified metabolic gene clusters from 18 species. This unified resource can be used to conduct a wide array of comparative studies of plant metabolism. Using the resource, we discovered a widespread occurrence of metabolic gene clusters in plants: 11,969 clusters from 18 species. The prevalence of metabolic gene clusters offers an intriguing possibility of an untapped source for uncovering new metabolite biosynthesis pathways. For example, more than 1,700 clusters contain enzymes that could generate a specialized metabolite scaffold (signature enzymes) and enzymes that modify the scaffold (tailoring enzymes). In four species with sufficient gene expression data, we identified 43 highly coexpressed clusters that contain signature and tailoring enzymes, of which eight were characterized previously to be functional pathways. Finally, we identified patterns of genome organization that implicate local gene duplication and, to a lesser extent, single gene transposition as having played roles in the evolution of plant metabolic gene clusters. © 2017 American Society of Plant Biologists. All Rights Reserved.
Pathgroups, a dynamic data structure for genome reconstruction problems.

PubMed

Zheng, Chunfang

2010-07-01

Ancestral gene order reconstruction problems, including the median problem, quartet construction, small phylogeny, guided genome halving and genome aliquoting, are NP hard. Available heuristics dedicated to each of these problems are computationally costly for even small instances. We present a data structure enabling rapid heuristic solution to all these ancestral genome reconstruction problems. A generic greedy algorithm with look-ahead based on an automatically generated priority system suffices for all the problems using this data structure. The efficiency of the algorithm is due to fast updating of the structure during run time and to the simplicity of the priority scheme. We illustrate with the first rapid algorithm for quartet construction and apply this to a set of yeast genomes to corroborate a recent gene sequence-based phylogeny. http://albuquerque.bioinformatics.uottawa.ca/pathgroup/Quartet.html chunfang313@gmail.com Supplementary data are available at Bioinformatics online.
Exploring metabolic pathway reconstruction and genome-wide expression profiling in Lactobacillus reuteri to define functional probiotic features.

PubMed

Saulnier, Delphine M; Santos, Filipe; Roos, Stefan; Mistretta, Toni-Ann; Spinler, Jennifer K; Molenaar, Douwe; Teusink, Bas; Versalovic, James

2011-04-29

The genomes of four Lactobacillus reuteri strains isolated from human breast milk and the gastrointestinal tract have been recently sequenced as part of the Human Microbiome Project. Preliminary genome comparisons suggested that these strains belong to two different clades, previously shown to differ with respect to antimicrobial production, biofilm formation, and immunomodulation. To explain possible mechanisms of survival in the host and probiosis, we completed a detailed genomic comparison of two breast milk-derived isolates representative of each group: an established probiotic strain (L. reuteri ATCC 55730) and a strain with promising probiotic features (L. reuteri ATCC PTA 6475). Transcriptomes of L. reuteri strains in different growth phases were monitored using strain-specific microarrays, and compared using a pan-metabolic model representing all known metabolic reactions present in these strains. Both strains contained candidate genes involved in the survival and persistence in the gut such as mucus-binding proteins and enzymes scavenging reactive oxygen species. A large operon predicted to encode the synthesis of an exopolysaccharide was identified in strain 55730. Both strains were predicted to produce health-promoting factors, including antimicrobial agents and vitamins (folate, vitamin B(12)). Additionally, a complete pathway for thiamine biosynthesis was predicted in strain 55730 for the first time in this species. Candidate genes responsible for immunomodulatory properties of each strain were identified by transcriptomic comparisons. The production of bioactive metabolites by human-derived probiotics may be predicted using metabolic modeling and transcriptomics. Such strategies may facilitate selection and optimization of probiotics for health promotion, disease prevention and amelioration.
EnzDP: improved enzyme annotation for metabolic network reconstruction based on domain composition profiles.

PubMed

Nguyen, Nam-Ninh; Srihari, Sriganesh; Leong, Hon Wai; Chong, Ket-Fah

2015-10-01

Determining the entire complement of enzymes and their enzymatic functions is a fundamental step for reconstructing the metabolic network of cells. High quality enzyme annotation helps in enhancing metabolic networks reconstructed from the genome, especially by reducing gaps and increasing the enzyme coverage. Currently, structure-based and network-based approaches can only cover a limited number of enzyme families, and the accuracy of homology-based approaches can be further improved. Bottom-up homology-based approach improves the coverage by rebuilding Hidden Markov Model (HMM) profiles for all known enzymes. However, its clustering procedure relies firmly on BLAST similarity score, ignoring protein domains/patterns, and is sensitive to changes in cut-off thresholds. Here, we use functional domain architecture to score the association between domain families and enzyme families (Domain-Enzyme Association Scoring, DEAS). The DEAS score is used to calculate the similarity between proteins, which is then used in clustering procedure, instead of using sequence similarity score. We improve the enzyme annotation protocol using a stringent classification procedure, and by choosing optimal threshold settings and checking for active sites. Our analysis shows that our stringent protocol EnzDP can cover up to 90% of enzyme families available in Swiss-Prot. It achieves a high accuracy of 94.5% based on five-fold cross-validation. EnzDP outperforms existing methods across several testing scenarios. Thus, EnzDP serves as a reliable automated tool for enzyme annotation and metabolic network reconstruction. Available at: www.comp.nus.edu.sg/~nguyennn/EnzDP .
Metabolic Reconstruction of Setaria italica: A Systems Biology Approach for Integrating Tissue-Specific Omics and Pathway Analysis of Bioenergy Grasses.

PubMed

de Oliveira Dal'Molin, Cristiana G; Orellana, Camila; Gebbie, Leigh; Steen, Jennifer; Hodson, Mark P; Chrysanthopoulos, Panagiotis; Plan, Manuel R; McQualter, Richard; Palfreyman, Robin W; Nielsen, Lars K

2016-01-01

The urgent need for major gains in industrial crops productivity and in biofuel production from bioenergy grasses have reinforced attention on understanding C4 photosynthesis. Systems biology studies of C4 model plants may reveal important features of C4 metabolism. Here we chose foxtail millet (Setaria italica), as a C4 model plant and developed protocols to perform systems biology studies. As part of the systems approach, we have developed and used a genome-scale metabolic reconstruction in combination with the use of multi-omics technologies to gain more insights into the metabolism of S. italica. mRNA, protein, and metabolite abundances, were measured in mature and immature stem/leaf phytomers, and the multi-omics data were integrated into the metabolic reconstruction framework to capture key metabolic features in different developmental stages of the plant. RNA-Seq reads were mapped to the S. italica resulting for 83% coverage of the protein coding genes of S. italica. Besides revealing similarities and differences in central metabolism of mature and immature tissues, transcriptome analysis indicates significant gene expression of two malic enzyme isoforms (NADP- ME and NAD-ME). Although much greater expression levels of NADP-ME genes are observed and confirmed by the correspondent protein abundances in the samples, the expression of multiple genes combined to the significant abundance of metabolites that participates in C4 metabolism of NAD-ME and NADP-ME subtypes suggest that S. italica may use mixed decarboxylation modes of C4 photosynthetic pathways under different plant developmental stages. The overall analysis also indicates different levels of regulation in mature and immature tissues in carbon fixation, glycolysis, TCA cycle, amino acids, fatty acids, lignin, and cellulose syntheses. Altogether, the multi-omics analysis reveals different biological entities and their interrelation and regulation over plant development. With this study, we demonstrated
Metabolic Reconstruction of Setaria italica: A Systems Biology Approach for Integrating Tissue-Specific Omics and Pathway Analysis of Bioenergy Grasses

PubMed Central

de Oliveira Dal'Molin, Cristiana G.; Orellana, Camila; Gebbie, Leigh; Steen, Jennifer; Hodson, Mark P.; Chrysanthopoulos, Panagiotis; Plan, Manuel R.; McQualter, Richard; Palfreyman, Robin W.; Nielsen, Lars K.

2016-01-01

The urgent need for major gains in industrial crops productivity and in biofuel production from bioenergy grasses have reinforced attention on understanding C4 photosynthesis. Systems biology studies of C4 model plants may reveal important features of C4 metabolism. Here we chose foxtail millet (Setaria italica), as a C4 model plant and developed protocols to perform systems biology studies. As part of the systems approach, we have developed and used a genome-scale metabolic reconstruction in combination with the use of multi-omics technologies to gain more insights into the metabolism of S. italica. mRNA, protein, and metabolite abundances, were measured in mature and immature stem/leaf phytomers, and the multi-omics data were integrated into the metabolic reconstruction framework to capture key metabolic features in different developmental stages of the plant. RNA-Seq reads were mapped to the S. italica resulting for 83% coverage of the protein coding genes of S. italica. Besides revealing similarities and differences in central metabolism of mature and immature tissues, transcriptome analysis indicates significant gene expression of two malic enzyme isoforms (NADP- ME and NAD-ME). Although much greater expression levels of NADP-ME genes are observed and confirmed by the correspondent protein abundances in the samples, the expression of multiple genes combined to the significant abundance of metabolites that participates in C4 metabolism of NAD-ME and NADP-ME subtypes suggest that S. italica may use mixed decarboxylation modes of C4 photosynthetic pathways under different plant developmental stages. The overall analysis also indicates different levels of regulation in mature and immature tissues in carbon fixation, glycolysis, TCA cycle, amino acids, fatty acids, lignin, and cellulose syntheses. Altogether, the multi-omics analysis reveals different biological entities and their interrelation and regulation over plant development. With this study, we demonstrated
Reconstructing the Genomic Content of Microbiome Taxa through Shotgun Metagenomic Deconvolution

PubMed Central

Carr, Rogan; Shen-Orr, Shai S.; Borenstein, Elhanan

2013-01-01

Metagenomics has transformed our understanding of the microbial world, allowing researchers to bypass the need to isolate and culture individual taxa and to directly characterize both the taxonomic and gene compositions of environmental samples. However, associating the genes found in a metagenomic sample with the specific taxa of origin remains a critical challenge. Existing binning methods, based on nucleotide composition or alignment to reference genomes allow only a coarse-grained classification and rely heavily on the availability of sequenced genomes from closely related taxa. Here, we introduce a novel computational framework, integrating variation in gene abundances across multiple samples with taxonomic abundance data to deconvolve metagenomic samples into taxa-specific gene profiles and to reconstruct the genomic content of community members. This assembly-free method is not bounded by various factors limiting previously described methods of metagenomic binning or metagenomic assembly and represents a fundamentally different approach to metagenomic-based genome reconstruction. An implementation of this framework is available at http://elbo.gs.washington.edu/software.html. We first describe the mathematical foundations of our framework and discuss considerations for implementing its various components. We demonstrate the ability of this framework to accurately deconvolve a set of metagenomic samples and to recover the gene content of individual taxa using synthetic metagenomic samples. We specifically characterize determinants of prediction accuracy and examine the impact of annotation errors on the reconstructed genomes. We finally apply metagenomic deconvolution to samples from the Human Microbiome Project, successfully reconstructing genus-level genomic content of various microbial genera, based solely on variation in gene count. These reconstructed genera are shown to correctly capture genus-specific properties. With the accumulation of metagenomic
Genome-wide mapping of mutations at single-nucleotide resolution for protein, metabolic and genome engineering.

PubMed

Garst, Andrew D; Bassalo, Marcelo C; Pines, Gur; Lynch, Sean A; Halweg-Edwards, Andrea L; Liu, Rongming; Liang, Liya; Wang, Zhiwen; Zeitoun, Ramsey; Alexander, William G; Gill, Ryan T

2017-01-01

Improvements in DNA synthesis and sequencing have underpinned comprehensive assessment of gene function in bacteria and eukaryotes. Genome-wide analyses require high-throughput methods to generate mutations and analyze their phenotypes, but approaches to date have been unable to efficiently link the effects of mutations in coding regions or promoter elements in a highly parallel fashion. We report that CRISPR-Cas9 gene editing in combination with massively parallel oligomer synthesis can enable trackable editing on a genome-wide scale. Our method, CRISPR-enabled trackable genome engineering (CREATE), links each guide RNA to homologous repair cassettes that both edit loci and function as barcodes to track genotype-phenotype relationships. We apply CREATE to site saturation mutagenesis for protein engineering, reconstruction of adaptive laboratory evolution experiments, and identification of stress tolerance and antibiotic resistance genes in bacteria. We provide preliminary evidence that CREATE will work in yeast. We also provide a webtool to design multiplex CREATE libraries.
Comprehensive analysis of a Metabolic Model for lipid production in Rhodosporidium toruloides.

PubMed

Castañeda, María Teresita; Nuñez, Sebastián; Garelli, Fabricio; Voget, Claudio; Battista, Hernán De

2018-05-19

The yeast Rhodosporidium toruloides has been extensively studied for its application in biolipid production. The knowledge of its metabolism capabilities and the application of constraint-based flux analysis methodology provide useful information for process prediction and optimization. The accuracy of the resulting predictions is highly dependent on metabolic models. A metabolic reconstruction for R. toruloides metabolism has been recently published. On the basis of this model, we developed a curated version that unblocks the central nitrogen metabolism and, in addition, completes charge and mass balances in some reactions neglected in the former model. Then, a comprehensive analysis of network capability was performed with the curated model and compared with the published metabolic reconstruction. The flux distribution obtained by lipid optimization with Flux Balance Analysis was able to replicate the internal biochemical changes that lead to lipogenesis in oleaginous microorganisms. These results motivate the development of a genome-scale model for complete elucidation of R. toruloides metabolism. Copyright © 2018 Elsevier B.V. All rights reserved.
Genome-scale analysis of anaerobic benzoate and phenol metabolism in the hyperthermophilic archaeon Ferroglobus placidus

PubMed Central

Holmes, Dawn E; Risso, Carla; Smith, Jessica A; Lovley, Derek R

2012-01-01

Insight into the mechanisms for the anaerobic metabolism of aromatic compounds by the hyperthermophilic archaeon Ferroglobus placidus is expected to improve understanding of the degradation of aromatics in hot (>80° C) environments and to identify enzymes that might have biotechnological applications. Analysis of the F. placidus genome revealed genes predicted to encode enzymes homologous to those previously identified as having a role in benzoate and phenol metabolism in mesophilic bacteria. Surprisingly, F. placidus lacks genes for an ATP-independent class II benzoyl-CoA (coenzyme A) reductase (BCR) found in all strictly anaerobic bacteria, but has instead genes coding for a bzd-type ATP-consuming class I BCR, similar to those found in facultative bacteria. The lower portion of the benzoate degradation pathway appears to be more similar to that found in the phototroph Rhodopseudomonas palustris, than the pathway reported for all heterotrophic anaerobic benzoate degraders. Many of the genes predicted to be involved in benzoate metabolism were found in one of two gene clusters. Genes for phenol carboxylation proceeding through a phenylphosphate intermediate were identified in a single gene cluster. Analysis of transcript abundance with a whole-genome microarray and quantitative reverse transcriptase polymerase chain reaction demonstrated that most of the genes predicted to be involved in benzoate or phenol metabolism had higher transcript abundance during growth on those substrates vs growth on acetate. These results suggest that the general strategies for benzoate and phenol metabolism are highly conserved between microorganisms living in moderate and hot environments, and that anaerobic metabolism of aromatic compounds might be analyzed in a wide range of environments with similar molecular targets. PMID:21776029

Genome-Wide Prediction of Metabolic Enzymes, Pathways, and Gene Clusters in Plants1[OPEN

PubMed Central

Zhang, Peifen; Kim, Taehyong; Banf, Michael; Chavali, Arvind K.; Nilo-Poyanco, Ricardo; Bernard, Thomas

2017-01-01

Plant metabolism underpins many traits of ecological and agronomic importance. Plants produce numerous compounds to cope with their environments but the biosynthetic pathways for most of these compounds have not yet been elucidated. To engineer and improve metabolic traits, we need comprehensive and accurate knowledge of the organization and regulation of plant metabolism at the genome scale. Here, we present a computational pipeline to identify metabolic enzymes, pathways, and gene clusters from a sequenced genome. Using this pipeline, we generated metabolic pathway databases for 22 species and identified metabolic gene clusters from 18 species. This unified resource can be used to conduct a wide array of comparative studies of plant metabolism. Using the resource, we discovered a widespread occurrence of metabolic gene clusters in plants: 11,969 clusters from 18 species. The prevalence of metabolic gene clusters offers an intriguing possibility of an untapped source for uncovering new metabolite biosynthesis pathways. For example, more than 1,700 clusters contain enzymes that could generate a specialized metabolite scaffold (signature enzymes) and enzymes that modify the scaffold (tailoring enzymes). In four species with sufficient gene expression data, we identified 43 highly coexpressed clusters that contain signature and tailoring enzymes, of which eight were characterized previously to be functional pathways. Finally, we identified patterns of genome organization that implicate local gene duplication and, to a lesser extent, single gene transposition as having played roles in the evolution of plant metabolic gene clusters. PMID:28228535
Reverse engineering and analysis of large genome-scale gene networks

PubMed Central

Aluru, Maneesha; Zola, Jaroslaw; Nettleton, Dan; Aluru, Srinivas

2013-01-01

Reverse engineering the whole-genome networks of complex multicellular organisms continues to remain a challenge. While simpler models easily scale to large number of genes and gene expression datasets, more accurate models are compute intensive limiting their scale of applicability. To enable fast and accurate reconstruction of large networks, we developed Tool for Inferring Network of Genes (TINGe), a parallel mutual information (MI)-based program. The novel features of our approach include: (i) B-spline-based formulation for linear-time computation of MI, (ii) a novel algorithm for direct permutation testing and (iii) development of parallel algorithms to reduce run-time and facilitate construction of large networks. We assess the quality of our method by comparison with ARACNe (Algorithm for the Reconstruction of Accurate Cellular Networks) and GeneNet and demonstrate its unique capability by reverse engineering the whole-genome network of Arabidopsis thaliana from 3137 Affymetrix ATH1 GeneChips in just 9 min on a 1024-core cluster. We further report on the development of a new software Gene Network Analyzer (GeNA) for extracting context-specific subnetworks from a given set of seed genes. Using TINGe and GeNA, we performed analysis of 241 Arabidopsis AraCyc 8.0 pathways, and the results are made available through the web. PMID:23042249
Comparative genomics approaches to understanding and manipulating plant metabolism.

PubMed

Bradbury, Louis M T; Niehaus, Tom D; Hanson, Andrew D

2013-04-01

Over 3000 genomes, including numerous plant genomes, are now sequenced. However, their annotation remains problematic as illustrated by the many conserved genes with no assigned function, vague annotations such as 'kinase', or even wrong ones. Around 40% of genes of unknown function that are conserved between plants and microbes are probably metabolic enzymes or transporters; finding functions for these genes is a major challenge. Comparative genomics has correctly predicted functions for many such genes by analyzing genomic context, and gene fusions, distributions and co-expression. Comparative genomics complements genetic and biochemical approaches to dissect metabolism, continues to increase in power and decrease in cost, and has a pivotal role in modeling and engineering by helping identify functions for all metabolic genes. Copyright © 2012 Elsevier Ltd. All rights reserved.
Genome-enabled Modeling of Microbial Biogeochemistry using a Trait-based Approach. Does Increasing Metabolic Complexity Increase Predictive Capabilities?

NASA Astrophysics Data System (ADS)

King, E.; Karaoz, U.; Molins, S.; Bouskill, N.; Anantharaman, K.; Beller, H. R.; Banfield, J. F.; Steefel, C. I.; Brodie, E.

2015-12-01

The biogeochemical functioning of ecosystems is shaped in part by genomic information stored in the subsurface microbiome. Cultivation-independent approaches allow us to extract this information through reconstruction of thousands of genomes from a microbial community. Analysis of these genomes, in turn, gives an indication of the organisms present and their functional roles. However, metagenomic analyses can currently deliver thousands of different genomes that range in abundance/importance, requiring the identification and assimilation of key physiologies and metabolisms to be represented as traits for successful simulation of subsurface processes. Here we focus on incorporating -omics information into BioCrunch, a genome-informed trait-based model that represents the diversity of microbial functional processes within a reactive transport framework. This approach models the rate of nutrient uptake and the thermodynamics of coupled electron donors and acceptors for a range of microbial metabolisms including heterotrophs and chemolithotrophs. Metabolism of exogenous substrates fuels catabolic and anabolic processes, with the proportion of energy used for cellular maintenance, respiration, biomass development, and enzyme production based upon dynamic intracellular and environmental conditions. This internal resource partitioning represents a trade-off against biomass formation and results in microbial community emergence across a fitness landscape. Biocrunch was used here in simulations that included organisms and metabolic pathways derived from a dataset of ~1200 non-redundant genomes reflecting a microbial community in a floodplain aquifer. Metagenomic data was directly used to parameterize trait values related to growth and to identify trait linkages associated with respiration, fermentation, and key enzymatic functions such as plant polymer degradation. Simulations spanned a range of metabolic complexities and highlight benefits originating from simulations
Improving 3D Genome Reconstructions Using Orthologous and Functional Constraints

PubMed Central

Diament, Alon; Tuller, Tamir

2015-01-01

The study of the 3D architecture of chromosomes has been advancing rapidly in recent years. While a number of methods for 3D reconstruction of genomic models based on Hi-C data were proposed, most of the analyses in the field have been performed on different 3D representation forms (such as graphs). Here, we reproduce most of the previous results on the 3D genomic organization of the eukaryote Saccharomyces cerevisiae using analysis of 3D reconstructions. We show that many of these results can be reproduced in sparse reconstructions, generated from a small fraction of the experimental data (5% of the data), and study the properties of such models. Finally, we propose for the first time a novel approach for improving the accuracy of 3D reconstructions by introducing additional predicted physical interactions to the model, based on orthologous interactions in an evolutionary-related organism and based on predicted functional interactions between genes. We demonstrate that this approach indeed leads to the reconstruction of improved models. PMID:26000633
Genomic insights into the metabolic potential and interactions between marine methanotrophic ANME archaea and associated bacteria

NASA Astrophysics Data System (ADS)

Orphan, V. J.; Skennerton, C.; Chadwick, G.; Haroon, F.; Tyson, G. W.; Leu, A.; Hatzenpichler, R.; Woyke, T.; Malmstrom, R.; Yu, H.; Scheller, S.

2015-12-01

Cooperative metabolic interactions between multiple groups of methanotrophic 'ANME' archaea and sulfate-reducing bacteria represent the primary sink for methane within continental margin sediments. These syntrophic associations are frequently observed as structured multi-celled consortia in methane seeps, often comprising a substantial proportion of the microbial biomass within near seafloor seep sediments. Since their discovery nearly 15 years ago, a number of distinct ANME groups and multiple sulfate-reducing bacterial partners have been described from seep environments worldwide. Attempts to reconstruct the genomes of some ANME organisms have been reported, however the ecological physiology and metabolic interactions of distinct ANME lineages and their bacterial partners remains poorly understood. Here, we used a fluorescence azide-alkyne click chemistry technique known as BONCAT combined with FAC sorting to examine patterns in microbial membership and the genomes of single, metabolically active ANME-bacterial consortia recovered from methane seep sediments. This targeted consortia-level sequencing approach revealed significant diversity in the ANME-bacterial associations in situ as well as insights into the potential syntrophic mechanisms underpinning these enigmatic methane-fueled partnerships.
Microbial Community Metabolic Modeling: A Community Data-Driven Network Reconstruction: COMMUNITY DATA-DRIVEN METABOLIC NETWORK MODELING

DOE Office of Scientific and Technical Information (OSTI.GOV)

Henry, Christopher S.; Bernstein, Hans C.; Weisenhorn, Pamela

Metabolic network modeling of microbial communities provides an in-depth understanding of community-wide metabolic and regulatory processes. Compared to single organism analyses, community metabolic network modeling is more complex because it needs to account for interspecies interactions. To date, most approaches focus on reconstruction of high-quality individual networks so that, when combined, they can predict community behaviors as a result of interspecies interactions. However, this conventional method becomes ineffective for communities whose members are not well characterized and cannot be experimentally interrogated in isolation. Here, we tested a new approach that uses community-level data as a critical input for the networkmore » reconstruction process. This method focuses on directly predicting interspecies metabolic interactions in a community, when axenic information is insufficient. We validated our method through the case study of a bacterial photoautotroph-heterotroph consortium that was used to provide data needed for a community-level metabolic network reconstruction. Resulting simulations provided experimentally validated predictions of how a photoautotrophic cyanobacterium supports the growth of an obligate heterotrophic species by providing organic carbon and nitrogen sources.« less
Genome-Enabled Modeling of Biogeochemical Processes Predicts Metabolic Dependencies that Connect the Relative Fitness of Microbial Functional Guilds

NASA Astrophysics Data System (ADS)

Brodie, E.; King, E.; Molins, S.; Karaoz, U.; Steefel, C. I.; Banfield, J. F.; Beller, H. R.; Anantharaman, K.; Ligocki, T. J.; Trebotich, D.

2015-12-01

Pore-scale processes mediated by microorganisms underlie a range of critical ecosystem services, regulating carbon stability, nutrient flux, and the purification of water. Advances in cultivation-independent approaches now provide us with the ability to reconstruct thousands of genomes from microbial populations from which functional roles may be assigned. With this capability to reveal microbial metabolic potential, the next step is to put these microbes back where they belong to interact with their natural environment, i.e. the pore scale. At this scale, microorganisms communicate, cooperate and compete across their fitness landscapes with communities emerging that feedback on the physical and chemical properties of their environment, ultimately altering the fitness landscape and selecting for new microbial communities with new properties and so on. We have developed a trait-based model of microbial activity that simulates coupled functional guilds that are parameterized with unique combinations of traits that govern fitness under dynamic conditions. Using a reactive transport framework, we simulate the thermodynamics of coupled electron donor-acceptor reactions to predict energy available for cellular maintenance, respiration, biomass development, and enzyme production. From metagenomics, we directly estimate some trait values related to growth and identify the linkage of key traits associated with respiration and fermentation, macromolecule depolymerizing enzymes, and other key functions such as nitrogen fixation. Our simulations were carried out to explore abiotic controls on community emergence such as seasonally fluctuating water table regimes across floodplain organic matter hotspots. Simulations and metagenomic/metatranscriptomic observations highlighted the many dependencies connecting the relative fitness of functional guilds and the importance of chemolithoautotrophic lifestyles. Using an X-Ray microCT-derived soil microaggregate physical model combined
Use of randomized sampling for analysis of metabolic networks.

PubMed

Schellenberger, Jan; Palsson, Bernhard Ø

2009-02-27

Genome-scale metabolic network reconstructions in microorganisms have been formulated and studied for about 8 years. The constraint-based approach has shown great promise in analyzing the systemic properties of these network reconstructions. Notably, constraint-based models have been used successfully to predict the phenotypic effects of knock-outs and for metabolic engineering. The inherent uncertainty in both parameters and variables of large-scale models is significant and is well suited to study by Monte Carlo sampling of the solution space. These techniques have been applied extensively to the reaction rate (flux) space of networks, with more recent work focusing on dynamic/kinetic properties. Monte Carlo sampling as an analysis tool has many advantages, including the ability to work with missing data, the ability to apply post-processing techniques, and the ability to quantify uncertainty and to optimize experiments to reduce uncertainty. We present an overview of this emerging area of research in systems biology.
Two-Scale 13C Metabolic Flux Analysis for Metabolic Engineering.

PubMed

Ando, David; Garcia Martin, Hector

2018-01-01

Accelerating the Design-Build-Test-Learn (DBTL) cycle in synthetic biology is critical to achieving rapid and facile bioengineering of organisms for the production of, e.g., biofuels and other chemicals. The Learn phase involves using data obtained from the Test phase to inform the next Design phase. As part of the Learn phase, mathematical models of metabolic fluxes give a mechanistic level of comprehension to cellular metabolism, isolating the principle drivers of metabolic behavior from the peripheral ones, and directing future experimental designs and engineering methodologies. Furthermore, the measurement of intracellular metabolic fluxes is specifically noteworthy as providing a rapid and easy-to-understand picture of how carbon and energy flow throughout the cell. Here, we present a detailed guide to performing metabolic flux analysis in the Learn phase of the DBTL cycle, where we show how one can take the isotope labeling data from a 13 C labeling experiment and immediately turn it into a determination of cellular fluxes that points in the direction of genetic engineering strategies that will advance the metabolic engineering process.For our modeling purposes we use the Joint BioEnergy Institute (JBEI) Quantitative Metabolic Modeling (jQMM) library, which provides an open-source, python-based framework for modeling internal metabolic fluxes and making actionable predictions on how to modify cellular metabolism for specific bioengineering goals. It presents a complete toolbox for performing different types of flux analysis such as Flux Balance Analysis, 13 C Metabolic Flux Analysis, and it introduces the capability to use 13 C labeling experimental data to constrain comprehensive genome-scale models through a technique called two-scale 13 C Metabolic Flux Analysis (2S- 13 C MFA) [1]. In addition to several other capabilities, the jQMM is also able to predict the effects of knockouts using the MoMA and ROOM methodologies. The use of the jQMM library is
Wholly Rickettsia! Reconstructed Metabolic Profile of the Quintessential Bacterial Parasite of Eukaryotic Cells.

PubMed

Driscoll, Timothy P; Verhoeve, Victoria I; Guillotte, Mark L; Lehman, Stephanie S; Rennoll, Sherri A; Beier-Sexton, Magda; Rahman, M Sayeedur; Azad, Abdu F; Gillespie, Joseph J

2017-09-26

Reductive genome evolution has purged many metabolic pathways from obligate intracellular Rickettsia ( Alphaproteobacteria ; Rickettsiaceae ). While some aspects of host-dependent rickettsial metabolism have been characterized, the array of host-acquired metabolites and their cognate transporters remains unknown. This dearth of information has thwarted efforts to obtain an axenic Rickettsia culture, a major impediment to conventional genetic approaches. Using phylogenomics and computational pathway analysis, we reconstructed the Rickettsia metabolic and transport network, identifying 51 host-acquired metabolites (only 21 previously characterized) needed to compensate for degraded biosynthesis pathways. In the absence of glycolysis and the pentose phosphate pathway, cell envelope glycoconjugates are synthesized from three imported host sugars, with a range of additional host-acquired metabolites fueling the tricarboxylic acid cycle. Fatty acid and glycerophospholipid pathways also initiate from host precursors, and import of both isoprenes and terpenoids is required for the synthesis of ubiquinone and the lipid carrier of lipid I and O-antigen. Unlike metabolite-provisioning bacterial symbionts of arthropods, rickettsiae cannot synthesize B vitamins or most other cofactors, accentuating their parasitic nature. Six biosynthesis pathways contain holes (missing enzymes); similar patterns in taxonomically diverse bacteria suggest alternative enzymes that await discovery. A paucity of characterized and predicted transporters emphasizes the knowledge gap concerning how rickettsiae import host metabolites, some of which are large and not known to be transported by bacteria. Collectively, our reconstructed metabolic network offers clues to how rickettsiae hijack host metabolic pathways. This blueprint for growth determinants is an important step toward the design of axenic media to rescue rickettsiae from the eukaryotic cell. IMPORTANCE A hallmark of obligate intracellular
Using genome-scale metabolic models to compare serovars of the foodborne pathogen Listeria monocytogenes.

PubMed

Metz, Zachary P; Ding, Tong; Baumler, David J

2018-01-01

Listeria monocytogenes is a microorganism of great concern for the food industry and the cause of human foodborne disease. Therefore, novel methods of control are needed, and systems biology is one such approach to identify them. Using a combination of computational techniques and laboratory methods, genome-scale metabolic models (GEMs) can be created, validated, and used to simulate growth environments and discern metabolic capabilities of microbes of interest, including L. monocytogenes. The objective of the work presented here was to generate GEMs for six different strains of L. monocytogenes, and to both qualitatively and quantitatively validate these GEMs with experimental data to examine the diversity of metabolic capabilities of numerous strains from the three different serovar groups most associated with foodborne outbreaks and human disease. Following qualitative validation, 57 of the 95 carbon sources tested experimentally were present in the GEMs, and; therefore, these were the compounds from which comparisons could be drawn. Of these 57 compounds, agreement between in silico predictions and in vitro results for carbon source utilization ranged from 80.7% to 91.2% between strains. Nutrient utilization agreement between in silico predictions and in vitro results were also conducted for numerous nitrogen, phosphorous, and sulfur sources. Additionally, quantitative validation showed that the L. monocytogenes GEMs were able to generate in silico predictions for growth rate and growth yield that were strongly and significantly (p < 0.0013 and p < 0.0015, respectively) correlated with experimental results. These findings are significant because they show that these GEMs for L. monocytogenes are comparable to published GEMs of other organisms for agreement between in silico predictions and in vitro results. Therefore, as with the other GEMs, namely those for Escherichia coli, Staphylococcus aureus, Vibrio vulnificus, and Salmonella spp., they can be used to
A multi-objective constraint-based approach for modeling genome-scale microbial ecosystems.

PubMed

Budinich, Marko; Bourdon, Jérémie; Larhlimi, Abdelhalim; Eveillard, Damien

2017-01-01

Interplay within microbial communities impacts ecosystems on several scales, and elucidation of the consequent effects is a difficult task in ecology. In particular, the integration of genome-scale data within quantitative models of microbial ecosystems remains elusive. This study advocates the use of constraint-based modeling to build predictive models from recent high-resolution -omics datasets. Following recent studies that have demonstrated the accuracy of constraint-based models (CBMs) for simulating single-strain metabolic networks, we sought to study microbial ecosystems as a combination of single-strain metabolic networks that exchange nutrients. This study presents two multi-objective extensions of CBMs for modeling communities: multi-objective flux balance analysis (MO-FBA) and multi-objective flux variability analysis (MO-FVA). Both methods were applied to a hot spring mat model ecosystem. As a result, multiple trade-offs between nutrients and growth rates, as well as thermodynamically favorable relative abundances at community level, were emphasized. We expect this approach to be used for integrating genomic information in microbial ecosystems. Following models will provide insights about behaviors (including diversity) that take place at the ecosystem scale.
PathFinder: reconstruction and dynamic visualization of metabolic pathways.

PubMed

Goesmann, Alexander; Haubrock, Martin; Meyer, Folker; Kalinowski, Jörn; Giegerich, Robert

2002-01-01

Beyond methods for a gene-wise annotation and analysis of sequenced genomes new automated methods for functional analysis on a higher level are needed. The identification of realized metabolic pathways provides valuable information on gene expression and regulation. Detection of incomplete pathways helps to improve a constantly evolving genome annotation or discover alternative biochemical pathways. To utilize automated genome analysis on the level of metabolic pathways new methods for the dynamic representation and visualization of pathways are needed. PathFinder is a tool for the dynamic visualization of metabolic pathways based on annotation data. Pathways are represented as directed acyclic graphs, graph layout algorithms accomplish the dynamic drawing and visualization of the metabolic maps. A more detailed analysis of the input data on the level of biochemical pathways helps to identify genes and detect improper parts of annotations. As an Relational Database Management System (RDBMS) based internet application PathFinder reads a list of EC-numbers or a given annotation in EMBL- or Genbank-format and dynamically generates pathway graphs.
Parallel Mutual Information Based Construction of Genome-Scale Networks on the Intel® Xeon Phi™ Coprocessor.

PubMed

Misra, Sanchit; Pamnany, Kiran; Aluru, Srinivas

2015-01-01

Construction of whole-genome networks from large-scale gene expression data is an important problem in systems biology. While several techniques have been developed, most cannot handle network reconstruction at the whole-genome scale, and the few that can, require large clusters. In this paper, we present a solution on the Intel Xeon Phi coprocessor, taking advantage of its multi-level parallelism including many x86-based cores, multiple threads per core, and vector processing units. We also present a solution on the Intel® Xeon® processor. Our solution is based on TINGe, a fast parallel network reconstruction technique that uses mutual information and permutation testing for assessing statistical significance. We demonstrate the first ever inference of a plant whole genome regulatory network on a single chip by constructing a 15,575 gene network of the plant Arabidopsis thaliana from 3,137 microarray experiments in only 22 minutes. In addition, our optimization for parallelizing mutual information computation on the Intel Xeon Phi coprocessor holds out lessons that are applicable to other domains.
Functional phylogenomics analysis of bacteria and archaea using consistent genome annotation with UniFam

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chai, Juanjuan; Kora, Guruprasad; Ahn, Tae-Hyuk

2014-10-09

To supply some background, phylogenetic studies have provided detailed knowledge on the evolutionary mechanisms of genes and species in Bacteria and Archaea. However, the evolution of cellular functions, represented by metabolic pathways and biological processes, has not been systematically characterized. Many clades in the prokaryotic tree of life have now been covered by sequenced genomes in GenBank. This enables a large-scale functional phylogenomics study of many computationally inferred cellular functions across all sequenced prokaryotes. Our results show a total of 14,727 GenBank prokaryotic genomes were re-annotated using a new protein family database, UniFam, to obtain consistent functional annotations for accuratemore » comparison. The functional profile of a genome was represented by the biological process Gene Ontology (GO) terms in its annotation. The GO term enrichment analysis differentiated the functional profiles between selected archaeal taxa. 706 prokaryotic metabolic pathways were inferred from these genomes using Pathway Tools and MetaCyc. The consistency between the distribution of metabolic pathways in the genomes and the phylogenetic tree of the genomes was measured using parsimony scores and retention indices. The ancestral functional profiles at the internal nodes of the phylogenetic tree were reconstructed to track the gains and losses of metabolic pathways in evolutionary history. In conclusion, our functional phylogenomics analysis shows divergent functional profiles of taxa and clades. Such function-phylogeny correlation stems from a set of clade-specific cellular functions with low parsimony scores. On the other hand, many cellular functions are sparsely dispersed across many clades with high parsimony scores. These different types of cellular functions have distinct evolutionary patterns reconstructed from the prokaryotic tree.« less
Metabolic Reconstruction and Modeling Microbial Electrosynthesis.

PubMed

Marshall, Christopher W; Ross, Daniel E; Handley, Kim M; Weisenhorn, Pamela B; Edirisinghe, Janaka N; Henry, Christopher S; Gilbert, Jack A; May, Harold D; Norman, R Sean

2017-08-21

Microbial electrosynthesis is a renewable energy and chemical production platform that relies on microbial cells to capture electrons from a cathode and fix carbon. Yet despite the promise of this technology, the metabolic capacity of the microbes that inhabit the electrode surface and catalyze electron transfer in these systems remains largely unknown. We assembled thirteen draft genomes from a microbial electrosynthesis system producing primarily acetate from carbon dioxide, and their transcriptional activity was mapped to genomes from cells on the electrode surface and in the supernatant. This allowed us to create a metabolic model of the predominant community members belonging to Acetobacterium, Sulfurospirillum, and Desulfovibrio. According to the model, the Acetobacterium was the primary carbon fixer, and a keystone member of the community. Transcripts of soluble hydrogenases and ferredoxins from Acetobacterium and hydrogenases, formate dehydrogenase, and cytochromes of Desulfovibrio were found in high abundance near the electrode surface. Cytochrome c oxidases of facultative members of the community were highly expressed in the supernatant despite completely sealed reactors and constant flushing with anaerobic gases. These molecular discoveries and metabolic modeling now serve as a foundation for future examination and development of electrosynthetic microbial communities.
Genome resolved analysis of a premature infant gut microbial community reveals a Varibaculum cambriense genome and a shift towards fermentation-based metabolism during the third week of life.

PubMed

Brown, Christopher T; Sharon, Itai; Thomas, Brian C; Castelle, Cindy J; Morowitz, Michael J; Banfield, Jillian F

2013-12-17

The premature infant gut has low individual but high inter-individual microbial diversity compared with adults. Based on prior 16S rRNA gene surveys, many species from this environment are expected to be similar to those previously detected in the human microbiota. However, the level of genomic novelty and metabolic variation of strains found in the infant gut remains relatively unexplored. To study the stability and function of early microbial colonizers of the premature infant gut, nine stool samples were taken during the third week of life of a premature male infant delivered via Caesarean section. Metagenomic sequences were assembled and binned into near-complete and partial genomes, enabling strain-level genomic analysis of the microbial community.We reconstructed eleven near-complete and six partial bacterial genomes representative of the key members of the microbial community. Twelve of these genomes share >90% putative ortholog amino acid identity with reference genomes. Manual curation of the assembly of one particularly novel genome resulted in the first essentially complete genome sequence (in three pieces, the order of which could not be determined due to a repeat) for Varibaculum cambriense (strain Dora), a medically relevant species that has been implicated in abscess formation.During the period studied, the microbial community undergoes a compositional shift, in which obligate anaerobes (fermenters) overtake Escherichia coli as the most abundant species. Other species remain stable, probably due to their ability to either respire anaerobically or grow by fermentation, and their capacity to tolerate fluctuating levels of oxygen. Metabolic predictions for V. cambriense suggest that, like other members of the microbial community, this organism is able to process various sugar substrates and make use of multiple different electron acceptors during anaerobic respiration. Genome comparisons within the family Actinomycetaceae reveal important differences
Genome resolved analysis of a premature infant gut microbial community reveals a Varibaculum cambriense genome and a shift towards fermentation-based metabolism during the third week of life

PubMed Central

2013-01-01

Background The premature infant gut has low individual but high inter-individual microbial diversity compared with adults. Based on prior 16S rRNA gene surveys, many species from this environment are expected to be similar to those previously detected in the human microbiota. However, the level of genomic novelty and metabolic variation of strains found in the infant gut remains relatively unexplored. Results To study the stability and function of early microbial colonizers of the premature infant gut, nine stool samples were taken during the third week of life of a premature male infant delivered via Caesarean section. Metagenomic sequences were assembled and binned into near-complete and partial genomes, enabling strain-level genomic analysis of the microbial community. We reconstructed eleven near-complete and six partial bacterial genomes representative of the key members of the microbial community. Twelve of these genomes share >90% putative ortholog amino acid identity with reference genomes. Manual curation of the assembly of one particularly novel genome resulted in the first essentially complete genome sequence (in three pieces, the order of which could not be determined due to a repeat) for Varibaculum cambriense (strain Dora), a medically relevant species that has been implicated in abscess formation. During the period studied, the microbial community undergoes a compositional shift, in which obligate anaerobes (fermenters) overtake Escherichia coli as the most abundant species. Other species remain stable, probably due to their ability to either respire anaerobically or grow by fermentation, and their capacity to tolerate fluctuating levels of oxygen. Metabolic predictions for V. cambriense suggest that, like other members of the microbial community, this organism is able to process various sugar substrates and make use of multiple different electron acceptors during anaerobic respiration. Genome comparisons within the family Actinomycetaceae reveal
Genome Sequence of “Candidatus Walczuchella monophlebidarum” the Flavobacterial Endosymbiont of Llaveia axin axin (Hemiptera: Coccoidea: Monophlebidae)

PubMed Central

Rosas-Pérez, Tania; Rosenblueth, Mónica; Rincón-Rosales, Reiner; Mora, Jaime; Martínez-Romero, Esperanza

2014-01-01

Scale insects (Hemiptera: Coccoidae) constitute a very diverse group of sap-feeding insects with a large diversity of symbiotic associations with bacteria. Here, we present the complete genome sequence, metabolic reconstruction, and comparative genomics of the flavobacterial endosymbiont of the giant scale insect Llaveia axin axin. The gene repertoire of its 309,299 bp genome was similar to that of other flavobacterial insect endosymbionts though not syntenic. According to its genetic content, essential amino acid biosynthesis is likely to be the flavobacterial endosymbiont's principal contribution to the symbiotic association with its insect host. We also report the presence of a γ-proteobacterial symbiont that may be involved in waste nitrogen recycling and also has amino acid biosynthetic capabilities that may provide metabolic precursors to the flavobacterial endosymbiont. We propose “Candidatus Walczuchella monophlebidarum” as the name of the flavobacterial endosymbiont of insects from the Monophlebidae family. PMID:24610838

Reconstructing genome-wide regulatory network of E. coli using transcriptome data and predicted transcription factor activities

PubMed Central

2011-01-01

Background Gene regulatory networks play essential roles in living organisms to control growth, keep internal metabolism running and respond to external environmental changes. Understanding the connections and the activity levels of regulators is important for the research of gene regulatory networks. While relevance score based algorithms that reconstruct gene regulatory networks from transcriptome data can infer genome-wide gene regulatory networks, they are unfortunately prone to false positive results. Transcription factor activities (TFAs) quantitatively reflect the ability of the transcription factor to regulate target genes. However, classic relevance score based gene regulatory network reconstruction algorithms use models do not include the TFA layer, thus missing a key regulatory element. Results This work integrates TFA prediction algorithms with relevance score based network reconstruction algorithms to reconstruct gene regulatory networks with improved accuracy over classic relevance score based algorithms. This method is called Gene expression and Transcription factor activity based Relevance Network (GTRNetwork). Different combinations of TFA prediction algorithms and relevance score functions have been applied to find the most efficient combination. When the integrated GTRNetwork method was applied to E. coli data, the reconstructed genome-wide gene regulatory network predicted 381 new regulatory links. This reconstructed gene regulatory network including the predicted new regulatory links show promising biological significances. Many of the new links are verified by known TF binding site information, and many other links can be verified from the literature and databases such as EcoCyc. The reconstructed gene regulatory network is applied to a recent transcriptome analysis of E. coli during isobutanol stress. In addition to the 16 significantly changed TFAs detected in the original paper, another 7 significantly changed TFAs have been detected by
Metabolic Potential of Microbial Genomes Reconstructed from a Deep-Sea Oligotrophic Sediment Metagenome

NASA Astrophysics Data System (ADS)

Tully, B. J.; Huber, J. A.; Heidelberg, J. F.

2016-02-01

The South Pacific Gyre (SPG) possesses the lowest rates of sedimentation, surface chlorophyll concentration and primary productivity in the global oceans, making it one of the most oligotrophic environments on earth. As a direct result of the low-standing biomass in surface waters, deep-sea sediments are thin and contain small amount of labile organic carbon. It was recently shown that the sediment column within the SPG is fully oxic through to the underlying basalt basement and may be representative of 9-37% of the global marine environment. In addition, it appears that approximately 50% of the total organic carbon is removed from the oligotrophic sediments within the first 20 centimeters beneath the sea floor (cmbsf). To understand the microbial processes that contribute to the removal of the labile organic matter, metagenomic sequencing and analysis was carried out on a sample of sediment collected from 0-5 cmbsf from SPG site 10 (U1369). Analysis of 9 partially reconstructed environmental genomes revealed that the members of the SPG surface sediment microbial community are phylogenetically distinct from surface/upper ocean organisms, with deep branches within the Alpha- and Gammaproteobacteria, Nitrospirae, Nitrospina, the phylum NC10, and several unique phylogenetic groups. Within these partially complete genomes there is evidence for microbially mediated metal (iron/manganese) oxidation and carbon fixation linked to the nitrification. Additionally, despite low sedimentation and hypothesized energy-limitation, members of the SPG microbial community had motility and chemotactic genes and possessed mechanisms for the utilization of high molecular weight organic matter, including exoproteases and peptide specific membrane transporters. Simultaneously, the SPG genomes showed a limited potential for the degradation of recalcitrant carbon compounds. Finally, the presence of putative genes with functions involved with denitrification and the consumption of C1
Integrated Approach to Reconstruction of Microbial Regulatory Networks

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rodionov, Dmitry A; Novichkov, Pavel S

2013-11-04

This project had the goal(s) of development of integrated bioinformatics platform for genome-scale inference and visualization of transcriptional regulatory networks (TRNs) in bacterial genomes. The work was done in Sanford-Burnham Medical Research Institute (SBMRI, P.I. D.A. Rodionov) and Lawrence Berkeley National Laboratory (LBNL, co-P.I. P.S. Novichkov). The developed computational resources include: (1) RegPredict web-platform for TRN inference and regulon reconstruction in microbial genomes, and (2) RegPrecise database for collection, visualization and comparative analysis of transcriptional regulons reconstructed by comparative genomics. These analytical resources were selected as key components in the DOE Systems Biology KnowledgeBase (SBKB). The high-quality data accumulated inmore » RegPrecise will provide essential datasets of reference regulons in diverse microbes to enable automatic reconstruction of draft TRNs in newly sequenced genomes. We outline our progress toward the three aims of this grant proposal, which were: Develop integrated platform for genome-scale regulon reconstruction; Infer regulatory annotations in several groups of bacteria and building of reference collections of microbial regulons; and Develop KnowledgeBase on microbial transcriptional regulation.« less
Sugar Lego: gene composition of bacterial carbohydrate metabolism genomic loci.

PubMed

Kaznadzey, Anna; Shelyakin, Pavel; Gelfand, Mikhail S

2017-11-25

Bacterial carbohydrate metabolism is extremely diverse, since carbohydrates serve as a major energy source and are involved in a variety of cellular processes. Bacterial genes belonging to same metabolic pathway are often co-localized in the chromosome, but it is not a strict rule. Gene co-localization in linked to co-evolution and co-regulation. This study focuses on a large-scale analysis of bacterial genomic loci related to the carbohydrate metabolism. We demonstrate that only 53% of 148,000 studied genes from over six hundred bacterial genomes are co-localized in bacterial genomes with other carbohydrate metabolism genes, which points to a significant role of singleton genes. Co-localized genes form cassettes, ranging in size from two to fifteen genes. Two major factors influencing the cassette-forming tendency are gene function and bacterial phylogeny. We have obtained a comprehensive picture of co-localization preferences of genes for nineteen major carbohydrate metabolism functional classes, over two hundred gene orthologous clusters, and thirty bacterial classes, and characterized the cassette variety in size and content among different species, highlighting a significant role of short cassettes. The preference towards co-localization of carbohydrate metabolism genes varies between 40 and 76% for bacterial taxa. Analysis of frequently co-localized genes yielded forty-five significant pairwise links between genes belonging to different functional classes. The number of such links per class range from zero to eight, demonstrating varying preferences of respective genes towards a specific chromosomal neighborhood. Genes from eleven functional classes tend to co-localize with genes from the same class, indicating an important role of clustering of genes with similar functions. At that, in most cases such co-localization does not originate from local duplication events. Overall, we describe a complex web formed by evolutionary relationships of bacterial
Large-scale genome-wide association studies in East Asians identify new genetic loci influencing metabolic traits.

PubMed

Kim, Young Jin; Go, Min Jin; Hu, Cheng; Hong, Chang Bum; Kim, Yun Kyoung; Lee, Ji Young; Hwang, Joo-Yeon; Oh, Ji Hee; Kim, Dong-Joon; Kim, Nam Hee; Kim, Soeui; Hong, Eun Jung; Kim, Ji-Hyun; Min, Haesook; Kim, Yeonjung; Zhang, Rong; Jia, Weiping; Okada, Yukinori; Takahashi, Atsushi; Kubo, Michiaki; Tanaka, Toshihiro; Kamatani, Naoyuki; Matsuda, Koichi; Park, Taesung; Oh, Bermseok; Kimm, Kuchan; Kang, Daehee; Shin, Chol; Cho, Nam H; Kim, Hyung-Lae; Han, Bok-Ghee; Lee, Jong-Young; Cho, Yoon Shin

2011-09-11

To identify the genetic bases for nine metabolic traits, we conducted a meta-analysis combining Korean genome-wide association results from the KARE project (n = 8,842) and the HEXA shared control study (n = 3,703). We verified the associations of the loci selected from the discovery meta-analysis in the replication stage (30,395 individuals from the BioBank Japan genome-wide association study and individuals comprising the Health2 and Shanghai Jiao Tong University Diabetes cohorts). We identified ten genome-wide significant signals newly associated with traits from an overall meta-analysis. The most compelling associations involved 12q24.11 (near MYL2) and 12q24.13 (in C12orf51) for high-density lipoprotein cholesterol, 2p21 (near SIX2-SIX3) for fasting plasma glucose, 19q13.33 (in RPS11) and 6q22.33 (in RSPO3) for renal traits, and 12q24.11 (near MYL2), 12q24.13 (in C12orf51 and near OAS1), 4q31.22 (in ZNF827) and 7q11.23 (near TBL2-BCL7B) for hepatic traits. These findings highlight previously unknown biological pathways for metabolic traits investigated in this study.
Systematic Construction of Kinetic Models from Genome-Scale Metabolic Networks

PubMed Central

Smallbone, Kieran; Klipp, Edda; Mendes, Pedro; Liebermeister, Wolfram

2013-01-01

The quantitative effects of environmental and genetic perturbations on metabolism can be studied in silico using kinetic models. We present a strategy for large-scale model construction based on a logical layering of data such as reaction fluxes, metabolite concentrations, and kinetic constants. The resulting models contain realistic standard rate laws and plausible parameters, adhere to the laws of thermodynamics, and reproduce a predefined steady state. These features have not been simultaneously achieved by previous workflows. We demonstrate the advantages and limitations of the workflow by translating the yeast consensus metabolic network into a kinetic model. Despite crudely selected data, the model shows realistic control behaviour, a stable dynamic, and realistic response to perturbations in extracellular glucose concentrations. The paper concludes by outlining how new data can continuously be fed into the workflow and how iterative model building can assist in directing experiments. PMID:24324546
Reconciled rat and human metabolic networks for comparative toxicogenomics and biomarker predictions

PubMed Central

Blais, Edik M.; Rawls, Kristopher D.; Dougherty, Bonnie V.; Li, Zhuo I.; Kolling, Glynis L.; Ye, Ping; Wallqvist, Anders; Papin, Jason A.

2017-01-01

The laboratory rat has been used as a surrogate to study human biology for more than a century. Here we present the first genome-scale network reconstruction of Rattus norvegicus metabolism, iRno, and a significantly improved reconstruction of human metabolism, iHsa. These curated models comprehensively capture metabolic features known to distinguish rats from humans including vitamin C and bile acid synthesis pathways. After reconciling network differences between iRno and iHsa, we integrate toxicogenomics data from rat and human hepatocytes, to generate biomarker predictions in response to 76 drugs. We validate comparative predictions for xanthine derivatives with new experimental data and literature-based evidence delineating metabolite biomarkers unique to humans. Our results provide mechanistic insights into species-specific metabolism and facilitate the selection of biomarkers consistent with rat and human biology. These models can serve as powerful computational platforms for contextualizing experimental data and making functional predictions for clinical and basic science applications. PMID:28176778
Bayesian reconstruction of transmission within outbreaks using genomic variants.

PubMed

De Maio, Nicola; Worby, Colin J; Wilson, Daniel J; Stoesser, Nicole

2018-04-01

Pathogen genome sequencing can reveal details of transmission histories and is a powerful tool in the fight against infectious disease. In particular, within-host pathogen genomic variants identified through heterozygous nucleotide base calls are a potential source of information to identify linked cases and infer direction and time of transmission. However, using such data effectively to model disease transmission presents a number of challenges, including differentiating genuine variants from those observed due to sequencing error, as well as the specification of a realistic model for within-host pathogen population dynamics. Here we propose a new Bayesian approach to transmission inference, BadTrIP (BAyesian epiDemiological TRansmission Inference from Polymorphisms), that explicitly models evolution of pathogen populations in an outbreak, transmission (including transmission bottlenecks), and sequencing error. BadTrIP enables the inference of host-to-host transmission from pathogen sequencing data and epidemiological data. By assuming that genomic variants are unlinked, our method does not require the computationally intensive and unreliable reconstruction of individual haplotypes. Using simulations we show that BadTrIP is robust in most scenarios and can accurately infer transmission events by efficiently combining information from genetic and epidemiological sources; thanks to its realistic model of pathogen evolution and the inclusion of epidemiological data, BadTrIP is also more accurate than existing approaches. BadTrIP is distributed as an open source package (https://bitbucket.org/nicofmay/badtrip) for the phylogenetic software BEAST2. We apply our method to reconstruct transmission history at the early stages of the 2014 Ebola outbreak, showcasing the power of within-host genomic variants to reconstruct transmission events.
Metabolic Reconstruction and Modeling Microbial Electrosynthesis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Marshall, Christopher W.; Ross, Daniel E.; Handley, Kim M.

Microbial electrosynthesis is a renewable energy and chemical production platform that relies on microbial cells to capture electrons from a cathode and fix carbon. Yet despite the promise of this technology, the metabolic capacity of the microbes that inhabit the electrode surface and catalyze electron transfer in these systems remains largely unknown. Here, we assembled thirteen draft genomes from a microbial electrosynthesis system producing primarily acetate from carbon dioxide, and their transcriptional activity was mapped to genomes from cells on the electrode surface and in the supernatant. This allowed us to create a metabolic model of the predominant community membersmore » belonging to Acetobacterium, Sulfurospirillum, and Desulfovibrio. According to the model, the Acetobacterium was the primary carbon fixer, and a keystone member of the community. Transcripts of soluble hydrogenases and ferredoxins from Acetobacterium and hydrogenases, formate dehydrogenase, and cytochromes of Desulfovibrio were found in high abundance near the electrode surface. Cytochrome c oxidases of facultative members of the community were highly expressed in the supernatant despite completely sealed reactors and constant flushing with anaerobic gases. The resulting molecular discoveries and metabolic modeling now serve as a foundation for future examination and development of electrosynthetic microbial communities.« less
Metabolic Reconstruction and Modeling Microbial Electrosynthesis

DOE PAGES

Marshall, Christopher W.; Ross, Daniel E.; Handley, Kim M.; ...

2017-08-21

Microbial electrosynthesis is a renewable energy and chemical production platform that relies on microbial cells to capture electrons from a cathode and fix carbon. Yet despite the promise of this technology, the metabolic capacity of the microbes that inhabit the electrode surface and catalyze electron transfer in these systems remains largely unknown. Here, we assembled thirteen draft genomes from a microbial electrosynthesis system producing primarily acetate from carbon dioxide, and their transcriptional activity was mapped to genomes from cells on the electrode surface and in the supernatant. This allowed us to create a metabolic model of the predominant community membersmore » belonging to Acetobacterium, Sulfurospirillum, and Desulfovibrio. According to the model, the Acetobacterium was the primary carbon fixer, and a keystone member of the community. Transcripts of soluble hydrogenases and ferredoxins from Acetobacterium and hydrogenases, formate dehydrogenase, and cytochromes of Desulfovibrio were found in high abundance near the electrode surface. Cytochrome c oxidases of facultative members of the community were highly expressed in the supernatant despite completely sealed reactors and constant flushing with anaerobic gases. The resulting molecular discoveries and metabolic modeling now serve as a foundation for future examination and development of electrosynthetic microbial communities.« less
Joint scaling laws in functional and evolutionary categories in prokaryotic genomes

PubMed Central

Grilli, J.; Bassetti, B.; Maslov, S.; Cosentino Lagomarsino, M.

2012-01-01

We propose and study a class-expansion/innovation/loss model of genome evolution taking into account biological roles of genes and their constituent domains. In our model, numbers of genes in different functional categories are coupled to each other. For example, an increase in the number of metabolic enzymes in a genome is usually accompanied by addition of new transcription factors regulating these enzymes. Such coupling can be thought of as a proportional ‘recipe’ for genome composition of the type ‘a spoonful of sugar for each egg yolk’. The model jointly reproduces two known empirical laws: the distribution of family sizes and the non-linear scaling of the number of genes in certain functional categories (e.g. transcription factors) with genome size. In addition, it allows us to derive a novel relation between the exponents characterizing these two scaling laws, establishing a direct quantitative connection between evolutionary and functional categories. It predicts that functional categories that grow faster-than-linearly with genome size to be characterized by flatter-than-average family size distributions. This relation is confirmed by our bioinformatics analysis of prokaryotic genomes. This proves that the joint quantitative trends of functional and evolutionary classes can be understood in terms of evolutionary growth with proportional recipes. PMID:21937509
Enabling comparative modeling of closely related genomes: Example genus Brucella

DOE PAGES

Faria, José P.; Edirisinghe, Janaka N.; Davis, James J.; ...

2014-03-08

For many scientific applications, it is highly desirable to be able to compare metabolic models of closely related genomes. In this study, we attempt to raise awareness to the fact that taking annotated genomes from public repositories and using them for metabolic model reconstructions is far from being trivial due to annotation inconsistencies. We are proposing a protocol for comparative analysis of metabolic models on closely related genomes, using fifteen strains of genus Brucella, which contains pathogens of both humans and livestock. This study lead to the identification and subsequent correction of inconsistent annotations in the SEED database, as wellmore » as the identification of 31 biochemical reactions that are common to Brucella, which are not originally identified by automated metabolic reconstructions. We are currently implementing this protocol for improving automated annotations within the SEED database and these improvements have been propagated into PATRIC, Model-SEED, KBase and RAST. This method is an enabling step for the future creation of consistent annotation systems and high-quality model reconstructions that will support in predicting accurate phenotypes such as pathogenicity, media requirements or type of respiration.« less
Enabling comparative modeling of closely related genomes: Example genus Brucella

DOE Office of Scientific and Technical Information (OSTI.GOV)

Faria, José P.; Edirisinghe, Janaka N.; Davis, James J.

For many scientific applications, it is highly desirable to be able to compare metabolic models of closely related genomes. In this study, we attempt to raise awareness to the fact that taking annotated genomes from public repositories and using them for metabolic model reconstructions is far from being trivial due to annotation inconsistencies. We are proposing a protocol for comparative analysis of metabolic models on closely related genomes, using fifteen strains of genus Brucella, which contains pathogens of both humans and livestock. This study lead to the identification and subsequent correction of inconsistent annotations in the SEED database, as wellmore » as the identification of 31 biochemical reactions that are common to Brucella, which are not originally identified by automated metabolic reconstructions. We are currently implementing this protocol for improving automated annotations within the SEED database and these improvements have been propagated into PATRIC, Model-SEED, KBase and RAST. This method is an enabling step for the future creation of consistent annotation systems and high-quality model reconstructions that will support in predicting accurate phenotypes such as pathogenicity, media requirements or type of respiration.« less
Ontogenetic scaling of metabolism, growth, and assimilation: testing metabolic scaling theory with Manduca sexta larvae.

PubMed

Sears, Katie E; Kerkhoff, Andrew J; Messerman, Arianne; Itagaki, Haruhiko

2012-01-01

Metabolism, growth, and the assimilation of energy and materials are essential processes that are intricately related and depend heavily on animal size. However, models that relate the ontogenetic scaling of energy assimilation and metabolism to growth rely on assumptions that have yet to be rigorously tested. Based on detailed daily measurements of metabolism, growth, and assimilation in tobacco hornworms, Manduca sexta, we provide a first experimental test of the core assumptions of a metabolic scaling model of ontogenetic growth. Metabolic scaling parameters changed over development, in violation of the model assumptions. At the same time, the scaling of growth rate matches that of metabolic rate, with similar scaling exponents both across and within developmental instars. Rates of assimilation were much higher than expected during the first two instars and did not match the patterns of scaling of growth and metabolism, which suggests high costs of biosynthesis early in development. The rapid increase in size and discrete instars observed in larval insect development provide an ideal system for understanding how patterns of growth and metabolism emerge from fundamental cellular processes and the exchange of materials and energy between an organism and its environment.
Genome-driven evolutionary game theory helps understand the rise of metabolic interdependencies in microbial communities.

PubMed

Zomorrodi, Ali R; Segrè, Daniel

2017-11-16

Metabolite exchanges in microbial communities give rise to ecological interactions that govern ecosystem diversity and stability. It is unclear, however, how the rise of these interactions varies across metabolites and organisms. Here we address this question by integrating genome-scale models of metabolism with evolutionary game theory. Specifically, we use microbial fitness values estimated by metabolic models to infer evolutionarily stable interactions in multi-species microbial "games". We first validate our approach using a well-characterized yeast cheater-cooperator system. We next perform over 80,000 in silico experiments to infer how metabolic interdependencies mediated by amino acid leakage in Escherichia coli vary across 189 amino acid pairs. While most pairs display shared patterns of inter-species interactions, multiple deviations are caused by pleiotropy and epistasis in metabolism. Furthermore, simulated invasion experiments reveal possible paths to obligate cross-feeding. Our study provides genomically driven insight into the rise of ecological interactions, with implications for microbiome research and synthetic ecology.
Perspectives in metabolic engineering: understanding cellular regulation towards the control of metabolic routes.

PubMed

Zadran, Sohila; Levine, Raphael D

2013-01-01

Metabolic engineering seeks to redirect metabolic pathways through the modification of specific biochemical reactions or the introduction of new ones with the use of recombinant technology. Many of the chemicals synthesized via introduction of product-specific enzymes or the reconstruction of entire metabolic pathways into engineered hosts that can sustain production and can synthesize high yields of the desired product as yields of natural product-derived compounds are frequently low, and chemical processes can be both energy and material expensive; current endeavors have focused on using biologically derived processes as alternatives to chemical synthesis. Such economically favorable manufacturing processes pursue goals related to sustainable development and "green chemistry". Metabolic engineering is a multidisciplinary approach, involving chemical engineering, molecular biology, biochemistry, and analytical chemistry. Recent advances in molecular biology, genome-scale models, theoretical understanding, and kinetic modeling has increased interest in using metabolic engineering to redirect metabolic fluxes for industrial and therapeutic purposes. The use of metabolic engineering has increased the productivity of industrially pertinent small molecules, alcohol-based biofuels, and biodiesel. Here, we highlight developments in the practical and theoretical strategies and technologies available for the metabolic engineering of simple systems and address current limitations.
ocsESTdb: a database of oil crop seed EST sequences for comparative analysis and investigation of a global metabolic network and oil accumulation metabolism.

PubMed

Ke, Tao; Yu, Jingyin; Dong, Caihua; Mao, Han; Hua, Wei; Liu, Shengyi

2015-01-21

Oil crop seeds are important sources of fatty acids (FAs) for human and animal nutrition. Despite their importance, there is a lack of an essential bioinformatics resource on gene transcription of oil crops from a comparative perspective. In this study, we developed ocsESTdb, the first database of expressed sequence tag (EST) information on seeds of four large-scale oil crops with an emphasis on global metabolic networks and oil accumulation metabolism that target the involved unigenes. A total of 248,522 ESTs and 106,835 unigenes were collected from the cDNA libraries of rapeseed (Brassica napus), soybean (Glycine max), sesame (Sesamum indicum) and peanut (Arachis hypogaea). These unigenes were annotated by a sequence similarity search against databases including TAIR, NR protein database, Gene Ontology, COG, Swiss-Prot, TrEMBL and Kyoto Encyclopedia of Genes and Genomes (KEGG). Five genome-scale metabolic networks that contain different numbers of metabolites and gene-enzyme reaction-association entries were analysed and constructed using Cytoscape and yEd programs. Details of unigene entries, deduced amino acid sequences and putative annotation are available from our database to browse, search and download. Intuitive and graphical representations of EST/unigene sequences, functional annotations, metabolic pathways and metabolic networks are also available. ocsESTdb will be updated regularly and can be freely accessed at http://ocri-genomics.org/ocsESTdb/ . ocsESTdb may serve as a valuable and unique resource for comparative analysis of acyl lipid synthesis and metabolism in oilseed plants. It also may provide vital insights into improving oil content in seeds of oil crop species by transcriptional reconstruction of the metabolic network.
Microalgal Metabolic Network Model Refinement through High-Throughput Functional Metabolic Profiling

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chaiboonchoe, Amphun; Dohai, Bushra Saeed; Cai, Hong

2014-12-10

Metabolic modeling provides the means to define metabolic processes at a systems level; however, genome-scale metabolic models often remain incomplete in their description of metabolic networks and may include reactions that are experimentally unverified. This shortcoming is exacerbated in reconstructed models of newly isolated algal species, as there may be little to no biochemical evidence available for the metabolism of such isolates. The phenotype microarray (PM) technology (Biolog, Hayward, CA, USA) provides an efficient, high-throughput method to functionally define cellular metabolic activities in response to a large array of entry metabolites. The platform can experimentally verify many of the unverifiedmore » reactions in a network model as well as identify missing or new reactions in the reconstructed metabolic model. The PM technology has been used for metabolic phenotyping of non-photosynthetic bacteria and fungi, but it has not been reported for the phenotyping of microalgae. Here, we introduce the use of PM assays in a systematic way to the study of microalgae, applying it specifically to the green microalgal model species Chlamydomonas reinhardtii. The results obtained in this study validate a number of existing annotated metabolic reactions and identify a number of novel and unexpected metabolites. The obtained information was used to expand and refine the existing COBRA-based C. reinhardtii metabolic network model iRC1080. Over 254 reactions were added to the network, and the effects of these additions on flux distribution within the network are described. The novel reactions include the support of metabolism by a number of d-amino acids, l-dipeptides, and l-tripeptides as nitrogen sources, as well as support of cellular respiration by cysteamine-S-phosphate as a phosphorus source. The protocol developed here can be used as a foundation to functionally profile other microalgae such as known microalgae mutants and novel isolates.« less
Microalgal Metabolic Network Model Refinement through High-Throughput Functional Metabolic Profiling

PubMed Central

Chaiboonchoe, Amphun; Dohai, Bushra Saeed; Cai, Hong; Nelson, David R.; Jijakli, Kenan; Salehi-Ashtiani, Kourosh

2014-01-01

Metabolic modeling provides the means to define metabolic processes at a systems level; however, genome-scale metabolic models often remain incomplete in their description of metabolic networks and may include reactions that are experimentally unverified. This shortcoming is exacerbated in reconstructed models of newly isolated algal species, as there may be little to no biochemical evidence available for the metabolism of such isolates. The phenotype microarray (PM) technology (Biolog, Hayward, CA, USA) provides an efficient, high-throughput method to functionally define cellular metabolic activities in response to a large array of entry metabolites. The platform can experimentally verify many of the unverified reactions in a network model as well as identify missing or new reactions in the reconstructed metabolic model. The PM technology has been used for metabolic phenotyping of non-photosynthetic bacteria and fungi, but it has not been reported for the phenotyping of microalgae. Here, we introduce the use of PM assays in a systematic way to the study of microalgae, applying it specifically to the green microalgal model species Chlamydomonas reinhardtii. The results obtained in this study validate a number of existing annotated metabolic reactions and identify a number of novel and unexpected metabolites. The obtained information was used to expand and refine the existing COBRA-based C. reinhardtii metabolic network model iRC1080. Over 254 reactions were added to the network, and the effects of these additions on flux distribution within the network are described. The novel reactions include the support of metabolism by a number of d-amino acids, l-dipeptides, and l-tripeptides as nitrogen sources, as well as support of cellular respiration by cysteamine-S-phosphate as a phosphorus source. The protocol developed here can be used as a foundation to functionally profile other microalgae such as known microalgae mutants and novel isolates. PMID:25540776
Microalgal Metabolic Network Model Refinement through High-Throughput Functional Metabolic Profiling.

PubMed

Chaiboonchoe, Amphun; Dohai, Bushra Saeed; Cai, Hong; Nelson, David R; Jijakli, Kenan; Salehi-Ashtiani, Kourosh

2014-01-01

Metabolic modeling provides the means to define metabolic processes at a systems level; however, genome-scale metabolic models often remain incomplete in their description of metabolic networks and may include reactions that are experimentally unverified. This shortcoming is exacerbated in reconstructed models of newly isolated algal species, as there may be little to no biochemical evidence available for the metabolism of such isolates. The phenotype microarray (PM) technology (Biolog, Hayward, CA, USA) provides an efficient, high-throughput method to functionally define cellular metabolic activities in response to a large array of entry metabolites. The platform can experimentally verify many of the unverified reactions in a network model as well as identify missing or new reactions in the reconstructed metabolic model. The PM technology has been used for metabolic phenotyping of non-photosynthetic bacteria and fungi, but it has not been reported for the phenotyping of microalgae. Here, we introduce the use of PM assays in a systematic way to the study of microalgae, applying it specifically to the green microalgal model species Chlamydomonas reinhardtii. The results obtained in this study validate a number of existing annotated metabolic reactions and identify a number of novel and unexpected metabolites. The obtained information was used to expand and refine the existing COBRA-based C. reinhardtii metabolic network model iRC1080. Over 254 reactions were added to the network, and the effects of these additions on flux distribution within the network are described. The novel reactions include the support of metabolism by a number of d-amino acids, l-dipeptides, and l-tripeptides as nitrogen sources, as well as support of cellular respiration by cysteamine-S-phosphate as a phosphorus source. The protocol developed here can be used as a foundation to functionally profile other microalgae such as known microalgae mutants and novel isolates.

Network reconstruction of platelet metabolism identifies metabolic signature for aspirin resistance

NASA Astrophysics Data System (ADS)

Thomas, Alex; Rahmanian, Sorena; Bordbar, Aarash; Palsson, Bernhard Ø.; Jamshidi, Neema

2014-01-01

Recently there has not been a systematic, objective assessment of the metabolic capabilities of the human platelet. A manually curated, functionally tested, and validated biochemical reaction network of platelet metabolism, iAT-PLT-636, was reconstructed using 33 proteomic datasets and 354 literature references. The network contains enzymes mapping to 403 diseases and 231 FDA approved drugs, alluding to an expansive scope of biochemical transformations that may affect or be affected by disease processes in multiple organ systems. The effect of aspirin (ASA) resistance on platelet metabolism was evaluated using constraint-based modeling, which revealed a redirection of glycolytic, fatty acid, and nucleotide metabolism reaction fluxes in order to accommodate eicosanoid synthesis and reactive oxygen species stress. These results were confirmed with independent proteomic data. The construction and availability of iAT-PLT-636 should stimulate further data-driven, systems analysis of platelet metabolism towards the understanding of pathophysiological conditions including, but not strictly limited to, coagulopathies.
Comparison of environmental and isolate Sulfobacillus genomes reveals diverse carbon, sulfur, nitrogen, and hydrogen metabolisms

DOE Office of Scientific and Technical Information (OSTI.GOV)

Justice, Nicholas B.; Norman, Anders; Brown, Christopher T.

Bacteria of the genus Sulfobacillus are found worldwide as members of microbial communities that accelerate sulfide mineral dissolution in acid mine drainage environments (AMD), acid-rock drainage environments (ARD), as well as in industrial bioleaching operations. Despite their frequent identification in these environments, their role in biogeochemical cycling is poorly understood. Here we report draft genomes of five species of the Sulfobacillus genus (AMDSBA1-5) reconstructed by cultivation-independent sequencing of biofilms sampled from the Richmond Mine (Iron Mountain, CA). Three of these species (AMDSBA2, AMDSBA3, and AMDSBA4) have no cultured representatives while AMDSBA1 is a strain of S. benefaciens, and AMDSBA5 amore » strain of S. thermosulfidooxidans. We analyzed the diversity of energy conservation and central carbon metabolisms for these genomes and previously published Sulfobacillus genomes. Pathways of sulfur oxidation vary considerably across the genus, including the number and type of subunits of putative heterodisulfide reductase complexes likely involved in sulfur oxidation. The number and type of nickel-iron hydrogenase proteins varied across the genus, as does the presence of different central carbon pathways. Only the AMDSBA3 genome encodes a dissimilatory nitrate reducatase and only the AMDSBA5 and S. thermosulfidooxidans genomes encode assimilatory nitrate reductases. Lastly, within the genus, AMDSBA4 is unusual in that its electron transport chain includes a cytochrome bc type complex, a unique cytochrome c oxidase, and two distinct succinate dehydrogenase complexes. Overall, the results significantly expand our understanding of carbon, sulfur, nitrogen, and hydrogen metabolism within the Sulfobacillus genus.« less
Comparison of environmental and isolate Sulfobacillus genomes reveals diverse carbon, sulfur, nitrogen, and hydrogen metabolisms

DOE PAGES

Justice, Nicholas B.; Norman, Anders; Brown, Christopher T.; ...

2014-12-15

Bacteria of the genus Sulfobacillus are found worldwide as members of microbial communities that accelerate sulfide mineral dissolution in acid mine drainage environments (AMD), acid-rock drainage environments (ARD), as well as in industrial bioleaching operations. Despite their frequent identification in these environments, their role in biogeochemical cycling is poorly understood. Here we report draft genomes of five species of the Sulfobacillus genus (AMDSBA1-5) reconstructed by cultivation-independent sequencing of biofilms sampled from the Richmond Mine (Iron Mountain, CA). Three of these species (AMDSBA2, AMDSBA3, and AMDSBA4) have no cultured representatives while AMDSBA1 is a strain of S. benefaciens, and AMDSBA5 amore » strain of S. thermosulfidooxidans. We analyzed the diversity of energy conservation and central carbon metabolisms for these genomes and previously published Sulfobacillus genomes. Pathways of sulfur oxidation vary considerably across the genus, including the number and type of subunits of putative heterodisulfide reductase complexes likely involved in sulfur oxidation. The number and type of nickel-iron hydrogenase proteins varied across the genus, as does the presence of different central carbon pathways. Only the AMDSBA3 genome encodes a dissimilatory nitrate reducatase and only the AMDSBA5 and S. thermosulfidooxidans genomes encode assimilatory nitrate reductases. Lastly, within the genus, AMDSBA4 is unusual in that its electron transport chain includes a cytochrome bc type complex, a unique cytochrome c oxidase, and two distinct succinate dehydrogenase complexes. Overall, the results significantly expand our understanding of carbon, sulfur, nitrogen, and hydrogen metabolism within the Sulfobacillus genus.« less
Discovering hotspots in functional genomic data superposed on 3D chromatin configuration reconstructions.

PubMed

Capurso, Daniel; Bengtsson, Henrik; Segal, Mark R

2016-03-18

The spatial organization of the genome influences cellular function, notably gene regulation. Recent studies have assessed the three-dimensional (3D) co-localization of functional annotations (e.g. centromeres, long terminal repeats) using 3D genome reconstructions from Hi-C (genome-wide chromosome conformation capture) data; however, corresponding assessments for continuous functional genomic data (e.g. chromatin immunoprecipitation-sequencing (ChIP-seq) peak height) are lacking. Here, we demonstrate that applying bump hunting via the patient rule induction method (PRIM) to ChIP-seq data superposed on a Saccharomyces cerevisiae 3D genome reconstruction can discover 'functional 3D hotspots', regions in 3-space for which the mean ChIP-seq peak height is significantly elevated. For the transcription factor Swi6, the top hotspot by P-value contains MSB2 and ERG11 - known Swi6 target genes on different chromosomes. We verify this finding in a number of ways. First, this top hotspot is relatively stable under PRIM across parameter settings. Second, this hotspot is among the top hotspots by mean outcome identified by an alternative algorithm, k-Nearest Neighbor (k-NN) regression. Third, the distance between MSB2 and ERG11 is smaller than expected (by resampling) in two other 3D reconstructions generated via different normalization and reconstruction algorithms. This analytic approach can discover functional 3D hotspots and potentially reveal novel regulatory interactions. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Discovering hotspots in functional genomic data superposed on 3D chromatin configuration reconstructions

PubMed Central

Capurso, Daniel; Bengtsson, Henrik; Segal, Mark R.

2016-01-01

The spatial organization of the genome influences cellular function, notably gene regulation. Recent studies have assessed the three-dimensional (3D) co-localization of functional annotations (e.g. centromeres, long terminal repeats) using 3D genome reconstructions from Hi-C (genome-wide chromosome conformation capture) data; however, corresponding assessments for continuous functional genomic data (e.g. chromatin immunoprecipitation-sequencing (ChIP-seq) peak height) are lacking. Here, we demonstrate that applying bump hunting via the patient rule induction method (PRIM) to ChIP-seq data superposed on a Saccharomyces cerevisiae 3D genome reconstruction can discover ‘functional 3D hotspots’, regions in 3-space for which the mean ChIP-seq peak height is significantly elevated. For the transcription factor Swi6, the top hotspot by P-value contains MSB2 and ERG11 – known Swi6 target genes on different chromosomes. We verify this finding in a number of ways. First, this top hotspot is relatively stable under PRIM across parameter settings. Second, this hotspot is among the top hotspots by mean outcome identified by an alternative algorithm, k-Nearest Neighbor (k-NN) regression. Third, the distance between MSB2 and ERG11 is smaller than expected (by resampling) in two other 3D reconstructions generated via different normalization and reconstruction algorithms. This analytic approach can discover functional 3D hotspots and potentially reveal novel regulatory interactions. PMID:26869583
Reconstruction of the ancestral plastid genome in Geraniaceae reveals a correlation between genome rearrangements, repeats, and nucleotide substitution rates.

PubMed

Weng, Mao-Lun; Blazier, John C; Govindu, Madhumita; Jansen, Robert K

2014-03-01

Geraniaceae plastid genomes are highly rearranged, and each of the four genera already sequenced in the family has a distinct genome organization. This study reports plastid genome sequences of six additional species, Francoa sonchifolia, Melianthus villosus, and Viviania marifolia from Geraniales, and Pelargonium alternans, California macrophylla, and Hypseocharis bilobata from Geraniaceae. These genome sequences, combined with previously published species, provide sufficient taxon sampling to reconstruct the ancestral plastid genome organization of Geraniaceae and the rearrangements unique to each genus. The ancestral plastid genome of Geraniaceae has a 4 kb inversion and a reduced, Pelargonium-like small single copy region. Our ancestral genome reconstruction suggests that a few minor rearrangements occurred in the stem branch of Geraniaceae followed by independent rearrangements in each genus. The genomic comparison demonstrates that a series of inverted repeat boundary shifts and inversions played a major role in shaping genome organization in the family. The distribution of repeats is strongly associated with breakpoints in the rearranged genomes, and the proportion and the number of large repeats (>20 bp and >60 bp) are significantly correlated with the degree of genome rearrangements. Increases in the degree of plastid genome rearrangements are correlated with the acceleration in nonsynonymous substitution rates (dN) but not with synonymous substitution rates (dS). Possible mechanisms that might contribute to this correlation, including DNA repair system and selection, are discussed.
Integrated metabolism in sponge-microbe symbiosis revealed by genome-centered metatranscriptomics.

PubMed

Moitinho-Silva, Lucas; Díez-Vives, Cristina; Batani, Giampiero; Esteves, Ana Is; Jahn, Martin T; Thomas, Torsten

2017-07-01

Despite an increased understanding of functions in sponge microbiomes, the interactions among the symbionts and between symbionts and host are not well characterized. Here we reconstructed the metabolic interactions within the sponge Cymbastela concentrica microbiome in the context of functional features of symbiotic diatoms and the host. Three genome bins (CcPhy, CcNi and CcThau) were recovered from metagenomic data of C. concentrica, belonging to the proteobacterial family Phyllobacteriaceae, the Nitrospira genus and the thaumarchaeal order Nitrosopumilales. Gene expression was estimated by mapping C. concentrica metatranscriptomic reads. Our analyses indicated that CcPhy is heterotrophic, while CcNi and CcThau are chemolithoautotrophs. CcPhy expressed many transporters for the acquisition of dissolved organic compounds, likely available through the sponge's filtration activity and symbiotic carbon fixation. Coupled nitrification by CcThau and CcNi was reconstructed, supported by the observed close proximity of the cells in fluorescence in situ hybridization. CcPhy facultative anaerobic respiration and assimilation by diatoms may consume the resulting nitrate. Transcriptional analysis of diatom and sponge functions indicated that these organisms are likely sources of organic compounds, for example, creatine/creatinine and dissolved organic carbon, for other members of the symbiosis. Our results suggest that organic nitrogen compounds, for example, creatine, creatinine, urea and cyanate, fuel the nitrogen cycle within the sponge. This study provides an unprecedented view of the metabolic interactions within sponge-microbe symbiosis, bridging the gap between cell- and community-level knowledge.
Insights into the ecology, evolution, and metabolism of the widespread Woesearchaeotal lineages.

PubMed

Liu, Xiaobo; Li, Meng; Castelle, Cindy J; Probst, Alexander J; Zhou, Zhichao; Pan, Jie; Liu, Yang; Banfield, Jillian F; Gu, Ji-Dong

2018-06-08

As a recently discovered member of the DPANN superphylum, Woesearchaeota account for a wide diversity of 16S rRNA gene sequences, but their ecology, evolution, and metabolism remain largely unknown. Here, we assembled 133 global clone libraries/studies and 19 publicly available genomes to profile these patterns for Woesearchaeota. Phylogenetic analysis shows a high diversity with 26 proposed subgroups for this recently discovered archaeal phylum, which are widely distributed in different biotopes but primarily in inland anoxic environments. Ecological patterns analysis and ancestor state reconstruction for specific subgroups reveal that oxic status of the environments is the key factor driving the distribution and evolutionary diversity of Woesearchaeota. A selective distribution to different biotopes and an adaptive colonization from anoxic to oxic environments can be proposed and supported by evidence of the presence of ferredoxin-dependent pathways in the genomes only from anoxic biotopes but not from oxic biotopes. Metabolic reconstructions support an anaerobic heterotrophic lifestyle with conspicuous metabolic deficiencies, suggesting the requirement for metabolic complementarity with other microbes. Both lineage abundance distribution and co-occurrence network analyses across diverse biotopes confirmed metabolic complementation and revealed a potential syntrophic relationship between Woesearchaeota and methanogens, which is supported by metabolic modeling. If correct, Woesearchaeota may impact methanogenesis in inland ecosystems. The findings provide an ecological and evolutionary framework for Woesearchaeota at a global scale and indicate their potential ecological roles, especially in methanogenesis.
The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases

PubMed Central

Caspi, Ron; Altman, Tomer; Dale, Joseph M.; Dreher, Kate; Fulcher, Carol A.; Gilham, Fred; Kaipa, Pallavi; Karthikeyan, Athikkattuvalasu S.; Kothari, Anamika; Krummenacker, Markus; Latendresse, Mario; Mueller, Lukas A.; Paley, Suzanne; Popescu, Liviu; Pujar, Anuradha; Shearer, Alexander G.; Zhang, Peifen; Karp, Peter D.

2010-01-01

The MetaCyc database (MetaCyc.org) is a comprehensive and freely accessible resource for metabolic pathways and enzymes from all domains of life. The pathways in MetaCyc are experimentally determined, small-molecule metabolic pathways and are curated from the primary scientific literature. With more than 1400 pathways, MetaCyc is the largest collection of metabolic pathways currently available. Pathways reactions are linked to one or more well-characterized enzymes, and both pathways and enzymes are annotated with reviews, evidence codes, and literature citations. BioCyc (BioCyc.org) is a collection of more than 500 organism-specific Pathway/Genome Databases (PGDBs). Each BioCyc PGDB contains the full genome and predicted metabolic network of one organism. The network, which is predicted by the Pathway Tools software using MetaCyc as a reference, consists of metabolites, enzymes, reactions and metabolic pathways. BioCyc PGDBs also contain additional features, such as predicted operons, transport systems, and pathway hole-fillers. The BioCyc Web site offers several tools for the analysis of the PGDBs, including Omics Viewers that enable visualization of omics datasets on two different genome-scale diagrams and tools for comparative analysis. The BioCyc PGDBs generated by SRI are offered for adoption by any party interested in curation of metabolic, regulatory, and genome-related information about an organism. PMID:19850718
Reconciling theories for metabolic scaling.

PubMed

Maino, James L; Kearney, Michael R; Nisbet, Roger M; Kooijman, Sebastiaan A L M

2014-01-01

Metabolic theory specifies constraints on the metabolic organisation of individual organisms. These constraints have important implications for biological processes ranging from the scale of molecules all the way to the level of populations, communities and ecosystems, with their application to the latter emerging as the field of metabolic ecology. While ecologists continue to use individual metabolism to identify constraints in ecological processes, the topic of metabolic scaling remains controversial. Much of the current interest and controversy in metabolic theory relates to recent ideas about the role of supply networks in constraining energy supply to cells. We show that an alternative explanation for physicochemical constraints on individual metabolism, as formalised by dynamic energy budget (DEB) theory, can contribute to the theoretical underpinning of metabolic ecology, while increasing coherence between intra- and interspecific scaling relationships. In particular, we emphasise how the DEB theory considers constraints on the storage and use of assimilated nutrients and derive an equation for the scaling of metabolic rate for adult heterotrophs without relying on optimisation arguments or implying cellular nutrient supply limitation. Using realistic data on growth and reproduction from the literature, we parameterise the curve for respiration and compare the a priori prediction against a mammalian data set for respiration. Because the DEB theory mechanism for metabolic scaling is based on the universal process of acquiring and using pools of stored metabolites (a basal feature of life), it applies to all organisms irrespective of the nature of metabolic transport to cells. Although the DEB mechanism does not necessarily contradict insight from transport-based models, the mechanism offers an explanation for differences between the intra- and interspecific scaling of biological rates with mass, suggesting novel tests of the respective hypotheses. © 2013 The
The Genomic HyperBrowser: an analysis web server for genome-scale data

PubMed Central

Sandve, Geir K.; Gundersen, Sveinung; Johansen, Morten; Glad, Ingrid K.; Gunathasan, Krishanthi; Holden, Lars; Holden, Marit; Liestøl, Knut; Nygård, Ståle; Nygaard, Vegard; Paulsen, Jonas; Rydbeck, Halfdan; Trengereid, Kai; Clancy, Trevor; Drabløs, Finn; Ferkingstad, Egil; Kalaš, Matúš; Lien, Tonje; Rye, Morten B.; Frigessi, Arnoldo; Hovig, Eivind

2013-01-01

The immense increase in availability of genomic scale datasets, such as those provided by the ENCODE and Roadmap Epigenomics projects, presents unprecedented opportunities for individual researchers to pose novel falsifiable biological questions. With this opportunity, however, researchers are faced with the challenge of how to best analyze and interpret their genome-scale datasets. A powerful way of representing genome-scale data is as feature-specific coordinates relative to reference genome assemblies, i.e. as genomic tracks. The Genomic HyperBrowser (http://hyperbrowser.uio.no) is an open-ended web server for the analysis of genomic track data. Through the provision of several highly customizable components for processing and statistical analysis of genomic tracks, the HyperBrowser opens for a range of genomic investigations, related to, e.g., gene regulation, disease association or epigenetic modifications of the genome. PMID:23632163
The Genomic HyperBrowser: an analysis web server for genome-scale data.

PubMed

Sandve, Geir K; Gundersen, Sveinung; Johansen, Morten; Glad, Ingrid K; Gunathasan, Krishanthi; Holden, Lars; Holden, Marit; Liestøl, Knut; Nygård, Ståle; Nygaard, Vegard; Paulsen, Jonas; Rydbeck, Halfdan; Trengereid, Kai; Clancy, Trevor; Drabløs, Finn; Ferkingstad, Egil; Kalas, Matús; Lien, Tonje; Rye, Morten B; Frigessi, Arnoldo; Hovig, Eivind

2013-07-01

The immense increase in availability of genomic scale datasets, such as those provided by the ENCODE and Roadmap Epigenomics projects, presents unprecedented opportunities for individual researchers to pose novel falsifiable biological questions. With this opportunity, however, researchers are faced with the challenge of how to best analyze and interpret their genome-scale datasets. A powerful way of representing genome-scale data is as feature-specific coordinates relative to reference genome assemblies, i.e. as genomic tracks. The Genomic HyperBrowser (http://hyperbrowser.uio.no) is an open-ended web server for the analysis of genomic track data. Through the provision of several highly customizable components for processing and statistical analysis of genomic tracks, the HyperBrowser opens for a range of genomic investigations, related to, e.g., gene regulation, disease association or epigenetic modifications of the genome.
Pangolin genomes and the evolution of mammalian scales and immunity

PubMed Central

Rayko, Mike; Tan, Tze King; Hari, Ranjeev; Komissarov, Aleksey; Wee, Wei Yee; Yurchenko, Andrey A.; Kliver, Sergey; Tamazian, Gaik; Antunes, Agostinho; Wilson, Richard K.; Warren, Wesley C.; Koepfli, Klaus-Peter; Minx, Patrick; Krasheninnikova, Ksenia; Kotze, Antoinette; Dalton, Desire L.; Vermaak, Elaine; Paterson, Ian C.; Dobrynin, Pavel; Sitam, Frankie Thomas; Rovie-Ryan, Jeffrine J.; Johnson, Warren E.; Yusoff, Aini Mohamed; Luo, Shu-Jin; Karuppannan, Kayal Vizi; Fang, Gang; Zheng, Deyou; Gerstein, Mark B.; Lipovich, Leonard; O'Brien, Stephen J.; Wong, Guat Jah

2016-01-01

Pangolins, unique mammals with scales over most of their body, no teeth, poor vision, and an acute olfactory system, comprise the only placental order (Pholidota) without a whole-genome map. To investigate pangolin biology and evolution, we developed genome assemblies of the Malayan (Manis javanica) and Chinese (M. pentadactyla) pangolins. Strikingly, we found that interferon epsilon (IFNE), exclusively expressed in epithelial cells and important in skin and mucosal immunity, is pseudogenized in all African and Asian pangolin species that we examined, perhaps impacting resistance to infection. We propose that scale development was an innovation that provided protection against injuries or stress and reduced pangolin vulnerability to infection. Further evidence of specialized adaptations was evident from positively selected genes involving immunity-related pathways, inflammation, energy storage and metabolism, muscular and nervous systems, and scale/hair development. Olfactory receptor gene families are significantly expanded in pangolins, reflecting their well-developed olfaction system. This study provides insights into mammalian adaptation and functional diversification, new research tools and questions, and perhaps a new natural IFNE-deficient animal model for studying mammalian immunity. PMID:27510566
Swimming in Light: A Large-Scale Computational Analysis of the Metabolism of Dinoroseobacter shibae

PubMed Central

Rex, Rene; Bill, Nelli; Schmidt-Hohagen, Kerstin; Schomburg, Dietmar

2013-01-01

The Roseobacter clade is a ubiquitous group of marine α-proteobacteria. To gain insight into the versatile metabolism of this clade, we took a constraint-based approach and created a genome-scale metabolic model (iDsh827) of Dinoroseobacter shibae DFL12T. Our model is the first accounting for the energy demand of motility, the light-driven ATP generation and experimentally determined specific biomass composition. To cover a large variety of environmental conditions, as well as plasmid and single gene knock-out mutants, we simulated 391,560 different physiological states using flux balance analysis. We analyzed our results with regard to energy metabolism, validated them experimentally, and revealed a pronounced metabolic response to the availability of light. Furthermore, we introduced the energy demand of motility as an important parameter in genome-scale metabolic models. The results of our simulations also gave insight into the changing usage of the two degradation routes for dimethylsulfoniopropionate, an abundant compound in the ocean. A side product of dimethylsulfoniopropionate degradation is dimethyl sulfide, which seeds cloud formation and thus enhances the reflection of sunlight. By our exhaustive simulations, we were able to identify single-gene knock-out mutants, which show an increased production of dimethyl sulfide. In addition to the single-gene knock-out simulations we studied the effect of plasmid loss on the metabolism. Moreover, we explored the possible use of a functioning phosphofructokinase for D. shibae. PMID:24098096
Parallel labeling experiments for pathway elucidation and (13)C metabolic flux analysis.

PubMed

Antoniewicz, Maciek R

2015-12-01

Metabolic pathway models provide the foundation for quantitative studies of cellular physiology through the measurement of intracellular metabolic fluxes. For model organisms metabolic models are well established, with many manually curated genome-scale model reconstructions, gene knockout studies and stable-isotope tracing studies. However, for non-model organisms a similar level of knowledge is often lacking. Compartmentation of cellular metabolism in eukaryotic systems also presents significant challenges for quantitative (13)C-metabolic flux analysis ((13)C-MFA). Recently, innovative (13)C-MFA approaches have been developed based on parallel labeling experiments, the use of multiple isotopic tracers and integrated data analysis, that allow more rigorous validation of pathway models and improved quantification of metabolic fluxes. Applications of these approaches open new research directions in metabolic engineering, biotechnology and medicine. Copyright © 2015 Elsevier Ltd. All rights reserved.
Metabolic 'engines' of flight drive genome size reduction in birds.

PubMed

Wright, Natalie A; Gregory, T Ryan; Witt, Christopher C

2014-03-22

The tendency for flying organisms to possess small genomes has been interpreted as evidence of natural selection acting on the physical size of the genome. Nonetheless, the flight-genome link and its mechanistic basis have yet to be well established by comparative studies within a volant clade. Is there a particular functional aspect of flight such as brisk metabolism, lift production or maneuverability that impinges on the physical genome? We measured genome sizes, wing dimensions and heart, flight muscle and body masses from a phylogenetically diverse set of bird species. In phylogenetically controlled analyses, we found that genome size was negatively correlated with relative flight muscle size and heart index (i.e. ratio of heart to body mass), but positively correlated with body mass and wing loading. The proportional masses of the flight muscles and heart were the most important parameters explaining variation in genome size in multivariate models. Hence, the metabolic intensity of powered flight appears to have driven genome size reduction in birds.
The Genome-Based Metabolic Systems Engineering to Boost Levan Production in a Halophilic Bacterial Model.

PubMed

Aydin, Busra; Ozer, Tugba; Oner, Ebru Toksoy; Arga, Kazim Yalcin

2018-03-01

Metabolic systems engineering is being used to redirect microbial metabolism for the overproduction of chemicals of interest with the aim of transforming microbial hosts into cellular factories. In this study, a genome-based metabolic systems engineering approach was designed and performed to improve biopolymer biosynthesis capability of a moderately halophilic bacterium Halomonas smyrnensis AAD6 T producing levan, which is a fructose homopolymer with many potential uses in various industries and medicine. For this purpose, the genome-scale metabolic model for AAD6 T was used to characterize the metabolic resource allocation, specifically to design metabolic engineering strategies for engineered bacteria with enhanced levan production capability. Simulations were performed in silico to determine optimal gene knockout strategies to develop new strains with enhanced levan production capability. The majority of the gene knockout strategies emphasized the vital role of the fructose uptake mechanism, and pointed out the fructose-specific phosphotransferase system (PTS fru ) as the most promising target for further metabolic engineering studies. Therefore, the PTS fru of AAD6 T was restructured with insertional mutagenesis and triparental mating techniques to construct a novel, engineered H. smyrnensis strain, BMA14. Fermentation experiments were carried out to demonstrate the high efficiency of the mutant strain BMA14 in terms of final levan concentration, sucrose consumption rate, and sucrose conversion efficiency, when compared to the AAD6 T . The genome-based metabolic systems engineering approach presented in this study might be considered an efficient framework to redirect microbial metabolism for the overproduction of chemicals of interest, and the novel strain BMA14 might be considered a potential microbial cell factory for further studies aimed to design levan production processes with lower production costs.
Genomic diversity and versatility of Lactobacillus plantarum, a natural metabolic engineer.

PubMed

Siezen, Roland J; van Hylckama Vlieg, Johan E T

2011-08-30

In the past decade it has become clear that the lactic acid bacterium Lactobacillus plantarum occupies a diverse range of environmental niches and has an enormous diversity in phenotypic properties, metabolic capacity and industrial applications. In this review, we describe how genome sequencing, comparative genome hybridization and comparative genomics has provided insight into the underlying genomic diversity and versatility of L. plantarum. One of the main features appears to be genomic life-style islands consisting of numerous functional gene cassettes, in particular for carbohydrates utilization, which can be acquired, shuffled, substituted or deleted in response to niche requirements. In this sense, L. plantarum can be considered a "natural metabolic engineer".
Metabolic Imaging in Multiple Time Scales

PubMed Central

Ramanujan, V Krishnan

2013-01-01

We report here a novel combination of time-resolved imaging methods for probing mitochondrial metabolism multiple time scales at the level of single cells. By exploiting a mitochondrial membrane potential reporter fluorescence we demonstrate the single cell metabolic dynamics in time scales ranging from milliseconds to seconds to minutes in response to glucose metabolism and mitochondrial perturbations in real time. Our results show that in comparison with normal human mammary epithelial cells, the breast cancer cells display significant alterations in metabolic responses at all measured time scales by single cell kinetics, fluorescence recovery after photobleaching and by scaling analysis of time-series data obtained from mitochondrial fluorescence fluctuations. Furthermore scaling analysis of time-series data in living cells with distinct mitochondrial dysfunction also revealed significant metabolic differences thereby suggesting the broader applicability (e.g. in mitochondrial myopathies and other metabolic disorders) of the proposed strategies beyond the scope of cancer metabolism. We discuss the scope of these findings in the context of developing portable, real-time metabolic measurement systems that can find applications in preclinical and clinical diagnostics. PMID:24013043
Integrating Cellular Metabolism into a Multiscale Whole-Body Model

PubMed Central

Krauss, Markus; Schaller, Stephan; Borchers, Steffen; Findeisen, Rolf; Lippert, Jörg; Kuepfer, Lars

2012-01-01

Cellular metabolism continuously processes an enormous range of external compounds into endogenous metabolites and is as such a key element in human physiology. The multifaceted physiological role of the metabolic network fulfilling the catalytic conversions can only be fully understood from a whole-body perspective where the causal interplay of the metabolic states of individual cells, the surrounding tissue and the whole organism are simultaneously considered. We here present an approach relying on dynamic flux balance analysis that allows the integration of metabolic networks at the cellular scale into standardized physiologically-based pharmacokinetic models at the whole-body level. To evaluate our approach we integrated a genome-scale network reconstruction of a human hepatocyte into the liver tissue of a physiologically-based pharmacokinetic model of a human adult. The resulting multiscale model was used to investigate hyperuricemia therapy, ammonia detoxification and paracetamol-induced toxication at a systems level. The specific models simultaneously integrate multiple layers of biological organization and offer mechanistic insights into pathology and medication. The approach presented may in future support a mechanistic understanding in diagnostics and drug development. PMID:23133351

Reconstruction of biological pathways and metabolic networks from in silico labeled metabolites.

PubMed

Hadadi, Noushin; Hafner, Jasmin; Soh, Keng Cher; Hatzimanikatis, Vassily

2017-01-01

Reaction atom mappings track the positional changes of all of the atoms between the substrates and the products as they undergo the biochemical transformation. However, information on atom transitions in the context of metabolic pathways is not widely available in the literature. The understanding of metabolic pathways at the atomic level is of great importance as it can deconvolute the overlapping catabolic/anabolic pathways resulting in the observed metabolic phenotype. The automated identification of atom transitions within a metabolic network is a very challenging task since the degree of complexity of metabolic networks dramatically increases when we transit from metabolite-level studies to atom-level studies. Despite being studied extensively in various approaches, the field of atom mapping of metabolic networks is lacking an automated approach, which (i) accounts for the information of reaction mechanism for atom mapping and (ii) is extendable from individual atom-mapped reactions to atom-mapped reaction networks. Hereby, we introduce a computational framework, iAM.NICE (in silico Atom Mapped Network Integrated Computational Explorer), for the systematic atom-level reconstruction of metabolic networks from in silico labelled substrates. iAM.NICE is to our knowledge the first automated atom-mapping algorithm that is based on the underlying enzymatic biotransformation mechanisms, and its application goes beyond individual reactions and it can be used for the reconstruction of atom-mapped metabolic networks. We illustrate the applicability of our method through the reconstruction of atom-mapped reactions of the KEGG database and we provide an example of an atom-level representation of the core metabolic network of E. coli. Copyright © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Automated multiplex genome-scale engineering in yeast

PubMed Central

Si, Tong; Chao, Ran; Min, Yuhao; Wu, Yuying; Ren, Wen; Zhao, Huimin

2017-01-01

Genome-scale engineering is indispensable in understanding and engineering microorganisms, but the current tools are mainly limited to bacterial systems. Here we report an automated platform for multiplex genome-scale engineering in Saccharomyces cerevisiae, an important eukaryotic model and widely used microbial cell factory. Standardized genetic parts encoding overexpression and knockdown mutations of >90% yeast genes are created in a single step from a full-length cDNA library. With the aid of CRISPR-Cas, these genetic parts are iteratively integrated into the repetitive genomic sequences in a modular manner using robotic automation. This system allows functional mapping and multiplex optimization on a genome scale for diverse phenotypes including cellulase expression, isobutanol production, glycerol utilization and acetic acid tolerance, and may greatly accelerate future genome-scale engineering endeavours in yeast. PMID:28469255
Sequential computation of elementary modes and minimal cut sets in genome-scale metabolic networks using alternate integer linear programming

DOE Office of Scientific and Technical Information (OSTI.GOV)

Song, Hyun-Seob; Goldberg, Noam; Mahajan, Ashutosh

Elementary (flux) modes (EMs) have served as a valuable tool for investigating structural and functional properties of metabolic networks. Identification of the full set of EMs in genome-scale networks remains challenging due to combinatorial explosion of EMs in complex networks. It is often, however, that only a small subset of relevant EMs needs to be known, for which optimization-based sequential computation is a useful alternative. Most of the currently available methods along this line are based on the iterative use of mixed integer linear programming (MILP), the effectiveness of which significantly deteriorates as the number of iterations builds up. Tomore » alleviate the computational burden associated with the MILP implementation, we here present a novel optimization algorithm termed alternate integer linear programming (AILP). Results: Our algorithm was designed to iteratively solve a pair of integer programming (IP) and linear programming (LP) to compute EMs in a sequential manner. In each step, the IP identifies a minimal subset of reactions, the deletion of which disables all previously identified EMs. Thus, a subsequent LP solution subject to this reaction deletion constraint becomes a distinct EM. In cases where no feasible LP solution is available, IP-derived reaction deletion sets represent minimal cut sets (MCSs). Despite the additional computation of MCSs, AILP achieved significant time reduction in computing EMs by orders of magnitude. The proposed AILP algorithm not only offers a computational advantage in the EM analysis of genome-scale networks, but also improves the understanding of the linkage between EMs and MCSs.« less
Ultrafast Comparison of Personal Genomes via Precomputed Genome Fingerprints.

PubMed

Glusman, Gustavo; Mauldin, Denise E; Hood, Leroy E; Robinson, Max

2017-01-01

We present an ultrafast method for comparing personal genomes. We transform the standard genome representation (lists of variants relative to a reference) into "genome fingerprints" via locality sensitive hashing. The resulting genome fingerprints can be meaningfully compared even when the input data were obtained using different sequencing technologies, processed using different pipelines, represented in different data formats and relative to different reference versions. Furthermore, genome fingerprints are robust to up to 30% missing data. Because of their reduced size, computation on the genome fingerprints is fast and requires little memory. For example, we could compute all-against-all pairwise comparisons among the 2504 genomes in the 1000 Genomes data set in 67 s at high quality (21 μs per comparison, on a single processor), and achieved a lower quality approximation in just 11 s. Efficient computation enables scaling up a variety of important genome analyses, including quantifying relatedness, recognizing duplicative sequenced genomes in a set, population reconstruction, and many others. The original genome representation cannot be reconstructed from its fingerprint, effectively decoupling genome comparison from genome interpretation; the method thus has significant implications for privacy-preserving genome analytics.
Ultrafast Comparison of Personal Genomes via Precomputed Genome Fingerprints

PubMed Central

Glusman, Gustavo; Mauldin, Denise E.; Hood, Leroy E.; Robinson, Max

2017-01-01

We present an ultrafast method for comparing personal genomes. We transform the standard genome representation (lists of variants relative to a reference) into “genome fingerprints” via locality sensitive hashing. The resulting genome fingerprints can be meaningfully compared even when the input data were obtained using different sequencing technologies, processed using different pipelines, represented in different data formats and relative to different reference versions. Furthermore, genome fingerprints are robust to up to 30% missing data. Because of their reduced size, computation on the genome fingerprints is fast and requires little memory. For example, we could compute all-against-all pairwise comparisons among the 2504 genomes in the 1000 Genomes data set in 67 s at high quality (21 μs per comparison, on a single processor), and achieved a lower quality approximation in just 11 s. Efficient computation enables scaling up a variety of important genome analyses, including quantifying relatedness, recognizing duplicative sequenced genomes in a set, population reconstruction, and many others. The original genome representation cannot be reconstructed from its fingerprint, effectively decoupling genome comparison from genome interpretation; the method thus has significant implications for privacy-preserving genome analytics. PMID:29018478
Toward Genome-Based Metabolic Engineering in Bacteria.

PubMed

Oesterle, Sabine; Wuethrich, Irene; Panke, Sven

2017-01-01

Prokaryotes modified stably on the genome are of great importance for production of fine and commodity chemicals. Traditional methods for genome engineering have long suffered from imprecision and low efficiencies, making construction of suitable high-producer strains laborious. Here, we review the recent advances in discovery and refinement of molecular precision engineering tools for genome-based metabolic engineering in bacteria for chemical production, with focus on the λ-Red recombineering and the clustered regularly interspaced short palindromic repeats/Cas9 nuclease systems. In conjunction, they enable the integration of in vitro-synthesized DNA segments into specified locations on the chromosome and allow for enrichment of rare mutants by elimination of unmodified wild-type cells. Combination with concurrently developing improvements in important accessory technologies such as DNA synthesis, high-throughput screening methods, regulatory element design, and metabolic pathway optimization tools has resulted in novel efficient microbial producer strains and given access to new metabolic products. These new tools have made and will likely continue to make a big impact on the bioengineering strategies that transform the chemical industry. Copyright © 2017 Elsevier Inc. All rights reserved.
Genomic diversity and versatility of Lactobacillus plantarum, a natural metabolic engineer

PubMed Central

2011-01-01

In the past decade it has become clear that the lactic acid bacterium Lactobacillus plantarum occupies a diverse range of environmental niches and has an enormous diversity in phenotypic properties, metabolic capacity and industrial applications. In this review, we describe how genome sequencing, comparative genome hybridization and comparative genomics has provided insight into the underlying genomic diversity and versatility of L. plantarum. One of the main features appears to be genomic life-style islands consisting of numerous functional gene cassettes, in particular for carbohydrates utilization, which can be acquired, shuffled, substituted or deleted in response to niche requirements. In this sense, L. plantarum can be considered a “natural metabolic engineer”. PMID:21995294
Cyanobacterial Biofuels: Strategies and Developments on Network and Modeling.

PubMed

Klanchui, Amornpan; Raethong, Nachon; Prommeenate, Peerada; Vongsangnak, Wanwipa; Meechai, Asawin

Cyanobacteria, the phototrophic microorganisms, have attracted much attention recently as a promising source for environmentally sustainable biofuels production. However, barriers for commercial markets of cyanobacteria-based biofuels concern the economic feasibility. Miscellaneous strategies for improving the production performance of cyanobacteria have thus been developed. Among these, the simple ad hoc strategies resulting in failure to optimize fully cell growth coupled with desired product yield are explored. With the advancement of genomics and systems biology, a new paradigm toward systems metabolic engineering has been recognized. In particular, a genome-scale metabolic network reconstruction and modeling is a crucial systems-based tool for whole-cell-wide investigation and prediction. In this review, the cyanobacterial genome-scale metabolic models, which offer a system-level understanding of cyanobacterial metabolism, are described. The main process of metabolic network reconstruction and modeling of cyanobacteria are summarized. Strategies and developments on genome-scale network and modeling through the systems metabolic engineering approach are advanced and employed for efficient cyanobacterial-based biofuels production.
Genomic Footprints of Selective Sweeps from Metabolic Resistance to Pyrethroids in African Malaria Vectors Are Driven by Scale up of Insecticide-Based Vector Control.

PubMed

Barnes, Kayla G; Weedall, Gareth D; Ndula, Miranda; Irving, Helen; Mzihalowa, Themba; Hemingway, Janet; Wondji, Charles S

2017-02-01

Insecticide resistance in mosquito populations threatens recent successes in malaria prevention. Elucidating patterns of genetic structure in malaria vectors to predict the speed and direction of the spread of resistance is essential to get ahead of the 'resistance curve' and to avert a public health catastrophe. Here, applying a combination of microsatellite analysis, whole genome sequencing and targeted sequencing of a resistance locus, we elucidated the continent-wide population structure of a major African malaria vector, Anopheles funestus. We identified a major selective sweep in a genomic region controlling cytochrome P450-based metabolic resistance conferring high resistance to pyrethroids. This selective sweep occurred since 2002, likely as a direct consequence of scaled up vector control as revealed by whole genome and fine-scale sequencing of pre- and post-intervention populations. Fine-scaled analysis of the pyrethroid resistance locus revealed that a resistance-associated allele of the cytochrome P450 monooxygenase CYP6P9a has swept through southern Africa to near fixation, in contrast to high polymorphism levels before interventions, conferring high levels of pyrethroid resistance linked to control failure. Population structure analysis revealed a barrier to gene flow between southern Africa and other areas, which may prevent or slow the spread of the southern mechanism of pyrethroid resistance to other regions. By identifying a genetic signature of pyrethroid-based interventions, we have demonstrated the intense selective pressure that control interventions exert on mosquito populations. If this level of selection and spread of resistance continues unabated, our ability to control malaria with current interventions will be compromised.
SWARM : a scientific workflow for supporting Bayesian approaches to improve metabolic models.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Shi, X.; Stevens, R.; Mathematics and Computer Science

2008-01-01

With the exponential growth of complete genome sequences, the analysis of these sequences is becoming a powerful approach to build genome-scale metabolic models. These models can be used to study individual molecular components and their relationships, and eventually study cells as systems. However, constructing genome-scale metabolic models manually is time-consuming and labor-intensive. This property of manual model-building process causes the fact that much fewer genome-scale metabolic models are available comparing to hundreds of genome sequences available. To tackle this problem, we design SWARM, a scientific workflow that can be utilized to improve genome-scale metabolic models in high-throughput fashion. SWARM dealsmore » with a range of issues including the integration of data across distributed resources, data format conversions, data update, and data provenance. Putting altogether, SWARM streamlines the whole modeling process that includes extracting data from various resources, deriving training datasets to train a set of predictors and applying Bayesian techniques to assemble the predictors, inferring on the ensemble of predictors to insert missing data, and eventually improving draft metabolic networks automatically. By the enhancement of metabolic model construction, SWARM enables scientists to generate many genome-scale metabolic models within a short period of time and with less effort.« less
Genome-scale metabolic modeling to provide insight into the production of storage compounds during feast-famine cycles of activated sludge.

PubMed

Tajparast, Mohammad; Frigon, Dominic

2013-01-01

Studying storage metabolism during feast-famine cycles of activated sludge treatment systems provides profound insight in terms of both operational issues (e.g., foaming and bulking) and process optimization for the production of value added by-products (e.g., bioplastics). We examined the storage metabolism (including poly-β-hydroxybutyrate [PHB], glycogen, and triacylglycerols [TAGs]) during feast-famine cycles using two genome-scale metabolic models: Rhodococcus jostii RHA1 (iMT1174) and Escherichia coli K-12 (iAF1260) for growth on glucose, acetate, and succinate. The goal was to develop the proper objective function (OF) for the prediction of the main storage compound produced in activated sludge for given feast-famine cycle conditions. For the flux balance analysis, combinations of three OFs were tested. For all of them, the main OF was to maximize growth rates. Two additional sub-OFs were used: (1) minimization of biochemical fluxes, and (2) minimization of metabolic adjustments (MoMA) between the feast and famine periods. All (sub-)OFs predicted identical substrate-storage associations for the feast-famine growth of the above-mentioned metabolic models on a given substrate when glucose and acetate were set as sole carbon sources (i.e., glucose-glycogen and acetate-PHB), in agreement with experimental observations. However, in the case of succinate as substrate, the predictions depended on the network structure of the metabolic models such that the E. coli model predicted glycogen accumulation and the R. jostii model predicted PHB accumulation. While the accumulation of both PHB and glycogen was observed experimentally, PHB showed higher dynamics during an activated sludge feast-famine growth cycle with succinate as substrate. These results suggest that new modeling insights between metabolic predictions and population ecology will be necessary to properly predict metabolisms likely to emerge within the niches of activated sludge communities. Nonetheless
Recovery of community genomes to assess subsurface metabolic potential: exploiting the capacity of next generation sequencing-based metagenomics

NASA Astrophysics Data System (ADS)

Wrighton, K. C.; Thomas, B.; Miller, C. S.; Sharon, I.; Wilkins, M. J.; VerBerkmoes, N. C.; Handley, K. M.; Lipton, M. S.; Hettich, R. L.; Williams, K. H.; Long, P. E.; Banfield, J. F.

2011-12-01

With the goal of developing a deterministic understanding of the microbiological and geochemical processes controlling subsurface environments, groundwater bacterial communities were collected from the Rifle Integrated Field Research Challenge (IFRC) site. Biomass from three temporal acetate-stimulated groundwater samples were collected during a period of dominant Fe(III)-reduction, in a region of the aquifer that had previously received acetate amendment the year prior. Phylogenetic analysis revealed a diverse Bacterial community, notably devoid of Archaea with 249 taxa from 9 Bacterial phyla including the dominance of uncultured candidate divisions, BD1-5, OD1, and OP11. We have reconstructed 86 partial to near-complete genomes and have performed a detailed characterization of the underlying metabolic potential of the ecosystem. We assessed the natural variation and redundancy in multi-heme c-type cytochromes, sulfite reductases, and central carbon metabolic pathways. Deep genomic sampling indicated the community contained various metabolic pathways: sulfur oxidation coupled to microaerophilic conditions, nitrate reduction with both acetate and inorganic compounds as donors, carbon and nitrogen fixation, antibiotic warfare, and heavy-metal detoxification. Proteomic investigations using predicted proteins from metagenomics corroborated that acetate oxidation is coupled to reduction of oxygen, sulfur, nitrogen, and iron across the samples. Of particular interest was the detection of acetate oxidizing and sulfate reducing proteins from a Desulfotalea-like bacterium in all three time points, suggesting that aqueous sulfide produced by active sulfate-reducing bacteria could contribute to abiotic iron reduction during the dominant iron reduction phase. Additionally, proteogenomic analysis verified that a large portion of the community, including members of the uncultivated BD1-5, are obligate fermenters, characterized by the presence of hydrogen-evolving hydrogenases
Comparative evaluation of atom mapping algorithms for balanced metabolic reactions: application to Recon 3D

DOE PAGES

Preciat Gonzalez, German A.; El Assal, Lemmer R. P.; Noronha, Alberto; ...

2017-06-14

The mechanism of each chemical reaction in a metabolic network can be represented as a set of atom mappings, each of which relates an atom in a substrate metabolite to an atom of the same element in a product metabolite. Genome-scale metabolic network reconstructions typically represent biochemistry at the level of reaction stoichiometry. However, a more detailed representation at the underlying level of atom mappings opens the possibility for a broader range of biological, biomedical and biotechnological applications than with stoichiometry alone. Complete manual acquisition of atom mapping data for a genome-scale metabolic network is a laborious process. However, manymore » algorithms exist to predict atom mappings. How do their predictions compare to each other and to manually curated atom mappings? For more than four thousand metabolic reactions in the latest human metabolic reconstruction, Recon 3D, we compared the atom mappings predicted by six atom mapping algorithms. We also compared these predictions to those obtained by manual curation of atom mappings for over five hundred reactions distributed among all top level Enzyme Commission number classes. Five of the evaluated algorithms had similarly high prediction accuracy of over 91% when compared to manually curated atom mapped reactions. On average, the accuracy of the prediction was highest for reactions catalysed by oxidoreductases and lowest for reactions catalysed by ligases. In addition to prediction accuracy, the algorithms were evaluated on their accessibility, their advanced features, such as the ability to identify equivalent atoms, and their ability to map hydrogen atoms. In addition to prediction accuracy, we found that software accessibility and advanced features were fundamental to the selection of an atom mapping algorithm in practice.« less
Comparative evaluation of atom mapping algorithms for balanced metabolic reactions: application to Recon 3D

DOE Office of Scientific and Technical Information (OSTI.GOV)

Preciat Gonzalez, German A.; El Assal, Lemmer R. P.; Noronha, Alberto

The mechanism of each chemical reaction in a metabolic network can be represented as a set of atom mappings, each of which relates an atom in a substrate metabolite to an atom of the same element in a product metabolite. Genome-scale metabolic network reconstructions typically represent biochemistry at the level of reaction stoichiometry. However, a more detailed representation at the underlying level of atom mappings opens the possibility for a broader range of biological, biomedical and biotechnological applications than with stoichiometry alone. Complete manual acquisition of atom mapping data for a genome-scale metabolic network is a laborious process. However, manymore » algorithms exist to predict atom mappings. How do their predictions compare to each other and to manually curated atom mappings? For more than four thousand metabolic reactions in the latest human metabolic reconstruction, Recon 3D, we compared the atom mappings predicted by six atom mapping algorithms. We also compared these predictions to those obtained by manual curation of atom mappings for over five hundred reactions distributed among all top level Enzyme Commission number classes. Five of the evaluated algorithms had similarly high prediction accuracy of over 91% when compared to manually curated atom mapped reactions. On average, the accuracy of the prediction was highest for reactions catalysed by oxidoreductases and lowest for reactions catalysed by ligases. In addition to prediction accuracy, the algorithms were evaluated on their accessibility, their advanced features, such as the ability to identify equivalent atoms, and their ability to map hydrogen atoms. In addition to prediction accuracy, we found that software accessibility and advanced features were fundamental to the selection of an atom mapping algorithm in practice.« less
Comparative evaluation of atom mapping algorithms for balanced metabolic reactions: application to Recon 3D.

PubMed

Preciat Gonzalez, German A; El Assal, Lemmer R P; Noronha, Alberto; Thiele, Ines; Haraldsdóttir, Hulda S; Fleming, Ronan M T

2017-06-14

The mechanism of each chemical reaction in a metabolic network can be represented as a set of atom mappings, each of which relates an atom in a substrate metabolite to an atom of the same element in a product metabolite. Genome-scale metabolic network reconstructions typically represent biochemistry at the level of reaction stoichiometry. However, a more detailed representation at the underlying level of atom mappings opens the possibility for a broader range of biological, biomedical and biotechnological applications than with stoichiometry alone. Complete manual acquisition of atom mapping data for a genome-scale metabolic network is a laborious process. However, many algorithms exist to predict atom mappings. How do their predictions compare to each other and to manually curated atom mappings? For more than four thousand metabolic reactions in the latest human metabolic reconstruction, Recon 3D, we compared the atom mappings predicted by six atom mapping algorithms. We also compared these predictions to those obtained by manual curation of atom mappings for over five hundred reactions distributed among all top level Enzyme Commission number classes. Five of the evaluated algorithms had similarly high prediction accuracy of over 91% when compared to manually curated atom mapped reactions. On average, the accuracy of the prediction was highest for reactions catalysed by oxidoreductases and lowest for reactions catalysed by ligases. In addition to prediction accuracy, the algorithms were evaluated on their accessibility, their advanced features, such as the ability to identify equivalent atoms, and their ability to map hydrogen atoms. In addition to prediction accuracy, we found that software accessibility and advanced features were fundamental to the selection of an atom mapping algorithm in practice.
Effect of amino acid supplementation on titer and glycosylation distribution in hybridoma cell cultures-Systems biology-based interpretation using genome-scale metabolic flux balance model and multivariate data analysis.

PubMed

Reimonn, Thomas M; Park, Seo-Young; Agarabi, Cyrus D; Brorson, Kurt A; Yoon, Seongkyu

2016-09-01

Genome-scale flux balance analysis (FBA) is a powerful systems biology tool to characterize intracellular reaction fluxes during cell cultures. FBA estimates intracellular reaction rates by optimizing an objective function, subject to the constraints of a metabolic model and media uptake/excretion rates. A dynamic extension to FBA, dynamic flux balance analysis (DFBA), can calculate intracellular reaction fluxes as they change during cell cultures. In a previous study by Read et al. (2013), a series of informed amino acid supplementation experiments were performed on twelve parallel murine hybridoma cell cultures, and this data was leveraged for further analysis (Read et al., Biotechnol Prog. 2013;29:745-753). In order to understand the effects of media changes on the model murine hybridoma cell line, a systems biology approach is applied in the current study. Dynamic flux balance analysis was performed using a genome-scale mouse metabolic model, and multivariate data analysis was used for interpretation. The calculated reaction fluxes were examined using partial least squares and partial least squares discriminant analysis. The results indicate media supplementation increases product yield because it raises nutrient levels extending the growth phase, and the increased cell density allows for greater culture performance. At the same time, the directed supplementation does not change the overall metabolism of the cells. This supports the conclusion that product quality, as measured by glycoform assays, remains unchanged because the metabolism remains in a similar state. Additionally, the DFBA shows that metabolic state varies more at the beginning of the culture but less by the middle of the growth phase, possibly due to stress on the cells during inoculation. © 2016 American Institute of Chemical Engineers Biotechnol. Prog., 32:1163-1173, 2016. © 2016 American Institute of Chemical Engineers.
Ensembl Genomes 2013: scaling up access to genome-wide data

USDA-ARS?s Scientific Manuscript database

Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species. The project exploits and extends technologies for genome annotation, analysis and dissemination, developed in the context of the vertebrate-focused Ensembl project, and provi...
Reconstruction of a composite comparative map composed of ten legume genomes.

PubMed

Lee, Chaeyoung; Yu, Dongwoon; Choi, Hong-Kyu; Kim, Ryan W

2017-01-01

The Fabaceae (legume family) is the third largest and the second of agricultural importance among flowering plant groups. In this study, we report the reconstruction of a composite comparative map composed of ten legume genomes, including seven species from the galegoid clade ( Medicago truncatula , Medicago sativa , Lens culinaris, Pisum sativum , Lotus japonicus , Cicer arietinum , Vicia faba ) and three species from the phaseoloid clade ( Vigna radiata , Phaseolus vulgaris , Glycine max ). To accomplish this comparison, a total of 209 cross-species gene-derived markers were employed. The comparative analysis resulted in a single extensive genetic/genomic network composed of 93 chromosomes or linkage groups, from which 110 synteny blocks and other evolutionary events (e.g., 13 inversions) were identified. This comparative map also allowed us to deduce several large scale evolutionary events, such as chromosome fusion/fission, with which might explain differences in chromosome numbers among compared species or between the two clades. As a result, useful properties of cross-species genic markers were re-verified as an efficient tool for cross-species translation of genomic information, and similar approaches, combined with a high throughput bioinformatic marker design program, should be effective for applying the knowledge of trait-associated genes to other important crop species for breeding purposes. Here, we provide a basic comparative framework for the ten legume species, and expect to be usefully applied towards the crop improvement in legume breeding.
Genomic analysis reveals the biotechnological and industrial potential of levan producing halophilic extremophile, Halomonas smyrnensis AAD6T.

PubMed

Diken, Elif; Ozer, Tugba; Arikan, Muzaffer; Emrence, Zeliha; Oner, Ebru Toksoy; Ustek, Duran; Arga, Kazim Yalcin

2015-01-01

Halomonas smyrnensis AAD6T is a gram negative, aerobic, and moderately halophilic bacterium, and is known to produce high levels of levan with many potential uses in foods, feeds, cosmetics, pharmaceutical and chemical industries due to its outstanding properties. Here, the whole-genome analysis was performed to gain more insight about the biological mechanisms, and the whole-genome organization of the bacterium. Industrially crucial genes, including the levansucrase, were detected and the genome-scale metabolic model of H. smyrnensis AAD6T was reconstructed. The bacterium was found to have many potential applications in biotechnology not only being a levan producer, but also because of its capacity to produce Pel exopolysaccharide, polyhydroxyalkanoates, and osmoprotectants. The genomic information presented here will not only provide additional information to enhance our understanding of the genetic and metabolic network of halophilic bacteria, but also accelerate the research on systematical design of engineering strategies for biotechnology applications.
The PhytoClust tool for metabolic gene clusters discovery in plant genomes

PubMed Central

Fuchs, Lisa-Maria

2017-01-01

Abstract The existence of Metabolic Gene Clusters (MGCs) in plant genomes has recently raised increased interest. Thus far, MGCs were commonly identified for pathways of specialized metabolism, mostly those associated with terpene type products. For efficient identification of novel MGCs, computational approaches are essential. Here, we present PhytoClust; a tool for the detection of candidate MGCs in plant genomes. The algorithm employs a collection of enzyme families related to plant specialized metabolism, translated into hidden Markov models, to mine given genome sequences for physically co-localized metabolic enzymes. Our tool accurately identifies previously characterized plant MGCs. An exhaustive search of 31 plant genomes detected 1232 and 5531 putative gene cluster types and candidates, respectively. Clustering analysis of putative MGCs types by species reflected plant taxonomy. Furthermore, enrichment analysis revealed taxa- and species-specific enrichment of certain enzyme families in MGCs. When operating through our web-interface, PhytoClust users can mine a genome either based on a list of known cluster types or by defining new cluster rules. Moreover, for selected plant species, the output can be complemented by co-expression analysis. Altogether, we envisage PhytoClust to enhance novel MGCs discovery which will in turn impact the exploration of plant metabolism. PMID:28486689

Genome-Scale Model and Omics Analysis of Metabolic Capacities of Akkermansia muciniphila Reveal a Preferential Mucin-Degrading Lifestyle

PubMed Central

Suarez-Diez, Maria; Boeren, Sjef; Schaap, Peter J.; Martins dos Santos, Vitor A. P.; Smidt, Hauke; Belzer, Clara

2017-01-01

ABSTRACT The composition and activity of the microbiota in the human gastrointestinal tract are primarily shaped by nutrients derived from either food or the host. Bacteria colonizing the mucus layer have evolved to use mucin as a carbon and energy source. One of the members of the mucosa-associated microbiota is Akkermansia muciniphila, which is capable of producing an extensive repertoire of mucin-degrading enzymes. To further study the substrate utilization abilities of A. muciniphila, we constructed a genome-scale metabolic model to test amino acid auxotrophy, vitamin biosynthesis, and sugar-degrading capacities. The model-supported predictions were validated by in vitro experiments, which showed A. muciniphila to be able to utilize the mucin-derived monosaccharides fucose, galactose, and N-acetylglucosamine. Growth was also observed on N-acetylgalactosamine, even though the metabolic model did not predict this. The uptake of these sugars, as well as the nonmucin sugar glucose, was enhanced in the presence of mucin, indicating that additional mucin-derived components are needed for optimal growth. An analysis of whole-transcriptome sequencing (RNA-Seq) comparing the gene expression of A. muciniphila grown on mucin with that of the same bacterium grown on glucose confirmed the activity of the genes involved in mucin degradation and revealed most of these to be upregulated in the presence of mucin. The transcriptional response was confirmed by a proteome analysis, altogether revealing a hierarchy in the use of sugars and reflecting the adaptation of A. muciniphila to the mucosal environment. In conclusion, these findings provide molecular insights into the lifestyle of A. muciniphila and further confirm its role as a mucin specialist in the gut. IMPORTANCE Akkermansia muciniphila is among the most abundant mucosal bacteria in humans and in a wide range of other animals. Recently, A. muciniphila has attracted considerable attention because of its capacity to
Flux balance analysis of primary metabolism in Chlamydomonas reinhardtii.

PubMed

Boyle, Nanette R; Morgan, John A

2009-01-07

Photosynthetic organisms convert atmospheric carbon dioxide into numerous metabolites along the pathways to make new biomass. Aquatic photosynthetic organisms, which fix almost half of global inorganic carbon, have great potential: as a carbon dioxide fixation method, for the economical production of chemicals, or as a source for lipids and starch which can then be converted to biofuels. To harness this potential through metabolic engineering and to maximize production, a more thorough understanding of photosynthetic metabolism must first be achieved. A model algal species, C. reinhardtii, was chosen and the metabolic network reconstructed. Intracellular fluxes were then calculated using flux balance analysis (FBA). The metabolic network of primary metabolism for a green alga, C. reinhardtii, was reconstructed using genomic and biochemical information. The reconstructed network accounts for the intracellular localization of enzymes to three compartments and includes 484 metabolic reactions and 458 intracellular metabolites. Based on BLAST searches, one newly annotated enzyme (fructose-1,6-bisphosphatase) was added to the Chlamydomonas reinhardtii database. FBA was used to predict metabolic fluxes under three growth conditions, autotrophic, heterotrophic and mixotrophic growth. Biomass yields ranged from 28.9 g per mole C for autotrophic growth to 15 g per mole C for heterotrophic growth. The flux balance analysis model of central and intermediary metabolism in C. reinhardtii is the first such model for algae and the first model to include three metabolically active compartments. In addition to providing estimates of intracellular fluxes, metabolic reconstruction and modelling efforts also provide a comprehensive method for annotation of genome databases. As a result of our reconstruction, one new enzyme was annotated in the database and several others were found to be missing; implying new pathways or non-conserved enzymes. The use of FBA to estimate intracellular
The PhytoClust tool for metabolic gene clusters discovery in plant genomes.

PubMed

Töpfer, Nadine; Fuchs, Lisa-Maria; Aharoni, Asaph

2017-07-07

The existence of Metabolic Gene Clusters (MGCs) in plant genomes has recently raised increased interest. Thus far, MGCs were commonly identified for pathways of specialized metabolism, mostly those associated with terpene type products. For efficient identification of novel MGCs, computational approaches are essential. Here, we present PhytoClust; a tool for the detection of candidate MGCs in plant genomes. The algorithm employs a collection of enzyme families related to plant specialized metabolism, translated into hidden Markov models, to mine given genome sequences for physically co-localized metabolic enzymes. Our tool accurately identifies previously characterized plant MGCs. An exhaustive search of 31 plant genomes detected 1232 and 5531 putative gene cluster types and candidates, respectively. Clustering analysis of putative MGCs types by species reflected plant taxonomy. Furthermore, enrichment analysis revealed taxa- and species-specific enrichment of certain enzyme families in MGCs. When operating through our web-interface, PhytoClust users can mine a genome either based on a list of known cluster types or by defining new cluster rules. Moreover, for selected plant species, the output can be complemented by co-expression analysis. Altogether, we envisage PhytoClust to enhance novel MGCs discovery which will in turn impact the exploration of plant metabolism. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Genome-scale model-driven strain design for dicarboxylic acid production in Yarrowia lipolytica.

PubMed

Mishra, Pranjul; Lee, Na-Rae; Lakshmanan, Meiyappan; Kim, Minsuk; Kim, Byung-Gee; Lee, Dong-Yup

2018-03-19

Recently, there have been several attempts to produce long-chain dicarboxylic acids (DCAs) in various microbial hosts. Of these, Yarrowia lipolytica has great potential due to its oleaginous characteristics and unique ability to utilize hydrophobic substrates. However, Y. lipolytica should be further engineered to make it more competitive: the current approaches are mostly intuitive and cumbersome, thus limiting its industrial application. In this study, we proposed model-guided metabolic engineering strategies for enhanced production of DCAs in Y. lipolytica. At the outset, we reconstructed genome-scale metabolic model (GSMM) of Y. lipolytica (iYLI647) by substantially expanding the previous models. Subsequently, the model was validated using three sets of published culture experiment data. It was finally exploited to identify genetic engineering targets for overexpression, knockout, and cofactor modification by applying several in silico strain design methods, which potentially give rise to high yield production of the industrially relevant long-chain DCAs, e.g., dodecanedioic acid (DDDA). The resultant targets include (1) malate dehydrogenase and malic enzyme genes and (2) glutamate dehydrogenase gene, in silico overexpression of which generated additional NADPH required for fatty acid synthesis, leading to the increased DDDA fluxes by 48% and 22% higher, respectively, compared to wild-type. We further investigated the effect of supplying branched-chain amino acids on the acetyl-CoA turn-over rate which is key metabolite for fatty acid synthesis, suggesting their significance for production of DDDA in Y. lipolytica. In silico model-based strain design strategies allowed us to identify several metabolic engineering targets for overproducing DCAs in lipid accumulating yeast, Y. lipolytica. Thus, the current study can provide a methodological framework that is applicable to other oleaginous yeasts for value-added biochemical production.
Genomic islands link secondary metabolism to functional adaptation in marine Actinobacteria

PubMed Central

Penn, Kevin; Jenkins, Caroline; Nett, Markus; Udwary, Daniel W.; Gontang, Erin A.; McGlinchey, Ryan P.; Foster, Brian; Lapidus, Alla; Podell, Sheila; Allen, Eric E.; Moore, Bradley S.; Jensen, Paul R.

2009-01-01

Genomic islands have been shown to harbor functional traits that differentiate ecologically distinct populations of environmental bacteria. A comparative analysis of the complete genome sequences of the marine Actinobacteria Salinispora tropica and S. arenicola reveals that 75% of the species-specific genes are located in 21 genomic islands. These islands are enriched in genes associated with secondary metabolite biosynthesis providing evidence that secondary metabolism is linked to functional adaptation. Secondary metabolism accounts for 8.8% and 10.9% of the genes in the S. tropica and S. arenicola genomes, respectively, and represents the major functional category of annotated genes that differentiates the two species. Genomic islands harbor all 25 of the species-specific biosynthetic pathways, the majority of which occur in S. arenicola and may contribute to the cosmopolitan distribution of this species. Genome evolution is dominated by gene duplication and acquisition, which in the case of secondary metabolism provide immediate opportunities for the production of new bioactive products. Evidence that secondary metabolic pathways are exchanged horizontally, coupled with prior evidence for fixation among globally distributed populations, supports a functional role and suggests that the acquisition of natural product biosynthetic gene clusters represents a previously unrecognized force driving bacterial diversification. Species-specific differences observed in CRISPR (clustered regularly interspaced short palindromic repeat) sequences suggest that S. arenicola may possess a higher level of phage immunity, while a highly duplicated family of polymorphic membrane proteins provides evidence of a new mechanism of marine adaptation in Gram-positive bacteria. PMID:19474814
Combining Genome-Scale Experimental and Computational Methods To Identify Essential Genes in Rhodobacter sphaeroides

DOE PAGES

Burger, Brian T.; Imam, Saheed; Scarborough, Matthew J.; ...

2017-06-06

Rhodobacter sphaeroides is one of the best-studied alphaproteobacteria from biochemical, genetic, and genomic perspectives. To gain a better systems-level understanding of this organism, we generated a large transposon mutant library and used transposon sequencing (Tn-seq) to identify genes that are essential under several growth conditions. Using newly developed Tn-seq analysis software (TSAS), we identified 493 genes as essential for aerobic growth on a rich medium. We then used the mutant library to identify conditionally essential genes under two laboratory growth conditions, identifying 85 additional genes required for aerobic growth in a minimal medium and 31 additional genes required for photosyntheticmore » growth. In all instances, our analyses confirmed essentiality for many known genes and identified genes not previously considered to be essential. We used the resulting Tn-seq data to refine and improve a genome-scale metabolic network model (GEM) for R. sphaeroides. Together, we demonstrate how genetic, genomic, and computational approaches can be combined to obtain a systems-level understanding of the genetic framework underlying metabolic diversity in bacterial species.« less
Demographic History of the Genus Pan Inferred from Whole Mitochondrial Genome Reconstructions

PubMed Central

Tucci, Serena; de Manuel, Marc; Ghirotto, Silvia; Benazzo, Andrea; Prado-Martinez, Javier; Lorente-Galdos, Belen; Nam, Kiwoong; Dabad, Marc; Hernandez-Rodriguez, Jessica; Comas, David; Navarro, Arcadi; Schierup, Mikkel H.; Andres, Aida M.; Barbujani, Guido; Hvilsom, Christina; Marques-Bonet, Tomas

2016-01-01

The genus Pan is the closest genus to our own and it includes two species, Pan paniscus (bonobos) and Pan troglodytes (chimpanzees). The later is constituted by four subspecies, all highly endangered. The study of the Pan genera has been incessantly complicated by the intricate relationship among subspecies and the statistical limitations imposed by the reduced number of samples or genomic markers analyzed. Here, we present a new method to reconstruct complete mitochondrial genomes (mitogenomes) from whole genome shotgun (WGS) datasets, mtArchitect, showing that its reconstructions are highly accurate and consistent with long-range PCR mitogenomes. We used this approach to build the mitochondrial genomes of 20 newly sequenced samples which, together with available genomes, allowed us to analyze the hitherto most complete Pan mitochondrial genome dataset including 156 chimpanzee and 44 bonobo individuals, with a proportional contribution from all chimpanzee subspecies. We estimated the separation time between chimpanzees and bonobos around 1.15 million years ago (Mya) [0.81–1.49]. Further, we found that under the most probable genealogical model the two clades of chimpanzees, Western + Nigeria-Cameroon and Central + Eastern, separated at 0.59 Mya [0.41–0.78] with further internal separations at 0.32 Mya [0.22–0.43] and 0.16 Mya [0.17–0.34], respectively. Finally, for a subset of our samples, we compared nuclear versus mitochondrial genomes and we found that chimpanzee subspecies have different patterns of nuclear and mitochondrial diversity, which could be a result of either processes affecting the mitochondrial genome, such as hitchhiking or background selection, or a result of population dynamics. PMID:27345955
Genome size evolution in relation to leaf strategy and metabolic rates revisited.

PubMed

Beaulieu, Jeremy M; Leitch, Ilia J; Knight, Charles A

2007-03-01

It has been proposed that having too much DNA may carry physiological consequences for plants. The strong correlation between DNA content, cell size and cell division rate could lead to predictable morphological variation in plants, including a negative relationship with leaf mass per unit area (LMA). In addition, the possible increased demand for resources in species with high DNA content may have downstream effects on maximal metabolic efficiency, including decreased metabolic rates. Tests were made for genome size-dependent variation in LMA and metabolic rates (mass-based photosynthetic rate and dark respiration rate) using our own measurements and data from a plant functional trait database (Glopnet). These associations were tested using two metrics of genome size: bulk DNA amount (2C DNA) and monoploid genome size (1Cx DNA). The data were analysed using an evolutionary framework that included a regression analysis and independent contrasts using a phylogenetic tree with estimates of molecular diversification times. A contribution index for the LMA data set was also calculated to determine which divergences have the greatest influence on the relationship between genome size and LMA. A significant negative association was found between bulk DNA amount and LMA in angiosperms. This was primarily a result of influential divergences that may represent early shifts in growth form. However, divergences in bulk DNA amount were positively associated with divergences in LMA, suggesting that the relationship may be indirect and mediated through other traits directly related to genome size. There was a significant negative association between genome size and metabolic rates that was driven by a basal divergence between angiosperms and gymnosperms; no significant independent contrast results were found. Therefore, it is concluded that genome size-dependent constraints acting on metabolic efficiency may not exist within seed plants.
Genome-Based Metabolic Mapping and 13C Flux Analysis Reveal Systematic Properties of an Oleaginous Microalga Chlorella protothecoides

DOE PAGES

Wu, Chao; Xiong, Wei; Dai, Junbiao; ...

2014-12-15

We report that integrated and genome-based flux balance analysis, metabolomics, and 13C-label profiling of phototrophic and heterotrophic metabolism in Chlorella protothecoides, an oleaginous green alga for biofuel. The green alga Chlorella protothecoides, capable of autotrophic and heterotrophic growth with rapid lipid synthesis, is a promising candidate for biofuel production. Based on the newly available genome knowledge of the alga, we reconstructed the compartmentalized metabolic network consisting of 272 metabolic reactions, 270 enzymes, and 461 encoding genes and simulated the growth in different cultivation conditions with flux balance analysis. Phenotype-phase plane analysis shows conditions achieving theoretical maximum of the biomass andmore » corresponding fatty acid-producing rate for phototrophic cells (the ratio of photon uptake rate to CO 2 uptake rate equals 8.4) and heterotrophic ones (the glucose uptake rate to O 2 consumption rate reaches 2.4), respectively. Isotope-assisted liquid chromatography-mass spectrometry/mass spectrometry reveals higher metabolite concentrations in the glycolytic pathway and the tricarboxylic acid cycle in heterotrophic cells compared with autotrophic cells. We also observed enhanced levels of ATP, nicotinamide adenine dinucleotide (phosphate), reduced, acetyl-Coenzyme A, and malonyl-Coenzyme A in heterotrophic cells consistently, consistent with a strong activity of lipid synthesis. To profile the flux map in experimental conditions, we applied nonstationary 13C metabolic flux analysis as a complementing strategy to flux balance analysis. We found that the result reveals negligible photorespiratory fluxes and a metabolically low active tricarboxylic acid cycle in phototrophic C. protothecoides. In comparison, high throughput of amphibolic reactions and the tricarboxylic acid cycle with no glyoxylate shunt activities were measured for heterotrophic cells. Lastly, taken together, the metabolic network modeling
Genomic analysis of methanogenic archaea reveals a shift towards energy conservation

DOE PAGES

Gilmore, Sean P.; Henske, John K.; Sexton, Jessica A.; ...

2017-08-21

The metabolism of archaeal methanogens drives methane release into the environment and is critical to understanding global carbon cycling. Methanogenesis operates at a very low reducing potential compared to other forms of respiration and is therefore critical to many anaerobic environments. Harnessing or altering methanogen metabolism has the potential to mitigate global warming and even be utilized for energy applications. Here, we report draft genome sequences for the isolated methanogens Methanobacterium bryantii, Methanosarcina spelaei, Methanosphaera cuniculi, and Methanocorpusculum parvum. These anaerobic, methane-producing archaea represent a diverse set of isolates, capable of methylotrophic, acetoclastic, and hydrogenotrophic methanogenesis. Assembly and analysis ofmore » the genomes allowed for simple and rapid reconstruction of metabolism in the four methanogens. Comparison of the distribution of Clusters of Orthologous Groups (COG) proteins to a sample of genomes from the RefSeq database revealed a trend towards energy conservation in genome composition of all methanogens sequenced. Further analysis of the predicted membrane proteins and transporters distinguished differing energy conservation methods utilized during methanogenesis, such as chemiosmotic coupling in Msar. spelaei and electron bifurcation linked to chemiosmotic coupling in Mbac. bryantii and Msph. cuniculi. Methanogens occupy a unique ecological niche, acting as the terminal electron acceptors in anaerobic environments, and their genomes display a significant shift towards energy conservation. The genome-enabled reconstructed metabolisms reported here have significance to diverse anaerobic communities and have led to proposed substrate utilization not previously reported in isolation, such as formate and methanol metabolism in Mbac. bryantii and CO 2 metabolism in Msph. cuniculi. The newly proposed substrates establish an important foundation with which to decipher how methanogens
Genomic analysis of methanogenic archaea reveals a shift towards energy conservation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gilmore, Sean P.; Henske, John K.; Sexton, Jessica A.

The metabolism of archaeal methanogens drives methane release into the environment and is critical to understanding global carbon cycling. Methanogenesis operates at a very low reducing potential compared to other forms of respiration and is therefore critical to many anaerobic environments. Harnessing or altering methanogen metabolism has the potential to mitigate global warming and even be utilized for energy applications. Here, we report draft genome sequences for the isolated methanogens Methanobacterium bryantii, Methanosarcina spelaei, Methanosphaera cuniculi, and Methanocorpusculum parvum. These anaerobic, methane-producing archaea represent a diverse set of isolates, capable of methylotrophic, acetoclastic, and hydrogenotrophic methanogenesis. Assembly and analysis ofmore » the genomes allowed for simple and rapid reconstruction of metabolism in the four methanogens. Comparison of the distribution of Clusters of Orthologous Groups (COG) proteins to a sample of genomes from the RefSeq database revealed a trend towards energy conservation in genome composition of all methanogens sequenced. Further analysis of the predicted membrane proteins and transporters distinguished differing energy conservation methods utilized during methanogenesis, such as chemiosmotic coupling in Msar. spelaei and electron bifurcation linked to chemiosmotic coupling in Mbac. bryantii and Msph. cuniculi. Methanogens occupy a unique ecological niche, acting as the terminal electron acceptors in anaerobic environments, and their genomes display a significant shift towards energy conservation. The genome-enabled reconstructed metabolisms reported here have significance to diverse anaerobic communities and have led to proposed substrate utilization not previously reported in isolation, such as formate and methanol metabolism in Mbac. bryantii and CO 2 metabolism in Msph. cuniculi. The newly proposed substrates establish an important foundation with which to decipher how methanogens
Genome-wide Reconstruction of OxyR and SoxRS Transcriptional Regulatory Networks under Oxidative Stress in Escherichia coli K-12 MG1655.

PubMed

Seo, Sang Woo; Kim, Donghyuk; Szubin, Richard; Palsson, Bernhard O

2015-08-25

Three transcription factors (TFs), OxyR, SoxR, and SoxS, play a critical role in transcriptional regulation of the defense system for oxidative stress in bacteria. However, their full genome-wide regulatory potential is unknown. Here, we perform a genome-scale reconstruction of the OxyR, SoxR, and SoxS regulons in Escherichia coli K-12 MG1655. Integrative data analysis reveals that a total of 68 genes in 51 transcription units (TUs) belong to these regulons. Among them, 48 genes showed more than 2-fold changes in expression level under single-TF-knockout conditions. This reconstruction expands the genome-wide roles of these factors to include direct activation of genes related to amino acid biosynthesis (methionine and aromatic amino acids), cell wall synthesis (lipid A biosynthesis and peptidoglycan growth), and divalent metal ion transport (Mn(2+), Zn(2+), and Mg(2+)). Investigating the co-regulation of these genes with other stress-response TFs reveals that they are independently regulated by stress-specific TFs. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
Genome reconstructions indicate the partitioning of ecological functions inside a phytoplankton bloom in the Amundsen Sea, Antarctica

PubMed Central

Delmont, Tom O.; Eren, A. Murat; Vineis, Joseph H.; Post, Anton F.

2015-01-01

Antarctica polynyas support intense phytoplankton blooms, impacting their environment by a substantial depletion of inorganic carbon and nutrients. These blooms are dominated by the colony-forming haptophyte Phaeocystis antarctica and they are accompanied by a distinct bacterial population. Yet, the ecological role these bacteria may play in P. antarctica blooms awaits elucidation of their functional gene pool and of the geochemical activities they support. Here, we report on a metagenome (~160 million reads) analysis of the microbial community associated with a P. antarctica bloom event in the Amundsen Sea polynya (West Antarctica). Genomes of the most abundant Bacteroidetes and Proteobacteria populations have been reconstructed and a network analysis indicates a strong functional partitioning of these bacterial taxa. Three of them (SAR92, and members of the Oceanospirillaceae and Cryomorphaceae) are found in close association with P. antarctica colonies. Distinct features of their carbohydrate, nitrogen, sulfur and iron metabolisms may serve to support mutualistic relationships with P. antarctica. The SAR92 genome indicates a specialization in the degradation of fatty acids and dimethylsulfoniopropionate (compounds released by P. antarctica) into dimethyl sulfide, an aerosol precursor. The Oceanospirillaceae genome carries genes that may enhance algal physiology (cobalamin synthesis). Finally, the Cryomorphaceae genome is enriched in genes that function in cell or colony invasion. A novel pico-eukaryote, Micromonas related genome (19.6 Mb, ~94% completion) was also recovered. It contains the gene for an anti-freeze protein, which is lacking in Micromonas at lower latitudes. These draft genomes are representative for abundant microbial taxa across the Southern Ocean surface. PMID:26579075
Ensembl Genomes 2013: scaling up access to genome-wide data.

PubMed

Kersey, Paul Julian; Allen, James E; Christensen, Mikkel; Davis, Paul; Falin, Lee J; Grabmueller, Christoph; Hughes, Daniel Seth Toney; Humphrey, Jay; Kerhornou, Arnaud; Khobova, Julia; Langridge, Nicholas; McDowall, Mark D; Maheswari, Uma; Maslen, Gareth; Nuhn, Michael; Ong, Chuang Kee; Paulini, Michael; Pedro, Helder; Toneva, Iliana; Tuli, Mary Ann; Walts, Brandon; Williams, Gareth; Wilson, Derek; Youens-Clark, Ken; Monaco, Marcela K; Stein, Joshua; Wei, Xuehong; Ware, Doreen; Bolser, Daniel M; Howe, Kevin Lee; Kulesha, Eugene; Lawson, Daniel; Staines, Daniel Michael

2014-01-01

Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species. The project exploits and extends technologies for genome annotation, analysis and dissemination, developed in the context of the vertebrate-focused Ensembl project, and provides a complementary set of resources for non-vertebrate species through a consistent set of programmatic and interactive interfaces. These provide access to data including reference sequence, gene models, transcriptional data, polymorphisms and comparative analysis. This article provides an update to the previous publications about the resource, with a focus on recent developments. These include the addition of important new genomes (and related data sets) including crop plants, vectors of human disease and eukaryotic pathogens. In addition, the resource has scaled up its representation of bacterial genomes, and now includes the genomes of over 9000 bacteria. Specific extensions to the web and programmatic interfaces have been developed to support users in navigating these large data sets. Looking forward, analytic tools to allow targeted selection of data for visualization and download are likely to become increasingly important in future as the number of available genomes increases within all domains of life, and some of the challenges faced in representing bacterial data are likely to become commonplace for eukaryotes in future.
Sequential computation of elementary modes and minimal cut sets in genome-scale metabolic networks using alternate integer linear programming.

PubMed

Song, Hyun-Seob; Goldberg, Noam; Mahajan, Ashutosh; Ramkrishna, Doraiswami

2017-08-01

Elementary (flux) modes (EMs) have served as a valuable tool for investigating structural and functional properties of metabolic networks. Identification of the full set of EMs in genome-scale networks remains challenging due to combinatorial explosion of EMs in complex networks. It is often, however, that only a small subset of relevant EMs needs to be known, for which optimization-based sequential computation is a useful alternative. Most of the currently available methods along this line are based on the iterative use of mixed integer linear programming (MILP), the effectiveness of which significantly deteriorates as the number of iterations builds up. To alleviate the computational burden associated with the MILP implementation, we here present a novel optimization algorithm termed alternate integer linear programming (AILP). Our algorithm was designed to iteratively solve a pair of integer programming (IP) and linear programming (LP) to compute EMs in a sequential manner. In each step, the IP identifies a minimal subset of reactions, the deletion of which disables all previously identified EMs. Thus, a subsequent LP solution subject to this reaction deletion constraint becomes a distinct EM. In cases where no feasible LP solution is available, IP-derived reaction deletion sets represent minimal cut sets (MCSs). Despite the additional computation of MCSs, AILP achieved significant time reduction in computing EMs by orders of magnitude. The proposed AILP algorithm not only offers a computational advantage in the EM analysis of genome-scale networks, but also improves the understanding of the linkage between EMs and MCSs. The software is implemented in Matlab, and is provided as supplementary information . hyunseob.song@pnnl.gov. Supplementary data are available at Bioinformatics online. Published by Oxford University Press 2017. This work is written by US Government employees and are in the public domain in the US.
Reconstruction and evolutionary history of eutherian chromosomes

PubMed Central

Kim, Jaebum; Auvil, Loretta; Capitanu, Boris; Larkin, Denis M.; Ma, Jian; Lewin, Harris A.

2017-01-01

Whole-genome assemblies of 19 placental mammals and two outgroup species were used to reconstruct the order and orientation of syntenic fragments in chromosomes of the eutherian ancestor and six other descendant ancestors leading to human. For ancestral chromosome reconstructions, we developed an algorithm (DESCHRAMBLER) that probabilistically determines the adjacencies of syntenic fragments using chromosome-scale and fragmented genome assemblies. The reconstructed chromosomes of the eutherian, boreoeutherian, and euarchontoglires ancestor each included >80% of the entire length of the human genome, whereas reconstructed chromosomes of the most recent common ancestor of simians, catarrhini, great apes, and humans and chimpanzees included >90% of human genome sequence. These high-coverage reconstructions permitted reliable identification of chromosomal rearrangements over ∼105 My of eutherian evolution. Orangutan was found to have eight chromosomes that were completely conserved in homologous sequence order and orientation with the eutherian ancestor, the largest number for any species. Ruminant artiodactyls had the highest frequency of intrachromosomal rearrangements, and interchromosomal rearrangements dominated in murid rodents. A total of 162 chromosomal breakpoints in evolution of the eutherian ancestral genome to the human genome were identified; however, the rate of rearrangements was significantly lower (0.80/My) during the first ∼60 My of eutherian evolution, then increased to greater than 2.0/My along the five primate lineages studied. Our results significantly expand knowledge of eutherian genome evolution and will facilitate greater understanding of the role of chromosome rearrangements in adaptation, speciation, and the etiology of inherited and spontaneously occurring diseases. PMID:28630326
A genome scale metabolic network for rice and accompanying analysis of tryptophan, auxin and serotonin biosynthesis regulation under biotic stress

USDA-ARS?s Scientific Manuscript database

Functional annotations of large plant genome projects mostly provide information on gene function and gene families based on the presence of protein domains and gene homology, but not necessarily in association with gene expression or metabolic and regulatory networks. These additional annotations a...
Linkage of Organic Anion Transporter-1 to Metabolic Pathways through Integrated “Omics”-driven Network and Functional Analysis*

PubMed Central

Ahn, Sun-Young; Jamshidi, Neema; Mo, Monica L.; Wu, Wei; Eraly, Satish A.; Dnyanmote, Ankur; Bush, Kevin T.; Gallegos, Tom F.; Sweet, Douglas H.; Palsson, Bernhard Ø.; Nigam, Sanjay K.

2011-01-01

The main kidney transporter of many commonly prescribed drugs (e.g. penicillins, diuretics, antivirals, methotrexate, and non-steroidal anti-inflammatory drugs) is organic anion transporter-1 (OAT1), originally identified as NKT (Lopez-Nieto, C. E., You, G., Bush, K. T., Barros, E. J., Beier, D. R., and Nigam, S. K. (1997) J. Biol. Chem. 272, 6471–6478). Targeted metabolomics in knockouts have shown that OAT1 mediates the secretion or reabsorption of many important metabolites, including intermediates in carbohydrate, fatty acid, and amino acid metabolism. This observation raises the possibility that OAT1 helps regulate broader metabolic activities. We therefore examined the potential roles of OAT1 in metabolic pathways using Recon 1, a functionally tested genome-scale reconstruction of human metabolism. A computational approach was used to analyze in vivo metabolomic as well as transcriptomic data from wild-type and OAT1 knock-out animals, resulting in the implication of several metabolic pathways, including the citric acid cycle, polyamine, and fatty acid metabolism. Validation by in vitro and ex vivo analysis using Xenopus oocyte, cell culture, and kidney tissue assays demonstrated interactions between OAT1 and key intermediates in these metabolic pathways, including previously unknown substrates, such as polyamines (e.g. spermine and spermidine). A genome-scale metabolic network reconstruction generated some experimentally supported predictions for metabolic pathways linked to OAT1-related transport. The data support the possibility that the SLC22 and other families of transporters, known to be expressed in many tissues and primarily known for drug and toxin clearance, are integral to a number of endogenous pathways and may be involved in a larger remote sensing and signaling system (Ahn, S. Y., and Nigam, S. K. (2009) Mol. Pharmacol. 76, 481–490, and Wu, W., Dnyanmote, A. V., and Nigam, S. K. (2011) Mol. Pharmacol. 79, 795–805). Drugs may alter
How accurate is automated gap filling of metabolic models?

PubMed

Karp, Peter D; Weaver, Daniel; Latendresse, Mario

2018-06-19

Reaction gap filling is a computational technique for proposing the addition of reactions to genome-scale metabolic models to permit those models to run correctly. Gap filling completes what are otherwise incomplete models that lack fully connected metabolic networks. The models are incomplete because they are derived from annotated genomes in which not all enzymes have been identified. Here we compare the results of applying an automated likelihood-based gap filler within the Pathway Tools software with the results of manually gap filling the same metabolic model. Both gap-filling exercises were applied to the same genome-derived qualitative metabolic reconstruction for Bifidobacterium longum subsp. longum JCM 1217, and to the same modeling conditions - anaerobic growth under four nutrients producing 53 biomass metabolites. The solution computed by the gap-filling program GenDev contained 12 reactions, but closer examination showed that solution was not minimal; two of the twelve reactions can be removed to yield a set of ten reactions that enable model growth. The manually curated solution contained 13 reactions, eight of which were shared with the 12-reaction computed solution. Thus, GenDev achieved recall of 61.5% and precision of 66.6%. These results suggest that although computational gap fillers are populating metabolic models with significant numbers of correct reactions, automatically gap-filled metabolic models also contain significant numbers of incorrect reactions. Our conclusion is that manual curation of gap-filler results is needed to obtain high-accuracy models. Many of the differences between the manual and automatic solutions resulted from using expert biological knowledge to direct the choice of reactions within the curated solution, such as reactions specific to the anaerobic lifestyle of B. longum.
Capturing prokaryotic dark matter genomes.

PubMed

Gasc, Cyrielle; Ribière, Céline; Parisot, Nicolas; Beugnot, Réjane; Defois, Clémence; Petit-Biderre, Corinne; Boucher, Delphine; Peyretaillade, Eric; Peyret, Pierre

2015-12-01

Prokaryotes are the most diverse and abundant cellular life forms on Earth. Most of them, identified by indirect molecular approaches, belong to microbial dark matter. The advent of metagenomic and single-cell genomic approaches has highlighted the metabolic capabilities of numerous members of this dark matter through genome reconstruction. Thus, linking functions back to the species has revolutionized our understanding of how ecosystem function is sustained by the microbial world. This review will present discoveries acquired through the illumination of prokaryotic dark matter genomes by these innovative approaches. Copyright © 2015 Institut Pasteur. Published by Elsevier Masson SAS. All rights reserved.

DeCoSTAR: Reconstructing the Ancestral Organization of Genes or Genomes Using Reconciled Phylogenies

PubMed Central

Anselmetti, Yoann; Patterson, Murray; Ponty, Yann; B�rard, S�verine; Chauve, Cedric; Scornavacca, Celine; Daubin, Vincent; Tannier, Eric

2017-01-01

DeCoSTAR is a software that aims at reconstructing the organization of ancestral genes or genomes in the form of sets of neighborhood relations (adjacencies) between pairs of ancestral genes or gene domains. It can also improve the assembly of fragmented genomes by proposing evolutionary-induced adjacencies between scaffolding fragments. Ancestral genes or domains are deduced from reconciled phylogenetic trees under an evolutionary model that considers gains, losses, speciations, duplications, and transfers as possible events for gene evolution. Reconciliations are either given as input or computed with the ecceTERA package, into which DeCoSTAR is integrated. DeCoSTAR computes adjacency evolutionary scenarios using a scoring scheme based on a weighted sum of adjacency gains and breakages. Solutions, both optimal and near-optimal, are sampled according to the Boltzmann–Gibbs distribution centered around parsimonious solutions, and statistical supports on ancestral and extant adjacencies are provided. DeCoSTAR supports the features of previously contributed tools that reconstruct ancestral adjacencies, namely DeCo, DeCoLT, ART-DeCo, and DeClone. In a few minutes, DeCoSTAR can reconstruct the evolutionary history of domains inside genes, of gene fusion and fission events, or of gene order along chromosomes, for large data sets including dozens of whole genomes from all kingdoms of life. We illustrate the potential of DeCoSTAR with several applications: ancestral reconstruction of gene orders for Anopheles mosquito genomes, multidomain proteins in Drosophila, and gene fusion and fission detection in Actinobacteria. Availability: http://pbil.univ-lyon1.fr/software/DeCoSTAR (Last accessed April 24, 2017). PMID:28402423
Pathway Tools version 19.0 update: software for pathway/genome informatics and systems biology

PubMed Central

Latendresse, Mario; Paley, Suzanne M.; Krummenacker, Markus; Ong, Quang D.; Billington, Richard; Kothari, Anamika; Weaver, Daniel; Lee, Thomas; Subhraveti, Pallavi; Spaulding, Aaron; Fulcher, Carol; Keseler, Ingrid M.; Caspi, Ron

2016-01-01

Pathway Tools is a bioinformatics software environment with a broad set of capabilities. The software provides genome-informatics tools such as a genome browser, sequence alignments, a genome-variant analyzer and comparative-genomics operations. It offers metabolic-informatics tools, such as metabolic reconstruction, quantitative metabolic modeling, prediction of reaction atom mappings and metabolic route search. Pathway Tools also provides regulatory-informatics tools, such as the ability to represent and visualize a wide range of regulatory interactions. This article outlines the advances in Pathway Tools in the past 5 years. Major additions include components for metabolic modeling, metabolic route search, computation of atom mappings and estimation of compound Gibbs free energies of formation; addition of editors for signaling pathways, for genome sequences and for cellular architecture; storage of gene essentiality data and phenotype data; display of multiple alignments, and of signaling and electron-transport pathways; and development of Python and web-services application programming interfaces. Scientists around the world have created more than 9800 Pathway/Genome Databases by using Pathway Tools, many of which are curated databases for important model organisms. PMID:26454094
Metabolic Engineering for Probiotics and their Genome-Wide Expression Profiling.

PubMed

Yadav, Ruby; Singh, Puneet K; Shukla, Pratyoosh

2018-01-01

Probiotic supplements in food industry have attracted a lot of attention and shown a remarkable growth in this field. Metabolic engineering (ME) approaches enable understanding their mechanism of action and increases possibility of designing probiotic strains with desired functions. Probiotic microorganisms generally referred as industrially important lactic acid bacteria (LAB) which are involved in fermenting dairy products, food, beverages and produces lactic acid as final product. A number of illustrations of metabolic engineering approaches in industrial probiotic bacteria have been described in this review including transcriptomic studies of Lactobacillus reuteri and improvement in exopolysaccharide (EPS) biosynthesis yield in Lactobacillus casei LC2W. This review summaries various metabolic engineering approaches for exploring metabolic pathways. These approaches enable evaluation of cellular metabolic state and effective editing of microbial genome or introduction of novel enzymes to redirect the carbon fluxes. In addition, various system biology tools such as in silico design commonly used for improving strain performance is also discussed. Finally, we discuss the integration of metabolic engineering and genome profiling which offers a new way to explore metabolic interactions, fluxomics and probiogenomics using probiotic bacteria like Bifidobacterium spp and Lactobacillus spp. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Development of constraint-based system-level models of microbial metabolism.

PubMed

Navid, Ali

2012-01-01

Genome-scale models of metabolism are valuable tools for using genomic information to predict microbial phenotypes. System-level mathematical models of metabolic networks have been developed for a number of microbes and have been used to gain new insights into the biochemical conversions that occur within organisms and permit their survival and proliferation. Utilizing these models, computational biologists can (1) examine network structures, (2) predict metabolic capabilities and resolve unexplained experimental observations, (3) generate and test new hypotheses, (4) assess the nutritional requirements of the organism and approximate its environmental niche, (5) identify missing enzymatic functions in the annotated genome, and (6) engineer desired metabolic capabilities in model organisms. This chapter details the protocol for developing genome-scale models of metabolism in microbes as well as tips for accelerating the model building process.
Reconstruction of metabolic networks from high-throughput metabolite profiling data: in silico analysis of red blood cell metabolism.

PubMed

Nemenman, Ilya; Escola, G Sean; Hlavacek, William S; Unkefer, Pat J; Unkefer, Clifford J; Wall, Michael E

2007-12-01

We investigate the ability of algorithms developed for reverse engineering of transcriptional regulatory networks to reconstruct metabolic networks from high-throughput metabolite profiling data. For benchmarking purposes, we generate synthetic metabolic profiles based on a well-established model for red blood cell metabolism. A variety of data sets are generated, accounting for different properties of real metabolic networks, such as experimental noise, metabolite correlations, and temporal dynamics. These data sets are made available online. We use ARACNE, a mainstream algorithm for reverse engineering of transcriptional regulatory networks from gene expression data, to predict metabolic interactions from these data sets. We find that the performance of ARACNE on metabolic data is comparable to that on gene expression data.
Genome-wide association studies of obesity and metabolic syndrome.

PubMed

Fall, Tove; Ingelsson, Erik

2014-01-25

Until just a few years ago, the genetic determinants of obesity and metabolic syndrome were largely unknown, with the exception of a few forms of monogenic extreme obesity. Since genome-wide association studies (GWAS) became available, large advances have been made. The first single nucleotide polymorphism robustly associated with increased body mass index (BMI) was in 2007 mapped to a gene with for the time unknown function. This gene, now known as fat mass and obesity associated (FTO) has been repeatedly replicated in several ethnicities and is affecting obesity by regulating appetite. Since the first report from a GWAS of obesity, an increasing number of markers have been shown to be associated with BMI, other measures of obesity or fat distribution and metabolic syndrome. This systematic review of obesity GWAS will summarize genome-wide significant findings for obesity and metabolic syndrome and briefly give a few suggestions of what is to be expected in the next few years. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
Comparative analysis of rosaceous genomes and the reconstruction of a putative ancestral genome for the family

PubMed Central

2011-01-01

Background Comparative genome mapping studies in Rosaceae have been conducted until now by aligning genetic maps within the same genus, or closely related genera and using a limited number of common markers. The growing body of genomics resources and sequence data for both Prunus and Fragaria permits detailed comparisons between these genera and the recently released Malus × domestica genome sequence. Results We generated a comparative analysis using 806 molecular markers that are anchored genetically to the Prunus and/or Fragaria reference maps, and physically to the Malus genome sequence. Markers in common for Malus and Prunus, and Malus and Fragaria, respectively were 784 and 148. The correspondence between marker positions was high and conserved syntenic blocks were identified among the three genera in the Rosaceae. We reconstructed a proposed ancestral genome for the Rosaceae. Conclusions A genome containing nine chromosomes is the most likely candidate for the ancestral Rosaceae progenitor. The number of chromosomal translocations observed between the three genera investigated was low. However, the number of inversions identified among Malus and Prunus was much higher than any reported genome comparisons in plants, suggesting that small inversions have played an important role in the evolution of these two genera or of the Rosaceae. PMID:21226921
Genome size and metabolic intensity in tetrapods: a tale of two lines

PubMed Central

Vinogradov, Alexander E; Anatskaya, Olga V

2005-01-01

We show the negative link between genome size and metabolic intensity in tetrapods, using the heart index (relative heart mass) as a unified indicator of metabolic intensity in poikilothermal and homeothermal animals. We found two separate regression lines of heart index on genome size for reptiles–birds and amphibians–mammals (the slope of regression is steeper in reptiles–birds). We also show a negative correlation between GC content and nucleosome formation potential in vertebrate DNA, and, consistent with this relationship, a positive correlation between genome GC content and nuclear size (independent of genome size). It is known that there are two separate regression lines of genome GC content on genome size for reptiles–birds and amphibians–mammals: reptiles–birds have the relatively higher GC content (for their genome sizes) compared to amphibians–mammals. Our results suggest uniting all these data into one concept. The slope of negative regression between GC content and nucleosome formation potential is steeper in exons than in non-coding DNA (where nucleosome formation potential is generally higher), which indicates a special role of non-coding DNA for orderly chromatin organization. The chromatin condensation and nuclear size are supposed to be key parameters that accommodate the effects of both genome size and GC content and connect them with metabolic intensity. Our data suggest that the reptilian–birds clade evolved special relationships among these parameters, whereas mammals preserved the amphibian-like relationships. Surprisingly, mammals, although acquiring a more complex general organization, seem to retain certain genome-related properties that are similar to amphibians. At the same time, the slope of regression between nucleosome formation potential and GC content is steeper in poikilothermal than in homeothermal genomes, which suggests that mammals and birds acquired certain common features of genomic organization. PMID:16519230
Engineering Escherichia coli for poly-(3-hydroxybutyrate) production guided by genome-scale metabolic network analysis.

PubMed

Zheng, Yangyang; Yuan, Qianqian; Yang, Xiaoyan; Ma, Hongwu

2017-11-01

Poly-(3-hydroxybutyrate) (P3HB) is a promising biodegradable plastic synthesized from acetyl-CoA. One important factor affecting the P3HB production cost is the P3HB yield. Through flux balance analysis of an extended genome-scale metabolic network of E. coli, we found that the introduction of non-oxidative glycolysis pathway (NOG), a previously reported pathway enabling complete carbon conservation, can increase the theoretical carbon yield from 67% to 89%, equivalent to the theoretical mass yield from 0.48g P3HB/g glucose to 0.64g P3HB/g glucose. Based on this analysis result, we introduced phosphoketolase and enhanced the NOG pathway in E. coli. The mass yield in the engineered strain was increased from 0.16g P3HB/g glucose to 0.24g P3HB/g glucose. We further overexpressed pntAB to enhance the NADPH availability and down-regulated TCA cycle to divert more acetyl-CoA toward P3HB. The final construct accumulated 5.7g/L P3HB and reached a carbon yield of 0.43 (a mass yield of 0.31g P3HB/g glucose) in shake flask cultures in shake flask cultures. The introduction of NOG pathway could also be useful for improving yields of many other biochemicals derived from acetyl-coA. Copyright © 2017 Elsevier Inc. All rights reserved.
Community genomic analyses constrain the distribution of metabolic traits across the Chloroflexi phylum and indicate roles in sediment carbon cycling

PubMed Central

2013-01-01

Background Sediments are massive reservoirs of carbon compounds and host a large fraction of microbial life. Microorganisms within terrestrial aquifer sediments control buried organic carbon turnover, degrade organic contaminants, and impact drinking water quality. Recent 16S rRNA gene profiling indicates that members of the bacterial phylum Chloroflexi are common in sediment. Only the role of the class Dehalococcoidia, which degrade halogenated solvents, is well understood. Genomic sampling is available for only six of the approximate 30 Chloroflexi classes, so little is known about the phylogenetic distribution of reductive dehalogenation or about the broader metabolic characteristics of Chloroflexi in sediment. Results We used metagenomics to directly evaluate the metabolic potential and diversity of Chloroflexi in aquifer sediments. We sampled genomic sequence from 86 Chloroflexi representing 15 distinct lineages, including members of eight classes previously characterized only by 16S rRNA sequences. Unlike in the Dehalococcoidia, genes for organohalide respiration are rare within the Chloroflexi genomes sampled here. Near-complete genomes were reconstructed for three Chloroflexi. One, a member of an unsequenced lineage in the Anaerolinea, is an aerobe with the potential for respiring diverse carbon compounds. The others represent two genomically unsampled classes sibling to the Dehalococcoidia, and are anaerobes likely involved in sugar and plant-derived-compound degradation to acetate. Both fix CO2 via the Wood-Ljungdahl pathway, a pathway not previously documented in Chloroflexi. The genomes each encode unique traits apparently acquired from Archaea, including mechanisms of motility and ATP synthesis. Conclusions Chloroflexi in the aquifer sediments are abundant and highly diverse. Genomic analyses provide new evolutionary boundaries for obligate organohalide respiration. We expand the potential roles of Chloroflexi in sediment carbon cycling beyond
The genome of black cottonwood, Populus trichocarpa (Torr. & Gray)

Treesearch

G.A. Tuskan; S. DiFazio; S. Jansson; J. Bohlmann; I. Grigoriev; U. Hellsten; N. Putnam; S. Ralph; S. Rombauts; A. Salamov; J. Schein; L. Sterck; A. Aerts; R.R. Bhalerao; R.P. Bhalerao; D. Blaudez; W. Boerjan; A. Brun; A. Brunner; V. Busov; M. Campbell; J. Carlson; M. Chalot; J. Chapman; G.-L. Chen; D. Cooper; P.M. Coutinho; J. Couturier; S. Covert; Q. Cronk; R. Cunningham; J. Davis; S. Degroeve; A. Dejardin; C. dePamphilis; J. Detter; B. Dirks; U. Dubchak; S. Duplessis; J. Ehlting; B. Ellis; K. Gendler; D. Goodstein; M. Gribskov; J. Grimwood; A. Groover; L. Gunter; B. Hamberger; B. Heinze; Y. Helariutta; B. Henrissat; D. Holligan; R. Holt; W. Huang; N. Islam-Faridi; S. Jones; M. Jones-Rhoades; R. Jorgensen; C. Joshi; J. Kangasjarvi; J. Karlsson; C. Kelleher; R. Kirkpatrick; M. Kirst; A. Kohler; U. Kalluri; F. Larimer; J. Leebens-Mack; J.-C. Leple; P. Locascio; Y. Lou; S. Lucas; F. Martin; B. Montanini; C. Napoli; D.R. Nelson; C. Nelson; K. Nieminen; O. Nilsson; V. Pereda; G. Peter; R. Philippe; G. Pilate; A. Poliakov; J. Razumovskaya; P. Richardson; C. Rinaldi; K. Ritland; P. Rouze; D. Ryaboy; J. Schumtz; J. Schrader; B. Segerman; H. Shin; A. Siddiqui; F. Sterky; A. Terry; C.-J. Tsai; E. Uberbacher; P. Unneberg; J. Vahala; K. Wall; S. Wessler; G. Yang; T. Yin; C. Douglas; M. Marra; G. Sandberg; Y. Van de Peer; D. Rokhsar

2006-01-01

We report the draft genome of the black cottonwood tree, Populus trichocarpa. Integration of shotgun sequence assembly with genetic mapping enabled chromosome-scale reconstruction of the genome. More than 45,000 putative protein-coding genes were identified. Analysis of the assembled genome revealed a whole-genome duplication event; about 8000 pairs...
Proteomics and comparative genomics of Nitrososphaera viennensis reveal the core genome and adaptations of archaeal ammonia oxidizers

PubMed Central

Kerou, Melina; Offre, Pierre; Valledor, Luis; Abby, Sophie S.; Melcher, Michael; Nagler, Matthias; Weckwerth, Wolfram; Schleper, Christa

2016-01-01

Ammonia-oxidizing archaea (AOA) are among the most abundant microorganisms and key players in the global nitrogen and carbon cycles. They share a common energy metabolism but represent a heterogeneous group with respect to their environmental distribution and adaptions, growth requirements, and genome contents. We report here the genome and proteome of Nitrososphaera viennensis EN76, the type species of the archaeal class Nitrososphaeria of the phylum Thaumarchaeota encompassing all known AOA. N. viennensis is a soil organism with a 2.52-Mb genome and 3,123 predicted protein-coding genes. Proteomic analysis revealed that nearly 50% of the predicted genes were translated under standard laboratory growth conditions. Comparison with genomes of closely related species of the predominantly terrestrial Nitrososphaerales as well as the more streamlined marine Nitrosopumilales [Candidatus (Ca.) order] and the acidophile “Ca. Nitrosotalea devanaterra” revealed a core genome of AOA comprising 860 genes, which allowed for the reconstruction of central metabolic pathways common to all known AOA and expressed in the N. viennensis and “Ca. Nitrosopelagicus brevis” proteomes. Concomitantly, we were able to identify candidate proteins for as yet unidentified crucial steps in central metabolisms. In addition to unraveling aspects of core AOA metabolism, we identified specific metabolic innovations associated with the Nitrososphaerales mediating growth and survival in the soil milieu, including the capacity for biofilm formation, cell surface modifications and cell adhesion, and carbohydrate conversions as well as detoxification of aromatic compounds and drugs. PMID:27864514
Reconstruction and flux analysis of coupling between metabolic pathways of astrocytes and neurons: application to cerebral hypoxia

PubMed Central

Çakιr, Tunahan; Alsan, Selma; Saybaşιlι, Hale; Akιn, Ata; Ülgen, Kutlu Ö

2007-01-01

Background It is a daunting task to identify all the metabolic pathways of brain energy metabolism and develop a dynamic simulation environment that will cover a time scale ranging from seconds to hours. To simplify this task and make it more practicable, we undertook stoichiometric modeling of brain energy metabolism with the major aim of including the main interacting pathways in and between astrocytes and neurons. Model The constructed model includes central metabolism (glycolysis, pentose phosphate pathway, TCA cycle), lipid metabolism, reactive oxygen species (ROS) detoxification, amino acid metabolism (synthesis and catabolism), the well-known glutamate-glutamine cycle, other coupling reactions between astrocytes and neurons, and neurotransmitter metabolism. This is, to our knowledge, the most comprehensive attempt at stoichiometric modeling of brain metabolism to date in terms of its coverage of a wide range of metabolic pathways. We then attempted to model the basal physiological behaviour and hypoxic behaviour of the brain cells where astrocytes and neurons are tightly coupled. Results The reconstructed stoichiometric reaction model included 217 reactions (184 internal, 33 exchange) and 216 metabolites (183 internal, 33 external) distributed in and between astrocytes and neurons. Flux balance analysis (FBA) techniques were applied to the reconstructed model to elucidate the underlying cellular principles of neuron-astrocyte coupling. Simulation of resting conditions under the constraints of maximization of glutamate/glutamine/GABA cycle fluxes between the two cell types with subsequent minimization of Euclidean norm of fluxes resulted in a flux distribution in accordance with literature-based findings. As a further validation of our model, the effect of oxygen deprivation (hypoxia) on fluxes was simulated using an FBA-derivative approach, known as minimization of metabolic adjustment (MOMA). The results show the power of the constructed model to simulate
The Sequenced Angiosperm Genomes and Genome Databases.

PubMed

Chen, Fei; Dong, Wei; Zhang, Jiawei; Guo, Xinyue; Chen, Junhao; Wang, Zhengjia; Lin, Zhenguo; Tang, Haibao; Zhang, Liangsheng

2018-01-01

Angiosperms, the flowering plants, provide the essential resources for human life, such as food, energy, oxygen, and materials. They also promoted the evolution of human, animals, and the planet earth. Despite the numerous advances in genome reports or sequencing technologies, no review covers all the released angiosperm genomes and the genome databases for data sharing. Based on the rapid advances and innovations in the database reconstruction in the last few years, here we provide a comprehensive review for three major types of angiosperm genome databases, including databases for a single species, for a specific angiosperm clade, and for multiple angiosperm species. The scope, tools, and data of each type of databases and their features are concisely discussed. The genome databases for a single species or a clade of species are especially popular for specific group of researchers, while a timely-updated comprehensive database is more powerful for address of major scientific mysteries at the genome scale. Considering the low coverage of flowering plants in any available database, we propose construction of a comprehensive database to facilitate large-scale comparative studies of angiosperm genomes and to promote the collaborative studies of important questions in plant biology.
The Sequenced Angiosperm Genomes and Genome Databases

PubMed Central

Chen, Fei; Dong, Wei; Zhang, Jiawei; Guo, Xinyue; Chen, Junhao; Wang, Zhengjia; Lin, Zhenguo; Tang, Haibao; Zhang, Liangsheng

2018-01-01

Angiosperms, the flowering plants, provide the essential resources for human life, such as food, energy, oxygen, and materials. They also promoted the evolution of human, animals, and the planet earth. Despite the numerous advances in genome reports or sequencing technologies, no review covers all the released angiosperm genomes and the genome databases for data sharing. Based on the rapid advances and innovations in the database reconstruction in the last few years, here we provide a comprehensive review for three major types of angiosperm genome databases, including databases for a single species, for a specific angiosperm clade, and for multiple angiosperm species. The scope, tools, and data of each type of databases and their features are concisely discussed. The genome databases for a single species or a clade of species are especially popular for specific group of researchers, while a timely-updated comprehensive database is more powerful for address of major scientific mysteries at the genome scale. Considering the low coverage of flowering plants in any available database, we propose construction of a comprehensive database to facilitate large-scale comparative studies of angiosperm genomes and to promote the collaborative studies of important questions in plant biology. PMID:29706973
Metabolic pathway reconstruction of eugenol to vanillin bioconversion in Aspergillus niger

PubMed Central

Srivastava, Suchita; Luqman, Suaib; Khan, Feroz; Chanotiya, Chandan S; Darokar, Mahendra P

2010-01-01

Identification of missing genes or proteins participating in the metabolic pathways as enzymes are of great interest. One such class of pathway is involved in the eugenol to vanillin bioconversion. Our goal is to develop an integral approach for identifying the topology of a reference or known pathway in other organism. We successfully identify the missing enzymes and then reconstruct the vanillin biosynthetic pathway in Aspergillus niger. The procedure combines enzyme sequence similarity searched through BLAST homology search and orthologs detection through COG & KEGG databases. Conservation of protein domains and motifs was searched through CDD, PFAM & PROSITE databases. Predictions regarding how proteins act in pathway were validated experimentally and also compared with reported data. The bioconversion of vanillin was screened on UV-TLC plates and later confirmed through GC and GC-MS techniques. We applied a procedure for identifying missing enzymes on the basis of conserved functional motifs and later reconstruct the metabolic pathway in target organism. Using the vanillin biosynthetic pathway of Pseudomonas fluorescens as a case study, we indicate how this approach can be used to reconstruct the reference pathway in A. niger and later results were experimentally validated through chromatography and spectroscopy techniques. PMID:20978605
Genomic reconstruction to improve bioethanol and ergosterol production of industrial yeast Saccharomyces cerevisiae.

PubMed

Zhang, Ke; Tong, Mengmeng; Gao, Kehui; Di, Yanan; Wang, Pinmei; Zhang, Chunfang; Wu, Xuechang; Zheng, Daoqiong

2015-02-01

Baker's yeast (Saccharomyces cerevisiae) is the common yeast used in the fields of bread making, brewing, and bioethanol production. Growth rate, stress tolerance, ethanol titer, and byproducts yields are some of the most important agronomic traits of S. cerevisiae for industrial applications. Here, we developed a novel method of constructing S. cerevisiae strains for co-producing bioethanol and ergosterol. The genome of an industrial S. cerevisiae strain, ZTW1, was first reconstructed through treatment with an antimitotic drug followed by sporulation and hybridization. A total of 140 mutants were selected for ethanol fermentation testing, and a significant positive correlation between ergosterol content and ethanol production was observed. The highest performing mutant, ZG27, produced 7.9 % more ethanol and 43.2 % more ergosterol than ZTW1 at the end of fermentation. Chromosomal karyotyping and proteome analysis of ZG27 and ZTW1 suggested that this breeding strategy caused large-scale genome structural variations and global gene expression diversities in the mutants. Genetic manipulation further demonstrated that the altered expression activity of some genes (such as ERG1, ERG9, and ERG11) involved in ergosterol synthesis partly explained the trait improvement in ZG27.
Genome-scale model guided design of Propionibacterium for enhanced propionic acid production.

PubMed

Navone, Laura; McCubbin, Tim; Gonzalez-Garcia, Ricardo A; Nielsen, Lars K; Marcellin, Esteban

2018-06-01

Production of propionic acid by fermentation of propionibacteria has gained increasing attention in the past few years. However, biomanufacturing of propionic acid cannot compete with the current oxo-petrochemical synthesis process due to its well-established infrastructure, low oil prices and the high downstream purification costs of microbial production. Strain improvement to increase propionic acid yield is the best alternative to reduce downstream purification costs. The recent generation of genome-scale models for a number of Propionibacterium species facilitates the rational design of metabolic engineering strategies and provides a new opportunity to explore the metabolic potential of the Wood-Werkman cycle. Previous strategies for strain improvement have individually targeted acid tolerance, rate of propionate production or minimisation of by-products. Here we used the P. freudenreichii subsp . shermanii and the pan- Propionibacterium genome-scale metabolic models (GEMs) to simultaneously target these combined issues. This was achieved by focussing on strategies which yield higher energies and directly suppress acetate formation. Using P. freudenreichii subsp . shermanii , two strategies were assessed. The first tested the ability to manipulate the redox balance to favour propionate production by over-expressing the first two enzymes of the pentose-phosphate pathway (PPP), Zwf (glucose-6-phosphate 1-dehydrogenase) and Pgl (6-phosphogluconolactonase). Results showed a 4-fold increase in propionate to acetate ratio during the exponential growth phase. Secondly, the ability to enhance the energy yield from propionate production by over-expressing an ATP-dependent phosphoenolpyruvate carboxykinase (PEPCK) and sodium-pumping methylmalonyl-CoA decarboxylase (MMD) was tested, which extended the exponential growth phase. Together, these strategies demonstrate that in silico design strategies are predictive and can be used to reduce by-product formation in
Systems-level modeling of mycobacterial metabolism for the identification of new (multi-)drug targets.

PubMed

Rienksma, Rienk A; Suarez-Diez, Maria; Spina, Lucie; Schaap, Peter J; Martins dos Santos, Vitor A P

2014-12-01

Systems-level metabolic network reconstructions and the derived constraint-based (CB) mathematical models are efficient tools to explore bacterial metabolism. Approximately one-fourth of the Mycobacterium tuberculosis (Mtb) genome contains genes that encode proteins directly involved in its metabolism. These represent potential drug targets that can be systematically probed with CB models through the prediction of genes essential (or the combination thereof) for the pathogen to grow. However, gene essentiality depends on the growth conditions and, so far, no in vitro model precisely mimics the host at the different stages of mycobacterial infection, limiting model predictions. These limitations can be circumvented by combining expression data from in vivo samples with a validated CB model, creating an accurate description of pathogen metabolism in the host. To this end, we present here a thoroughly curated and extended genome-scale CB metabolic model of Mtb quantitatively validated using 13C measurements. We describe some of the efforts made in integrating CB models and high-throughput data to generate condition specific models, and we will discuss challenges ahead. This knowledge and the framework herein presented will enable to identify potential new drug targets, and will foster the development of optimal therapeutic strategies. Copyright © 2014 The Authors. Published by Elsevier Ltd.. All rights reserved.
Genome-centric metatranscriptomes and ecological roles of the active microbial populations during cellulosic biomass anaerobic digestion.

PubMed

Jia, Yangyang; Ng, Siu-Kin; Lu, Hongyuan; Cai, Mingwei; Lee, Patrick K H

2018-01-01

Although anaerobic digestion for biogas production is used worldwide in treatment processes to recover energy from carbon-rich waste such as cellulosic biomass, the activities and interactions among the microbial populations that perform anaerobic digestion deserve further investigations, especially at the population genome level. To understand the cellulosic biomass-degrading potentials in two full-scale digesters, this study examined five methanogenic enrichment cultures derived from the digesters that anaerobically digested cellulose or xylan for more than 2 years under 35 or 55 °C conditions. Metagenomics and metatranscriptomics were used to capture the active microbial populations in each enrichment culture and reconstruct their meta-metabolic network and ecological roles. 107 population genomes were reconstructed from the five enrichment cultures using a differential coverage binning approach, of which only a subset was highly transcribed in the metatranscriptomes. Phylogenetic and functional convergence of communities by enrichment condition and phase of fermentation was observed for the highly transcribed populations in the metatranscriptomes. In the 35 °C cultures grown on cellulose, Clostridium cellulolyticum -related and Ruminococcus -related bacteria were identified as major hydrolyzers and primary fermenters in the early growth phase, while Clostridium leptum -related bacteria were major secondary fermenters and potential fatty acid scavengers in the late growth phase. While the meta-metabolism and trophic roles of the cultures were similar, the bacterial populations performing each function were distinct between the enrichment conditions. Overall, a population genome-centric view of the meta-metabolism and functional roles of key active players in anaerobic digestion of cellulosic biomass was obtained. This study represents a major step forward towards understanding the microbial functions and interactions at population genome level during the

Pathway Tools version 19.0 update: software for pathway/genome informatics and systems biology.

PubMed

Karp, Peter D; Latendresse, Mario; Paley, Suzanne M; Krummenacker, Markus; Ong, Quang D; Billington, Richard; Kothari, Anamika; Weaver, Daniel; Lee, Thomas; Subhraveti, Pallavi; Spaulding, Aaron; Fulcher, Carol; Keseler, Ingrid M; Caspi, Ron

2016-09-01

Pathway Tools is a bioinformatics software environment with a broad set of capabilities. The software provides genome-informatics tools such as a genome browser, sequence alignments, a genome-variant analyzer and comparative-genomics operations. It offers metabolic-informatics tools, such as metabolic reconstruction, quantitative metabolic modeling, prediction of reaction atom mappings and metabolic route search. Pathway Tools also provides regulatory-informatics tools, such as the ability to represent and visualize a wide range of regulatory interactions. This article outlines the advances in Pathway Tools in the past 5 years. Major additions include components for metabolic modeling, metabolic route search, computation of atom mappings and estimation of compound Gibbs free energies of formation; addition of editors for signaling pathways, for genome sequences and for cellular architecture; storage of gene essentiality data and phenotype data; display of multiple alignments, and of signaling and electron-transport pathways; and development of Python and web-services application programming interfaces. Scientists around the world have created more than 9800 Pathway/Genome Databases by using Pathway Tools, many of which are curated databases for important model organisms. © The Author 2015. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.
Electron transfer to nitrogenase in different genomic and metabolic backgrounds.

PubMed

Poudel, Saroj; Colman, Daniel R; Fixen, Kathryn R; Ledbetter, Rhesa N; Zheng, Yanning; Pence, Natasha; Seefeldt, Lance C; Peters, John W; Harwood, Caroline S; Boyd, Eric S

2018-02-26

Nitrogenase catalyzes the reduction of dinitrogen (N 2 ) using low potential electrons from ferredoxin (Fd) or flavodoxin (Fld) through an ATP dependent process. Since its emergence in an anaerobic chemoautotroph, this oxygen (O 2 ) sensitive enzyme complex has evolved to operate in a variety of genomic and metabolic backgrounds including those of aerobes, anaerobes, chemotrophs, and phototrophs. However, whether pathways of electron delivery to nitrogenase are influenced by these different metabolic backgrounds is not well understood. Here, we report the distribution of homologs of Fds, Flds, and Fd/Fld-reducing enzymes in 359 genomes of putative N 2 fixers (diazotrophs). Six distinct lineages of nitrogenase were identified and their distributions largely corresponded to differences in the host cells' ability to integrate O 2 or light into energy metabolism. Predicted pathways of electron transfer to nitrogenase in aerobes, facultative anaerobes, and phototrophs varied from those in anaerobes at the level of Fds/Flds used to reduce nitrogenase, the enzymes that generate reduced Fds/Flds, and the putative substrates of these enzymes. Proteins that putatively reduce Fd with hydrogen or pyruvate were enriched in anaerobes, while those that reduce Fd with NADH/NADPH were enriched in aerobes, facultative anaerobes, and anoxygenic phototrophs. The energy metabolism of aerobic, facultatively anaerobic, and anoxygenic phototrophic diazotrophs often yields reduced NADH/NADPH that is not sufficiently reduced to drive N 2 reduction. At least two mechanisms have been acquired by these taxa to overcome this limitation and to generate electrons with potentials capable of reducing Fd. These include the bifurcation of electrons or the coupling of Fd reduction to reverse ion translocation. IMPORTANCE Nitrogen fixation supplies fixed nitrogen to cells from a variety of genomic and metabolic backgrounds including those of aerobes, facultative anaerobes, chemotrophs, and phototrophs
The CanOE strategy: integrating genomic and metabolic contexts across multiple prokaryote genomes to find candidate genes for orphan enzymes.

PubMed

Smith, Adam Alexander Thil; Belda, Eugeni; Viari, Alain; Medigue, Claudine; Vallenet, David

2012-05-01

Of all biochemically characterized metabolic reactions formalized by the IUBMB, over one out of four have yet to be associated with a nucleic or protein sequence, i.e. are sequence-orphan enzymatic activities. Few bioinformatics annotation tools are able to propose candidate genes for such activities by exploiting context-dependent rather than sequence-dependent data, and none are readily accessible and propose result integration across multiple genomes. Here, we present CanOE (Candidate genes for Orphan Enzymes), a four-step bioinformatics strategy that proposes ranked candidate genes for sequence-orphan enzymatic activities (or orphan enzymes for short). The first step locates "genomic metabolons", i.e. groups of co-localized genes coding proteins catalyzing reactions linked by shared metabolites, in one genome at a time. These metabolons can be particularly helpful for aiding bioanalysts to visualize relevant metabolic data. In the second step, they are used to generate candidate associations between un-annotated genes and gene-less reactions. The third step integrates these gene-reaction associations over several genomes using gene families, and summarizes the strength of family-reaction associations by several scores. In the final step, these scores are used to rank members of gene families which are proposed for metabolic reactions. These associations are of particular interest when the metabolic reaction is a sequence-orphan enzymatic activity. Our strategy found over 60,000 genomic metabolons in more than 1,000 prokaryote organisms from the MicroScope platform, generating candidate genes for many metabolic reactions, of which more than 70 distinct orphan reactions. A computational validation of the approach is discussed. Finally, we present a case study on the anaerobic allantoin degradation pathway in Escherichia coli K-12.
Genome resequencing in Populus: Revealing large-scale genome variation and implications on specialized-trait genomics

DOE Office of Scientific and Technical Information (OSTI.GOV)

Muchero, Wellington; Labbe, Jessy L; Priya, Ranjan

2014-01-01

To date, Populus ranks among a few plant species with a complete genome sequence and other highly developed genomic resources. With the first genome sequence among all tree species, Populus has been adopted as a suitable model organism for genomic studies in trees. However, far from being just a model species, Populus is a key renewable economic resource that plays a significant role in providing raw materials for the biofuel and pulp and paper industries. Therefore, aside from leading frontiers of basic tree molecular biology and ecological research, Populus leads frontiers in addressing global economic challenges related to fuel andmore » fiber production. The latter fact suggests that research aimed at improving quality and quantity of Populus as a raw material will likely drive the pursuit of more targeted and deeper research in order to unlock the economic potential tied in molecular biology processes that drive this tree species. Advances in genome sequence-driven technologies, such as resequencing individual genotypes, which in turn facilitates large scale SNP discovery and identification of large scale polymorphisms are key determinants of future success in these initiatives. In this treatise we discuss implications of genome sequence-enable technologies on Populus genomic and genetic studies of complex and specialized-traits.« less
Compartmentalized metabolic network reconstruction of microbial communities to determine the effect of agricultural intervention on soils

PubMed Central

Álvarez-Yela, Astrid Catalina; Gómez-Cano, Fabio; Zambrano, María Mercedes; Husserl, Johana; Danies, Giovanna; Restrepo, Silvia; González-Barrios, Andrés Fernando

2017-01-01

Soil microbial communities are responsible for a wide range of ecological processes and have an important economic impact in agriculture. Determining the metabolic processes performed by microbial communities is crucial for understanding and managing ecosystem properties. Metagenomic approaches allow the elucidation of the main metabolic processes that determine the performance of microbial communities under different environmental conditions and perturbations. Here we present the first compartmentalized metabolic reconstruction at a metagenomics scale of a microbial ecosystem. This systematic approach conceives a meta-organism without boundaries between individual organisms and allows the in silico evaluation of the effect of agricultural intervention on soils at a metagenomics level. To characterize the microbial ecosystems, topological properties, taxonomic and metabolic profiles, as well as a Flux Balance Analysis (FBA) were considered. Furthermore, topological and optimization algorithms were implemented to carry out the curation of the models, to ensure the continuity of the fluxes between the metabolic pathways, and to confirm the metabolite exchange between subcellular compartments. The proposed models provide specific information about ecosystems that are generally overlooked in non-compartmentalized or non-curated networks, like the influence of transport reactions in the metabolic processes, especially the important effect on mitochondrial processes, as well as provide more accurate results of the fluxes used to optimize the metabolic processes within the microbial community. PMID:28767679
On distributed wavefront reconstruction for large-scale adaptive optics systems.

PubMed

de Visser, Cornelis C; Brunner, Elisabeth; Verhaegen, Michel

2016-05-01

The distributed-spline-based aberration reconstruction (D-SABRE) method is proposed for distributed wavefront reconstruction with applications to large-scale adaptive optics systems. D-SABRE decomposes the wavefront sensor domain into any number of partitions and solves a local wavefront reconstruction problem on each partition using multivariate splines. D-SABRE accuracy is within 1% of a global approach with a speedup that scales quadratically with the number of partitions. The D-SABRE is compared to the distributed cumulative reconstruction (CuRe-D) method in open-loop and closed-loop simulations using the YAO adaptive optics simulation tool. D-SABRE accuracy exceeds CuRe-D for low levels of decomposition, and D-SABRE proved to be more robust to variations in the loop gain.
Industrial Acetogenic Biocatalysts: A Comparative Metabolic and Genomic Analysis

PubMed Central

Bengelsdorf, Frank R.; Poehlein, Anja; Linder, Sonja; Erz, Catarina; Hummel, Tim; Hoffmeister, Sabrina; Daniel, Rolf; Dürre, Peter

2016-01-01

Synthesis gas (syngas) fermentation by anaerobic acetogenic bacteria employing the Wood–Ljungdahl pathway is a bioprocess for production of biofuels and biocommodities. The major fermentation products of the most relevant biocatalytic strains (Clostridium ljungdahlii, C. autoethanogenum, C. ragsdalei, and C. coskatii) are acetic acid and ethanol. A comparative metabolic and genomic analysis using the mentioned biocatalysts might offer targets for metabolic engineering and thus improve the production of compounds apart from ethanol. Autotrophic growth and product formation of the four wild type (WT) strains were compared in uncontrolled batch experiments. The genomes of C. ragsdalei and C. coskatii were sequenced and the genome sequences of all four biocatalytic strains analyzed in comparative manner. Growth and product spectra (acetate, ethanol, 2,3-butanediol) of C. autoethanogenum, C. ljungdahlii, and C. ragsdalei were rather similar. In contrast, C. coskatii produced significantly less ethanol and its genome sequence lacks two genes encoding aldehyde:ferredoxin oxidoreductases (AOR). Comparative genome sequence analysis of the four WT strains revealed high average nucleotide identity (ANI) of C. ljungdahlii and C. autoethanogenum (99.3%) and C. coskatii (98.3%). In contrast, C. ljungdahlii WT and C. ragsdalei WT showed an ANI-based similarity of only 95.8%. Additionally, recombinant C. ljungdahlii strains were constructed that harbor an artificial acetone synthesis operon (ASO) consisting of the following genes: adc, ctfA, ctfB, and thlA (encoding acetoacetate decarboxylase, acetoacetyl-CoA:acetate/butyrate:CoA-transferase subunits A and B, and thiolase) under the control of thlA promoter (PthlA) from C. acetobutylicum or native pta-ack promoter (Ppta-ack) from C. ljungdahlii. Respective recombinant strains produced 2-propanol rather than acetone, due to the presence of a NADPH-dependent primary-secondary alcohol dehydrogenase that converts acetone to 2
Genome-scale strain designs based on regulatory minimal cut sets.

PubMed

Mahadevan, Radhakrishnan; von Kamp, Axel; Klamt, Steffen

2015-09-01

Stoichiometric and constraint-based methods of computational strain design have become an important tool for rational metabolic engineering. One of those relies on the concept of constrained minimal cut sets (cMCSs). However, as most other techniques, cMCSs may consider only reaction (or gene) knockouts to achieve a desired phenotype. We generalize the cMCSs approach to constrained regulatory MCSs (cRegMCSs), where up/downregulation of reaction rates can be combined along with reaction deletions. We show that flux up/downregulations can virtually be treated as cuts allowing their direct integration into the algorithmic framework of cMCSs. Because of vastly enlarged search spaces in genome-scale networks, we developed strategies to (optionally) preselect suitable candidates for flux regulation and novel algorithmic techniques to further enhance efficiency and speed of cMCSs calculation. We illustrate the cRegMCSs approach by a simple example network and apply it then by identifying strain designs for ethanol production in a genome-scale metabolic model of Escherichia coli. The results clearly show that cRegMCSs combining reaction deletions and flux regulations provide a much larger number of suitable strain designs, many of which are significantly smaller relative to cMCSs involving only knockouts. Furthermore, with cRegMCSs, one may also enable the fine tuning of desired behaviours in a narrower range. The new cRegMCSs approach may thus accelerate the implementation of model-based strain designs for the bio-based production of fuels and chemicals. MATLAB code and the examples can be downloaded at http://www.mpi-magdeburg.mpg.de/projects/cna/etcdownloads.html. krishna.mahadevan@utoronto.ca or klamt@mpi-magdeburg.mpg.de Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Reconstructions of solar irradiance on centennial time scales

NASA Astrophysics Data System (ADS)

Krivova, Natalie; Solanki, Sami K.; Dasi Espuig, Maria; Kok Leng, Yeo

Solar irradiance is the main external source of energy to Earth's climate system. The record of direct measurements covering less than 40 years is too short to study solar influence on Earth's climate, which calls for reconstructions of solar irradiance into the past with the help of appropriate models. An obvious requirement to a competitive model is its ability to reproduce observed irradiance changes, and a successful example of such a model is presented by the SATIRE family of models. As most state-of-the-art models, SATIRE assumes that irradiance changes on time scales longer than approximately a day are caused by the evolving distribution of dark and bright magnetic features on the solar surface. The surface coverage by such features as a function of time is derived from solar observations. The choice of these depends on the time scale in question. Most accurate is the version of the model that employs full-disc spatially-resolved solar magnetograms and reproduces over 90% of the measured irradiance variation, including the overall decreasing trend in the total solar irradiance over the last four cycles. Since such magnetograms are only available for about four decades, reconstructions on time scales of centuries have to rely on disc-integrated proxies of solar magnetic activity, such as sunspot areas and numbers. Employing a surface flux transport model and sunspot observations as input, we have being able to produce synthetic magnetograms since 1700. This improves the temporal resolution of the irradiance reconstructions on centennial time scales. The most critical aspect of such reconstructions remains the uncertainty in the magnitude of the secular change.
Metabolism of halophilic archaea

PubMed Central

Falb, Michaela; Müller, Kerstin; Königsmaier, Lisa; Oberwinkler, Tanja; Horn, Patrick; von Gronau, Susanne; Gonzalez, Orland; Pfeiffer, Friedhelm; Bornberg-Bauer, Erich

2008-01-01

In spite of their common hypersaline environment, halophilic archaea are surprisingly different in their nutritional demands and metabolic pathways. The metabolic diversity of halophilic archaea was investigated at the genomic level through systematic metabolic reconstruction and comparative analysis of four completely sequenced species: Halobacterium salinarum, Haloarcula marismortui, Haloquadratum walsbyi, and the haloalkaliphile Natronomonas pharaonis. The comparative study reveals different sets of enzyme genes amongst halophilic archaea, e.g. in glycerol degradation, pentose metabolism, and folate synthesis. The carefully assessed metabolic data represent a reliable resource for future system biology approaches as it also links to current experimental data on (halo)archaea from the literature. Electronic supplementary material The online version of this article (doi:10.1007/s00792-008-0138-x) contains supplementary material, which is available to authorized users. PMID:18278431
Phylogenomic Reconstruction of the Oomycete Phylogeny Derived from 37 Genomes

PubMed Central

McCarthy, Charley G. P.

2017-01-01

ABSTRACT The oomycetes are a class of microscopic, filamentous eukaryotes within the Stramenopiles-Alveolata-Rhizaria (SAR) supergroup which includes ecologically significant animal and plant pathogens, most infamously the causative agent of potato blight Phytophthora infestans. Single-gene and concatenated phylogenetic studies both of individual oomycete genera and of members of the larger class have resulted in conflicting conclusions concerning species phylogenies within the oomycetes, particularly for the large Phytophthora genus. Genome-scale phylogenetic studies have successfully resolved many eukaryotic relationships by using supertree methods, which combine large numbers of potentially disparate trees to determine evolutionary relationships that cannot be inferred from individual phylogenies alone. With a sufficient amount of genomic data now available, we have undertaken the first whole-genome phylogenetic analysis of the oomycetes using data from 37 oomycete species and 6 SAR species. In our analysis, we used established supertree methods to generate phylogenies from 8,355 homologous oomycete and SAR gene families and have complemented those analyses with both phylogenomic network and concatenated supermatrix analyses. Our results show that a genome-scale approach to oomycete phylogeny resolves oomycete classes and individual clades within the problematic Phytophthora genus. Support for the resolution of the inferred relationships between individual Phytophthora clades varies depending on the methodology used. Our analysis represents an important first step in large-scale phylogenomic analysis of the oomycetes. IMPORTANCE The oomycetes are a class of eukaryotes and include ecologically significant animal and plant pathogens. Single-gene and multigene phylogenetic studies of individual oomycete genera and of members of the larger classes have resulted in conflicting conclusions concerning interspecies relationships among these species, particularly for the
Polycyclic aromatic hydrocarbon metabolic network in Mycobacterium vanbaalenii PYR-1.

PubMed

Kweon, Ohgew; Kim, Seong-Jae; Holland, Ricky D; Chen, Hongyan; Kim, Dae-Wi; Gao, Yuan; Yu, Li-Rong; Baek, Songjoon; Baek, Dong-Heon; Ahn, Hongsik; Cerniglia, Carl E

2011-09-01

This study investigated a metabolic network (MN) from Mycobacterium vanbaalenii PYR-1 for polycyclic aromatic hydrocarbons (PAHs) from the perspective of structure, behavior, and evolution, in which multilayer omics data are integrated. Initially, we utilized a high-throughput proteomic analysis to assess the protein expression response of M. vanbaalenii PYR-1 to seven different aromatic compounds. A total of 3,431 proteins (57.38% of the genome-predicted proteins) were identified, which included 160 proteins that seemed to be involved in the degradation of aromatic hydrocarbons. Based on the proteomic data and the previous metabolic, biochemical, physiological, and genomic information, we reconstructed an experiment-based system-level PAH-MN. The structure of PAH-MN, with 183 metabolic compounds and 224 chemical reactions, has a typical scale-free nature. The behavior and evolution of the PAH-MN reveals a hierarchical modularity with funnel effects in structure/function and intimate association with evolutionary modules of the functional modules, which are the ring cleavage process (RCP), side chain process (SCP), and central aromatic process (CAP). The 189 commonly upregulated proteins in all aromatic hydrocarbon treatments provide insights into the global adaptation to facilitate the PAH metabolism. Taken together, the findings of our study provide the hierarchical viewpoint from genes/proteins/metabolites to the network via functional modules of the PAH-MN equipped with the engineering-driven approaches of modularization and rationalization, which may expand our understanding of the metabolic potential of M. vanbaalenii PYR-1 for bioremediation applications.
Genome-scale dynamic modeling of the competition between Rhodoferax and Geobacter in anoxic subsurface environments.

PubMed

Zhuang, Kai; Izallalen, Mounir; Mouser, Paula; Richter, Hanno; Risso, Carla; Mahadevan, Radhakrishnan; Lovley, Derek R

2011-02-01

The advent of rapid complete genome sequencing, and the potential to capture this information in genome-scale metabolic models, provide the possibility of comprehensively modeling microbial community interactions. For example, Rhodoferax and Geobacter species are acetate-oxidizing Fe(III)-reducers that compete in anoxic subsurface environments and this competition may have an influence on the in situ bioremediation of uranium-contaminated groundwater. Therefore, genome-scale models of Geobacter sulfurreducens and Rhodoferax ferrireducens were used to evaluate how Geobacter and Rhodoferax species might compete under diverse conditions found in a uranium-contaminated aquifer in Rifle, CO. The model predicted that at the low rates of acetate flux expected under natural conditions at the site, Rhodoferax will outcompete Geobacter as long as sufficient ammonium is available. The model also predicted that when high concentrations of acetate are added during in situ bioremediation, Geobacter species would predominate, consistent with field-scale observations. This can be attributed to the higher expected growth yields of Rhodoferax and the ability of Geobacter to fix nitrogen. The modeling predicted relative proportions of Geobacter and Rhodoferax in geochemically distinct zones of the Rifle site that were comparable to those that were previously documented with molecular techniques. The model also predicted that under nitrogen fixation, higher carbon and electron fluxes would be diverted toward respiration rather than biomass formation in Geobacter, providing a potential explanation for enhanced in situ U(VI) reduction in low-ammonium zones. These results show that genome-scale modeling can be a useful tool for predicting microbial interactions in subsurface environments and shows promise for designing bioremediation strategies.
Historical contingency and the gradual evolution of metabolic properties in central carbon and genome-scale metabolisms

PubMed Central

2014-01-01

Background A metabolism can evolve through changes in its biochemical reactions that are caused by processes such as horizontal gene transfer and gene deletion. While such changes need to preserve an organism’s viability in its environment, they can modify other important properties, such as a metabolism’s maximal biomass synthesis rate and its robustness to genetic and environmental change. Whether such properties can be modulated in evolution depends on whether all or most viable metabolisms – those that can synthesize all essential biomass precursors – are connected in a space of all possible metabolisms. Connectedness means that any two viable metabolisms can be converted into one another through a sequence of single reaction changes that leave viability intact. If the set of viable metabolisms is disconnected and highly fragmented, then historical contingency becomes important and restricts the alteration of metabolic properties, as well as the number of novel metabolic phenotypes accessible in evolution. Results We here computationally explore two vast spaces of possible metabolisms to ask whether viable metabolisms are connected. We find that for all but the simplest metabolisms, most viable metabolisms can be transformed into one another by single viability-preserving reaction changes. Where this is not the case, alternative essential metabolic pathways consisting of multiple reactions are responsible, but such pathways are not common. Conclusions Metabolism is thus highly evolvable, in the sense that its properties could be fine-tuned by successively altering individual reactions. Historical contingency does not strongly restrict the origin of novel metabolic phenotypes. PMID:24758311
Family-specific scaling laws in bacterial genomes.

PubMed

De Lazzari, Eleonora; Grilli, Jacopo; Maslov, Sergei; Cosentino Lagomarsino, Marco

2017-07-27

Among several quantitative invariants found in evolutionary genomics, one of the most striking is the scaling of the overall abundance of proteins, or protein domains, sharing a specific functional annotation across genomes of given size. The size of these functional categories change, on average, as power-laws in the total number of protein-coding genes. Here, we show that such regularities are not restricted to the overall behavior of high-level functional categories, but also exist systematically at the level of single evolutionary families of protein domains. Specifically, the number of proteins within each family follows family-specific scaling laws with genome size. Functionally similar sets of families tend to follow similar scaling laws, but this is not always the case. To understand this systematically, we provide a comprehensive classification of families based on their scaling properties. Additionally, we develop a quantitative score for the heterogeneity of the scaling of families belonging to a given category or predefined group. Under the common reasonable assumption that selection is driven solely or mainly by biological function, these findings point to fine-tuned and interdependent functional roles of specific protein domains, beyond our current functional annotations. This analysis provides a deeper view on the links between evolutionary expansion of protein families and the functional constraints shaping the gene repertoire of bacterial genomes. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Metabolic cancer biology: structural-based analysis of cancer as a metabolic disease, new sights and opportunities for disease treatment.

PubMed

Masoudi-Nejad, Ali; Asgari, Yazdan

2015-02-01

The cancer cell metabolism or the Warburg effect discovery goes back to 1924 when, for the first time Otto Warburg observed, in contrast to the normal cells, cancer cells have different metabolism. With the initiation of high throughput technologies and computational systems biology, cancer cell metabolism renaissances and many attempts were performed to revise the Warburg effect. The development of experimental and analytical tools which generate high-throughput biological data including lots of information could lead to application of computational models in biological discovery and clinical medicine especially for cancer. Due to the recent availability of tissue-specific reconstructed models, new opportunities in studying metabolic alteration in various kinds of cancers open up. Structural approaches at genome-scale levels seem to be suitable for developing diagnostic and prognostic molecular signatures, as well as in identifying new drug targets. In this review, we have considered these recent advances in structural-based analysis of cancer as a metabolic disease view. Two different structural approaches have been described here: topological and constraint-based methods. The ultimate goal of this type of systems analysis is not only the discovery of novel drug targets but also the development of new systems-based therapy strategies. Copyright © 2014 Elsevier Ltd. All rights reserved.
Multi-scale modeling of Arabidopsis thaliana response to different CO2 conditions: From gene expression to metabolic flux.

PubMed

Liu, Lin; Shen, Fangzhou; Xin, Changpeng; Wang, Zhuo

2016-01-01

Multi-scale investigation from gene transcript level to metabolic activity is important to uncover plant response to environment perturbation. Here we integrated a genome-scale constraint-based metabolic model with transcriptome data to explore Arabidopsis thaliana response to both elevated and low CO2 conditions. The four condition-specific models from low to high CO2 concentrations show differences in active reaction sets, enriched pathways for increased/decreased fluxes, and putative post-transcriptional regulation, which indicates that condition-specific models are necessary to reflect physiological metabolic states. The simulated CO2 fixation flux at different CO2 concentrations is consistent with the measured Assimilation-CO2intercellular curve. Interestingly, we found that reactions in primary metabolism are affected most significantly by CO2 perturbation, whereas secondary metabolic reactions are not influenced a lot. The changes predicted in key pathways are consistent with existing knowledge. Another interesting point is that Arabidopsis is required to make stronger adjustment on metabolism to adapt to the more severe low CO2 stress than elevated CO2 . The challenges of identifying post-transcriptional regulation could also be addressed by the integrative model. In conclusion, this innovative application of multi-scale modeling in plants demonstrates potential to uncover the mechanisms of metabolic response to different conditions. © 2015 Institute of Botany, Chinese Academy of Sciences.
Genome-Wide Identification of Regulatory Elements and Reconstruction of Gene Regulatory Networks of the Green Alga Chlamydomonas reinhardtii under Carbon Deprivation

PubMed Central

Vischi Winck, Flavia; Arvidsson, Samuel; Riaño-Pachón, Diego Mauricio; Hempel, Sabrina; Koseska, Aneta; Nikoloski, Zoran; Urbina Gomez, David Alejandro; Rupprecht, Jens; Mueller-Roeber, Bernd

2013-01-01

The unicellular green alga Chlamydomonas reinhardtii is a long-established model organism for studies on photosynthesis and carbon metabolism-related physiology. Under conditions of air-level carbon dioxide concentration [CO2], a carbon concentrating mechanism (CCM) is induced to facilitate cellular carbon uptake. CCM increases the availability of carbon dioxide at the site of cellular carbon fixation. To improve our understanding of the transcriptional control of the CCM, we employed FAIRE-seq (formaldehyde-assisted Isolation of Regulatory Elements, followed by deep sequencing) to determine nucleosome-depleted chromatin regions of algal cells subjected to carbon deprivation. Our FAIRE data recapitulated the positions of known regulatory elements in the promoter of the periplasmic carbonic anhydrase (Cah1) gene, which is upregulated during CCM induction, and revealed new candidate regulatory elements at a genome-wide scale. In addition, time series expression patterns of 130 transcription factor (TF) and transcription regulator (TR) genes were obtained for cells cultured under photoautotrophic condition and subjected to a shift from high to low [CO2]. Groups of co-expressed genes were identified and a putative directed gene-regulatory network underlying the CCM was reconstructed from the gene expression data using the recently developed IOTA (inner composition alignment) method. Among the candidate regulatory genes, two members of the MYB-related TF family, Lcr1 (Low-CO 2 response regulator 1) and Lcr2 (Low-CO 2 response regulator 2), may play an important role in down-regulating the expression of a particular set of TF and TR genes in response to low [CO2]. The results obtained provide new insights into the transcriptional control of the CCM and revealed more than 60 new candidate regulatory genes. Deep sequencing of nucleosome-depleted genomic regions indicated the presence of new, previously unknown regulatory elements in the C. reinhardtii genome. Our work can
Genome-wide analysis of starch metabolism genes in potato (Solanum tuberosum L.).

PubMed

Van Harsselaar, Jessica K; Lorenz, Julia; Senning, Melanie; Sonnewald, Uwe; Sonnewald, Sophia

2017-01-05

Starch is the principle constituent of potato tubers and is of considerable importance for food and non-food applications. Its metabolism has been subject of extensive research over the past decades. Despite its importance, a description of the complete inventory of genes involved in starch metabolism and their genome organization in potato plants is still missing. Moreover, mechanisms regulating the expression of starch genes in leaves and tubers remain elusive with regard to differences between transitory and storage starch metabolism, respectively. This study aimed at identifying and mapping the complete set of potato starch genes, and to study their expression pattern in leaves and tubers using different sets of transcriptome data. Moreover, we wanted to uncover transcription factors co-regulated with starch accumulation in tubers in order to get insight into the regulation of starch metabolism. We identified 77 genomic loci encoding enzymes involved in starch metabolism. Novel isoforms of many enzymes were found. Their analysis will help to elucidate mechanisms of starch biosynthesis and degradation. Expression analysis of starch genes led to the identification of tissue-specific isoenzymes suggesting differences in the transcriptional regulation of starch metabolism between potato leaf and tuber tissues. Selection of genes predominantly expressed in developing potato tubers and exhibiting an expression pattern indicative for a role in starch biosynthesis enabled the identification of possible transcriptional regulators of tuber starch biosynthesis by co-expression analysis. This study provides the annotation of the complete set of starch metabolic genes in potato plants and their genomic localizations. Novel, so far undescribed, enzyme isoforms were revealed. Comparative transcriptome analysis enabled the identification of tuber- and leaf-specific isoforms of starch genes. This finding suggests distinct regulatory mechanisms in transitory and storage starch
Multi-thread parallel algorithm for reconstructing 3D large-scale porous structures

NASA Astrophysics Data System (ADS)

Ju, Yang; Huang, Yaohui; Zheng, Jiangtao; Qian, Xu; Xie, Heping; Zhao, Xi

2017-04-01

Geomaterials inherently contain many discontinuous, multi-scale, geometrically irregular pores, forming a complex porous structure that governs their mechanical and transport properties. The development of an efficient reconstruction method for representing porous structures can significantly contribute toward providing a better understanding of the governing effects of porous structures on the properties of porous materials. In order to improve the efficiency of reconstructing large-scale porous structures, a multi-thread parallel scheme was incorporated into the simulated annealing reconstruction method. In the method, four correlation functions, which include the two-point probability function, the linear-path functions for the pore phase and the solid phase, and the fractal system function for the solid phase, were employed for better reproduction of the complex well-connected porous structures. In addition, a random sphere packing method and a self-developed pre-conditioning method were incorporated to cast the initial reconstructed model and select independent interchanging pairs for parallel multi-thread calculation, respectively. The accuracy of the proposed algorithm was evaluated by examining the similarity between the reconstructed structure and a prototype in terms of their geometrical, topological, and mechanical properties. Comparisons of the reconstruction efficiency of porous models with various scales indicated that the parallel multi-thread scheme significantly shortened the execution time for reconstruction of a large-scale well-connected porous model compared to a sequential single-thread procedure.

Farnesoid X Receptor Signaling Shapes the Gut Microbiota and Controls Hepatic Lipid Metabolism

PubMed Central

Zhang, Limin; Xie, Cen; Nichols, Robert G.; Chan, Siu H. J.; Jiang, Changtao; Hao, Ruixin; Smith, Philip B.; Cai, Jingwei; Simons, Margaret N.; Hatzakis, Emmanuel; Maranas, Costas D.; Gonzalez, Frank J.

2016-01-01

ABSTRACT The gut microbiota modulates obesity and associated metabolic phenotypes in part through intestinal farnesoid X receptor (FXR) signaling. Glycine-β-muricholic acid (Gly-MCA), an intestinal FXR antagonist, has been reported to prevent or reverse high-fat diet (HFD)-induced and genetic obesity, insulin resistance, and fatty liver; however, the mechanism by which these phenotypes are improved is not fully understood. The current study investigated the influence of FXR activity on the gut microbiota community structure and function and its impact on hepatic lipid metabolism. Predictions about the metabolic contribution of the gut microbiota to the host were made using 16S rRNA-based PICRUSt (phylogenetic investigation of communities by reconstruction of unobserved states), then validated using 1H nuclear magnetic resonance-based metabolomics, and results were summarized by using genome-scale metabolic models. Oral Gly-MCA administration altered the gut microbial community structure, notably reducing the ratio of Firmicutes to Bacteroidetes and its PICRUSt-predicted metabolic function, including reduced production of short-chain fatty acids (substrates for hepatic gluconeogenesis and de novo lipogenesis) in the ceca of HFD-fed mice. Metabolic improvement was intestinal FXR dependent, as revealed by the lack of changes in HFD-fed intestine-specific Fxr-null (FxrΔIE) mice treated with Gly-MCA. Integrative analyses based on genome-scale metabolic models demonstrated an important link between Lactobacillus and Clostridia bile salt hydrolase activity and bacterial fermentation. Hepatic metabolite levels after Gly-MCA treatment correlated with altered levels of gut bacterial species. In conclusion, modulation of the gut microbiota by inhibition of intestinal FXR signaling alters host liver lipid metabolism and improves obesity-related metabolic dysfunction. IMPORTANCE The farnesoid X receptor (FXR) plays an important role in mediating the dialog between the host
Metabolic network visualization eliminating node redundance and preserving metabolic pathways

PubMed Central

Bourqui, Romain; Cottret, Ludovic; Lacroix, Vincent; Auber, David; Mary, Patrick; Sagot, Marie-France; Jourdan, Fabien

2007-01-01

Background The tools that are available to draw and to manipulate the representations of metabolism are usually restricted to metabolic pathways. This limitation becomes problematic when studying processes that span several pathways. The various attempts that have been made to draw genome-scale metabolic networks are confronted with two shortcomings: 1- they do not use contextual information which leads to dense, hard to interpret drawings, 2- they impose to fit to very constrained standards, which implies, in particular, duplicating nodes making topological analysis considerably more difficult. Results We propose a method, called MetaViz, which enables to draw a genome-scale metabolic network and that also takes into account its structuration into pathways. This method consists in two steps: a clustering step which addresses the pathway overlapping problem and a drawing step which consists in drawing the clustered graph and each cluster. Conclusion The method we propose is original and addresses new drawing issues arising from the no-duplication constraint. We do not propose a single drawing but rather several alternative ways of presenting metabolism depending on the pathway on which one wishes to focus. We believe that this provides a valuable tool to explore the pathway structure of metabolism. PMID:17608928
DNA Precursor Metabolism and Mitochondrial Genome Stability

DTIC Science & Technology

2003-04-01

mitochondrial DNA replication , to learn how the pool sizes are regulated, and to understand how perturbations of normal dNTP metabolism within the...mitochondria raises the possibility, however unlikely, that it is serving a function in addition to its role in DNA replication . The literature on non-DNA...is below since many authors do not follow the 200 word limit 14. SUBJECT TERMS Mitochondria, Genome stability, DNA precursors, Mitochondrial DNA
MetReS, an Efficient Database for Genomic Applications.

PubMed

Vilaplana, Jordi; Alves, Rui; Solsona, Francesc; Mateo, Jordi; Teixidó, Ivan; Pifarré, Marc

2018-02-01

MetReS (Metabolic Reconstruction Server) is a genomic database that is shared between two software applications that address important biological problems. Biblio-MetReS is a data-mining tool that enables the reconstruction of molecular networks based on automated text-mining analysis of published scientific literature. Homol-MetReS allows functional (re)annotation of proteomes, to properly identify both the individual proteins involved in the processes of interest and their function. The main goal of this work was to identify the areas where the performance of the MetReS database performance could be improved and to test whether this improvement would scale to larger datasets and more complex types of analysis. The study was started with a relational database, MySQL, which is the current database server used by the applications. We also tested the performance of an alternative data-handling framework, Apache Hadoop. Hadoop is currently used for large-scale data processing. We found that this data handling framework is likely to greatly improve the efficiency of the MetReS applications as the dataset and the processing needs increase by several orders of magnitude, as expected to happen in the near future.
Whole-Genome Duplication and the Functional Diversification of Teleost Fish Hemoglobins

PubMed Central

Opazo, Juan C.; Butts, G. Tyler; Nery, Mariana F.; Storz, Jay F.; Hoffmann, Federico G.

2013-01-01

Subsequent to the two rounds of whole-genome duplication that occurred in the common ancestor of vertebrates, a third genome duplication occurred in the stem lineage of teleost fishes. This teleost-specific genome duplication (TGD) is thought to have provided genetic raw materials for the physiological, morphological, and behavioral diversification of this highly speciose group. The extreme physiological versatility of teleost fish is manifest in their diversity of blood–gas transport traits, which reflects the myriad solutions that have evolved to maintain tissue O2 delivery in the face of changing metabolic demands and environmental O2 availability during different ontogenetic stages. During the course of development, regulatory changes in blood–O2 transport are mediated by the expression of multiple, functionally distinct hemoglobin (Hb) isoforms that meet the particular O2-transport challenges encountered by the developing embryo or fetus (in viviparous or oviparous species) and in free-swimming larvae and adults. The main objective of the present study was to assess the relative contributions of whole-genome duplication, large-scale segmental duplication, and small-scale gene duplication in producing the extraordinary functional diversity of teleost Hbs. To accomplish this, we integrated phylogenetic reconstructions with analyses of conserved synteny to characterize the genomic organization and evolutionary history of the globin gene clusters of teleosts. These results were then integrated with available experimental data on functional properties and developmental patterns of stage-specific gene expression. Our results indicate that multiple α- and β-globin genes were present in the common ancestor of gars (order Lepisoteiformes) and teleosts. The comparative genomic analysis revealed that teleosts possess a dual set of TGD-derived globin gene clusters, each of which has undergone lineage-specific changes in gene content via repeated duplication and
Large-Scale Phylogenetic Classification of Fungal Chitin Synthases and Identification of a Putative Cell-Wall Metabolism Gene Cluster in Aspergillus Genomes

PubMed Central

Pacheco-Arjona, Jose Ramon; Ramirez-Prado, Jorge Humberto

2014-01-01

The cell wall is a protective and versatile structure distributed in all fungi. The component responsible for its rigidity is chitin, a product of chitin synthase (Chsp) enzymes. There are seven classes of chitin synthase genes (CHS) and the amount and type encoded in fungal genomes varies considerably from one species to another. Previous Chsp sequence analyses focused on their study as individual units, regardless of genomic context. The identification of blocks of conserved genes between genomes can provide important clues about the interactions and localization of chitin synthases. On the present study, we carried out an in silico search of all putative Chsp encoded in 54 full fungal genomes, encompassing 21 orders from five phyla. Phylogenetic studies of these Chsp were able to confidently classify 347 out of the 369 Chsp identified (94%). Patterns in the distribution of Chsp related to taxonomy were identified, the most prominent being related to the type of fungal growth. More importantly, a synteny analysis for genomic blocks centered on class IV Chsp (the most abundant and widely distributed Chsp class) identified a putative cell wall metabolism gene cluster in members of the genus Aspergillus, the first such association reported for any fungal genome. PMID:25148134
Managing uncertainty in metabolic network structure and improving predictions using EnsembleFBA

PubMed Central

2017-01-01

Genome-scale metabolic network reconstructions (GENREs) are repositories of knowledge about the metabolic processes that occur in an organism. GENREs have been used to discover and interpret metabolic functions, and to engineer novel network structures. A major barrier preventing more widespread use of GENREs, particularly to study non-model organisms, is the extensive time required to produce a high-quality GENRE. Many automated approaches have been developed which reduce this time requirement, but automatically-reconstructed draft GENREs still require curation before useful predictions can be made. We present a novel approach to the analysis of GENREs which improves the predictive capabilities of draft GENREs by representing many alternative network structures, all equally consistent with available data, and generating predictions from this ensemble. This ensemble approach is compatible with many reconstruction methods. We refer to this new approach as Ensemble Flux Balance Analysis (EnsembleFBA). We validate EnsembleFBA by predicting growth and gene essentiality in the model organism Pseudomonas aeruginosa UCBPP-PA14. We demonstrate how EnsembleFBA can be included in a systems biology workflow by predicting essential genes in six Streptococcus species and mapping the essential genes to small molecule ligands from DrugBank. We found that some metabolic subsystems contributed disproportionately to the set of predicted essential reactions in a way that was unique to each Streptococcus species, leading to species-specific outcomes from small molecule interactions. Through our analyses of P. aeruginosa and six Streptococci, we show that ensembles increase the quality of predictions without drastically increasing reconstruction time, thus making GENRE approaches more practical for applications which require predictions for many non-model organisms. All of our functions and accompanying example code are available in an open online repository. PMID:28263984
Managing uncertainty in metabolic network structure and improving predictions using EnsembleFBA.

PubMed

Biggs, Matthew B; Papin, Jason A

2017-03-01

Genome-scale metabolic network reconstructions (GENREs) are repositories of knowledge about the metabolic processes that occur in an organism. GENREs have been used to discover and interpret metabolic functions, and to engineer novel network structures. A major barrier preventing more widespread use of GENREs, particularly to study non-model organisms, is the extensive time required to produce a high-quality GENRE. Many automated approaches have been developed which reduce this time requirement, but automatically-reconstructed draft GENREs still require curation before useful predictions can be made. We present a novel approach to the analysis of GENREs which improves the predictive capabilities of draft GENREs by representing many alternative network structures, all equally consistent with available data, and generating predictions from this ensemble. This ensemble approach is compatible with many reconstruction methods. We refer to this new approach as Ensemble Flux Balance Analysis (EnsembleFBA). We validate EnsembleFBA by predicting growth and gene essentiality in the model organism Pseudomonas aeruginosa UCBPP-PA14. We demonstrate how EnsembleFBA can be included in a systems biology workflow by predicting essential genes in six Streptococcus species and mapping the essential genes to small molecule ligands from DrugBank. We found that some metabolic subsystems contributed disproportionately to the set of predicted essential reactions in a way that was unique to each Streptococcus species, leading to species-specific outcomes from small molecule interactions. Through our analyses of P. aeruginosa and six Streptococci, we show that ensembles increase the quality of predictions without drastically increasing reconstruction time, thus making GENRE approaches more practical for applications which require predictions for many non-model organisms. All of our functions and accompanying example code are available in an open online repository.
The evolution of metabolic networks of E. coli

PubMed Central

2011-01-01

Background Despite the availability of numerous complete genome sequences from E. coli strains, published genome-scale metabolic models exist only for two commensal E. coli strains. These models have proven useful for many applications, such as engineering strains for desired product formation, and we sought to explore how constructing and evaluating additional metabolic models for E. coli strains could enhance these efforts. Results We used the genomic information from 16 E. coli strains to generate an E. coli pangenome metabolic network by evaluating their collective 76,990 ORFs. Each of these ORFs was assigned to one of 17,647 ortholog groups including ORFs associated with reactions in the most recent metabolic model for E. coli K-12. For orthologous groups that contain an ORF already represented in the MG1655 model, the gene to protein to reaction associations represented in this model could then be easily propagated to other E. coli strain models. All remaining orthologous groups were evaluated to see if new metabolic reactions could be added to generate a pangenome-scale metabolic model (iEco1712_pan). The pangenome model included reactions from a metabolic model update for E. coli K-12 MG1655 (iEco1339_MG1655) and enabled development of five additional strain-specific genome-scale metabolic models. These additional models include a second K-12 strain (iEco1335_W3110) and four pathogenic strains (two enterohemorrhagic E. coli O157:H7 and two uropathogens). When compared to the E. coli K-12 models, the metabolic models for the enterohemorrhagic (iEco1344_EDL933 and iEco1345_Sakai) and uropathogenic strains (iEco1288_CFT073 and iEco1301_UTI89) contained numerous lineage-specific gene and reaction differences. All six E. coli models were evaluated by comparing model predictions to carbon source utilization measurements under aerobic and anaerobic conditions, and to batch growth profiles in minimal media with 0.2% (w/v) glucose. An ancestral genome-scale
New mouse models for metabolic bone diseases generated by genome-wide ENU mutagenesis.

PubMed

Sabrautzki, Sibylle; Rubio-Aliaga, Isabel; Hans, Wolfgang; Fuchs, Helmut; Rathkolb, Birgit; Calzada-Wack, Julia; Cohrs, Christian M; Klaften, Matthias; Seedorf, Hartwig; Eck, Sebastian; Benet-Pagès, Ana; Favor, Jack; Esposito, Irene; Strom, Tim M; Wolf, Eckhard; Lorenz-Depiereux, Bettina; Hrabě de Angelis, Martin

2012-08-01

Metabolic bone disorders arise as primary diseases or may be secondary due to a multitude of organ malfunctions. Animal models are required to understand the molecular mechanisms responsible for the imbalances of bone metabolism in disturbed bone mineralization diseases. Here we present the isolation of mutant mouse models for metabolic bone diseases by phenotyping blood parameters that target bone turnover within the large-scale genome-wide Munich ENU Mutagenesis Project. A screening panel of three clinical parameters, also commonly used as biochemical markers in patients with metabolic bone diseases, was chosen. Total alkaline phosphatase activity and total calcium and inorganic phosphate levels in plasma samples of F1 offspring produced from ENU-mutagenized C3HeB/FeJ male mice were measured. Screening of 9,540 mice led to the identification of 257 phenodeviants of which 190 were tested by genetic confirmation crosses. Seventy-one new dominant mutant lines showing alterations of at least one of the biochemical parameters of interest were confirmed. Fifteen mutations among three genes (Phex, Casr, and Alpl) have been identified by positional-candidate gene approaches and one mutation of the Asgr1 gene, which was identified by next-generation sequencing. All new mutant mouse lines are offered as a resource for the scientific community.
Mining Genomes of Three Marine Sponge-Associated Actinobacterial Isolates for Secondary Metabolism.

PubMed

Horn, Hannes; Hentschel, Ute; Abdelmohsen, Usama Ramadan

2015-10-01

Here, we report the draft genome sequences of three actinobacterial isolates, Micromonospora sp. RV43, Rubrobacter sp. RV113, and Nocardiopsis sp. RV163 that had previously been isolated from Mediterranean sponges. The draft genomes were analyzed for the presence of gene clusters indicative of secondary metabolism using antiSMASH 3.0 and NapDos pipelines. Our findings demonstrated the chemical richness of sponge-associated actinomycetes and the efficacy of genome mining in exploring the genomic potential of sponge-derived actinomycetes. Copyright © 2015 Horn et al.
Genome-scale dynamic modeling of the competition between Rhodoferax and Geobacter in anoxic subsurface environments

PubMed Central

Zhuang, Kai; Izallalen, Mounir; Mouser, Paula; Richter, Hanno; Risso, Carla; Mahadevan, Radhakrishnan; Lovley, Derek R

2011-01-01

The advent of rapid complete genome sequencing, and the potential to capture this information in genome-scale metabolic models, provide the possibility of comprehensively modeling microbial community interactions. For example, Rhodoferax and Geobacter species are acetate-oxidizing Fe(III)-reducers that compete in anoxic subsurface environments and this competition may have an influence on the in situ bioremediation of uranium-contaminated groundwater. Therefore, genome-scale models of Geobacter sulfurreducens and Rhodoferax ferrireducens were used to evaluate how Geobacter and Rhodoferax species might compete under diverse conditions found in a uranium-contaminated aquifer in Rifle, CO. The model predicted that at the low rates of acetate flux expected under natural conditions at the site, Rhodoferax will outcompete Geobacter as long as sufficient ammonium is available. The model also predicted that when high concentrations of acetate are added during in situ bioremediation, Geobacter species would predominate, consistent with field-scale observations. This can be attributed to the higher expected growth yields of Rhodoferax and the ability of Geobacter to fix nitrogen. The modeling predicted relative proportions of Geobacter and Rhodoferax in geochemically distinct zones of the Rifle site that were comparable to those that were previously documented with molecular techniques. The model also predicted that under nitrogen fixation, higher carbon and electron fluxes would be diverted toward respiration rather than biomass formation in Geobacter, providing a potential explanation for enhanced in situ U(VI) reduction in low-ammonium zones. These results show that genome-scale modeling can be a useful tool for predicting microbial interactions in subsurface environments and shows promise for designing bioremediation strategies. PMID:20668487
Trace: a high-throughput tomographic reconstruction engine for large-scale datasets.

PubMed

Bicer, Tekin; Gürsoy, Doğa; Andrade, Vincent De; Kettimuthu, Rajkumar; Scullin, William; Carlo, Francesco De; Foster, Ian T

2017-01-01

Modern synchrotron light sources and detectors produce data at such scale and complexity that large-scale computation is required to unleash their full power. One of the widely used imaging techniques that generates data at tens of gigabytes per second is computed tomography (CT). Although CT experiments result in rapid data generation, the analysis and reconstruction of the collected data may require hours or even days of computation time with a medium-sized workstation, which hinders the scientific progress that relies on the results of analysis. We present Trace, a data-intensive computing engine that we have developed to enable high-performance implementation of iterative tomographic reconstruction algorithms for parallel computers. Trace provides fine-grained reconstruction of tomography datasets using both (thread-level) shared memory and (process-level) distributed memory parallelization. Trace utilizes a special data structure called replicated reconstruction object to maximize application performance. We also present the optimizations that we apply to the replicated reconstruction objects and evaluate them using tomography datasets collected at the Advanced Photon Source. Our experimental evaluations show that our optimizations and parallelization techniques can provide 158× speedup using 32 compute nodes (384 cores) over a single-core configuration and decrease the end-to-end processing time of a large sinogram (with 4501 × 1 × 22,400 dimensions) from 12.5 h to <5 min per iteration. The proposed tomographic reconstruction engine can efficiently process large-scale tomographic data using many compute nodes and minimize reconstruction times.
Delineation of metabolic gene clusters in plant genomes by chromatin signatures

PubMed Central

Yu, Nan; Nützmann, Hans-Wilhelm; MacDonald, James T.; Moore, Ben; Field, Ben; Berriri, Souha; Trick, Martin; Rosser, Susan J.; Kumar, S. Vinod; Freemont, Paul S.; Osbourn, Anne

2016-01-01

Plants are a tremendous source of diverse chemicals, including many natural product-derived drugs. It has recently become apparent that the genes for the biosynthesis of numerous different types of plant natural products are organized as metabolic gene clusters, thereby unveiling a highly unusual form of plant genome architecture and offering novel avenues for discovery and exploitation of plant specialized metabolism. Here we show that these clustered pathways are characterized by distinct chromatin signatures of histone 3 lysine trimethylation (H3K27me3) and histone 2 variant H2A.Z, associated with cluster repression and activation, respectively, and represent discrete windows of co-regulation in the genome. We further demonstrate that knowledge of these chromatin signatures along with chromatin mutants can be used to mine genomes for cluster discovery. The roles of H3K27me3 and H2A.Z in repression and activation of single genes in plants are well known. However, our discovery of highly localized operon-like co-regulated regions of chromatin modification is unprecedented in plants. Our findings raise intriguing parallels with groups of physically linked multi-gene complexes in animals and with clustered pathways for specialized metabolism in filamentous fungi. PMID:26895889
Intramolecular stable isotope distributions detect plant metabolic responses on century time scales

NASA Astrophysics Data System (ADS)

Schleucher, Jürgen; Ehlers, Ina; Augusti, Angela; Betson, Tatiana

2014-05-01

vast majority of crop species. To access century time scales, we traced this metabolic signal in historic material of two crop species during the past 100 years and find the same response as predicted from the greenhouse experiments. This allows estimating how much photorespiration has been reduced due to the anthropogenic CO2 emission during the 20th century, and shows that plants have not acclimated to increasing [CO2] during more than 100 generations. In summary, we demonstrate that metabolic responses of plants to environmental changes create intramolecular isotope signals. These signals can be identified in manipulation experiments and can be retrieved from plant archives. The isotope abundance of each intramolecular position is set by specific isotope fractionations, such as enzyme isotope effects or hydrogen exchange with xylem water (Augusti et al., Chem. Geol. 2008). Therefore it may be possible to simultaneously reconstruct several physiologic or climate signals from an archive of a single molecule. The principles governing intramolecular isotope distributions are general for all metabolites and isotopes (D, 13C), therefore intramolecular isotope distributions can multiply the information content of paleo archives. In particular, they allow extraction of metabolic information on long time scales, thereby connecting plant physiology with paleo research.
A novel untargeted metabolomics correlation-based network analysis incorporating human metabolic reconstructions

PubMed Central

2013-01-01

Background Metabolomics has become increasingly popular in the study of disease phenotypes and molecular pathophysiology. One branch of metabolomics that encompasses the high-throughput screening of cellular metabolism is metabolic profiling. In the present study, the metabolic profiles of different tumour cells from colorectal carcinoma and breast adenocarcinoma were exposed to hypoxic and normoxic conditions and these have been compared to reveal the potential metabolic effects of hypoxia on the biochemistry of the tumour cells; this may contribute to their survival in oxygen compromised environments. In an attempt to analyse the complex interactions between metabolites beyond routine univariate and multivariate data analysis methods, correlation analysis has been integrated with a human metabolic reconstruction to reveal connections between pathways that are associated with normoxic or hypoxic oxygen environments. Results Correlation analysis has revealed statistically significant connections between metabolites, where differences in correlations between cells exposed to different oxygen levels have been highlighted as markers of hypoxic metabolism in cancer. Network mapping onto reconstructed human metabolic models is a novel addition to correlation analysis. Correlated metabolites have been mapped onto the Edinburgh human metabolic network (EHMN) with the aim of interlinking metabolites found to be regulated in a similar fashion in response to oxygen. This revealed novel pathways within the metabolic network that may be key to tumour cell survival at low oxygen. Results show that the metabolic responses to lowering oxygen availability can be conserved or specific to a particular cell line. Network-based correlation analysis identified conserved metabolites including malate, pyruvate, 2-oxoglutarate, glutamate and fructose-6-phosphate. In this way, this method has revealed metabolites not previously linked, or less well recognised, with respect to hypoxia
Version 6 of the consensus yeast metabolic network refines biochemical coverage and improves model performance

PubMed Central

Heavner, Benjamin D.; Smallbone, Kieran; Price, Nathan D.; Walker, Larry P.

2013-01-01

Updates to maintain a state-of-the art reconstruction of the yeast metabolic network are essential to reflect our understanding of yeast metabolism and functional organization, to eliminate any inaccuracies identified in earlier iterations, to improve predictive accuracy and to continue to expand into novel subsystems to extend the comprehensiveness of the model. Here, we present version 6 of the consensus yeast metabolic network (Yeast 6) as an update to the community effort to computationally reconstruct the genome-scale metabolic network of Saccharomyces cerevisiae S288c. Yeast 6 comprises 1458 metabolites participating in 1888 reactions, which are annotated with 900 yeast genes encoding the catalyzing enzymes. Compared with Yeast 5, Yeast 6 demonstrates improved sensitivity, specificity and positive and negative predictive values for predicting gene essentiality in glucose-limited aerobic conditions when analyzed with flux balance analysis. Additionally, Yeast 6 improves the accuracy of predicting the likelihood that a mutation will cause auxotrophy. The network reconstruction is available as a Systems Biology Markup Language (SBML) file enriched with Minimium Information Requested in the Annotation of Biochemical Models (MIRIAM)-compliant annotations. Small- and macromolecules in the network are referenced to authoritative databases such as Uniprot or ChEBI. Molecules and reactions are also annotated with appropriate publications that contain supporting evidence. Yeast 6 is freely available at http://yeast.sf.net/ as three separate SBML files: a model using the SBML level 3 Flux Balance Constraint package, a model compatible with the MATLAB® COBRA Toolbox for backward compatibility and a reconstruction containing only reactions for which there is experimental evidence (without the non-biological reactions necessary for simulating growth). Database URL: http://yeast.sf.net/ PMID:23935056
Single-cell genomics reveals co-metabolic interactions within uncultivated Marine Group A bacteria

NASA Astrophysics Data System (ADS)

Hawley, A. K.; Hallam, S. J.

2016-02-01

Marine Group A (MGA) bacteria represent a ubiquitous and abundant candidate phylum enriched in oxygen minimum zones (OMZs) and the deep ocean. Despite MGA prevalence little is known about their ecology and biogeochemistry. Here we chart the metabolic potential of 26 MGA single-cell amplified genomes sourced from different environments spanning ecothermodynamic gradients including open ocean waters, OMZs and methanogenic environments including a terephthalate-degrading bioreactor. Metagenomic contig recruitment to SAGs combined with tetra-nucleotide frequency distribution patterns resolved nine MGA population genome bins. All population genomes exhibited genomic streamlining with open ocean MGA being the most reduced. Different strategies for carbohydrate utilization, carbon fixation energy metabolism and respiratory pathways were identified between population genome bins, including various roles in the nitrogen and sulfur cycles. MGA inhabiting OMZ oxyclines encoded genes for partial denitrification with potential to feed into anammox and nitrification as well as a polysulfide reductase with a potential role in the cryptic sulfur cycle. MGA inhabiting anoxic waters, encoded NiFe hydrogenase and nitrous oxide reductase with the potential to complete partial denitrification pathways previously linked to sulfur oxidation in SUP05 bacteria. MGA from methanogenic environments encoded genes mediating cascading syntrophic interactions with fatty acid degraders and methanogens including reverse electron transport potential. The MGA phylum appears to have evolved alternative metabolic innovations adapting specific subgroups to occupy specific niches along ecothermodynamic gradients. Additionally, expression of MGA genes from different OMZ environments supports that these subgroups manifest an increasing propensity for co-metabolic interactions under energy limiting conditions that mandates a cooperative mode of existence with important implications for C, N and S cycling in
Identifying metabolic enzymes with multiple types of association evidence

PubMed Central

Kharchenko, Peter; Chen, Lifeng; Freund, Yoav; Vitkup, Dennis; Church, George M

2006-01-01

Background Existing large-scale metabolic models of sequenced organisms commonly include enzymatic functions which can not be attributed to any gene in that organism. Existing computational strategies for identifying such missing genes rely primarily on sequence homology to known enzyme-encoding genes. Results We present a novel method for identifying genes encoding for a specific metabolic function based on a local structure of metabolic network and multiple types of functional association evidence, including clustering of genes on the chromosome, similarity of phylogenetic profiles, gene expression, protein fusion events and others. Using E. coli and S. cerevisiae metabolic networks, we illustrate predictive ability of each individual type of association evidence and show that significantly better predictions can be obtained based on the combination of all data. In this way our method is able to predict 60% of enzyme-encoding genes of E. coli metabolism within the top 10 (out of 3551) candidates for their enzymatic function, and as a top candidate within 43% of the cases. Conclusion We illustrate that a combination of genome context and other functional association evidence is effective in predicting genes encoding metabolic enzymes. Our approach does not rely on direct sequence homology to known enzyme-encoding genes, and can be used in conjunction with traditional homology-based metabolic reconstruction methods. The method can also be used to target orphan metabolic activities. PMID:16571130
Genomic reconstruction of novel sediment phyla enlightens roles in sedimentary biogeochemical cycling

NASA Astrophysics Data System (ADS)

Baker, B.; Lazar, C.; Seitz, K.; Teske, A.; Hinrichs, K. U.; Dick, G.

2015-12-01

Estuaries are among the most productive habitats on the planet. Microbes in estuary sediments control the turnover of organic carbon, and the anaerobic cycling of nitrogen and sulfur. These communities are complex and primarily made up of uncultured lineages, thus little is known about how ecological and metabolic processes are partitioned in sediments. We reconstructed 82 bacterial and 24 archaeal high-quality genomes from different redox regimes (sulfate-rich, sulfate-methane transition zone, and methane-rich zones) of estuary sediments. These bacteria belong to 23 distinct groups, including uncultured candidate phyla (eg. KSB1, TA06, and KD3-62), and three newly described phyla (WOR-1, and -2, and -3). The archaea encompass 8 widespread sediment lineages including MGB-D, RC-III and IV, Z7ME43, Parvarchaeota, Lokiarchoaeta (MBG-B), SAGMEG, Bathyarchaeota (groups MCG-1, -6, -7, and -15) and previously unrecognized deeply branched phylum "Thorarchaeota". The uncultured phyla mediate essential biogeochemical processes of the estuarine environment. Z7ME43 archaea have genes for S disproportionation (S0 reduction and thiosulfate reduction and oxidation). SAGMEG appear to be strict anaerobes capable of coupling CO/H2 oxidation to either S0 or nitrite reduction and have novel RubisCO genes for carbon fixation. Thorarchaeota contain pathways for acetate production from the degradation of detrital proteins and intermediate S cycling. Furthermore, the gene content of this group revealed links in the evolutionary histories of archaea and eukaryotes. This dataset extents our knowledge of the metabolic potential of several uncultured phyla. We were able to chart the flow of carbon and nutrients through the multiple layers of bacterial processing and reveal potential ecological interactions within the communities.

The gut microbiota modulates host amino acid and glutathione metabolism in mice

PubMed Central

Mardinoglu, Adil; Shoaie, Saeed; Bergentall, Mattias; Ghaffari, Pouyan; Zhang, Cheng; Larsson, Erik; Bäckhed, Fredrik; Nielsen, Jens

2015-01-01

The gut microbiota has been proposed as an environmental factor that promotes the progression of metabolic diseases. Here, we investigated how the gut microbiota modulates the global metabolic differences in duodenum, jejunum, ileum, colon, liver, and two white adipose tissue depots obtained from conventionally raised (CONV-R) and germ-free (GF) mice using gene expression data and tissue-specific genome-scale metabolic models (GEMs). We created a generic mouse metabolic reaction (MMR) GEM, reconstructed 28 tissue-specific GEMs based on proteomics data, and manually curated GEMs for small intestine, colon, liver, and adipose tissues. We used these functional models to determine the global metabolic differences between CONV-R and GF mice. Based on gene expression data, we found that the gut microbiota affects the host amino acid (AA) metabolism, which leads to modifications in glutathione metabolism. To validate our predictions, we measured the level of AAs and N-acetylated AAs in the hepatic portal vein of CONV-R and GF mice. Finally, we simulated the metabolic differences between the small intestine of the CONV-R and GF mice accounting for the content of the diet and relative gene expression differences. Our analyses revealed that the gut microbiota influences host amino acid and glutathione metabolism in mice. PMID:26475342
Genome-wide association studies for the identification of biomarkers in metabolic diseases.

PubMed

Pattin, Kristine A; Moore, Jason H

2010-01-01

The field of genetics as it relates to metabolic disorders such as obesity and type II diabetes is complicated, and along with the medical research community, great strides are being taken to begin to understand the biological and genetic underpinnings of these diseases, with the hope of improving therapeutic, diagnostic and preventive strategies. Although research on metabolic disorders has been continuing for decades, the completion of the Human Genome Project in 2003 and the International HapMap Project in 2005 gave rise to an abundance of research tools, such as genome-wide genotyping, which allow researchers to conduct genome-wide association studies (GWAS) for detecting genetic variants that confer increased or decreased susceptibility to such complex diseases. In this review, the complex nature of metabolic disorders is discussed, specifically obesity and type II diabetes, as well as the limitations of the GWAS as applied to these disorders. While acknowledging limitations of GWAS, it is hoped to provide an insight about how GWAS can be adapted and advantageous in the clinical setting, enhancing prevention, diagnosis and treatment of these diseases. To be able to use the GWAS in a clinical setting is a complex challenge, yet it is hoped that in the future this tool will ultimately allow the development of pharmaceutical options that are capable of targeting the cause of metabolic disorders, not just the symptoms themselves.
The large-scale organization of metabolic networks

NASA Astrophysics Data System (ADS)

Jeong, H.; Tombor, B.; Albert, R.; Oltvai, Z. N.; Barabási, A.-L.

2000-10-01

In a cell or microorganism, the processes that generate mass, energy, information transfer and cell-fate specification are seamlessly integrated through a complex network of cellular constituents and reactions. However, despite the key role of these networks in sustaining cellular functions, their large-scale structure is essentially unknown. Here we present a systematic comparative mathematical analysis of the metabolic networks of 43 organisms representing all three domains of life. We show that, despite significant variation in their individual constituents and pathways, these metabolic networks have the same topological scaling properties and show striking similarities to the inherent organization of complex non-biological systems. This may indicate that metabolic organization is not only identical for all living organisms, but also complies with the design principles of robust and error-tolerant scale-free networks, and may represent a common blueprint for the large-scale organization of interactions among all cellular constituents.
The amphioxus genome and the evolution of the chordate karyotype

DOE Office of Scientific and Technical Information (OSTI.GOV)

Putnam, Nicholas H.; Butts, Thomas; Ferrier, David E.K.

2008-04-01

Lancelets ('amphioxus') are the modern survivors of an ancient chordate lineage with a fossil record dating back to the Cambrian. We describe the structure and gene content of the highly polymorphic {approx}520 million base pair genome of the Florida lancelet Branchiostoma floridae, and analyze it in the context of chordate evolution. Whole genome comparisons illuminate the murky relationships among the three chordate groups (tunicates, lancelets, and vertebrates), and allow reconstruction of not only the gene complement of the last common chordate ancestor, but also a partial reconstruction of its genomic organization, as well as a description of two genome-wide duplicationsmore » and subsequent reorganizations in the vertebrate lineage. These genome-scale events shaped the vertebrate genome and provided additional genetic variation for exploitation during vertebrate evolution.« less
Multi-scale structural community organisation of the human genome.

PubMed

Boulos, Rasha E; Tremblay, Nicolas; Arneodo, Alain; Borgnat, Pierre; Audit, Benjamin

2017-04-11

Structural interaction frequency matrices between all genome loci are now experimentally achievable thanks to high-throughput chromosome conformation capture technologies. This ensues a new methodological challenge for computational biology which consists in objectively extracting from these data the structural motifs characteristic of genome organisation. We deployed the fast multi-scale community mining algorithm based on spectral graph wavelets to characterise the networks of intra-chromosomal interactions in human cell lines. We observed that there exist structural domains of all sizes up to chromosome length and demonstrated that the set of structural communities forms a hierarchy of chromosome segments. Hence, at all scales, chromosome folding predominantly involves interactions between neighbouring sites rather than the formation of links between distant loci. Multi-scale structural decomposition of human chromosomes provides an original framework to question structural organisation and its relationship to functional regulation across the scales. By construction the proposed methodology is independent of the precise assembly of the reference genome and is thus directly applicable to genomes whose assembly is not fully determined.
Reconstructing a hydrogen-driven microbial metabolic network in Opalinus Clay rock

DOE PAGES

Bagnoud, Alexandre; Chourey, Karuna; Hettich, Robert L.; ...

2016-10-14

A significant fraction (~ 20%) of microbial life is found in the terrestrial deep subsurface, yet the metabolic processes extant in those environments are poorly understood. Here we show that H 2, injected into the Opalinus Clay formation in a borehole located 300 meters below the surface, fuels a community of microorganisms with interconnected metabolisms. Metagenomic binning and metaproteomic analysis reveal a complete carbon cycle, driven by autotrophic hydrogen oxidizers. Dead biomass from these organisms is a substrate for a fermenting bacterium that produces acetate as a product. In turn, complete oxidizer heterotrophic sulfate- reducing bacteria utilize acetate and oxidizemore » it to CO 2, closing the cycle. This metabolic reconstruction sheds light onto a hydrogen-driven carbon cycle, and a sunlight-independent ecosystem in the deep subsurface.« less
1-CMDb: A Curated Database of Genomic Variations of the One-Carbon Metabolism Pathway.

PubMed

Bhat, Manoj K; Gadekar, Veerendra P; Jain, Aditya; Paul, Bobby; Rai, Padmalatha S; Satyamoorthy, Kapaettu

2017-01-01

The one-carbon metabolism pathway is vital in maintaining tissue homeostasis by driving the critical reactions of folate and methionine cycles. A myriad of genetic and epigenetic events mark the rate of reactions in a tissue-specific manner. Integration of these to predict and provide personalized health management requires robust computational tools that can process multiomics data. The DNA sequences that may determine the chain of biological events and the endpoint reactions within one-carbon metabolism genes remain to be comprehensively recorded. Hence, we designed the one-carbon metabolism database (1-CMDb) as a platform to interrogate its association with a host of human disorders. DNA sequence and network information of a total of 48 genes were extracted from a literature survey and KEGG pathway that are involved in the one-carbon folate-mediated pathway. The information generated, collected, and compiled for all these genes from the UCSC genome browser included the single nucleotide polymorphisms (SNPs), CpGs, copy number variations (CNVs), and miRNAs, and a comprehensive database was created. Furthermore, a significant correlation analysis was performed for SNPs in the pathway genes. Detailed data of SNPs, CNVs, CpG islands, and miRNAs for 48 folate pathway genes were compiled. The SNPs in CNVs (9670), CpGs (984), and miRNAs (14) were also compiled for all pathway genes. The SIFT score, the prediction and PolyPhen score, as well as the prediction for each of the SNPs were tabulated and represented for folate pathway genes. Also included in the database for folate pathway genes were the links to 124 various phenotypes and disease associations as reported in the literature and from publicly available information. A comprehensive database was generated consisting of genomic elements within and among SNPs, CNVs, CpGs, and miRNAs of one-carbon metabolism pathways to facilitate (a) single source of information and (b) integration into large-genome scale network
A Large-Scale Comparative Metagenomic Study Reveals the Functional Interactions in Six Bloom-Forming Microcystis-Epibiont Communities

PubMed Central

Li, Qi; Lin, Feibi; Yang, Chen; Wang, Juanping; Lin, Yan; Shen, Mengyuan; Park, Min S.; Li, Tao; Zhao, Jindong

2018-01-01

Cyanobacterial blooms are worldwide issues of societal concern and scientific interest. Lake Taihu and Lake Dianchi, two of the largest lakes in China, have been suffering from annual Microcystis-based blooms over the past two decades. These two eutrophic lakes differ in both nutrient load and environmental parameters, where Microcystis microbiota consisting of different Microcystis morphospecies and associated bacteria (epibionts) have dominated. We conducted a comprehensive metagenomic study that analyzed species diversity, community structure, functional components, metabolic pathways and networks to investigate functional interactions among the members of six Microcystis-epibiont communities in these two lakes. Our integrated metagenomic pipeline consisted of efficient assembly, binning, annotation, and quality assurance methods that ensured high-quality genome reconstruction. This study provides a total of 68 reconstructed genomes including six complete Microcystis genomes and 28 high quality bacterial genomes of epibionts belonging to 14 distinct taxa. This metagenomic dataset constitutes the largest reference genome catalog available for genome-centric studies of the Microcystis microbiome. Epibiont community composition appears to be dynamic rather than fixed, and the functional profiles of communities were related to the environment of origin. This study demonstrates mutualistic interactions between Microcystis and epibionts at genetic and metabolic levels. Metabolic pathway reconstruction provided evidence for functional complementation in nitrogen and sulfur cycles, fatty acid catabolism, vitamin synthesis, and aromatic compound degradation among community members. Thus, bacterial social interactions within Microcystis-epibiont communities not only shape species composition, but also stabilize the communities functional profiles. These interactions appear to play an important role in environmental adaptation of Microcystis colonies. PMID:29731741
Lipid metabolism in Rhodnius prolixus: Lessons from the genome.

PubMed

Majerowicz, David; Calderón-Fernández, Gustavo M; Alves-Bezerra, Michele; De Paula, Iron F; Cardoso, Lívia S; Juárez, M Patricia; Atella, Georgia C; Gondim, Katia C

2017-01-05

The kissing bug Rhodnius prolixus is both an important vector of Chagas' disease and an interesting model for investigation into the field of physiology, including lipid metabolism. The publication of this insect genome will bring a huge amount of new molecular biology data to be used in future experiments. Although this work represents a promising scenario, a preliminary analysis of the sequence data is necessary to identify and annotate the genes involved in lipid metabolism. Here, we used bioinformatics tools and gene expression analysis to explore genes from different genes families and pathways, including genes for fat breakdown, as lipases and phospholipases, and enzymes from β-oxidation, fatty acid metabolism, and acyl-CoA and glycerolipid synthesis. The R. prolixus genome encodes 31 putative lipase genes, including 21 neutral lipases and 5 acid lipases. The expression profiles of some of these genes were analyzed. We were able to identify nine phospholipase A2 genes. A variety of gene families that participate in fatty acid synthesis and modification were studied, including fatty acid synthase, elongase, desaturase and reductase. Concerning the synthesis of glycerolipids, we found a second isoform of glycerol-3-phosphate acyltransferase that was ubiquitously expressed throughout the organs. Finally, all genes involved in fatty acid β-oxidation were identified, but not a long-chain acyl-CoA dehydrogenase. These results provide fundamental data to be used in future research on insect lipid metabolism and its possible relevance to Chagas' disease transmission. Copyright © 2016 Elsevier B.V. All rights reserved.
Delineation of metabolic gene clusters in plant genomes by chromatin signatures.

PubMed

Yu, Nan; Nützmann, Hans-Wilhelm; MacDonald, James T; Moore, Ben; Field, Ben; Berriri, Souha; Trick, Martin; Rosser, Susan J; Kumar, S Vinod; Freemont, Paul S; Osbourn, Anne

2016-03-18

Plants are a tremendous source of diverse chemicals, including many natural product-derived drugs. It has recently become apparent that the genes for the biosynthesis of numerous different types of plant natural products are organized as metabolic gene clusters, thereby unveiling a highly unusual form of plant genome architecture and offering novel avenues for discovery and exploitation of plant specialized metabolism. Here we show that these clustered pathways are characterized by distinct chromatin signatures of histone 3 lysine trimethylation (H3K27me3) and histone 2 variant H2A.Z, associated with cluster repression and activation, respectively, and represent discrete windows of co-regulation in the genome. We further demonstrate that knowledge of these chromatin signatures along with chromatin mutants can be used to mine genomes for cluster discovery. The roles of H3K27me3 and H2A.Z in repression and activation of single genes in plants are well known. However, our discovery of highly localized operon-like co-regulated regions of chromatin modification is unprecedented in plants. Our findings raise intriguing parallels with groups of physically linked multi-gene complexes in animals and with clustered pathways for specialized metabolism in filamentous fungi. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Phylogenomic reconstruction of archaeal fatty acid metabolism

PubMed Central

Dibrova, Daria V.; Galperin, Michael Y.; Mulkidjanian, Armen Y.

2014-01-01

While certain archaea appear to synthesize and/or metabolize fatty acids, the respective pathways still remain obscure. By analyzing the genomic distribution of the key lipid-related enzymes, we were able to identify the likely components of the archaeal pathway of fatty acid metabolism, namely, a combination of the enzymes of bacterial-type β-oxidation of fatty acids (acyl-CoA-dehydrogenase, enoyl-CoA hydratase, and 3-hydroxyacyl-CoA dehydrogenase) with paralogs of the archaeal acetyl-CoA C-acetyltransferase, an enzyme of the mevalonate biosynthesis pathway. These three β-oxidation enzymes working in the reverse direction could potentially catalyze biosynthesis of fatty acids, with paralogs of acetyl-CoA C-acetyltransferase performing addition of C2 fragments. The presence in archaea of the genes for energy-transducing membrane enzyme complexes, such as cytochrome bc complex, cytochrome c oxidase, and diverse rhodopsins, was found to correlate with the presence of the proposed system of fatty acid biosynthesis. We speculate that because these membrane complexes functionally depend on fatty acid chains, their genes could have been acquired via lateral gene transfer from bacteria only by those archaea that already possessed a system of fatty acid biosynthesis. The proposed pathway of archaeal fatty acid metabolism operates in extreme conditions and therefore might be of interest in the context of biofuel production and other industrial applications. PMID:24818264
Differential retention of metabolic genes following whole-genome duplication.

PubMed

Gout, Jean-François; Duret, Laurent; Kahn, Daniel

2009-05-01

Classical studies in Metabolic Control Theory have shown that metabolic fluxes usually exhibit little sensitivity to changes in individual enzyme activity, yet remain sensitive to global changes of all enzymes in a pathway. Therefore, little selective pressure is expected on the dosage or expression of individual metabolic genes, yet entire pathways should still be constrained. However, a direct estimate of this selective pressure had not been evaluated. Whole-genome duplications (WGDs) offer a good opportunity to address this question by analyzing the fates of metabolic genes during the massive gene losses that follow. Here, we take advantage of the successive rounds of WGD that occurred in the Paramecium lineage. We show that metabolic genes exhibit different gene retention patterns than nonmetabolic genes. Contrary to what was expected for individual genes, metabolic genes appeared more retained than other genes after the recent WGD, which was best explained by selection for gene expression operating on entire pathways. Metabolic genes also tend to be less retained when present at high copy number before WGD, contrary to other genes that show a positive correlation between gene retention and preduplication copy number. This is rationalized on the basis of the classical concave relationship relating metabolic fluxes with enzyme expression.
Sparse deconvolution for the large-scale ill-posed inverse problem of impact force reconstruction

NASA Astrophysics Data System (ADS)

Qiao, Baijie; Zhang, Xingwu; Gao, Jiawei; Liu, Ruonan; Chen, Xuefeng

2017-01-01

Most previous regularization methods for solving the inverse problem of force reconstruction are to minimize the l2-norm of the desired force. However, these traditional regularization methods such as Tikhonov regularization and truncated singular value decomposition, commonly fail to solve the large-scale ill-posed inverse problem in moderate computational cost. In this paper, taking into account the sparse characteristic of impact force, the idea of sparse deconvolution is first introduced to the field of impact force reconstruction and a general sparse deconvolution model of impact force is constructed. Second, a novel impact force reconstruction method based on the primal-dual interior point method (PDIPM) is proposed to solve such a large-scale sparse deconvolution model, where minimizing the l2-norm is replaced by minimizing the l1-norm. Meanwhile, the preconditioned conjugate gradient algorithm is used to compute the search direction of PDIPM with high computational efficiency. Finally, two experiments including the small-scale or medium-scale single impact force reconstruction and the relatively large-scale consecutive impact force reconstruction are conducted on a composite wind turbine blade and a shell structure to illustrate the advantage of PDIPM. Compared with Tikhonov regularization, PDIPM is more efficient, accurate and robust whether in the single impact force reconstruction or in the consecutive impact force reconstruction.
Unraveling the Light-Specific Metabolic and Regulatory Signatures of Rice through Combined in Silico Modeling and Multiomics Analysis1[OPEN

PubMed Central

Lim, Sun-Hyung; Kim, Jae Kwang; Ha, Sun-Hwa

2015-01-01

Light quality is an important signaling component upon which plants orchestrate various morphological processes, including seed germination and seedling photomorphogenesis. However, it is still unclear how plants, especially food crops, sense various light qualities and modulate their cellular growth and other developmental processes. Therefore, in this work, we initially profiled the transcripts of a model crop, rice (Oryza sativa), under four different light treatments (blue, green, red, and white) as well as in the dark. Concurrently, we reconstructed a fully compartmentalized genome-scale metabolic model of rice cells, iOS2164, containing 2,164 unique genes, 2,283 reactions, and 1,999 metabolites. We then combined the model with transcriptome profiles to elucidate the light-specific transcriptional signatures of rice metabolism. Clearly, light signals mediated rice gene expressions, differentially regulating numerous metabolic pathways: photosynthesis and secondary metabolism were up-regulated in blue light, whereas reserve carbohydrates degradation was pronounced in the dark. The topological analysis of gene expression data with the rice genome-scale metabolic model further uncovered that phytohormones, such as abscisate, ethylene, gibberellin, and jasmonate, are the key biomarkers of light-mediated regulation, and subsequent analysis of the associated genes’ promoter regions identified several light-specific transcription factors. Finally, the transcriptional control of rice metabolism by red and blue light signals was assessed by integrating the transcriptome and metabolome data with constraint-based modeling. The biological insights gained from this integrative systems biology approach offer several potential applications, such as improving the agronomic traits of food crops and designing light-specific synthetic gene circuits in microbial and mammalian systems. PMID:26453433
Jatropha curcas, a biofuel crop: Functional genomics for understanding metabolic pathways and genetic improvement

PubMed Central

Maghuly, Fatemeh; Laimer, Margit

2013-01-01

Jatropha curcas is currently attracting much attention as an oilseed crop for biofuel, as Jatropha can grow under climate and soil conditions that are unsuitable for food production. However, little is known about Jatropha, and there are a number of challenges to be overcome. In fact, Jatropha has not really been domesticated; most of the Jatropha accessions are toxic, which renders the seedcake unsuitable for use as animal feed. The seeds of Jatropha contain high levels of polyunsaturated fatty acids, which negatively impact the biofuel quality. Fruiting of Jatropha is fairly continuous, thus increasing costs of harvesting. Therefore, before starting any improvement program using conventional or molecular breeding techniques, understanding gene function and the genome scale of Jatropha are prerequisites. This review presents currently available and relevant information on the latest technologies (genomics, transcriptomics, proteomics and metabolomics) to decipher important metabolic pathways within Jatropha, such as oil and toxin synthesis. Further, it discusses future directions for biotechnological approaches in Jatropha breeding and improvement. PMID:24092674
WEbcoli: an interactive and asynchronous web application for in silico design and analysis of genome-scale E.coli model.

PubMed

Jung, Tae-Sung; Yeo, Hock Chuan; Reddy, Satty G; Cho, Wan-Sup; Lee, Dong-Yup

2009-11-01

WEbcoli is a WEb application for in silico designing, analyzing and engineering Escherichia coli metabolism. It is devised and implemented using advanced web technologies, thereby leading to enhanced usability and dynamic web accessibility. As a main feature, the WEbcoli system provides a user-friendly rich web interface, allowing users to virtually design and synthesize mutant strains derived from the genome-scale wild-type E.coli model and to customize pathways of interest through a graph editor. In addition, constraints-based flux analysis can be conducted for quantifying metabolic fluxes and charactering the physiological and metabolic states under various genetic and/or environmental conditions. WEbcoli is freely accessible at http://webcoli.org. cheld@nus.edu.sg.
Find_tfSBP: find thermodynamics-feasible and smallest balanced pathways with high yield from large-scale metabolic networks.

PubMed

Xu, Zixiang; Sun, Jibin; Wu, Qiaqing; Zhu, Dunming

2017-12-11

Biologically meaningful metabolic pathways are important references in the design of industrial bacterium. At present, constraint-based method is the only way to model and simulate a genome-scale metabolic network under steady-state criteria. Due to the inadequate assumption of the relationship in gene-enzyme-reaction as one-to-one unique association, computational difficulty or ignoring the yield from substrate to product, previous pathway finding approaches can't be effectively applied to find out the high yield pathways that are mass balanced in stoichiometry. In addition, the shortest pathways may not be the pathways with high yield. At the same time, a pathway, which exists in stoichiometry, may not be feasible in thermodynamics. By using mixed integer programming strategy, we put forward an algorithm to identify all the smallest balanced pathways which convert the source compound to the target compound in large-scale metabolic networks. The resulting pathways by our method can finely satisfy the stoichiometric constraints and non-decomposability condition. Especially, the functions of high yield and thermodynamics feasibility have been considered in our approach. This tool is tailored to direct the metabolic engineering practice to enlarge the metabolic potentials of industrial strains by integrating the extensive metabolic network information built from systems biology dataset.
Whole genome sequencing of Saccharomyces cerevisiae: from genotype to phenotype for improved metabolic engineering applications.

PubMed

Otero, José Manuel; Vongsangnak, Wanwipa; Asadollahi, Mohammad A; Olivares-Hernandes, Roberto; Maury, Jérôme; Farinelli, Laurent; Barlocher, Loïc; Osterås, Magne; Schalk, Michel; Clark, Anthony; Nielsen, Jens

2010-12-22

The need for rapid and efficient microbial cell factory design and construction are possible through the enabling technology, metabolic engineering, which is now being facilitated by systems biology approaches. Metabolic engineering is often complimented by directed evolution, where selective pressure is applied to a partially genetically engineered strain to confer a desirable phenotype. The exact genetic modification or resulting genotype that leads to the improved phenotype is often not identified or understood to enable further metabolic engineering. In this work we performed whole genome high-throughput sequencing and annotation can be used to identify single nucleotide polymorphisms (SNPs) between Saccharomyces cerevisiae strains S288c and CEN.PK113-7D. The yeast strain S288c was the first eukaryote sequenced, serving as the reference genome for the Saccharomyces Genome Database, while CEN.PK113-7D is a preferred laboratory strain for industrial biotechnology research. A total of 13,787 high-quality SNPs were detected between both strains (reference strain: S288c). Considering only metabolic genes (782 of 5,596 annotated genes), a total of 219 metabolism specific SNPs are distributed across 158 metabolic genes, with 85 of the SNPs being nonsynonymous (e.g., encoding amino acid modifications). Amongst metabolic SNPs detected, there was pathway enrichment in the galactose uptake pathway (GAL1, GAL10) and ergosterol biosynthetic pathway (ERG8, ERG9). Physiological characterization confirmed a strong deficiency in galactose uptake and metabolism in S288c compared to CEN.PK113-7D, and similarly, ergosterol content in CEN.PK113-7D was significantly higher in both glucose and galactose supplemented cultivations compared to S288c. Furthermore, DNA microarray profiling of S288c and CEN.PK113-7D in both glucose and galactose batch cultures did not provide a clear hypothesis for major phenotypes observed, suggesting that genotype to phenotype correlations are manifested
Whole genome sequencing of Saccharomyces cerevisiae: from genotype to phenotype for improved metabolic engineering applications

PubMed Central

2010-01-01

Background The need for rapid and efficient microbial cell factory design and construction are possible through the enabling technology, metabolic engineering, which is now being facilitated by systems biology approaches. Metabolic engineering is often complimented by directed evolution, where selective pressure is applied to a partially genetically engineered strain to confer a desirable phenotype. The exact genetic modification or resulting genotype that leads to the improved phenotype is often not identified or understood to enable further metabolic engineering. Results In this work we performed whole genome high-throughput sequencing and annotation can be used to identify single nucleotide polymorphisms (SNPs) between Saccharomyces cerevisiae strains S288c and CEN.PK113-7D. The yeast strain S288c was the first eukaryote sequenced, serving as the reference genome for the Saccharomyces Genome Database, while CEN.PK113-7D is a preferred laboratory strain for industrial biotechnology research. A total of 13,787 high-quality SNPs were detected between both strains (reference strain: S288c). Considering only metabolic genes (782 of 5,596 annotated genes), a total of 219 metabolism specific SNPs are distributed across 158 metabolic genes, with 85 of the SNPs being nonsynonymous (e.g., encoding amino acid modifications). Amongst metabolic SNPs detected, there was pathway enrichment in the galactose uptake pathway (GAL1, GAL10) and ergosterol biosynthetic pathway (ERG8, ERG9). Physiological characterization confirmed a strong deficiency in galactose uptake and metabolism in S288c compared to CEN.PK113-7D, and similarly, ergosterol content in CEN.PK113-7D was significantly higher in both glucose and galactose supplemented cultivations compared to S288c. Furthermore, DNA microarray profiling of S288c and CEN.PK113-7D in both glucose and galactose batch cultures did not provide a clear hypothesis for major phenotypes observed, suggesting that genotype to phenotype
The transcriptional regulatory network of Corynebacterium jeikeium K411 and its interaction with metabolic routes contributing to human body odor formation.

PubMed

Barzantny, Helena; Schröder, Jasmin; Strotmeier, Jasmin; Fredrich, Eugenie; Brune, Iris; Tauch, Andreas

2012-06-15

Lipophilic corynebacteria are involved in the generation of volatile odorous products in the process of human body odor formation by degrading skin lipids and specific odor precursors. Therefore, these bacteria represent appropriate model systems for the cosmetic industry to examine axillary malodor formation on the molecular level. To understand the transcriptional control of metabolic pathways involved in this process, the transcriptional regulatory network of the lipophilic axilla isolate Corynebacterium jeikeium K411 was reconstructed from the complete genome sequence. This bioinformatic approach detected a gene-regulatory repertoire of 83 candidate proteins, including 56 DNA-binding transcriptional regulators, nine two-component systems, nine sigma factors, and nine regulators with diverse physiological functions. Furthermore, a cross-genome comparison among selected corynebacterial species of the taxonomic cluster 3 revealed a common gene-regulatory repertoire of 44 transcriptional regulators, including the MarR-like regulator Jk0257, which is exclusively encoded in the genomes of this taxonomical subline. The current network reconstruction comprises 48 transcriptional regulators and 674 gene-regulatory interactions that were assigned to five interconnected functional modules. Most genes involved in lipid degradation are under the combined control of the global cAMP-sensing transcriptional regulator GlxR and the LuxR-family regulator RamA, probably reflecting the essential role of lipid degradation in C. jeikeium. This study provides the first genome-scale in silico analysis of the transcriptional regulation of metabolism in a lipophilic bacterium involved in the formation of human body odor. Copyright © 2012 Elsevier B.V. All rights reserved.

Glucocorticoids shift arachidonic acid metabolism toward endocannabinoid synthesis: a non-genomic anti-inflammatory switch

PubMed Central

Malcher-Lopes, Renato; Franco, Alier; Tasker, Jeffrey G.

2008-01-01

Glucocorticoids are capable of exerting both genomic and non-genomic actions in target cells of multiple tissues, including the brain, which trigger an array of electrophysiological, metabolic, secretory and inflammatory regulatory responses. Here, we have attempted to show how glucocorticoids may generate a rapid anti-inflammatory response by promoting arachidonic acid-derived endocannabinoid biosynthesis. According to our hypothesized model, non-genomic action of glucocorticoids results in the global shift of membrane lipid metabolism, subverting metabolic pathways toward the synthesis of the anti-inflammatory endocannabinoids, anandamide (AEA) and 2-arachidonoyl-glycerol (2-AG), and away from arachidonic acid production. Post-transcriptional inhibition of cyclooxygenase-2 (COX2) synthesis by glucocorticoids assists this mechanism by suppressing the synthesis of pro-inflammatory prostaglandins as well as endocannabinoid-derived prostanoids. In the central nervous system (CNS) this may represent a major neuroprotective system, which may cross-talk with leptin signaling in the hypothalamus allowing for the coordination between energy homeostasis and the inflammatory response. PMID:18295199
Genome-scale fluxes predicted under the guidance of enzyme abundance using a novel hyper-cube shrink algorithm.

PubMed

Xie, Zhengwei; Zhang, Tianyu; Ouyang, Qi

2018-02-01

One of the long-expected goals of genome-scale metabolic modelling is to evaluate the influence of the perturbed enzymes on flux distribution. Both ordinary differential equation (ODE) models and constraint-based models, like Flux balance analysis (FBA), lack the capacity to perform metabolic control analysis (MCA) for large-scale networks. In this study, we developed a hyper-cube shrink algorithm (HCSA) to incorporate the enzymatic properties into the FBA model by introducing a pseudo reaction V constrained by enzymatic parameters. Our algorithm uses the enzymatic information quantitatively rather than qualitatively. We first demonstrate the concept by applying HCSA to a simple three-node network, whereby we obtained a good correlation between flux and enzyme abundance. We then validate its prediction by comparison with ODE and with a synthetic network producing voilacein and analogues in Saccharomyces cerevisiae. We show that HCSA can mimic the state-state results of ODE. Finally, we show its capability of predicting the flux distribution in genome-scale networks by applying it to sporulation in yeast. We show the ability of HCSA to operate without biomass flux and perform MCA to determine rate-limiting reactions. Algorithm was implemented by Matlab and C ++. The code is available at https://github.com/kekegg/HCSA. xiezhengwei@hsc.pku.edu.cn or qi@pku.edu.cn. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Guided genome halving: hardness, heuristics and the history of the Hemiascomycetes.

PubMed

Zheng, Chunfang; Zhu, Qian; Adam, Zaky; Sankoff, David

2008-07-01

Some present day species have incurred a whole genome doubling event in their evolutionary history, and this is reflected today in patterns of duplicated segments scattered throughout their chromosomes. These duplications may be used as data to 'halve' the genome, i.e. to reconstruct the ancestral genome at the moment of doubling, but the solution is often highly nonunique. To resolve this problem, we take account of outgroups, external reference genomes, to guide and narrow down the search. We improve on a previous, computationally costly, 'brute force' method by adapting the genome halving algorithm of El-Mabrouk and Sankoff so that it rapidly and accurately constructs an ancestor close the outgroups, prior to a local optimization heuristic. We apply this to reconstruct the predoubling ancestor of Saccharomyces cerevisiae and Candida glabrata, guided by the genomes of three other yeasts that diverged before the genome doubling event. We analyze the results in terms (1) of the minimum evolution criterion, (2) how close the genome halving result is to the final (local) minimum and (3) how close the final result is to an ancestor manually constructed by an expert with access to additional information. We also visualize the set of reconstructed ancestors using classic multidimensional scaling to see what aspects of the two doubled and three unduplicated genomes influence the differences among the reconstructions. The experimental software is available on request.
Revealing Less Derived Nature of Cartilaginous Fish Genomes with Their Evolutionary Time Scale Inferred with Nuclear Genes

PubMed Central

Renz, Adina J.; Meyer, Axel; Kuraku, Shigehiro

2013-01-01

Cartilaginous fishes, divided into Holocephali (chimaeras) and Elasmoblanchii (sharks, rays and skates), occupy a key phylogenetic position among extant vertebrates in reconstructing their evolutionary processes. Their accurate evolutionary time scale is indispensable for better understanding of the relationship between phenotypic and molecular evolution of cartilaginous fishes. However, our current knowledge on the time scale of cartilaginous fish evolution largely relies on estimates using mitochondrial DNA sequences. In this study, making the best use of the still partial, but large-scale sequencing data of cartilaginous fish species, we estimate the divergence times between the major cartilaginous fish lineages employing nuclear genes. By rigorous orthology assessment based on available genomic and transcriptomic sequence resources for cartilaginous fishes, we selected 20 protein-coding genes in the nuclear genome, spanning 2973 amino acid residues. Our analysis based on the Bayesian inference resulted in the mean divergence time of 421 Ma, the late Silurian, for the Holocephali-Elasmobranchii split, and 306 Ma, the late Carboniferous, for the split between sharks and rays/skates. By applying these results and other documented divergence times, we measured the relative evolutionary rate of the Hox A cluster sequences in the cartilaginous fish lineages, which resulted in a lower substitution rate with a factor of at least 2.4 in comparison to tetrapod lineages. The obtained time scale enables mapping phenotypic and molecular changes in a quantitative framework. It is of great interest to corroborate the less derived nature of cartilaginous fish at the molecular level as a genome-wide phenomenon. PMID:23825540
Revealing less derived nature of cartilaginous fish genomes with their evolutionary time scale inferred with nuclear genes.

PubMed

Renz, Adina J; Meyer, Axel; Kuraku, Shigehiro

2013-01-01

Cartilaginous fishes, divided into Holocephali (chimaeras) and Elasmoblanchii (sharks, rays and skates), occupy a key phylogenetic position among extant vertebrates in reconstructing their evolutionary processes. Their accurate evolutionary time scale is indispensable for better understanding of the relationship between phenotypic and molecular evolution of cartilaginous fishes. However, our current knowledge on the time scale of cartilaginous fish evolution largely relies on estimates using mitochondrial DNA sequences. In this study, making the best use of the still partial, but large-scale sequencing data of cartilaginous fish species, we estimate the divergence times between the major cartilaginous fish lineages employing nuclear genes. By rigorous orthology assessment based on available genomic and transcriptomic sequence resources for cartilaginous fishes, we selected 20 protein-coding genes in the nuclear genome, spanning 2973 amino acid residues. Our analysis based on the Bayesian inference resulted in the mean divergence time of 421 Ma, the late Silurian, for the Holocephali-Elasmobranchii split, and 306 Ma, the late Carboniferous, for the split between sharks and rays/skates. By applying these results and other documented divergence times, we measured the relative evolutionary rate of the Hox A cluster sequences in the cartilaginous fish lineages, which resulted in a lower substitution rate with a factor of at least 2.4 in comparison to tetrapod lineages. The obtained time scale enables mapping phenotypic and molecular changes in a quantitative framework. It is of great interest to corroborate the less derived nature of cartilaginous fish at the molecular level as a genome-wide phenomenon.
Trace: a high-throughput tomographic reconstruction engine for large-scale datasets

DOE PAGES

Bicer, Tekin; Gursoy, Doga; Andrade, Vincent De; ...

2017-01-28

Here, synchrotron light source and detector technologies enable scientists to perform advanced experiments. These scientific instruments and experiments produce data at such scale and complexity that large-scale computation is required to unleash their full power. One of the widely used data acquisition technique at light sources is Computed Tomography, which can generate tens of GB/s depending on x-ray range. A large-scale tomographic dataset, such as mouse brain, may require hours of computation time with a medium size workstation. In this paper, we present Trace, a data-intensive computing middleware we developed for implementation and parallelization of iterative tomographic reconstruction algorithms. Tracemore » provides fine-grained reconstruction of tomography datasets using both (thread level) shared memory and (process level) distributed memory parallelization. Trace utilizes a special data structure called replicated reconstruction object to maximize application performance. We also present the optimizations we have done on the replicated reconstruction objects and evaluate them using a shale and a mouse brain sinogram. Our experimental evaluations show that the applied optimizations and parallelization techniques can provide 158x speedup (using 32 compute nodes) over single core configuration, which decreases the reconstruction time of a sinogram (with 4501 projections and 22400 detector resolution) from 12.5 hours to less than 5 minutes per iteration.« less
Trace: a high-throughput tomographic reconstruction engine for large-scale datasets

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bicer, Tekin; Gursoy, Doga; Andrade, Vincent De

Here, synchrotron light source and detector technologies enable scientists to perform advanced experiments. These scientific instruments and experiments produce data at such scale and complexity that large-scale computation is required to unleash their full power. One of the widely used data acquisition technique at light sources is Computed Tomography, which can generate tens of GB/s depending on x-ray range. A large-scale tomographic dataset, such as mouse brain, may require hours of computation time with a medium size workstation. In this paper, we present Trace, a data-intensive computing middleware we developed for implementation and parallelization of iterative tomographic reconstruction algorithms. Tracemore » provides fine-grained reconstruction of tomography datasets using both (thread level) shared memory and (process level) distributed memory parallelization. Trace utilizes a special data structure called replicated reconstruction object to maximize application performance. We also present the optimizations we have done on the replicated reconstruction objects and evaluate them using a shale and a mouse brain sinogram. Our experimental evaluations show that the applied optimizations and parallelization techniques can provide 158x speedup (using 32 compute nodes) over single core configuration, which decreases the reconstruction time of a sinogram (with 4501 projections and 22400 detector resolution) from 12.5 hours to less than 5 minutes per iteration.« less
In silico metabolic engineering of Clostridium ljungdahlii for synthesis gas fermentation.

PubMed

Chen, Jin; Henson, Michael A

2016-11-01

Synthesis gas fermentation is one of the most promising routes to convert synthesis gas (syngas; mainly comprised of H 2 and CO) to renewable liquid fuels and chemicals by specialized bacteria. The most commonly studied syngas fermenting bacterium is Clostridium ljungdahlii, which produces acetate and ethanol as its primary metabolic byproducts. Engineering of C. ljungdahlii metabolism to overproduce ethanol, enhance the synthesize of the native byproducts lactate and 2,3-butanediol, and introduce the synthesis of non-native products such as butanol and butyrate has substantial commercial value. We performed in silico metabolic engineering studies using a genome-scale reconstruction of C. ljungdahlii metabolism and the OptKnock computational framework to identify gene knockouts that were predicted to enhance the synthesis of these native products and non-native products, introduced through insertion of the necessary heterologous pathways. The OptKnock derived strategies were often difficult to assess because increase product synthesis was invariably accompanied by decreased growth. Therefore, the OptKnock strategies were further evaluated using a spatiotemporal metabolic model of a syngas bubble column reactor, a popular technology for large-scale gas fermentation. Unlike flux balance analysis, the bubble column model accounted for the complex tradeoffs between increased product synthesis and reduced growth rates of engineered mutants within the spatially varying column environment. The two-stage methodology for deriving and evaluating metabolic engineering strategies was shown to yield new C. ljungdahlii gene targets that offer the potential for increased product synthesis under realistic syngas fermentation conditions. Copyright © 2016 International Metabolic Engineering Society. Published by Elsevier Inc. All rights reserved.
Intracellular metabolic pathway distribution in diatoms and tools for genome-enabled experimental diatom research.

PubMed

Gruber, Ansgar; Kroth, Peter G

2017-09-05

Diatoms are important primary producers in the oceans and can also dominate other aquatic habitats. One reason for the success of this phylogenetically relatively young group of unicellular organisms could be the impressive redundancy and diversity of metabolic isoenzymes in diatoms. This redundancy is a result of the evolutionary origin of diatom plastids by a eukaryote-eukaryote endosymbiosis, a process that implies temporary redundancy of functionally complete eukaryotic genomes. During the establishment of the plastids, this redundancy was partially reduced via gene losses, and was partially retained via gene transfer to the nucleus of the respective host cell. These gene transfers required re-assignment of intracellular targeting signals, a process that simultaneously altered the intracellular distribution of metabolic enzymes compared with the ancestral cells. Genome annotation, the correct assignment of the gene products and the prediction of putative function, strongly depends on the correct prediction of the intracellular targeting of a gene product. Here again diatoms are very peculiar, because the targeting systems for organelle import are partially different to those in land plants. In this review, we describe methods of predicting intracellular enzyme locations, highlight findings of metabolic peculiarities in diatoms and present genome-enabled approaches to study their metabolism.This article is part of the themed issue 'The peculiar carbon metabolism in diatoms'. © 2017 The Author(s).
Reconstruction and Analysis of Human Kidney-Specific Metabolic Network Based on Omics Data

PubMed Central

Zhang, Ai-Di; Dai, Shao-Xing; Huang, Jing-Fei

2013-01-01

With the advent of the high-throughput data production, recent studies of tissue-specific metabolic networks have largely advanced our understanding of the metabolic basis of various physiological and pathological processes. However, for kidney, which plays an essential role in the body, the available kidney-specific model remains incomplete. This paper reports the reconstruction and characterization of the human kidney metabolic network based on transcriptome and proteome data. In silico simulations revealed that house-keeping genes were more essential than kidney-specific genes in maintaining kidney metabolism. Importantly, a total of 267 potential metabolic biomarkers for kidney-related diseases were successfully explored using this model. Furthermore, we found that the discrepancies in metabolic processes of different tissues are directly corresponding to tissue's functions. Finally, the phenotypes of the differentially expressed genes in diabetic kidney disease were characterized, suggesting that these genes may affect disease development through altering kidney metabolism. Thus, the human kidney-specific model constructed in this study may provide valuable information for the metabolism of kidney and offer excellent insights into complex kidney diseases. PMID:24222897
Reconstructing relative genome size of vascular plants through geological time.

PubMed

Lomax, Barry H; Hilton, Jason; Bateman, Richard M; Upchurch, Garland R; Lake, Janice A; Leitch, Ilia J; Cromwell, Avery; Knight, Charles A

2014-01-01

The strong positive relationship evident between cell and genome size in both animals and plants forms the basis of using the size of stomatal guard cells as a proxy to track changes in plant genome size through geological time. We report for the first time a taxonomic fine-scale investigation into changes in stomatal guard-cell length and use these data to infer changes in genome size through the evolutionary history of land plants. Our data suggest that many of the earliest land plants had exceptionally large genome sizes and that a predicted overall trend of increasing genome size within individual lineages through geological time is not supported. However, maximum genome size steadily increases from the Mississippian (c. 360 million yr ago (Ma)) to the present. We hypothesise that the functional relationship between stomatal size, genome size and atmospheric CO2 may contribute to the dichotomy reported between preferential extinction of neopolyploids and the prevalence of palaeopolyploidy observed in DNA sequence data of extant vascular plants. © 2013 The Authors. New Phytologist © 2013 New Phytologist Trust.
Reconstructing rare soil microbial genomes using in situ enrichments and metagenomics

PubMed Central

Delmont, Tom O.; Eren, A. Murat; Maccario, Lorrie; Prestat, Emmanuel; Esen, Özcan C.; Pelletier, Eric; Le Paslier, Denis; Simonet, Pascal; Vogel, Timothy M.

2015-01-01

Despite extensive direct sequencing efforts and advanced analytical tools, reconstructing microbial genomes from soil using metagenomics have been challenging due to the tremendous diversity and relatively uniform distribution of genomes found in this system. Here we used enrichment techniques in an attempt to decrease the complexity of a soil microbiome prior to sequencing by submitting it to a range of physical and chemical stresses in 23 separate microcosms for 4 months. The metagenomic analysis of these microcosms at the end of the treatment yielded 540 Mb of assembly using standard de novo assembly techniques (a total of 559,555 genes and 29,176 functions), from which we could recover novel bacterial genomes, plasmids and phages. The recovered genomes belonged to Leifsonia (n = 2), Rhodanobacter (n = 5), Acidobacteria (n = 2), Sporolactobacillus (n = 2, novel nitrogen fixing taxon), Ktedonobacter (n = 1, second representative of the family Ktedonobacteraceae), Streptomyces (n = 3, novel polyketide synthase modules), and Burkholderia (n = 2, includes mega-plasmids conferring mercury resistance). Assembled genomes averaged to 5.9 Mb, with relative abundances ranging from rare (<0.0001%) to relatively abundant (>0.01%) in the original soil microbiome. Furthermore, we detected them in samples collected from geographically distant locations, particularly more in temperate soils compared to samples originating from high-latitude soils and deserts. To the best of our knowledge, this study is the first successful attempt to assemble multiple bacterial genomes directly from a soil sample. Our findings demonstrate that developing pertinent enrichment conditions can stimulate environmental genomic discoveries that would have been impossible to achieve with canonical approaches that focus solely upon post-sequencing data treatment. PMID:25983722
Genome Sequence of Candidatus Nitrososphaera evergladensis from Group I.1b Enriched from Everglades Soil Reveals Novel Genomic Features of the Ammonia-Oxidizing Archaea

PubMed Central

Zhalnina, Kateryna V.; Dias, Raquel; Leonard, Michael T.; Dorr de Quadros, Patricia; Camargo, Flavio A. O.; Drew, Jennifer C.; Farmerie, William G.; Daroub, Samira H.; Triplett, Eric W.

2014-01-01

The activity of ammonia-oxidizing archaea (AOA) leads to the loss of nitrogen from soil, pollution of water sources and elevated emissions of greenhouse gas. To date, eight AOA genomes are available in the public databases, seven are from the group I.1a of the Thaumarchaeota and only one is from the group I.1b, isolated from hot springs. Many soils are dominated by AOA from the group I.1b, but the genomes of soil representatives of this group have not been sequenced and functionally characterized. The lack of knowledge of metabolic pathways of soil AOA presents a critical gap in understanding their role in biogeochemical cycles. Here, we describe the first complete genome of soil archaeon Candidatus Nitrososphaera evergladensis, which has been reconstructed from metagenomic sequencing of a highly enriched culture obtained from an agricultural soil. The AOA enrichment was sequenced with the high throughput next generation sequencing platforms from Pacific Biosciences and Ion Torrent. The de novo assembly of sequences resulted in one 2.95 Mb contig. Annotation of the reconstructed genome revealed many similarities of the basic metabolism with the rest of sequenced AOA. Ca. N. evergladensis belongs to the group I.1b and shares only 40% of whole-genome homology with the closest sequenced relative Ca. N. gargensis. Detailed analysis of the genome revealed coding sequences that were completely absent from the group I.1a. These unique sequences code for proteins involved in control of DNA integrity, transporters, two-component systems and versatile CRISPR defense system. Notably, genomes from the group I.1b have more gene duplications compared to the genomes from the group I.1a. We suggest that the presence of these unique genes and gene duplications may be associated with the environmental versatility of this group. PMID:24999826
A geographically-diverse collection of 418 human gut microbiome pathway genome databases

PubMed Central

Hahn, Aria S.; Altman, Tomer; Konwar, Kishori M.; Hanson, Niels W.; Kim, Dongjae; Relman, David A.; Dill, David L.; Hallam, Steven J.

2017-01-01

Advances in high-throughput sequencing are reshaping how we perceive microbial communities inhabiting the human body, with implications for therapeutic interventions. Several large-scale datasets derived from hundreds of human microbiome samples sourced from multiple studies are now publicly available. However, idiosyncratic data processing methods between studies introduce systematic differences that confound comparative analyses. To overcome these challenges, we developed GutCyc, a compendium of environmental pathway genome databases (ePGDBs) constructed from 418 assembled human microbiome datasets using MetaPathways, enabling reproducible functional metagenomic annotation. We also generated metabolic network reconstructions for each metagenome using the Pathway Tools software, empowering researchers and clinicians interested in visualizing and interpreting metabolic pathways encoded by the human gut microbiome. For the first time, GutCyc provides consistent annotations and metabolic pathway predictions, making possible comparative community analyses between health and disease states in inflammatory bowel disease, Crohn’s disease, and type 2 diabetes. GutCyc data products are searchable online, or may be downloaded and explored locally using MetaPathways and Pathway Tools. PMID:28398290
Multi-scale modularity and motif distributional effect in metabolic networks.

PubMed

Gao, Shang; Chen, Alan; Rahmani, Ali; Zeng, Jia; Tan, Mehmet; Alhajj, Reda; Rokne, Jon; Demetrick, Douglas; Wei, Xiaohui

2016-01-01

Metabolism is a set of fundamental processes that play important roles in a plethora of biological and medical contexts. It is understood that the topological information of reconstructed metabolic networks, such as modular organization, has crucial implications on biological functions. Recent interpretations of modularity in network settings provide a view of multiple network partitions induced by different resolution parameters. Here we ask the question: How do multiple network partitions affect the organization of metabolic networks? Since network motifs are often interpreted as the super families of evolved units, we further investigate their impact under multiple network partitions and investigate how the distribution of network motifs influences the organization of metabolic networks. We studied Homo sapiens, Saccharomyces cerevisiae and Escherichia coli metabolic networks; we analyzed the relationship between different community structures and motif distribution patterns. Further, we quantified the degree to which motifs participate in the modular organization of metabolic networks.
Draft genome sequence of Micrococcus luteus strain O'Kane implicates metabolic versatility and the potential to degrade polyhydroxybutyrates.

PubMed

Hanafy, Radwa A; Couger, M B; Baker, Kristina; Murphy, Chelsea; O'Kane, Shannon D; Budd, Connie; French, Donald P; Hoff, Wouter D; Youssef, Noha

2016-09-01

Micrococcus luteus is a predominant member of skin microbiome. We here report on the genomic analysis of Micrococcus luteus strain O'Kane that was isolated from an elevator. The partial genome assembly of Micrococcus luteus strain O'Kane is 2.5 Mb with 2256 protein-coding genes and 62 RNA genes. Genomic analysis revealed metabolic versatility with genes involved in the metabolism and transport of glucose, galactose, fructose, mannose, alanine, aspartate, asparagine, glutamate, glutamine, glycine, serine, cysteine, methionine, arginine, proline, histidine, phenylalanine, and fatty acids. Genomic comparison to other M. luteus representatives identified the potential to degrade polyhydroxybutyrates, as well as several antibiotic resistance genes absent from other genomes.
Assessing the Metabolic Impact of Nitrogen Availability Using a Compartmentalized Maize Leaf Genome-Scale Model1[C][W][OPEN

PubMed Central

Simons, Margaret; Saha, Rajib; Amiour, Nardjis; Kumar, Akhil; Guillard, Lenaïg; Clément, Gilles; Miquel, Martine; Li, Zhenni; Mouille, Gregory; Lea, Peter J.; Hirel, Bertrand; Maranas, Costas D.

2014-01-01

Maize (Zea mays) is an important C4 plant due to its widespread use as a cereal and energy crop. A second-generation genome-scale metabolic model for the maize leaf was created to capture C4 carbon fixation and investigate nitrogen (N) assimilation by modeling the interactions between the bundle sheath and mesophyll cells. The model contains gene-protein-reaction relationships, elemental and charge-balanced reactions, and incorporates experimental evidence pertaining to the biomass composition, compartmentalization, and flux constraints. Condition-specific biomass descriptions were introduced that account for amino acids, fatty acids, soluble sugars, proteins, chlorophyll, lignocellulose, and nucleic acids as experimentally measured biomass constituents. Compartmentalization of the model is based on proteomic/transcriptomic data and literature evidence. With the incorporation of information from the MetaCrop and MaizeCyc databases, this updated model spans 5,824 genes, 8,525 reactions, and 9,153 metabolites, an increase of approximately 4 times the size of the earlier iRS1563 model. Transcriptomic and proteomic data have also been used to introduce regulatory constraints in the model to simulate an N-limited condition and mutants deficient in glutamine synthetase, gln1-3 and gln1-4. Model-predicted results achieved 90% accuracy when comparing the wild type grown under an N-complete condition with the wild type grown under an N-deficient condition. PMID:25248718
Genome-directed analysis of prophage excision, host defence systems, and central fermentative metabolism in Clostridium pasteurianum.

PubMed

Pyne, Michael E; Liu, Xuejia; Moo-Young, Murray; Chung, Duane A; Chou, C Perry

2016-09-19

Clostridium pasteurianum is emerging as a prospective host for the production of biofuels and chemicals, and has recently been shown to directly consume electric current. Despite this growing biotechnological appeal, the organism's genetics and central metabolism remain poorly understood. Here we present a concurrent genome sequence for the C. pasteurianum type strain and provide extensive genomic analysis of the organism's defence mechanisms and central fermentative metabolism. Next generation genome sequencing produced reads corresponding to spontaneous excision of a novel phage, designated φ6013, which could be induced using mitomycin C and detected using PCR and transmission electron microscopy. Methylome analysis of sequencing reads provided a near-complete glimpse into the organism's restriction-modification systems. We also unveiled the chief C. pasteurianum Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) locus, which was found to exemplify a Type I-B system. Finally, we show that C. pasteurianum possesses a highly complex fermentative metabolism whereby the metabolic pathways enlisted by the cell is governed by the degree of reductance of the substrate. Four distinct fermentation profiles, ranging from exclusively acidogenic to predominantly alcohologenic, were observed through redox consideration of the substrate. A detailed discussion of the organism's central metabolism within the context of metabolic engineering is provided.
Draft Genomes, Phylogenetic Reconstruction, and Comparative Genomics of Two Novel Cohabiting Bacterial Symbionts Isolated from Frankliniella occidentalis

PubMed Central

Facey, Paul D.; Méric, Guillaume; Hitchings, Matthew D.; Pachebat, Justin A.; Hegarty, Matt J.; Chen, Xiaorui; Morgan, Laura V.A.; Hoeppner, James E.; Whitten, Miranda M.A.; Kirk, William D.J.; Dyson, Paul J.; Sheppard, Sam K.; Sol, Ricardo Del

2015-01-01

Obligate bacterial symbionts are widespread in many invertebrates, where they are often confined to specialized host cells and are transmitted directly from mother to progeny. Increasing numbers of these bacteria are being characterized but questions remain about their population structure and evolution. Here we take a comparative genomics approach to investigate two prominent bacterial symbionts (BFo1 and BFo2) isolated from geographically separated populations of western flower thrips, Frankliniella occidentalis. Our multifaceted approach to classifying these symbionts includes concatenated multilocus sequence analysis (MLSA) phylogenies, ribosomal multilocus sequence typing (rMLST), construction of whole-genome phylogenies, and in-depth genomic comparisons. We showed that the BFo1 genome clusters more closely to species in the genus Erwinia, and is a putative close relative to Erwinia aphidicola. BFo1 is also likely to have shared a common ancestor with Erwinia pyrifoliae/Erwinia amylovora and the nonpathogenic Erwinia tasmaniensis and genetic traits similar to Erwinia billingiae. The BFo1 genome contained virulence factors found in the genus Erwinia but represented a divergent lineage. In contrast, we showed that BFo2 belongs within the Enterobacteriales but does not group closely with any currently known bacterial species. Concatenated MLSA phylogenies indicate that it may have shared a common ancestor to the Erwinia and Pantoea genera, and based on the clustering of rMLST genes, it was most closely related to Pantoea ananatis but represented a divergent lineage. We reconstructed a core genome of a putative common ancestor of Erwinia and Pantoea and compared this with the genomes of BFo bacteria. BFo2 possessed none of the virulence determinants that were omnipresent in the Erwinia and Pantoea genera. Taken together, these data are consistent with BFo2 representing a highly novel species that maybe related to known Pantoea. PMID:26185096
ITEP: an integrated toolkit for exploration of microbial pan-genomes.

PubMed

Benedict, Matthew N; Henriksen, James R; Metcalf, William W; Whitaker, Rachel J; Price, Nathan D

2014-01-03

Comparative genomics is a powerful approach for studying variation in physiological traits as well as the evolution and ecology of microorganisms. Recent technological advances have enabled sequencing large numbers of related genomes in a single project, requiring computational tools for their integrated analysis. In particular, accurate annotations and identification of gene presence and absence are critical for understanding and modeling the cellular physiology of newly sequenced genomes. Although many tools are available to compare the gene contents of related genomes, new tools are necessary to enable close examination and curation of protein families from large numbers of closely related organisms, to integrate curation with the analysis of gain and loss, and to generate metabolic networks linking the annotations to observed phenotypes. We have developed ITEP, an Integrated Toolkit for Exploration of microbial Pan-genomes, to curate protein families, compute similarities to externally-defined domains, analyze gene gain and loss, and generate draft metabolic networks from one or more curated reference network reconstructions in groups of related microbial species among which the combination of core and variable genes constitute the their "pan-genomes". The ITEP toolkit consists of: (1) a series of modular command-line scripts for identification, comparison, curation, and analysis of protein families and their distribution across many genomes; (2) a set of Python libraries for programmatic access to the same data; and (3) pre-packaged scripts to perform common analysis workflows on a collection of genomes. ITEP's capabilities include de novo protein family prediction, ortholog detection, analysis of functional domains, identification of core and variable genes and gene regions, sequence alignments and tree generation, annotation curation, and the integration of cross-genome analysis and metabolic networks for study of metabolic network evolution. ITEP is a

The BioCyc collection of microbial genomes and metabolic pathways.

PubMed

Karp, Peter D; Billington, Richard; Caspi, Ron; Fulcher, Carol A; Latendresse, Mario; Kothari, Anamika; Keseler, Ingrid M; Krummenacker, Markus; Midford, Peter E; Ong, Quang; Ong, Wai Kit; Paley, Suzanne M; Subhraveti, Pallavi

2017-08-17

BioCyc.org is a microbial genome Web portal that combines thousands of genomes with additional information inferred by computer programs, imported from other databases and curated from the biomedical literature by biologist curators. BioCyc also provides an extensive range of query tools, visualization services and analysis software. Recent advances in BioCyc include an expansion in the content of BioCyc in terms of both the number of genomes and the types of information available for each genome; an expansion in the amount of curated content within BioCyc; and new developments in the BioCyc software tools including redesigned gene/protein pages and metabolite pages; new search tools; a new sequence-alignment tool; a new tool for visualizing groups of related metabolic pathways; and a facility called SmartTables, which enables biologists to perform analyses that previously would have required a programmer's assistance. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Genome-scale analysis of positionally relocated genes

PubMed Central

Bhutkar, Arjun; Russo, Susan M.; Smith, Temple F.; Gelbart, William M.

2007-01-01

During evolution, genome reorganization includes large-scale events such as inversions, translocations, and segmental or even whole-genome duplications, as well as fine-scale events such as the relocation of individual genes. This latter category, which we will refer to as positionally relocated genes (PRGs), is the subject of this report. Assessment of the magnitude of such PRGs and of possible contributing mechanisms is aided by a comparative analysis of related genomes, where conserved chromosomal organization can aid in identifying genes that have acquired a new location in a lineage of these genomes. Here we utilize two methods to comprehensively identify relocated protein-coding genes in the recently sequenced genomes of 12 species of genus Drosophila. We use exceptions to the general rule of maintenance of chromosome arm (Muller element) association for most Drosophila genes to identify one major class of PRGs. We also identify a partially overlapping set of PRGs among “embedded genes,” located within the extents of other surrounding genes. We provide evidence that PRG movements have at least two different origins: Some events occur via retrotransposition of processed RNAs and others via a DNA-based transposition mechanism. Overall, we identify several hundred PRGs that arose within a lineage of the genus Drosophila phylogeny and provide suggestive evidence that a few thousand such events have occurred within the radiation of the insect order Diptera, thereby illustrating the magnitude of the contribution of PRG movement to chromosomal reorganization during evolution. PMID:17989252
Continental-Scale Temperature Reconstructions from the PAGES 2k Network

NASA Astrophysics Data System (ADS)

Kaufman, D. S.

2012-12-01

We present a major new synthesis of seven regional temperature reconstructions to elucidate the global pattern of variations and their association with climate-forcing mechanisms over the past two millennia. To coordinate the integration of new and existing data of all proxy types, the Past Global Changes (PAGES) project developed the 2k Network. It comprises nine working groups representing eight continental-scale regions and the oceans. The PAGES 2k Consortium, authoring this paper, presently includes 79 representatives from 25 countries. For this synthesis, each of the PAGES 2k working groups identified the proxy climate records for reconstructing past temperature and associated uncertainty using the data and methodologies that they deemed most appropriate for their region. The datasets are from 973 sites where tree rings, pollen, corals, lake and marine sediment, glacier ice, speleothems, and historical documents record changes in biologically and physically mediated processes that are sensitive to temperature change, among other climatic factors. The proxy records used for this synthesis are available through the NOAA World Data Center for Paleoclimatology. On long time scales, the temperature reconstructions display similarities among regions, and a large part of this common behavior can be explained by known climate forcings. Reconstructed temperatures in all regions show an overall long-term cooling trend until around 1900 C.E., followed by strong warming during the 20th century. On the multi-decadal time scale, we assessed the variability among the temperature reconstructions using principal component (PC) analysis of the standardized decadal mean temperatures over the period of overlap among the reconstructions (1200 to 1980 C.E.). PC1 explains 35% of the total variability and is strongly correlated with temperature reconstructions from the four Northern Hemisphere regions, and with the sum of external forcings including solar, volcanic, and greenhouse
A De-Novo Genome Analysis Pipeline (DeNoGAP) for large-scale comparative prokaryotic genomics studies.

PubMed

Thakur, Shalabh; Guttman, David S

2016-06-30

Comparative analysis of whole genome sequence data from closely related prokaryotic species or strains is becoming an increasingly important and accessible approach for addressing both fundamental and applied biological questions. While there are number of excellent tools developed for performing this task, most scale poorly when faced with hundreds of genome sequences, and many require extensive manual curation. We have developed a de-novo genome analysis pipeline (DeNoGAP) for the automated, iterative and high-throughput analysis of data from comparative genomics projects involving hundreds of whole genome sequences. The pipeline is designed to perform reference-assisted and de novo gene prediction, homolog protein family assignment, ortholog prediction, functional annotation, and pan-genome analysis using a range of proven tools and databases. While most existing methods scale quadratically with the number of genomes since they rely on pairwise comparisons among predicted protein sequences, DeNoGAP scales linearly since the homology assignment is based on iteratively refined hidden Markov models. This iterative clustering strategy enables DeNoGAP to handle a very large number of genomes using minimal computational resources. Moreover, the modular structure of the pipeline permits easy updates as new analysis programs become available. DeNoGAP integrates bioinformatics tools and databases for comparative analysis of a large number of genomes. The pipeline offers tools and algorithms for annotation and analysis of completed and draft genome sequences. The pipeline is developed using Perl, BioPerl and SQLite on Ubuntu Linux version 12.04 LTS. Currently, the software package accompanies script for automated installation of necessary external programs on Ubuntu Linux; however, the pipeline should be also compatible with other Linux and Unix systems after necessary external programs are installed. DeNoGAP is freely available at https://sourceforge.net/projects/denogap/ .
Capturing the essence of a metabolic network: a flux balance analysis approach.

PubMed

Murabito, Ettore; Simeonidis, Evangelos; Smallbone, Kieran; Swinton, Jonathan

2009-10-07

As genome-scale metabolic reconstructions emerge, tools to manage their size and complexity will be increasingly important. Flux balance analysis (FBA) is a constraint-based approach widely used to study the metabolic capabilities of cellular or subcellular systems. FBA problems are highly underdetermined and many different phenotypes can satisfy any set of constraints through which the metabolic system is represented. Two of the main concerns in FBA are exploring the space of solutions for a given metabolic network and finding a specific phenotype which is representative for a given task such as maximal growth rate. Here, we introduce a recursive algorithm suitable for overcoming both of these concerns. The method proposed is able to find the alternate optimal patterns of active reactions of an FBA problem and identify the minimal subnetwork able to perform a specific task as optimally as the whole. Our method represents an alternative to and an extension of other approaches conceived for exploring the space of solutions of an FBA problem. It may also be particularly helpful in defining a scaffold of reactions upon which to build up a dynamic model, when the important pathways of the system have not yet been well-defined.
Genome-to-Watershed Predictive Understanding of Terrestrial Environments

NASA Astrophysics Data System (ADS)

Hubbard, S. S.; Agarwal, D.; Banfield, J. F.; Beller, H. R.; Brodie, E.; Long, P.; Nico, P. S.; Steefel, C. I.; Tokunaga, T. K.; Williams, K. H.

2014-12-01

Although terrestrial environments play a critical role in cycling water, greenhouse gasses, and other life-critical elements, the complexity of interactions among component microbes, plants, minerals, migrating fluids and dissolved constituents hinders predictive understanding of system behavior. The 'Sustainable Systems 2.0' project is developing genome-to-watershed scale predictive capabilities to quantify how the microbiome affects biogeochemical watershed functioning, how watershed-scale hydro-biogeochemical processes affect microbial functioning, and how these interactions co-evolve with climate and land-use changes. Development of such predictive capabilities is critical for guiding the optimal management of water resources, contaminant remediation, carbon stabilization, and agricultural sustainability - now and with global change. Initial investigations are focused on floodplains in the Colorado River Basin, and include iterative model development, experiments and observations with an early emphasis on subsurface aspects. Field experiments include local-scale experiments at Rifle CO to quantify spatiotemporal metabolic and geochemical responses to O2and nitrate amendments as well as floodplain-scale monitoring to quantify genomic and biogeochemical response to natural hydrological perturbations. Information obtained from such experiments are represented within GEWaSC, a Genome-Enabled Watershed Simulation Capability, which is being developed to allow mechanistic interrogation of how genomic information stored in a subsurface microbiome affects biogeochemical cycling. This presentation will describe the genome-to-watershed scale approach as well as early highlights associated with the project. Highlights include: first insights into the diversity of the subsurface microbiome and metabolic roles of organisms involved in subsurface nitrogen, sulfur and hydrogen and carbon cycling; the extreme variability of subsurface DOC and hydrological controls on carbon and
Guidelines for Genome-Scale Analysis of Biological Rhythms.

PubMed

Hughes, Michael E; Abruzzi, Katherine C; Allada, Ravi; Anafi, Ron; Arpat, Alaaddin Bulak; Asher, Gad; Baldi, Pierre; de Bekker, Charissa; Bell-Pedersen, Deborah; Blau, Justin; Brown, Steve; Ceriani, M Fernanda; Chen, Zheng; Chiu, Joanna C; Cox, Juergen; Crowell, Alexander M; DeBruyne, Jason P; Dijk, Derk-Jan; DiTacchio, Luciano; Doyle, Francis J; Duffield, Giles E; Dunlap, Jay C; Eckel-Mahan, Kristin; Esser, Karyn A; FitzGerald, Garret A; Forger, Daniel B; Francey, Lauren J; Fu, Ying-Hui; Gachon, Frédéric; Gatfield, David; de Goede, Paul; Golden, Susan S; Green, Carla; Harer, John; Harmer, Stacey; Haspel, Jeff; Hastings, Michael H; Herzel, Hanspeter; Herzog, Erik D; Hoffmann, Christy; Hong, Christian; Hughey, Jacob J; Hurley, Jennifer M; de la Iglesia, Horacio O; Johnson, Carl; Kay, Steve A; Koike, Nobuya; Kornacker, Karl; Kramer, Achim; Lamia, Katja; Leise, Tanya; Lewis, Scott A; Li, Jiajia; Li, Xiaodong; Liu, Andrew C; Loros, Jennifer J; Martino, Tami A; Menet, Jerome S; Merrow, Martha; Millar, Andrew J; Mockler, Todd; Naef, Felix; Nagoshi, Emi; Nitabach, Michael N; Olmedo, Maria; Nusinow, Dmitri A; Ptáček, Louis J; Rand, David; Reddy, Akhilesh B; Robles, Maria S; Roenneberg, Till; Rosbash, Michael; Ruben, Marc D; Rund, Samuel S C; Sancar, Aziz; Sassone-Corsi, Paolo; Sehgal, Amita; Sherrill-Mix, Scott; Skene, Debra J; Storch, Kai-Florian; Takahashi, Joseph S; Ueda, Hiroki R; Wang, Han; Weitz, Charles; Westermark, Pål O; Wijnen, Herman; Xu, Ying; Wu, Gang; Yoo, Seung-Hee; Young, Michael; Zhang, Eric Erquan; Zielinski, Tomasz; Hogenesch, John B

2017-10-01

Genome biology approaches have made enormous contributions to our understanding of biological rhythms, particularly in identifying outputs of the clock, including RNAs, proteins, and metabolites, whose abundance oscillates throughout the day. These methods hold significant promise for future discovery, particularly when combined with computational modeling. However, genome-scale experiments are costly and laborious, yielding "big data" that are conceptually and statistically difficult to analyze. There is no obvious consensus regarding design or analysis. Here we discuss the relevant technical considerations to generate reproducible, statistically sound, and broadly useful genome-scale data. Rather than suggest a set of rigid rules, we aim to codify principles by which investigators, reviewers, and readers of the primary literature can evaluate the suitability of different experimental designs for measuring different aspects of biological rhythms. We introduce CircaInSilico, a web-based application for generating synthetic genome biology data to benchmark statistical methods for studying biological rhythms. Finally, we discuss several unmet analytical needs, including applications to clinical medicine, and suggest productive avenues to address them.
Guidelines for Genome-Scale Analysis of Biological Rhythms

PubMed Central

Hughes, Michael E.; Abruzzi, Katherine C.; Allada, Ravi; Anafi, Ron; Arpat, Alaaddin Bulak; Asher, Gad; Baldi, Pierre; de Bekker, Charissa; Bell-Pedersen, Deborah; Blau, Justin; Brown, Steve; Ceriani, M. Fernanda; Chen, Zheng; Chiu, Joanna C.; Cox, Juergen; Crowell, Alexander M.; DeBruyne, Jason P.; Dijk, Derk-Jan; DiTacchio, Luciano; Doyle, Francis J.; Duffield, Giles E.; Dunlap, Jay C.; Eckel-Mahan, Kristin; Esser, Karyn A.; FitzGerald, Garret A.; Forger, Daniel B.; Francey, Lauren J.; Fu, Ying-Hui; Gachon, Frédéric; Gatfield, David; de Goede, Paul; Golden, Susan S.; Green, Carla; Harer, John; Harmer, Stacey; Haspel, Jeff; Hastings, Michael H.; Herzel, Hanspeter; Herzog, Erik D.; Hoffmann, Christy; Hong, Christian; Hughey, Jacob J.; Hurley, Jennifer M.; de la Iglesia, Horacio O.; Johnson, Carl; Kay, Steve A.; Koike, Nobuya; Kornacker, Karl; Kramer, Achim; Lamia, Katja; Leise, Tanya; Lewis, Scott A.; Li, Jiajia; Li, Xiaodong; Liu, Andrew C.; Loros, Jennifer J.; Martino, Tami A.; Menet, Jerome S.; Merrow, Martha; Millar, Andrew J.; Mockler, Todd; Naef, Felix; Nagoshi, Emi; Nitabach, Michael N.; Olmedo, Maria; Nusinow, Dmitri A.; Ptáček, Louis J.; Rand, David; Reddy, Akhilesh B.; Robles, Maria S.; Roenneberg, Till; Rosbash, Michael; Ruben, Marc D.; Rund, Samuel S.C.; Sancar, Aziz; Sassone-Corsi, Paolo; Sehgal, Amita; Sherrill-Mix, Scott; Skene, Debra J.; Storch, Kai-Florian; Takahashi, Joseph S.; Ueda, Hiroki R.; Wang, Han; Weitz, Charles; Westermark, Pål O.; Wijnen, Herman; Xu, Ying; Wu, Gang; Yoo, Seung-Hee; Young, Michael; Zhang, Eric Erquan; Zielinski, Tomasz; Hogenesch, John B.

2017-01-01

Genome biology approaches have made enormous contributions to our understanding of biological rhythms, particularly in identifying outputs of the clock, including RNAs, proteins, and metabolites, whose abundance oscillates throughout the day. These methods hold significant promise for future discovery, particularly when combined with computational modeling. However, genome-scale experiments are costly and laborious, yielding “big data” that are conceptually and statistically difficult to analyze. There is no obvious consensus regarding design or analysis. Here we discuss the relevant technical considerations to generate reproducible, statistically sound, and broadly useful genome-scale data. Rather than suggest a set of rigid rules, we aim to codify principles by which investigators, reviewers, and readers of the primary literature can evaluate the suitability of different experimental designs for measuring different aspects of biological rhythms. We introduce CircaInSilico, a web-based application for generating synthetic genome biology data to benchmark statistical methods for studying biological rhythms. Finally, we discuss several unmet analytical needs, including applications to clinical medicine, and suggest productive avenues to address them. PMID:29098954
GenomeDiagram: a python package for the visualization of large-scale genomic data.

PubMed

Pritchard, Leighton; White, Jennifer A; Birch, Paul R J; Toth, Ian K

2006-03-01

We present GenomeDiagram, a flexible, open-source Python module for the visualization of large-scale genomic, comparative genomic and other data with reference to a single chromosome or other biological sequence. GenomeDiagram may be used to generate publication-quality vector graphics, rastered images and in-line streamed graphics for webpages. The package integrates with datatypes from the BioPython project, and is available for Windows, Linux and Mac OS X systems. GenomeDiagram is freely available as source code (under GNU Public License) at http://bioinf.scri.ac.uk/lp/programs.html, and requires Python 2.3 or higher, and recent versions of the ReportLab and BioPython packages. A user manual, example code and images are available at http://bioinf.scri.ac.uk/lp/programs.html.
CoryneRegNet: an ontology-based data warehouse of corynebacterial transcription factors and regulatory networks.

PubMed

Baumbach, Jan; Brinkrolf, Karina; Czaja, Lisa F; Rahmann, Sven; Tauch, Andreas

2006-02-14

The application of DNA microarray technology in post-genomic analysis of bacterial genome sequences has allowed the generation of huge amounts of data related to regulatory networks. This data along with literature-derived knowledge on regulation of gene expression has opened the way for genome-wide reconstruction of transcriptional regulatory networks. These large-scale reconstructions can be converted into in silico models of bacterial cells that allow a systematic analysis of network behavior in response to changing environmental conditions. CoryneRegNet was designed to facilitate the genome-wide reconstruction of transcriptional regulatory networks of corynebacteria relevant in biotechnology and human medicine. During the import and integration process of data derived from experimental studies or literature knowledge CoryneRegNet generates links to genome annotations, to identified transcription factors and to the corresponding cis-regulatory elements. CoryneRegNet is based on a multi-layered, hierarchical and modular concept of transcriptional regulation and was implemented by using the relational database management system MySQL and an ontology-based data structure. Reconstructed regulatory networks can be visualized by using the yFiles JAVA graph library. As an application example of CoryneRegNet, we have reconstructed the global transcriptional regulation of a cellular module involved in SOS and stress response of corynebacteria. CoryneRegNet is an ontology-based data warehouse that allows a pertinent data management of regulatory interactions along with the genome-scale reconstruction of transcriptional regulatory networks. These models can further be combined with metabolic networks to build integrated models of cellular function including both metabolism and its transcriptional regulation.
The scaling and temperature dependence of vertebrate metabolism

PubMed Central

White, Craig R; Phillips, Nicole F; Seymour, Roger S

2005-01-01

Body size and temperature are primary determinants of metabolic rate, and the standard metabolic rate (SMR) of animals ranging in size from unicells to mammals has been thought to be proportional to body mass (M) raised to the power of three-quarters for over 40 years. However, recent evidence from rigorously selected datasets suggests that this is not the case for birds and mammals. To determine whether the influence of body mass on the metabolic rate of vertebrates is indeed universal, we compiled SMR measurements for 938 species spanning six orders of magnitude variation in mass. When normalized to a common temperature of 38 °C, the SMR scaling exponents of fish, amphibians, reptiles, birds and mammals are significantly heterogeneous. This suggests both that there is no universal metabolic allometry and that models that attempt to explain only quarter-power scaling of metabolic rate are unlikely to succeed. PMID:17148344
Metabolic engineering of strains: from industrial-scale to lab-scale chemical production.

PubMed

Sun, Jie; Alper, Hal S

2015-03-01

A plethora of successful metabolic engineering case studies have been published over the past several decades. Here, we highlight a collection of microbially produced chemicals using a historical framework, starting with titers ranging from industrial scale (more than 50 g/L), to medium-scale (5-50 g/L), and lab-scale (0-5 g/L). Although engineered Escherichia coli and Saccharomyces cerevisiae emerge as prominent hosts in the literature as a result of well-developed genetic engineering tools, several novel native-producing strains are gaining attention. This review catalogs the current progress of metabolic engineering towards production of compounds such as acids, alcohols, amino acids, natural organic compounds, and others.
A Multi-scale Computational Platform to Mechanistically Assess the Effect of Genetic Variation on Drug Responses in Human Erythrocyte Metabolism.

PubMed

Mih, Nathan; Brunk, Elizabeth; Bordbar, Aarash; Palsson, Bernhard O

2016-07-01

Progress in systems medicine brings promise to addressing patient heterogeneity and individualized therapies. Recently, genome-scale models of metabolism have been shown to provide insight into the mechanistic link between drug therapies and systems-level off-target effects while being expanded to explicitly include the three-dimensional structure of proteins. The integration of these molecular-level details, such as the physical, structural, and dynamical properties of proteins, notably expands the computational description of biochemical network-level properties and the possibility of understanding and predicting whole cell phenotypes. In this study, we present a multi-scale modeling framework that describes biological processes which range in scale from atomistic details to an entire metabolic network. Using this approach, we can understand how genetic variation, which impacts the structure and reactivity of a protein, influences both native and drug-induced metabolic states. As a proof-of-concept, we study three enzymes (catechol-O-methyltransferase, glucose-6-phosphate dehydrogenase, and glyceraldehyde-3-phosphate dehydrogenase) and their respective genetic variants which have clinically relevant associations. Using all-atom molecular dynamic simulations enables the sampling of long timescale conformational dynamics of the proteins (and their mutant variants) in complex with their respective native metabolites or drug molecules. We find that changes in a protein's structure due to a mutation influences protein binding affinity to metabolites and/or drug molecules, and inflicts large-scale changes in metabolism.
Jatropha curcas, a biofuel crop: functional genomics for understanding metabolic pathways and genetic improvement.

PubMed

Maghuly, Fatemeh; Laimer, Margit

2013-10-01

Jatropha curcas is currently attracting much attention as an oilseed crop for biofuel, as Jatropha can grow under climate and soil conditions that are unsuitable for food production. However, little is known about Jatropha, and there are a number of challenges to be overcome. In fact, Jatropha has not really been domesticated; most of the Jatropha accessions are toxic, which renders the seedcake unsuitable for use as animal feed. The seeds of Jatropha contain high levels of polyunsaturated fatty acids, which negatively impact the biofuel quality. Fruiting of Jatropha is fairly continuous, thus increasing costs of harvesting. Therefore, before starting any improvement program using conventional or molecular breeding techniques, understanding gene function and the genome scale of Jatropha are prerequisites. This review presents currently available and relevant information on the latest technologies (genomics, transcriptomics, proteomics and metabolomics) to decipher important metabolic pathways within Jatropha, such as oil and toxin synthesis. Further, it discusses future directions for biotechnological approaches in Jatropha breeding and improvement. © 2013 The Authors. Biotechnology Journal published by Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
Exploring candidate biomarkers for lung and prostate cancers using gene expression and flux variability analysis.

PubMed

Asgari, Yazdan; Khosravi, Pegah; Zabihinpour, Zahra; Habibi, Mahnaz

2018-02-19

Genome-scale metabolic models have provided valuable resources for exploring changes in metabolism under normal and cancer conditions. However, metabolism itself is strongly linked to gene expression, so integration of gene expression data into metabolic models might improve the detection of genes involved in the control of tumor progression. Herein, we considered gene expression data as extra constraints to enhance the predictive powers of metabolic models. We reconstructed genome-scale metabolic models for lung and prostate, under normal and cancer conditions to detect the major genes associated with critical subsystems during tumor development. Furthermore, we utilized gene expression data in combination with an information theory-based approach to reconstruct co-expression networks of the human lung and prostate in both cohorts. Our results revealed 19 genes as candidate biomarkers for lung and prostate cancer cells. This study also revealed that the development of a complementary approach (integration of gene expression and metabolic profiles) could lead to proposing novel biomarkers and suggesting renovated cancer treatment strategies which have not been possible to detect using either of the methods alone.
The restricted metabolism of the obligate organohalide respiring bacterium Dehalobacter restrictus: lessons from tiered functional genomics

PubMed Central

Rupakula, Aamani; Kruse, Thomas; Boeren, Sjef; Holliger, Christof; Smidt, Hauke; Maillard, Julien

2013-01-01

Dehalobacter restrictus strain PER-K23 is an obligate organohalide respiring bacterium, which displays extremely narrow metabolic capabilities. It grows only via coupling energy conservation to anaerobic respiration of tetra- and trichloroethene with hydrogen as sole electron donor. Dehalobacter restrictus represents the paradigmatic member of the genus Dehalobacter, which in recent years has turned out to be a major player in the bioremediation of an increasing number of organohalides, both in situ and in laboratory studies. The recent elucidation of the D. restrictus genome revealed a rather elaborate genome with predicted pathways that were not suspected from its restricted metabolism, such as a complete corrinoid biosynthetic pathway, the Wood–Ljungdahl (WL) pathway for CO2 fixation, abundant transcriptional regulators and several types of hydrogenases. However, one important feature of the genome is the presence of 25 reductive dehalogenase genes, from which so far only one, pceA, has been characterized on genetic and biochemical levels. This study describes a multi-level functional genomics approach on D. restrictus across three different growth phases. A global proteomic analysis allowed consideration of general metabolic pathways relevant to organohalide respiration, whereas the dedicated genomic and transcriptomic analysis focused on the diversity, composition and expression of genes associated with reductive dehalogenases. PMID:23479754
Lophotrochozoan mitochondrial genomes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Valles, Yvonne; Boore, Jeffrey L.

2005-10-01

Progress in both molecular techniques and phylogeneticmethods has challenged many of the interpretations of traditionaltaxonomy. One example is in the recognition of the animal superphylumLophotrochozoa (annelids, mollusks, echiurans, platyhelminthes,brachiopods, and other phyla), although the relationships within thisgroup and the inclusion of some phyla remain uncertain. While much ofthis progress in phylogenetic reconstruction has been based on comparingsingle gene sequences, we are beginning to see the potential of comparinglarge-scale features of genomes, such as the relative order of genes.Even though tremendous progress is being made on the sequencedetermination of whole nuclear genomes, the dataset of choice forgenome-level characters for many animalsmore » across a broad taxonomic rangeremains mitochondrial genomes. We review here what is known aboutmitochondrial genomes of the lophotrochozoans and discuss the promisethat this dataset will enable insight into theirrelationships.« less
Finding elementary flux modes in metabolic networks based on flux balance analysis and flux coupling analysis: application to the analysis of Escherichia coli metabolism.

PubMed

Tabe-Bordbar, Shayan; Marashi, Sayed-Amir

2013-12-01

Elementary modes (EMs) are steady-state metabolic flux vectors with minimal set of active reactions. Each EM corresponds to a metabolic pathway. Therefore, studying EMs is helpful for analyzing the production of biotechnologically important metabolites. However, memory requirements for computing EMs may hamper their applicability as, in most genome-scale metabolic models, no EM can be computed due to running out of memory. In this study, we present a method for computing randomly sampled EMs. In this approach, a network reduction algorithm is used for EM computation, which is based on flux balance-based methods. We show that this approach can be used to recover the EMs in the medium- and genome-scale metabolic network models, while the EMs are sampled in an unbiased way. The applicability of such results is shown by computing “estimated” control-effective flux values in Escherichia coli metabolic network.
The reduced genomes of Parcubacteria (OD1) contain signatures of a symbiotic lifestyle

PubMed Central

Nelson, William C.; Stegen, James C.

2015-01-01

Candidate phylum OD1 bacteria (also referred to as Parcubacteria) have been identified in a broad range of anoxic environments through community survey analysis. Although none of these species have been isolated in the laboratory, several genome sequences have been reconstructed from metagenomic sequence data and single-cell sequencing. The organisms have small (generally <1 Mb) genomes with severely reduced metabolic capabilities. We have reconstructed 8 partial to near-complete OD1 genomes from oxic groundwater samples, and compared them against existing genomic data. The conserved core gene set comprises 202 genes, or ~28% of the genomic complement. “Housekeeping” genes and genes for biosynthesis of peptidoglycan and Type IV pilus production are conserved. Gene sets for biosynthesis of cofactors, amino acids, nucleotides, and fatty acids are absent entirely or greatly reduced. The only aspects of energy metabolism conserved are the non-oxidative branch of the pentose-phosphate shunt and central glycolysis. These organisms also lack some activities conserved in almost all other known bacterial genomes, including signal recognition particle, pseudouridine synthase A, and FAD synthase. Pan-genome analysis indicates a broad genotypic diversity and perhaps a highly fluid gene complement, indicating historical adaptation to a wide range of growth environments and a high degree of specialization. The genomes were examined for signatures suggesting either a free-living, streamlined lifestyle, or a symbiotic lifestyle. The lack of biosynthetic capabilities and DNA repair, along with the presence of potential attachment and adhesion proteins suggest that the Parcubacteria are ectosymbionts or parasites of other organisms. The wide diversity of genes that potentially mediate cell-cell contact suggests a broad range of partner/prey organisms across the phylum. PMID:26257709
The reduced genomes of Parcubacteria (OD1) contain signatures of a symbiotic lifestyle

DOE Office of Scientific and Technical Information (OSTI.GOV)

Nelson, William C.; Stegen, James C.

2015-07-21

Candidate phylum OD1 bacteria (also referred to as Parcubacteria) have been identified in broad range of anoxic environments through community survey analysis. Although none of these species have been isolated in the laboratory, several genome sequences have been reconstructed from metagenomic sequence data and single-cell sequencing. The organisms have small (generally <1 Mb) genomes with severely reduced metabolic capabilities. We have reconstructed 8 partial to near-complete OD1 genomes from oxic groundwater samples, and compared them against existing genomic data. The conserved core gene set comprises 202 genes, or ~28% of the genomic complement. ‘Housekeeping’ genes and genes for biosynthesis ofmore » peptidoglycan and Type IV pilus production are conserved. Gene sets for biosynthesis of cofactors, amino acids, nucleotides and fatty acids are absent entirely or greatly reduced. The only aspects of energy metabolism conserved are the non-oxidative branch of the pentose-phosphate shunt and central glycolysis. These organisms also lack some activities conserved in almost all other known bacterial genomes, including signal recognition particle, pseudouridine synthase A, and FAD synthase. Pan-genome analysis indicates a broad genotypic diversity and perhaps a highly fluid gene complement, indicating historical adaptation to a wide range of growth environments and a high degree of specialization. The genomes were examined for signatures suggesting either a free-living, streamlined lifestyle or a symbiotic lifestyle. The lack of biosynthetic capabilities and DNA repair, along with the presence of potential attachment and adhesion proteins suggest the Parcubacteria are ectosymbionts or parasites of other organisms. The wide diversity of genes that potentially mediate cell-cell contact suggests a broad range of partner/prey organisms across the phylum.« less

Ensembl Genomes: an integrative resource for genome-scale data from non-vertebrate species.

PubMed

Kersey, Paul J; Staines, Daniel M; Lawson, Daniel; Kulesha, Eugene; Derwent, Paul; Humphrey, Jay C; Hughes, Daniel S T; Keenan, Stephan; Kerhornou, Arnaud; Koscielny, Gautier; Langridge, Nicholas; McDowall, Mark D; Megy, Karine; Maheswari, Uma; Nuhn, Michael; Paulini, Michael; Pedro, Helder; Toneva, Iliana; Wilson, Derek; Yates, Andrew; Birney, Ewan

2012-01-01

Ensembl Genomes (http://www.ensemblgenomes.org) is an integrative resource for genome-scale data from non-vertebrate species. The project exploits and extends technology (for genome annotation, analysis and dissemination) developed in the context of the (vertebrate-focused) Ensembl project and provides a complementary set of resources for non-vertebrate species through a consistent set of programmatic and interactive interfaces. These provide access to data including reference sequence, gene models, transcriptional data, polymorphisms and comparative analysis. Since its launch in 2009, Ensembl Genomes has undergone rapid expansion, with the goal of providing coverage of all major experimental organisms, and additionally including taxonomic reference points to provide the evolutionary context in which genes can be understood. Against the backdrop of a continuing increase in genome sequencing activities in all parts of the tree of life, we seek to work, wherever possible, with the communities actively generating and using data, and are participants in a growing range of collaborations involved in the annotation and analysis of genomes.
Metabolism and Genetics of Helicobacter pylori: the Genome Era

PubMed Central

Marais, Armelle; Mendz, George L.; Hazell, Stuart L.; Mégraud, Francis

1999-01-01

The publication of the complete sequence of Helicobacter pylori 26695 in 1997 and more recently that of strain J99 has provided new insight into the biology of this organism. In this review, we attempt to analyze and interpret the information provided by sequence annotations and to compare these data with those provided by experimental analyses. After a brief description of the general features of the genomes of the two sequenced strains, the principal metabolic pathways are analyzed. In particular, the enzymes encoded by H. pylori involved in fermentative and oxidative metabolism, lipopolysaccharide biosynthesis, nucleotide biosynthesis, aerobic and anaerobic respiration, and iron and nitrogen assimilation are described, and the areas of controversy between the experimental data and those provided by the sequence annotation are discussed. The role of urease, particularly in pH homeostasis, and other specialized mechanisms developed by the bacterium to maintain its internal pH are also considered. The replicational, transcriptional, and translational apparatuses are reviewed, as is the regulatory network. The numerous findings on the metabolism of the bacteria and the paucity of gene expression regulation systems are indicative of the high level of adaptation to the human gastric environment. Arguments in favor of the diversity of H. pylori and molecular data reflecting possible mechanisms involved in this diversity are presented. Finally, we compare the numerous experimental data on the colonization factors and those provided from the genome sequence annotation, in particular for genes involved in motility and adherence of the bacterium to the gastric tissue. PMID:10477311
Quantum metabolism explains the allometric scaling of metabolic rates.

PubMed

Demetrius, Lloyd; Tuszynski, J A

2010-03-06

A general model explaining the origin of allometric laws of physiology is proposed based on coupled energy-transducing oscillator networks embedded in a physical d-dimensional space (d = 1, 2, 3). This approach integrates Mitchell's theory of chemi-osmosis with the Debye model of the thermal properties of solids. We derive a scaling rule that relates the energy generated by redox reactions in cells, the dimensionality of the physical space and the mean cycle time. Two major regimes are found corresponding to classical and quantum behaviour. The classical behaviour leads to allometric isometry while the quantum regime leads to scaling laws relating metabolic rate and body size that cover a broad range of exponents that depend on dimensionality and specific parameter values. The regimes are consistent with a range of behaviours encountered in micelles, plants and animals and provide a conceptual framework for a theory of the metabolic function of living systems.
Framework for network modularization and Bayesian network analysis to investigate the perturbed metabolic network

PubMed Central

2011-01-01

Background Genome-scale metabolic network models have contributed to elucidating biological phenomena, and predicting gene targets to engineer for biotechnological applications. With their increasing importance, their precise network characterization has also been crucial for better understanding of the cellular physiology. Results We herein introduce a framework for network modularization and Bayesian network analysis (FMB) to investigate organism’s metabolism under perturbation. FMB reveals direction of influences among metabolic modules, in which reactions with similar or positively correlated flux variation patterns are clustered, in response to specific perturbation using metabolic flux data. With metabolic flux data calculated by constraints-based flux analysis under both control and perturbation conditions, FMB, in essence, reveals the effects of specific perturbations on the biological system through network modularization and Bayesian network analysis at metabolic modular level. As a demonstration, this framework was applied to the genetically perturbed Escherichia coli metabolism, which is a lpdA gene knockout mutant, using its genome-scale metabolic network model. Conclusions After all, it provides alternative scenarios of metabolic flux distributions in response to the perturbation, which are complementary to the data obtained from conventionally available genome-wide high-throughput techniques or metabolic flux analysis. PMID:22784571
Framework for network modularization and Bayesian network analysis to investigate the perturbed metabolic network.

PubMed

Kim, Hyun Uk; Kim, Tae Yong; Lee, Sang Yup

2011-01-01

Genome-scale metabolic network models have contributed to elucidating biological phenomena, and predicting gene targets to engineer for biotechnological applications. With their increasing importance, their precise network characterization has also been crucial for better understanding of the cellular physiology. We herein introduce a framework for network modularization and Bayesian network analysis (FMB) to investigate organism's metabolism under perturbation. FMB reveals direction of influences among metabolic modules, in which reactions with similar or positively correlated flux variation patterns are clustered, in response to specific perturbation using metabolic flux data. With metabolic flux data calculated by constraints-based flux analysis under both control and perturbation conditions, FMB, in essence, reveals the effects of specific perturbations on the biological system through network modularization and Bayesian network analysis at metabolic modular level. As a demonstration, this framework was applied to the genetically perturbed Escherichia coli metabolism, which is a lpdA gene knockout mutant, using its genome-scale metabolic network model. After all, it provides alternative scenarios of metabolic flux distributions in response to the perturbation, which are complementary to the data obtained from conventionally available genome-wide high-throughput techniques or metabolic flux analysis.
Reference-guided de novo assembly approach improves genome reconstruction for related species.

PubMed

Lischer, Heidi E L; Shimizu, Kentaro K

2017-11-10

The development of next-generation sequencing has made it possible to sequence whole genomes at a relatively low cost. However, de novo genome assemblies remain challenging due to short read length, missing data, repetitive regions, polymorphisms and sequencing errors. As more and more genomes are sequenced, reference-guided assembly approaches can be used to assist the assembly process. However, previous methods mostly focused on the assembly of other genotypes within the same species. We adapted and extended a reference-guided de novo assembly approach, which enables the usage of a related reference sequence to guide the genome assembly. In order to compare and evaluate de novo and our reference-guided de novo assembly approaches, we used a simulated data set of a repetitive and heterozygotic plant genome. The extended reference-guided de novo assembly approach almost always outperforms the corresponding de novo assembly program even when a reference of a different species is used. Similar improvements can be observed in high and low coverage situations. In addition, we show that a single evaluation metric, like the widely used N50 length, is not enough to properly rate assemblies as it not always points to the best assembly evaluated with other criteria. Therefore, we used the summed z-scores of 36 different statistics to evaluate the assemblies. The combination of reference mapping and de novo assembly provides a powerful tool to improve genome reconstruction by integrating information of a related genome. Our extension of the reference-guided de novo assembly approach enables the application of this strategy not only within but also between related species. Finally, the evaluation of genome assemblies is often not straight forward, as the truth is not known. Thus one should always use a combination of evaluation metrics, which not only try to assess the continuity but also the accuracy of an assembly.
Reconstructing the complex evolutionary history of mobile plasmids in red algal genomes

PubMed Central

Lee, JunMo; Kim, Kyeong Mi; Yang, Eun Chan; Miller, Kathy Ann; Boo, Sung Min; Bhattacharya, Debashish; Yoon, Hwan Su

2016-01-01

The integration of foreign DNA into algal and plant plastid genomes is a rare event, with only a few known examples of horizontal gene transfer (HGT). Plasmids, which are well-studied drivers of HGT in prokaryotes, have been reported previously in red algae (Rhodophyta). However, the distribution of these mobile DNA elements and their sites of integration into the plastid (ptDNA), mitochondrial (mtDNA), and nuclear genomes of Rhodophyta remain unknown. Here we reconstructed the complex evolutionary history of plasmid-derived DNAs in red algae. Comparative analysis of 21 rhodophyte ptDNAs, including new genome data for 5 species, turned up 22 plasmid-derived open reading frames (ORFs) that showed syntenic and copy number variation among species, but were conserved within different individuals in three lineages. Several plasmid-derived homologs were found not only in ptDNA but also in mtDNA and in the nuclear genome of green plants, stramenopiles, and rhizarians. Phylogenetic and plasmid-derived ORF analyses showed that the majority of plasmid DNAs originated within red algae, whereas others were derived from cyanobacteria, other bacteria, and viruses. Our results elucidate the evolution of plasmid DNAs in red algae and suggest that they spread as parasitic genetic elements. This hypothesis is consistent with their sporadic distribution within Rhodophyta. PMID:27030297
Scalable Parameter Estimation for Genome-Scale Biochemical Reaction Networks

PubMed Central

Kaltenbacher, Barbara; Hasenauer, Jan

2017-01-01

Mechanistic mathematical modeling of biochemical reaction networks using ordinary differential equation (ODE) models has improved our understanding of small- and medium-scale biological processes. While the same should in principle hold for large- and genome-scale processes, the computational methods for the analysis of ODE models which describe hundreds or thousands of biochemical species and reactions are missing so far. While individual simulations are feasible, the inference of the model parameters from experimental data is computationally too intensive. In this manuscript, we evaluate adjoint sensitivity analysis for parameter estimation in large scale biochemical reaction networks. We present the approach for time-discrete measurement and compare it to state-of-the-art methods used in systems and computational biology. Our comparison reveals a significantly improved computational efficiency and a superior scalability of adjoint sensitivity analysis. The computational complexity is effectively independent of the number of parameters, enabling the analysis of large- and genome-scale models. Our study of a comprehensive kinetic model of ErbB signaling shows that parameter estimation using adjoint sensitivity analysis requires a fraction of the computation time of established methods. The proposed method will facilitate mechanistic modeling of genome-scale cellular processes, as required in the age of omics. PMID:28114351
Genome-scale engineering for systems and synthetic biology

PubMed Central

Esvelt, Kevin M; Wang, Harris H

2013-01-01

Genome-modification technologies enable the rational engineering and perturbation of biological systems. Historically, these methods have been limited to gene insertions or mutations at random or at a few pre-defined locations across the genome. The handful of methods capable of targeted gene editing suffered from low efficiencies, significant labor costs, or both. Recent advances have dramatically expanded our ability to engineer cells in a directed and combinatorial manner. Here, we review current technologies and methodologies for genome-scale engineering, discuss the prospects for extending efficient genome modification to new hosts, and explore the implications of continued advances toward the development of flexibly programmable chasses, novel biochemistries, and safer organismal and ecological engineering. PMID:23340847
Plant metabolic clusters - from genetics to genomics.

PubMed

Nützmann, Hans-Wilhelm; Huang, Ancheng; Osbourn, Anne

2016-08-01

Contents 771 I. 771 II. 772 III. 780 IV. 781 V. 786 786 References 786 SUMMARY: Plant natural products are of great value for agriculture, medicine and a wide range of other industrial applications. The discovery of new plant natural product pathways is currently being revolutionized by two key developments. First, breakthroughs in sequencing technology and reduced cost of sequencing are accelerating the ability to find enzymes and pathways for the biosynthesis of new natural products by identifying the underlying genes. Second, there are now multiple examples in which the genes encoding certain natural product pathways have been found to be grouped together in biosynthetic gene clusters within plant genomes. These advances are now making it possible to develop strategies for systematically mining multiple plant genomes for the discovery of new enzymes, pathways and chemistries. Increased knowledge of the features of plant metabolic gene clusters - architecture, regulation and assembly - will be instrumental in expediting natural product discovery. This review summarizes progress in this area. © 2016 The Authors. New Phytologist © 2016 New Phytologist Trust.
High precision multi-genome scale reannotation of enzyme function by EFICAz

PubMed Central

Arakaki, Adrian K; Tian, Weidong; Skolnick, Jeffrey

2006-01-01

biologically significant hypotheses and can be useful for comparative genome analysis and automated metabolic pathway reconstruction. PMID:17166279
On the effects of alternative optima in context-specific metabolic model predictions

PubMed Central

Nikoloski, Zoran

2017-01-01

The integration of experimental data into genome-scale metabolic models can greatly improve flux predictions. This is achieved by restricting predictions to a more realistic context-specific domain, like a particular cell or tissue type. Several computational approaches to integrate data have been proposed—generally obtaining context-specific (sub)models or flux distributions. However, these approaches may lead to a multitude of equally valid but potentially different models or flux distributions, due to possible alternative optima in the underlying optimization problems. Although this issue introduces ambiguity in context-specific predictions, it has not been generally recognized, especially in the case of model reconstructions. In this study, we analyze the impact of alternative optima in four state-of-the-art context-specific data integration approaches, providing both flux distributions and/or metabolic models. To this end, we present three computational methods and apply them to two particular case studies: leaf-specific predictions from the integration of gene expression data in a metabolic model of Arabidopsis thaliana, and liver-specific reconstructions derived from a human model with various experimental data sources. The application of these methods allows us to obtain the following results: (i) we sample the space of alternative flux distributions in the leaf- and the liver-specific case and quantify the ambiguity of the predictions. In addition, we show how the inclusion of ℓ1-regularization during data integration reduces the ambiguity in both cases. (ii) We generate sets of alternative leaf- and liver-specific models that are optimal to each one of the evaluated model reconstruction approaches. We demonstrate that alternative models of the same context contain a marked fraction of disparate reactions. Further, we show that a careful balance between model sparsity and metabolic functionality helps in reducing the discrepancies between alternative
On the effects of alternative optima in context-specific metabolic model predictions.

PubMed

Robaina-Estévez, Semidán; Nikoloski, Zoran

2017-05-01

The integration of experimental data into genome-scale metabolic models can greatly improve flux predictions. This is achieved by restricting predictions to a more realistic context-specific domain, like a particular cell or tissue type. Several computational approaches to integrate data have been proposed-generally obtaining context-specific (sub)models or flux distributions. However, these approaches may lead to a multitude of equally valid but potentially different models or flux distributions, due to possible alternative optima in the underlying optimization problems. Although this issue introduces ambiguity in context-specific predictions, it has not been generally recognized, especially in the case of model reconstructions. In this study, we analyze the impact of alternative optima in four state-of-the-art context-specific data integration approaches, providing both flux distributions and/or metabolic models. To this end, we present three computational methods and apply them to two particular case studies: leaf-specific predictions from the integration of gene expression data in a metabolic model of Arabidopsis thaliana, and liver-specific reconstructions derived from a human model with various experimental data sources. The application of these methods allows us to obtain the following results: (i) we sample the space of alternative flux distributions in the leaf- and the liver-specific case and quantify the ambiguity of the predictions. In addition, we show how the inclusion of ℓ1-regularization during data integration reduces the ambiguity in both cases. (ii) We generate sets of alternative leaf- and liver-specific models that are optimal to each one of the evaluated model reconstruction approaches. We demonstrate that alternative models of the same context contain a marked fraction of disparate reactions. Further, we show that a careful balance between model sparsity and metabolic functionality helps in reducing the discrepancies between alternative
Genome-Wide Fine-Scale Recombination Rate Variation in Drosophila melanogaster

PubMed Central

Song, Yun S.

2012-01-01

Estimating fine-scale recombination maps of Drosophila from population genomic data is a challenging problem, in particular because of the high background recombination rate. In this paper, a new computational method is developed to address this challenge. Through an extensive simulation study, it is demonstrated that the method allows more accurate inference, and exhibits greater robustness to the effects of natural selection and noise, compared to a well-used previous method developed for studying fine-scale recombination rate variation in the human genome. As an application, a genome-wide analysis of genetic variation data is performed for two Drosophila melanogaster populations, one from North America (Raleigh, USA) and the other from Africa (Gikongoro, Rwanda). It is shown that fine-scale recombination rate variation is widespread throughout the D. melanogaster genome, across all chromosomes and in both populations. At the fine-scale, a conservative, systematic search for evidence of recombination hotspots suggests the existence of a handful of putative hotspots each with at least a tenfold increase in intensity over the background rate. A wavelet analysis is carried out to compare the estimated recombination maps in the two populations and to quantify the extent to which recombination rates are conserved. In general, similarity is observed at very broad scales, but substantial differences are seen at fine scales. The average recombination rate of the X chromosome appears to be higher than that of the autosomes in both populations, and this pattern is much more pronounced in the African population than the North American population. The correlation between various genomic features—including recombination rates, diversity, divergence, GC content, gene content, and sequence quality—is examined using the wavelet analysis, and it is shown that the most notable difference between D. melanogaster and humans is in the correlation between recombination and
Between Two Fern Genomes

PubMed Central

2014-01-01

Ferns are the only major lineage of vascular plants not represented by a sequenced nuclear genome. This lack of genome sequence information significantly impedes our ability to understand and reconstruct genome evolution not only in ferns, but across all land plants. Azolla and Ceratopteris are ideal and complementary candidates to be the first ferns to have their nuclear genomes sequenced. They differ dramatically in genome size, life history, and habit, and thus represent the immense diversity of extant ferns. Together, this pair of genomes will facilitate myriad large-scale comparative analyses across ferns and all land plants. Here we review the unique biological characteristics of ferns and describe a number of outstanding questions in plant biology that will benefit from the addition of ferns to the set of taxa with sequenced nuclear genomes. We explain why the fern clade is pivotal for understanding genome evolution across land plants, and we provide a rationale for how knowledge of fern genomes will enable progress in research beyond the ferns themselves. PMID:25324969
A Prochlorococcus proving ground for constraint-based metabolic modeling and multi-`omics data integration

NASA Astrophysics Data System (ADS)

Casey, J.; Ji, B.; Shaoie, S.; Mardinoglu, A.; Sarathi Sen, P.; Jahn, O.; Reda, K.; Leigh, J.; Follows, M. J.; Nielsen, J.; Karl, D. M.

2016-02-01

Representatives of the oligotrophic marine cyanobacterium Prochlorococcus marinus are the smallest free-living photosynthetic organisms, both in terms of physical size and genome size, yet are the most abundant photoautotrophic microbes in the oceans and profoundly influence global biogeochemical cycles. Physiological and regulatory control of nutrient and light stress has been observed in MED4 in culture and in its closely related `ecotype' eMED4 in the field, however its metabolism has not been investigated in detail. We present a genome-scale metabolic network reconstruction of the high-light adapted axenic strain MED4ax ("iJCMED4") for the quantitative analysis of a range of its metabolic phenotypes. The resulting structure is a proving ground for the incorporation of enzyme kinetics, biochemical and elemental compositional data, transcriptomic, proteomic, metabolomic, and fluxomic datasets which can be implemented within a constraint-based metabolic modeling environment. The iJCMED4 stoichiometric model consists of 523 metabolic genes encoding 787 reactions with 673 unique metabolites distributed in 5 sub-cellular compartments and is mass, charge, and thermodynamically balanced. Several variants of flux balance analysis were used to simulate growth and metabolic fluxes over the diel cycle, under various stress conditions (e.g., nitrogen, phosphorus, light), and within the framework of a global biogeochemical model (DARWIN). Model simulations accurately predicted growth rates in culture under a variety of defined medium compositions and there was close agreement of photosynthetic performance, biomass and energy yields and efficiencies, and transporter fluxes for iJCMED4 and culture experiments. In addition to a nearly optimal photosynthetic quotient and central carbon metabolism efficiency, MED4 has made dramatic alterations to redox and phosphorus metabolism across biosynthetic and intermediate pathways. We propose that reductions in phosphate reaction
Genome-scale CRISPR-Cas9 knockout screening in human cells.

PubMed

Shalem, Ophir; Sanjana, Neville E; Hartenian, Ella; Shi, Xi; Scott, David A; Mikkelson, Tarjei; Heckl, Dirk; Ebert, Benjamin L; Root, David E; Doench, John G; Zhang, Feng

2014-01-03

The simplicity of programming the CRISPR (clustered regularly interspaced short palindromic repeats)-associated nuclease Cas9 to modify specific genomic loci suggests a new way to interrogate gene function on a genome-wide scale. We show that lentiviral delivery of a genome-scale CRISPR-Cas9 knockout (GeCKO) library targeting 18,080 genes with 64,751 unique guide sequences enables both negative and positive selection screening in human cells. First, we used the GeCKO library to identify genes essential for cell viability in cancer and pluripotent stem cells. Next, in a melanoma model, we screened for genes whose loss is involved in resistance to vemurafenib, a therapeutic RAF inhibitor. Our highest-ranking candidates include previously validated genes NF1 and MED12, as well as novel hits NF2, CUL3, TADA2B, and TADA1. We observe a high level of consistency between independent guide RNAs targeting the same gene and a high rate of hit confirmation, demonstrating the promise of genome-scale screening with Cas9.
Draft Genomes, Phylogenetic Reconstruction, and Comparative Genomics of Two Novel Cohabiting Bacterial Symbionts Isolated from Frankliniella occidentalis.

PubMed

Facey, Paul D; Méric, Guillaume; Hitchings, Matthew D; Pachebat, Justin A; Hegarty, Matt J; Chen, Xiaorui; Morgan, Laura V A; Hoeppner, James E; Whitten, Miranda M A; Kirk, William D J; Dyson, Paul J; Sheppard, Sam K; Del Sol, Ricardo

2015-07-15

Obligate bacterial symbionts are widespread in many invertebrates, where they are often confined to specialized host cells and are transmitted directly from mother to progeny. Increasing numbers of these bacteria are being characterized but questions remain about their population structure and evolution. Here we take a comparative genomics approach to investigate two prominent bacterial symbionts (BFo1 and BFo2) isolated from geographically separated populations of western flower thrips, Frankliniella occidentalis. Our multifaceted approach to classifying these symbionts includes concatenated multilocus sequence analysis (MLSA) phylogenies, ribosomal multilocus sequence typing (rMLST), construction of whole-genome phylogenies, and in-depth genomic comparisons. We showed that the BFo1 genome clusters more closely to species in the genus Erwinia, and is a putative close relative to Erwinia aphidicola. BFo1 is also likely to have shared a common ancestor with Erwinia pyrifoliae/Erwinia amylovora and the nonpathogenic Erwinia tasmaniensis and genetic traits similar to Erwinia billingiae. The BFo1 genome contained virulence factors found in the genus Erwinia but represented a divergent lineage. In contrast, we showed that BFo2 belongs within the Enterobacteriales but does not group closely with any currently known bacterial species. Concatenated MLSA phylogenies indicate that it may have shared a common ancestor to the Erwinia and Pantoea genera, and based on the clustering of rMLST genes, it was most closely related to Pantoea ananatis but represented a divergent lineage. We reconstructed a core genome of a putative common ancestor of Erwinia and Pantoea and compared this with the genomes of BFo bacteria. BFo2 possessed none of the virulence determinants that were omnipresent in the Erwinia and Pantoea genera. Taken together, these data are consistent with BFo2 representing a highly novel species that maybe related to known Pantoea. © The Author(s) 2015. Published by
High-throughput comparison, functional annotation, and metabolic modeling of plant genomes using the PlantSEED resource

USDA-ARS?s Scientific Manuscript database

The increasing number of sequenced plant genomes is placing new demands on the methods applied to analyze, annotate, and model these genomes. Today's annotation pipelines result in inconsistent gene assignments that complicate comparative analyses and prevent efficient construction of metabolic mode...
Analysis of metabolic networks of Streptomyces leeuwenhoekii C34 by means of a genome scale model: Prediction of modifications that enhance the production of specialized metabolites.

PubMed

Razmilic, Valeria; Castro, Jean F; Andrews, Barbara; Asenjo, Juan A

2018-07-01

The first genome scale model (GSM) for Streptomyces leeuwenhoekii C34 was developed to study the biosynthesis pathways of specialized metabolites and to find metabolic engineering targets for enhancing their production. The model, iVR1007, consists of 1,722 reactions, 1,463 metabolites, and 1,007 genes, it includes the biosynthesis pathways of chaxamycins, chaxalactins, desferrioxamines, ectoine, and other specialized metabolites. iVR1007 was validated using experimental information of growth on 166 different sources of carbon, nitrogen and phosphorous, showing an 83.7% accuracy. The model was used to predict metabolic engineering targets for enhancing the biosynthesis of chaxamycins and chaxalactins. Gene knockouts, such as sle03600 (L-homoserine O-acetyltransferase), and sle39090 (trehalose-phosphate synthase), that enhance the production of the specialized metabolites by increasing the pool of precursors were identified. Using the algorithm of flux scanning based on enforced objective flux (FSEOF) implemented in python, 35 and 25 over-expression targets for increasing the production of chaxamycin A and chaxalactin A, respectively, that were not directly associated with their biosynthesis routes were identified. Nineteen over-expression targets that were common to the two specialized metabolites studied, like the over-expression of the acetyl carboxylase complex (sle47660 (accA) and any of the following genes: sle44630 (accA_1) or sle39830 (accA_2) or sle27560 (bccA) or sle59710) were identified. The predicted knockouts and over-expression targets will be used to perform metabolic engineering of S. leeuwenhoekii C34 and obtain overproducer strains. © 2018 Wiley Periodicals, Inc.

Thousands of microbial genomes shed light on interconnected biogeochemical processes in an aquifer system

DOE Office of Scientific and Technical Information (OSTI.GOV)

Anantharaman, Karthik; Brown, Christopher T.; Hug, Laura A.

The subterranean world hosts up to one-fifth of all biomass, including microbial communities that drive transformations central to Earth's biogeochemical cycles. However, little is known about how complex microbial communities in such environments are structured, and how inter-organism interactions shape ecosystem function. Here we apply terabase-scale cultivation-independent metagenomics to aquifer sediments and groundwater, and reconstruct 2,540 draft-quality, near-complete and complete strain-resolved genomes that represent the majority of known bacterial phyla as well as 47 newly discovered phylum-level lineages. Metabolic analyses spanning this vast phylogenetic diversity and representing up to 36% of organisms detected in the system are used to documentmore » the distribution of pathways in coexisting organisms. Consistent with prior findings indicating metabolic handoffs in simple consortia, we find that few organisms within the community can conduct multiple sequential redox transformations. As environmental conditions change, different assemblages of organisms are selected for, altering linkages among the major biogeochemical cycles.« less
Thousands of microbial genomes shed light on interconnected biogeochemical processes in an aquifer system

DOE PAGES

Anantharaman, Karthik; Brown, Christopher T.; Hug, Laura A.; ...

2016-10-24

The subterranean world hosts up to one-fifth of all biomass, including microbial communities that drive transformations central to Earth's biogeochemical cycles. However, little is known about how complex microbial communities in such environments are structured, and how inter-organism interactions shape ecosystem function. Here we apply terabase-scale cultivation-independent metagenomics to aquifer sediments and groundwater, and reconstruct 2,540 draft-quality, near-complete and complete strain-resolved genomes that represent the majority of known bacterial phyla as well as 47 newly discovered phylum-level lineages. Metabolic analyses spanning this vast phylogenetic diversity and representing up to 36% of organisms detected in the system are used to documentmore » the distribution of pathways in coexisting organisms. Consistent with prior findings indicating metabolic handoffs in simple consortia, we find that few organisms within the community can conduct multiple sequential redox transformations. As environmental conditions change, different assemblages of organisms are selected for, altering linkages among the major biogeochemical cycles.« less
Thousands of microbial genomes shed light on interconnected biogeochemical processes in an aquifer system

PubMed Central

Anantharaman, Karthik; Brown, Christopher T.; Hug, Laura A.; Sharon, Itai; Castelle, Cindy J.; Probst, Alexander J.; Thomas, Brian C.; Singh, Andrea; Wilkins, Michael J.; Karaoz, Ulas; Brodie, Eoin L.; Williams, Kenneth H.; Hubbard, Susan S.; Banfield, Jillian F.

2016-01-01

The subterranean world hosts up to one-fifth of all biomass, including microbial communities that drive transformations central to Earth's biogeochemical cycles. However, little is known about how complex microbial communities in such environments are structured, and how inter-organism interactions shape ecosystem function. Here we apply terabase-scale cultivation-independent metagenomics to aquifer sediments and groundwater, and reconstruct 2,540 draft-quality, near-complete and complete strain-resolved genomes that represent the majority of known bacterial phyla as well as 47 newly discovered phylum-level lineages. Metabolic analyses spanning this vast phylogenetic diversity and representing up to 36% of organisms detected in the system are used to document the distribution of pathways in coexisting organisms. Consistent with prior findings indicating metabolic handoffs in simple consortia, we find that few organisms within the community can conduct multiple sequential redox transformations. As environmental conditions change, different assemblages of organisms are selected for, altering linkages among the major biogeochemical cycles. PMID:27774985
Thousands of microbial genomes shed light on interconnected biogeochemical processes in an aquifer system

NASA Astrophysics Data System (ADS)

Anantharaman, Karthik; Brown, Christopher T.; Hug, Laura A.; Sharon, Itai; Castelle, Cindy J.; Probst, Alexander J.; Thomas, Brian C.; Singh, Andrea; Wilkins, Michael J.; Karaoz, Ulas; Brodie, Eoin L.; Williams, Kenneth H.; Hubbard, Susan S.; Banfield, Jillian F.

2016-10-01

The subterranean world hosts up to one-fifth of all biomass, including microbial communities that drive transformations central to Earth's biogeochemical cycles. However, little is known about how complex microbial communities in such environments are structured, and how inter-organism interactions shape ecosystem function. Here we apply terabase-scale cultivation-independent metagenomics to aquifer sediments and groundwater, and reconstruct 2,540 draft-quality, near-complete and complete strain-resolved genomes that represent the majority of known bacterial phyla as well as 47 newly discovered phylum-level lineages. Metabolic analyses spanning this vast phylogenetic diversity and representing up to 36% of organisms detected in the system are used to document the distribution of pathways in coexisting organisms. Consistent with prior findings indicating metabolic handoffs in simple consortia, we find that few organisms within the community can conduct multiple sequential redox transformations. As environmental conditions change, different assemblages of organisms are selected for, altering linkages among the major biogeochemical cycles.
Genomic analysis of thermophilic Bacillus coagulans strains: efficient producers for platform bio-chemicals.

PubMed

Su, Fei; Xu, Ping

2014-01-29

Microbial strains with high substrate efficiency and excellent environmental tolerance are urgently needed for the production of platform bio-chemicals. Bacillus coagulans has these merits; however, little genetic information is available about this species. Here, we determined the genome sequences of five B. coagulans strains, and used a comparative genomic approach to reconstruct the central carbon metabolism of this species to explain their fermentation features. A novel xylose isomerase in the xylose utilization pathway was identified in these strains. Based on a genome-wide positive selection scan, the selection pressure on amino acid metabolism may have played a significant role in the thermal adaptation. We also researched the immune systems of B. coagulans strains, which provide them with acquired resistance to phages and mobile genetic elements. Our genomic analysis provides comprehensive insights into the genetic characteristics of B. coagulans and paves the way for improving and extending the uses of this species.
Genomic analysis of thermophilic Bacillus coagulans strains: efficient producers for platform bio-chemicals

PubMed Central

Su, Fei; Xu, Ping

2014-01-01

Microbial strains with high substrate efficiency and excellent environmental tolerance are urgently needed for the production of platform bio-chemicals. Bacillus coagulans has these merits; however, little genetic information is available about this species. Here, we determined the genome sequences of five B. coagulans strains, and used a comparative genomic approach to reconstruct the central carbon metabolism of this species to explain their fermentation features. A novel xylose isomerase in the xylose utilization pathway was identified in these strains. Based on a genome-wide positive selection scan, the selection pressure on amino acid metabolism may have played a significant role in the thermal adaptation. We also researched the immune systems of B. coagulans strains, which provide them with acquired resistance to phages and mobile genetic elements. Our genomic analysis provides comprehensive insights into the genetic characteristics of B. coagulans and paves the way for improving and extending the uses of this species. PMID:24473268
Dynamics of Marine Microbial Metabolism and Physiology at Station ALOHA

NASA Astrophysics Data System (ADS)

Casey, John R.

Marine microbial communities influence global biogeochemical cycles by coupling the transduction of free energy to the transformation of Earth's essential bio-elements: H, C, N, O, P, and S. The web of interactions between these processes is extraordinarily complex, though fundamental physical and thermodynamic principles should describe its dynamics. In this collection of 5 studies, aspects of the complexity of marine microbial metabolism and physiology were investigated as they interact with biogeochemical cycles and direct the flow of energy within the Station ALOHA surface layer microbial community. In Chapter 1, and at the broadest level of complexity discussed, a method to relate cell size to metabolic activity was developed to evaluate allometric power laws at fine scales within picoplankton populations. Although size was predictive of metabolic rates, within-population power laws deviated from the broader size spectrum, suggesting metabolic diversity as a key determinant of microbial activity. In Chapter 2, a set of guidelines was proposed by which organic substrates are selected and utilized by the heterotrophic community based on their nitrogen content, carbon content, and energy content. A hierarchical experimental design suggested that the heterotrophic microbial community prefers high nitrogen content but low energy density substrates, while carbon content was not important. In Chapter 3, a closer look at the light-dependent dynamics of growth on a single organic substrate, glycolate, suggested that growth yields were improved by photoheterotrophy. The remaining chapters were based on the development of a genome-scale metabolic network reconstruction of the cyanobacterium Prochlorococcus to probe its metabolic capabilities and quantify metabolic fluxes. Findings described in Chapter 4 pointed to evolution of the Prochlorococcus metabolic network to optimize growth at low phosphate concentrations. Finally, in Chapter 5 and at the finest scale of
Cyanobacterial life at low O(2): community genomics and function reveal metabolic versatility and extremely low diversity in a Great Lakes sinkhole mat.

PubMed

Voorhies, A A; Biddanda, B A; Kendall, S T; Jain, S; Marcus, D N; Nold, S C; Sheldon, N D; Dick, G J

2012-05-01

Cyanobacteria are renowned as the mediators of Earth's oxygenation. However, little is known about the cyanobacterial communities that flourished under the low-O(2) conditions that characterized most of their evolutionary history. Microbial mats in the submerged Middle Island Sinkhole of Lake Huron provide opportunities to investigate cyanobacteria under such persistent low-O(2) conditions. Here, venting groundwater rich in sulfate and low in O(2) supports a unique benthic ecosystem of purple-colored cyanobacterial mats. Beneath the mat is a layer of carbonate that is enriched in calcite and to a lesser extent dolomite. In situ benthic metabolism chambers revealed that the mats are net sinks for O(2), suggesting primary production mechanisms other than oxygenic photosynthesis. Indeed, (14)C-bicarbonate uptake studies of autotrophic production show variable contributions from oxygenic and anoxygenic photosynthesis and chemosynthesis, presumably because of supply of sulfide. These results suggest the presence of either facultatively anoxygenic cyanobacteria or a mix of oxygenic/anoxygenic types of cyanobacteria. Shotgun metagenomic sequencing revealed a remarkably low-diversity mat community dominated by just one genotype most closely related to the cyanobacterium Phormidium autumnale, for which an essentially complete genome was reconstructed. Also recovered were partial genomes from a second genotype of Phormidium and several Oscillatoria. Despite the taxonomic simplicity, diverse cyanobacterial genes putatively involved in sulfur oxidation were identified, suggesting a diversity of sulfide physiologies. The dominant Phormidium genome reflects versatile metabolism and physiology that is specialized for a communal lifestyle under fluctuating redox conditions and light availability. Overall, this study provides genomic and physiologic insights into low-O(2) cyanobacterial mat ecosystems that played crucial geobiological roles over long stretches of Earth history. �
Principles of proteome allocation are revealed using proteomic data and genome-scale models

PubMed Central

Yang, Laurence; Yurkovich, James T.; Lloyd, Colton J.; Ebrahim, Ali; Saunders, Michael A.; Palsson, Bernhard O.

2016-01-01

Integrating omics data to refine or make context-specific models is an active field of constraint-based modeling. Proteomics now cover over 95% of the Escherichia coli proteome by mass. Genome-scale models of Metabolism and macromolecular Expression (ME) compute proteome allocation linked to metabolism and fitness. Using proteomics data, we formulated allocation constraints for key proteome sectors in the ME model. The resulting calibrated model effectively computed the “generalist” (wild-type) E. coli proteome and phenotype across diverse growth environments. Across 15 growth conditions, prediction errors for growth rate and metabolic fluxes were 69% and 14% lower, respectively. The sector-constrained ME model thus represents a generalist ME model reflecting both growth rate maximization and “hedging” against uncertain environments and stresses, as indicated by significant enrichment of these sectors for the general stress response sigma factor σS. Finally, the sector constraints represent a general formalism for integrating omics data from any experimental condition into constraint-based ME models. The constraints can be fine-grained (individual proteins) or coarse-grained (functionally-related protein groups) as demonstrated here. This flexible formalism provides an accessible approach for narrowing the gap between the complexity captured by omics data and governing principles of proteome allocation described by systems-level models. PMID:27857205
Principles of proteome allocation are revealed using proteomic data and genome-scale models

DOE PAGES

Yang, Laurence; Yurkovich, James T.; Lloyd, Colton J.; ...

2016-11-18

Integrating omics data to refine or make context-specific models is an active field of constraint-based modeling. Proteomics now cover over 95% of the Escherichia coli proteome by mass. Genome-scale models of Metabolism and macromolecular Expression (ME) compute proteome allocation linked to metabolism and fitness. Using proteomics data, we formulated allocation constraints for key proteome sectors in the ME model. The resulting calibrated model effectively computed the “generalist” (wild-type) E. coli proteome and phenotype across diverse growth environments. Across 15 growth conditions, prediction errors for growth rate and metabolic fluxes were 69% and 14% lower, respectively. The sector-constrained ME model thusmore » represents a generalist ME model reflecting both growth rate maximization and “hedging” against uncertain environments and stresses, as indicated by significant enrichment of these sectors for the general stress response sigma factor σS. Finally, the sector constraints represent a general formalism for integrating omics data from any experimental condition into constraint-based ME models. The constraints can be fine-grained (individual proteins) or coarse-grained (functionally-related protein groups) as demonstrated here. Furthermore, this flexible formalism provides an accessible approach for narrowing the gap between the complexity captured by omics data and governing principles of proteome allocation described by systems-level models.« less
Use of an uncertainty analysis for genome-scale models as a prediction tool for microbial growth processes in subsurface environments.

PubMed

Klier, Christine

2012-03-06

The integration of genome-scale, constraint-based models of microbial cell function into simulations of contaminant transport and fate in complex groundwater systems is a promising approach to help characterize the metabolic activities of microorganisms in natural environments. In constraint-based modeling, the specific uptake flux rates of external metabolites are usually determined by Michaelis-Menten kinetic theory. However, extensive data sets based on experimentally measured values are not always available. In this study, a genome-scale model of Pseudomonas putida was used to study the key issue of uncertainty arising from the parametrization of the influx of two growth-limiting substrates: oxygen and toluene. The results showed that simulated growth rates are highly sensitive to substrate affinity constants and that uncertainties in specific substrate uptake rates have a significant influence on the variability of simulated microbial growth. Michaelis-Menten kinetic theory does not, therefore, seem to be appropriate for descriptions of substrate uptake processes in the genome-scale model of P. putida. Microbial growth rates of P. putida in subsurface environments can only be accurately predicted if the processes of complex substrate transport and microbial uptake regulation are sufficiently understood in natural environments and if data-driven uptake flux constraints can be applied.
Reconstructing each cell's genome within complex microbial communities-dream or reality?

PubMed

Clingenpeel, Scott; Clum, Alicia; Schwientek, Patrick; Rinke, Christian; Woyke, Tanja

2014-01-01

As the vast majority of microorganisms have yet to be cultivated in a laboratory setting, access to their genetic makeup has largely been limited to cultivation-independent methods. These methods, namely metagenomics and more recently single-cell genomics, have become cornerstones for microbial ecology and environmental microbiology. One ultimate goal is the recovery of genome sequences from each cell within an environment to move toward a better understanding of community metabolic potential and to provide substrate for experimental work. As single-cell sequencing has the ability to decipher all sequence information contained in an individual cell, this method holds great promise in tackling such challenge. Methodological limitations and inherent biases however do exist, which will be discussed here based on environmental and benchmark data, to assess how far we are from reaching this goal.
A Multi-scale Computational Platform to Mechanistically Assess the Effect of Genetic Variation on Drug Responses in Human Erythrocyte Metabolism

PubMed Central

Bordbar, Aarash; Palsson, Bernhard O.

2016-01-01

Progress in systems medicine brings promise to addressing patient heterogeneity and individualized therapies. Recently, genome-scale models of metabolism have been shown to provide insight into the mechanistic link between drug therapies and systems-level off-target effects while being expanded to explicitly include the three-dimensional structure of proteins. The integration of these molecular-level details, such as the physical, structural, and dynamical properties of proteins, notably expands the computational description of biochemical network-level properties and the possibility of understanding and predicting whole cell phenotypes. In this study, we present a multi-scale modeling framework that describes biological processes which range in scale from atomistic details to an entire metabolic network. Using this approach, we can understand how genetic variation, which impacts the structure and reactivity of a protein, influences both native and drug-induced metabolic states. As a proof-of-concept, we study three enzymes (catechol-O-methyltransferase, glucose-6-phosphate dehydrogenase, and glyceraldehyde-3-phosphate dehydrogenase) and their respective genetic variants which have clinically relevant associations. Using all-atom molecular dynamic simulations enables the sampling of long timescale conformational dynamics of the proteins (and their mutant variants) in complex with their respective native metabolites or drug molecules. We find that changes in a protein’s structure due to a mutation influences protein binding affinity to metabolites and/or drug molecules, and inflicts large-scale changes in metabolism. PMID:27467583
MEBS, a software platform to evaluate large (meta)genomic collections according to their metabolic machinery: unraveling the sulfur cycle

PubMed Central

Zapata-Peñasco, Icoquih; Poot-Hernandez, Augusto Cesar; Eguiarte, Luis E

2017-01-01

Abstract The increasing number of metagenomic and genomic sequences has dramatically improved our understanding of microbial diversity, yet our ability to infer metabolic capabilities in such datasets remains challenging. We describe the Multigenomic Entropy Based Score pipeline (MEBS), a software platform designed to evaluate, compare, and infer complex metabolic pathways in large “omic” datasets, including entire biogeochemical cycles. MEBS is open source and available through https://github.com/eead-csic-compbio/metagenome_Pfam_score. To demonstrate its use, we modeled the sulfur cycle by exhaustively curating the molecular and ecological elements involved (compounds, genes, metabolic pathways, and microbial taxa). This information was reduced to a collection of 112 characteristic Pfam protein domains and a list of complete-sequenced sulfur genomes. Using the mathematical framework of relative entropy (H΄), we quantitatively measured the enrichment of these domains among sulfur genomes. The entropy of each domain was used both to build up a final score that indicates whether a (meta)genomic sample contains the metabolic machinery of interest and to propose marker domains in metagenomic sequences such as DsrC (PF04358). MEBS was benchmarked with a dataset of 2107 non-redundant microbial genomes from RefSeq and 935 metagenomes from MG-RAST. Its performance, reproducibility, and robustness were evaluated using several approaches, including random sampling, linear regression models, receiver operator characteristic plots, and the area under the curve metric (AUC). Our results support the broad applicability of this algorithm to accurately classify (AUC = 0.985) hard-to-culture genomes (e.g., Candidatus Desulforudis audaxviator), previously characterized ones, and metagenomic environments such as hydrothermal vents, or deep-sea sediment. Our benchmark indicates that an entropy-based score can capture the metabolic machinery of interest and can be used to
MEBS, a software platform to evaluate large (meta)genomic collections according to their metabolic machinery: unraveling the sulfur cycle.

PubMed

De Anda, Valerie; Zapata-Peñasco, Icoquih; Poot-Hernandez, Augusto Cesar; Eguiarte, Luis E; Contreras-Moreira, Bruno; Souza, Valeria

2017-11-01

The increasing number of metagenomic and genomic sequences has dramatically improved our understanding of microbial diversity, yet our ability to infer metabolic capabilities in such datasets remains challenging. We describe the Multigenomic Entropy Based Score pipeline (MEBS), a software platform designed to evaluate, compare, and infer complex metabolic pathways in large "omic" datasets, including entire biogeochemical cycles. MEBS is open source and available through https://github.com/eead-csic-compbio/metagenome_Pfam_score. To demonstrate its use, we modeled the sulfur cycle by exhaustively curating the molecular and ecological elements involved (compounds, genes, metabolic pathways, and microbial taxa). This information was reduced to a collection of 112 characteristic Pfam protein domains and a list of complete-sequenced sulfur genomes. Using the mathematical framework of relative entropy (H΄), we quantitatively measured the enrichment of these domains among sulfur genomes. The entropy of each domain was used both to build up a final score that indicates whether a (meta)genomic sample contains the metabolic machinery of interest and to propose marker domains in metagenomic sequences such as DsrC (PF04358). MEBS was benchmarked with a dataset of 2107 non-redundant microbial genomes from RefSeq and 935 metagenomes from MG-RAST. Its performance, reproducibility, and robustness were evaluated using several approaches, including random sampling, linear regression models, receiver operator characteristic plots, and the area under the curve metric (AUC). Our results support the broad applicability of this algorithm to accurately classify (AUC = 0.985) hard-to-culture genomes (e.g., Candidatus Desulforudis audaxviator), previously characterized ones, and metagenomic environments such as hydrothermal vents, or deep-sea sediment. Our benchmark indicates that an entropy-based score can capture the metabolic machinery of interest and can be used to efficiently classify
An ancient genome duplication contributed to the abundance of metabolic genes in the moss Physcomitrella patens

PubMed Central

Rensing, Stefan A; Ick, Julia; Fawcett, Jeffrey A; Lang, Daniel; Zimmer, Andreas; Van de Peer, Yves; Reski, Ralf

2007-01-01

Background: Analyses of complete genomes and large collections of gene transcripts have shown that most, if not all seed plants have undergone one or more genome duplications in their evolutionary past. Results: In this study, based on a large collection of EST sequences, we provide evidence that the haploid moss Physcomitrella patens is a paleopolyploid as well. Based on the construction of linearized phylogenetic trees we infer the genome duplication to have occurred between 30 and 60 million years ago. Gene Ontology and pathway association of the duplicated genes in P. patens reveal different biases of gene retention compared with seed plants. Conclusion: Metabolic genes seem to have been retained in excess following the genome duplication in P. patens. This might, at least partly, explain the versatility of metabolism, as described for P. patens and other mosses, in comparison to other land plants. PMID:17683536
Carbohydrate metabolism genes and pathways in insects: insights from the honey bee genome

PubMed Central

Kunieda, T; Fujiyuki, T; Kucharski, R; Foret, S; Ament, S A; Toth, A L; Ohashi, K; Takeuchi, H; Kamikouchi, A; Kage, E; Morioka, M; Beye, M; Kubo, T; Robinson, G E; Maleszka, R

2006-01-01

Carbohydrate-metabolizing enzymes may have particularly interesting roles in the honey bee, Apis mellifera, because this social insect has an extremely carbohydrate-rich diet, and nutrition plays important roles in caste determination and socially mediated behavioural plasticity. We annotated a total of 174 genes encoding carbohydrate-metabolizing enzymes and 28 genes encoding lipid-metabolizing enzymes, based on orthology to their counterparts in the fly, Drosophila melanogaster, and the mosquito, Anopheles gambiae. We found that the number of genes for carbohydrate metabolism appears to be more evolutionarily labile than for lipid metabolism. In particular, we identified striking changes in gene number or genomic organization for genes encoding glycolytic enzymes, cellulase, glucose oxidase and glucose dehydrogenases, glucose-methanol-choline (GMC) oxidoreductases, fucosyltransferases, and lysozymes. PMID:17069632
Cloud-Scale Genomic Signals Processing for Robust Large-Scale Cancer Genomic Microarray Data Analysis.

PubMed

Harvey, Benjamin Simeon; Ji, Soo-Yeon

2017-01-01

As microarray data available to scientists continues to increase in size and complexity, it has become overwhelmingly important to find multiple ways to bring forth oncological inference to the bioinformatics community through the analysis of large-scale cancer genomic (LSCG) DNA and mRNA microarray data that is useful to scientists. Though there have been many attempts to elucidate the issue of bringing forth biological interpretation by means of wavelet preprocessing and classification, there has not been a research effort that focuses on a cloud-scale distributed parallel (CSDP) separable 1-D wavelet decomposition technique for denoising through differential expression thresholding and classification of LSCG microarray data. This research presents a novel methodology that utilizes a CSDP separable 1-D method for wavelet-based transformation in order to initialize a threshold which will retain significantly expressed genes through the denoising process for robust classification of cancer patients. Additionally, the overall study was implemented and encompassed within CSDP environment. The utilization of cloud computing and wavelet-based thresholding for denoising was used for the classification of samples within the Global Cancer Map, Cancer Cell Line Encyclopedia, and The Cancer Genome Atlas. The results proved that separable 1-D parallel distributed wavelet denoising in the cloud and differential expression thresholding increased the computational performance and enabled the generation of higher quality LSCG microarray datasets, which led to more accurate classification results.
A Helitron transposon reconstructed from bats reveals a novel mechanism of genome shuffling in eukaryotes.

PubMed

Grabundzija, Ivana; Messing, Simon A; Thomas, Jainy; Cosby, Rachel L; Bilic, Ilija; Miskey, Csaba; Gogol-Döring, Andreas; Kapitonov, Vladimir; Diem, Tanja; Dalda, Anna; Jurka, Jerzy; Pritham, Ellen J; Dyda, Fred; Izsvák, Zsuzsanna; Ivics, Zoltán

2016-03-02

Helitron transposons capture and mobilize gene fragments in eukaryotes, but experimental evidence for their transposition is lacking in the absence of an isolated active element. Here we reconstruct Helraiser, an ancient element from the bat genome, and use this transposon as an experimental tool to unravel the mechanism of Helitron transposition. A hairpin close to the 3'-end of the transposon functions as a transposition terminator. However, the 3'-end can be bypassed by the transposase, resulting in transduction of flanking sequences to new genomic locations. Helraiser transposition generates covalently closed circular intermediates, suggestive of a replicative transposition mechanism, which provides a powerful means to disseminate captured transcriptional regulatory signals across the genome. Indeed, we document the generation of novel transcripts by Helitron promoter capture both experimentally and by transcriptome analysis in bats. Our results provide mechanistic insight into Helitron transposition, and its impact on diversification of gene function by genome shuffling.
CoryneRegNet: An ontology-based data warehouse of corynebacterial transcription factors and regulatory networks

PubMed Central

Baumbach, Jan; Brinkrolf, Karina; Czaja, Lisa F; Rahmann, Sven; Tauch, Andreas

2006-01-01

Background The application of DNA microarray technology in post-genomic analysis of bacterial genome sequences has allowed the generation of huge amounts of data related to regulatory networks. This data along with literature-derived knowledge on regulation of gene expression has opened the way for genome-wide reconstruction of transcriptional regulatory networks. These large-scale reconstructions can be converted into in silico models of bacterial cells that allow a systematic analysis of network behavior in response to changing environmental conditions. Description CoryneRegNet was designed to facilitate the genome-wide reconstruction of transcriptional regulatory networks of corynebacteria relevant in biotechnology and human medicine. During the import and integration process of data derived from experimental studies or literature knowledge CoryneRegNet generates links to genome annotations, to identified transcription factors and to the corresponding cis-regulatory elements. CoryneRegNet is based on a multi-layered, hierarchical and modular concept of transcriptional regulation and was implemented by using the relational database management system MySQL and an ontology-based data structure. Reconstructed regulatory networks can be visualized by using the yFiles JAVA graph library. As an application example of CoryneRegNet, we have reconstructed the global transcriptional regulation of a cellular module involved in SOS and stress response of corynebacteria. Conclusion CoryneRegNet is an ontology-based data warehouse that allows a pertinent data management of regulatory interactions along with the genome-scale reconstruction of transcriptional regulatory networks. These models can further be combined with metabolic networks to build integrated models of cellular function including both metabolism and its transcriptional regulation. PMID:16478536

Some links on this page may take you to non-federal websites. Their policies may differ from this site.