Tapping the promise of genomics in species with complex, nonmodel genomes.
Hirsch, Candice N; Buell, C Robin
2013-01-01
Genomics is enabling a renaissance in all disciplines of plant biology. However, many plant genomes are complex and remain recalcitrant to current genomic technologies. The complexities of these nonmodel plant genomes are attributable to gene and genome duplication, heterozygosity, ploidy, and/or repetitive sequences. Methods are available to simplify the genome and reduce these barriers, including inbreeding and genome reduction, making these species amenable to current sequencing and assembly methods. Some, but not all, of the complexities in nonmodel genomes can be bypassed by sequencing the transcriptome rather than the genome. Additionally, comparative genomics approaches, which leverage phylogenetic relatedness, can aid in the interpretation of complex genomes. Although there are limitations in accessing complex nonmodel plant genomes using current sequencing technologies, genome manipulation and resourceful analyses can allow access to even the most recalcitrant plant genomes.
High-Throughput resequencing of maize landraces at genomic regions associated with flowering time
USDA-ARS?s Scientific Manuscript database
Despite the reduction in the price of sequencing, it remains expensive to sequence and assemble whole, complex genomes of multiple samples for population studies, particularly for large genomes like those of many crop species. Enrichment of target genome regions coupled with next generation sequenci...
Agricultural biodiversity in the post-genomics era
USDA-ARS?s Scientific Manuscript database
The toolkit available for assessing and utilizing biological diversity within agricultural systems is rapidly expanding. In particular, genome and transcriptome re-sequencing as well as genome complexity reduction techniques are gaining popularity as the cost of generating short read sequence data d...
USDA-ARS?s Scientific Manuscript database
Hybridization and genomic admixture between divergent populations or species may be an important driver of plant invasiveness. Recent studies have emphasized the critical role that reductions in genome size may play in facilitating the rapid evolution of invasiveness, and small genome size has been ...
BAC sequencing using pooled methods.
Saski, Christopher A; Feltus, F Alex; Parida, Laxmi; Haiminen, Niina
2015-01-01
Shotgun sequencing and assembly of a large, complex genome can be both expensive and challenging to accurately reconstruct the true genome sequence. Repetitive DNA arrays, paralogous sequences, polyploidy, and heterozygosity are main factors that plague de novo genome sequencing projects that typically result in highly fragmented assemblies and are difficult to extract biological meaning. Targeted, sub-genomic sequencing offers complexity reduction by removing distal segments of the genome and a systematic mechanism for exploring prioritized genomic content through BAC sequencing. If one isolates and sequences the genome fraction that encodes the relevant biological information, then it is possible to reduce overall sequencing costs and efforts that target a genomic segment. This chapter describes the sub-genome assembly protocol for an organism based upon a BAC tiling path derived from a genome-scale physical map or from fine mapping using BACs to target sub-genomic regions. Methods that are described include BAC isolation and mapping, DNA sequencing, and sequence assembly.
Cuypers, Thomas D.; Hogeweg, Paulien
2012-01-01
The picture that emerges from phylogenetic gene content reconstructions is that genomes evolve in a dynamic pattern of rapid expansion and gradual streamlining. Ancestral organisms have been estimated to possess remarkably rich gene complements, although gene loss is a driving force in subsequent lineage adaptation and diversification. Here, we study genome dynamics in a model of virtual cells evolving to maintain homeostasis. We observe a pattern of an initial rapid expansion of the genome and a prolonged phase of mutational load reduction. Generally, load reduction is achieved by the deletion of redundant genes, generating a streamlining pattern. Load reduction can also occur as a result of the generation of highly neutral genomic regions. These regions can expand and contract in a neutral fashion. Our study suggests that genome expansion and streamlining are generic patterns of evolving systems. We propose that the complex genotype to phenotype mapping in virtual cells as well as in their biological counterparts drives genome size dynamics, due to an emerging interplay between adaptation, neutrality, and evolvability. PMID:22234601
Pootakham, Wirulda; Sonthirod, Chutima; Naktang, Chaiwat; Jomchai, Nukoon; Sangsrakru, Duangjai; Tangphatsornruang, Sithichoke
2016-01-01
Advances in next generation sequencing have facilitated a large-scale single nucleotide polymorphism (SNP) discovery in many crop species. Genotyping-by-sequencing (GBS) approach couples next generation sequencing with genome complexity reduction techniques to simultaneously identify and genotype SNPs. Choice of enzymes used in GBS library preparation depends on several factors including the number of markers required, the desired level of multiplexing, and whether the enrichment of genic SNP is preferred. We evaluated various combinations of methylation-sensitive ( Aat II, Pst I, Msp I) and methylation-insensitive ( Sph I, Mse I) enzymes for their effectiveness in genome complexity reduction and enrichment of genic SNPs. We discovered that the use of two methylation-sensitive enzymes effectively reduced genome complexity and did not require a size selection step. On the contrary, the genome coverage of libraries constructed with methylation-insensitive enzymes was quite high, and the additional size selection step may be required to increase the overall read depth. We also demonstrated the effectiveness of methylation-sensitive enzymes in enriching for SNPs located in genic regions. When two methylation-insensitive enzymes were used, only 16% of SNPs identified were located in genes and 18% in the vicinity (± 5 kb) of the genic regions, while most SNPs resided in the intergenic regions. In contrast, a remarkable degree of enrichment was observed when two methylation-sensitive enzymes were employed. Almost two thirds of the SNPs were located either inside (32-36%) or in the vicinity (28-31%) of the genic regions. These results provide useful information to help researchers choose appropriate GBS enzymes in oil palm and other crop species.
A Lack of Parasitic Reduction in the Obligate Parasitic Green Alga Helicosporidium
Pombert, Jean-François; Blouin, Nicolas Achille; Lane, Chris; Boucias, Drion; Keeling, Patrick J.
2014-01-01
The evolution of an obligate parasitic lifestyle is often associated with genomic reduction, in particular with the loss of functions associated with increasing host-dependence. This is evident in many parasites, but perhaps the most extreme transitions are from free-living autotrophic algae to obligate parasites. The best-known examples of this are the apicomplexans such as Plasmodium, which evolved from algae with red secondary plastids. However, an analogous transition also took place independently in the Helicosporidia, where an obligate parasite of animals with an intracellular infection mechanism evolved from algae with green primary plastids. We characterised the nuclear genome of Helicosporidium to compare its transition to parasitism with that of apicomplexans. The Helicosporidium genome is small and compact, even by comparison with the relatively small genomes of the closely related green algae Chlorella and Coccomyxa, but at the functional level we find almost no evidence for reduction. Nearly all ancestral metabolic functions are retained, with the single major exception of photosynthesis, and even here reduction is not complete. The great majority of genes for light-harvesting complexes, photosystems, and pigment biosynthesis have been lost, but those for other photosynthesis-related functions, such as Calvin cycle, are retained. Rather than loss of whole function categories, the predominant reductive force in the Helicosporidium genome is a contraction of gene family complexity, but even here most losses affect families associated with genome maintenance and expression, not functions associated with host-dependence. Other gene families appear to have expanded in response to parasitism, in particular chitinases, including those predicted to digest the chitinous barriers of the insect host or remodel the cell wall of Helicosporidium. Overall, the Helicosporidium genome presents a fascinating picture of the early stages of a transition from free-living autotroph to parasitic heterotroph where host-independence has been unexpectedly preserved. PMID:24809511
Implications of streamlining theory for microbial ecology
Giovannoni, Stephen J; Cameron Thrash, J; Temperton, Ben
2014-01-01
Whether a small cell, a small genome or a minimal set of chemical reactions with self-replicating properties, simplicity is beguiling. As Leonardo da Vinci reportedly said, ‘simplicity is the ultimate sophistication'. Two diverging views of simplicity have emerged in accounts of symbiotic and commensal bacteria and cosmopolitan free-living bacteria with small genomes. The small genomes of obligate insect endosymbionts have been attributed to genetic drift caused by small effective population sizes (Ne). In contrast, streamlining theory attributes small cells and genomes to selection for efficient use of nutrients in populations where Ne is large and nutrients limit growth. Regardless of the cause of genome reduction, lost coding potential eventually dictates loss of function. Consequences of reductive evolution in streamlined organisms include atypical patterns of prototrophy and the absence of common regulatory systems, which have been linked to difficulty in culturing these cells. Recent evidence from metagenomics suggests that streamlining is commonplace, may broadly explain the phenomenon of the uncultured microbial majority, and might also explain the highly interdependent (connected) behavior of many microbial ecosystems. Streamlining theory is belied by the observation that many successful bacteria are large cells with complex genomes. To fully appreciate streamlining, we must look to the life histories and adaptive strategies of cells, which impose minimum requirements for complexity that vary with niche. PMID:24739623
Blackburn, Michael B; Sparks, Michael E; Gundersen-Rindal, Dawn E
2016-12-01
The genome of Chromobacterium subtsugae strain PRAA4-1, a betaproteobacterium producing insecticidal compounds, was sequenced and compared with the genome of C. violaceum ATCC 12472. The genome of C. subtsugae displayed a reduction in genes devoted to capsular and extracellular polysaccharide, possessed no genes encoding nitrate reductases, and exhibited many more phage-related sequences than were observed for C. violaceum. The genomes of both species possess a number of gene clusters predicted to encode biosynthetic complexes for secondary metabolites; these clusters suggest they produce overlapping, but distinct assortments of metabolites.
Genome expansion via lineage splitting and genome reduction in the cicada endosymbiont Hodgkinia.
Campbell, Matthew A; Van Leuven, James T; Meister, Russell C; Carey, Kaitlin M; Simon, Chris; McCutcheon, John P
2015-08-18
Comparative genomics from mitochondria, plastids, and mutualistic endosymbiotic bacteria has shown that the stable establishment of a bacterium in a host cell results in genome reduction. Although many highly reduced genomes from endosymbiotic bacteria are stable in gene content and genome structure, organelle genomes are sometimes characterized by dramatic structural diversity. Previous results from Candidatus Hodgkinia cicadicola, an endosymbiont of cicadas, revealed that some lineages of this bacterium had split into two new cytologically distinct yet genetically interdependent species. It was hypothesized that the long life cycle of cicadas in part enabled this unusual lineage-splitting event. Here we test this hypothesis by investigating the structure of the Ca. Hodgkinia genome in one of the longest-lived cicadas, Magicicada tredecim. We show that the Ca. Hodgkinia genome from M. tredecim has fragmented into multiple new chromosomes or genomes, with at least some remaining partitioned into discrete cells. We also show that this lineage-splitting process has resulted in a complex of Ca. Hodgkinia genomes that are 1.1-Mb pairs in length when considered together, an almost 10-fold increase in size from the hypothetical single-genome ancestor. These results parallel some examples of genome fragmentation and expansion in organelles, although the mechanisms that give rise to these extreme genome instabilities are likely different.
Electron transfer to nitrogenase in different genomic and metabolic backgrounds.
Poudel, Saroj; Colman, Daniel R; Fixen, Kathryn R; Ledbetter, Rhesa N; Zheng, Yanning; Pence, Natasha; Seefeldt, Lance C; Peters, John W; Harwood, Caroline S; Boyd, Eric S
2018-02-26
Nitrogenase catalyzes the reduction of dinitrogen (N 2 ) using low potential electrons from ferredoxin (Fd) or flavodoxin (Fld) through an ATP dependent process. Since its emergence in an anaerobic chemoautotroph, this oxygen (O 2 ) sensitive enzyme complex has evolved to operate in a variety of genomic and metabolic backgrounds including those of aerobes, anaerobes, chemotrophs, and phototrophs. However, whether pathways of electron delivery to nitrogenase are influenced by these different metabolic backgrounds is not well understood. Here, we report the distribution of homologs of Fds, Flds, and Fd/Fld-reducing enzymes in 359 genomes of putative N 2 fixers (diazotrophs). Six distinct lineages of nitrogenase were identified and their distributions largely corresponded to differences in the host cells' ability to integrate O 2 or light into energy metabolism. Predicted pathways of electron transfer to nitrogenase in aerobes, facultative anaerobes, and phototrophs varied from those in anaerobes at the level of Fds/Flds used to reduce nitrogenase, the enzymes that generate reduced Fds/Flds, and the putative substrates of these enzymes. Proteins that putatively reduce Fd with hydrogen or pyruvate were enriched in anaerobes, while those that reduce Fd with NADH/NADPH were enriched in aerobes, facultative anaerobes, and anoxygenic phototrophs. The energy metabolism of aerobic, facultatively anaerobic, and anoxygenic phototrophic diazotrophs often yields reduced NADH/NADPH that is not sufficiently reduced to drive N 2 reduction. At least two mechanisms have been acquired by these taxa to overcome this limitation and to generate electrons with potentials capable of reducing Fd. These include the bifurcation of electrons or the coupling of Fd reduction to reverse ion translocation. IMPORTANCE Nitrogen fixation supplies fixed nitrogen to cells from a variety of genomic and metabolic backgrounds including those of aerobes, facultative anaerobes, chemotrophs, and phototrophs. Here, using informatics approaches applied to genomic data, we show that pathways of electron transfer to nitrogenase in metabolically diverse diazotrophic taxa have diversified primarily in response to host cells' acquired ability to integrate O 2 or light into their energy metabolism. Acquisition of two key enzyme complexes enabled aerobic and facultatively anaerobic phototrophic taxa to generate electrons of sufficiently low potential to reduce nitrogenase: the bifurcation of electrons via the Fix complex or the coupling of Fd reduction to reverse ion translocation via the Rhodobacter nitrogen fixation (Rnf) complex. Copyright © 2018 American Society for Microbiology.
Sanderson, Michael J; Copetti, Dario; Búrquez, Alberto; Bustamante, Enriquena; Charboneau, Joseph L M; Eguiarte, Luis E; Kumar, Sudhir; Lee, Hyun Oh; Lee, Junki; McMahon, Michelle; Steele, Kelly; Wing, Rod; Yang, Tae-Jin; Zwickl, Derrick; Wojciechowski, Martin F
2015-07-01
• Land-plant plastid genomes have only rarely undergone significant changes in gene content and order. Thus, discovery of additional examples adds power to tests for causes of such genome-scale structural changes.• Using next-generation sequence data, we assembled the plastid genome of saguaro cactus and probed the nuclear genome for transferred plastid genes and functionally related nuclear genes. We combined these results with available data across Cactaceae and seed plants more broadly to infer the history of gene loss and to assess the strength of phylogenetic association between gene loss and loss of the inverted repeat (IR).• The saguaro plastid genome is the smallest known for an obligately photosynthetic angiosperm (∼113 kb), having lost the IR and plastid ndh genes. This loss supports a statistically strong association across seed plants between the loss of ndh genes and the loss of the IR. Many nonplastid copies of plastid ndh genes were found in the nuclear genome, but none had intact reading frames; nor did three related nuclear-encoded subunits. However, nuclear pgr5, which functions in a partially redundant pathway, was intact.• The existence of an alternative pathway redundant with the function of the plastid NADH dehydrogenase-like complex (NDH) complex may permit loss of the plastid ndh gene suite in photoautotrophs like saguaro. Loss of these genes may be a recurring mechanism for overall plastid genome size reduction, especially in combination with loss of the IR. © 2015 Botanical Society of America, Inc.
Mathew, Boby; Léon, Jens; Sannemann, Wiebke; Sillanpää, Mikko J.
2018-01-01
Gene-by-gene interactions, also known as epistasis, regulate many complex traits in different species. With the availability of low-cost genotyping it is now possible to study epistasis on a genome-wide scale. However, identifying genome-wide epistasis is a high-dimensional multiple regression problem and needs the application of dimensionality reduction techniques. Flowering Time (FT) in crops is a complex trait that is known to be influenced by many interacting genes and pathways in various crops. In this study, we successfully apply Sure Independence Screening (SIS) for dimensionality reduction to identify two-way and three-way epistasis for the FT trait in a Multiparent Advanced Generation Inter-Cross (MAGIC) barley population using the Bayesian multilocus model. The MAGIC barley population was generated from intercrossing among eight parental lines and thus, offered greater genetic diversity to detect higher-order epistatic interactions. Our results suggest that SIS is an efficient dimensionality reduction approach to detect high-order interactions in a Bayesian multilocus model. We also observe that many of our findings (genomic regions with main or higher-order epistatic effects) overlap with known candidate genes that have been already reported in barley and closely related species for the FT trait. PMID:29254994
Ogier, Jean-Claude; Pagès, Sylvie; Bisch, Gaëlle; Chiapello, Hélène; Médigue, Claudine; Rouy, Zoé; Teyssier, Corinne; Vincent, Stéphanie; Tailliez, Patrick; Givaudan, Alain; Gaudriault, Sophie
2014-01-01
Bacteria of the genus Xenorhabdus are symbionts of soil entomopathogenic nematodes of the genus Steinernema. This symbiotic association constitutes an insecticidal complex active against a wide range of insect pests. Unlike other Xenorhabdus species, Xenorhabdus poinarii is avirulent when injected into insects in the absence of its nematode host. We sequenced the genome of the X. poinarii strain G6 and the closely related but virulent X. doucetiae strain FRM16. G6 had a smaller genome (500–700 kb smaller) than virulent Xenorhabdus strains and lacked genes encoding potential virulence factors (hemolysins, type 5 secretion systems, enzymes involved in the synthesis of secondary metabolites, and toxin–antitoxin systems). The genomes of all the X. poinarii strains analyzed here had a similar small size. We did not observe the accumulation of pseudogenes, insertion sequences or decrease in coding density usually seen as a sign of genomic erosion driven by genetic drift in host-adapted bacteria. Instead, genome reduction of X. poinarii seems to have been mediated by the excision of genomic blocks from the flexible genome, as reported for the genomes of attenuated free pathogenic bacteria and some facultative mutualistic bacteria growing exclusively within hosts. This evolutionary pathway probably reflects the adaptation of X. poinarii to specific host. PMID:24904010
Sewell, Holly L; Kaster, Anne-Kristin; Spormann, Alfred M
2017-12-19
The deep marine subsurface is one of the largest unexplored biospheres on Earth and is widely inhabited by members of the phylum Chloroflexi In this report, we investigated genomes of single cells obtained from deep-sea sediments of the Peruvian Margin, which are enriched in such Chloroflexi 16S rRNA gene sequence analysis placed two of these single-cell-derived genomes (DscP3 and Dsc4) in a clade of subphylum I Chloroflexi which were previously recovered from deep-sea sediment in the Okinawa Trough and a third (DscP2-2) as a member of the previously reported DscP2 population from Peruvian Margin site 1230. The presence of genes encoding enzymes of a complete Wood-Ljungdahl pathway, glycolysis/gluconeogenesis, a Rhodobacter nitrogen fixation (Rnf) complex, glyosyltransferases, and formate dehydrogenases in the single-cell genomes of DscP3 and Dsc4 and the presence of an NADH-dependent reduced ferredoxin:NADP oxidoreductase (Nfn) and Rnf in the genome of DscP2-2 imply a homoacetogenic lifestyle of these abundant marine Chloroflexi We also report here the first complete pathway for anaerobic benzoate oxidation to acetyl coenzyme A (CoA) in the phylum Chloroflexi (DscP3 and Dsc4), including a class I benzoyl-CoA reductase. Of remarkable evolutionary significance, we discovered a gene encoding a formate dehydrogenase (FdnI) with reciprocal closest identity to the formate dehydrogenase-like protein (complex iron-sulfur molybdoenzyme [CISM], DET0187) of terrestrial Dehalococcoides/Dehalogenimonas spp. This formate dehydrogenase-like protein has been shown to lack formate dehydrogenase activity in Dehalococcoides/Dehalogenimonas spp. and is instead hypothesized to couple HupL hydrogenase to a reductive dehalogenase in the catabolic reductive dehalogenation pathway. This finding of a close functional homologue provides an important missing link for understanding the origin and the metabolic core of terrestrial Dehalococcoides/Dehalogenimonas spp. and of reductive dehalogenation, as well as the biology of abundant deep-sea Chloroflexi IMPORTANCE The deep marine subsurface is one of the largest unexplored biospheres on Earth and is widely inhabited by members of the phylum Chloroflexi In this report, we investigated genomes of single cells obtained from deep-sea sediments and provide evidence for a homacetogenic lifestyle of these abundant marine Chloroflexi Moreover, genome signature and key metabolic genes indicate an evolutionary relationship between these deep-sea sediment microbes and terrestrial, reductively dehalogenating Dehalococcoides . Copyright © 2017 Sewell et al.
NASA Astrophysics Data System (ADS)
Wrighton, K. C.; Thomas, B.; Miller, C. S.; Sharon, I.; Wilkins, M. J.; VerBerkmoes, N. C.; Handley, K. M.; Lipton, M. S.; Hettich, R. L.; Williams, K. H.; Long, P. E.; Banfield, J. F.
2011-12-01
With the goal of developing a deterministic understanding of the microbiological and geochemical processes controlling subsurface environments, groundwater bacterial communities were collected from the Rifle Integrated Field Research Challenge (IFRC) site. Biomass from three temporal acetate-stimulated groundwater samples were collected during a period of dominant Fe(III)-reduction, in a region of the aquifer that had previously received acetate amendment the year prior. Phylogenetic analysis revealed a diverse Bacterial community, notably devoid of Archaea with 249 taxa from 9 Bacterial phyla including the dominance of uncultured candidate divisions, BD1-5, OD1, and OP11. We have reconstructed 86 partial to near-complete genomes and have performed a detailed characterization of the underlying metabolic potential of the ecosystem. We assessed the natural variation and redundancy in multi-heme c-type cytochromes, sulfite reductases, and central carbon metabolic pathways. Deep genomic sampling indicated the community contained various metabolic pathways: sulfur oxidation coupled to microaerophilic conditions, nitrate reduction with both acetate and inorganic compounds as donors, carbon and nitrogen fixation, antibiotic warfare, and heavy-metal detoxification. Proteomic investigations using predicted proteins from metagenomics corroborated that acetate oxidation is coupled to reduction of oxygen, sulfur, nitrogen, and iron across the samples. Of particular interest was the detection of acetate oxidizing and sulfate reducing proteins from a Desulfotalea-like bacterium in all three time points, suggesting that aqueous sulfide produced by active sulfate-reducing bacteria could contribute to abiotic iron reduction during the dominant iron reduction phase. Additionally, proteogenomic analysis verified that a large portion of the community, including members of the uncultivated BD1-5, are obligate fermenters, characterized by the presence of hydrogen-evolving hydrogenases, the capacity to oxidize complex organic carbon, as well as lack of membrane bound electron transport chains and an incomplete citric acid cycle. We propose that these organisms grow cryptically on residual biomass from previous biostimulation experiments and thus demonstrate that resource utilization and turnover in the aquifer can be decoupled from existing acetate amendment and external terminal electron accepting processes. In addition to the first recovery of multiple genomes from these novel candidate divisions, our community genomic approach uncovered viral diversity not yet observed at the site, with the reconstruction of six phage genomes and the presence of CRISPR loci detected in bacterial genomes from diverse lineages. These findings have implications for predictive ecosystem modeling, highlighting the importance of integrating the response, adaptation, as well as biological and geochemical feedback mechanisms existing within complex subsurface communities to long term organic carbon amendment.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hug, Laura A.; Thomas, Brian C.; Sharon, Itai
Nitrogen, sulfur and carbon fluxes in the terrestrial subsurface are determined by the intersecting activities of microbial community members, yet the organisms responsible are largely unknown. Metagenomic methods can identify organisms and functions, but genome recovery is often precluded by data complexity. To address this limitation, we developed subsampling assembly methods to re-construct high-quality draft genomes from complex samples. Here, we applied these methods to evaluate the interlinked roles of the most abundant organisms in biogeochemical cycling in the aquifer sediment. Community proteomics confirmed these activities. The eight most abundant organisms belong to novel lineages, and two represent phyla withmore » no previously sequenced genome. Four organisms are predicted to fix carbon via the Calvin Benson Bassham, Wood Ljungdahl or 3-hydroxyproprionate/4-hydroxybutarate pathways. The profiled organisms are involved in the network of denitrification, dissimilatory nitrate reduction to ammonia, ammonia oxidation and sulfate reduction/oxidation, and require substrates supplied by other community members. An ammonium-oxidizing Thaumarchaeote is the most abundant community member, despite low ammonium concentrations in the groundwater. Finally, this organism likely benefits from two other relatively abundant organisms capable of producing ammonium from nitrate, which is abundant in the groundwater. Overall, dominant members of the microbial community are interconnected through exchange of geochemical resources.« less
Construction of a minimal genome as a chassis for synthetic biology.
Sung, Bong Hyun; Choe, Donghui; Kim, Sun Chang; Cho, Byung-Kwan
2016-11-30
Microbial diversity and complexity pose challenges in understanding the voluminous genetic information produced from whole-genome sequences, bioinformatics and high-throughput '-omics' research. These challenges can be overcome by a core blueprint of a genome drawn with a minimal gene set, which is essential for life. Systems biology and large-scale gene inactivation studies have estimated the number of essential genes to be ∼300-500 in many microbial genomes. On the basis of the essential gene set information, minimal-genome strains have been generated using sophisticated genome engineering techniques, such as genome reduction and chemical genome synthesis. Current size-reduced genomes are not perfect minimal genomes, but chemically synthesized genomes have just been constructed. Some minimal genomes provide various desirable functions for bioindustry, such as improved genome stability, increased transformation efficacy and improved production of biomaterials. The minimal genome as a chassis genome for synthetic biology can be used to construct custom-designed genomes for various practical and industrial applications. © 2016 The Author(s). published by Portland Press Limited on behalf of the Biochemical Society.
Melnyk, Ryan A; Coates, John D
2015-10-26
Perchlorate is a widely distributed anion that is toxic to humans, but serves as a valuable electron acceptor for several lineages of bacteria. The ability to utilize perchlorate is conferred by a horizontally transferred piece of DNA called the perchlorate reduction genomic island (PRI). We compared genomes of perchlorate reducers using phylogenomics, SNP mapping, and differences in genomic architecture to interrogate the evolutionary history of perchlorate respiration. Here we report on the PRI of 13 genomes of perchlorate-reducing bacteria from four different classes of Phylum Proteobacteria (the Alpha-, Beta-, Gamma- and Epsilonproteobacteria). Among the different phylogenetic classes, the island varies considerably in genetic content as well as in its putative mechanism and location of integration. However, the islands of the densely sampled genera Azospira and Magnetospirillum have striking nucleotide identity despite divergent genomes, implying horizontal transfer and positive selection within narrow phylogenetic taxa. We also assess the phylogenetic origin of accessory genes in the various incarnations of the island, which can be traced to chromosomal paralogs from phylogenetically similar organisms. These observations suggest a complex phylogenetic history where the island is rarely transferred at the class level but undergoes frequent and continuous transfer within narrow phylogenetic groups. This restricted transfer is seen directly by the independent integration of near-identical islands within a genus and indirectly due to the acquisition of lineage-specific accessory genes. The genomic reversibility of perchlorate reduction may present a unique equilibrium for a metabolism that confers a competitive advantage only in the presence of an electron acceptor, which although widely distributed, is generally present at low concentrations in nature.
Haplotag: Software for Haplotype-Based Genotyping-by-Sequencing Analysis
Tinker, Nicholas A.; Bekele, Wubishet A.; Hattori, Jiro
2016-01-01
Genotyping-by-sequencing (GBS), and related methods, are based on high-throughput short-read sequencing of genomic complexity reductions followed by discovery of single nucleotide polymorphisms (SNPs) within sequence tags. This provides a powerful and economical approach to whole-genome genotyping, facilitating applications in genomics, diversity analysis, and molecular breeding. However, due to the complexity of analyzing large data sets, applications of GBS may require substantial time, expertise, and computational resources. Haplotag, the novel GBS software described here, is freely available, and operates with minimal user-investment on widely available computer platforms. Haplotag is unique in fulfilling the following set of criteria: (1) operates without a reference genome; (2) can be used in a polyploid species; (3) provides a discovery mode, and a production mode; (4) discovers polymorphisms based on a model of tag-level haplotypes within sequenced tags; (5) reports SNPs as well as haplotype-based genotypes; and (6) provides an intuitive visual “passport” for each inferred locus. Haplotag is optimized for use in a self-pollinating plant species. PMID:26818073
Efficient delivery of genome-editing proteins using bioreducible lipid nanoparticles.
Wang, Ming; Zuris, John A; Meng, Fantao; Rees, Holly; Sun, Shuo; Deng, Pu; Han, Yong; Gao, Xue; Pouli, Dimitra; Wu, Qi; Georgakoudi, Irene; Liu, David R; Xu, Qiaobing
2016-03-15
A central challenge to the development of protein-based therapeutics is the inefficiency of delivery of protein cargo across the mammalian cell membrane, including escape from endosomes. Here we report that combining bioreducible lipid nanoparticles with negatively supercharged Cre recombinase or anionic Cas9:single-guide (sg)RNA complexes drives the electrostatic assembly of nanoparticles that mediate potent protein delivery and genome editing. These bioreducible lipids efficiently deliver protein cargo into cells, facilitate the escape of protein from endosomes in response to the reductive intracellular environment, and direct protein to its intracellular target sites. The delivery of supercharged Cre protein and Cas9:sgRNA complexed with bioreducible lipids into cultured human cells enables gene recombination and genome editing with efficiencies greater than 70%. In addition, we demonstrate that these lipids are effective for functional protein delivery into mouse brain for gene recombination in vivo. Therefore, the integration of this bioreducible lipid platform with protein engineering has the potential to advance the therapeutic relevance of protein-based genome editing.
Sewell, Holly L.; Kaster, Anne-Kristin
2017-01-01
ABSTRACT The deep marine subsurface is one of the largest unexplored biospheres on Earth and is widely inhabited by members of the phylum Chloroflexi. In this report, we investigated genomes of single cells obtained from deep-sea sediments of the Peruvian Margin, which are enriched in such Chloroflexi. 16S rRNA gene sequence analysis placed two of these single-cell-derived genomes (DscP3 and Dsc4) in a clade of subphylum I Chloroflexi which were previously recovered from deep-sea sediment in the Okinawa Trough and a third (DscP2-2) as a member of the previously reported DscP2 population from Peruvian Margin site 1230. The presence of genes encoding enzymes of a complete Wood-Ljungdahl pathway, glycolysis/gluconeogenesis, a Rhodobacter nitrogen fixation (Rnf) complex, glyosyltransferases, and formate dehydrogenases in the single-cell genomes of DscP3 and Dsc4 and the presence of an NADH-dependent reduced ferredoxin:NADP oxidoreductase (Nfn) and Rnf in the genome of DscP2-2 imply a homoacetogenic lifestyle of these abundant marine Chloroflexi. We also report here the first complete pathway for anaerobic benzoate oxidation to acetyl coenzyme A (CoA) in the phylum Chloroflexi (DscP3 and Dsc4), including a class I benzoyl-CoA reductase. Of remarkable evolutionary significance, we discovered a gene encoding a formate dehydrogenase (FdnI) with reciprocal closest identity to the formate dehydrogenase-like protein (complex iron-sulfur molybdoenzyme [CISM], DET0187) of terrestrial Dehalococcoides/Dehalogenimonas spp. This formate dehydrogenase-like protein has been shown to lack formate dehydrogenase activity in Dehalococcoides/Dehalogenimonas spp. and is instead hypothesized to couple HupL hydrogenase to a reductive dehalogenase in the catabolic reductive dehalogenation pathway. This finding of a close functional homologue provides an important missing link for understanding the origin and the metabolic core of terrestrial Dehalococcoides/Dehalogenimonas spp. and of reductive dehalogenation, as well as the biology of abundant deep-sea Chloroflexi. PMID:29259088
Analysis of genetic diversity using SNP markers in oat
USDA-ARS?s Scientific Manuscript database
A large-scale single nucleotide polymorphism (SNP) discovery was carried out in cultivated oat using Roche 454 sequencing methods. DNA sequences were generated from cDNAs originating from a panel of 20 diverse oat cultivars, and from Diversity Array Technology (DArT) genomic complexity reductions fr...
Hug, Laura A.; Thomas, Brian C.; Sharon, Itai; ...
2015-07-22
Nitrogen, sulfur and carbon fluxes in the terrestrial subsurface are determined by the intersecting activities of microbial community members, yet the organisms responsible are largely unknown. Metagenomic methods can identify organisms and functions, but genome recovery is often precluded by data complexity. To address this limitation, we developed subsampling assembly methods to re-construct high-quality draft genomes from complex samples. Here, we applied these methods to evaluate the interlinked roles of the most abundant organisms in biogeochemical cycling in the aquifer sediment. Community proteomics confirmed these activities. The eight most abundant organisms belong to novel lineages, and two represent phyla withmore » no previously sequenced genome. Four organisms are predicted to fix carbon via the Calvin Benson Bassham, Wood Ljungdahl or 3-hydroxyproprionate/4-hydroxybutarate pathways. The profiled organisms are involved in the network of denitrification, dissimilatory nitrate reduction to ammonia, ammonia oxidation and sulfate reduction/oxidation, and require substrates supplied by other community members. An ammonium-oxidizing Thaumarchaeote is the most abundant community member, despite low ammonium concentrations in the groundwater. Finally, this organism likely benefits from two other relatively abundant organisms capable of producing ammonium from nitrate, which is abundant in the groundwater. Overall, dominant members of the microbial community are interconnected through exchange of geochemical resources.« less
Diversity arrays technology (DArT) markers in apple for genetic linkage maps.
Schouten, Henk J; van de Weg, W Eric; Carling, Jason; Khan, Sabaz Ali; McKay, Steven J; van Kaauwen, Martijn P W; Wittenberg, Alexander H J; Koehorst-van Putten, Herma J J; Noordijk, Yolanda; Gao, Zhongshan; Rees, D Jasper G; Van Dyk, Maria M; Jaccoud, Damian; Considine, Michael J; Kilian, Andrzej
2012-03-01
Diversity Arrays Technology (DArT) provides a high-throughput whole-genome genotyping platform for the detection and scoring of hundreds of polymorphic loci without any need for prior sequence information. The work presented here details the development and performance of a DArT genotyping array for apple. This is the first paper on DArT in horticultural trees. Genetic mapping of DArT markers in two mapping populations and their integration with other marker types showed that DArT is a powerful high-throughput method for obtaining accurate and reproducible marker data, despite the low cost per data point. This method appears to be suitable for aligning the genetic maps of different segregating populations. The standard complexity reduction method, based on the methylation-sensitive PstI restriction enzyme, resulted in a high frequency of markers, although there was 52-54% redundancy due to the repeated sampling of highly similar sequences. Sequencing of the marker clones showed that they are significantly enriched for low-copy, genic regions. The genome coverage using the standard method was 55-76%. For improved genome coverage, an alternative complexity reduction method was examined, which resulted in less redundancy and additional segregating markers. The DArT markers proved to be of high quality and were very suitable for genetic mapping at low cost for the apple, providing moderate genome coverage. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1007/s11032-011-9579-5) contains supplementary material, which is available to authorized users.
Kruse, Thomas; van de Pas, Bram A; Atteia, Ariane; Krab, Klaas; Hagen, Wilfred R; Goodwin, Lynne; Chain, Patrick; Boeren, Sjef; Maphosa, Farai; Schraa, Gosse; de Vos, Willem M; van der Oost, John; Smidt, Hauke; Stams, Alfons J M
2015-03-01
Desulfitobacterium dehalogenans is able to grow by organohalide respiration using 3-chloro-4-hydroxyphenyl acetate (Cl-OHPA) as an electron acceptor. We used a combination of genome sequencing, biochemical analysis of redox active components, and shotgun proteomics to study elements of the organohalide respiratory electron transport chain. The genome of Desulfitobacterium dehalogenans JW/IU-DC1(T) consists of a single circular chromosome of 4,321,753 bp with a GC content of 44.97%. The genome contains 4,252 genes, including six rRNA operons and six predicted reductive dehalogenases. One of the reductive dehalogenases, CprA, is encoded by a well-characterized cprTKZEBACD gene cluster. Redox active components were identified in concentrated suspensions of cells grown on formate and Cl-OHPA or formate and fumarate, using electron paramagnetic resonance (EPR), visible spectroscopy, and high-performance liquid chromatography (HPLC) analysis of membrane extracts. In cell suspensions, these components were reduced upon addition of formate and oxidized after addition of Cl-OHPA, indicating involvement in organohalide respiration. Genome analysis revealed genes that likely encode the identified components of the electron transport chain from formate to fumarate or Cl-OHPA. Data presented here suggest that the first part of the electron transport chain from formate to fumarate or Cl-OHPA is shared. Electrons are channeled from an outward-facing formate dehydrogenase via menaquinones to a fumarate reductase located at the cytoplasmic face of the membrane. When Cl-OHPA is the terminal electron acceptor, electrons are transferred from menaquinones to outward-facing CprA, via an as-yet-unidentified membrane complex, and potentially an extracellular flavoprotein acting as an electron shuttle between the quinol dehydrogenase membrane complex and CprA. Copyright © 2015, American Society for Microbiology. All Rights Reserved.
van de Pas, Bram A.; Atteia, Ariane; Krab, Klaas; Hagen, Wilfred R.; Goodwin, Lynne; Chain, Patrick; Boeren, Sjef; Maphosa, Farai; Schraa, Gosse; de Vos, Willem M.; van der Oost, John; Smidt, Hauke
2014-01-01
Desulfitobacterium dehalogenans is able to grow by organohalide respiration using 3-chloro-4-hydroxyphenyl acetate (Cl-OHPA) as an electron acceptor. We used a combination of genome sequencing, biochemical analysis of redox active components, and shotgun proteomics to study elements of the organohalide respiratory electron transport chain. The genome of Desulfitobacterium dehalogenans JW/IU-DC1T consists of a single circular chromosome of 4,321,753 bp with a GC content of 44.97%. The genome contains 4,252 genes, including six rRNA operons and six predicted reductive dehalogenases. One of the reductive dehalogenases, CprA, is encoded by a well-characterized cprTKZEBACD gene cluster. Redox active components were identified in concentrated suspensions of cells grown on formate and Cl-OHPA or formate and fumarate, using electron paramagnetic resonance (EPR), visible spectroscopy, and high-performance liquid chromatography (HPLC) analysis of membrane extracts. In cell suspensions, these components were reduced upon addition of formate and oxidized after addition of Cl-OHPA, indicating involvement in organohalide respiration. Genome analysis revealed genes that likely encode the identified components of the electron transport chain from formate to fumarate or Cl-OHPA. Data presented here suggest that the first part of the electron transport chain from formate to fumarate or Cl-OHPA is shared. Electrons are channeled from an outward-facing formate dehydrogenase via menaquinones to a fumarate reductase located at the cytoplasmic face of the membrane. When Cl-OHPA is the terminal electron acceptor, electrons are transferred from menaquinones to outward-facing CprA, via an as-yet-unidentified membrane complex, and potentially an extracellular flavoprotein acting as an electron shuttle between the quinol dehydrogenase membrane complex and CprA. PMID:25512312
[Advances in microbial genome reduction and modification].
Wang, Jianli; Wang, Xiaoyuan
2013-08-01
Microbial genome reduction and modification are important strategies for constructing cellular chassis used for synthetic biology. This article summarized the essential genes and the methods to identify them in microorganisms, compared various strategies for microbial genome reduction, and analyzed the characteristics of some microorganisms with the minimized genome. This review shows the important role of genome reduction in constructing cellular chassis.
Pal, Siddhartha; Kundu, Anirban; Banerjee, Tirtha Das; Mohapatra, Balaram; Roy, Ajoy; Manna, Riddha; Sar, Pinaki; Kazy, Sufia K
2017-10-01
Franconibacter pulveris strain DJ34, isolated from Duliajan oil fields, Assam, was characterized in terms of its taxonomic, metabolic and genomic properties. The bacterium showed utilization of diverse petroleum hydrocarbons and electron acceptors, metal resistance, and biosurfactant production. The genome (4,856,096bp) of this strain contained different genes related to the degradation of various petroleum hydrocarbons, metal transport and resistance, dissimilatory nitrate, nitrite and sulfite reduction, chemotaxy, biosurfactant synthesis, etc. Genomic comparison with other Franconibacter spp. revealed higher abundance of genes for cell motility, lipid transport and metabolism, transcription and translation in DJ34 genome. Detailed COG analysis provides deeper insights into the genomic potential of this organism for degradation and survival in oil-contaminated complex habitat. This is the first report on ecophysiology and genomic inventory of Franconibacter sp. inhabiting crude oil rich environment, which might be useful for designing the strategy for bioremediation of oil contaminated environment. Copyright © 2017 Elsevier Inc. All rights reserved.
Skippington, Elizabeth; Barkman, Todd J.; Rice, Danny W.; Palmer, Jeffrey D.
2015-01-01
Despite the enormous diversity among parasitic angiosperms in form and structure, life-history strategies, and plastid genomes, little is known about the diversity of their mitogenomes. We report the sequence of the wonderfully bizarre mitogenome of the hemiparasitic aerial mistletoe Viscum scurruloideum. This genome is only 66 kb in size, making it the smallest known angiosperm mitogenome by a factor of more than three and the smallest land plant mitogenome. Accompanying this size reduction is exceptional reduction of gene content. Much of this reduction arises from the unexpected loss of respiratory complex I (NADH dehydrogenase), universally present in all 300+ other angiosperms examined, where it is encoded by nine mitochondrial and many nuclear nad genes. Loss of complex I in a multicellular organism is unprecedented. We explore the potential relationship between this loss in Viscum and its parasitic lifestyle. Despite its small size, the Viscum mitogenome is unusually rich in recombinationally active repeats, possessing unparalleled levels of predicted sublimons resulting from recombination across short repeats. Many mitochondrial gene products exhibit extraordinary levels of divergence in Viscum, indicative of highly relaxed if not positive selection. In addition, all Viscum mitochondrial protein genes have experienced a dramatic acceleration in synonymous substitution rates, consistent with the hypothesis of genomic streamlining in response to a high mutation rate but completely opposite to the pattern seen for the high-rate but enormous mitogenomes of Silene. In sum, the Viscum mitogenome possesses a unique constellation of extremely unusual features, a subset of which may be related to its parasitic lifestyle. PMID:26100885
Hsu, Jeremy L; Crawford, Jeremy Chase; Tammone, Mauro N; Ramakrishnan, Uma; Lacey, Eileen A; Hadly, Elizabeth A
2017-11-24
Marked reductions in population size can trigger corresponding declines in genetic variation. Understanding the precise genetic consequences of such reductions, however, is often challenging due to the absence of robust pre- and post-reduction datasets. Here, we use heterochronous genomic data from samples obtained before and immediately after the 2011 eruption of the Puyehue-Cordón Caulle volcanic complex in Patagonia to explore the genetic impacts of this event on two parapatric species of rodents, the colonial tuco-tuco (Ctenomys sociabilis) and the Patagonian tuco-tuco (C. haigi). Previous analyses using microsatellites revealed no post-eruption changes in genetic variation in C. haigi, but an unexpected increase in variation in C. sociabilis. To explore this outcome further, we used targeted gene capture to sequence over 2,000 putatively neutral regions for both species. Our data revealed that, contrary to the microsatellite analyses, the eruption was associated with a small but significant decrease in genetic variation in both species. We suggest that genome-level analyses provide greater power than traditional molecular markers to detect the genetic consequences of population size changes, particularly changes that are recent, short-term, or modest in size. Consequently, genomic analyses promise to generate important new insights into the effects of specific environmental events on demography and genetic variation.
Absence of Complex I Is Associated with Diminished Respiratory Chain Function in European Mistletoe.
Maclean, Andrew E; Hertle, Alexander P; Ligas, Joanna; Bock, Ralph; Balk, Janneke; Meyer, Etienne H
2018-05-21
Parasitism is a life history strategy found across all domains of life whereby nutrition is obtained from a host. It is often associated with reductive evolution of the genome, including loss of genes from the organellar genomes [1, 2]. In some unicellular parasites, the mitochondrial genome (mitogenome) has been lost entirely, with far-reaching consequences for the physiology of the organism [3, 4]. Recently, mitogenome sequences of several species of the hemiparasitic plant mistletoe (Viscum sp.) have been reported [5, 6], revealing a striking loss of genes not seen in any other multicellular eukaryotes. In particular, the nad genes encoding subunits of respiratory complex I are all absent and other protein-coding genes are also lost or highly diverged in sequence, raising the question what remains of the respiratory complexes and mitochondrial functions. Here we show that oxidative phosphorylation (OXPHOS) in European mistletoe, Viscum album, is highly diminished. Complex I activity and protein subunits of complex I could not be detected. The levels of complex IV and ATP synthase were at least 5-fold lower than in the non-parasitic model plant Arabidopsis thaliana, whereas alternative dehydrogenases and oxidases were higher in abundance. Carbon flux analysis indicates that cytosolic reactions including glycolysis are greater contributors to ATP synthesis than the mitochondrial tricarboxylic acid (TCA) cycle. Our results describe the extreme adjustments in mitochondrial functions of the first reported multicellular eukaryote without complex I. Copyright © 2018 Elsevier Ltd. All rights reserved.
Heritability of variation in glycaemic response to metformin: a genome-wide complex trait analysis
Zhou, Kaixin; Donnelly, Louise; Yang, Jian; Li, Miaoxin; Deshmukh, Harshal; Van Zuydam, Natalie; Ahlqvist, Emma; Spencer, Chris C; Groop, Leif; Morris, Andrew D; Colhoun, Helen M; Sham, Pak C; McCarthy, Mark I; Palmer, Colin N A; Pearson, Ewan R
2014-01-01
Summary Background Metformin is a first-line oral agent used in the treatment of type 2 diabetes, but glycaemic response to this drug is highly variable. Understanding the genetic contribution to metformin response might increase the possibility of personalising metformin treatment. We aimed to establish the heritability of glycaemic response to metformin using the genome-wide complex trait analysis (GCTA) method. Methods In this GCTA study, we obtained data about HbA1c concentrations before and during metformin treatment from patients in the Genetics of Diabetes Audit and Research in Tayside Scotland (GoDARTS) study, which includes a cohort of patients with type 2 diabetes and is linked to comprehensive clinical databases and genome-wide association study data. We applied the GCTA method to estimate heritability for four definitions of glycaemic response to metformin: absolute reduction in HbA1c; proportional reduction in HbA1c; adjusted reduction in HbA1c; and whether or not the target on-treatment HbA1c of less than 7% (53 mmol/mol) was achieved, with adjustment for baseline HbA1c and known clinical covariates. Chromosome-wise heritability estimation was used to obtain further information about the genetic architecture. Findings 5386 individuals were included in the final dataset, of whom 2085 had enough clinical data to define glycaemic response to metformin. The heritability of glycaemic response to metformin varied by response phenotype, with a heritability of 34% (95% CI 1–68; p=0·022) for the absolute reduction in HbA1c, adjusted for pretreatment HbA1c. Chromosome-wise heritability estimates suggest that the genetic contribution is probably from individual variants scattered across the genome, which each have a small to moderate effect, rather than from a few loci that each have a large effect. Interpretation Glycaemic response to metformin is heritable, thus glycaemic response to metformin is, in part, intrinsic to individual biological variation. Further genetic analysis might enable us to make better predictions for stratified medicine and to unravel new mechanisms of metformin action. Funding Wellcome Trust. PMID:24731673
Heritability of variation in glycaemic response to metformin: a genome-wide complex trait analysis.
Zhou, Kaixin; Donnelly, Louise; Yang, Jian; Li, Miaoxin; Deshmukh, Harshal; Van Zuydam, Natalie; Ahlqvist, Emma; Spencer, Chris C; Groop, Leif; Morris, Andrew D; Colhoun, Helen M; Sham, Pak C; McCarthy, Mark I; Palmer, Colin N A; Pearson, Ewan R
2014-06-01
Metformin is a first-line oral agent used in the treatment of type 2 diabetes, but glycaemic response to this drug is highly variable. Understanding the genetic contribution to metformin response might increase the possibility of personalising metformin treatment. We aimed to establish the heritability of glycaemic response to metformin using the genome-wide complex trait analysis (GCTA) method. In this GCTA study, we obtained data about HbA1c concentrations before and during metformin treatment from patients in the Genetics of Diabetes Audit and Research in Tayside Scotland (GoDARTS) study, which includes a cohort of patients with type 2 diabetes and is linked to comprehensive clinical databases and genome-wide association study data. We applied the GCTA method to estimate heritability for four definitions of glycaemic response to metformin: absolute reduction in HbA1c; proportional reduction in HbA1c; adjusted reduction in HbA1c; and whether or not the target on-treatment HbA1c of less than 7% (53 mmol/mol) was achieved, with adjustment for baseline HbA1c and known clinical covariates. Chromosome-wise heritability estimation was used to obtain further information about the genetic architecture. 5386 individuals were included in the final dataset, of whom 2085 had enough clinical data to define glycaemic response to metformin. The heritability of glycaemic response to metformin varied by response phenotype, with a heritability of 34% (95% CI 1-68; p=0·022) for the absolute reduction in HbA1c, adjusted for pretreatment HbA1c. Chromosome-wise heritability estimates suggest that the genetic contribution is probably from individual variants scattered across the genome, which each have a small to moderate effect, rather than from a few loci that each have a large effect. Glycaemic response to metformin is heritable, thus glycaemic response to metformin is, in part, intrinsic to individual biological variation. Further genetic analysis might enable us to make better predictions for stratified medicine and to unravel new mechanisms of metformin action. Wellcome Trust. Copyright © 2014 Elsevier Ltd. All rights reserved.
Thorup, Casper; Schramm, Andreas; Findlay, Alyssa J; Finster, Kai W; Schreiber, Lars
2017-07-18
This study demonstrates that the deltaproteobacterium Desulfurivibrio alkaliphilus can grow chemolithotrophically by coupling sulfide oxidation to the dissimilatory reduction of nitrate and nitrite to ammonium. Key genes of known sulfide oxidation pathways are absent from the genome of D. alkaliphilus Instead, the genome contains all of the genes necessary for sulfate reduction, including a gene for a reductive-type dissimilatory bisulfite reductase (DSR). Despite this, growth by sulfate reduction was not observed. Transcriptomic analysis revealed a very high expression level of sulfate-reduction genes during growth by sulfide oxidation, while inhibition experiments with molybdate pointed to elemental sulfur/polysulfides as intermediates. Consequently, we propose that D. alkaliphilus initially oxidizes sulfide to elemental sulfur, which is then either disproportionated, or oxidized by a reversal of the sulfate reduction pathway. This is the first study providing evidence that a reductive-type DSR is involved in a sulfide oxidation pathway. Transcriptome sequencing further suggests that nitrate reduction to ammonium is performed by a novel type of periplasmic nitrate reductase and an unusual membrane-anchored nitrite reductase. IMPORTANCE Sulfide oxidation and sulfate reduction, the two major branches of the sulfur cycle, are usually ascribed to distinct sets of microbes with distinct diagnostic genes. Here we show a more complex picture, as D. alkaliphilus , with the genomic setup of a sulfate reducer, grows by sulfide oxidation. The high expression of genes typically involved in the sulfate reduction pathway suggests that these genes, including the reductive-type dissimilatory bisulfite reductases, are also involved in as-yet-unresolved sulfide oxidation pathways. Finally, D. alkaliphilus is closely related to cable bacteria, which grow by electrogenic sulfide oxidation. Since there are no pure cultures of cable bacteria, D. alkaliphilus may represent an exciting model organism in which to study the physiology of this process. Copyright © 2017 Thorup et al.
Nucleolin promotes in vitro translation of feline calicivirus genomic RNA.
Hernández, Beatriz Alvarado; Sandoval-Jaime, Carlos; Sosnovtsev, Stanislav V; Green, Kim Y; Gutiérrez-Escolano, Ana Lorena
2016-02-01
Feline calicivirus depends on host-cell proteins for its replication. We previously showed that knockdown of nucleolin (NCL), a phosphoprotein involved in ribosome biogenesis, resulted in the reduction of FCV protein synthesis and virus yield. Here, we found that NCL may not be involved in FCV binding and entry into cells, but it binds to both ends of the FCV genomic RNA, and stimulates its translation in vitro. AGRO100, an aptamer that specifically binds and inactivates NCL, caused a strong reduction in FCV protein synthesis. This effect could be reversed by the addition of full-length NCL but not by a ΔrNCL, lacking the N-terminal domain. Consistent with this, FCV infection of CrFK cells stably expressing ΔrNCL led to a reduction in virus protein translation. These results suggest that NCL is part of the FCV RNA translational complex, and that the N-terminal part of the protein is required for efficient FCV replication. Copyright © 2015 Elsevier Inc. All rights reserved.
Genome-Enabled Molecular Tools for Reductive Dehalogenation
2011-11-01
Genome-Enabled Molecular Tools for Reductive Dehalogenation - A Shift in Paradigm for Bioremediation - Alfred M. Spormann Departments of Chemical...Genome-Enabled Molecular Tools for Reductive Dehalogenation 5a. CONTRACT NUMBER 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR(S) 5d...Applications Technical Session No. 3D C-77 GENOME-ENABLED MOLECULAR TOOLS FOR REDUCTIVE DEHALOGENATION PROFESSOR ALFRED SPORMANN Stanford
Cost-effective cloud computing: a case study using the comparative genomics tool, roundup.
Kudtarkar, Parul; Deluca, Todd F; Fusaro, Vincent A; Tonellato, Peter J; Wall, Dennis P
2010-12-22
Comparative genomics resources, such as ortholog detection tools and repositories are rapidly increasing in scale and complexity. Cloud computing is an emerging technological paradigm that enables researchers to dynamically build a dedicated virtual cluster and may represent a valuable alternative for large computational tools in bioinformatics. In the present manuscript, we optimize the computation of a large-scale comparative genomics resource-Roundup-using cloud computing, describe the proper operating principles required to achieve computational efficiency on the cloud, and detail important procedures for improving cost-effectiveness to ensure maximal computation at minimal costs. Utilizing the comparative genomics tool, Roundup, as a case study, we computed orthologs among 902 fully sequenced genomes on Amazon's Elastic Compute Cloud. For managing the ortholog processes, we designed a strategy to deploy the web service, Elastic MapReduce, and maximize the use of the cloud while simultaneously minimizing costs. Specifically, we created a model to estimate cloud runtime based on the size and complexity of the genomes being compared that determines in advance the optimal order of the jobs to be submitted. We computed orthologous relationships for 245,323 genome-to-genome comparisons on Amazon's computing cloud, a computation that required just over 200 hours and cost $8,000 USD, at least 40% less than expected under a strategy in which genome comparisons were submitted to the cloud randomly with respect to runtime. Our cost savings projections were based on a model that not only demonstrates the optimal strategy for deploying RSD to the cloud, but also finds the optimal cluster size to minimize waste and maximize usage. Our cost-reduction model is readily adaptable for other comparative genomics tools and potentially of significant benefit to labs seeking to take advantage of the cloud as an alternative to local computing infrastructure.
Intelligibility in microbial complex systems: Wittgenstein and the score of life.
Baquero, Fernando; Moya, Andrés
2012-01-01
Knowledge in microbiology is reaching an extreme level of diversification and complexity, which paradoxically results in a strong reduction in the intelligibility of microbial life. In our days, the "score of life" metaphor is more accurate to express the complexity of living systems than the classic "book of life." Music and life can be represented at lower hierarchical levels by music scores and genomic sequences, and such representations have a generational influence in the reproduction of music and life. If music can be considered as a representation of life, such representation remains as unthinkable as life itself. The analysis of scores and genomic sequences might provide mechanistic, phylogenetic, and evolutionary insights into music and life, but not about their real dynamics and nature, which is still maintained unthinkable, as was proposed by Wittgenstein. As complex systems, life or music is composed by thinkable and only showable parts, and a strategy of half-thinking, half-seeing is needed to expand knowledge. Complex models for complex systems, based on experiences on trans-hierarchical integrations, should be developed in order to provide a mixture of legibility and imageability of biological processes, which should lead to higher levels of intelligibility of microbial life.
Intelligibility in microbial complex systems: Wittgenstein and the score of life
Baquero, Fernando; Moya, Andrés
2012-01-01
Knowledge in microbiology is reaching an extreme level of diversification and complexity, which paradoxically results in a strong reduction in the intelligibility of microbial life. In our days, the “score of life” metaphor is more accurate to express the complexity of living systems than the classic “book of life.” Music and life can be represented at lower hierarchical levels by music scores and genomic sequences, and such representations have a generational influence in the reproduction of music and life. If music can be considered as a representation of life, such representation remains as unthinkable as life itself. The analysis of scores and genomic sequences might provide mechanistic, phylogenetic, and evolutionary insights into music and life, but not about their real dynamics and nature, which is still maintained unthinkable, as was proposed by Wittgenstein. As complex systems, life or music is composed by thinkable and only showable parts, and a strategy of half-thinking, half-seeing is needed to expand knowledge. Complex models for complex systems, based on experiences on trans-hierarchical integrations, should be developed in order to provide a mixture of legibility and imageability of biological processes, which should lead to higher levels of intelligibility of microbial life. PMID:22919679
Silva, Lindsey; Oh, Hyung Suk; Chang, Lynne; Yan, Zhipeng; Triezenberg, Steven J.; Knipe, David M.
2012-01-01
ABSTRACT Little is known about the mechanisms of gene targeting within the nucleus and its effect on gene expression, but most studies have concluded that genes located near the nuclear periphery are silenced by heterochromatin. In contrast, we found that early herpes simplex virus (HSV) genome complexes localize near the nuclear lamina and that this localization is associated with reduced heterochromatin on the viral genome and increased viral immediate-early (IE) gene transcription. In this study, we examined the mechanism of this effect and found that input virion transactivator protein, virion protein 16 (VP16), targets sites adjacent to the nuclear lamina and is required for targeting of the HSV genome to the nuclear lamina, exclusion of heterochromatin from viral replication compartments, and reduction of heterochromatin on the viral genome. Because cells infected with the VP16 mutant virus in1814 showed a phenotype similar to that of lamin A/C−/− cells infected with wild-type virus, we hypothesized that the nuclear lamina is required for VP16 activator complex formation. In lamin A/C−/− mouse embryo fibroblasts, VP16 and Oct-1 showed reduced association with the viral IE gene promoters, the levels of VP16 and HCF-1 stably associated with the nucleus were lower than in wild-type cells, and the association of VP16 with HCF-1 was also greatly reduced. These results show that the nuclear lamina is required for stable nuclear localization and formation of the VP16 activator complex and provide evidence for the nuclear lamina being the site of assembly of the VP16 activator complex. PMID:22251972
Systems Biology Perspectives on Minimal and Simpler Cells
Xavier, Joana C.; Patil, Kiran Raosaheb
2014-01-01
SUMMARY The concept of the minimal cell has fascinated scientists for a long time, from both fundamental and applied points of view. This broad concept encompasses extreme reductions of genomes, the last universal common ancestor (LUCA), the creation of semiartificial cells, and the design of protocells and chassis cells. Here we review these different areas of research and identify common and complementary aspects of each one. We focus on systems biology, a discipline that is greatly facilitating the classical top-down and bottom-up approaches toward minimal cells. In addition, we also review the so-called middle-out approach and its contributions to the field with mathematical and computational models. Owing to the advances in genomics technologies, much of the work in this area has been centered on minimal genomes, or rather minimal gene sets, required to sustain life. Nevertheless, a fundamental expansion has been taking place in the last few years wherein the minimal gene set is viewed as a backbone of a more complex system. Complementing genomics, progress is being made in understanding the system-wide properties at the levels of the transcriptome, proteome, and metabolome. Network modeling approaches are enabling the integration of these different omics data sets toward an understanding of the complex molecular pathways connecting genotype to phenotype. We review key concepts central to the mapping and modeling of this complexity, which is at the heart of research on minimal cells. Finally, we discuss the distinction between minimizing the number of cellular components and minimizing cellular complexity, toward an improved understanding and utilization of minimal and simpler cells. PMID:25184563
Coordinated Changes in Mutation and Growth Rates Induced by Genome Reduction
Nishimura, Issei; Kurokawa, Masaomi; Liu, Liu
2017-01-01
ABSTRACT Genome size is determined during evolution, but it can also be altered by genetic engineering in laboratories. The systematic characterization of reduced genomes provides valuable insights into the cellular properties that are quantitatively described by the global parameters related to the dynamics of growth and mutation. In the present study, we analyzed a small collection of W3110 Escherichia coli derivatives containing either the wild-type genome or reduced genomes of various lengths to examine whether the mutation rate, a global parameter representing genomic plasticity, was affected by genome reduction. We found that the mutation rates of these cells increased with genome reduction. The correlation between genome length and mutation rate, which has been reported for the evolution of bacteria, was also identified, intriguingly, for genome reduction. Gene function enrichment analysis indicated that the deletion of many of the genes encoding membrane and transport proteins play a role in the mutation rate changes mediated by genome reduction. Furthermore, the increase in the mutation rate with genome reduction was highly associated with a decrease in the growth rate in a nutrition-dependent manner; thus, poorer media showed a larger change that was of higher significance. This negative correlation was strongly supported by experimental evidence that the serial transfer of the reduced genome improved the growth rate and reduced the mutation rate to a large extent. Taken together, the global parameters corresponding to the genome, growth, and mutation showed a coordinated relationship, which might be an essential working principle for balancing the cellular dynamics appropriate to the environment. PMID:28679744
Kent, Jack W
2016-02-03
New technologies for acquisition of genomic data, while offering unprecedented opportunities for genetic discovery, also impose severe burdens of interpretation and penalties for multiple testing. The Pathway-based Analyses Group of the Genetic Analysis Workshop 19 (GAW19) sought reduction of multiple-testing burden through various approaches to aggregation of highdimensional data in pathways informed by prior biological knowledge. Experimental methods testedincluded the use of "synthetic pathways" (random sets of genes) to estimate power and false-positive error rate of methods applied to simulated data; data reduction via independent components analysis, single-nucleotide polymorphism (SNP)-SNP interaction, and use of gene sets to estimate genetic similarity; and general assessment of the efficacy of prior biological knowledge to reduce the dimensionality of complex genomic data. The work of this group explored several promising approaches to managing high-dimensional data, with the caveat that these methods are necessarily constrained by the quality of external bioinformatic annotation.
Microdiversification in genome-streamlined ubiquitous freshwater Actinobacteria.
Neuenschwander, Stefan M; Ghai, Rohit; Pernthaler, Jakob; Salcher, Michaela M
2018-01-01
Actinobacteria of the acI lineage are the most abundant microbes in freshwater systems, but there are so far no pure living cultures of these organisms, possibly because of metabolic dependencies on other microbes. This, in turn, has hampered an in-depth assessment of the genomic basis for their success in the environment. Here we present genomes from 16 axenic cultures of acI Actinobacteria. The isolates were not only of minute cell size, but also among the most streamlined free-living microbes, with extremely small genome sizes (1.2-1.4 Mbp) and low genomic GC content. Genome reduction in these bacteria might have led to auxotrophy for various vitamins, amino acids and reduced sulphur sources, thus creating dependencies to co-occurring organisms (the 'Black Queen' hypothesis). Genome analyses, moreover, revealed a surprising degree of inter- and intraspecific diversity in metabolic pathways, especially of carbohydrate transport and metabolism, and mainly encoded in genomic islands. The striking genotype microdiversification of acI Actinobacteria might explain their global success in highly dynamic freshwater environments with complex seasonal patterns of allochthonous and autochthonous carbon sources. We propose a new order within Actinobacteria ('Candidatus Nanopelagicales') with two new genera ('Candidatus Nanopelagicus' and 'Candidatus Planktophila') and nine new species.
Visualization of Genome Diversity in German Shepherd Dogs.
Mortlock, Sally-Anne; Booth, Rachel; Mazrier, Hamutal; Khatkar, Mehar S; Williamson, Peter
2015-01-01
A loss of genetic diversity may lead to increased disease risks in subpopulations of dogs. The canine breed structure has contributed to relatively small effective population size in many breeds and can limit the options for selective breeding strategies to maintain diversity. With the completion of the canine genome sequencing project, and the subsequent reduction in the cost of genotyping on a genomic scale, evaluating diversity in dogs has become much more accurate and accessible. This provides a potential tool for advising dog breeders and developing breeding programs within a breed. A challenge in doing this is to present complex relationship data in a form that can be readily utilized. Here, we demonstrate the use of a pipeline, known as NetView, to visualize the network of relationships in a subpopulation of German Shepherd Dogs.
Rapid cloning of genes in hexaploid wheat using cultivar-specific long-range chromosome assembly.
Thind, Anupriya Kaur; Wicker, Thomas; Šimková, Hana; Fossati, Dario; Moullet, Odile; Brabant, Cécile; Vrána, Jan; Doležel, Jaroslav; Krattinger, Simon G
2017-08-01
Cereal crops such as wheat and maize have large repeat-rich genomes that make cloning of individual genes challenging. Moreover, gene order and gene sequences often differ substantially between cultivars of the same crop species. A major bottleneck for gene cloning in cereals is the generation of high-quality sequence information from a cultivar of interest. In order to accelerate gene cloning from any cropping line, we report 'targeted chromosome-based cloning via long-range assembly' (TACCA). TACCA combines lossless genome-complexity reduction via chromosome flow sorting with Chicago long-range linkage to assemble complex genomes. We applied TACCA to produce a high-quality (N50 of 9.76 Mb) de novo chromosome assembly of the wheat line CH Campala Lr22a in only 4 months. Using this assembly we cloned the broad-spectrum Lr22a leaf-rust resistance gene, using molecular marker information and ethyl methanesulfonate (EMS) mutants, and found that Lr22a encodes an intracellular immune receptor homologous to the Arabidopsis thaliana RPM1 protein.
Directed combinatorial mutagenesis of Escherichia coli for complex phenotype engineering
DOE Office of Scientific and Technical Information (OSTI.GOV)
Liu, Rongming; Liang, Liya; Garst, Andrew D.
Strain engineering for industrial production requires a targeted improvement of multiple complex traits, which range from pathway flux to tolerance to mixed sugar utilization. Here, we report the use of an iterative CRISPR EnAbled Trackable genome Engineering (iCREATE) method to engineer rapid glucose and xylose co-consumption and tolerance to hydrolysate inhibitors in E. coli. Deep mutagenesis libraries were rationally designed, constructed, and screened to target ~40,000 mutations across 30 genes. These libraries included global and high-level regulators that regulate global gene expression, transcription factors that play important roles in genome-level transcription, enzymes that function in the sugar transport system, NAD(P)Hmore » metabolism, and the aldehyde reduction system. Specific mutants that conferred increased growth in mixed sugars and hydrolysate tolerance conditions were isolated, confirmed, and evaluated for changes in genome-wide expression levels. As a result, we tested the strain with positive combinatorial mutations for 3-hydroxypropionic acid (3HP) production under high furfural and high acetate hydrolysate fermentation, which demonstrated a 7- and 8-fold increase in 3HP productivity relative to the parent strain, respectively.« less
Directed combinatorial mutagenesis of Escherichia coli for complex phenotype engineering
Liu, Rongming; Liang, Liya; Garst, Andrew D.; ...
2018-03-29
Strain engineering for industrial production requires a targeted improvement of multiple complex traits, which range from pathway flux to tolerance to mixed sugar utilization. Here, we report the use of an iterative CRISPR EnAbled Trackable genome Engineering (iCREATE) method to engineer rapid glucose and xylose co-consumption and tolerance to hydrolysate inhibitors in E. coli. Deep mutagenesis libraries were rationally designed, constructed, and screened to target ~40,000 mutations across 30 genes. These libraries included global and high-level regulators that regulate global gene expression, transcription factors that play important roles in genome-level transcription, enzymes that function in the sugar transport system, NAD(P)Hmore » metabolism, and the aldehyde reduction system. Specific mutants that conferred increased growth in mixed sugars and hydrolysate tolerance conditions were isolated, confirmed, and evaluated for changes in genome-wide expression levels. As a result, we tested the strain with positive combinatorial mutations for 3-hydroxypropionic acid (3HP) production under high furfural and high acetate hydrolysate fermentation, which demonstrated a 7- and 8-fold increase in 3HP productivity relative to the parent strain, respectively.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Park, Jong -Jin; Yoo, Chang Geun; Flanagan, Amy
The development of genome editing technologies offers new prospects in improving bioenergy crops like switchgrass (Panicum virgatum). Switchgrass is an outcrossing species with an allotetraploid genome (2n = 4x = 36), a complexity which forms an impediment to generating homozygous knock-out plants. Lignin, a major component of the plant cell wall and a contributor to cellulosic feedstock’s recalcitrance to decomposition, stands as a barrier to efficient biofuel production by limiting enzyme access to cell wall polymers during the fermentation process.
Park, Jong -Jin; Yoo, Chang Geun; Flanagan, Amy; ...
2017-11-30
The development of genome editing technologies offers new prospects in improving bioenergy crops like switchgrass (Panicum virgatum). Switchgrass is an outcrossing species with an allotetraploid genome (2n = 4x = 36), a complexity which forms an impediment to generating homozygous knock-out plants. Lignin, a major component of the plant cell wall and a contributor to cellulosic feedstock’s recalcitrance to decomposition, stands as a barrier to efficient biofuel production by limiting enzyme access to cell wall polymers during the fermentation process.
Suen, Garret; Holt, Carson; Abouheif, Ehab; Bornberg-Bauer, Erich; Bouffard, Pascal; Caldera, Eric J.; Cash, Elizabeth; Cavanaugh, Amy; Denas, Olgert; Elhaik, Eran; Favé, Marie-Julie; Gadau, Jürgen; Gibson, Joshua D.; Graur, Dan; Grubbs, Kirk J.; Hagen, Darren E.; Harkins, Timothy T.; Helmkampf, Martin; Hu, Hao; Johnson, Brian R.; Kim, Jay; Marsh, Sarah E.; Moeller, Joseph A.; Muñoz-Torres, Mónica C.; Murphy, Marguerite C.; Naughton, Meredith C.; Nigam, Surabhi; Overson, Rick; Rajakumar, Rajendhran; Reese, Justin T.; Scott, Jarrod J.; Smith, Chris R.; Tao, Shu; Tsutsui, Neil D.; Viljakainen, Lumi; Wissler, Lothar; Yandell, Mark D.; Zimmer, Fabian; Taylor, James; Slater, Steven C.; Clifton, Sandra W.; Warren, Wesley C.; Elsik, Christine G.; Smith, Christopher D.; Weinstock, George M.; Gerardo, Nicole M.; Currie, Cameron R.
2011-01-01
Leaf-cutter ants are one of the most important herbivorous insects in the Neotropics, harvesting vast quantities of fresh leaf material. The ants use leaves to cultivate a fungus that serves as the colony's primary food source. This obligate ant-fungus mutualism is one of the few occurrences of farming by non-humans and likely facilitated the formation of their massive colonies. Mature leaf-cutter ant colonies contain millions of workers ranging in size from small garden tenders to large soldiers, resulting in one of the most complex polymorphic caste systems within ants. To begin uncovering the genomic underpinnings of this system, we sequenced the genome of Atta cephalotes using 454 pyrosequencing. One prediction from this ant's lifestyle is that it has undergone genetic modifications that reflect its obligate dependence on the fungus for nutrients. Analysis of this genome sequence is consistent with this hypothesis, as we find evidence for reductions in genes related to nutrient acquisition. These include extensive reductions in serine proteases (which are likely unnecessary because proteolysis is not a primary mechanism used to process nutrients obtained from the fungus), a loss of genes involved in arginine biosynthesis (suggesting that this amino acid is obtained from the fungus), and the absence of a hexamerin (which sequesters amino acids during larval development in other insects). Following recent reports of genome sequences from other insects that engage in symbioses with beneficial microbes, the A. cephalotes genome provides new insights into the symbiotic lifestyle of this ant and advances our understanding of host–microbe symbioses. PMID:21347285
Jarvis, Erich D
2016-01-01
The rapid pace of advances in genome technology, with concomitant reductions in cost, makes it feasible that one day in our lifetime we will have available extant genomes of entire classes of species, including vertebrates. I recently helped cocoordinate the large-scale Avian Phylogenomics Project, which collected and sequenced genomes of 48 bird species representing most currently classified orders to address a range of questions in phylogenomics and comparative genomics. The consortium was able to answer questions not previously possible with just a few genomes. This success spurred on the creation of a project to sequence the genomes of at least one individual of all extant ∼10,500 bird species. The initiation of this project has led us to consider what questions now impossible to answer could be answered with all genomes, and could drive new questions now unimaginable. These include the generation of a highly resolved family tree of extant species, genome-wide association studies across species to identify genetic substrates of many complex traits, redefinition of species and the species concept, reconstruction of the genomes of common ancestors, and generation of new computational tools to address these questions. Here I present visions for the future by posing and answering questions regarding what scientists could potentially do with available genomes of an entire vertebrate class.
Systems biology perspectives on minimal and simpler cells.
Xavier, Joana C; Patil, Kiran Raosaheb; Rocha, Isabel
2014-09-01
The concept of the minimal cell has fascinated scientists for a long time, from both fundamental and applied points of view. This broad concept encompasses extreme reductions of genomes, the last universal common ancestor (LUCA), the creation of semiartificial cells, and the design of protocells and chassis cells. Here we review these different areas of research and identify common and complementary aspects of each one. We focus on systems biology, a discipline that is greatly facilitating the classical top-down and bottom-up approaches toward minimal cells. In addition, we also review the so-called middle-out approach and its contributions to the field with mathematical and computational models. Owing to the advances in genomics technologies, much of the work in this area has been centered on minimal genomes, or rather minimal gene sets, required to sustain life. Nevertheless, a fundamental expansion has been taking place in the last few years wherein the minimal gene set is viewed as a backbone of a more complex system. Complementing genomics, progress is being made in understanding the system-wide properties at the levels of the transcriptome, proteome, and metabolome. Network modeling approaches are enabling the integration of these different omics data sets toward an understanding of the complex molecular pathways connecting genotype to phenotype. We review key concepts central to the mapping and modeling of this complexity, which is at the heart of research on minimal cells. Finally, we discuss the distinction between minimizing the number of cellular components and minimizing cellular complexity, toward an improved understanding and utilization of minimal and simpler cells. Copyright © 2014, American Society for Microbiology. All Rights Reserved.
Schmid, Michael; Muri, Jonathan; Melidis, Damianos; Varadarajan, Adithi R; Somerville, Vincent; Wicki, Adrian; Moser, Aline; Bourqui, Marc; Wenzel, Claudia; Eugster-Meier, Elisabeth; Frey, Juerg E; Irmler, Stefan; Ahrens, Christian H
2018-01-01
Although complete genome sequences hold particular value for an accurate description of core genomes, the identification of strain-specific genes, and as the optimal basis for functional genomics studies, they are still largely underrepresented in public repositories. Based on an assessment of the genome assembly complexity for all lactobacilli, we used Pacific Biosciences' long read technology to sequence and de novo assemble the genomes of three Lactobacillus helveticus starter strains, raising the number of completely sequenced strains to 12. The first comparative genomics study for L. helveticus -to our knowledge-identified a core genome of 988 genes and sets of unique, strain-specific genes ranging from about 30 to more than 200 genes. Importantly, the comparison of MiSeq- and PacBio-based assemblies uncovered that not only accessory but also core genes can be missed in incomplete genome assemblies based on short reads. Analysis of the three genomes revealed that a large number of pseudogenes were enriched for functional Gene Ontology categories such as amino acid transmembrane transport and carbohydrate metabolism, which is in line with a reductive genome evolution in the rich natural habitat of L. helveticus . Notably, the functional Clusters of Orthologous Groups of proteins categories "cell wall/membrane biogenesis" and "defense mechanisms" were found to be enriched among the strain-specific genes. A genome mining effort uncovered examples where an experimentally observed phenotype could be linked to the underlying genotype, such as for cell envelope proteinase PrtH3 of strain FAM8627. Another possible link identified for peptidoglycan hydrolases will require further experiments. Of note, strain FAM22155 did not harbor a CRISPR/Cas system; its loss was also observed in other L. helveticus strains and lactobacillus species, thus questioning the value of the CRISPR/Cas system for diagnostic purposes. Importantly, the complete genome sequences proved to be very useful for the analysis of natural whey starter cultures with metagenomics, as a larger percentage of the sequenced reads of these complex mixtures could be unambiguously assigned down to the strain level.
Schmid, Michael; Muri, Jonathan; Melidis, Damianos; Varadarajan, Adithi R.; Somerville, Vincent; Wicki, Adrian; Moser, Aline; Bourqui, Marc; Wenzel, Claudia; Eugster-Meier, Elisabeth; Frey, Juerg E.; Irmler, Stefan; Ahrens, Christian H.
2018-01-01
Although complete genome sequences hold particular value for an accurate description of core genomes, the identification of strain-specific genes, and as the optimal basis for functional genomics studies, they are still largely underrepresented in public repositories. Based on an assessment of the genome assembly complexity for all lactobacilli, we used Pacific Biosciences' long read technology to sequence and de novo assemble the genomes of three Lactobacillus helveticus starter strains, raising the number of completely sequenced strains to 12. The first comparative genomics study for L. helveticus—to our knowledge—identified a core genome of 988 genes and sets of unique, strain-specific genes ranging from about 30 to more than 200 genes. Importantly, the comparison of MiSeq- and PacBio-based assemblies uncovered that not only accessory but also core genes can be missed in incomplete genome assemblies based on short reads. Analysis of the three genomes revealed that a large number of pseudogenes were enriched for functional Gene Ontology categories such as amino acid transmembrane transport and carbohydrate metabolism, which is in line with a reductive genome evolution in the rich natural habitat of L. helveticus. Notably, the functional Clusters of Orthologous Groups of proteins categories “cell wall/membrane biogenesis” and “defense mechanisms” were found to be enriched among the strain-specific genes. A genome mining effort uncovered examples where an experimentally observed phenotype could be linked to the underlying genotype, such as for cell envelope proteinase PrtH3 of strain FAM8627. Another possible link identified for peptidoglycan hydrolases will require further experiments. Of note, strain FAM22155 did not harbor a CRISPR/Cas system; its loss was also observed in other L. helveticus strains and lactobacillus species, thus questioning the value of the CRISPR/Cas system for diagnostic purposes. Importantly, the complete genome sequences proved to be very useful for the analysis of natural whey starter cultures with metagenomics, as a larger percentage of the sequenced reads of these complex mixtures could be unambiguously assigned down to the strain level. PMID:29441050
The connection between BRG1, CTCF and topoisomerases at TAD boundaries.
Barutcu, A Rasim; Lian, Jane B; Stein, Janet L; Stein, Gary S; Imbalzano, Anthony N
2017-03-04
The eukaryotic genome is partitioned into topologically associating domains (TADs). Despite recent advances characterizing TADs and TAD boundaries, the organization of these structures is an important dimension of genome architecture and function that is not well understood. Recently, we demonstrated that knockdown of BRG1, an ATPase driving the chromatin remodeling activity of mammalian SWI/SNF enzymes, globally alters long-range genomic interactions and results in a reduction of TAD boundary strength. We provided evidence suggesting that this effect may be due to BRG1 affecting nucleosome occupancy around CTCF sites present at TAD boundaries. In this review, we elaborate on our findings and speculate that BRG1 may contribute to the regulation of the structural and functional properties of chromatin at TAD boundaries by affecting the function or the recruitment of CTCF and DNA topoisomerase complexes.
Cost-Effective Cloud Computing: A Case Study Using the Comparative Genomics Tool, Roundup
Kudtarkar, Parul; DeLuca, Todd F.; Fusaro, Vincent A.; Tonellato, Peter J.; Wall, Dennis P.
2010-01-01
Background Comparative genomics resources, such as ortholog detection tools and repositories are rapidly increasing in scale and complexity. Cloud computing is an emerging technological paradigm that enables researchers to dynamically build a dedicated virtual cluster and may represent a valuable alternative for large computational tools in bioinformatics. In the present manuscript, we optimize the computation of a large-scale comparative genomics resource—Roundup—using cloud computing, describe the proper operating principles required to achieve computational efficiency on the cloud, and detail important procedures for improving cost-effectiveness to ensure maximal computation at minimal costs. Methods Utilizing the comparative genomics tool, Roundup, as a case study, we computed orthologs among 902 fully sequenced genomes on Amazon’s Elastic Compute Cloud. For managing the ortholog processes, we designed a strategy to deploy the web service, Elastic MapReduce, and maximize the use of the cloud while simultaneously minimizing costs. Specifically, we created a model to estimate cloud runtime based on the size and complexity of the genomes being compared that determines in advance the optimal order of the jobs to be submitted. Results We computed orthologous relationships for 245,323 genome-to-genome comparisons on Amazon’s computing cloud, a computation that required just over 200 hours and cost $8,000 USD, at least 40% less than expected under a strategy in which genome comparisons were submitted to the cloud randomly with respect to runtime. Our cost savings projections were based on a model that not only demonstrates the optimal strategy for deploying RSD to the cloud, but also finds the optimal cluster size to minimize waste and maximize usage. Our cost-reduction model is readily adaptable for other comparative genomics tools and potentially of significant benefit to labs seeking to take advantage of the cloud as an alternative to local computing infrastructure. PMID:21258651
Coordinated Changes in Mutation and Growth Rates Induced by Genome Reduction.
Nishimura, Issei; Kurokawa, Masaomi; Liu, Liu; Ying, Bei-Wen
2017-07-05
Genome size is determined during evolution, but it can also be altered by genetic engineering in laboratories. The systematic characterization of reduced genomes provides valuable insights into the cellular properties that are quantitatively described by the global parameters related to the dynamics of growth and mutation. In the present study, we analyzed a small collection of W3110 Escherichia coli derivatives containing either the wild-type genome or reduced genomes of various lengths to examine whether the mutation rate, a global parameter representing genomic plasticity, was affected by genome reduction. We found that the mutation rates of these cells increased with genome reduction. The correlation between genome length and mutation rate, which has been reported for the evolution of bacteria, was also identified, intriguingly, for genome reduction. Gene function enrichment analysis indicated that the deletion of many of the genes encoding membrane and transport proteins play a role in the mutation rate changes mediated by genome reduction. Furthermore, the increase in the mutation rate with genome reduction was highly associated with a decrease in the growth rate in a nutrition-dependent manner; thus, poorer media showed a larger change that was of higher significance. This negative correlation was strongly supported by experimental evidence that the serial transfer of the reduced genome improved the growth rate and reduced the mutation rate to a large extent. Taken together, the global parameters corresponding to the genome, growth, and mutation showed a coordinated relationship, which might be an essential working principle for balancing the cellular dynamics appropriate to the environment. IMPORTANCE Genome reduction is a powerful approach for investigating the fundamental rules for living systems. Whether genetically disturbed genomes have any specific properties that are different from or similar to those of natively evolved genomes has been under investigation. In the present study, we found that Escherichia coli cells with reduced genomes showed accelerated nucleotide substitution errors (mutation rates), although these cells retained the normal DNA mismatch repair systems. Intriguingly, this finding of correlation between reduced genome size and a higher mutation rate was consistent with the reported evolution of mutation rates. Furthermore, the increased mutation rate was quantitatively associated with a decreased growth rate, indicating that the global parameters related to the genome, growth, and mutation, which represent the amount of genetic information, the efficiency of propagation, and the fidelity of replication, respectively, are dynamically coordinated. Copyright © 2017 Nishimura et al.
Mining sequence variations in representative polyploid sugarcane germplasm accessions
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yang, Xiping; Song, Jian; You, Qian
Sugarcane (Saccharum spp.) is one of the most important economic crops because of its high sugar production and biofuel potential. Due to the high polyploid level and complex genome of sugarcane, it has been a huge challenge to investigate genomic sequence variations, which are critical for identifying alleles contributing to important agronomic traits. In order to mine the genetic variations in sugarcane, genotyping by sequencing (GBS), was used to genotype 14 representative Saccharum complex accessions. GBS is a method to generate a large number of markers, enabled by next generation sequencing (NGS) and the genome complexity reduction using restriction enzymes.more » To use GBS for high throughput genotyping highly polyploid sugarcane, the GBS analysis pipelines in 14 Saccharum complex accessions were established by evaluating different alignment methods, sequence variants callers, and sequence depth for single nucleotide polymorphism (SNP) filtering. By using the established pipeline, a total of 76,251 non-redundant SNPs, 5642 InDels, 6380 presence/absence variants (PAVs), and 826 copy number variations (CNVs) were detected among the 14 accessions. In addition, non-reference based universal network enabled analysis kit and Stacks de novo called 34,353 and 109,043 SNPs, respectively. In the 14 accessions, the percentages of single dose SNPs ranged from 38.3% to 62.3% with an average of 49.6%, much more than the portions of multiple dosage SNPs. Concordantly called SNPs were used to evaluate the phylogenetic relationship among the 14 accessions. The results showed that the divergence time between the Erianthus genus and the Saccharum genus was more than 10 million years ago (MYA). The Saccharum species separated from their common ancestors ranging from 0.19 to 1.65 MYA. The GBS pipelines including the reference sequences, alignment methods, sequence variant callers, and sequence depth were recommended and discussed for the Saccharum complex and other related species. A large number of sequence variations were discovered in the Saccharum complex, including SNPs, InDels, PAVs, and CNVs. Genome-wide SNPs were further used to illustrate sequence features of polyploid species and demonstrated the divergence of different species in the Saccharum complex. The results of this study showed that GBS was an effective NGS-based method to discover genomic sequence variations in highly polyploid and heterozygous species.« less
Mining sequence variations in representative polyploid sugarcane germplasm accessions
Yang, Xiping; Song, Jian; You, Qian; ...
2017-08-09
Sugarcane (Saccharum spp.) is one of the most important economic crops because of its high sugar production and biofuel potential. Due to the high polyploid level and complex genome of sugarcane, it has been a huge challenge to investigate genomic sequence variations, which are critical for identifying alleles contributing to important agronomic traits. In order to mine the genetic variations in sugarcane, genotyping by sequencing (GBS), was used to genotype 14 representative Saccharum complex accessions. GBS is a method to generate a large number of markers, enabled by next generation sequencing (NGS) and the genome complexity reduction using restriction enzymes.more » To use GBS for high throughput genotyping highly polyploid sugarcane, the GBS analysis pipelines in 14 Saccharum complex accessions were established by evaluating different alignment methods, sequence variants callers, and sequence depth for single nucleotide polymorphism (SNP) filtering. By using the established pipeline, a total of 76,251 non-redundant SNPs, 5642 InDels, 6380 presence/absence variants (PAVs), and 826 copy number variations (CNVs) were detected among the 14 accessions. In addition, non-reference based universal network enabled analysis kit and Stacks de novo called 34,353 and 109,043 SNPs, respectively. In the 14 accessions, the percentages of single dose SNPs ranged from 38.3% to 62.3% with an average of 49.6%, much more than the portions of multiple dosage SNPs. Concordantly called SNPs were used to evaluate the phylogenetic relationship among the 14 accessions. The results showed that the divergence time between the Erianthus genus and the Saccharum genus was more than 10 million years ago (MYA). The Saccharum species separated from their common ancestors ranging from 0.19 to 1.65 MYA. The GBS pipelines including the reference sequences, alignment methods, sequence variant callers, and sequence depth were recommended and discussed for the Saccharum complex and other related species. A large number of sequence variations were discovered in the Saccharum complex, including SNPs, InDels, PAVs, and CNVs. Genome-wide SNPs were further used to illustrate sequence features of polyploid species and demonstrated the divergence of different species in the Saccharum complex. The results of this study showed that GBS was an effective NGS-based method to discover genomic sequence variations in highly polyploid and heterozygous species.« less
Ludwig, A; Belfiore, N M; Pitra, C; Svirsky, V; Jenneckens, I
2001-07-01
Sturgeon (order Acipenserformes) provide an ideal taxonomic context for examination of genome duplication events. Multiple levels of ploidy exist among these fish. In a novel microsatellite approach, data from 962 fish from 20 sturgeon species were used for analysis of ploidy in sturgeon. Allele numbers in a sample of individuals were assessed at six microsatellite loci. Species with approximately 120 chromosomes are classified as functional diploid species, species with approximately 250 chromosomes as functional tetraploid species, and with approximately 500 chromosomes as functional octaploids. A molecular phylogeny of the sturgeon was determined on the basis of sequences of the entire mitochondrial cytochrome b gene. By mapping the estimated levels of ploidy on this proposed phylogeny we demonstrate that (I) polyploidization events independently occurred in the acipenseriform radiation; (II) the process of functional genome reduction is nearly finished in species with approximately 120 chromosomes and more active in species with approximately 250 chromosomes and approximately 500 chromosomes; and (III) species with approximately 250 and approximately 500 chromosomes arose more recently than those with approximately 120 chromosomes. These results suggest that gene silencing, chromosomal rearrangements, and transposition events played an important role in the acipenseriform genome formation. Furthermore, this phylogeny is broadly consistent with previous hypotheses but reveals a highly supported oceanic (Atlantic-Pacific) subdivision within the Acipenser/Huso complex.
Ludwig, A; Belfiore, N M; Pitra, C; Svirsky, V; Jenneckens, I
2001-01-01
Sturgeon (order Acipenserformes) provide an ideal taxonomic context for examination of genome duplication events. Multiple levels of ploidy exist among these fish. In a novel microsatellite approach, data from 962 fish from 20 sturgeon species were used for analysis of ploidy in sturgeon. Allele numbers in a sample of individuals were assessed at six microsatellite loci. Species with approximately 120 chromosomes are classified as functional diploid species, species with approximately 250 chromosomes as functional tetraploid species, and with approximately 500 chromosomes as functional octaploids. A molecular phylogeny of the sturgeon was determined on the basis of sequences of the entire mitochondrial cytochrome b gene. By mapping the estimated levels of ploidy on this proposed phylogeny we demonstrate that (I) polyploidization events independently occurred in the acipenseriform radiation; (II) the process of functional genome reduction is nearly finished in species with approximately 120 chromosomes and more active in species with approximately 250 chromosomes and approximately 500 chromosomes; and (III) species with approximately 250 and approximately 500 chromosomes arose more recently than those with approximately 120 chromosomes. These results suggest that gene silencing, chromosomal rearrangements, and transposition events played an important role in the acipenseriform genome formation. Furthermore, this phylogeny is broadly consistent with previous hypotheses but reveals a highly supported oceanic (Atlantic-Pacific) subdivision within the Acipenser/Huso complex. PMID:11454768
Genome assembly from synthetic long read clouds
Kuleshov, Volodymyr; Snyder, Michael P.; Batzoglou, Serafim
2016-01-01
Motivation: Despite rapid progress in sequencing technology, assembling de novo the genomes of new species as well as reconstructing complex metagenomes remains major technological challenges. New synthetic long read (SLR) technologies promise significant advances towards these goals; however, their applicability is limited by high sequencing requirements and the inability of current assembly paradigms to cope with combinations of short and long reads. Results: Here, we introduce Architect, a new de novo scaffolder aimed at SLR technologies. Unlike previous assembly strategies, Architect does not require a costly subassembly step; instead it assembles genomes directly from the SLR’s underlying short reads, which we refer to as read clouds. This enables a 4- to 20-fold reduction in sequencing requirements and a 5-fold increase in assembly contiguity on both genomic and metagenomic datasets relative to state-of-the-art assembly strategies aimed directly at fully subassembled long reads. Availability and Implementation: Our source code is freely available at https://github.com/kuleshov/architect. Contact: kuleshov@stanford.edu PMID:27307620
Ma, Liang; Chen, Zehua; Huang, Da Wei; Kutty, Geetha; Ishihara, Mayumi; Wang, Honghui; Abouelleil, Amr; Bishop, Lisa; Davey, Emma; Deng, Rebecca; Deng, Xilong; Fan, Lin; Fantoni, Giovanna; Fitzgerald, Michael; Gogineni, Emile; Goldberg, Jonathan M.; Handley, Grace; Hu, Xiaojun; Huber, Charles; Jiao, Xiaoli; Jones, Kristine; Levin, Joshua Z.; Liu, Yueqin; Macdonald, Pendexter; Melnikov, Alexandre; Raley, Castle; Sassi, Monica; Sherman, Brad T.; Song, Xiaohong; Sykes, Sean; Tran, Bao; Walsh, Laura; Xia, Yun; Yang, Jun; Young, Sarah; Zeng, Qiandong; Zheng, Xin; Stephens, Robert; Nusbaum, Chad; Birren, Bruce W.; Azadi, Parastoo; Lempicki, Richard A.; Cuomo, Christina A.; Kovacs, Joseph A.
2016-01-01
Pneumocystis jirovecii is a major cause of life-threatening pneumonia in immunosuppressed patients including transplant recipients and those with HIV/AIDS, yet surprisingly little is known about the biology of this fungal pathogen. Here we report near complete genome assemblies for three Pneumocystis species that infect humans, rats and mice. Pneumocystis genomes are highly compact relative to other fungi, with substantial reductions of ribosomal RNA genes, transporters, transcription factors and many metabolic pathways, but contain expansions of surface proteins, especially a unique and complex surface glycoprotein superfamily, as well as proteases and RNA processing proteins. Unexpectedly, the key fungal cell wall components chitin and outer chain N-mannans are absent, based on genome content and experimental validation. Our findings suggest that Pneumocystis has developed unique mechanisms of adaptation to life exclusively in mammalian hosts, including dependence on the lungs for gas and nutrients and highly efficient strategies to escape both host innate and acquired immune defenses. PMID:26899007
Correlation between genome reduction and bacterial growth.
Kurokawa, Masaomi; Seno, Shigeto; Matsuda, Hideo; Ying, Bei-Wen
2016-12-01
Genome reduction by removing dispensable genomic sequences in bacteria is commonly used in both fundamental and applied studies to determine the minimal genetic requirements for a living system or to develop highly efficient bioreactors. Nevertheless, whether and how the accumulative loss of dispensable genomic sequences disturbs bacterial growth remains unclear. To investigate the relationship between genome reduction and growth, a series of Escherichia coli strains carrying genomes reduced in a stepwise manner were used. Intensive growth analyses revealed that the accumulation of multiple genomic deletions caused decreases in the exponential growth rate and the saturated cell density in a deletion-length-dependent manner as well as gradual changes in the patterns of growth dynamics, regardless of the growth media. Accordingly, a perspective growth model linking genome evolution to genome engineering was proposed. This study provides the first demonstration of a quantitative connection between genomic sequence and bacterial growth, indicating that growth rate is potentially associated with dispensable genomic sequences. © The Author 2016. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Batty, Elizabeth M; Chaemchuen, Suwittra; Blacksell, Stuart; Richards, Allen L; Paris, Daniel; Bowden, Rory; Chan, Caroline; Lachumanan, Ramkumar; Day, Nicholas; Donnelly, Peter; Chen, Swaine; Salje, Jeanne
2018-06-01
Orientia tsutsugamushi is a clinically important but neglected obligate intracellular bacterial pathogen of the Rickettsiaceae family that causes the potentially life-threatening human disease scrub typhus. In contrast to the genome reduction seen in many obligate intracellular bacteria, early genetic studies of Orientia have revealed one of the most repetitive bacterial genomes sequenced to date. The dramatic expansion of mobile elements has hampered efforts to generate complete genome sequences using short read sequencing methodologies, and consequently there have been few studies of the comparative genomics of this neglected species. We report new high-quality genomes of O. tsutsugamushi, generated using PacBio single molecule long read sequencing, for six strains: Karp, Kato, Gilliam, TA686, UT76 and UT176. In comparative genomics analyses of these strains together with existing reference genomes from Ikeda and Boryong strains, we identify a relatively small core genome of 657 genes, grouped into core gene islands and separated by repeat regions, and use the core genes to infer the first whole-genome phylogeny of Orientia. Complete assemblies of multiple Orientia genomes verify initial suggestions that these are remarkable organisms. They have larger genomes compared with most other Rickettsiaceae, with widespread amplification of repeat elements and massive chromosomal rearrangements between strains. At the gene level, Orientia has a relatively small set of universally conserved genes, similar to other obligate intracellular bacteria, and the relative expansion in genome size can be accounted for by gene duplication and repeat amplification. Our study demonstrates the utility of long read sequencing to investigate complex bacterial genomes and characterise genomic variation.
Karamitros, Timokratis; Piorkowska, Renata; Katzourakis, Aris; Magiorkinis, Gkikas; Mbisa, Jean Lutamyo
2016-01-01
Human herpesvirus type 1 (HHV-1) has a large double-stranded DNA genome of approximately 152 kbp that is structurally complex and GC-rich. This makes the assembly of HHV-1 whole genomes from short-read sequencing data technically challenging. To improve the assembly of HHV-1 genomes we have employed a hybrid genome assembly protocol using data from two sequencing technologies: the short-read Roche 454 and the long-read Oxford Nanopore MinION sequencers. We sequenced 18 HHV-1 cell culture-isolated clinical specimens collected from immunocompromised patients undergoing antiviral therapy. The susceptibility of the samples to several antivirals was determined by plaque reduction assay. Hybrid genome assembly resulted in a decrease in the number of contigs in 6 out of 7 samples and an increase in N(G)50 and N(G)75 of all 7 samples sequenced by both technologies. The approach also enhanced the detection of non-canonical contigs including a rearrangement between the unique (UL) and repeat (T/IRL) sequence regions of one sample that was not detectable by assembly of 454 reads alone. We detected several known and novel resistance-associated mutations in UL23 and UL30 genes. Genome-wide genetic variability ranged from <1% to 53% of amino acids in each gene exhibiting at least one substitution within the pool of samples. The UL23 gene had one of the highest genetic variabilities at 35.2% in keeping with its role in development of drug resistance. The assembly of accurate, full-length HHV-1 genomes will be useful in determining genetic determinants of drug resistance, virulence, pathogenesis and viral evolution. The numerous, complex repeat regions of the HHV-1 genome currently remain a barrier towards this goal. PMID:27309375
Karamitros, Timokratis; Harrison, Ian; Piorkowska, Renata; Katzourakis, Aris; Magiorkinis, Gkikas; Mbisa, Jean Lutamyo
2016-01-01
Human herpesvirus type 1 (HHV-1) has a large double-stranded DNA genome of approximately 152 kbp that is structurally complex and GC-rich. This makes the assembly of HHV-1 whole genomes from short-read sequencing data technically challenging. To improve the assembly of HHV-1 genomes we have employed a hybrid genome assembly protocol using data from two sequencing technologies: the short-read Roche 454 and the long-read Oxford Nanopore MinION sequencers. We sequenced 18 HHV-1 cell culture-isolated clinical specimens collected from immunocompromised patients undergoing antiviral therapy. The susceptibility of the samples to several antivirals was determined by plaque reduction assay. Hybrid genome assembly resulted in a decrease in the number of contigs in 6 out of 7 samples and an increase in N(G)50 and N(G)75 of all 7 samples sequenced by both technologies. The approach also enhanced the detection of non-canonical contigs including a rearrangement between the unique (UL) and repeat (T/IRL) sequence regions of one sample that was not detectable by assembly of 454 reads alone. We detected several known and novel resistance-associated mutations in UL23 and UL30 genes. Genome-wide genetic variability ranged from <1% to 53% of amino acids in each gene exhibiting at least one substitution within the pool of samples. The UL23 gene had one of the highest genetic variabilities at 35.2% in keeping with its role in development of drug resistance. The assembly of accurate, full-length HHV-1 genomes will be useful in determining genetic determinants of drug resistance, virulence, pathogenesis and viral evolution. The numerous, complex repeat regions of the HHV-1 genome currently remain a barrier towards this goal.
How genome complexity can explain the difficulty of aligning reads to genomes.
Phan, Vinhthuy; Gao, Shanshan; Tran, Quang; Vo, Nam S
2015-01-01
Although it is frequently observed that aligning short reads to genomes becomes harder if they contain complex repeat patterns, there has not been much effort to quantify the relationship between complexity of genomes and difficulty of short-read alignment. Existing measures of sequence complexity seem unsuitable for the understanding and quantification of this relationship. We investigated several measures of complexity and found that length-sensitive measures of complexity had the highest correlation to accuracy of alignment. In particular, the rate of distinct substrings of length k, where k is similar to the read length, correlated very highly to alignment performance in terms of precision and recall. We showed how to compute this measure efficiently in linear time, making it useful in practice to estimate quickly the difficulty of alignment for new genomes without having to align reads to them first. We showed how the length-sensitive measures could provide additional information for choosing aligners that would align consistently accurately on new genomes. We formally established a connection between genome complexity and the accuracy of short-read aligners. The relationship between genome complexity and alignment accuracy provides additional useful information for selecting suitable aligners for new genomes. Further, this work suggests that the complexity of genomes sometimes should be thought of in terms of specific computational problems, such as the alignment of short reads to genomes.
Bacterial genome reduction using the progressive clustering of deletions via yeast sexual cycling
Suzuki, Yo; Assad-Garcia, Nacyra; Kostylev, Maxim; ...
2015-02-05
The availability of genetically tractable organisms with simple genomes is critical for the rapid, systems-level understanding of basic biological processes. Mycoplasma bacteria, with the smallest known genomes among free-living cellular organisms, are ideal models for this purpose, but the natural versions of these cells have genome complexities still too great to offer a comprehensive view of a fundamental life form. Here in this paper we describe an efficient method for reducing genomes from these organisms by identifying individually deletable regions using transposon mutagenesis and progressively clustering deleted genomic segments using meiotic recombination between the bacterial genomes harbored in yeast. Mycoplasmalmore » genomes subjected to this process and transplanted into recipient cells yielded two mycoplasma strains. The first simultaneously lacked eight singly deletable regions of the genome, representing a total of 91 genes and ~10%of the original genome. The second strain lacked seven of the eight regions, representing 84 genes. Growth assay data revealed an absence of genetic interactions among the 91 genes under tested conditions. Despite predicted effects of the deletions on sugar metabolism and the proteome, growth rates were unaffected by the gene deletions in the seven-deletion strain. These results support the feasibility of using single-gene disruption data to design and construct viable genomes lacking multiple genes, paving the way toward genome minimization. The progressive clustering method is expected to be effective for the reorganization of any mega-sized DNA molecules cloned in yeast, facilitating the construction of designer genomes in microbes as well as genomic fragments for genetic engineering of higher eukaryotes.« less
Bacterial genome reduction using the progressive clustering of deletions via yeast sexual cycling
DOE Office of Scientific and Technical Information (OSTI.GOV)
Suzuki, Yo; Assad-Garcia, Nacyra; Kostylev, Maxim
The availability of genetically tractable organisms with simple genomes is critical for the rapid, systems-level understanding of basic biological processes. Mycoplasma bacteria, with the smallest known genomes among free-living cellular organisms, are ideal models for this purpose, but the natural versions of these cells have genome complexities still too great to offer a comprehensive view of a fundamental life form. Here in this paper we describe an efficient method for reducing genomes from these organisms by identifying individually deletable regions using transposon mutagenesis and progressively clustering deleted genomic segments using meiotic recombination between the bacterial genomes harbored in yeast. Mycoplasmalmore » genomes subjected to this process and transplanted into recipient cells yielded two mycoplasma strains. The first simultaneously lacked eight singly deletable regions of the genome, representing a total of 91 genes and ~10%of the original genome. The second strain lacked seven of the eight regions, representing 84 genes. Growth assay data revealed an absence of genetic interactions among the 91 genes under tested conditions. Despite predicted effects of the deletions on sugar metabolism and the proteome, growth rates were unaffected by the gene deletions in the seven-deletion strain. These results support the feasibility of using single-gene disruption data to design and construct viable genomes lacking multiple genes, paving the way toward genome minimization. The progressive clustering method is expected to be effective for the reorganization of any mega-sized DNA molecules cloned in yeast, facilitating the construction of designer genomes in microbes as well as genomic fragments for genetic engineering of higher eukaryotes.« less
Silva, Lindsey; Oh, Hyung Suk; Chang, Lynne; Yan, Zhipeng; Triezenberg, Steven J; Knipe, David M
2012-01-01
Little is known about the mechanisms of gene targeting within the nucleus and its effect on gene expression, but most studies have concluded that genes located near the nuclear periphery are silenced by heterochromatin. In contrast, we found that early herpes simplex virus (HSV) genome complexes localize near the nuclear lamina and that this localization is associated with reduced heterochromatin on the viral genome and increased viral immediate-early (IE) gene transcription. In this study, we examined the mechanism of this effect and found that input virion transactivator protein, virion protein 16 (VP16), targets sites adjacent to the nuclear lamina and is required for targeting of the HSV genome to the nuclear lamina, exclusion of heterochromatin from viral replication compartments, and reduction of heterochromatin on the viral genome. Because cells infected with the VP16 mutant virus in1814 showed a phenotype similar to that of lamin A/C(-/-) cells infected with wild-type virus, we hypothesized that the nuclear lamina is required for VP16 activator complex formation. In lamin A/C(-/-) mouse embryo fibroblasts, VP16 and Oct-1 showed reduced association with the viral IE gene promoters, the levels of VP16 and HCF-1 stably associated with the nucleus were lower than in wild-type cells, and the association of VP16 with HCF-1 was also greatly reduced. These results show that the nuclear lamina is required for stable nuclear localization and formation of the VP16 activator complex and provide evidence for the nuclear lamina being the site of assembly of the VP16 activator complex. The targeting of chromosomes in the cell nucleus is thought to be important in the regulation of expression of genes on the chromosomes. The major documented effect of intranuclear targeting has been silencing of chromosomes at sites near the nuclear periphery. In this study, we show that targeting of the herpes simplex virus DNA genome to the nuclear periphery promotes formation of transcriptional activator complexes on the viral genome, demonstrating that the nuclear periphery also has sites for activation of transcription. These results highlight the importance of the nuclear lamina, the structure that lines the inner nuclear membrane, in both transcriptional activation and repression. Future studies defining the molecular structures of these two types of nuclear sites should define new levels of gene regulation.
A survey about methods dedicated to epistasis detection.
Niel, Clément; Sinoquet, Christine; Dina, Christian; Rocheleau, Ghislain
2015-01-01
During the past decade, findings of genome-wide association studies (GWAS) improved our knowledge and understanding of disease genetics. To date, thousands of SNPs have been associated with diseases and other complex traits. Statistical analysis typically looks for association between a phenotype and a SNP taken individually via single-locus tests. However, geneticists admit this is an oversimplified approach to tackle the complexity of underlying biological mechanisms. Interaction between SNPs, namely epistasis, must be considered. Unfortunately, epistasis detection gives rise to analytic challenges since analyzing every SNP combination is at present impractical at a genome-wide scale. In this review, we will present the main strategies recently proposed to detect epistatic interactions, along with their operating principle. Some of these methods are exhaustive, such as multifactor dimensionality reduction, likelihood ratio-based tests or receiver operating characteristic curve analysis; some are non-exhaustive, such as machine learning techniques (random forests, Bayesian networks) or combinatorial optimization approaches (ant colony optimization, computational evolution system).
Energetics and genetics across the prokaryote-eukaryote divide
2011-01-01
Background All complex life on Earth is eukaryotic. All eukaryotic cells share a common ancestor that arose just once in four billion years of evolution. Prokaryotes show no tendency to evolve greater morphological complexity, despite their metabolic virtuosity. Here I argue that the eukaryotic cell originated in a unique prokaryotic endosymbiosis, a singular event that transformed the selection pressures acting on both host and endosymbiont. Results The reductive evolution and specialisation of endosymbionts to mitochondria resulted in an extreme genomic asymmetry, in which the residual mitochondrial genomes enabled the expansion of bioenergetic membranes over several orders of magnitude, overcoming the energetic constraints on prokaryotic genome size, and permitting the host cell genome to expand (in principle) over 200,000-fold. This energetic transformation was permissive, not prescriptive; I suggest that the actual increase in early eukaryotic genome size was driven by a heavy early bombardment of genes and introns from the endosymbiont to the host cell, producing a high mutation rate. Unlike prokaryotes, with lower mutation rates and heavy selection pressure to lose genes, early eukaryotes without genome-size limitations could mask mutations by cell fusion and genome duplication, as in allopolyploidy, giving rise to a proto-sexual cell cycle. The side effect was that a large number of shared eukaryotic basal traits accumulated in the same population, a sexual eukaryotic common ancestor, radically different to any known prokaryote. Conclusions The combination of massive bioenergetic expansion, release from genome-size constraints, and high mutation rate favoured a protosexual cell cycle and the accumulation of eukaryotic traits. These factors explain the unique origin of eukaryotes, the absence of true evolutionary intermediates, and the evolution of sex in eukaryotes but not prokaryotes. Reviewers This article was reviewed by: Eugene Koonin, William Martin, Ford Doolittle and Mark van der Giezen. For complete reports see the Reviewers' Comments section. PMID:21714941
Jo, Jinkwan; Purushotham, Preethi M.; Han, Koeun; Lee, Heung-Ryul; Nah, Gyoungju; Kang, Byoung-Cheorl
2017-01-01
Single nucleotide polymorphisms (SNPs) play important roles as molecular markers in plant genomics and breeding studies. Although onion (Allium cepa L.) is an important crop globally, relatively few molecular marker resources have been reported due to its large genome and high heterozygosity. Genotyping-by-sequencing (GBS) offers a greater degree of complexity reduction followed by concurrent SNP discovery and genotyping for species with complex genomes. In this study, GBS was employed for SNP mining in onion, which currently lacks a reference genome. A segregating F2 population, derived from a cross between ‘NW-001’ and ‘NW-002,’ as well as multiple parental lines were used for GBS analysis. A total of 56.15 Gbp of raw sequence data were generated and 1,851,428 SNPs were identified from the de novo assembled contigs. Stringent filtering resulted in 10,091 high-fidelity SNP markers. Robust SNPs that satisfied the segregation ratio criteria and with even distribution in the mapping population were used to construct an onion genetic map. The final map contained eight linkage groups and spanned a genetic length of 1,383 centiMorgans (cM), with an average marker interval of 8.08 cM. These robust SNPs were further analyzed using the high-throughput Fluidigm platform for marker validation. This is the first study in onion to develop genome-wide SNPs using GBS. The resulting SNP markers and developed linkage map will be valuable tools for genetic mapping of important agronomic traits and marker-assisted selection in onion breeding programs. PMID:28959273
Evolution of biological complexity
Adami, Christoph; Ofria, Charles; Collier, Travis C.
2000-01-01
To make a case for or against a trend in the evolution of complexity in biological evolution, complexity needs to be both rigorously defined and measurable. A recent information-theoretic (but intuitively evident) definition identifies genomic complexity with the amount of information a sequence stores about its environment. We investigate the evolution of genomic complexity in populations of digital organisms and monitor in detail the evolutionary transitions that increase complexity. We show that, because natural selection forces genomes to behave as a natural “Maxwell Demon,” within a fixed environment, genomic complexity is forced to increase. PMID:10781045
Janssen, Paul J; Van Houdt, Rob; Moors, Hugo; Monsieurs, Pieter; Morin, Nicolas; Michaux, Arlette; Benotmane, Mohammed A; Leys, Natalie; Vallaeys, Tatiana; Lapidus, Alla; Monchy, Sébastien; Médigue, Claudine; Taghavi, Safiyh; McCorkle, Sean; Dunn, John; van der Lelie, Daniël; Mergeay, Max
2010-05-05
Many bacteria in the environment have adapted to the presence of toxic heavy metals. Over the last 30 years, this heavy metal tolerance was the subject of extensive research. The bacterium Cupriavidus metallidurans strain CH34, originally isolated by us in 1976 from a metal processing factory, is considered a major model organism in this field because it withstands milli-molar range concentrations of over 20 different heavy metal ions. This tolerance is mostly achieved by rapid ion efflux but also by metal-complexation and -reduction. We present here the full genome sequence of strain CH34 and the manual annotation of all its genes. The genome of C. metallidurans CH34 is composed of two large circular chromosomes CHR1 and CHR2 of, respectively, 3,928,089 bp and 2,580,084 bp, and two megaplasmids pMOL28 and pMOL30 of, respectively, 171,459 bp and 233,720 bp in size. At least 25 loci for heavy-metal resistance (HMR) are distributed over the four replicons. Approximately 67% of the 6,717 coding sequences (CDSs) present in the CH34 genome could be assigned a putative function, and 9.1% (611 genes) appear to be unique to this strain. One out of five proteins is associated with either transport or transcription while the relay of environmental stimuli is governed by more than 600 signal transduction systems. The CH34 genome is most similar to the genomes of other Cupriavidus strains by correspondence between the respective CHR1 replicons but also displays similarity to the genomes of more distantly related species as a result of gene transfer and through the presence of large genomic islands. The presence of at least 57 IS elements and 19 transposons and the ability to take in and express foreign genes indicates a very dynamic and complex genome shaped by evolutionary forces. The genome data show that C. metallidurans CH34 is particularly well equipped to live in extreme conditions and anthropogenic environments that are rich in metals.
Janssen, Paul J.; Van Houdt, Rob; Moors, Hugo; Monsieurs, Pieter; Morin, Nicolas; Michaux, Arlette; Benotmane, Mohammed A.; Leys, Natalie; Vallaeys, Tatiana; Lapidus, Alla; Monchy, Sébastien; Médigue, Claudine; Taghavi, Safiyh; McCorkle, Sean; Dunn, John; van der Lelie, Daniël; Mergeay, Max
2010-01-01
Many bacteria in the environment have adapted to the presence of toxic heavy metals. Over the last 30 years, this heavy metal tolerance was the subject of extensive research. The bacterium Cupriavidus metallidurans strain CH34, originally isolated by us in 1976 from a metal processing factory, is considered a major model organism in this field because it withstands milli-molar range concentrations of over 20 different heavy metal ions. This tolerance is mostly achieved by rapid ion efflux but also by metal-complexation and -reduction. We present here the full genome sequence of strain CH34 and the manual annotation of all its genes. The genome of C. metallidurans CH34 is composed of two large circular chromosomes CHR1 and CHR2 of, respectively, 3,928,089 bp and 2,580,084 bp, and two megaplasmids pMOL28 and pMOL30 of, respectively, 171,459 bp and 233,720 bp in size. At least 25 loci for heavy-metal resistance (HMR) are distributed over the four replicons. Approximately 67% of the 6,717 coding sequences (CDSs) present in the CH34 genome could be assigned a putative function, and 9.1% (611 genes) appear to be unique to this strain. One out of five proteins is associated with either transport or transcription while the relay of environmental stimuli is governed by more than 600 signal transduction systems. The CH34 genome is most similar to the genomes of other Cupriavidus strains by correspondence between the respective CHR1 replicons but also displays similarity to the genomes of more distantly related species as a result of gene transfer and through the presence of large genomic islands. The presence of at least 57 IS elements and 19 transposons and the ability to take in and express foreign genes indicates a very dynamic and complex genome shaped by evolutionary forces. The genome data show that C. metallidurans CH34 is particularly well equipped to live in extreme conditions and anthropogenic environments that are rich in metals. PMID:20463976
PWHATSHAP: efficient haplotyping for future generation sequencing.
Bracciali, Andrea; Aldinucci, Marco; Patterson, Murray; Marschall, Tobias; Pisanti, Nadia; Merelli, Ivan; Torquati, Massimo
2016-09-22
Haplotype phasing is an important problem in the analysis of genomics information. Given a set of DNA fragments of an individual, it consists of determining which one of the possible alleles (alternative forms of a gene) each fragment comes from. Haplotype information is relevant to gene regulation, epigenetics, genome-wide association studies, evolutionary and population studies, and the study of mutations. Haplotyping is currently addressed as an optimisation problem aiming at solutions that minimise, for instance, error correction costs, where costs are a measure of the confidence in the accuracy of the information acquired from DNA sequencing. Solutions have typically an exponential computational complexity. WHATSHAP is a recent optimal approach which moves computational complexity from DNA fragment length to fragment overlap, i.e., coverage, and is hence of particular interest when considering sequencing technology's current trends that are producing longer fragments. Given the potential relevance of efficient haplotyping in several analysis pipelines, we have designed and engineered PWHATSHAP, a parallel, high-performance version of WHATSHAP. PWHATSHAP is embedded in a toolkit developed in Python and supports genomics datasets in standard file formats. Building on WHATSHAP, PWHATSHAP exhibits the same complexity exploring a number of possible solutions which is exponential in the coverage of the dataset. The parallel implementation on multi-core architectures allows for a relevant reduction of the execution time for haplotyping, while the provided results enjoy the same high accuracy as that provided by WHATSHAP, which increases with coverage. Due to its structure and management of the large datasets, the parallelisation of WHATSHAP posed demanding technical challenges, which have been addressed exploiting a high-level parallel programming framework. The result, PWHATSHAP, is a freely available toolkit that improves the efficiency of the analysis of genomics information.
Diversity Arrays Technology (DArT) for whole-genome profiling of barley
Wenzl, Peter; Carling, Jason; Kudrna, David; Jaccoud, Damian; Huttner, Eric; Kleinhofs, Andris; Kilian, Andrzej
2004-01-01
Diversity Arrays Technology (DArT) can detect and type DNA variation at several hundred genomic loci in parallel without relying on sequence information. Here we show that it can be effectively applied to genetic mapping and diversity analyses of barley, a species with a 5,000-Mbp genome. We tested several complexity reduction methods and selected two that generated the most polymorphic genomic representations. Arrays containing individual fragments from these representations generated DArT fingerprints with a genotype call rate of 98.0% and a scoring reproducibility of at least 99.8%. The fingerprints grouped barley lines according to known genetic relationships. To validate the Mendelian behavior of DArT markers, we constructed a genetic map for a cross between cultivars Steptoe and Morex. Nearly all polymorphic array features could be incorporated into one of seven linkage groups (98.8%). The resulting map comprised ≈385 unique DArT markers and spanned 1,137 centimorgans. A comparison with the restriction fragment length polymorphism-based framework map indicated that the quality of the DArT map was equivalent, if not superior, to that of the framework map. These results highlight the potential of DArT as a generic technique for genome profiling in the context of molecular breeding and genomics. PMID:15192146
Sánchez-Sevilla, José F.; Horvath, Aniko; Botella, Miguel A.; Gaston, Amèlia; Folta, Kevin; Kilian, Andrzej; Denoyes, Beatrice; Amaya, Iraida
2015-01-01
Cultivated strawberry (Fragaria × ananassa) is a genetically complex allo-octoploid crop with 28 pairs of chromosomes (2n = 8x = 56) for which a genome sequence is not yet available. The diploid Fragaria vesca is considered the donor species of one of the octoploid sub-genomes and its available genome sequence can be used as a reference for genomic studies. A wide number of strawberry cultivars are stored in ex situ germplasm collections world-wide but a number of previous studies have addressed the genetic diversity present within a limited number of these collections. Here, we report the development and application of two platforms based on the implementation of Diversity Array Technology (DArT) markers for high-throughput genotyping in strawberry. The first DArT microarray was used to evaluate the genetic diversity of 62 strawberry cultivars that represent a wide range of variation based on phenotype, geographical and temporal origin and pedigrees. A total of 603 DArT markers were used to evaluate the diversity and structure of the population and their cluster analyses revealed that these markers were highly efficient in classifying the accessions in groups based on historical, geographical and pedigree-based cues. The second DArTseq platform took benefit of the complexity reduction method optimized for strawberry and the development of next generation sequencing technologies. The strawberry DArTseq was used to generate a total of 9,386 SNP markers in the previously developed ‘232’ × ‘1392’ mapping population, of which, 4,242 high quality markers were further selected to saturate this map after several filtering steps. The high-throughput platforms here developed for genotyping strawberry will facilitate genome-wide characterizations of large accessions sets and complement other available options. PMID:26675207
Microbial minimalism: genome reduction in bacterial pathogens.
Moran, Nancy A
2002-03-08
When bacterial lineages make the transition from free-living or facultatively parasitic life cycles to permanent associations with hosts, they undergo a major loss of genes and DNA. Complete genome sequences are providing an understanding of how extreme genome reduction affects evolutionary directions and metabolic capabilities of obligate pathogens and symbionts.
Heinz, Eva; Williams, Tom A.; Nakjang, Sirintra; Noël, Christophe J.; Swan, Daniel C.; Goldberg, Alina V.; Harris, Simon R.; Weinmaier, Thomas; Markert, Stephanie; Becher, Dörte; Bernhardt, Jörg; Dagan, Tal; Hacker, Christian; Lucocq, John M.; Schweder, Thomas; Rattei, Thomas; Hall, Neil; Hirt, Robert P.; Embley, T. Martin
2012-01-01
The dynamics of reductive genome evolution for eukaryotes living inside other eukaryotic cells are poorly understood compared to well-studied model systems involving obligate intracellular bacteria. Here we present 8.5 Mb of sequence from the genome of the microsporidian Trachipleistophora hominis, isolated from an HIV/AIDS patient, which is an outgroup to the smaller compacted-genome species that primarily inform ideas of evolutionary mode for these enormously successful obligate intracellular parasites. Our data provide detailed information on the gene content, genome architecture and intergenic regions of a larger microsporidian genome, while comparative analyses allowed us to infer genomic features and metabolism of the common ancestor of the species investigated. Gene length reduction and massive loss of metabolic capacity in the common ancestor was accompanied by the evolution of novel microsporidian-specific protein families, whose conservation among microsporidians, against a background of reductive evolution, suggests they may have important functions in their parasitic lifestyle. The ancestor had already lost many metabolic pathways but retained glycolysis and the pentose phosphate pathway to provide cytosolic ATP and reduced coenzymes, and it had a minimal mitochondrion (mitosome) making Fe-S clusters but not ATP. It possessed bacterial-like nucleotide transport proteins as a key innovation for stealing host-generated ATP, the machinery for RNAi, key elements of the early secretory pathway, canonical eukaryotic as well as microsporidian-specific regulatory elements, a diversity of repetitive and transposable elements, and relatively low average gene density. Microsporidian genome evolution thus appears to have proceeded in at least two major steps: an ancestral remodelling of the proteome upon transition to intracellular parasitism that involved reduction but also selective expansion, followed by a secondary compaction of genome architecture in some, but not all, lineages. PMID:23133373
Chromosome organizaton in simple and complex unicellular organisms.
O'Sullivan, Justin M
2011-01-01
The genomes of unicellular organisms form complex 3-dimensional structures. This spatial organization is hypothesized to have a significant role in genomic function. Spatial organization is not limited solely to the three-dimensional folding of the chromosome(s) in genomes but also includes genome positioning, and the folding and compartmentalization of any additional genetic material (e.g. episomes) present within complex genomes. In this comment, I will highlight similarities in the spatial organization of eukaryotic and prokaryotic unicellular genomes.
Jordan, Daniel M; Do, Ron
2018-04-11
While sequence-based genetic tests have long been available for specific loci, especially for Mendelian disease, the rapidly falling costs of genome-wide genotyping arrays, whole-exome sequencing, and whole-genome sequencing are moving us toward a future where full genomic information might inform the prognosis and treatment of a variety of diseases, including complex disease. Similarly, the availability of large populations with full genomic information has enabled new insights about the etiology and genetic architecture of complex disease. Insights from the latest generation of genomic studies suggest that our categorization of diseases as complex may conceal a wide spectrum of genetic architectures and causal mechanisms that ranges from Mendelian forms of complex disease to complex regulatory structures underlying Mendelian disease. Here, we review these insights, along with advances in the prediction of disease risk and outcomes from full genomic information. Expected final online publication date for the Annual Review of Genomics and Human Genetics Volume 19 is August 31, 2018. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
Reciprocal genomic evolution in the ant–fungus agricultural symbiosis
Nygaard, Sanne; Hu, Haofu; Li, Cai; Schiøtt, Morten; Chen, Zhensheng; Yang, Zhikai; Xie, Qiaolin; Ma, Chunyu; Deng, Yuan; Dikow, Rebecca B.; Rabeling, Christian; Nash, David R.; Wcislo, William T.; Brady, Seán G.; Schultz, Ted R.; Zhang, Guojie; Boomsma, Jacobus J.
2016-01-01
The attine ant–fungus agricultural symbiosis evolved over tens of millions of years, producing complex societies with industrial-scale farming analogous to that of humans. Here we document reciprocal shifts in the genomes and transcriptomes of seven fungus-farming ant species and their fungal cultivars. We show that ant subsistence farming probably originated in the early Tertiary (55–60 MYA), followed by further transitions to the farming of fully domesticated cultivars and leaf-cutting, both arising earlier than previously estimated. Evolutionary modifications in the ants include unprecedented rates of genome-wide structural rearrangement, early loss of arginine biosynthesis and positive selection on chitinase pathways. Modifications of fungal cultivars include loss of a key ligninase domain, changes in chitin synthesis and a reduction in carbohydrate-degrading enzymes as the ants gradually transitioned to functional herbivory. In contrast to human farming, increasing dependence on a single cultivar lineage appears to have been essential to the origin of industrial-scale ant agriculture. PMID:27436133
GAMES identifies and annotates mutations in next-generation sequencing projects.
Sana, Maria Elena; Iascone, Maria; Marchetti, Daniela; Palatini, Jeff; Galasso, Marco; Volinia, Stefano
2011-01-01
Next-generation sequencing (NGS) methods have the potential for changing the landscape of biomedical science, but at the same time pose several problems in analysis and interpretation. Currently, there are many commercial and public software packages that analyze NGS data. However, the limitations of these applications include output which is insufficiently annotated and of difficult functional comprehension to end users. We developed GAMES (Genomic Analysis of Mutations Extracted by Sequencing), a pipeline aiming to serve as an efficient middleman between data deluge and investigators. GAMES attains multiple levels of filtering and annotation, such as aligning the reads to a reference genome, performing quality control and mutational analysis, integrating results with genome annotations and sorting each mismatch/deletion according to a range of parameters. Variations are matched to known polymorphisms. The prediction of functional mutations is achieved by using different approaches. Overall GAMES enables an effective complexity reduction in large-scale DNA-sequencing projects. GAMES is available free of charge to academic users and may be obtained from http://aqua.unife.it/GAMES.
Otwell, Anne E.; Callister, Stephen J.; Zink, Erika M.; ...
2016-02-19
In this study, the proteomes of the metabolically versatile and poorly characterized Gram-positive bacterium Desulfotomaculum reducens MI-1 were compared across four cultivation conditions including sulfate reduction, soluble Fe(III) reduction, insoluble Fe(III) reduction, and pyruvate fermentation. Collectively across conditions, we observed at high confidence ~38% of genome-encoded proteins. Here, we focus on proteins that display significant differential abundance on conditions tested. To the best of our knowledge, this is the first full-proteome study focused on a Gram-positive organism cultivated either on sulfate or metal-reducing conditions. Several proteins with uncharacterized function encoded within heterodisulfide reductase ( hdr)-containing loci were upregulated on eithermore » sulfate (Dred_0633-4, Dred_0689-90, and Dred_1325-30) or Fe(III)-citrate-reducing conditions (Dred_0432-3 and Dred_1778-84). Two of these hdr-containing loci display homology to recently described flavin-based electron bifurcation (FBEB) pathways (Dred_1325-30 and Dred_1778-84). Additionally, we propose that a cluster of proteins, which is homologous to a described FBEB lactate dehydrogenase (LDH) complex, is performing lactate oxidation in D. reducens (Dred_0367-9). Analysis of the putative sulfate reduction machinery in D. reducens revealed that most of these proteins are constitutively expressed across cultivation conditions tested. In addition, peptides from the single multiheme c-type cytochrome (MHC) in the genome were exclusively observed on the insoluble Fe(III) condition, suggesting that this MHC may play a role in reduction of insoluble metals.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Otwell, Anne E.; Callister, Stephen J.; Zink, Erika M.
In this study, the proteomes of the metabolically versatile and poorly characterized Gram-positive bacterium Desulfotomaculum reducens MI-1 were compared across four cultivation conditions including sulfate reduction, soluble Fe(III) reduction, insoluble Fe(III) reduction, and pyruvate fermentation. Collectively across conditions, we observed at high confidence ~38% of genome-encoded proteins. Here, we focus on proteins that display significant differential abundance on conditions tested. To the best of our knowledge, this is the first full-proteome study focused on a Gram-positive organism cultivated either on sulfate or metal-reducing conditions. Several proteins with uncharacterized function encoded within heterodisulfide reductase ( hdr)-containing loci were upregulated on eithermore » sulfate (Dred_0633-4, Dred_0689-90, and Dred_1325-30) or Fe(III)-citrate-reducing conditions (Dred_0432-3 and Dred_1778-84). Two of these hdr-containing loci display homology to recently described flavin-based electron bifurcation (FBEB) pathways (Dred_1325-30 and Dred_1778-84). Additionally, we propose that a cluster of proteins, which is homologous to a described FBEB lactate dehydrogenase (LDH) complex, is performing lactate oxidation in D. reducens (Dred_0367-9). Analysis of the putative sulfate reduction machinery in D. reducens revealed that most of these proteins are constitutively expressed across cultivation conditions tested. In addition, peptides from the single multiheme c-type cytochrome (MHC) in the genome were exclusively observed on the insoluble Fe(III) condition, suggesting that this MHC may play a role in reduction of insoluble metals.« less
An efficient approach to BAC based assembly of complex genomes.
Visendi, Paul; Berkman, Paul J; Hayashi, Satomi; Golicz, Agnieszka A; Bayer, Philipp E; Ruperao, Pradeep; Hurgobin, Bhavna; Montenegro, Juan; Chan, Chon-Kit Kenneth; Staňková, Helena; Batley, Jacqueline; Šimková, Hana; Doležel, Jaroslav; Edwards, David
2016-01-01
There has been an exponential growth in the number of genome sequencing projects since the introduction of next generation DNA sequencing technologies. Genome projects have increasingly involved assembly of whole genome data which produces inferior assemblies compared to traditional Sanger sequencing of genomic fragments cloned into bacterial artificial chromosomes (BACs). While whole genome shotgun sequencing using next generation sequencing (NGS) is relatively fast and inexpensive, this method is extremely challenging for highly complex genomes, where polyploidy or high repeat content confounds accurate assembly, or where a highly accurate 'gold' reference is required. Several attempts have been made to improve genome sequencing approaches by incorporating NGS methods, to variable success. We present the application of a novel BAC sequencing approach which combines indexed pools of BACs, Illumina paired read sequencing, a sequence assembler specifically designed for complex BAC assembly, and a custom bioinformatics pipeline. We demonstrate this method by sequencing and assembling BAC cloned fragments from bread wheat and sugarcane genomes. We demonstrate that our assembly approach is accurate, robust, cost effective and scalable, with applications for complete genome sequencing in large and complex genomes.
2014-01-01
Background Within the genus Streptococcus, only Streptococcus thermophilus is used as a starter culture in food fermentations. Streptococcus macedonicus though, which belongs to the Streptococcus bovis/Streptococcus equinus complex (SBSEC), is also frequently isolated from fermented foods mainly of dairy origin. Members of the SBSEC have been implicated in human endocarditis and colon cancer. Here we compare the genome sequence of the dairy isolate S. macedonicus ACA-DC 198 to the other SBSEC genomes in order to assess in silico its potential adaptation to milk and its pathogenicity status. Results Despite the fact that the SBSEC species were found tightly related based on whole genome phylogeny of streptococci, two distinct patterns of evolution were identified among them. Streptococcus macedonicus, Streptococcus infantarius CJ18 and Streptococcus pasteurianus ATCC 43144 seem to have undergone reductive evolution resulting in significantly diminished genome sizes and increased percentages of potential pseudogenes when compared to Streptococcus gallolyticus subsp. gallolyticus. In addition, the three species seem to have lost genes for catabolizing complex plant carbohydrates and for detoxifying toxic substances previously linked to the ability of S. gallolyticus to survive in the rumen. Analysis of the S. macedonicus genome revealed features that could support adaptation to milk, including an extra gene cluster for lactose and galactose metabolism, a proteolytic system for casein hydrolysis, auxotrophy for several vitamins, an increased ability to resist bacteriophages and horizontal gene transfer events with the dairy Lactococcus lactis and S. thermophilus as potential donors. In addition, S. macedonicus lacks several pathogenicity-related genes found in S. gallolyticus. For example, S. macedonicus has retained only one (i.e. the pil3) of the three pilus gene clusters which may mediate the binding of S. gallolyticus to the extracellular matrix. Unexpectedly, similar findings were obtained not only for the dairy S. infantarius CJ18, but also for the blood isolate S. pasteurianus ATCC 43144. Conclusions Our whole genome analyses suggest traits of adaptation of S. macedonicus to the nutrient-rich dairy environment. During this process the bacterium gained genes presumably important for this new ecological niche. Finally, S. macedonicus carries a reduced number of putative SBSEC virulence factors, which suggests a diminished pathogenic potential. PMID:24713045
Papadimitriou, Konstantinos; Anastasiou, Rania; Mavrogonatou, Eleni; Blom, Jochen; Papandreou, Nikos C; Hamodrakas, Stavros J; Ferreira, Stéphanie; Renault, Pierre; Supply, Philip; Pot, Bruno; Tsakalidou, Effie
2014-04-08
Within the genus Streptococcus, only Streptococcus thermophilus is used as a starter culture in food fermentations. Streptococcus macedonicus though, which belongs to the Streptococcus bovis/Streptococcus equinus complex (SBSEC), is also frequently isolated from fermented foods mainly of dairy origin. Members of the SBSEC have been implicated in human endocarditis and colon cancer. Here we compare the genome sequence of the dairy isolate S. macedonicus ACA-DC 198 to the other SBSEC genomes in order to assess in silico its potential adaptation to milk and its pathogenicity status. Despite the fact that the SBSEC species were found tightly related based on whole genome phylogeny of streptococci, two distinct patterns of evolution were identified among them. Streptococcus macedonicus, Streptococcus infantarius CJ18 and Streptococcus pasteurianus ATCC 43144 seem to have undergone reductive evolution resulting in significantly diminished genome sizes and increased percentages of potential pseudogenes when compared to Streptococcus gallolyticus subsp. gallolyticus. In addition, the three species seem to have lost genes for catabolizing complex plant carbohydrates and for detoxifying toxic substances previously linked to the ability of S. gallolyticus to survive in the rumen. Analysis of the S. macedonicus genome revealed features that could support adaptation to milk, including an extra gene cluster for lactose and galactose metabolism, a proteolytic system for casein hydrolysis, auxotrophy for several vitamins, an increased ability to resist bacteriophages and horizontal gene transfer events with the dairy Lactococcus lactis and S. thermophilus as potential donors. In addition, S. macedonicus lacks several pathogenicity-related genes found in S. gallolyticus. For example, S. macedonicus has retained only one (i.e. the pil3) of the three pilus gene clusters which may mediate the binding of S. gallolyticus to the extracellular matrix. Unexpectedly, similar findings were obtained not only for the dairy S. infantarius CJ18, but also for the blood isolate S. pasteurianus ATCC 43144. Our whole genome analyses suggest traits of adaptation of S. macedonicus to the nutrient-rich dairy environment. During this process the bacterium gained genes presumably important for this new ecological niche. Finally, S. macedonicus carries a reduced number of putative SBSEC virulence factors, which suggests a diminished pathogenic potential.
Arenas, Miguel
2015-04-01
NGS technologies present a fast and cheap generation of genomic data. Nevertheless, ancestral genome inference is not so straightforward due to complex evolutionary processes acting on this material such as inversions, translocations, and other genome rearrangements that, in addition to their implicit complexity, can co-occur and confound ancestral inferences. Recently, models of genome evolution that accommodate such complex genomic events are emerging. This letter explores these novel evolutionary models and proposes their incorporation into robust statistical approaches based on computer simulations, such as approximate Bayesian computation, that may produce a more realistic evolutionary analysis of genomic data. Advantages and pitfalls in using these analytical methods are discussed. Potential applications of these ancestral genomic inferences are also pointed out.
Endo, Akihito; Tanizawa, Yasuhiro; Tanaka, Naoto; ...
2015-12-29
In this study, Fructobacillus spp. in fructose-rich niches belong to the family Leuconostocaceae. They were originally classified as Leuconostoc spp., but were later grouped into a novel genus, Fructobacillus , based on their phylogenetic position, morphology and specific biochemical characteristics. The unique characters, so called fructophilic characteristics, had not been reported in the group of lactic acid bacteria, suggesting unique evolution at the genome level. Here we studied four draft genome sequences of Fructobacillus spp. and compared their metabolic properties against those of Leuconostoc spp. As a result, Fructobacillus species possess significantly less protein coding sequences in their small genomes.more » The number of genes was significantly smaller in carbohydrate transport and metabolism. Several other metabolic pathways, including TCA cycle, ubiquinone and other terpenoid-quinone biosynthesis and phosphotransferase systems, were characterized as discriminative pathways between the two genera. The adhE gene for bifunctional acetaldehyde/alcohol dehydrogenase, and genes for subunits of the pyruvate dehydrogenase complex were absent in Fructobacillus spp. The two genera also show different levels of GC contents, which are mainly due to the different GC contents at the third codon position. In conclusion, the present genome characteristics in Fructobacillus spp. suggest reductive evolution that took place to adapt to specific niches.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Endo, Akihito; Tanizawa, Yasuhiro; Tanaka, Naoto
In this study, Fructobacillus spp. in fructose-rich niches belong to the family Leuconostocaceae. They were originally classified as Leuconostoc spp., but were later grouped into a novel genus, Fructobacillus , based on their phylogenetic position, morphology and specific biochemical characteristics. The unique characters, so called fructophilic characteristics, had not been reported in the group of lactic acid bacteria, suggesting unique evolution at the genome level. Here we studied four draft genome sequences of Fructobacillus spp. and compared their metabolic properties against those of Leuconostoc spp. As a result, Fructobacillus species possess significantly less protein coding sequences in their small genomes.more » The number of genes was significantly smaller in carbohydrate transport and metabolism. Several other metabolic pathways, including TCA cycle, ubiquinone and other terpenoid-quinone biosynthesis and phosphotransferase systems, were characterized as discriminative pathways between the two genera. The adhE gene for bifunctional acetaldehyde/alcohol dehydrogenase, and genes for subunits of the pyruvate dehydrogenase complex were absent in Fructobacillus spp. The two genera also show different levels of GC contents, which are mainly due to the different GC contents at the third codon position. In conclusion, the present genome characteristics in Fructobacillus spp. suggest reductive evolution that took place to adapt to specific niches.« less
NASA Astrophysics Data System (ADS)
Beller, H. R.; Zhou, P.; Legler, T. C.; Chakicherla, A.; O'Day, P. A.
2013-12-01
Thiobacillus denitrificans is a chemolithoautotrophic bacterium capable of anaerobic, nitrate-dependent U(IV) and Fe(II) oxidation, both of which can strongly influence the long-term efficacy of in situ reductive immobilization of uranium in contaminated aquifers. We previously identified two c-type cytochromes involved in nitrate-dependent U(IV) oxidation in T. denitrificans and hypothesized that c-type cytochromes would also catalyze Fe(II) oxidation, as they have been found to play this role in anaerobic phototrophic Fe(II)-oxidizing bacteria. Here we report on efforts to identify genes associated with nitrate-dependent Fe(II) oxidation, namely (a) whole-genome transcriptional studies [using FeCO3, Fe2+, and U(IV) oxides as electron donors under denitrifying conditions], (b) Fe(II) oxidation assays performed with knockout mutants targeting primarily highly expressed or upregulated c-type cytochromes, and (c) random transposon-mutagenesis studies with screening for Fe(II) oxidation. Assays of mutants for 26 target genes, most of which were c-type cytochromes, indicated that none of the mutants tested were significantly defective in nitrate-dependent Fe(II) oxidation. The non-defective mutants included the c1-cytochrome subunit of the cytochrome bc1 complex (complex III), which has relevance to a previously proposed role for this complex in nitrate-dependent Fe(II) oxidation and to current concepts of reverse electron transfer. Of the transposon mutants defective in Fe(II) oxidation, one mutant with a disrupted gene associated with NADH:ubiquinone oxidoreductase (complex I) was ~35% defective relative to the wild-type strain; this strain was similarly defective in nitrate reduction with thiosulfate as the electron donor. Overall, our results indicate that nitrate-dependent Fe(II) oxidation in T. denitrificans is not catalyzed by the same c-type cytochromes involved in U(IV) oxidation, nor have other c-type cytochromes yet been implicated in the process.
Beller, Harry R.; Zhou, Peng; Legler, Tina C.; Chakicherla, Anu; Kane, Staci; Letain, Tracy E.; A. O’Day, Peggy
2013-01-01
Thiobacillus denitrificans is a chemolithoautotrophic bacterium capable of anaerobic, nitrate-dependent U(IV) and Fe(II) oxidation, both of which can strongly influence the long-term efficacy of in situ reductive immobilization of uranium in contaminated aquifers. We previously identified two c-type cytochromes involved in nitrate-dependent U(IV) oxidation in T. denitrificans and hypothesized that c-type cytochromes would also catalyze Fe(II) oxidation, as they have been found to play this role in anaerobic phototrophic Fe(II)-oxidizing bacteria. Here we report on efforts to identify genes associated with nitrate-dependent Fe(II) oxidation, namely (a) whole-genome transcriptional studies [using FeCO3, Fe2+, and U(IV) oxides as electron donors under denitrifying conditions], (b) Fe(II) oxidation assays performed with knockout mutants targeting primarily highly expressed or upregulated c-type cytochromes, and (c) random transposon-mutagenesis studies with screening for Fe(II) oxidation. Assays of mutants for 26 target genes, most of which were c-type cytochromes, indicated that none of the mutants tested were significantly defective in nitrate-dependent Fe(II) oxidation. The non-defective mutants included the c1-cytochrome subunit of the cytochrome bc1 complex (complex III), which has relevance to a previously proposed role for this complex in nitrate-dependent Fe(II) oxidation and to current concepts of reverse electron transfer. A transposon mutant with a disrupted gene associated with NADH:ubiquinone oxidoreductase (complex I) was ~35% defective relative to the wild-type strain; this strain was similarly defective in nitrate reduction with thiosulfate as the electron donor. Overall, our results indicate that nitrate-dependent Fe(II) oxidation in T. denitrificans is not catalyzed by the same c-type cytochromes involved in U(IV) oxidation, nor have other c-type cytochromes yet been implicated in the process. PMID:24065960
NASA Astrophysics Data System (ADS)
Mullin, S. W.; Wrighton, K. C.; Luef, B.; Wilkins, M. J.; Handley, K. M.; Williams, K. H.; Banfield, J. F.
2012-12-01
Community genomics and proteomics (proteogenomics) can be used to predict the metabolic potential of complex microbial communities and provide insight into microbial activity and nutrient cycling in situ. Inferences regarding the physiology of specific organisms then can guide isolation efforts, which, if successful, can yield strains that can be metabolically and structurally characterized to further test metagenomic predictions. Here we used proteogenomic data from an acetate-stimulated, sulfidic sediment column deployed in a groundwater well in Rifle, CO to direct laboratory amendment experiments to isolate a bacterial strain potentially involved in sulfur oxidation for physiological and microscopic characterization (Handley et al, submitted 2012). Field strains of Sulfurovum (genome r9c2) were predicted to be capable of CO2 fixation via the reverse TCA cycle and sulfur oxidation (Sox and SQR) coupled to either nitrate reduction (Nap, Nir, Nos) in anaerobic environments or oxygen reduction in microaerobic (cbb3 and bd oxidases) environments; however, key genes for sulfur oxidation (soxXAB) were not identified. Sulfidic groundwater and sediment from the Rifle site were used to inoculate cultures that contained various sulfur species, with and without nitrate and oxygen. We isolated a bacterium, Sulfurovum sp. OBA, whose 16S rRNA gene shares 99.8 % identity to the gene of the dominant genomically characterized strain (genome r9c2) in the Rifle sediment column. The 16S rRNA gene of the isolate most closely matches (95 % sequence identity) the gene of Sulfurovum sp. NBC37-1, a genome-sequenced deep-sea sulfur oxidizer. Strain OBA grew via polysulfide, colloidal sulfur, and tetrathionate oxidation coupled to nitrate reduction under autotrophic and mixotrophic conditions. Strain OBA also grew heterotrophically, oxidizing glucose, fructose, mannose, and maltose with nitrate as an electron acceptor. Over the range of oxygen concentrations tested, strain OBA was not capable of aerobic growth, but it could tolerate low oxygen conditions in the polysulfide/nitrate growth medium, suggesting that oxidases identified by genomics may play a role in detoxification rather than energy generation. Cryo-TEM imaging showed that strain OBA cells are rod-shaped and ~0.4 wide and 1.0 μm in length, and confirmed metagenomics-based predictions of a Gram-negative cell envelope, pili and polyphosphate body production. Our results show the value of integrating metagenomics, culturing, and microscopic imaging to discern the physiology of bacteria involved in biogeochemical transformations in the subsurface.
MacGregor, Barbara J; Biddle, Jennifer F; Harbort, Christopher; Matthysse, Ann G; Teske, Andreas
2013-09-01
A near-complete draft genome has been obtained for a single vacuolated orange Beggiatoa (Cand. Maribeggiatoa) filament from a Guaymas Basin seafloor microbial mat, the third relatively complete sequence for the Beggiatoaceae. Possible pathways for sulfide oxidation; nitrate respiration; inorganic carbon fixation by both Type II RuBisCO and the reductive tricarboxylic acid cycle; acetate and possibly formate uptake; and energy-generating electron transport via both oxidative phosphorylation and the Rnf complex are discussed here. A role in nitrite reduction is suggested for an abundant orange cytochrome produced by the Guaymas strain; this has a possible homolog in Beggiatoa (Cand. Isobeggiatoa) sp. PS, isolated from marine harbor sediment, but not Beggiatoa alba B18LD, isolated from a freshwater rice field ditch. Inferred phylogenies for the Calvin-Benson-Bassham (CBB) cycle and the reductive (rTCA) and oxidative (TCA) tricarboxylic acid cycles suggest that genes encoding succinate dehydrogenase and enzymes for carboxylation and/or decarboxylation steps (including RuBisCO) may have been introduced to (or exported from) one or more of the three genomes by horizontal transfer, sometimes by different routes. Sequences from the two marine strains are generally more similar to each other than to sequences from the freshwater strain, except in the case of RuBisCO: only the Guaymas strain encodes a Type II enzyme, which (where studied) discriminates less against oxygen than do Type I RuBisCOs. Genes subject to horizontal transfer may represent key steps for adaptation to factors such as oxygen and carbon dioxide concentration, organic carbon availability, and environmental variability. © 2013.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chakraborty, Romy; Woo, Hannah; Dehal, Paramvir
Hexavalent Chromium [Cr(VI)] is a widespread contaminant found in soil, sediment, and ground water in several DOE sites, including Hanford 100 H area. In order to stimulate microbially mediated reduction of Cr(VI) at this site, a poly-lactate hydrogen release compound was injected into the chromium contaminated aquifer. The targeted enrichment of dominant nitrate-reducing bacteria post injection resulted in the isolation of Pseudomonas stutzeri strain RCH2. P. stutzeri strain RCH2 was isolated using acetate as the electron donor and is a complete denitrifier. Experiments with anaerobic washed cell suspension of strain RCH2 revealed it could reduce Cr(VI) and Fe(III). We sequencedmore » the genome of strain RCH2 using a combination of Illumina and 454 sequencing technologies and contained a circular chromosome of 4.6 Mb and three plasmids. Furthermore, global genome comparisons of strain RCH2 with six other fully sequenced P. stutzeri strains revealed most genomic regions are conserved, however strain RCH2 has an additional 244 genes, some of which are involved in chemotaxis, Flp pilus biogenesis and pyruvate/2-oxogluturate complex formation.« less
Chakraborty, Romy; Woo, Hannah; Dehal, Paramvir; ...
2017-02-08
Hexavalent Chromium [Cr(VI)] is a widespread contaminant found in soil, sediment, and ground water in several DOE sites, including Hanford 100 H area. In order to stimulate microbially mediated reduction of Cr(VI) at this site, a poly-lactate hydrogen release compound was injected into the chromium contaminated aquifer. The targeted enrichment of dominant nitrate-reducing bacteria post injection resulted in the isolation of Pseudomonas stutzeri strain RCH2. P. stutzeri strain RCH2 was isolated using acetate as the electron donor and is a complete denitrifier. Experiments with anaerobic washed cell suspension of strain RCH2 revealed it could reduce Cr(VI) and Fe(III). We sequencedmore » the genome of strain RCH2 using a combination of Illumina and 454 sequencing technologies and contained a circular chromosome of 4.6 Mb and three plasmids. Furthermore, global genome comparisons of strain RCH2 with six other fully sequenced P. stutzeri strains revealed most genomic regions are conserved, however strain RCH2 has an additional 244 genes, some of which are involved in chemotaxis, Flp pilus biogenesis and pyruvate/2-oxogluturate complex formation.« less
NASA Astrophysics Data System (ADS)
Derelle, Evelyne; Ferraz, Conchita; Rombauts, Stephane; Rouzé, Pierre; Worden, Alexandra Z.; Robbens, Steven; Partensky, Frédéric; Degroeve, Sven; Echeynié, Sophie; Cooke, Richard; Saeys, Yvan; Wuyts, Jan; Jabbari, Kamel; Bowler, Chris; Panaud, Olivier; Piégu, Benoît; Ball, Steven G.; Ral, Jean-Philippe; Bouget, François-Yves; Piganeau, Gwenael; de Baets, Bernard; Picard, André; Delseny, Michel; Demaille, Jacques; van de Peer, Yves; Moreau, Hervé
2006-08-01
The green lineage is reportedly 1,500 million years old, evolving shortly after the endosymbiosis event that gave rise to early photosynthetic eukaryotes. In this study, we unveil the complete genome sequence of an ancient member of this lineage, the unicellular green alga Ostreococcus tauri (Prasinophyceae). This cosmopolitan marine primary producer is the world's smallest free-living eukaryote known to date. Features likely reflecting optimization of environmentally relevant pathways, including resource acquisition, unusual photosynthesis apparatus, and genes potentially involved in C4 photosynthesis, were observed, as was downsizing of many gene families. Overall, the 12.56-Mb nuclear genome has an extremely high gene density, in part because of extensive reduction of intergenic regions and other forms of compaction such as gene fusion. However, the genome is structurally complex. It exhibits previously unobserved levels of heterogeneity for a eukaryote. Two chromosomes differ structurally from the other eighteen. Both have a significantly biased G+C content, and, remarkably, they contain the majority of transposable elements. Many chromosome 2 genes also have unique codon usage and splicing, but phylogenetic analysis and composition do not support alien gene origin. In contrast, most chromosome 19 genes show no similarity to green lineage genes and a large number of them are specialized in cell surface processes. Taken together, the complete genome sequence, unusual features, and downsized gene families, make O. tauri an ideal model system for research on eukaryotic genome evolution, including chromosome specialization and green lineage ancestry. genome heterogeneity | genome sequence | green alga | Prasinophyceae | gene prediction
Reorganization of wheat and rye genomes in octoploid triticale (× Triticosecale).
Kalinka, Anna; Achrem, Magdalena
2018-04-01
The analysis of early generations of triticale showed numerous rearrangements of the genome. Complexed transformation included loss of chromosomes, t-heterochromatin content changes and the emergence of retrotransposons in new locations. This study investigated certain aspects of genomic transformations in the early generations (F5 and F8) of the primary octoploid triticale derived from the cross of hexaploid wheat with the diploid rye. Most of the plants tested were hypoploid; among eliminated chromosomes were rye chromosomes 4R and 5R and variable number of wheat chromosomes. Wheat chromosomes were eliminated to a higher extent. The lower content of telomeric heterochromatin was also found in rye chromosomes in comparison with parental rye. Studying the location of selected retrotransposons from Ty1-copia and Ty3-gypsy families using fluorescence in situ hybridization revealed additional locations of these retrotransposons that were not present in chromosomes of parental species. ISSR, IRAP and REMAP analyses showed significant changes at the level of specific DNA nucleotide sequences. In most cases, the disappearance of certain types of bands was observed, less frequently new types of bands appeared, not present in parental species. This demonstrates the scale of genome rearrangement and, above all, the elimination of wheat and rye sequences, largely due to the reduction of chromosome number. With regard to the proportion of wheat to rye genome, the rye genome was more affected by the changes, thus this study was focused more on the rye genome. Observations suggest that genome reorganization is not finished in the F5 generation but is still ongoing in the F8 generation.
Hahn, Christoph; Fromm, Bastian; Bachmann, Lutz
2014-01-01
The ectoparasitic Monogenea comprise a major part of the obligate parasitic flatworm diversity. Although genomic adaptations to parasitism have been studied in the endoparasitic tapeworms (Cestoda) and flukes (Trematoda), no representative of the Monogenea has been investigated yet. We present the high-quality draft genome of Gyrodactylus salaris, an economically important monogenean ectoparasite of wild Atlantic salmon (Salmo salar). A total of 15,488 gene models were identified, of which 7,102 were functionally annotated. The controversial phylogenetic relationships within the obligate parasitic Neodermata were resolved in a phylogenomic analysis using 1,719 gene models (alignment length of >500,000 amino acids) for a set of 16 metazoan taxa. The Monogenea were found basal to the Cestoda and Trematoda, which implies ectoparasitism being plesiomorphic within the Neodermata and strongly supports a common origin of complex life cycles. Comparative analysis of seven parasitic flatworm genomes identified shared genomic features for the ecto- and endoparasitic lineages, such as a substantial reduction of the core bilaterian gene complement, including the homeodomain-containing genes, and a loss of the piwi and vasa genes, which are considered essential for animal development. Furthermore, the shared loss of functional fatty acid biosynthesis pathways and the absence of peroxisomes, the latter organelles presumed ubiquitous in eukaryotes except for parasitic protozoans, were inferred. The draft genome of G. salaris opens for future in-depth analyses of pathogenicity and host specificity of poorly characterized G. salaris strains, and will enhance studies addressing the genomics of host–parasite interactions and speciation in the highly diverse monogenean flatworms. PMID:24732282
Evolution of Genome Size and Complexity in Pinus
Morse, Alison M.; Peterson, Daniel G.; Islam-Faridi, M. Nurul; Smith, Katherine E.; Magbanua, Zenaida; Garcia, Saul A.; Kubisiak, Thomas L.; Amerson, Henry V.; Carlson, John E.; Nelson, C. Dana; Davis, John M.
2009-01-01
Background Genome evolution in the gymnosperm lineage of seed plants has given rise to many of the most complex and largest plant genomes, however the elements involved are poorly understood. Methodology/Principal Findings Gymny is a previously undescribed retrotransposon family in Pinus that is related to Athila elements in Arabidopsis. Gymny elements are dispersed throughout the modern Pinus genome and occupy a physical space at least the size of the Arabidopsis thaliana genome. In contrast to previously described retroelements in Pinus, the Gymny family was amplified or introduced after the divergence of pine and spruce (Picea). If retrotransposon expansions are responsible for genome size differences within the Pinaceae, as they are in angiosperms, then they have yet to be identified. In contrast, molecular divergence of Gymny retrotransposons together with other families of retrotransposons can account for the large genome complexity of pines along with protein-coding genic DNA, as revealed by massively parallel DNA sequence analysis of Cot fractionated genomic DNA. Conclusions/Significance Most of the enormous genome complexity of pines can be explained by divergence of retrotransposons, however the elements responsible for genome size variation are yet to be identified. Genomic resources for Pinus including those reported here should assist in further defining whether and how the roles of retrotransposons differ in the evolution of angiosperm and gymnosperm genomes. PMID:19194510
USDA-ARS?s Scientific Manuscript database
The large and complex genome of bread wheat (Triticum aestivum L., ~17 Gb) requires high-resolution genome maps saturated with ordered markers to assist in anchoring and orienting BAC contigs/ sequence scaffolds for whole genome sequence assembly. Radiation hybrid (RH) mapping has proven to be an e...
USDA-ARS?s Scientific Manuscript database
Modern biological analyses are often assisted by recent technologies making the sequencing of complex genomes both technically possible and feasible. We recently sequenced the tomato genome that, like many eukaryotic genomes, is large and complex. Current sequencing technologies allow the developmen...
Arshad, Arslan; Speth, Daan R.; de Graaf, Rob M.; Op den Camp, Huub J. M.; Jetten, Mike S. M.; Welte, Cornelia U.
2015-01-01
Methane oxidation is an important process to mitigate the emission of the greenhouse gas methane and further exacerbating of climate forcing. Both aerobic and anaerobic microorganisms have been reported to catalyze methane oxidation with only a few possible electron acceptors. Recently, new microorganisms were identified that could couple the oxidation of methane to nitrate or nitrite reduction. Here we investigated such an enrichment culture at the (meta) genomic level to establish a metabolic model of nitrate-driven anaerobic oxidation of methane (nitrate-AOM). Nitrate-AOM is catalyzed by an archaeon closely related to (reverse) methanogens that belongs to the ANME-2d clade, tentatively named Methanoperedens nitroreducens. Methane may be activated by methyl-CoM reductase and subsequently undergo full oxidation to carbon dioxide via reverse methanogenesis. All enzymes of this pathway were present and expressed in the investigated culture. The genome of the archaeal enrichment culture encoded a variety of enzymes involved in an electron transport chain similar to those found in Methanosarcina species with additional features not previously found in methane-converting archaea. Nitrate reduction to nitrite seems to be located in the pseudoperiplasm and may be catalyzed by an unusual Nar-like protein complex. A small part of the resulting nitrite is reduced to ammonium which may be catalyzed by a Nrf-type nitrite reductase. One of the key questions is how electrons from cytoplasmically located reverse methanogenesis reach the nitrate reductase in the pseudoperiplasm. Electron transport in M. nitroreducens probably involves cofactor F420 in the cytoplasm, quinones in the cytoplasmic membrane and cytochrome c in the pseudoperiplasm. The membrane-bound electron transport chain includes F420H2 dehydrogenase and an unusual Rieske/cytochrome b complex. Based on genome and transcriptome studies a tentative model of how central energy metabolism of nitrate-AOM could work is presented and discussed. PMID:26733968
Malmberg, M Michelle; Shi, Fan; Spangenberg, German C; Daetwyler, Hans D; Cogan, Noel O I
2018-01-01
Intensive breeding of Brassica napus has resulted in relatively low diversity, such that B. napus would benefit from germplasm improvement schemes that sustain diversity. As such, samples representative of global germplasm pools need to be assessed for existing population structure, diversity and linkage disequilibrium (LD). Complexity reduction genotyping-by-sequencing (GBS) methods, including GBS-transcriptomics (GBS-t), enable cost-effective screening of a large number of samples, while whole genome re-sequencing (WGR) delivers the ability to generate large numbers of unbiased genomic single nucleotide polymorphisms (SNPs), and identify structural variants (SVs). Furthermore, the development of genomic tools based on whole genomes representative of global oilseed diversity and orientated by the reference genome has substantial industry relevance and will be highly beneficial for canola breeding. As recent studies have focused on European and Chinese varieties, a global diversity panel as well as a substantial number of Australian spring types were included in this study. Focusing on industry relevance, 633 varieties were initially genotyped using GBS-t to examine population structure using 61,037 SNPs. Subsequently, 149 samples representative of global diversity were selected for WGR and both data sets used for a side-by-side evaluation of diversity and LD. The WGR data was further used to develop genomic resources consisting of a list of 4,029,750 high-confidence SNPs annotated using SnpEff, and SVs in the form of 10,976 deletions and 2,556 insertions. These resources form the basis of a reliable and repeatable system allowing greater integration between canola genomics studies, with a strong focus on breeding germplasm and industry applicability.
Schmidlen, Tara; Sturm, Amy C; Hovick, Shelly; Scheinfeldt, Laura; Scott Roberts, J; Morr, Lindsey; McElroy, Joseph; Toland, Amanda E; Christman, Michael; O'Daniel, Julianne M; Gordon, Erynn S; Bernhardt, Barbara A; Ormond, Kelly E; Sweet, Kevin
2018-02-19
With the advent of widespread genomic testing for diagnostic indications and disease risk assessment, there is increased need to optimize genetic counseling services to support the scalable delivery of precision medicine. Here, we describe how we operationalized the reciprocal engagement model of genetic counseling practice to develop a framework of counseling components and strategies for the delivery of genomic results. This framework was constructed based upon qualitative research with patients receiving genomic counseling following online receipt of potentially actionable complex disease and pharmacogenomics reports. Consultation with a transdisciplinary group of investigators, including practicing genetic counselors, was sought to ensure broad scope and applicability of these strategies for use with any large-scale genomic testing effort. We preserve the provision of pre-test education and informed consent as established in Mendelian/single-gene disease genetic counseling practice. Following receipt of genomic results, patients are afforded the opportunity to tailor the counseling agenda by selecting the specific test results they wish to discuss, specifying questions for discussion, and indicating their preference for counseling modality. The genetic counselor uses these patient preferences to set the genomic counseling session and to personalize result communication and risk reduction recommendations. Tailored visual aids and result summary reports divide areas of risk (genetic variant, family history, lifestyle) for each disease to facilitate discussion of multiple disease risks. Post-counseling, session summary reports are actively routed to both the patient and their physician team to encourage review and follow-up. Given the breadth of genomic information potentially resulting from genomic testing, this framework is put forth as a starting point to meet the need for scalable genetic counseling services in the delivery of precision medicine.
Fang, Lingzhao; Sahana, Goutam; Ma, Peipei; Su, Guosheng; Yu, Ying; Zhang, Shengli; Lund, Mogens Sandø; Sørensen, Peter
2017-08-10
A better understanding of the genetic architecture underlying complex traits (e.g., the distribution of causal variants and their effects) may aid in the genomic prediction. Here, we hypothesized that the genomic variants of complex traits might be enriched in a subset of genomic regions defined by genes grouped on the basis of "Gene Ontology" (GO), and that incorporating this independent biological information into genomic prediction models might improve their predictive ability. Four complex traits (i.e., milk, fat and protein yields, and mastitis) together with imputed sequence variants in Holstein (HOL) and Jersey (JER) cattle were analysed. We first carried out a post-GWAS analysis in a HOL training population to assess the degree of enrichment of the association signals in the gene regions defined by each GO term. We then extended the genomic best linear unbiased prediction model (GBLUP) to a genomic feature BLUP (GFBLUP) model, including an additional genomic effect quantifying the joint effect of a group of variants located in a genomic feature. The GBLUP model using a single random effect assumes that all genomic variants contribute to the genomic relationship equally, whereas GFBLUP attributes different weights to the individual genomic relationships in the prediction equation based on the estimated genomic parameters. Our results demonstrate that the immune-relevant GO terms were more associated with mastitis than milk production, and several biologically meaningful GO terms improved the prediction accuracy with GFBLUP for the four traits, as compared with GBLUP. The improvement of the genomic prediction between breeds (the average increase across the four traits was 0.161) was more apparent than that it was within the HOL (the average increase across the four traits was 0.020). Our genomic feature modelling approaches provide a framework to simultaneously explore the genetic architecture and genomic prediction of complex traits by taking advantage of independent biological knowledge.
Oishi, Wakana; Sano, Daisuke; Decrey, Loic; Kadoya, Syunsuke; Kohn, Tamar; Funamizu, Naoyuki
2017-11-15
Volume reduction (condensation) is a key for the practical usage of human urine as a fertilizer because it enables the saving of storage space and the reduction of transportation cost. However, concentrated urine may carry infectious disease risks resulting from human pathogens frequently present in excreta, though the survival of pathogens in concentrated urine is not well understood. In this study, the inactivation of MS2 coliphage, a surrogate for single-stranded RNA human enteric viruses, in concentrated synthetic urine was investigated. The infectious titer reduction of MS2 coliphage in synthetic urine samples was measured by plaque assay, and the reduction of genome copy number was monitored by reverse transcription-quantitative PCR (RTqPCR). Among chemical-physical conditions such as pH and osmotic pressure, uncharged ammonia was shown to be the predominant factor responsible for MS2 inactivation, independently of urine concentration level. The reduction rate of the viral genome number varied among genome regions, but the comprehensive reduction rate of six genome regions was well correlated with that of the infectious titer of MS2 coliphage. This indicates that genome degradation is the main mechanism driving loss of infectivity, and that RT-qPCR targeting the six genome regions can be used as a culture-independent assay for monitoring infectivity loss of the coliphage in urine. MS2 inactivation rate constants were well predicted by a model using ion composition and speciation in synthetic urine samples, which suggests that MS2 infectivity loss can be estimated solely based on the solution composition, temperature and pH, without explicitly accounting for effects of osmotic pressure. Copyright © 2017 Elsevier B.V. All rights reserved.
Reproductive Mode and the Evolution of Genome Size and Structure in Caenorhabditis Nematodes
Fierst, Janna L.; Willis, John H.; Thomas, Cristel G.; Wang, Wei; Reynolds, Rose M.; Ahearne, Timothy E.; Cutter, Asher D.; Phillips, Patrick C.
2015-01-01
The self-fertile nematode worms Caenorhabditis elegans, C. briggsae, and C. tropicalis evolved independently from outcrossing male-female ancestors and have genomes 20-40% smaller than closely related outcrossing relatives. This pattern of smaller genomes for selfing species and larger genomes for closely related outcrossing species is also seen in plants. We use comparative genomics, including the first high quality genome assembly for an outcrossing member of the genus (C. remanei) to test several hypotheses for the evolution of genome reduction under a change in mating system. Unlike plants, it does not appear that reductions in the number of repetitive elements, such as transposable elements, are an important contributor to the change in genome size. Instead, all functional genomic categories are lost in approximately equal proportions. Theory predicts that self-fertilization should equalize the effective population size, as well as the resulting effects of genetic drift, between the X chromosome and autosomes. Contrary to this, we find that the self-fertile C. briggsae and C. elegans have larger intergenic spaces and larger protein-coding genes on the X chromosome when compared to autosomes, while C. remanei actually has smaller introns on the X chromosome than either self-reproducing species. Rather than being driven by mutational biases and/or genetic drift caused by a reduction in effective population size under self reproduction, changes in genome size in this group of nematodes appear to be caused by genome-wide patterns of gene loss, most likely generated by genomic adaptation to self reproduction per se. PMID:26114425
Lyu, Haomin; He, Ziwen; Wu, Chung-I; Shi, Suhua
2018-01-01
Several clades of mangrove trees independently invade the interface between land and sea at the margin of woody plant distribution. As phenotypic convergence among mangroves is common, the possibility of convergent adaptation in their genomes is quite intriguing. To study this molecular convergence, we sequenced multiple mangrove genomes. In this study, we focused on the evolution of transposable elements (TEs) in relation to the genome size evolution. TEs, generally considered genomic parasites, are the most common components of woody plant genomes. Analyzing the long terminal repeat-retrotransposon (LTR-RT) type of TE, we estimated their death rates by counting solo-LTRs and truncated elements. We found that all lineages of mangroves massively and convergently reduce TE loads in comparison to their nonmangrove relatives; as a consequence, genome size reduction happens independently in all six mangrove lineages; TE load reduction in mangroves can be attributed to the paucity of young elements; the rarity of young LTR-RTs is a consequence of fewer births rather than access death. In conclusion, mangrove genomes employ a convergent strategy of TE load reduction by suppressing element origination in their independent adaptation to a new environment. © 2017 The Authors. New Phytologist © 2017 New Phytologist Trust.
Lin, Choun-Sea; Chen, Jeremy J W; Chiu, Chi-Chou; Hsiao, Han C W; Yang, Chen-Jui; Jin, Xiao-Hua; Leebens-Mack, James; de Pamphilis, Claude W; Huang, Yao-Ting; Yang, Ling-Hung; Chang, Wan-Jung; Kui, Ling; Wong, Gane Ka-Shu; Hu, Jer-Ming; Wang, Wen; Shih, Ming-Che
2017-06-01
The chloroplast NAD(P)H dehydrogenase-like (NDH) complex consists of about 30 subunits from both the nuclear and chloroplast genomes and is ubiquitous across most land plants. In some orchids, such as Phalaenopsis equestris, Dendrobium officinale and Dendrobium catenatum, most of the 11 chloroplast genome-encoded ndh genes (cp-ndh) have been lost. Here we investigated whether functional cp-ndh genes have been completely lost in these orchids or whether they have been transferred and retained in the nuclear genome. Further, we assessed whether both cp-ndh genes and nucleus-encoded NDH-related genes can be lost, resulting in the absence of the NDH complex. Comparative analyses of the genome of Apostasia odorata, an orchid species with a complete complement of cp-ndh genes which represents the sister lineage to all other orchids, and three published orchid genome sequences for P. equestris, D. officinale and D. catenatum, which are all missing cp-ndh genes, indicated that copies of cp-ndh genes are not present in any of these four nuclear genomes. This observation suggests that the NDH complex is not necessary for some plants. Comparative genomic/transcriptomic analyses of currently available plastid genome sequences and nuclear transcriptome data showed that 47 out of 660 photoautotrophic plants and all the heterotrophic plants are missing plastid-encoded cp-ndh genes and exhibit no evidence for maintenance of a functional NDH complex. Our data indicate that the NDH complex can be lost in photoautotrophic plant species. Further, the loss of the NDH complex may increase the probability of transition from a photoautotrophic to a heterotrophic life history. © 2017 The Authors The Plant Journal © 2017 John Wiley & Sons Ltd.
Evolution and the complexity of bacteriophages.
Serwer, Philip
2007-03-13
The genomes of both long-genome (> 200 Kb) bacteriophages and long-genome eukaryotic viruses have cellular gene homologs whose selective advantage is not explained. These homologs add genomic and possibly biochemical complexity. Understanding their significance requires a definition of complexity that is more biochemically oriented than past empirically based definitions. Initially, I propose two biochemistry-oriented definitions of complexity: either decreased randomness or increased encoded information that does not serve immediate needs. Then, I make the assumption that these two definitions are equivalent. This assumption and recent data lead to the following four-part hypothesis that explains the presence of cellular gene homologs in long bacteriophage genomes and also provides a pathway for complexity increases in prokaryotic cells: (1) Prokaryotes underwent evolutionary increases in biochemical complexity after the eukaryote/prokaryote splits. (2) Some of the complexity increases occurred via multi-step, weak selection that was both protected from strong selection and accelerated by embedding evolving cellular genes in the genomes of bacteriophages and, presumably, also archaeal viruses (first tier selection). (3) The mechanisms for retaining cellular genes in viral genomes evolved under additional, longer-term selection that was stronger (second tier selection). (4) The second tier selection was based on increased access by prokaryotic cells to improved biochemical systems. This access was achieved when DNA transfer moved to prokaryotic cells both the more evolved genes and their more competitive and complex biochemical systems. I propose testing this hypothesis by controlled evolution in microbial communities to (1) determine the effects of deleting individual cellular gene homologs on the growth and evolution of long genome bacteriophages and hosts, (2) find the environmental conditions that select for the presence of cellular gene homologs, (3) determine which, if any, bacteriophage genes were selected for maintaining the homologs and (4) determine the dynamics of homolog evolution. This hypothesis is an explanation of evolutionary leaps in general. If accurate, it will assist both understanding and influencing the evolution of microbes and their communities. Analysis of evolutionary complexity increase for at least prokaryotes should include analysis of genomes of long-genome bacteriophages.
Multichromosomal median and halving problems under different genomic distances
Tannier, Eric; Zheng, Chunfang; Sankoff, David
2009-01-01
Background Genome median and genome halving are combinatorial optimization problems that aim at reconstructing ancestral genomes as well as the evolutionary events leading from the ancestor to extant species. Exploring complexity issues is a first step towards devising efficient algorithms. The complexity of the median problem for unichromosomal genomes (permutations) has been settled for both the breakpoint distance and the reversal distance. Although the multichromosomal case has often been assumed to be a simple generalization of the unichromosomal case, it is also a relaxation so that complexity in this context does not follow from existing results, and is open for all distances. Results We settle here the complexity of several genome median and halving problems, including a surprising polynomial result for the breakpoint median and guided halving problems in genomes with circular and linear chromosomes, showing that the multichromosomal problem is actually easier than the unichromosomal problem. Still other variants of these problems are NP-complete, including the DCJ double distance problem, previously mentioned as an open question. We list the remaining open problems. Conclusion This theoretical study clears up a wide swathe of the algorithmical study of genome rearrangements with multiple multichromosomal genomes. PMID:19386099
Genomic and Genetic Diversity within the Pseudomonas fluorescens Complex
Garrido-Sanz, Daniel; Meier-Kolthoff, Jan P.; Göker, Markus; Martín, Marta; Rivilla, Rafael; Redondo-Nieto, Miguel
2016-01-01
The Pseudomonas fluorescens complex includes Pseudomonas strains that have been taxonomically assigned to more than fifty different species, many of which have been described as plant growth-promoting rhizobacteria (PGPR) with potential applications in biocontrol and biofertilization. So far the phylogeny of this complex has been analyzed according to phenotypic traits, 16S rDNA, MLSA and inferred by whole-genome analysis. However, since most of the type strains have not been fully sequenced and new species are frequently described, correlation between taxonomy and phylogenomic analysis is missing. In recent years, the genomes of a large number of strains have been sequenced, showing important genomic heterogeneity and providing information suitable for genomic studies that are important to understand the genomic and genetic diversity shown by strains of this complex. Based on MLSA and several whole-genome sequence-based analyses of 93 sequenced strains, we have divided the P. fluorescens complex into eight phylogenomic groups that agree with previous works based on type strains. Digital DDH (dDDH) identified 69 species and 75 subspecies within the 93 genomes. The eight groups corresponded to clustering with a threshold of 31.8% dDDH, in full agreement with our MLSA. The Average Nucleotide Identity (ANI) approach showed inconsistencies regarding the assignment to species and to the eight groups. The small core genome of 1,334 CDSs and the large pan-genome of 30,848 CDSs, show the large diversity and genetic heterogeneity of the P. fluorescens complex. However, a low number of strains were enough to explain most of the CDSs diversity at core and strain-specific genomic fractions. Finally, the identification and analysis of group-specific genome and the screening for distinctive characters revealed a phylogenomic distribution of traits among the groups that provided insights into biocontrol and bioremediation applications as well as their role as PGPR. PMID:26915094
Stelzer, Claus-Peter; Riss, Simone; Stadler, Peter
2011-04-07
Studies on genome size variation in animals are rarely done at lower taxonomic levels, e.g., slightly above/below the species level. Yet, such variation might provide important clues on the tempo and mode of genome size evolution. In this study we used the flow-cytometry method to study the evolution of genome size in the rotifer Brachionus plicatilis, a cryptic species complex consisting of at least 14 closely related species. We found an unexpectedly high variation in this species complex, with genome sizes ranging approximately seven-fold (haploid '1C' genome sizes: 0.056-0.416 pg). Most of this variation (67%) could be ascribed to the major clades of the species complex, i.e. clades that are well separated according to most species definitions. However, we also found substantial variation (32%) at lower taxonomic levels--within and among genealogical species--and, interestingly, among species pairs that are not completely reproductively isolated. In one genealogical species, called B. 'Austria', we found greatly enlarged genome sizes that could roughly be approximated as multiples of the genomes of its closest relatives, which suggests that whole-genome duplications have occurred early during separation of this lineage. Overall, genome size was significantly correlated to egg size and body size, even though the latter became non-significant after controlling for phylogenetic non-independence. Our study suggests that substantial genome size variation can build up early during speciation, potentially even among isolated populations. An alternative, but not mutually exclusive interpretation might be that reproductive isolation tends to build up unusually slow in this species complex.
2011-01-01
Background Studies on genome size variation in animals are rarely done at lower taxonomic levels, e.g., slightly above/below the species level. Yet, such variation might provide important clues on the tempo and mode of genome size evolution. In this study we used the flow-cytometry method to study the evolution of genome size in the rotifer Brachionus plicatilis, a cryptic species complex consisting of at least 14 closely related species. Results We found an unexpectedly high variation in this species complex, with genome sizes ranging approximately seven-fold (haploid '1C' genome sizes: 0.056-0.416 pg). Most of this variation (67%) could be ascribed to the major clades of the species complex, i.e. clades that are well separated according to most species definitions. However, we also found substantial variation (32%) at lower taxonomic levels - within and among genealogical species - and, interestingly, among species pairs that are not completely reproductively isolated. In one genealogical species, called B. 'Austria', we found greatly enlarged genome sizes that could roughly be approximated as multiples of the genomes of its closest relatives, which suggests that whole-genome duplications have occurred early during separation of this lineage. Overall, genome size was significantly correlated to egg size and body size, even though the latter became non-significant after controlling for phylogenetic non-independence. Conclusions Our study suggests that substantial genome size variation can build up early during speciation, potentially even among isolated populations. An alternative, but not mutually exclusive interpretation might be that reproductive isolation tends to build up unusually slow in this species complex. PMID:21473744
Diversity arrays technology: a generic genome profiling technology on open platforms.
Kilian, Andrzej; Wenzl, Peter; Huttner, Eric; Carling, Jason; Xia, Ling; Blois, Hélène; Caig, Vanessa; Heller-Uszynska, Katarzyna; Jaccoud, Damian; Hopper, Colleen; Aschenbrenner-Kilian, Malgorzata; Evers, Margaret; Peng, Kaiman; Cayla, Cyril; Hok, Puthick; Uszynski, Grzegorz
2012-01-01
In the last 20 years, we have observed an exponential growth of the DNA sequence data and simular increase in the volume of DNA polymorphism data generated by numerous molecular marker technologies. Most of the investment, and therefore progress, concentrated on human genome and genomes of selected model species. Diversity Arrays Technology (DArT), developed over a decade ago, was among the first "democratizing" genotyping technologies, as its performance was primarily driven by the level of DNA sequence variation in the species rather than by the level of financial investment. DArT also proved more robust to genome size and ploidy-level differences among approximately 60 organisms for which DArT was developed to date compared to other high-throughput genotyping technologies. The success of DArT in a number of organisms, including a wide range of "orphan crops," can be attributed to the simplicity of underlying concepts: DArT combines genome complexity reduction methods enriching for genic regions with a highly parallel assay readout on a number of "open-access" microarray platforms. The quantitative nature of the assay enabled a number of applications in which allelic frequencies can be estimated from DArT arrays. A typical DArT assay tests for polymorphism tens of thousands of genomic loci with the final number of markers reported (hundreds to thousands) reflecting the level of DNA sequence variation in the tested loci. Detailed DArT methods, protocols, and a range of their application examples as well as DArT's evolution path are presented.
Metabolic 'engines' of flight drive genome size reduction in birds.
Wright, Natalie A; Gregory, T Ryan; Witt, Christopher C
2014-03-22
The tendency for flying organisms to possess small genomes has been interpreted as evidence of natural selection acting on the physical size of the genome. Nonetheless, the flight-genome link and its mechanistic basis have yet to be well established by comparative studies within a volant clade. Is there a particular functional aspect of flight such as brisk metabolism, lift production or maneuverability that impinges on the physical genome? We measured genome sizes, wing dimensions and heart, flight muscle and body masses from a phylogenetically diverse set of bird species. In phylogenetically controlled analyses, we found that genome size was negatively correlated with relative flight muscle size and heart index (i.e. ratio of heart to body mass), but positively correlated with body mass and wing loading. The proportional masses of the flight muscles and heart were the most important parameters explaining variation in genome size in multivariate models. Hence, the metabolic intensity of powered flight appears to have driven genome size reduction in birds.
Recovery of temperate Desulfovibrio vulgaris bacteriophage on anovel host strain
DOE Office of Scientific and Technical Information (OSTI.GOV)
Walker, C.B.; Stolyar, S.S.; Pinel, N.
2007-04-02
A novel sulfate-reducing bacterium (strain DePue) closelyrelated to Desulfovibrio vulgaris ssp. vulgaris strain Hildenborough wasisolated from the sediment of a heavy-metal impacted lake usingestablished techniques. Although few physiological differences betweenstrains DePue and Hildenborough were observed, pulsed-field gelelectrophoresis (PFGE) revealed a significant genome reduction in strainDePue. Comparative whole-genome microarray and PCR analyses demonstratedthat the absence of genes annotated in the Hildenborough genome as phageor phage-related contributed to the significant genome reduction instrain DePue. Two morphotypically distinct temperate bacteriophage fromstrain Hildenborough were recovered using strain DePue as a host forplaque isolation.
Carlson, Hanqian L; Quinn, Jeffrey J; Yang, Yul W; Thornburg, Chelsea K; Chang, Howard Y; Stadler, H Scott
2015-12-01
Gene expression profiling in E 11 mouse embryos identified high expression of the long noncoding RNA (lncRNA), LNCRNA-HIT in the undifferentiated limb mesenchyme, gut, and developing genital tubercle. In the limb mesenchyme, LncRNA-HIT was found to be retained in the nucleus, forming a complex with p100 and CBP. Analysis of the genome-wide distribution of LncRNA-HIT-p100/CBP complexes by ChIRP-seq revealed LncRNA-HIT associated peaks at multiple loci in the murine genome. Ontological analysis of the genes contacted by LncRNA-HIT-p100/CBP complexes indicate a primary role for these loci in chondrogenic differentiation. Functional analysis using siRNA-mediated reductions in LncRNA-HIT or p100 transcripts revealed a significant decrease in expression of many of the LncRNA-HIT-associated loci. LncRNA-HIT siRNA treatments also impacted the ability of the limb mesenchyme to form cartilage, reducing mesenchymal cell condensation and the formation of cartilage nodules. Mechanistically the LncRNA-HIT siRNA treatments impacted pro-chondrogenic gene expression by reducing H3K27ac or p100 activity, confirming that LncRNA-HIT is essential for chondrogenic differentiation in the limb mesenchyme. Taken together, these findings reveal a fundamental epigenetic mechanism functioning during early limb development, using LncRNA-HIT and its associated proteins to promote the expression of multiple genes whose products are necessary for the formation of cartilage.
Carlson, Hanqian L.; Quinn, Jeffrey J.; Yang, Yul W.; Thornburg, Chelsea K.; Chang, Howard Y.; Stadler, H. Scott
2015-01-01
Gene expression profiling in E 11 mouse embryos identified high expression of the long noncoding RNA (lncRNA), LNCRNA-HIT in the undifferentiated limb mesenchyme, gut, and developing genital tubercle. In the limb mesenchyme, LncRNA-HIT was found to be retained in the nucleus, forming a complex with p100 and CBP. Analysis of the genome-wide distribution of LncRNA-HIT-p100/CBP complexes by ChIRP-seq revealed LncRNA-HIT associated peaks at multiple loci in the murine genome. Ontological analysis of the genes contacted by LncRNA-HIT-p100/CBP complexes indicate a primary role for these loci in chondrogenic differentiation. Functional analysis using siRNA-mediated reductions in LncRNA-HIT or p100 transcripts revealed a significant decrease in expression of many of the LncRNA-HIT-associated loci. LncRNA-HIT siRNA treatments also impacted the ability of the limb mesenchyme to form cartilage, reducing mesenchymal cell condensation and the formation of cartilage nodules. Mechanistically the LncRNA-HIT siRNA treatments impacted pro-chondrogenic gene expression by reducing H3K27ac or p100 activity, confirming that LncRNA-HIT is essential for chondrogenic differentiation in the limb mesenchyme. Taken together, these findings reveal a fundamental epigenetic mechanism functioning during early limb development, using LncRNA-HIT and its associated proteins to promote the expression of multiple genes whose products are necessary for the formation of cartilage. PMID:26633036
Ataman, Meric
2017-01-01
Genome-scale metabolic reconstructions have proven to be valuable resources in enhancing our understanding of metabolic networks as they encapsulate all known metabolic capabilities of the organisms from genes to proteins to their functions. However the complexity of these large metabolic networks often hinders their utility in various practical applications. Although reduced models are commonly used for modeling and in integrating experimental data, they are often inconsistent across different studies and laboratories due to different criteria and detail, which can compromise transferability of the findings and also integration of experimental data from different groups. In this study, we have developed a systematic semi-automatic approach to reduce genome-scale models into core models in a consistent and logical manner focusing on the central metabolism or subsystems of interest. The method minimizes the loss of information using an approach that combines graph-based search and optimization methods. The resulting core models are shown to be able to capture key properties of the genome-scale models and preserve consistency in terms of biomass and by-product yields, flux and concentration variability and gene essentiality. The development of these “consistently-reduced” models will help to clarify and facilitate integration of different experimental data to draw new understanding that can be directly extendable to genome-scale models. PMID:28727725
The Blueprint of a Minimal Cell: MiniBacillus
Reuß, Daniel R.; Commichau, Fabian M.; Gundlach, Jan; Zhu, Bingyao
2016-01-01
SUMMARY Bacillus subtilis is one of the best-studied organisms. Due to the broad knowledge and annotation and the well-developed genetic system, this bacterium is an excellent starting point for genome minimization with the aim of constructing a minimal cell. We have analyzed the genome of B. subtilis and selected all genes that are required to allow life in complex medium at 37°C. This selection is based on the known information on essential genes and functions as well as on gene and protein expression data and gene conservation. The list presented here includes 523 and 119 genes coding for proteins and RNAs, respectively. These proteins and RNAs are required for the basic functions of life in information processing (replication and chromosome maintenance, transcription, translation, protein folding, and secretion), metabolism, cell division, and the integrity of the minimal cell. The completeness of the selected metabolic pathways, reactions, and enzymes was verified by the development of a model of metabolism of the minimal cell. A comparison of the MiniBacillus genome to the recently reported designed minimal genome of Mycoplasma mycoides JCVI-syn3.0 indicates excellent agreement in the information-processing pathways, whereas each species has a metabolism that reflects specific evolution and adaptation. The blueprint of MiniBacillus presented here serves as the starting point for a successive reduction of the B. subtilis genome. PMID:27681641
Fourie, Gerda; van der Merwe, Nicolaas A; Wingfield, Brenda D; Bogale, Mesfin; Tudzynski, Bettina; Wingfield, Michael J; Steenkamp, Emma T
2013-09-08
The availability of mitochondrial genomes has allowed for the resolution of numerous questions regarding the evolutionary history of fungi and other eukaryotes. In the Gibberella fujikuroi species complex, the exact relationships among the so-called "African", "Asian" and "American" Clades remain largely unresolved, irrespective of the markers employed. In this study, we considered the feasibility of using mitochondrial genes to infer the phylogenetic relationships among Fusarium species in this complex. The mitochondrial genomes of representatives of the three Clades (Fusarium circinatum, F. verticillioides and F. fujikuroi) were characterized and we determined whether or not the mitochondrial genomes of these fungi have value in resolving the higher level evolutionary relationships in the complex. Overall, the mitochondrial genomes of the three species displayed a high degree of synteny, with all the genes (protein coding genes, unique ORFs, ribosomal RNA and tRNA genes) in identical order and orientation, as well as introns that share similar positions within genes. The intergenic regions and introns generally contributed significantly to the size differences and diversity observed among these genomes. Phylogenetic analysis of the concatenated protein-coding dataset separated members of the Gibberella fujikuroi complex from other Fusarium species and suggested that F. fujikuroi ("Asian" Clade) is basal in the complex. However, individual mitochondrial gene trees were largely incongruent with one another and with the concatenated gene tree, because six distinct phylogenetic trees were recovered from the various single gene datasets. The mitochondrial genomes of Fusarium species in the Gibberella fujikuroi complex are remarkably similar to those of the previously characterized Fusarium species and Sordariomycetes. Despite apparently representing a single replicative unit, all of the genes encoded on the mitochondrial genomes of these fungi do not share the same evolutionary history. This incongruence could be due to biased selection on some genes or recombination among mitochondrial genomes. The results thus suggest that the use of individual mitochondrial genes for phylogenetic inference could mask the true relationships between species in this complex.
Nuclear import of viral DNA genomes.
Greber, Urs F; Fassati, Ariberto
2003-03-01
The genomes of many viruses traffic into the nucleus, where they are either integrated into host chromosomes or maintained as episomal DNA and then transcriptionally activated or silenced. Here, we discuss the existing evidence on how the lentiviruses, adenoviruses, herpesviruses, hepadnaviruses and autonomous parvoviruses enter the nucleus. Depending on the size of the capsid enclosing the genome, three principles of viral nucleic acids import are discussed. The first principle is that the capsid disassembles in the cytosol or in a docked state at the nuclear pore complex and a subviral genomic complex is trafficked through the pore. Second, the genome is injected from a capsid that is docked to the pore complex, and third, import factors are recruited to cytosolic capsids to increase capsid affinity to the pore complex, mediate translocation and allow disassembly in the nucleoplasm.
Sequence-Based Genotyping for Marker Discovery and Co-Dominant Scoring in Germplasm and Populations
Truong, Hoa T.; Ramos, A. Marcos; Yalcin, Feyruz; de Ruiter, Marjo; van der Poel, Hein J. A.; Huvenaars, Koen H. J.; Hogers, René C. J.; van Enckevort, Leonora. J. G.; Janssen, Antoine; van Orsouw, Nathalie J.; van Eijk, Michiel J. T.
2012-01-01
Conventional marker-based genotyping platforms are widely available, but not without their limitations. In this context, we developed Sequence-Based Genotyping (SBG), a technology for simultaneous marker discovery and co-dominant scoring, using next-generation sequencing. SBG offers users several advantages including a generic sample preparation method, a highly robust genome complexity reduction strategy to facilitate de novo marker discovery across entire genomes, and a uniform bioinformatics workflow strategy to achieve genotyping goals tailored to individual species, regardless of the availability of a reference sequence. The most distinguishing features of this technology are the ability to genotype any population structure, regardless whether parental data is included, and the ability to co-dominantly score SNP markers segregating in populations. To demonstrate the capabilities of SBG, we performed marker discovery and genotyping in Arabidopsis thaliana and lettuce, two plant species of diverse genetic complexity and backgrounds. Initially we obtained 1,409 SNPs for arabidopsis, and 5,583 SNPs for lettuce. Further filtering of the SNP dataset produced over 1,000 high quality SNP markers for each species. We obtained a genotyping rate of 201.2 genotypes/SNP and 58.3 genotypes/SNP for arabidopsis (n = 222 samples) and lettuce (n = 87 samples), respectively. Linkage mapping using these SNPs resulted in stable map configurations. We have therefore shown that the SBG approach presented provides users with the utmost flexibility in garnering high quality markers that can be directly used for genotyping and downstream applications. Until advances and costs will allow for routine whole-genome sequencing of populations, we expect that sequence-based genotyping technologies such as SBG will be essential for genotyping of model and non-model genomes alike. PMID:22662172
Weiss, Victor U; Bliem, Christina; Gösler, Irene; Fedosyuk, Sofiya; Kratzmeier, Martin; Blaas, Dieter; Allmaier, Günter
2016-06-01
Liquid-phase electrophoresis either in the classical capillary format or miniaturized (chip CE) is a valuable tool for quality control of virus preparations and for targeting questions related to conformational changes of viruses during infection. We present an in vitro assay to follow the release of the RNA genome from a human rhinovirus (common cold virus) by using a molecular beacon (MB) and chip CE. The MB, a probe that becomes fluorescent upon hybridization to a complementary sequence, was designed to bind close to the 3' end of the viral genome. Addition of Trolox (6-hydroxy-2,5,7,8-tetramethylchroman-2-carboxylic acid), a well-known additive for reduction of bleaching and blinking of fluorophores in fluorescence microscopy, to the background electrolyte increased the sensitivity of our chip CE set-up. Hence, a fast, sensitive and straightforward method for the detection of viral RNA is introduced. Additionally, challenges of our assay will be discussed. In particular, we found that (i) desalting of virus preparations prior to analysis increased the recorded signal and (ii) the MB-RNA complex signal decreased with the time of virus storage at -70 °C. This suggests that 3'-proximal sequences of the viral RNA, if not the whole genome, underwent degradation during storage and/or freezing and thawing. In summary, we demonstrate, for two independent virus batches, that chip electrophoresis can be used to monitor MB hybridization to RNA released upon incubation of the native virus at 56 °C. Graphical Abstract Schematic of the study strategy: RNA released from HRV-A2 is detected by chip electrophoresis through the increase in fluorescence after genom complexation to a cognate molecular beacon.
Lateral gene transfer and the origins of prokaryotic groups.
Boucher, Yan; Douady, Christophe J; Papke, R Thane; Walsh, David A; Boudreau, Mary Ellen R; Nesbø, Camilla L; Case, Rebecca J; Doolittle, W Ford
2003-01-01
Lateral gene transfer (LGT) is now known to be a major force in the evolution of prokaryotic genomes. To date, most analyses have focused on either (a) verifying phylogenies of individual genes thought to have been transferred, or (b) estimating the fraction of individual genomes likely to have been introduced by transfer. Neither approach does justice to the ability of LGT to effect massive and complex transformations in basic biology. In some cases, such transformation will be manifested as the patchy distribution of a seemingly fundamental property (such as aerobiosis or nitrogen fixation) among the members of a group classically defined by the sharing of other properties (metabolic, morphological, or molecular, such as small subunit ribosomal RNA sequence). In other cases, the lineage of recipients so transformed may be seen to comprise a new group of high taxonomic rank ("class" or even "phylum"). Here we review evidence for an important role of LGT in the evolution of photosynthesis, aerobic respiration, nitrogen fixation, sulfate reduction, methylotrophy, isoprenoid biosynthesis, quorum sensing, flotation (gas vesicles), thermophily, and halophily. Sometimes transfer of complex gene clusters may have been involved, whereas other times separate exchanges of many genes must be invoked.
Molecular Mapping of Restriction-Site Associated DNA Markers In Allotetraploid Upland Cotton.
Wang, Yangkun; Ning, Zhiyuan; Hu, Yan; Chen, Jiedan; Zhao, Rui; Chen, Hong; Ai, Nijiang; Guo, Wangzhen; Zhang, Tianzhen
2015-01-01
Upland cotton (Gossypium hirsutum L., 2n = 52, AADD) is an allotetraploid, therefore the discovery of single nucleotide polymorphism (SNP) markers is difficult. The recent emergence of genome complexity reduction technologies based on the next-generation sequencing (NGS) platform has greatly expedited SNP discovery in crops with highly repetitive and complex genomes. Here we applied restriction-site associated DNA (RAD) sequencing technology for de novo SNP discovery in allotetraploid cotton. We identified 21,109 SNPs between the two parents and used these for genotyping of 161 recombinant inbred lines (RILs). Finally, a high dense linkage map comprising 4,153 loci over 3500-cM was developed based on the previous result. Using this map quantitative trait locus (QTLs) conferring fiber strength and Verticillium Wilt (VW) resistance were mapped to a more accurate region in comparison to the 1576-cM interval determined using the simple sequence repeat (SSR) genetic map. This suggests that the newly constructed map has more power and resolution than the previous SSR map. It will pave the way for the rapid identification of the marker-assisted selection in cotton breeding and cloning of QTL of interest traits.
Yuan, Bo; Liu, Pengfei; Gupta, Aditya; Beck, Christine R.; Tejomurtula, Anusha; Campbell, Ian M.; Gambin, Tomasz; Simmons, Alexandra D.; Withers, Marjorie A.; Harris, R. Alan; Rogers, Jeffrey; Schwartz, David C.; Lupski, James R.
2015-01-01
Many loci in the human genome harbor complex genomic structures that can result in susceptibility to genomic rearrangements leading to various genomic disorders. Nephronophthisis 1 (NPHP1, MIM# 256100) is an autosomal recessive disorder that can be caused by defects of NPHP1; the gene maps within the human 2q13 region where low copy repeats (LCRs) are abundant. Loss of function of NPHP1 is responsible for approximately 85% of the NPHP1 cases—about 80% of such individuals carry a large recurrent homozygous NPHP1 deletion that occurs via nonallelic homologous recombination (NAHR) between two flanking directly oriented ~45 kb LCRs. Published data revealed a non-pathogenic inversion polymorphism involving the NPHP1 gene flanked by two inverted ~358 kb LCRs. Using optical mapping and array-comparative genomic hybridization, we identified three potential novel structural variant (SV) haplotypes at the NPHP1 locus that may protect a haploid genome from the NPHP1 deletion. Inter-species comparative genomic analyses among primate genomes revealed massive genomic changes during evolution. The aggregated data suggest that dynamic genomic rearrangements occurred historically within the NPHP1 locus and generated SV haplotypes observed in the human population today, which may confer differential susceptibility to genomic instability and the NPHP1 deletion within a personal genome. Our study documents diverse SV haplotypes at a complex LCR-laden human genomic region. Comparative analyses provide a model for how this complex region arose during primate evolution, and studies among humans suggest that intra-species polymorphism may potentially modulate an individual’s susceptibility to acquiring disease-associated alleles. PMID:26641089
Pyne, Michael E; Liu, Xuejia; Moo-Young, Murray; Chung, Duane A; Chou, C Perry
2016-09-19
Clostridium pasteurianum is emerging as a prospective host for the production of biofuels and chemicals, and has recently been shown to directly consume electric current. Despite this growing biotechnological appeal, the organism's genetics and central metabolism remain poorly understood. Here we present a concurrent genome sequence for the C. pasteurianum type strain and provide extensive genomic analysis of the organism's defence mechanisms and central fermentative metabolism. Next generation genome sequencing produced reads corresponding to spontaneous excision of a novel phage, designated φ6013, which could be induced using mitomycin C and detected using PCR and transmission electron microscopy. Methylome analysis of sequencing reads provided a near-complete glimpse into the organism's restriction-modification systems. We also unveiled the chief C. pasteurianum Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) locus, which was found to exemplify a Type I-B system. Finally, we show that C. pasteurianum possesses a highly complex fermentative metabolism whereby the metabolic pathways enlisted by the cell is governed by the degree of reductance of the substrate. Four distinct fermentation profiles, ranging from exclusively acidogenic to predominantly alcohologenic, were observed through redox consideration of the substrate. A detailed discussion of the organism's central metabolism within the context of metabolic engineering is provided.
Manzoor, Shahid; Schnürer, Anna; Müller, Bettina
2018-01-01
Syntrophic acetate oxidation operates close to the thermodynamic equilibrium and very little is known about the participating organisms and their metabolism. Clostridium ultunense is one of the most abundant syntrophic acetate-oxidising bacteria (SAOB) that are found in engineered biogas processes operating with high ammonia concentrations. It has been proven to oxidise acetate in cooperation with hydrogenotrophic methanogens. There is evidence that the Wood-Ljungdahl (WL) pathway plays an important role in acetate oxidation. In this study, we analysed the physiological and metabolic capacities of C. ultunense strain Esp and strain BST on genome scale and conducted a comparative study of all the known characterised SAOB, namely Syntrophaceticus schinkii, Thermacetogenium phaeum, Tepidanaerobacter acetatoxydans, and Pseudothermotoga lettingae. The results clearly indicated physiological robustness to be beneficial for anaerobic digestion environments and revealed unexpected metabolic diversity with respect to acetate oxidation and energy conservation systems. Unlike S. schinkii and Th. phaeum, C. ultunense clearly does not employ the oxidative WL pathway for acetate oxidation, as its genome (and that of P. lettingae) lack important key genes. In both of those species, a proton motive force is likely formed by chemical protons involving putative electron-bifurcating [Fe-Fe] hydrogenases rather than proton pumps. No genes encoding a respiratory Ech (energy-converting hydrogenase), as involved in energy conservation in Th. phaeum and S. schinkii, were identified in C. ultunense and P. lettingae. Moreover, two respiratory complexes sharing similarities to the proton-translocating ferredoxin:NAD+ oxidoreductase (Rnf) and the Na+ pumping NADH:quinone hydrogenase (NQR) were predicted. These might form a respiratory chain that is involved in the reduction of electron acceptors rather than protons. However, involvement of these complexes in acetate oxidation in C. ultunense and P. lettingae needs further study. This genome-based comparison provides a solid platform for future meta-proteomics and meta-transcriptomics studies and for metabolic engineering, control, and monitoring of SAOB. PMID:29690652
Manzoor, Shahid; Schnürer, Anna; Bongcam-Rudloff, Erik; Müller, Bettina
2018-04-23
Syntrophic acetate oxidation operates close to the thermodynamic equilibrium and very little is known about the participating organisms and their metabolism. Clostridium ultunense is one of the most abundant syntrophic acetate-oxidising bacteria (SAOB) that are found in engineered biogas processes operating with high ammonia concentrations. It has been proven to oxidise acetate in cooperation with hydrogenotrophic methanogens. There is evidence that the Wood-Ljungdahl (WL) pathway plays an important role in acetate oxidation. In this study, we analysed the physiological and metabolic capacities of C. ultunense strain Esp and strain BS T on genome scale and conducted a comparative study of all the known characterised SAOB, namely Syntrophaceticus schinkii , Thermacetogenium phaeum , Tepidanaerobacter acetatoxydans , and Pseudothermotoga lettingae . The results clearly indicated physiological robustness to be beneficial for anaerobic digestion environments and revealed unexpected metabolic diversity with respect to acetate oxidation and energy conservation systems. Unlike S. schinkii and Th. phaeum , C. ultunense clearly does not employ the oxidative WL pathway for acetate oxidation, as its genome (and that of P. lettingae ) lack important key genes. In both of those species, a proton motive force is likely formed by chemical protons involving putative electron-bifurcating [Fe-Fe] hydrogenases rather than proton pumps. No genes encoding a respiratory Ech (energy-converting hydrogenase), as involved in energy conservation in Th. phaeum and S. schinkii, were identified in C. ultunense and P. lettingae . Moreover, two respiratory complexes sharing similarities to the proton-translocating ferredoxin:NAD⁺ oxidoreductase (Rnf) and the Na⁺ pumping NADH:quinone hydrogenase (NQR) were predicted. These might form a respiratory chain that is involved in the reduction of electron acceptors rather than protons. However, involvement of these complexes in acetate oxidation in C. ultunense and P. lettingae needs further study. This genome-based comparison provides a solid platform for future meta-proteomics and meta-transcriptomics studies and for metabolic engineering, control, and monitoring of SAOB.
Landscape community genomics: understanding eco-evolutionary processes in complex environments
Hand, Brian K.; Lowe, Winsor H.; Kovach, Ryan P.; Muhlfeld, Clint C.; Luikart, Gordon
2015-01-01
Extrinsic factors influencing evolutionary processes are often categorically lumped into interactions that are environmentally (e.g., climate, landscape) or community-driven, with little consideration of the overlap or influence of one on the other. However, genomic variation is strongly influenced by complex and dynamic interactions between environmental and community effects. Failure to consider both effects on evolutionary dynamics simultaneously can lead to incomplete, spurious, or erroneous conclusions about the mechanisms driving genomic variation. We highlight the need for a landscape community genomics (LCG) framework to help to motivate and challenge scientists in diverse fields to consider a more holistic, interdisciplinary perspective on the genomic evolution of multi-species communities in complex environments.
Baichoo, Shakuntala; Ouzounis, Christos A
A multitude of algorithms for sequence comparison, short-read assembly and whole-genome alignment have been developed in the general context of molecular biology, to support technology development for high-throughput sequencing, numerous applications in genome biology and fundamental research on comparative genomics. The computational complexity of these algorithms has been previously reported in original research papers, yet this often neglected property has not been reviewed previously in a systematic manner and for a wider audience. We provide a review of space and time complexity of key sequence analysis algorithms and highlight their properties in a comprehensive manner, in order to identify potential opportunities for further research in algorithm or data structure optimization. The complexity aspect is poised to become pivotal as we will be facing challenges related to the continuous increase of genomic data on unprecedented scales and complexity in the foreseeable future, when robust biological simulation at the cell level and above becomes a reality. Copyright © 2017 Elsevier B.V. All rights reserved.
Zickler, D; Moreau, P J; Huynh, A D; Slezec, A M
1992-09-01
The decrease of meiotic exchanges (crossing over and conversion) in two mutants of Sordaria macrospora correlated strongly with a reduction of chiasmata and of both types of "recombination nodules." Serial section reconstruction electron microscopy was used to compare the synapsis pattern of meiotic prophase I in wild type and mutants. First, synapsis occurred but the number of synaptonemal complex initiation sites was reduced in both mutants. Second, this reduction was accompanied by, or resulted in, modifications of the pattern of synapsis. Genetic and synaptonemal complex maps were compared in three regions along one chromosome arm divided into well marked intervals. Reciprocal exchange frequencies and number of recombination nodules correlated in wild type in the three analyzed intervals, but disparity was found between the location of recombination nodules and exchanges in the mutants. Despite the twofold exchange decrease, sections of the genome such as the short arm of chromosome 2 and telomere regions were sheltered from nodule decrease and from pairing modifications. This indicated a certain amount of diversity in the control of these features and suggested that exchange frequency was dependent not only on the amount of effective pairing but also on the localization of the pairing sites, as revealed by the synaptonemal complex progression in the mutants.
Zickler, D.; Moreau, PJF.; Huynh, A. D.; Slezec, A. M.
1992-01-01
The decrease of meiotic exchanges (crossing over and conversion) in two mutants of Sordaria macrospora correlated strongly with a reduction of chiasmata and of both types of ``recombination nodules.'' Serial section reconstruction electron microscopy was used to compare the synapsis pattern of meiotic prophase I in wild type and mutants. First, synapsis occurred but the number of synaptonemal complex initiation sites was reduced in both mutants. Second, this reduction was accompanied by, or resulted in, modifications of the pattern of synapsis. Genetic and synaptonemal complex maps were compared in three regions along one chromosome arm divided into well marked intervals. Reciprocal exchange frequencies and number of recombination nodules correlated in wild type in the three analyzed intervals, but disparity was found between the location of recombination nodules and exchanges in the mutants. Despite the twofold exchange decrease, sections of the genome such as the short arm of chromosome 2 and telomere regions were sheltered from nodule decrease and from pairing modifications. This indicated a certain amount of diversity in the control of these features and suggested that exchange frequency was dependent not only on the amount of effective pairing but also on the localization of the pairing sites, as revealed by the synaptonemal complex progression in the mutants. PMID:1398050
Enhancing genomic prediction with genome-wide association studies in multiparental maize populations
USDA-ARS?s Scientific Manuscript database
Genome-wide association mapping using dense marker sets has identified some nucleotide variants affecting complex traits which have been validated with fine-mapping and functional analysis. Many sequence variants associated with complex traits in maize have small effects and low repeatability, howev...
Sunflower Hybrid Breeding: From Markers to Genomic Selection
Dimitrijevic, Aleksandra; Horn, Renate
2018-01-01
In sunflower, molecular markers for simple traits as, e.g., fertility restoration, high oleic acid content, herbicide tolerance or resistances to Plasmopara halstedii, Puccinia helianthi, or Orobanche cumana have been successfully used in marker-assisted breeding programs for years. However, agronomically important complex quantitative traits like yield, heterosis, drought tolerance, oil content or selection for disease resistance, e.g., against Sclerotinia sclerotiorum have been challenging and will require genome-wide approaches. Plant genetic resources for sunflower are being collected and conserved worldwide that represent valuable resources to study complex traits. Sunflower association panels provide the basis for genome-wide association studies, overcoming disadvantages of biparental populations. Advances in technologies and the availability of the sunflower genome sequence made novel approaches on the whole genome level possible. Genotype-by-sequencing, and whole genome sequencing based on next generation sequencing technologies facilitated the production of large amounts of SNP markers for high density maps as well as SNP arrays and allowed genome-wide association studies and genomic selection in sunflower. Genome wide or candidate gene based association studies have been performed for traits like branching, flowering time, resistance to Sclerotinia head and stalk rot. First steps in genomic selection with regard to hybrid performance and hybrid oil content have shown that genomic selection can successfully address complex quantitative traits in sunflower and will help to speed up sunflower breeding programs in the future. To make sunflower more competitive toward other oil crops higher levels of resistance against pathogens and better yield performance are required. In addition, optimizing plant architecture toward a more complex growth type for higher plant densities has the potential to considerably increase yields per hectare. Integrative approaches combining omic technologies (genomics, transcriptomics, proteomics, metabolomics and phenomics) using bioinformatic tools will facilitate the identification of target genes and markers for complex traits and will give a better insight into the mechanisms behind the traits. PMID:29387071
A Deluge of Complex Repeats: The Solanum Genome
Mehra, Mrigaya; Gangwar, Indu; Shankar, Ravi
2015-01-01
Repetitive elements have lately emerged as key components of genome, performing varieties of roles. It has now become necessary to have an account of repeats for every genome to understand its dynamics and state. Recently, genomes of two major Solanaceae species, Solanum tuberosum and Solanum lycopersicum, were sequenced. These species are important crops having high commercial significance as well as value as model species. However, there is a reasonable gap in information about repetitive elements and their possible roles in genome regulation for these species. The present study was aimed at detailed identification and characterization of complex repetitive elements in these genomes, along with study of their possible functional associations as well as to assess possible transcriptionally active repetitive elements. In this study, it was found that ~50–60% of genomes of S. tuberosum and S. lycopersicum were composed of repetitive elements. It was also found that complex repetitive elements were associated with >95% of genes in both species. These two genomes are mostly composed of LTR retrotransposons. Two novel repeat families very similar to LTR/ERV1 and LINE/RTE-BovB have been reported for the first time. Active existence of complex repeats was estimated by measuring their transcriptional abundance using Next Generation Sequencing read data and Microarray platforms. A reasonable amount of regulatory components like transcription factor binding sites and miRNAs appear to be under the influence of these complex repetitive elements in these species, while several genes appeared to possess exonized repeats. PMID:26241045
Hosmani, Prashant S.; Villalobos-Ayala, Krystal; Miller, Sherry; Shippy, Teresa; Flores, Mirella; Rosendale, Andrew; Cordola, Chris; Bell, Tracey; Mann, Hannah; DeAvila, Gabe; DeAvila, Daniel; Moore, Zachary; Buller, Kyle; Ciolkevich, Kathryn; Nandyal, Samantha; Mahoney, Robert; Van Voorhis, Joshua; Dunlevy, Megan; Farrow, David; Hunter, David; Morgan, Taylar; Shore, Kayla; Guzman, Victoria; Izsak, Allison; Dixon, Danielle E.; Cridge, Andrew; Cano, Liliana; Cao, Xiaolong; Jiang, Haobo; Leng, Nan; Johnson, Shannon; Cantarel, Brandi L.; Richards, Stephen; English, Adam; Shatters, Robert G.; Childers, Chris; Chen, Mei-Ju; Hunter, Wayne; Cilia, Michelle; Mueller, Lukas A.; Munoz-Torres, Monica; Nelson, David; Poelchau, Monica F.; Benoit, Joshua B.; Wiersma-Koch, Helen; D’Elia, Tom; Brown, Susan J.
2017-01-01
Abstract The Asian citrus psyllid (Diaphorina citri Kuwayama) is the insect vector of the bacterium Candidatus Liberibacter asiaticus (CLas), the pathogen associated with citrus Huanglongbing (HLB, citrus greening). HLB threatens citrus production worldwide. Suppression or reduction of the insect vector using chemical insecticides has been the primary method to inhibit the spread of citrus greening disease. Accurate structural and functional annotation of the Asian citrus psyllid genome, as well as a clear understanding of the interactions between the insect and CLas, are required for development of new molecular-based HLB control methods. A draft assembly of the D. citri genome has been generated and annotated with automated pipelines. However, knowledge transfer from well-curated reference genomes such as that of Drosophila melanogaster to newly sequenced ones is challenging due to the complexity and diversity of insect genomes. To identify and improve gene models as potential targets for pest control, we manually curated several gene families with a focus on genes that have key functional roles in D. citri biology and CLas interactions. This community effort produced 530 manually curated gene models across developmental, physiological, RNAi regulatory and immunity-related pathways. As previously shown in the pea aphid, RNAi machinery genes putatively involved in the microRNA pathway have been specifically duplicated. A comprehensive transcriptome enabled us to identify a number of gene families that are either missing or misassembled in the draft genome. In order to develop biocuration as a training experience, we included undergraduate and graduate students from multiple institutions, as well as experienced annotators from the insect genomics research community. The resulting gene set (OGS v1.0) combines both automatically predicted and manually curated gene models. Database URL: https://citrusgreening.org/ PMID:29220441
Flow Sorting and Sequencing Meadow Fescue Chromosome 4F1[C][W
Kopecký, David; Martis, Mihaela; Číhalíková, Jarmila; Hřibová, Eva; Vrána, Jan; Bartoš, Jan; Kopecká, Jitka; Cattonaro, Federica; Stočes, Štěpán; Novák, Petr; Neumann, Pavel; Macas, Jiří; Šimková, Hana; Studer, Bruno; Asp, Torben; Baird, James H.; Navrátil, Petr; Karafiátová, Miroslava; Kubaláková, Marie; Šafář, Jan; Mayer, Klaus; Doležel, Jaroslav
2013-01-01
The analysis of large genomes is hampered by a high proportion of repetitive DNA, which makes the assembly of short sequence reads difficult. This is also the case in meadow fescue (Festuca pratensis), which is known for good abiotic stress resistance and has been used in intergeneric hybridization with ryegrasses (Lolium spp.) to produce Festulolium cultivars. In this work, we describe a new approach to analyze the large genome of meadow fescue, which involves the reduction of sample complexity without compromising information content. This is achieved by dissecting the genome to smaller parts: individual chromosomes and groups of chromosomes. As the first step, we flow sorted chromosome 4F and sequenced it by Illumina with approximately 50× coverage. This provided, to our knowledge, the first insight into the composition of the fescue genome, enabled the construction of the virtual gene order of the chromosome, and facilitated detailed comparative analysis with the sequenced genomes of rice (Oryza sativa), Brachypodium distachyon, sorghum (Sorghum bicolor), and barley (Hordeum vulgare). Using GenomeZipper, we were able to confirm the collinearity of chromosome 4F with barley chromosome 4H and the long arm of chromosome 5H. Several new tandem repeats were identified and physically mapped using fluorescence in situ hybridization. They were found as robust cytogenetic markers for karyotyping of meadow fescue and ryegrass species and their hybrids. The ability to purify chromosome 4F opens the way for more efficient analysis of genomic loci on this chromosome underlying important traits, including freezing tolerance. Our results confirm that next-generation sequencing of flow-sorted chromosomes enables an overview of chromosome structure and evolution at a resolution never achieved before. PMID:24096412
Froenicke, Lutz; Lavelle, Dean; Martineau, Belinda; Perroud, Bertrand; Michelmore, Richard
2013-01-01
Several applications of high throughput genome and transcriptome sequencing would benefit from a reduction of the high-copy-number sequences in the libraries being sequenced and analyzed, particularly when applied to species with large genomes. We adapted and analyzed the consequences of a method that utilizes a thermostable duplex-specific nuclease for reducing the high-copy components in transcriptomic and genomic libraries prior to sequencing. This reduces the time, cost, and computational effort of obtaining informative transcriptomic and genomic sequence data for both fully sequenced and non-sequenced genomes. It also reduces contamination from organellar DNA in preparations of nuclear DNA. Hybridization in the presence of 3 M tetramethylammonium chloride (TMAC), which equalizes the rates of hybridization of GC and AT nucleotide pairs, reduced the bias against sequences with high GC content. Consequences of this method on the reduction of high-copy and enrichment of low-copy sequences are reported for Arabidopsis and lettuce. PMID:23409088
Matvienko, Marta; Kozik, Alexander; Froenicke, Lutz; Lavelle, Dean; Martineau, Belinda; Perroud, Bertrand; Michelmore, Richard
2013-01-01
Several applications of high throughput genome and transcriptome sequencing would benefit from a reduction of the high-copy-number sequences in the libraries being sequenced and analyzed, particularly when applied to species with large genomes. We adapted and analyzed the consequences of a method that utilizes a thermostable duplex-specific nuclease for reducing the high-copy components in transcriptomic and genomic libraries prior to sequencing. This reduces the time, cost, and computational effort of obtaining informative transcriptomic and genomic sequence data for both fully sequenced and non-sequenced genomes. It also reduces contamination from organellar DNA in preparations of nuclear DNA. Hybridization in the presence of 3 M tetramethylammonium chloride (TMAC), which equalizes the rates of hybridization of GC and AT nucleotide pairs, reduced the bias against sequences with high GC content. Consequences of this method on the reduction of high-copy and enrichment of low-copy sequences are reported for Arabidopsis and lettuce.
USDA-ARS?s Scientific Manuscript database
Bovine Respiratory Disease Complex is a disease that is very costly to the dairy industry. Genomic selection may be an effective tool to improve host resistance to the pathogens that cause this disease. Use of genomic predicted transmitting abilities (GPTA) for selection has had a dramatic effect on...
USDA-ARS?s Scientific Manuscript database
New and emerging next generation sequencing technologies have been promising in reducing sequencing costs, but not significantly for complex polyploid plant genomes such as cotton. Large and highly repetitive genome of G. hirsutum (~2.5GB) is less amenable and cost-intensive with traditional BAC-by...
Chromosome catastrophes involve replication mechanisms generating complex genomic rearrangements
Liu, Pengfei; Erez, Ayelet; Sreenath Nagamani, Sandesh C.; Dhar, Shweta U.; Kołodziejska, Katarzyna E.; Dharmadhikari, Avinash V.; Cooper, M. Lance; Wiszniewska, Joanna; Zhang, Feng; Withers, Marjorie A.; Bacino, Carlos A.; Campos-Acevedo, Luis Daniel; Delgado, Mauricio R.; Freedenberg, Debra; Garnica, Adolfo; Grebe, Theresa A.; Hernández-Almaguer, Dolores; Immken, LaDonna; Lalani, Seema R.; McLean, Scott D.; Northrup, Hope; Scaglia, Fernando; Strathearn, Lane; Trapane, Pamela; Kang, Sung-Hae L.; Patel, Ankita; Cheung, Sau Wai; Hastings, P. J.; Stankiewicz, Paweł; Lupski, James R.; Bi, Weimin
2011-01-01
SUMMARY Complex genomic rearrangements (CGR) consisting of two or more breakpoint junctions have been observed in genomic disorders. Recently, a chromosome catastrophe phenomenon termed chromothripsis, in which numerous genomic rearrangements are apparently acquired in one single catastrophic event, was described in multiple cancers. Here we show that constitutionally acquired CGRs share similarities with cancer chromothripsis. In the 17 CGR cases investigated we observed localization and multiple copy number changes including deletions, duplications and/or triplications, as well as extensive translocations and inversions. Genomic rearrangements involved varied in size and complexities; in one case, array comparative genomic hybridization revealed 18 copy number changes. Breakpoint sequencing identified characteristic features, including small templated insertions at breakpoints and microhomology at breakpoint junctions, which have been attributed to replicative processes. The resemblance between CGR and chromothripsis suggests similar mechanistic underpinnings. Such chromosome catastrophic events appear to reflect basic DNA metabolism operative throughout an organism’s life cycle. PMID:21925314
MetaSort untangles metagenome assembly by reducing microbial community complexity
Ji, Peifeng; Zhang, Yanming; Wang, Jinfeng; Zhao, Fangqing
2017-01-01
Most current approaches to analyse metagenomic data rely on reference genomes. Novel microbial communities extend far beyond the coverage of reference databases and de novo metagenome assembly from complex microbial communities remains a great challenge. Here we present a novel experimental and bioinformatic framework, metaSort, for effective construction of bacterial genomes from metagenomic samples. MetaSort provides a sorted mini-metagenome approach based on flow cytometry and single-cell sequencing methodologies, and employs new computational algorithms to efficiently recover high-quality genomes from the sorted mini-metagenome by the complementary of the original metagenome. Through extensive evaluations, we demonstrated that metaSort has an excellent and unbiased performance on genome recovery and assembly. Furthermore, we applied metaSort to an unexplored microflora colonized on the surface of marine kelp and successfully recovered 75 high-quality genomes at one time. This approach will greatly improve access to microbial genomes from complex or novel communities. PMID:28112173
Crowding Induces Complex Ergodic Diffusion and Dynamic Elongation of Large DNA Molecules
Chapman, Cole D.; Gorczyca, Stephanie; Robertson-Anderson, Rae M.
2015-01-01
Despite the ubiquity of molecular crowding in living cells, the effects of crowding on the dynamics of genome-sized DNA are poorly understood. Here, we track single, fluorescent-labeled large DNA molecules (11, 115 kbp) diffusing in dextran solutions that mimic intracellular crowding conditions (0–40%), and determine the effects of crowding on both DNA mobility and conformation. Both DNAs exhibit ergodic Brownian motion and comparable mobility reduction in all conditions; however, crowder size (10 vs. 500 kDa) plays a critical role in the underlying diffusive mechanisms and dependence on crowder concentration. Surprisingly, in 10-kDa dextran, crowder influence saturates at ∼20% with an ∼5× drop in DNA diffusion, in stark contrast to exponentially retarded mobility, coupled to weak anomalous subdiffusion, with increasing concentration of 500-kDa dextran. Both DNAs elongate into lower-entropy states (compared to random coil conformations) when crowded, with elongation states that are gamma distributed and fluctuate in time. However, the broadness of the distribution of states and the time-dependence and length scale of elongation length fluctuations depend on both DNA and crowder size with concentration having surprisingly little impact. Results collectively show that mobility reduction and coil elongation of large crowded DNAs are due to a complex interplay between entropic effects and crowder mobility. Although elongation and initial mobility retardation are driven by depletion interactions, subdiffusive dynamics, and the drastic exponential slowing of DNA, up to ∼300×, arise from the reduced mobility of larger crowders. Our results elucidate the highly important and widely debated effects of cellular crowding on genome-sized DNA. PMID:25762333
Kay, Neil E.; Eckel-Passow, Jeanette E.; Braggio, Esteban; VanWier, Scott; Shanafelt, Tait D.; Van Dyke, Daniel L.; Jelinek, Diane F.; Tschumper, Renee C.; Kipps, Thomas; Byrd, John C.; Fonseca, Rafael
2010-01-01
To better understand the implications of genomic instability and outcome in B-cell CLL, we sought to address genomic complexity as a predictor of chemosensitivity and ultimately clinical outcome in this disease. We employed array-based comparative genomic hybridization (aCGH), using a one-million probe array and identified gains and losses of genetic material in 48 patients treated on a chemoimmunotherapy (CIT) clinical trial. We identified chromosomal gain or loss in ≥6% of the patients on chromosomes 3, 8, 9, 10, 11, 12, 13, 14 and 17. Higher genomic complexity, as a mechanism favoring clonal selection, was associated with shorter progression-free survival and predicted a poor response to treatment. Of interest, CLL cases with loss of p53 surveillance showed more complex genomic features and were found both in patients with a 17p13.1 deletion and in the more favorable genetic subtype characterized by the presence of 13q14.1 deletion. This aCGH study adds information on the association between inferior trial response and increasing genetic complexity as CLL progresses. PMID:21156228
Data compression and genomes: a two-dimensional life domain map.
Menconi, Giulia; Benci, Vieri; Buiatti, Marcello
2008-07-21
We define the complexity of DNA sequences as the information content per nucleotide, calculated by means of some Lempel-Ziv data compression algorithm. It is possible to use the statistics of the complexity values of the functional regions of different complete genomes to distinguish among genomes of different domains of life (Archaea, Bacteria and Eukarya). We shall focus on the distribution function of the complexity of non-coding regions. We show that the three domains may be plotted in separate regions within the two-dimensional space where the axes are the skewness coefficient and the curtosis coefficient of the aforementioned distribution. Preliminary results on 15 genomes are introduced.
Genome-Wide Mapping of Furfural Tolerance Genes in Escherichia coli
Glebes, Tirzah Y.; Sandoval, Nicholas R.; Reeder, Philippa J.; Schilling, Katherine D.; Zhang, Min; Gill, Ryan T.
2014-01-01
Advances in genomics have improved the ability to map complex genotype-to-phenotype relationships, like those required for engineering chemical tolerance. Here, we have applied the multiSCale Analysis of Library Enrichments (SCALEs; Lynch et al. (2007) Nat. Method.) approach to map, in parallel, the effect of increased dosage for >105 different fragments of the Escherichia coli genome onto furfural tolerance (furfural is a key toxin of lignocellulosic hydrolysate). Only 268 of >4,000 E. coli genes (∼6%) were enriched after growth selections in the presence of furfural. Several of the enriched genes were cloned and tested individually for their effect on furfural tolerance. Overexpression of thyA, lpcA, or groESL individually increased growth in the presence of furfural. Overexpression of lpcA, but not groESL or thyA, resulted in increased furfural reduction rate, a previously identified mechanism underlying furfural tolerance. We additionally show that plasmid-based expression of functional LpcA or GroESL is required to confer furfural tolerance. This study identifies new furfural tolerant genes, which can be applied in future strain design efforts focused on the production of fuels and chemicals from lignocellulosic hydrolysate. PMID:24489935
Genome-wide mapping of furfural tolerance genes in Escherichia coli.
Glebes, Tirzah Y; Sandoval, Nicholas R; Reeder, Philippa J; Schilling, Katherine D; Zhang, Min; Gill, Ryan T
2014-01-01
Advances in genomics have improved the ability to map complex genotype-to-phenotype relationships, like those required for engineering chemical tolerance. Here, we have applied the multiSCale Analysis of Library Enrichments (SCALEs; Lynch et al. (2007) Nat. Method.) approach to map, in parallel, the effect of increased dosage for >10(5) different fragments of the Escherichia coli genome onto furfural tolerance (furfural is a key toxin of lignocellulosic hydrolysate). Only 268 of >4,000 E. coli genes (∼ 6%) were enriched after growth selections in the presence of furfural. Several of the enriched genes were cloned and tested individually for their effect on furfural tolerance. Overexpression of thyA, lpcA, or groESL individually increased growth in the presence of furfural. Overexpression of lpcA, but not groESL or thyA, resulted in increased furfural reduction rate, a previously identified mechanism underlying furfural tolerance. We additionally show that plasmid-based expression of functional LpcA or GroESL is required to confer furfural tolerance. This study identifies new furfural tolerant genes, which can be applied in future strain design efforts focused on the production of fuels and chemicals from lignocellulosic hydrolysate.
SNP discovery by high-throughput sequencing in soybean
2010-01-01
Background With the advance of new massively parallel genotyping technologies, quantitative trait loci (QTL) fine mapping and map-based cloning become more achievable in identifying genes for important and complex traits. Development of high-density genetic markers in the QTL regions of specific mapping populations is essential for fine-mapping and map-based cloning of economically important genes. Single nucleotide polymorphisms (SNPs) are the most abundant form of genetic variation existing between any diverse genotypes that are usually used for QTL mapping studies. The massively parallel sequencing technologies (Roche GS/454, Illumina GA/Solexa, and ABI/SOLiD), have been widely applied to identify genome-wide sequence variations. However, it is still remains unclear whether sequence data at a low sequencing depth are enough to detect the variations existing in any QTL regions of interest in a crop genome, and how to prepare sequencing samples for a complex genome such as soybean. Therefore, with the aims of identifying SNP markers in a cost effective way for fine-mapping several QTL regions, and testing the validation rate of the putative SNPs predicted with Solexa short sequence reads at a low sequencing depth, we evaluated a pooled DNA fragment reduced representation library and SNP detection methods applied to short read sequences generated by Solexa high-throughput sequencing technology. Results A total of 39,022 putative SNPs were identified by the Illumina/Solexa sequencing system using a reduced representation DNA library of two parental lines of a mapping population. The validation rates of these putative SNPs predicted with low and high stringency were 72% and 85%, respectively. One hundred sixty four SNP markers resulted from the validation of putative SNPs and have been selectively chosen to target a known QTL, thereby increasing the marker density of the targeted region to one marker per 42 K bp. Conclusions We have demonstrated how to quickly identify large numbers of SNPs for fine mapping of QTL regions by applying massively parallel sequencing combined with genome complexity reduction techniques. This SNP discovery approach is more efficient for targeting multiple QTL regions in a same genetic population, which can be applied to other crops. PMID:20701770
An Adenovirus DNA Replication Factor, but Not Incoming Genome Complexes, Targets PML Nuclear Bodies.
Komatsu, Tetsuro; Nagata, Kyosuke; Wodrich, Harald
2016-02-01
Promyelocytic leukemia protein nuclear bodies (PML-NBs) are subnuclear domains implicated in cellular antiviral responses. Despite the antiviral activity, several nuclear replicating DNA viruses use the domains as deposition sites for the incoming viral genomes and/or as sites for viral DNA replication, suggesting that PML-NBs are functionally relevant during early viral infection to establish productive replication. Although PML-NBs and their components have also been implicated in the adenoviral life cycle, it remains unclear whether incoming adenoviral genome complexes target PML-NBs. Here we show using immunofluorescence and live-cell imaging analyses that incoming adenovirus genome complexes neither localize at nor recruit components of PML-NBs during early phases of infection. We further show that the viral DNA binding protein (DBP), an early expressed viral gene and essential DNA replication factor, independently targets PML-NBs. We show that DBP oligomerization is required to selectively recruit the PML-NB components Sp100 and USP7. Depletion experiments suggest that the absence of one PML-NB component might not affect the recruitment of other components toward DBP oligomers. Thus, our findings suggest a model in which an adenoviral DNA replication factor, but not incoming viral genome complexes, targets and modulates PML-NBs to support a conducive state for viral DNA replication and argue against a generalized concept that PML-NBs target incoming viral genomes. The immediate fate upon nuclear delivery of genomes of incoming DNA viruses is largely unclear. Early reports suggested that incoming genomes of herpesviruses are targeted and repressed by PML-NBs immediately upon nuclear import. Genome localization and/or viral DNA replication has also been observed at PML-NBs for other DNA viruses. Thus, it was suggested that PML-NBs may immediately sense and target nuclear viral genomes and hence serve as sites for deposition of incoming viral genomes and/or subsequent viral DNA replication. Here we performed a detailed analyses of the spatiotemporal distribution of incoming adenoviral genome complexes and found, in contrast to the expectation, that an adenoviral DNA replication factor, but not incoming genomes, targets PML-NBs. Thus, our findings may explain why adenoviral genomes could be observed at PML-NBs in earlier reports but argue against a generalized role for PML-NBs in targeting invading viral genomes. Copyright © 2016, American Society for Microbiology. All Rights Reserved.
Via, Sara
2012-01-01
In allopatric populations, geographical separation simultaneously isolates the entire genome, allowing genetic divergence to accumulate virtually anywhere in the genome. In sympatric populations, however, the strong divergent selection required to overcome migration produces a genetic mosaic of divergent and non-divergent genomic regions. In some recent genome scans, each divergent genomic region has been interpreted as an independent incidence of migration/selection balance, such that the reduction of gene exchange is restricted to a few kilobases around each divergently selected gene. I propose an alternative mechanism, ‘divergence hitchhiking’ (DH), in which divergent selection can reduce gene exchange for several megabases around a gene under strong divergent selection. Not all genes/markers within a DH region are divergently selected, yet the entire region is protected to some degree from gene exchange, permitting genetic divergence from mechanisms other than divergent selection to accumulate secondarily. After contrasting DH and multilocus migration/selection balance (MM/SB), I outline a model in which genomic isolation at a given genomic location is jointly determined by DH and genome-wide effects of the progressive reduction in realized migration, then illustrate DH using data from several pairs of incipient species in the wild. PMID:22201174
Protein complexes are assemblies of subunits that have co-evolved to execute one or many coordinated functions in the cellular environment. Functional annotation of mammalian protein complexes is critical to understanding biological processes, as well as disease mechanisms. Here, we used genetic co-essentiality derived from genome-scale RNAi- and CRISPR-Cas9-based fitness screens performed across hundreds of human cancer cell lines to assign measures of functional similarity.
Transposable elements as a molecular evolutionary force
NASA Technical Reports Server (NTRS)
Fedoroff, N. V.
1999-01-01
This essay addresses the paradoxes of the complex and highly redundant genomes. The central theses developed are that: (1) the distinctive feature of complex genomes is the existence of epigenetic mechanisms that permit extremely high levels of both tandem and dispersed redundancy; (2) the special contribution of transposable elements is to modularize the genome; and (3) the labilizing forces of recombination and transposition are just barely contained, giving a dynamic genetic system of ever increasing complexity that verges on the chaotic.
USDA-ARS?s Scientific Manuscript database
Single Molecule Real-Time (SMRT) sequencing provides advantages to the sequencing of complex genomes. The long reads generated are superior for resolving complex genomic regions and provide highly contiguous de novo assemblies. Current SMRTbell libraries generate average read lengths of 10-15kb. How...
USDA-ARS?s Scientific Manuscript database
The identification of specific genes underlying phenotypic variation of complex traits remains one of the greatest challenges in biology despite having genome sequences and more powerful tools. Most genome-wide screens lack sufficient resolving power as they typically depend on linkage. One altern...
A sequence-based survey of the complex structural organization of tumor genomes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Collins, Colin; Raphael, Benjamin J.; Volik, Stanislav
2008-04-03
The genomes of many epithelial tumors exhibit extensive chromosomal rearrangements. All classes of genome rearrangements can be identified using End Sequencing Profiling (ESP), which relies on paired-end sequencing of cloned tumor genomes. In this study, brain, breast, ovary and prostate tumors along with three breast cancer cell lines were surveyed with ESP yielding the largest available collection of sequence-ready tumor genome breakpoints and providing evidence that some rearrangements may be recurrent. Sequencing and fluorescence in situ hybridization (FISH) confirmed translocations and complex tumor genome structures that include coamplification and packaging of disparate genomic loci with associated molecular heterogeneity. Comparison ofmore » the tumor genomes suggests recurrent rearrangements. Some are likely to be novel structural polymorphisms, whereas others may be bona fide somatic rearrangements. A recurrent fusion transcript in breast tumors and a constitutional fusion transcript resulting from a segmental duplication were identified. Analysis of end sequences for single nucleotide polymorphisms (SNPs) revealed candidate somatic mutations and an elevated rate of novel SNPs in an ovarian tumor. These results suggest that the genomes of many epithelial tumors may be far more dynamic and complex than previously appreciated and that genomic fusions including fusion transcripts and proteins may be common, possibly yielding tumor-specific biomarkers and therapeutic targets.« less
Genomes Behave as Social Entities: Alien Chromatin Minorities Evolve Through Specificities Reduction
USDA-ARS?s Scientific Manuscript database
Hybridization and chromosome doubling entailed by allopolyploidization requires genetic and epigenetic modifications, resulting in the adjustment of different genomes to the same nuclear environment. Recently, the main role of retrotransposon/microsatellite-rich regions of the genome in DNA sequenc...
USDA-ARS?s Scientific Manuscript database
Single nucleotide polymorphism was employed in the construction of a high-resolution, expressed sequence tag (EST) map of Aegilops tauschii, the diploid source of the wheat D genome. Comparison of the map with the rice and sorghum genome sequences revealed 50 inversions and translocations; 2, 8, and...
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tang, Shuiquan; Wang, Po Hsiang; Higgins, Steven A.
Here we report that the genomes of two closely related Dehalobacter strains (strain CF and strain DCA) were assembled from the metagenome of an anaerobic enrichment culture that reductively dechlorinates chloroform (CF), 1,1,1-trichloroethane (1,1,1-TCA) and 1,1-dichloroethane (1,1-DCA). The 3.1 Mbp genomes of strain CF (that dechlorinates CF and 1,1,1-TCA) and strain DCA (that dechlorinates 1,1-DCA) each contain 17 putative reductive dehalogenase homologous (rdh) genes. These two genomes were systematically compared to three other available organohalide-respiring Dehalobacter genomes (Dehalobacter restrictus strain PER-K23, Dehalobacter sp. strain E1 and Dehalobacter sp. strain UNSWDHB), and to the genomes of Dehalococcoides mccartyi strain 195 andmore » Desulfitobacterium hafniense strain Y51. This analysis compared 42 different metabolic and physiological categories. The genomes of strains CF and DCA share 90% overall average nucleotide identity and >99.8% identity over a 2.9 Mbp alignment that excludes large insertions, indicating that these genomes differentiated from a close common ancestor. This differentiation was likely driven by selection pressures around two orthologous reductive dehalogenase genes, cfrA and dcrA, that code for the enzymes that reduce CF or 1,1,1-TCA and 1,1-DCA. The many reductive dehalogenase genes found in the five Dehalobacter genomes cluster into two small conserved regions and were often associated with Crp/Fnr transcriptional regulators. Specialization is on-going on a strain-specific basis, as some strains but not others have lost essential genes in the Wood-Ljungdahl (strain E1) and corrinoid biosynthesis pathways (strains E1 and PER-K23). The gene encoding phosphoserine phosphatase, which catalyzes the last step of serine biosynthesis, is missing from all five Dehalobacter genomes, yet D. restrictus can grow without serine, suggesting an alternative or unrecognized biosynthesis route exists. In contrast to D. mccartyi, a complete heme biosynthesis pathway is present in the five Dehalobacter genomes. This pathway corresponds to a newly described alternative heme biosynthesis route first identified in Archaea. Ultimately, this analysis of organohalide-respiring Firmicutes and Chloroflexi reveals profound evolutionary differences despite very similar niche-specific metabolism and function.« less
Tang, Shuiquan; Wang, Po Hsiang; Higgins, Steven A.; ...
2016-02-12
Here we report that the genomes of two closely related Dehalobacter strains (strain CF and strain DCA) were assembled from the metagenome of an anaerobic enrichment culture that reductively dechlorinates chloroform (CF), 1,1,1-trichloroethane (1,1,1-TCA) and 1,1-dichloroethane (1,1-DCA). The 3.1 Mbp genomes of strain CF (that dechlorinates CF and 1,1,1-TCA) and strain DCA (that dechlorinates 1,1-DCA) each contain 17 putative reductive dehalogenase homologous (rdh) genes. These two genomes were systematically compared to three other available organohalide-respiring Dehalobacter genomes (Dehalobacter restrictus strain PER-K23, Dehalobacter sp. strain E1 and Dehalobacter sp. strain UNSWDHB), and to the genomes of Dehalococcoides mccartyi strain 195 andmore » Desulfitobacterium hafniense strain Y51. This analysis compared 42 different metabolic and physiological categories. The genomes of strains CF and DCA share 90% overall average nucleotide identity and >99.8% identity over a 2.9 Mbp alignment that excludes large insertions, indicating that these genomes differentiated from a close common ancestor. This differentiation was likely driven by selection pressures around two orthologous reductive dehalogenase genes, cfrA and dcrA, that code for the enzymes that reduce CF or 1,1,1-TCA and 1,1-DCA. The many reductive dehalogenase genes found in the five Dehalobacter genomes cluster into two small conserved regions and were often associated with Crp/Fnr transcriptional regulators. Specialization is on-going on a strain-specific basis, as some strains but not others have lost essential genes in the Wood-Ljungdahl (strain E1) and corrinoid biosynthesis pathways (strains E1 and PER-K23). The gene encoding phosphoserine phosphatase, which catalyzes the last step of serine biosynthesis, is missing from all five Dehalobacter genomes, yet D. restrictus can grow without serine, suggesting an alternative or unrecognized biosynthesis route exists. In contrast to D. mccartyi, a complete heme biosynthesis pathway is present in the five Dehalobacter genomes. This pathway corresponds to a newly described alternative heme biosynthesis route first identified in Archaea. Ultimately, this analysis of organohalide-respiring Firmicutes and Chloroflexi reveals profound evolutionary differences despite very similar niche-specific metabolism and function.« less
López-Manríquez, Eduardo; Vashist, Surender; Ureña, Luis; Goodfellow, Ian; Chavez, Pedro; Mora-Heredia, José Eduardo; Cancio-Lonches, Clotilde; Garrido, Efraín
2013-01-01
Sequences and structures within the terminal genomic regions of plus-strand RNA viruses are targets for the binding of host proteins that modulate functions such as translation, RNA replication, and encapsidation. Using murine norovirus 1 (MNV-1), we describe the presence of long-range RNA-RNA interactions that were stabilized by cellular proteins. The proteins potentially responsible for the stabilization were selected based on their ability to bind the MNV-1 genome and/or having been reported to be involved in the stabilization of RNA-RNA interactions. Cell extracts were preincubated with antibodies against the selected proteins and used for coprecipitation reactions. Extracts treated with antibodies to poly(C) binding protein 2 (PCBP2) and heterogeneous nuclear ribonucleoprotein (hnRNP) A1 significantly reduced the 5′-3′ interaction. Both PCBP2 and hnRNP A1 recombinant proteins stabilized the 5′-3′ interactions and formed ribonucleoprotein complexes with the 5′ and 3′ ends of the MNV-1 genomic RNA. Mutations within the 3′ complementary sequences (CS) that disrupt the 5′-3′-end interactions resulted in a significant reduction of the viral titer, suggesting that the integrity of the 3′-end sequence and/or the lack of complementarity with the 5′ end is important for efficient virus replication. Small interfering RNA-mediated knockdown of PCBP2 or hnRNP A1 resulted in a reduction in virus yield, confirming a role for the observed interactions in efficient viral replication. PCBP2 and hnRNP A1 induced the circularization of MNV-1 RNA, as revealed by electron microscopy. This study provides evidence that PCBP2 and hnRNP A1 bind to the 5′ and 3′ ends of the MNV-1 viral RNA and contribute to RNA circularization, playing a role in the virus life cycle. PMID:23946460
Baby, Vincent; Lachance, Jean-Christophe; Gagnon, Jules; Lucier, Jean-François; Matteau, Dominick; Knight, Tom; Rodrigue, Sébastien
2018-01-01
The creation and comparison of minimal genomes will help better define the most fundamental mechanisms supporting life. Mesoplasma florum is a near-minimal, fast-growing, nonpathogenic bacterium potentially amenable to genome reduction efforts. In a comparative genomic study of 13 M. florum strains, including 11 newly sequenced genomes, we have identified the core genome and open pangenome of this species. Our results show that all of the strains have approximately 80% of their gene content in common. Of the remaining 20%, 17% of the genes were found in multiple strains and 3% were unique to any given strain. On the basis of random transposon mutagenesis, we also estimated that ~290 out of 720 genes are essential for M. florum L1 in rich medium. We next evaluated different genome reduction scenarios for M. florum L1 by using gene conservation and essentiality data, as well as comparisons with the first working approximation of a minimal organism, Mycoplasma mycoides JCVI-syn3.0. Our results suggest that 409 of the 473 M. mycoides JCVI-syn3.0 genes have orthologs in M. florum L1. Conversely, 57 putatively essential M. florum L1 genes have no homolog in M. mycoides JCVI-syn3.0. This suggests differences in minimal genome compositions, even for these evolutionarily closely related bacteria. IMPORTANCE The last years have witnessed the development of whole-genome cloning and transplantation methods and the complete synthesis of entire chromosomes. Recently, the first minimal cell, Mycoplasma mycoides JCVI-syn3.0, was created. Despite these milestone achievements, several questions remain to be answered. For example, is the composition of minimal genomes virtually identical in phylogenetically related species? On the basis of comparative genomics and transposon mutagenesis, we investigated this question by using an alternative model, Mesoplasma florum, that is also amenable to genome reduction efforts. Our results suggest that the creation of additional minimal genomes could help reveal different gene compositions and strategies that can support life, even within closely related species.
Genome fluctuations in cyanobacteria reflect evolutionary, developmental and adaptive traits.
Larsson, John; Nylander, Johan Aa; Bergman, Birgitta
2011-06-30
Cyanobacteria belong to an ancient group of photosynthetic prokaryotes with pronounced variations in their cellular differentiation strategies, physiological capacities and choice of habitat. Sequencing efforts have shown that genomes within this phylum are equally diverse in terms of size and protein-coding capacity. To increase our understanding of genomic changes in the lineage, the genomes of 58 contemporary cyanobacteria were analysed for shared and unique orthologs. A total of 404 protein families, present in all cyanobacterial genomes, were identified. Two of these are unique to the phylum, corresponding to an AbrB family transcriptional regulator and a gene that escapes functional annotation although its genomic neighbourhood is conserved among the organisms examined. The evolution of cyanobacterial genome sizes involves a mix of gains and losses in the clade encompassing complex cyanobacteria, while a single event of reduction is evident in a clade dominated by unicellular cyanobacteria. Genome sizes and gene family copy numbers evolve at a higher rate in the former clade, and multi-copy genes were predominant in large genomes. Orthologs unique to cyanobacteria exhibiting specific characteristics, such as filament formation, heterocyst differentiation, diazotrophy and symbiotic competence, were also identified. An ancestral character reconstruction suggests that the most recent common ancestor of cyanobacteria had a genome size of approx. 4.5 Mbp and 1678 to 3291 protein-coding genes, 4%-6% of which are unique to cyanobacteria today. The different rates of genome-size evolution and multi-copy gene abundance suggest two routes of genome development in the history of cyanobacteria. The expansion strategy is driven by gene-family enlargment and generates a broad adaptive potential; while the genome streamlining strategy imposes adaptations to highly specific niches, also reflected in their different functional capacities. A few genomes display extreme proliferation of non-coding nucleotides which is likely to be the result of initial expansion of genomes/gene copy number to gain adaptive potential, followed by a shift to a life-style in a highly specific niche (e.g. symbiosis). This transition results in redundancy of genes and gene families, leading to an increase in junk DNA and eventually to gene loss. A few orthologs can be correlated with specific phenotypes in cyanobacteria, such as filament formation and symbiotic competence; these constitute exciting exploratory targets.
Genome fluctuations in cyanobacteria reflect evolutionary, developmental and adaptive traits
2011-01-01
Background Cyanobacteria belong to an ancient group of photosynthetic prokaryotes with pronounced variations in their cellular differentiation strategies, physiological capacities and choice of habitat. Sequencing efforts have shown that genomes within this phylum are equally diverse in terms of size and protein-coding capacity. To increase our understanding of genomic changes in the lineage, the genomes of 58 contemporary cyanobacteria were analysed for shared and unique orthologs. Results A total of 404 protein families, present in all cyanobacterial genomes, were identified. Two of these are unique to the phylum, corresponding to an AbrB family transcriptional regulator and a gene that escapes functional annotation although its genomic neighbourhood is conserved among the organisms examined. The evolution of cyanobacterial genome sizes involves a mix of gains and losses in the clade encompassing complex cyanobacteria, while a single event of reduction is evident in a clade dominated by unicellular cyanobacteria. Genome sizes and gene family copy numbers evolve at a higher rate in the former clade, and multi-copy genes were predominant in large genomes. Orthologs unique to cyanobacteria exhibiting specific characteristics, such as filament formation, heterocyst differentiation, diazotrophy and symbiotic competence, were also identified. An ancestral character reconstruction suggests that the most recent common ancestor of cyanobacteria had a genome size of approx. 4.5 Mbp and 1678 to 3291 protein-coding genes, 4%-6% of which are unique to cyanobacteria today. Conclusions The different rates of genome-size evolution and multi-copy gene abundance suggest two routes of genome development in the history of cyanobacteria. The expansion strategy is driven by gene-family enlargment and generates a broad adaptive potential; while the genome streamlining strategy imposes adaptations to highly specific niches, also reflected in their different functional capacities. A few genomes display extreme proliferation of non-coding nucleotides which is likely to be the result of initial expansion of genomes/gene copy number to gain adaptive potential, followed by a shift to a life-style in a highly specific niche (e.g. symbiosis). This transition results in redundancy of genes and gene families, leading to an increase in junk DNA and eventually to gene loss. A few orthologs can be correlated with specific phenotypes in cyanobacteria, such as filament formation and symbiotic competence; these constitute exciting exploratory targets. PMID:21718514
Microeconomic principles explain an optimal genome size in bacteria.
Ranea, Juan A G; Grant, Alastair; Thornton, Janet M; Orengo, Christine A
2005-01-01
Bacteria can clearly enhance their survival by expanding their genetic repertoire. However, the tight packing of the bacterial genome and the fact that the most evolved species do not necessarily have the biggest genomes suggest there are other evolutionary factors limiting their genome expansion. To clarify these restrictions on size, we studied those protein families contributing most significantly to bacterial-genome complexity. We found that all bacteria apply the same basic and ancestral 'molecular technology' to optimize their reproductive efficiency. The same microeconomics principles that define the optimum size in a factory can also explain the existence of a statistical optimum in bacterial genome size. This optimum is reached when the bacterial genome obtains the maximum metabolic complexity (revenue) for minimal regulatory genes (logistic cost).
Joosen, Ronny Viktor Louis; Arends, Danny; Li, Yang; Willems, Leo A.J.; Keurentjes, Joost J.B.; Ligterink, Wilco; Jansen, Ritsert C.; Hilhorst, Henk W.M.
2013-01-01
A complex phenotype such as seed germination is the result of several genetic and environmental cues and requires the concerted action of many genes. The use of well-structured recombinant inbred lines in combination with “omics” analysis can help to disentangle the genetic basis of such quantitative traits. This so-called genetical genomics approach can effectively capture both genetic and epistatic interactions. However, to understand how the environment interacts with genomic-encoded information, a better understanding of the perception and processing of environmental signals is needed. In a classical genetical genomics setup, this requires replication of the whole experiment in different environmental conditions. A novel generalized setup overcomes this limitation and includes environmental perturbation within a single experimental design. We developed a dedicated quantitative trait loci mapping procedure to implement this approach and used existing phenotypical data to demonstrate its power. In addition, we studied the genetic regulation of primary metabolism in dry and imbibed Arabidopsis (Arabidopsis thaliana) seeds. In the metabolome, many changes were observed that were under both environmental and genetic controls and their interaction. This concept offers unique reduction of experimental load with minimal compromise of statistical power and is of great potential in the field of systems genetics, which requires a broad understanding of both plasticity and dynamic regulation. PMID:23606598
Abad-Grau, Mara M; Medina-Medina, Nuria; Montes-Soldado, Rosana; Matesanz, Fuencisla; Bafna, Vineet
2012-01-01
Multimarker Transmission/Disequilibrium Tests (TDTs) are very robust association tests to population admixture and structure which may be used to identify susceptibility loci in genome-wide association studies. Multimarker TDTs using several markers may increase power by capturing high-degree associations. However, there is also a risk of spurious associations and power reduction due to the increase in degrees of freedom. In this study we show that associations found by tests built on simple null hypotheses are highly reproducible in a second independent data set regardless the number of markers. As a test exhibiting this feature to its maximum, we introduce the multimarker 2-Groups TDT (mTDT(2G)), a test which under the hypothesis of no linkage, asymptotically follows a χ2 distribution with 1 degree of freedom regardless the number of markers. The statistic requires the division of parental haplotypes into two groups: disease susceptibility and disease protective haplotype groups. We assessed the test behavior by performing an extensive simulation study as well as a real-data study using several data sets of two complex diseases. We show that mTDT(2G) test is highly efficient and it achieves the highest power among all the tests used, even when the null hypothesis is tested in a second independent data set. Therefore, mTDT(2G) turns out to be a very promising multimarker TDT to perform genome-wide searches for disease susceptibility loci that may be used as a preprocessing step in the construction of more accurate genetic models to predict individual susceptibility to complex diseases.
Abad-Grau, Mara M.; Medina-Medina, Nuria; Montes-Soldado, Rosana; Matesanz, Fuencisla; Bafna, Vineet
2012-01-01
Multimarker Transmission/Disequilibrium Tests (TDTs) are very robust association tests to population admixture and structure which may be used to identify susceptibility loci in genome-wide association studies. Multimarker TDTs using several markers may increase power by capturing high-degree associations. However, there is also a risk of spurious associations and power reduction due to the increase in degrees of freedom. In this study we show that associations found by tests built on simple null hypotheses are highly reproducible in a second independent data set regardless the number of markers. As a test exhibiting this feature to its maximum, we introduce the multimarker -Groups TDT ( ), a test which under the hypothesis of no linkage, asymptotically follows a distribution with degree of freedom regardless the number of markers. The statistic requires the division of parental haplotypes into two groups: disease susceptibility and disease protective haplotype groups. We assessed the test behavior by performing an extensive simulation study as well as a real-data study using several data sets of two complex diseases. We show that test is highly efficient and it achieves the highest power among all the tests used, even when the null hypothesis is tested in a second independent data set. Therefore, turns out to be a very promising multimarker TDT to perform genome-wide searches for disease susceptibility loci that may be used as a preprocessing step in the construction of more accurate genetic models to predict individual susceptibility to complex diseases. PMID:22363405
Beauregard, France; Angers, Bernard
2018-05-31
Unisexuals of the blue-spotted salamander complex are thought to reproduce by kleptogenesis. Genome exchanges associated with this sperm-dependent mode of reproduction are expected to result in a higher genetic variation and multiple ploidy levels compared to clonality. However, the existence of some populations exclusively formed of genetically identical individuals suggests that factors could prevent genome exchanges. This study aimed at assessing the prevalence of genome exchange among unisexuals of the Ambystoma laterale-jeffersonianum complex from 10 sites in the northern part of their distribution. A total of 235 individuals, including 207 unisexuals, were genotyped using microsatellite loci and AFLP. Unisexual individuals could be sorted in five genetically distinct groups, likely derived from the same paternal A. jeffersonianum haplome. One of these groups exclusively reproduced clonally, even when found in sympatry with lineages presenting signature of genome exchange. Genome exchange was site-dependent for another group. Genome exchange was detected at all sites for the three remaining groups. Prevalence of genome exchange appears to be associated with ecological conditions such as availability of effective sperm donors. Intrinsic genomic factors may also affect this process, since different lineages in sympatry present highly variable rate of genome exchange. The coexistence of clonal and genetically diversified lineages opens the door to further research on alternatives to genetic variation.
Genomes in turmoil: quantification of genome dynamics in prokaryote supergenomes.
Puigbò, Pere; Lobkovsky, Alexander E; Kristensen, David M; Wolf, Yuri I; Koonin, Eugene V
2014-08-21
Genomes of bacteria and archaea (collectively, prokaryotes) appear to exist in incessant flux, expanding via horizontal gene transfer and gene duplication, and contracting via gene loss. However, the actual rates of genome dynamics and relative contributions of different types of event across the diversity of prokaryotes are largely unknown, as are the sizes of microbial supergenomes, i.e. pools of genes that are accessible to the given microbial species. We performed a comprehensive analysis of the genome dynamics in 35 groups (34 bacterial and one archaeal) of closely related microbial genomes using a phylogenetic birth-and-death maximum likelihood model to quantify the rates of gene family gain and loss, as well as expansion and reduction. The results show that loss of gene families dominates the evolution of prokaryotes, occurring at approximately three times the rate of gain. The rates of gene family expansion and reduction are typically seven and twenty times less than the gain and loss rates, respectively. Thus, the prevailing mode of evolution in bacteria and archaea is genome contraction, which is partially compensated by the gain of new gene families via horizontal gene transfer. However, the rates of gene family gain, loss, expansion and reduction vary within wide ranges, with the most stable genomes showing rates about 25 times lower than the most dynamic genomes. For many groups, the supergenome estimated from the fraction of repetitive gene family gains includes about tenfold more gene families than the typical genome in the group although some groups appear to have vast, 'open' supergenomes. Reconstruction of evolution for groups of closely related bacteria and archaea reveals an extremely rapid and highly variable flux of genes in evolving microbial genomes, demonstrates that extensive gene loss and horizontal gene transfer leading to innovation are the two dominant evolutionary processes, and yields robust estimates of the supergenome size.
Selective recruitment of nuclear factors to productively replicating herpes simplex virus genomes.
Dembowski, Jill A; DeLuca, Neal A
2015-05-01
Much of the HSV-1 life cycle is carried out in the cell nucleus, including the expression, replication, repair, and packaging of viral genomes. Viral proteins, as well as cellular factors, play essential roles in these processes. Isolation of proteins on nascent DNA (iPOND) was developed to label and purify cellular replication forks. We adapted aspects of this method to label viral genomes to both image, and purify replicating HSV-1 genomes for the identification of associated proteins. Many viral and cellular factors were enriched on viral genomes, including factors that mediate DNA replication, repair, chromatin remodeling, transcription, and RNA processing. As infection proceeded, packaging and structural components were enriched to a greater extent. Among the more abundant proteins that copurified with genomes were the viral transcription factor ICP4 and the replication protein ICP8. Furthermore, all seven viral replication proteins were enriched on viral genomes, along with cellular PCNA and topoisomerases, while other cellular replication proteins were not detected. The chromatin-remodeling complexes present on viral genomes included the INO80, SWI/SNF, NURD, and FACT complexes, which may prevent chromatinization of the genome. Consistent with this conclusion, histones were not readily recovered with purified viral genomes, and imaging studies revealed an underrepresentation of histones on viral genomes. RNA polymerase II, the mediator complex, TFIID, TFIIH, and several other transcriptional activators and repressors were also affinity purified with viral DNA. The presence of INO80, NURD, SWI/SNF, mediator, TFIID, and TFIIH components is consistent with previous studies in which these complexes copurified with ICP4. Therefore, ICP4 is likely involved in the recruitment of these key cellular chromatin remodeling and transcription factors to viral genomes. Taken together, iPOND is a valuable method for the study of viral genome dynamics during infection and provides a comprehensive view of how HSV-1 selectively utilizes cellular resources.
Wang, W.; Haberer, G.; Gundlach, H.; Gläßer, C.; Nussbaumer, T.; Luo, M.C.; Lomsadze, A.; Borodovsky, M.; Kerstetter, R.A.; Shanklin, J.; Byrant, D.W.; Mockler, T.C.; Appenroth, K.J.; Grimwood, J.; Jenkins, J.; Chow, J.; Choi, C.; Adam, C.; Cao, X.-H.; Fuchs, J.; Schubert, I.; Rokhsar, D.; Schmutz, J.; Michael, T.P.; Mayer, K.F.X.; Messing, J
2014-01-01
The subfamily of the Lemnoideae belongs to a different order than other monocotyledonous species that have been sequenced and comprises aquatic plants that grow rapidly on the water surface. Here we select Spirodela polyrhiza for whole-genome sequencing. We show that Spirodela has a genome with no signs of recent retrotranspositions but signatures of two ancient whole-genome duplications, possibly 95 million years ago (mya), older than those in Arabidopsis and rice. Its genome has only 19,623 predicted protein-coding genes, which is 28% less than the dicotyledonous Arabidopsis thaliana and 50% less than monocotyledonous rice. We propose that at least in part, the neotenous reduction of these aquatic plants is based on readjusted copy numbers of promoters and repressors of the juvenile-to-adult transition. The Spirodela genome, along with its unique biology and physiology, will stimulate new insights into environmental adaptation, ecology, evolution and plant development, and will be instrumental for future bioenergy applications. PMID:24548928
Appels, R; Barrero, R; Bellgard, M
2012-03-01
The Plant and Animal Genome (PAG, held annually) meeting in January 2012 provided insights into the advances in plant, animal, and microbe genome studies particularly as they impact on our understanding of complex biological systems. The diverse areas of biology covered included the advances in technologies, variation in complex traits, genome change in evolution, and targeting phenotypic changes, across the broad spectrum of life forms. This overview aims to summarize the major advances in research areas presented in the plenary lectures and does not attempt to summarize the diverse research activities covered throughout the PAG in workshops, posters, presentations, and displays by suppliers of cutting-edge technologies.
KGCAK: a K-mer based database for genome-wide phylogeny and complexity evaluation.
Wang, Dapeng; Xu, Jiayue; Yu, Jun
2015-09-16
The K-mer approach, treating genomic sequences as simple characters and counting the relative abundance of each string upon a fixed K, has been extensively applied to phylogeny inference for genome assembly, annotation, and comparison. To meet increasing demands for comparing large genome sequences and to promote the use of the K-mer approach, we develop a versatile database, KGCAK ( http://kgcak.big.ac.cn/KGCAK/ ), containing ~8,000 genomes that include genome sequences of diverse life forms (viruses, prokaryotes, protists, animals, and plants) and cellular organelles of eukaryotic lineages. It builds phylogeny based on genomic elements in an alignment-free fashion and provides in-depth data processing enabling users to compare the complexity of genome sequences based on K-mer distribution. We hope that KGCAK becomes a powerful tool for exploring relationship within and among groups of species in a tree of life based on genomic data.
Developing molecular tools for Chlamydomonas reinhardtii
NASA Astrophysics Data System (ADS)
Noor-Mohammadi, Samaneh
Microalgae have garnered increasing interest over the years for their ability to produce compounds ranging from biofuels to neutraceuticals. A main focus of researchers has been to use microalgae as a natural bioreactor for the production of valuable and complex compounds. Recombinant protein expression in the chloroplasts of green algae has recently become more routine; however, the heterologous expression of multiple proteins or complete biosynthetic pathways remains a significant challenge. To take full advantage of these organisms' natural abilities, sophisticated molecular tools are needed to be able to introduce and functionally express multiple gene biosynthetic pathways in its genome. To achieve the above objective, we have sought to establish a method to construct, integrate and express multigene operons in the chloroplast and nuclear genome of the model microalgae Chlamydomonas reinhardtii. Here we show that a modified DNA Assembler approach can be used to rapidly assemble multiple-gene biosynthetic pathways in yeast and then integrate these assembled pathways at a site-specific location in the chloroplast, or by random integration in the nuclear genome of C. reinhardtii. As a proof of concept, this method was used to successfully integrate and functionally express up to three reporter proteins (AphA6, AadA, and GFP) in the chloroplast of C. reinhardtii and up to three reporter proteins (Ble, AphVIII, and GFP) in its nuclear genome. An analysis of the relative gene expression of the engineered strains showed significant differences in the mRNA expression levels of the reporter genes and thus highlights the importance of proper promoter/untranslated-region selection when constructing a target pathway. In addition, this work focuses on expressing the cofactor regeneration enzyme phosphite dehydrogenase (PTDH) in the chloroplast and nuclear genomes of C. reinhardtii. The PTDH enzyme converts phosphite into phosphate and NAD(P)+ into NAD(P)H. The reduced nicotinamide cofactor NAD(P)H plays a pivotal role in many biochemical oxidation and reduction reactions, thus this enzyme would allow regeneration of NAD(P)H in a microalgae strain over-expressing a NAD(P)H-dependent oxidoreductase. A phosphite dehydrogenase gene was introduced into the chloroplast genome (codon optimized) and nuclear genome of C. reinhardtii by biolistic transformation and electroporation in separate events, respectively. Successful expression of the heterologous protein was confirmed by transcript analysis and protein analysis. In conclusion, this new method represents a useful genetic tool in the construction and integration of complex biochemical pathways into the chloroplast or nuclear genome of microalgae, and this should aid current efforts to engineer algae for recombinant protein expression, biofuels production and production of other desirable natural products.
Entropic fluctuations in DNA sequences
NASA Astrophysics Data System (ADS)
Thanos, Dimitrios; Li, Wentian; Provata, Astero
2018-03-01
The Local Shannon Entropy (LSE) in blocks is used as a complexity measure to study the information fluctuations along DNA sequences. The LSE of a DNA block maps the local base arrangement information to a single numerical value. It is shown that despite this reduction of information, LSE allows to extract meaningful information related to the detection of repetitive sequences in whole chromosomes and is useful in finding evolutionary differences between organisms. More specifically, large regions of tandem repeats, such as centromeres, can be detected based on their low LSE fluctuations along the chromosome. Furthermore, an empirical investigation of the appropriate block sizes is provided and the relationship of LSE properties with the structure of the underlying repetitive units is revealed by using both computational and mathematical methods. Sequence similarity between the genomic DNA of closely related species also leads to similar LSE values at the orthologous regions. As an application, the LSE covariance function is used to measure the evolutionary distance between several primate genomes.
2013-01-01
Background Homosporous ferns are distinctive amongst the land plant lineages for their high chromosome numbers and enigmatic genomes. Genome size measurements are an under exploited tool in homosporous ferns and show great potential to provide an overview of the mechanisms that define genome evolution in these ferns. The aim of this study is to investigate the evolution of genome size and the relationship between genome size and spore size within the apomictic Asplenium monanthes fern complex and related lineages. Results Comparative analyses to test for a relationship between spore size and genome size show that they are not correlated. The data do however provide evidence for marked genome size variation between species in this group. These results indicate that Asplenium monanthes has undergone a two-fold expansion in genome size. Conclusions Our findings challenge the widely held assumption that spore size can be used to infer ploidy levels within apomictic fern complexes. We argue that the observed genome size variation is likely to have arisen via increases in both chromosome number due to polyploidy and chromosome size due to amplification of repetitive DNA (e.g. transposable elements, especially retrotransposons). However, to date the latter has not been considered to be an important process of genome evolution within homosporous ferns. We infer that genome evolution, at least in some homosporous fern lineages, is a more dynamic process than existing studies would suggest. PMID:24354467
A novel mode of lactate metabolism in strictly anaerobic bacteria.
Weghoff, Marie Charlotte; Bertsch, Johannes; Müller, Volker
2015-03-01
Lactate is a common substrate for major groups of strictly anaerobic bacteria, but the biochemistry and bioenergetics of lactate oxidation is obscure. The high redox potential of the pyruvate/lactate pair of E0 ' = -190 mV excludes direct NAD(+) reduction (E0 ' = -320 mV). To identify the hitherto unknown electron acceptor, we have purified the lactate dehydrogenase (LDH) from the strictly anaerobic, acetogenic bacterium Acetobacterium woodii. The LDH forms a stable complex with an electron-transferring flavoprotein (Etf) that exhibited NAD(+) reduction only when reduced ferredoxin (Fd(2-) ) was present. Biochemical analyses revealed that the LDH/Etf complex of A. woodii uses flavin-based electron confurcation to drive endergonic lactate oxidation with NAD(+) as oxidant at the expense of simultaneous exergonic electron flow from reduced ferredoxin (E0 ' ≈ -500 mV) to NAD(+) according to: lactate + Fd(2-) + 2 NAD(+) → pyruvate + Fd + 2 NADH. The reduced Fd(2-) is regenerated from NADH by a sequence of events that involves conversion of chemical (ATP) to electrochemical ( Δ μ ˜ Na + ) and finally redox energy (Fd(2-) from NADH) via reversed electron transport catalysed by the Rnf complex. Inspection of genomes revealed that this metabolic scenario for lactate oxidation may also apply to many other anaerobes. © 2014 Society for Applied Microbiology and John Wiley & Sons Ltd.
Feltus, F Alex
2014-06-01
Understanding the control of any trait optimally requires the detection of causal genes, gene interaction, and mechanism of action to discover and model the biochemical pathways underlying the expressed phenotype. Functional genomics techniques, including RNA expression profiling via microarray and high-throughput DNA sequencing, allow for the precise genome localization of biological information. Powerful genetic approaches, including quantitative trait locus (QTL) and genome-wide association study mapping, link phenotype with genome positions, yet genetics is less precise in localizing the relevant mechanistic information encoded in DNA. The coupling of salient functional genomic signals with genetically mapped positions is an appealing approach to discover meaningful gene-phenotype relationships. Techniques used to define this genetic-genomic convergence comprise the field of systems genetics. This short review will address an application of systems genetics where RNA profiles are associated with genetically mapped genome positions of individual genes (eQTL mapping) or as gene sets (co-expression network modules). Both approaches can be applied for knowledge independent selection of candidate genes (and possible control mechanisms) underlying complex traits where multiple, likely unlinked, genomic regions might control specific complex traits. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Darwinian evolution in the light of genomics
Koonin, Eugene V.
2009-01-01
Comparative genomics and systems biology offer unprecedented opportunities for testing central tenets of evolutionary biology formulated by Darwin in the Origin of Species in 1859 and expanded in the Modern Synthesis 100 years later. Evolutionary-genomic studies show that natural selection is only one of the forces that shape genome evolution and is not quantitatively dominant, whereas non-adaptive processes are much more prominent than previously suspected. Major contributions of horizontal gene transfer and diverse selfish genetic elements to genome evolution undermine the Tree of Life concept. An adequate depiction of evolution requires the more complex concept of a network or ‘forest’ of life. There is no consistent tendency of evolution towards increased genomic complexity, and when complexity increases, this appears to be a non-adaptive consequence of evolution under weak purifying selection rather than an adaptation. Several universals of genome evolution were discovered including the invariant distributions of evolutionary rates among orthologous genes from diverse genomes and of paralogous gene family sizes, and the negative correlation between gene expression level and sequence evolution rate. Simple, non-adaptive models of evolution explain some of these universals, suggesting that a new synthesis of evolutionary biology might become feasible in a not so remote future. PMID:19213802
Zhang, Weipeng; Tian, Ren-Mao; Sun, Jin; Bougouffa, Salim; Ding, Wei; Cai, Lin; Lan, Yi; Tong, Haoya; Li, Yongxin; Jamieson, Alan J; Bajic, Vladimir B; Drazen, Jeffrey C; Bartlett, Douglas; Qian, Pei-Yuan
2018-01-01
Amphipods are the dominant scavenging metazoan species in the Mariana Trench, the deepest known point in Earth's oceans. Here the gut microbiota of the amphipod Hirondellea gigas collected from the Challenger and Sirena Deeps of the Mariana Trench were investigated. The 11 amphipod individuals included for analyses were dominated by Psychromonas , of which a nearly complete genome was successfully recovered (designated CDP1). Compared with previously reported free-living Psychromonas strains, CDP1 has a highly reduced genome. Genome alignment showed deletion of the trimethylamine N -oxide (TMAO) reducing gene cluster in CDP1, suggesting that the "piezolyte" function of TMAO is more important than its function in respiration, which may lead to TMAO accumulation. In terms of nutrient utilization, the bacterium retains its central carbohydrate metabolism but lacks most of the extended carbohydrate utilization pathways, suggesting the confinement of Psychromonas to the host gut and sequestration from more variable environmental conditions. Moreover, CDP1 contains a complete formate hydrogenlyase complex, which might be involved in energy production. The genomic analyses imply that CDP1 may have developed adaptive strategies for a lifestyle within the gut of the hadal amphipod H. gigas. IMPORTANCE As a unique but poorly investigated habitat within marine ecosystems, hadal trenches have received interest in recent years. This study explores the gut microbial composition and function in hadal amphipods, which are among the dominant carrion feeders in hadal habitats. Further analyses of a dominant strain revealed genomic features that may contribute to its adaptation to the amphipod gut environment. Our findings provide new insights into animal-associated bacteria in the hadal biosphere.
Zhang, Weipeng; Tian, Ren-Mao; Sun, Jin; Bougouffa, Salim; Ding, Wei; Cai, Lin; Lan, Yi; Tong, Haoya; Li, Yongxin; Jamieson, Alan J.; Bajic, Vladimir B.; Drazen, Jeffrey C.; Bartlett, Douglas
2018-01-01
ABSTRACT Amphipods are the dominant scavenging metazoan species in the Mariana Trench, the deepest known point in Earth’s oceans. Here the gut microbiota of the amphipod Hirondellea gigas collected from the Challenger and Sirena Deeps of the Mariana Trench were investigated. The 11 amphipod individuals included for analyses were dominated by Psychromonas, of which a nearly complete genome was successfully recovered (designated CDP1). Compared with previously reported free-living Psychromonas strains, CDP1 has a highly reduced genome. Genome alignment showed deletion of the trimethylamine N-oxide (TMAO) reducing gene cluster in CDP1, suggesting that the “piezolyte” function of TMAO is more important than its function in respiration, which may lead to TMAO accumulation. In terms of nutrient utilization, the bacterium retains its central carbohydrate metabolism but lacks most of the extended carbohydrate utilization pathways, suggesting the confinement of Psychromonas to the host gut and sequestration from more variable environmental conditions. Moreover, CDP1 contains a complete formate hydrogenlyase complex, which might be involved in energy production. The genomic analyses imply that CDP1 may have developed adaptive strategies for a lifestyle within the gut of the hadal amphipod H. gigas. IMPORTANCE As a unique but poorly investigated habitat within marine ecosystems, hadal trenches have received interest in recent years. This study explores the gut microbial composition and function in hadal amphipods, which are among the dominant carrion feeders in hadal habitats. Further analyses of a dominant strain revealed genomic features that may contribute to its adaptation to the amphipod gut environment. Our findings provide new insights into animal-associated bacteria in the hadal biosphere. PMID:29657971
Calhoun, Eric S; Hucl, Tomas; Gallmeier, Eike; West, Kristen M; Arking, Dan E; Maitra, Anirban; Iacobuzio-Donahue, Christine A; Chakravarti, Aravinda; Hruban, Ralph H; Kern, Scott E
2006-08-15
Recent advances in oligonucleotide arrays and whole-genome complexity reduction data analysis now permit the evaluation of tens of thousands of single-nucleotide polymorphisms simultaneously for a genome-wide analysis of allelic status. Using these arrays, we created high-resolution allelotype maps of 26 pancreatic cancer cell lines. The areas of heterozygosity implicitly served to reveal regions of allelic loss. The array-derived maps were verified by a panel of 317 microsatellite markers used in a subset of seven samples, showing a 97.1% concordance between heterozygous calls. Three matched tumor/normal pairs were used to estimate the false-negative and potential false-positive rates for identifying loss of heterozygosity: 3.6 regions (average minimal region of loss, 720,228 bp) and 2.3 regions (average heterozygous gap distance, 4,434,994 bp) per genome, respectively. Genomic fractional allelic loss calculations showed that cumulative levels of allelic loss ranged widely from 17.1% to 79.9% of the haploid genome length. Regional increases in "NoCall" frequencies combined with copy number loss estimates were used to identify 41 homozygous deletions (19 first reports), implicating an additional 13 regions disrupted in pancreatic cancer. Unexpectedly, 23 of these occurred in just two lines (BxPc3 and MiaPaCa2), suggesting the existence of at least two subclasses of chromosomal instability (CIN) patterns, distinguished here by allelic loss and copy number changes (original CIN) and those also highly enriched in the genomic "holes" of homozygous deletions (holey CIN). This study provides previously unavailable high-resolution allelotype and deletion breakpoint maps in widely shared pancreatic cancer cell lines and effectively eliminates the need for matched normal tissue to define informative loci.
Ensembl genomes 2016: more genomes, more complexity
USDA-ARS?s Scientific Manuscript database
Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species, complementing the resources for vertebrate genomics developed in the context of the Ensembl project (http://www.ensembl.org). Together, the two resources provide a consistent...
The Effects of Signal Erosion and Core Genome Reduction on the Identification of Diagnostic Markers
2016-09-20
31 diagnostics for the identification of bacterial pathogens. To do this effectively, 32 genomics databases must be comprehensive to identify the...diverse B. 118 pseudomallei/mallei strains were sequenced, assembled, and deposited in public 119 databases (Supplemental Table 1); these genomes were...combined with 160 B. 120 pseudomallei/mallei genome assemblies already in public databases . Most of the 121 genomes (n=779) in this study were
Cho, Yong-Joon; Yi, Hana; Chun, Jongsik; Cho, Sang-Nae; Daley, Charles L; Koh, Won-Jung; Shin, Sung Jae
2013-01-01
Members of the Mycobacterium abscessus complex are rapidly growing mycobacteria that are emerging as human pathogens. The M. abscessus complex was previously composed of three species, namely M. abscessus sensu stricto, 'M. massiliense', and 'M. bolletii'. In 2011, 'M. massiliense' and 'M. bolletii' were united and reclassified as a single subspecies within M. abscessus: M. abscessus subsp. bolletii. However, the placement of 'M. massiliense' within the boundary of M. abscessus subsp. bolletii remains highly controversial with regard to clinical aspects. In this study, we revisited the taxonomic status of members of the M. abscessus complex based on comparative analysis of the whole-genome sequences of 53 strains. The genome sequence of the previous type strain of 'Mycobacterium massiliense' (CIP 108297) was determined using next-generation sequencing. The genome tree based on average nucleotide identity (ANI) values supported the differentiation of 'M. bolletii' and 'M. massiliense' at the subspecies level. The genome tree also clearly illustrated that 'M. bolletii' and 'M. massiliense' form a distinct phylogenetic clade within the radiation of the M. abscessus complex. The genomic distances observed in this study suggest that the current M. abscessus subsp. bolletii taxon should be divided into two subspecies, M. abscessus subsp. massiliense subsp. nov. and M. abscessus subsp. bolletii, to correspondingly accommodate the previously known 'M. massiliense' and 'M. bolletii' strains.
Petrovska, Liljana; Tang, Yue; Jansen van Rensburg, Melissa J; Cawthraw, Shaun; Nunez, Javier; Sheppard, Samuel K; Ellis, Richard J; Whatmore, Adrian M; Crawshaw, Tim R; Irvine, Richard M
2017-01-01
The term "spotty liver disease" (SLD) has been used since the late 1990s for a condition seen in the UK and Australia that primarily affects free range laying hens around peak lay, causing acute mortality and a fall in egg production. A novel thermophilic SLD-associated Campylobacter was reported in the United Kingdom (UK) in 2015. Subsequently, similar isolates occurring in Australia were formally described as a new species, Campylobacter hepaticus . We describe the comparative genomics of 10 C. hepaticus isolates recovered from 5 geographically distinct poultry holdings in the UK between 2010 and 2012. Hierarchical gene-by-gene analyses of the study isolates and representatives of 24 known Campylobacter species indicated that C. hepaticus is most closely related to the major pathogens Campylobacter jejuni and Campylobacter coli . We observed low levels of within-farm variation, even between isolates collected over almost 3 years. With respect to C. hepaticus genome features, we noted that the study isolates had a ~140 Kb reduction in genome size, ~144 fewer genes, and a lower GC content compared to C. jejuni . The most notable reduction was in the subsystem containing genes for iron acquisition and metabolism, supported by reduced growth of C. hepaticus in an iron depletion assay. Genome reduction is common among many pathogens and in C. hepaticus has likely been driven at least in part by specialization following the occupation of a new niche, the chicken liver.
Phenotypic and Genomic Analysis of Hypervirulent Human-associated Bordetella bronchiseptica
2012-01-01
Background B. bronchiseptica infections are usually associated with wild or domesticated animals, but infrequently with humans. A recent phylogenetic analysis distinguished two distinct B. bronchiseptica subpopulations, designated complexes I and IV. Complex IV isolates appear to have a bias for infecting humans; however, little is known regarding their epidemiology, virulence properties, or comparative genomics. Results Here we report a characterization of the virulence of human-associated complex IV B. bronchiseptica strains. In in vitro cytotoxicity assays, complex IV strains showed increased cytotoxicity in comparison to a panel of complex I strains. Some complex IV isolates were remarkably cytotoxic, resulting in LDH release levels in A549 cells that were 10- to 20-fold greater than complex I strains. In vivo, a subset of complex IV strains was found to be hypervirulent, with an increased ability to cause lethal pulmonary infections in mice. Hypercytotoxicity in vitro and hypervirulence in vivo were both dependent on the activity of the bsc T3SS and the BteA effector. To clarify differences between lineages, representative complex IV isolates were sequenced and their genomes were compared to complex I isolates. Although our analysis showed there were no genomic sequences that can be considered unique to complex IV strains, there were several loci that were predominantly found in complex IV isolates. Conclusion Our observations reveal a T3SS-dependent hypervirulence phenotype in human-associated complex IV isolates, highlighting the need for further studies on the epidemiology and evolutionary dynamics of this B. bronchiseptica lineage. PMID:22863321
The translational landscape of Arabidopsis mitochondria.
Planchard, Noelya; Bertin, Pierre; Quadrado, Martine; Dargel-Graffin, Céline; Hatin, Isabelle; Namy, Olivier; Mireau, Hakim
2018-06-05
Messenger RNA translation is a complex process that is still poorly understood in eukaryotic organelles like mitochondria. Growing evidence indicates though that mitochondrial translation differs from its bacterial counterpart in many key aspects. In this analysis, we have used ribosome profiling technology to generate a genome-wide snapshot view of mitochondrial translation in Arabidopsis. We show that, unlike in humans, most Arabidopsis mitochondrial ribosome footprints measure 27 and 28 bases. We also reveal that respiratory subunits encoding mRNAs show much higher ribosome association than other mitochondrial mRNAs, implying that they are translated at higher levels. Homogenous ribosome densities were generally detected within each respiratory complex except for complex V, where higher ribosome coverage corroborated with higher requirements for specific subunits. In complex I respiratory mutants, a reorganization of mitochondrial mRNAs ribosome association was detected involving increased ribosome densities for certain ribosomal protein encoding transcripts and a reduction in translation of a few complex V mRNAs. Taken together, our observations reveal that plant mitochondrial translation is a dynamic process and that translational control is important for gene expression in plant mitochondria. This study paves the way for future advances in the understanding translation in higher plant mitochondria.
PGSB/MIPS Plant Genome Information Resources and Concepts for the Analysis of Complex Grass Genomes.
Spannagl, Manuel; Bader, Kai; Pfeifer, Matthias; Nussbaumer, Thomas; Mayer, Klaus F X
2016-01-01
PGSB (Plant Genome and Systems Biology; formerly MIPS-Munich Institute for Protein Sequences) has been involved in developing, implementing and maintaining plant genome databases for more than a decade. Genome databases and analysis resources have focused on individual genomes and aim to provide flexible and maintainable datasets for model plant genomes as a backbone against which experimental data, e.g., from high-throughput functional genomics, can be organized and analyzed. In addition, genomes from both model and crop plants form a scaffold for comparative genomics, assisted by specialized tools such as the CrowsNest viewer to explore conserved gene order (synteny) between related species on macro- and micro-levels.The genomes of many economically important Triticeae plants such as wheat, barley, and rye present a great challenge for sequence assembly and bioinformatic analysis due to their enormous complexity and large genome size. Novel concepts and strategies have been developed to deal with these difficulties and have been applied to the genomes of wheat, barley, rye, and other cereals. This includes the GenomeZipper concept, reference-guided exome assembly, and "chromosome genomics" based on flow cytometry sorted chromosomes.
Functional assessment of human enhancer activities using whole-genome STARR-sequencing.
Liu, Yuwen; Yu, Shan; Dhiman, Vineet K; Brunetti, Tonya; Eckart, Heather; White, Kevin P
2017-11-20
Genome-wide quantification of enhancer activity in the human genome has proven to be a challenging problem. Recent efforts have led to the development of powerful tools for enhancer quantification. However, because of genome size and complexity, these tools have yet to be applied to the whole human genome. In the current study, we use a human prostate cancer cell line, LNCaP as a model to perform whole human genome STARR-seq (WHG-STARR-seq) to reliably obtain an assessment of enhancer activity. This approach builds upon previously developed STARR-seq in the fly genome and CapSTARR-seq techniques in targeted human genomic regions. With an improved library preparation strategy, our approach greatly increases the library complexity per unit of starting material, which makes it feasible and cost-effective to explore the landscape of regulatory activity in the much larger human genome. In addition to our ability to identify active, accessible enhancers located in open chromatin regions, we can also detect sequences with the potential for enhancer activity that are located in inaccessible, closed chromatin regions. When treated with the histone deacetylase inhibitor, Trichostatin A, genes nearby this latter class of enhancers are up-regulated, demonstrating the potential for endogenous functionality of these regulatory elements. WHG-STARR-seq provides an improved approach to current pipelines for analysis of high complexity genomes to gain a better understanding of the intricacies of transcriptional regulation.
Yerrapragada, Shaila; Shukla, Animesh; Hallsworth-Pepin, Kymberlie; Choi, Kwangmin; Wollam, Aye; Clifton, Sandra; Qin, Xiang; Muzny, Donna; Raghuraman, Sriram; Ashki, Haleh; Uzman, Akif; Highlander, Sarah K.; Fryszczyn, Bartlomiej G.; Fox, George E.; Tirumalai, Madhan R.; Liu, Yamei; Kim, Sun
2015-01-01
Tolypothrix sp. PCC 7601 is a freshwater filamentous cyanobacterium with complex responses to environmental conditions. Here, we present its 9.96-Mbp draft genome sequence, containing 10,065 putative protein-coding sequences, including 305 predicted two-component system proteins and 27 putative phytochrome-class photoreceptors, the most such proteins in any sequenced genome. PMID:25953173
2011-01-01
Background Many plants have large and complex genomes with an abundance of repeated sequences. Many plants are also polyploid. Both of these attributes typify the genome architecture in the tribe Triticeae, whose members include economically important wheat, rye and barley. Large genome sizes, an abundance of repeated sequences, and polyploidy present challenges to genome-wide SNP discovery using next-generation sequencing (NGS) of total genomic DNA by making alignment and clustering of short reads generated by the NGS platforms difficult, particularly in the absence of a reference genome sequence. Results An annotation-based, genome-wide SNP discovery pipeline is reported using NGS data for large and complex genomes without a reference genome sequence. Roche 454 shotgun reads with low genome coverage of one genotype are annotated in order to distinguish single-copy sequences and repeat junctions from repetitive sequences and sequences shared by paralogous genes. Multiple genome equivalents of shotgun reads of another genotype generated with SOLiD or Solexa are then mapped to the annotated Roche 454 reads to identify putative SNPs. A pipeline program package, AGSNP, was developed and used for genome-wide SNP discovery in Aegilops tauschii-the diploid source of the wheat D genome, and with a genome size of 4.02 Gb, of which 90% is repetitive sequences. Genomic DNA of Ae. tauschii accession AL8/78 was sequenced with the Roche 454 NGS platform. Genomic DNA and cDNA of Ae. tauschii accession AS75 was sequenced primarily with SOLiD, although some Solexa and Roche 454 genomic sequences were also generated. A total of 195,631 putative SNPs were discovered in gene sequences, 155,580 putative SNPs were discovered in uncharacterized single-copy regions, and another 145,907 putative SNPs were discovered in repeat junctions. These SNPs were dispersed across the entire Ae. tauschii genome. To assess the false positive SNP discovery rate, DNA containing putative SNPs was amplified by PCR from AL8/78 and AS75 and resequenced with the ABI 3730 xl. In a sample of 302 randomly selected putative SNPs, 84.0% in gene regions, 88.0% in repeat junctions, and 81.3% in uncharacterized regions were validated. Conclusion An annotation-based genome-wide SNP discovery pipeline for NGS platforms was developed. The pipeline is suitable for SNP discovery in genomic libraries of complex genomes and does not require a reference genome sequence. The pipeline is applicable to all current NGS platforms, provided that at least one such platform generates relatively long reads. The pipeline package, AGSNP, and the discovered 497,118 Ae. tauschii SNPs can be accessed at (http://avena.pw.usda.gov/wheatD/agsnp.shtml). PMID:21266061
Gouret, Philippe; Vitiello, Vérane; Balandraud, Nathalie; Gilles, André; Pontarotti, Pierre; Danchin, Etienne GJ
2005-01-01
Background Two of the main objectives of the genomic and post-genomic era are to structurally and functionally annotate genomes which consists of detecting genes' position and structure, and inferring their function (as well as of other features of genomes). Structural and functional annotation both require the complex chaining of numerous different software, algorithms and methods under the supervision of a biologist. The automation of these pipelines is necessary to manage huge amounts of data released by sequencing projects. Several pipelines already automate some of these complex chaining but still necessitate an important contribution of biologists for supervising and controlling the results at various steps. Results Here we propose an innovative automated platform, FIGENIX, which includes an expert system capable to substitute to human expertise at several key steps. FIGENIX currently automates complex pipelines of structural and functional annotation under the supervision of the expert system (which allows for example to make key decisions, check intermediate results or refine the dataset). The quality of the results produced by FIGENIX is comparable to those obtained by expert biologists with a drastic gain in terms of time costs and avoidance of errors due to the human manipulation of data. Conclusion The core engine and expert system of the FIGENIX platform currently handle complex annotation processes of broad interest for the genomic community. They could be easily adapted to new, or more specialized pipelines, such as for example the annotation of miRNAs, the classification of complex multigenic families, annotation of regulatory elements and other genomic features of interest. PMID:16083500
Padmanabhan, Sandosh; Aman, Alisha; Dominiczak, Anna F
2018-06-07
Hypertension is recognised as the biggest contributor to the global burden of disease, but it is controlled in less than a fifth of patients worldwide, despite being relatively easy to detect and the availability of inexpensive safe generic drugs. Blood pressure is regulated by a complex network of physiologic pathways with currently available drugs targeting key receptors or enzymes in the top pathways. Major advances in the dissection of both monogenic and polygenic determinants of blood pressure regulation and variation have not resulted in rapid translation of these discoveries into clinical applications or precision medicine. Uromodulin is an example of a novel gene for hypertension identified from genome-wide association studies, currently the basis of a clinical trial to reposition loop diuretics in hypertension management. Gene-editing studies have established a genome-wide association studies (GWAS) SNP in chromosome 6p24, implicated in six conditions including hypertension, as a distal regulator of the endothelin-1 gene around 3000 base pairs away. Genomics of aldosterone-producing adenomas bring to focus the paradox in genomic medicine where availability of cheap generic drugs may render precision medicine uneconomical. The speed of technology-driven genomic discoveries and the sluggish traditional pathways of drug development and translation need harmonisation to make a timely and early impact on global public health. This requires a directed collaborative effort for which we propose a hypertension moonshot to make a quantum leap in hypertension management and cardiovascular risk reduction by bringing together traditional bioscience, omics, engineering, digital technology and data science.
Comparison of phasing strategies for whole human genomes
Kirkness, Ewen; Schork, Nicholas J.
2018-01-01
Humans are a diploid species that inherit one set of chromosomes paternally and one homologous set of chromosomes maternally. Unfortunately, most human sequencing initiatives ignore this fact in that they do not directly delineate the nucleotide content of the maternal and paternal copies of the 23 chromosomes individuals possess (i.e., they do not ‘phase’ the genome) often because of the costs and complexities of doing so. We compared 11 different widely-used approaches to phasing human genomes using the publicly available ‘Genome-In-A-Bottle’ (GIAB) phased version of the NA12878 genome as a gold standard. The phasing strategies we compared included laboratory-based assays that prepare DNA in unique ways to facilitate phasing as well as purely computational approaches that seek to reconstruct phase information from general sequencing reads and constructs or population-level haplotype frequency information obtained through a reference panel of haplotypes. To assess the performance of the 11 approaches, we used metrics that included, among others, switch error rates, haplotype block lengths, the proportion of fully phase-resolved genes, phasing accuracy and yield between pairs of SNVs. Our comparisons suggest that a hybrid or combined approach that leverages: 1. population-based phasing using the SHAPEIT software suite, 2. either genome-wide sequencing read data or parental genotypes, and 3. a large reference panel of variant and haplotype frequencies, provides a fast and efficient way to produce highly accurate phase-resolved individual human genomes. We found that for population-based approaches, phasing performance is enhanced with the addition of genome-wide read data; e.g., whole genome shotgun and/or RNA sequencing reads. Further, we found that the inclusion of parental genotype data within a population-based phasing strategy can provide as much as a ten-fold reduction in phasing errors. We also considered a majority voting scheme for the construction of a consensus haplotype combining multiple predictions for enhanced performance and site coverage. Finally, we also identified DNA sequence signatures associated with the genomic regions harboring phasing switch errors, which included regions of low polymorphism or SNV density. PMID:29621242
Optimizing complex phenotypes through model-guided multiplex genome engineering
Kuznetsov, Gleb; Goodman, Daniel B.; Filsinger, Gabriel T.; ...
2017-05-25
Here, we present a method for identifying genomic modifications that optimize a complex phenotype through multiplex genome engineering and predictive modeling. We apply our method to identify six single nucleotide mutations that recover 59% of the fitness defect exhibited by the 63-codon E. coli strain C321.ΔA. By introducing targeted combinations of changes in multiplex we generate rich genotypic and phenotypic diversity and characterize clones using whole-genome sequencing and doubling time measurements. Regularized multivariate linear regression accurately quantifies individual allelic effects and overcomes bias from hitchhiking mutations and context-dependence of genome editing efficiency that would confound other strategies.
Optimizing complex phenotypes through model-guided multiplex genome engineering
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kuznetsov, Gleb; Goodman, Daniel B.; Filsinger, Gabriel T.
Here, we present a method for identifying genomic modifications that optimize a complex phenotype through multiplex genome engineering and predictive modeling. We apply our method to identify six single nucleotide mutations that recover 59% of the fitness defect exhibited by the 63-codon E. coli strain C321.ΔA. By introducing targeted combinations of changes in multiplex we generate rich genotypic and phenotypic diversity and characterize clones using whole-genome sequencing and doubling time measurements. Regularized multivariate linear regression accurately quantifies individual allelic effects and overcomes bias from hitchhiking mutations and context-dependence of genome editing efficiency that would confound other strategies.
Multiple origins of interdependent endosymbiotic complexes in a genus of cicadas.
Łukasik, Piotr; Nazario, Katherine; Van Leuven, James T; Campbell, Matthew A; Meyer, Mariah; Michalik, Anna; Pessacq, Pablo; Simon, Chris; Veloso, Claudio; McCutcheon, John P
2018-01-09
Bacterial endosymbionts that provide nutrients to hosts often have genomes that are extremely stable in structure and gene content. In contrast, the genome of the endosymbiont Hodgkinia cicadicola has fractured into multiple distinct lineages in some species of the cicada genus Tettigades To better understand the frequency, timing, and outcomes of Hodgkinia lineage splitting throughout this cicada genus, we sampled cicadas over three field seasons in Chile and performed genomics and microscopy on representative samples. We found that a single ancestral Hodgkinia lineage has split at least six independent times in Tettigades over the last 4 million years, resulting in complexes of between two and six distinct Hodgkinia lineages per host. Individual genomes in these symbiotic complexes differ dramatically in relative abundance, genome size, organization, and gene content. Each Hodgkinia lineage retains a small set of core genes involved in genetic information processing, but the high level of gene loss experienced by all genomes suggests that extensive sharing of gene products among symbiont cells must occur. In total, Hodgkinia complexes that consist of multiple lineages encode nearly complete sets of genes present on the ancestral single lineage and presumably perform the same functions as symbionts that have not undergone splitting. However, differences in the timing of the splits, along with dissimilar gene loss patterns on the resulting genomes, have led to very different outcomes of lineage splitting in extant cicadas.
The complex hybrid origins of the root knot nematodes revealed through comparative genomics
Kumar, Sujai; Koutsovoulos, Georgios; Blaxter, Mark L.
2014-01-01
Root knot nematodes (RKN) can infect most of the world’s agricultural crop species and are among the most important of all plant pathogens. As yet however we have little understanding of their origins or the genomic basis of their extreme polyphagy. The most damaging pathogens reproduce by obligatory mitotic parthenogenesis and it has been suggested that these species originated from interspecific hybridizations between unknown parental taxa. We have sequenced the genome of the diploid meiotic parthenogen Meloidogyne floridensis, and use a comparative genomic approach to test the hypothesis that this species was involved in the hybrid origin of the tropical mitotic parthenogen Meloidogyne incognita. Phylogenomic analysis of gene families from M. floridensis, M. incognita and an outgroup species Meloidogyne hapla was carried out to trace the evolutionary history of these species’ genomes, and we demonstrate that M. floridensis was one of the parental species in the hybrid origins of M. incognita. Analysis of the M. floridensis genome itself revealed many gene loci present in divergent copies, as they are in M. incognita, indicating that it too had a hybrid origin. The triploid M. incognita is shown to be a complex double-hybrid between M. floridensis and a third, unidentified, parent. The agriculturally important RKN have very complex origins involving the mixing of several parental genomes by hybridization and their extreme polyphagy and success in agricultural environments may be related to this hybridization, producing transgressive variation on which natural selection can act. It is now clear that studying RKN variation via individual marker loci may fail due to the species’ convoluted origins, and multi-species population genomics is essential to understand the hybrid diversity and adaptive variation of this important species complex. This comparative genomic analysis provides a compelling example of the importance and complexity of hybridization in generating animal species diversity more generally. PMID:24860695
Complex multi-enhancer contacts captured by Genome Architecture Mapping (GAM)
Beagrie, Robert A.; Scialdone, Antonio; Schueler, Markus; Kraemer, Dorothee C.A.; Chotalia, Mita; Xie, Sheila Q.; Barbieri, Mariano; de Santiago, Inês; Lavitas, Liron-Mark; Branco, Miguel R.; Fraser, James; Dostie, Josée; Game, Laurence; Dillon, Niall; Edwards, Paul A.W.; Nicodemi, Mario; Pombo, Ana
2017-01-01
Summary The organization of the genome in the nucleus and the interactions of genes with their regulatory elements are key features of transcriptional control and their disruption can cause disease. We developed a novel genome-wide method, Genome Architecture Mapping (GAM), for measuring chromatin contacts, and other features of three-dimensional chromatin topology, based on sequencing DNA from a large collection of thin nuclear sections. We apply GAM to mouse embryonic stem cells and identify an enrichment for specific interactions between active genes and enhancers across very large genomic distances, using a mathematical model ‘SLICE’ (Statistical Inference of Co-segregation). GAM also reveals an abundance of three-way contacts genome-wide, especially between regions that are highly transcribed or contain super-enhancers, highlighting a previously inaccessible complexity in genome architecture and a major role for gene-expression specific contacts in organizing the genome in mammalian nuclei. PMID:28273065
DOE Office of Scientific and Technical Information (OSTI.GOV)
Knight, Thomas
2013-04-10
Today it is commonplace to design and construct single silicon chips with billions of transistors. These are complex systems, difficult (but possible) to design, test, and fabricate. Remarkably, simple living systems can be assembled from a similar number of atoms, most of them in water molecules. In this talk I will present the current status of our attempts at full understanding and complexity reduction of one of the simplest living systems, the free-living bacterial species Mesoplasma florum. This 400 nm diameter cell thrives and replicates every 40 minutes with a genome of only 800 kilobases. Our recent experiments using transposonmore » gene knockouts identified 354 of 683 annotated genes as inessential in laboratory culture when inactivated individually. While a functional redesigned genome will certainly not remove all of those genes, this suggests that roughly half the genome can be removed in an intentional redesign. I will discuss our recent knockout results and methodology, and our future plans for Genome re-engineering using targeted knock-in/knock-out double recombination; whole cell metabolic models; comprehensive whole cell metabolite measurement techniques; creation of plug-and-play metabolic modules for the simplified organism; inherent and engineered biosafety control mechanisms. This redesign is part of a comprehensive plan to lay the foundations for a new discipline of engineering biology. Engineering biological systems requires a fundamentally different viewpoint from that taken by the science of biology. Key engineering principles of modularity, simplicity, separation of concerns, abstraction, flexibility, hierarchical design, isolation, and standardization are of critical importance. The essence of engineering is the ability to imagine, design, model, build, and characterize novel systems to achieve specific goals. Current tools and components for these tasks are primitive. Our approach is to create and distribute standard biological parts, organisms, assembly techniques, and measurement techniques as a way of enabling this new field.« less
Evolution of genome size and complexity in the rhabdoviridae.
Walker, Peter J; Firth, Cadhla; Widen, Steven G; Blasdell, Kim R; Guzman, Hilda; Wood, Thomas G; Paradkar, Prasad N; Holmes, Edward C; Tesh, Robert B; Vasilakis, Nikos
2015-02-01
RNA viruses exhibit substantial structural, ecological and genomic diversity. However, genome size in RNA viruses is likely limited by a high mutation rate, resulting in the evolution of various mechanisms to increase complexity while minimising genome expansion. Here we conduct a large-scale analysis of the genome sequences of 99 animal rhabdoviruses, including 45 genomes which we determined de novo, to identify patterns of genome expansion and the evolution of genome complexity. All but seven of the rhabdoviruses clustered into 17 well-supported monophyletic groups, of which eight corresponded to established genera, seven were assigned as new genera, and two were taxonomically ambiguous. We show that the acquisition and loss of new genes appears to have been a central theme of rhabdovirus evolution, and has been associated with the appearance of alternative, overlapping and consecutive ORFs within the major structural protein genes, and the insertion and loss of additional ORFs in each gene junction in a clade-specific manner. Changes in the lengths of gene junctions accounted for as much as 48.5% of the variation in genome size from the smallest to the largest genome, and the frequency with which new ORFs were observed increased in the 3' to 5' direction along the genome. We also identify several new families of accessory genes encoded in these regions, and show that non-canonical expression strategies involving TURBS-like termination-reinitiation, ribosomal frame-shifts and leaky ribosomal scanning appear to be common. We conclude that rhabdoviruses have an unusual capacity for genomic plasticity that may be linked to their discontinuous transcription strategy from the negative-sense single-stranded RNA genome, and propose a model that accounts for the regular occurrence of genome expansion and contraction throughout the evolution of the Rhabdoviridae.
Evolution of Genome Size and Complexity in the Rhabdoviridae
Walker, Peter J.; Firth, Cadhla; Widen, Steven G.; Blasdell, Kim R.; Guzman, Hilda; Wood, Thomas G.; Paradkar, Prasad N.; Holmes, Edward C.; Tesh, Robert B.; Vasilakis, Nikos
2015-01-01
RNA viruses exhibit substantial structural, ecological and genomic diversity. However, genome size in RNA viruses is likely limited by a high mutation rate, resulting in the evolution of various mechanisms to increase complexity while minimising genome expansion. Here we conduct a large-scale analysis of the genome sequences of 99 animal rhabdoviruses, including 45 genomes which we determined de novo, to identify patterns of genome expansion and the evolution of genome complexity. All but seven of the rhabdoviruses clustered into 17 well-supported monophyletic groups, of which eight corresponded to established genera, seven were assigned as new genera, and two were taxonomically ambiguous. We show that the acquisition and loss of new genes appears to have been a central theme of rhabdovirus evolution, and has been associated with the appearance of alternative, overlapping and consecutive ORFs within the major structural protein genes, and the insertion and loss of additional ORFs in each gene junction in a clade-specific manner. Changes in the lengths of gene junctions accounted for as much as 48.5% of the variation in genome size from the smallest to the largest genome, and the frequency with which new ORFs were observed increased in the 3’ to 5’ direction along the genome. We also identify several new families of accessory genes encoded in these regions, and show that non-canonical expression strategies involving TURBS-like termination-reinitiation, ribosomal frame-shifts and leaky ribosomal scanning appear to be common. We conclude that rhabdoviruses have an unusual capacity for genomic plasticity that may be linked to their discontinuous transcription strategy from the negative-sense single-stranded RNA genome, and propose a model that accounts for the regular occurrence of genome expansion and contraction throughout the evolution of the Rhabdoviridae. PMID:25679389
Schelkunov, Mikhail I.; Shtratnikova, Viktoria Yu; Nuraliev, Maxim S.; Selosse, Marc-Andre; Penin, Aleksey A.; Logacheva, Maria D.
2015-01-01
The question on the patterns and limits of reduction of plastid genomes in nonphotosynthetic plants and the reasons of their conservation is one of the intriguing topics in plant genome evolution. Here, we report sequencing and analysis of plastid genome in nonphotosynthetic orchids Epipogium aphyllum and Epipogium roseum, which, with sizes of 31 and 19 kbp, respectively, represent the smallest plastid genomes characterized by now. Besides drastic reduction, which is expected, we found several unusual features of these “minimal” plastomes: Multiple rearrangements, highly biased nucleotide composition, and unprecedentedly high substitution rate. Only 27 and 29 genes remained intact in the plastomes of E. aphyllum and E. roseum—those encoding ribosomal components, transfer RNAs, and three additional housekeeping genes (infA, clpP, and accD). We found no signs of relaxed selection acting on these genes. We hypothesize that the main reason for retention of plastid genomes in Epipogium is the necessity to translate messenger RNAs (mRNAs) of accD and/or clpP proteins which are essential for cell metabolism. However, these genes are absent in plastomes of several plant species; their absence is compensated by the presence of a functional copy arisen by gene transfer from plastid to the nuclear genome. This suggests that there is no single set of plastid-encoded essential genes, but rather different sets for different species and that the retention of a gene in the plastome depends on the interaction between the nucleus and plastids. PMID:25635040
Genomes as geography: using GIS technology to build interactive genome feature maps
Dolan, Mary E; Holden, Constance C; Beard, M Kate; Bult, Carol J
2006-01-01
Background Many commonly used genome browsers display sequence annotations and related attributes as horizontal data tracks that can be toggled on and off according to user preferences. Most genome browsers use only simple keyword searches and limit the display of detailed annotations to one chromosomal region of the genome at a time. We have employed concepts, methodologies, and tools that were developed for the display of geographic data to develop a Genome Spatial Information System (GenoSIS) for displaying genomes spatially, and interacting with genome annotations and related attribute data. In contrast to the paradigm of horizontally stacked data tracks used by most genome browsers, GenoSIS uses the concept of registered spatial layers composed of spatial objects for integrated display of diverse data. In addition to basic keyword searches, GenoSIS supports complex queries, including spatial queries, and dynamically generates genome maps. Our adaptation of the geographic information system (GIS) model in a genome context supports spatial representation of genome features at multiple scales with a versatile and expressive query capability beyond that supported by existing genome browsers. Results We implemented an interactive genome sequence feature map for the mouse genome in GenoSIS, an application that uses ArcGIS, a commercially available GIS software system. The genome features and their attributes are represented as spatial objects and data layers that can be toggled on and off according to user preferences or displayed selectively in response to user queries. GenoSIS supports the generation of custom genome maps in response to complex queries about genome features based on both their attributes and locations. Our example application of GenoSIS to the mouse genome demonstrates the powerful visualization and query capability of mature GIS technology applied in a novel domain. Conclusion Mapping tools developed specifically for geographic data can be exploited to display, explore and interact with genome data. The approach we describe here is organism independent and is equally useful for linear and circular chromosomes. One of the unique capabilities of GenoSIS compared to existing genome browsers is the capacity to generate genome feature maps dynamically in response to complex attribute and spatial queries. PMID:16984652
The Effects of Signal Erosion and Core Genome Reduction on the Identification of Diagnostic Markers
Sahl, Jason W.; Vazquez, Adam J.; Hall, Carina M.; Busch, Joseph D.; Tuanyok, Apichai; Mayo, Mark; Schupp, James M.; Lummis, Madeline; Pearson, Talima; Shippy, Kenzie; Allender, Christopher J.; Theobald, Vanessa; Hutcheson, Alex; Korlach, Jonas; LiPuma, John J.; Ladner, Jason; Lovett, Sean; Koroleva, Galina; Palacios, Gustavo; Limmathurotsakul, Direk; Wuthiekanun, Vanaporn; Wongsuwan, Gumphol; Currie, Bart J.
2016-01-01
ABSTRACT Whole-genome sequence (WGS) data are commonly used to design diagnostic targets for the identification of bacterial pathogens. To do this effectively, genomics databases must be comprehensive to identify the strict core genome that is specific to the target pathogen. As additional genomes are analyzed, the core genome size is reduced and there is erosion of the target-specific regions due to commonality with related species, potentially resulting in the identification of false positives and/or false negatives. PMID:27651357
Yerrapragada, Shaila; Shukla, Animesh; Hallsworth-Pepin, Kymberlie; Choi, Kwangmin; Wollam, Aye; Clifton, Sandra; Qin, Xiang; Muzny, Donna; Raghuraman, Sriram; Ashki, Haleh; Uzman, Akif; Highlander, Sarah K; Fryszczyn, Bartlomiej G; Fox, George E; Tirumalai, Madhan R; Liu, Yamei; Kim, Sun; Kehoe, David M; Weinstock, George M
2015-05-07
Tolypothrix sp. PCC 7601 is a freshwater filamentous cyanobacterium with complex responses to environmental conditions. Here, we present its 9.96-Mbp draft genome sequence, containing 10,065 putative protein-coding sequences, including 305 predicted two-component system proteins and 27 putative phytochrome-class photoreceptors, the most such proteins in any sequenced genome. Copyright © 2015 Yerrapragada et al.
Francisco, Joel Celio; Dai, Qian; Luo, Zhuojuan; Wang, Yan; Chong, Roxanne Hui-Heng; Tan, Yee Joo; Xie, Wei; Lee, Guan-Huei; Lin, Chengqi
2017-10-01
Chronic hepatitis B virus (HBV) infection can lead to liver cirrhosis and hepatocellular carcinoma. HBV reactivation during or after chemotherapy is a potentially fatal complication for cancer patients with chronic HBV infection. Transcription of HBV is a critical intermediate step of the HBV life cycle. However, factors controlling HBV transcription remain largely unknown. Here, we found that different P-TEFb complexes are involved in the transcription of the HBV viral genome. Both BRD4 and the super elongation complex (SEC) bind to the HBV genome. The treatment of bromodomain inhibitor JQ1 stimulates HBV transcription and increases the occupancy of BRD4 on the HBV genome, suggesting the bromodomain-independent recruitment of BRD4 to the HBV genome. JQ1 also leads to the increased binding of SEC to the HBV genome, and SEC is required for JQ1-induced HBV transcription. These findings reveal a novel mechanism by which the HBV genome hijacks the host P-TEFb-containing complexes to promote its own transcription. Our findings also point out an important clinical implication, that is, the potential risk of HBV reactivation during therapy with a BRD4 inhibitor, such as JQ1 or its analogues, which are a potential treatment for acute myeloid leukemia. Copyright © 2017 American Society for Microbiology.
Vanwonterghem, Inka; Jensen, Paul D; Rabaey, Korneel; Tyson, Gene W
2016-09-01
Our understanding of the complex interconnected processes performed by microbial communities is hindered by our inability to culture the vast majority of microorganisms. Metagenomics provides a way to bypass this cultivation bottleneck and recent advances in this field now allow us to recover a growing number of genomes representing previously uncultured populations from increasingly complex environments. In this study, a temporal genome-centric metagenomic analysis was performed of lab-scale anaerobic digesters that host complex microbial communities fulfilling a series of interlinked metabolic processes to enable the conversion of cellulose to methane. In total, 101 population genomes that were moderate to near-complete were recovered based primarily on differential coverage binning. These populations span 19 phyla, represent mostly novel species and expand the genomic coverage of several rare phyla. Classification into functional guilds based on their metabolic potential revealed metabolic networks with a high level of functional redundancy as well as niche specialization, and allowed us to identify potential roles such as hydrolytic specialists for several rare, uncultured populations. Genome-centric analyses of complex microbial communities across diverse environments provide the key to understanding the phylogenetic and metabolic diversity of these interactive communities. © 2016 Society for Applied Microbiology and John Wiley & Sons Ltd.
Novel Insights into Tree Biology and Genome Evolution as Revealed Through Genomics.
Neale, David B; Martínez-García, Pedro J; De La Torre, Amanda R; Montanari, Sara; Wei, Xiao-Xin
2017-04-28
Reference genome sequences are the key to the discovery of genes and gene families that determine traits of interest. Recent progress in sequencing technologies has enabled a rapid increase in genome sequencing of tree species, allowing the dissection of complex characters of economic importance, such as fruit and wood quality and resistance to biotic and abiotic stresses. Although the number of reference genome sequences for trees lags behind those for other plant species, it is not too early to gain insight into the unique features that distinguish trees from nontree plants. Our review of the published data suggests that, although many gene families are conserved among herbaceous and tree species, some gene families, such as those involved in resistance to biotic and abiotic stresses and in the synthesis and transport of sugars, are often expanded in tree genomes. As the genomes of more tree species are sequenced, comparative genomics will further elucidate the complexity of tree genomes and how this relates to traits unique to trees.
Kujur, Alice; Saxena, Maneesha S; Bajaj, Deepak; Laxmi; Parida, Swarup K
2013-12-01
The enormous population growth, climate change and global warming are now considered major threats to agriculture and world's food security. To improve the productivity and sustainability of agriculture, the development of highyielding and durable abiotic and biotic stress-tolerant cultivars and/climate resilient crops is essential. Henceforth, understanding the molecular mechanism and dissection of complex quantitative yield and stress tolerance traits is the prime objective in current agricultural biotechnology research. In recent years, tremendous progress has been made in plant genomics and molecular breeding research pertaining to conventional and next-generation whole genome, transcriptome and epigenome sequencing efforts, generation of huge genomic, transcriptomic and epigenomic resources and development of modern genomics-assisted breeding approaches in diverse crop genotypes with contrasting yield and abiotic stress tolerance traits. Unfortunately, the detailed molecular mechanism and gene regulatory networks controlling such complex quantitative traits is not yet well understood in crop plants. Therefore, we propose an integrated strategies involving available enormous and diverse traditional and modern -omics (structural, functional, comparative and epigenomics) approaches/resources and genomics-assisted breeding methods which agricultural biotechnologist can adopt/utilize to dissect and decode the molecular and gene regulatory networks involved in the complex quantitative yield and stress tolerance traits in crop plants. This would provide clues and much needed inputs for rapid selection of novel functionally relevant molecular tags regulating such complex traits to expedite traditional and modern marker-assisted genetic enhancement studies in target crop species for developing high-yielding stress-tolerant varieties.
Ethical, legal, and social issues in the translation of genomics into health care.
Badzek, Laurie; Henaghan, Mark; Turner, Martha; Monsen, Rita
2013-03-01
The rapid continuous feed of new information from scientific discoveries related to the human genome makes translation and incorporation of information into the clinical setting difficult and creates ethical, legal, and social challenges for providers. This article overviews some of the legal and ethical foundations that guide our response to current complex issues in health care associated with the impact of scientific discoveries related to the human genome. Overlapping ethical, legal, and social implications impact nurses and other healthcare professionals as they seek to identify and translate into practice important information related to new genomic scientific knowledge. Ethical and legal foundations such as professional codes, human dignity, and human rights provide the framework for understanding highly complex genomic issues. Ethical, legal, and social concerns of the health provider in the translation of genomic knowledge into practice including minimizing harms, maximizing benefits, transparency, confidentiality, and informed consent are described. Additionally, nursing professional competencies related to ethical, legal, and social issues in the translation of genomics into health care are discussed. Ethical, legal, and social considerations in new genomic discovery necessitate that healthcare professionals have knowledge and competence to respond to complex genomic issues and provide appropriate information and care to patients, families, and communities. Understanding the ethical, legal, and social issues in the translation of genomic information into practice is essential to provide patients, families, and communities with competent, safe, effective health care. © 2013 Sigma Theta Tau International.
Genetics and Genomics of Acute Neurologic Disorders.
Maserati, Megan; Alexander, Sheila A
2018-01-01
Neurologic diseases and injuries are complex and multifactorial, making risk prediction, targeted treatment modalities, and outcome prognostication difficult and elusive. Genetics and genomics have affected clinical practice in many aspects in medicine, particularly cancer treatment. Advancements in knowledge of genetic and genomic variability in neurologic disease and injury are growing rapidly. Although these data are not yet ready for use in clinical practice, research continues to progress and elucidate information that eventually will provide answers to complex neurologic questions and serve as a platform to provide individualized care plans aimed at improving outcomes. This article provides a focused review of relevant literature on genetics, genomics, and common complex neurologic disease and injury likely to be seen in the acute care setting. ©2018 American Association of Critical-Care Nurses.
Hypothesis: Gene-rich plastid genomes in red algae may be an outcome of nuclear genome reduction.
Qiu, Huan; Lee, Jun Mo; Yoon, Hwan Su; Bhattacharya, Debashish
2017-06-01
Red algae (Rhodophyta) putatively diverged from the eukaryote tree of life >1.2 billion years ago and are the source of plastids in the ecologically important diatoms, haptophytes, and dinoflagellates. In general, red algae contain the largest plastid gene inventory among all such organelles derived from primary, secondary, or additional rounds of endosymbiosis. In contrast, their nuclear gene inventory is reduced when compared to their putative sister lineage, the Viridiplantae, and other photosynthetic lineages. The latter is thought to have resulted from a phase of genome reduction that occurred in the stem lineage of Rhodophyta. A recent comparative analysis of a taxonomically broad collection of red algal and Viridiplantae plastid genomes demonstrates that the red algal ancestor encoded ~1.5× more plastid genes than Viridiplantae. This difference is primarily explained by more extensive endosymbiotic gene transfer (EGT) in the stem lineage of Viridiplantae, when compared to red algae. We postulate that limited EGT in Rhodophytes resulted from the countervailing force of ancient, and likely recurrent, nuclear genome reduction. In other words, the propensity for nuclear gene loss led to the retention of red algal plastid genes that would otherwise have undergone intracellular gene transfer to the nucleus. This hypothesis recognizes the primacy of nuclear genome evolution over that of plastids, which have no inherent control of their gene inventory and can change dramatically (e.g., secondarily non-photosynthetic eukaryotes, dinoflagellates) in response to selection acting on the host lineage. © 2017 Phycological Society of America.
Dietzgen, Ralf G.; Kondo, Hideki; Goodin, Michael M.; Kurath, Gael; Vasilakis, Nikos
2017-01-01
The family Rhabdoviridae consists of mostly enveloped, bullet-shaped or bacilliform viruses with a negative-sense, single-stranded RNA genome that infect vertebrates, invertebrates or plants. This ecological diversity is reflected by the diversity and complexity of their genomes. Five canonical structural protein genes are conserved in all rhabdoviruses, but may be overprinted, overlapped or interspersed with several novel and diverse accessory genes. This review gives an overview of the characteristics and diversity of rhabdoviruses, their taxonomic classification, replication mechanism, properties of classical rhabdoviruses such as rabies virus and rhabdoviruses with complex genomes, rhabdoviruses infecting aquatic species, and plant rhabdoviruses with both mono- and bipartite genomes.
Lessons learned from the dog genome.
Wayne, Robert K; Ostrander, Elaine A
2007-11-01
Extensive genetic resources and a high-quality genome sequence position the dog as an important model species for understanding genome evolution, population genetics and genes underlying complex phenotypic traits. Newly developed genomic resources have expanded our understanding of canine evolutionary history and dog origins. Domestication involved genetic contributions from multiple populations of gray wolves probably through backcrossing. More recently, the advent of controlled breeding practices has segregated genetic variability into distinct dog breeds that possess specific phenotypic traits. Consequently, genome-wide association and selective sweep scans now allow the discovery of genes underlying breed-specific characteristics. The dog is finally emerging as a novel resource for studying the genetic basis of complex traits, including behavior.
Delta-proteobacterial SAR324 group in hydrothermal plumes on the South Mid-Atlantic Ridge.
Cao, Huiluo; Dong, Chunming; Bougouffa, Salim; Li, Jiangtao; Zhang, Weipeng; Shao, Zongze; Bajic, Vladimir B; Qian, Pei-Yuan
2016-03-08
In the dark ocean, the SAR324 group of Delta-proteobacteria has been associated with a chemolithotrophic lifestyle. However, their electron transport chain for energy generation and information system has not yet been well characterized. In the present study, four SAR324 draft genomes were extracted from metagenomes sampled from hydrothermal plumes in the South Mid-Atlantic Ridge. We describe novel electron transport chain components in the SAR324 group, particularly the alternative complex III, which is involved in energy generation. Moreover, we propose that the C-type cytochrome, for example the C553, may play a novel role in electron transfer, adding to our knowledge regarding the energy generation process in the SAR324 cluster. The central carbon metabolism in the described SAR324 genomes exhibits several new features other than methanotrophy e.g. aromatic compound degradation. This suggests that methane oxidation may not be the main central carbon metabolism component in SAR324 cluster bacteria. The reductive acetyl-CoA pathway may potentially be essential in carbon fixation due to the absence of components from the Calvin-Benson cycle. Our study provides insight into the role of recombination events in shaping the genome of the SAR324 group based on a larger number of repeat regions observed, which has been overlooked thus far.
Delta-proteobacterial SAR324 group in hydrothermal plumes on the South Mid-Atlantic Ridge
Cao, Huiluo; Dong, Chunming; Bougouffa, Salim; Li, Jiangtao; Zhang, Weipeng; Shao, Zongze; Bajic, Vladimir B.; Qian, Pei-Yuan
2016-01-01
In the dark ocean, the SAR324 group of Delta-proteobacteria has been associated with a chemolithotrophic lifestyle. However, their electron transport chain for energy generation and information system has not yet been well characterized. In the present study, four SAR324 draft genomes were extracted from metagenomes sampled from hydrothermal plumes in the South Mid-Atlantic Ridge. We describe novel electron transport chain components in the SAR324 group, particularly the alternative complex III, which is involved in energy generation. Moreover, we propose that the C-type cytochrome, for example the C553, may play a novel role in electron transfer, adding to our knowledge regarding the energy generation process in the SAR324 cluster. The central carbon metabolism in the described SAR324 genomes exhibits several new features other than methanotrophy e.g. aromatic compound degradation. This suggests that methane oxidation may not be the main central carbon metabolism component in SAR324 cluster bacteria. The reductive acetyl-CoA pathway may potentially be essential in carbon fixation due to the absence of components from the Calvin-Benson cycle. Our study provides insight into the role of recombination events in shaping the genome of the SAR324 group based on a larger number of repeat regions observed, which has been overlooked thus far. PMID:26953077
Ikuta, Tetsuro; Igawa, Kanae; Tame, Akihiro; Kuroiwa, Tsuneyoshi; Kuroiwa, Haruko; Aoki, Yui; Takaki, Yoshihiro; Nagai, Yukiko; Ozawa, Genki; Yamamoto, Masahiro; Deguchi, Ryusaku; Fujikura, Katsunori; Maruyama, Tadashi; Yoshida, Takao
2016-05-01
Symbiont transmission is a key event for understanding the processes underlying symbiotic associations and their evolution. However, our understanding of the mechanisms of symbiont transmission remains still fragmentary. The deep-sea clam Calyptogena okutanii harbours obligate sulfur-oxidizing intracellular symbiotic bacteria in the gill epithelial cells. In this study, we determined the localization of their symbiont associating with the spawned eggs, and the population size of the symbiont transmitted via the eggs. We show that the symbionts are located on the outer surface of the egg plasma membrane at the vegetal pole, and that each egg carries approximately 400 symbiont cells, each of which contains close to 10 genomic copies. The very small population size of the symbiont transmitted via the eggs might narrow the bottleneck and increase genetic drift, while polyploidy and its transient extracellular lifestyle might slow the rate of genome reduction. Additionally, the extracellular localization of the symbiont on the egg surface may increase the chance of symbiont exchange. This new type of extracellular transovarial transmission provides insights into complex interactions between the host and symbiont, development of both host and symbiont, as well as the population dynamics underlying genetic drift and genome evolution in microorganisms.
A Complex 6p25 Rearrangement in a Child With Multiple Epiphyseal Dysplasia
Bedoyan, Jirair K.; Lesperance, Marci M.; Ackley, Todd; Iyer, Ramaswamy K.; Innis, Jeffrey W.; Misra, Vinod K.
2015-01-01
Genomic rearrangements are increasingly recognized as important contributors to human disease. Here we report on an 11½-year-old child with myopia, Duane retraction syndrome, bilateral mixed hearing loss, skeletal anomalies including multiple epiphyseal dysplasia, and global developmental delay, and a complex 6p25 genomic rearrangement. We have employed oligonucleotide-based comparative genomic hybridization arrays (aCGH) of different resolutions (44 and 244K) as well as a 1 M single nucleotide polymorphism (SNP) array to analyze this complex rearrangement. Our analyses reveal a complex rearrangement involving a ~2.21 Mb interstitial deletion, a ~240 kb terminal deletion, and a 70–80 kb region in between these two deletions that shows maintenance of genomic copy number. The interstitial deletion contains eight known genes, including three Forkhead box containing (FOX) transcription factors (FOXQ1, FOXF2, and FOXC1). The region maintaining genomic copy number partly overlaps the dual specificity protein phosphatase 22 (DUSP22) gene. Array analyses suggest a homozygous loss of genomic material at the 5′ end of DUSP22, which was corroborated using TaqMan® copy number analysis. It is possible that this homozygous genomic loss may render both copies of DUSP22 or its products non-functional. Our analysis suggests a rearrangement mechanism distinct from a previously reported replication-based error-prone mechanism without template switching for a specific 6p25 rearrangement with a 1.22 Mb interstitial deletion. Our study demonstrates the utility and limitations of using oligonucleotide-based aCGH and SNP array technologies of increasing resolutions in order to identify complex DNA rearrangements and gene disruptions. PMID:21204225
On the need for widespread horizontal gene transfers under genome size constraint.
Isambert, Hervé; Stein, Richard R
2009-08-25
While eukaryotes primarily evolve by duplication-divergence expansion (and reduction) of their own gene repertoire with only rare horizontal gene transfers, prokaryotes appear to evolve under both gene duplications and widespread horizontal gene transfers over long evolutionary time scales. But, the evolutionary origin of this striking difference in the importance of horizontal gene transfers remains by and large a mystery. We propose that the abundance of horizontal gene transfers in free-living prokaryotes is a simple but necessary consequence of two opposite effects: i) their apparent genome size constraint compared to typical eukaryote genomes and ii) their underlying genome expansion dynamics through gene duplication-divergence evolution, as demonstrated by the presence of many tandem and block repeated genes. In principle, this combination of genome size constraint and underlying duplication expansion should lead to a coalescent-like process with extensive turnover of functional genes. This would, however, imply the unlikely, systematic reinvention of functions from discarded genes within independent phylogenetic lineages. Instead, we propose that the long-term evolutionary adaptation of free-living prokaryotes must have resulted in the emergence of efficient non-phylogenetic pathways to circumvent gene loss. This need for widespread horizontal gene transfers due to genome size constraint implies, in particular, that prokaryotes must remain under strong selection pressure in order to maintain the long-term evolutionary adaptation of their "mutualized" gene pool, beyond the inevitable turnover of individual prokaryote species. By contrast, the absence of genome size constraint for typical eukaryotes has presumably relaxed their need for widespread horizontal gene transfers and strong selection pressure. Yet, the resulting loss of genetic functions, due to weak selection pressure and inefficient gene recovery mechanisms, must have ultimately favored the emergence of more complex life styles and ecological integration of many eukaryotes. This article was reviewed by Pierre Pontarotti, Eugene V Koonin and Sergei Maslov.
Will Big Data Close the Missing Heritability Gap?
Kim, Hwasoon; Grueneberg, Alexander; Vazquez, Ana I; Hsu, Stephen; de Los Campos, Gustavo
2017-11-01
Despite the important discoveries reported by genome-wide association (GWA) studies, for most traits and diseases the prediction R-squared (R-sq.) achieved with genetic scores remains considerably lower than the trait heritability. Modern biobanks will soon deliver unprecedentedly large biomedical data sets: Will the advent of big data close the gap between the trait heritability and the proportion of variance that can be explained by a genomic predictor? We addressed this question using Bayesian methods and a data analysis approach that produces a surface response relating prediction R-sq. with sample size and model complexity ( e.g. , number of SNPs). We applied the methodology to data from the interim release of the UK Biobank. Focusing on human height as a model trait and using 80,000 records for model training, we achieved a prediction R-sq. in testing ( n = 22,221) of 0.24 (95% C.I.: 0.23-0.25). Our estimates show that prediction R-sq. increases with sample size, reaching an estimated plateau at values that ranged from 0.1 to 0.37 for models using 500 and 50,000 (GWA-selected) SNPs, respectively. Soon much larger data sets will become available. Using the estimated surface response, we forecast that larger sample sizes will lead to further improvements in prediction R-sq. We conclude that big data will lead to a substantial reduction of the gap between trait heritability and the proportion of interindividual differences that can be explained with a genomic predictor. However, even with the power of big data, for complex traits we anticipate that the gap between prediction R-sq. and trait heritability will not be fully closed. Copyright © 2017 by the Genetics Society of America.
Will Big Data Close the Missing Heritability Gap?
Kim, Hwasoon; Grueneberg, Alexander; Vazquez, Ana I.; Hsu, Stephen; de los Campos, Gustavo
2017-01-01
Despite the important discoveries reported by genome-wide association (GWA) studies, for most traits and diseases the prediction R-squared (R-sq.) achieved with genetic scores remains considerably lower than the trait heritability. Modern biobanks will soon deliver unprecedentedly large biomedical data sets: Will the advent of big data close the gap between the trait heritability and the proportion of variance that can be explained by a genomic predictor? We addressed this question using Bayesian methods and a data analysis approach that produces a surface response relating prediction R-sq. with sample size and model complexity (e.g., number of SNPs). We applied the methodology to data from the interim release of the UK Biobank. Focusing on human height as a model trait and using 80,000 records for model training, we achieved a prediction R-sq. in testing (n = 22,221) of 0.24 (95% C.I.: 0.23–0.25). Our estimates show that prediction R-sq. increases with sample size, reaching an estimated plateau at values that ranged from 0.1 to 0.37 for models using 500 and 50,000 (GWA-selected) SNPs, respectively. Soon much larger data sets will become available. Using the estimated surface response, we forecast that larger sample sizes will lead to further improvements in prediction R-sq. We conclude that big data will lead to a substantial reduction of the gap between trait heritability and the proportion of interindividual differences that can be explained with a genomic predictor. However, even with the power of big data, for complex traits we anticipate that the gap between prediction R-sq. and trait heritability will not be fully closed. PMID:28893854
van de Guchte, M; Penaud, S; Grimaldi, C; Barbe, V; Bryson, K; Nicolas, P; Robert, C; Oztas, S; Mangenot, S; Couloux, A; Loux, V; Dervyn, R; Bossy, R; Bolotin, A; Batto, J-M; Walunas, T; Gibrat, J-F; Bessières, P; Weissenbach, J; Ehrlich, S D; Maguin, E
2006-06-13
Lactobacillus delbrueckii ssp. bulgaricus (L. bulgaricus) is a representative of the group of lactic acid-producing bacteria, mainly known for its worldwide application in yogurt production. The genome sequence of this bacterium has been determined and shows the signs of ongoing specialization, with a substantial number of pseudogenes and incomplete metabolic pathways and relatively few regulatory functions. Several unique features of the L. bulgaricus genome support the hypothesis that the genome is in a phase of rapid evolution. (i) Exceptionally high numbers of rRNA and tRNA genes with regard to genome size may indicate that the L. bulgaricus genome has known a recent phase of important size reduction, in agreement with the observed high frequency of gene inactivation and elimination; (ii) a much higher GC content at codon position 3 than expected on the basis of the overall GC content suggests that the composition of the genome is evolving toward a higher GC content; and (iii) the presence of a 47.5-kbp inverted repeat in the replication termination region, an extremely rare feature in bacterial genomes, may be interpreted as a transient stage in genome evolution. The results indicate the adaptation of L. bulgaricus from a plant-associated habitat to the stable protein and lactose-rich milk environment through the loss of superfluous functions and protocooperation with Streptococcus thermophilus.
USDA-ARS?s Scientific Manuscript database
Genetic and genomic analyses of Upland cotton (Gossypium hirsutum) are difficult because it has a complex allotetraploid (AADD; 2n = 4x = 52) genome. Here we sequenced, assembled and analyzed the world's most important cultivated cotton genome with 246.2 gigabase (Gb) clean data obtained using whol...
Integrative modeling of gene and genome evolution roots the archaeal tree of life
Szöllősi, Gergely J.; Spang, Anja; Foster, Peter G.; Heaps, Sarah E.; Boussau, Bastien; Ettema, Thijs J. G.; Embley, T. Martin
2017-01-01
A root for the archaeal tree is essential for reconstructing the metabolism and ecology of early cells and for testing hypotheses that propose that the eukaryotic nuclear lineage originated from within the Archaea; however, published studies based on outgroup rooting disagree regarding the position of the archaeal root. Here we constructed a consensus unrooted archaeal topology using protein concatenation and a multigene supertree method based on 3,242 single gene trees, and then rooted this tree using a recently developed model of genome evolution. This model uses evidence from gene duplications, horizontal transfers, and gene losses contained in 31,236 archaeal gene families to identify the most likely root for the tree. Our analyses support the monophyly of DPANN (Diapherotrites, Parvarchaeota, Aenigmarchaeota, Nanoarchaeota, Nanohaloarchaea), a recently discovered cosmopolitan and genetically diverse lineage, and, in contrast to previous work, place the tree root between DPANN and all other Archaea. The sister group to DPANN comprises the Euryarchaeota and the TACK Archaea, including Lokiarchaeum, which our analyses suggest are monophyletic sister lineages. Metabolic reconstructions on the rooted tree suggest that early Archaea were anaerobes that may have had the ability to reduce CO2 to acetate via the Wood–Ljungdahl pathway. In contrast to proposals suggesting that genome reduction has been the predominant mode of archaeal evolution, our analyses infer a relatively small-genomed archaeal ancestor that subsequently increased in complexity via gene duplication and horizontal gene transfer. PMID:28533395
Integrative modeling of gene and genome evolution roots the archaeal tree of life.
Williams, Tom A; Szöllősi, Gergely J; Spang, Anja; Foster, Peter G; Heaps, Sarah E; Boussau, Bastien; Ettema, Thijs J G; Embley, T Martin
2017-06-06
A root for the archaeal tree is essential for reconstructing the metabolism and ecology of early cells and for testing hypotheses that propose that the eukaryotic nuclear lineage originated from within the Archaea; however, published studies based on outgroup rooting disagree regarding the position of the archaeal root. Here we constructed a consensus unrooted archaeal topology using protein concatenation and a multigene supertree method based on 3,242 single gene trees, and then rooted this tree using a recently developed model of genome evolution. This model uses evidence from gene duplications, horizontal transfers, and gene losses contained in 31,236 archaeal gene families to identify the most likely root for the tree. Our analyses support the monophyly of DPANN (Diapherotrites, Parvarchaeota, Aenigmarchaeota, Nanoarchaeota, Nanohaloarchaea), a recently discovered cosmopolitan and genetically diverse lineage, and, in contrast to previous work, place the tree root between DPANN and all other Archaea. The sister group to DPANN comprises the Euryarchaeota and the TACK Archaea, including Lokiarchaeum , which our analyses suggest are monophyletic sister lineages. Metabolic reconstructions on the rooted tree suggest that early Archaea were anaerobes that may have had the ability to reduce CO 2 to acetate via the Wood-Ljungdahl pathway. In contrast to proposals suggesting that genome reduction has been the predominant mode of archaeal evolution, our analyses infer a relatively small-genomed archaeal ancestor that subsequently increased in complexity via gene duplication and horizontal gene transfer.
Reductive evolution of architectural repertoires in proteomes and the birth of the tripartite world
Wang, Minglei; Yafremava, Liudmila S.; Caetano-Anollés, Derek; Mittenthal, Jay E.; Caetano-Anollés, Gustavo
2007-01-01
The repertoire of protein architectures in proteomes is evolutionarily conserved and capable of preserving an accurate record of genomic history. Here we use a census of protein architecture in 185 genomes that have been fully sequenced to generate genome-based phylogenies that describe the evolution of the protein world at fold (F) and fold superfamily (FSF) levels. The patterns of representation of F and FSF architectures over evolutionary history suggest three epochs in the evolution of the protein world: (1) architectural diversification, where members of an architecturally rich ancestral community diversified their protein repertoire; (2) superkingdom specification, where superkingdoms Archaea, Bacteria, and Eukarya were specified; and (3) organismal diversification, where F and FSF specific to relatively small sets of organisms appeared as the result of diversification of organismal lineages. Functional annotation of FSF along these architectural chronologies revealed patterns of discovery of biological function. Most importantly, the analysis identified an early and extensive differential loss of architectures occurring primarily in Archaea that segregates the archaeal lineage from the ancient community of organisms and establishes the first organismal divide. Reconstruction of phylogenomic trees of proteomes reflects the timeline of architectural diversification in the emerging lineages. Thus, Archaea undertook a minimalist strategy using only a small subset of the full architectural repertoire and then crystallized into a diversified superkingdom late in evolution. Our analysis also suggests a communal ancestor to all life that was molecularly complex and adopted genomic strategies currently present in Eukarya. PMID:17908824
Arar, Nedal; Knight, Sara J; Modell, Stephen M; Issa, Amalia M
2011-03-01
The main mission of the Genomic Applications in Practice and Prevention Network™ is to advance collaborative efforts involving partners from across the public health sector to realize the promise of genomics in healthcare and disease prevention. We introduce a new framework that supports the Genomic Applications in Practice and Prevention Network mission and leverages the characteristics of the complex adaptive systems approach. We call this framework the Genome-based Knowledge Management in Cycles model (G-KNOMIC). G-KNOMIC proposes that the collaborative work of multidisciplinary teams utilizing genome-based applications will enhance translating evidence-based genomic findings by creating ongoing knowledge management cycles. Each cycle consists of knowledge synthesis, knowledge evaluation, knowledge implementation and knowledge utilization. Our framework acknowledges that all the elements in the knowledge translation process are interconnected and continuously changing. It also recognizes the importance of feedback loops, and the ability of teams to self-organize within a dynamic system. We demonstrate how this framework can be used to improve the adoption of genomic technologies into practice using two case studies of genomic uptake.
Poon, Betty P.K
2011-01-01
Interactions between genetic regions located across the genome maintain its three-dimensional organization and function. Recent studies point to key roles for a set of coiled-coil domain-containing complexes (cohibin, cohesin, condensin and monopolin) and related factors in the regulation of DNA-DNA connections across the genome. These connections are critical to replication, recombination, gene expression as well as chromosome segregation. PMID:21822055
NASA Astrophysics Data System (ADS)
Baker, B.; Lazar, C.; Seitz, K.; Teske, A.; Hinrichs, K. U.; Dick, G.
2015-12-01
Estuaries are among the most productive habitats on the planet. Microbes in estuary sediments control the turnover of organic carbon, and the anaerobic cycling of nitrogen and sulfur. These communities are complex and primarily made up of uncultured lineages, thus little is known about how ecological and metabolic processes are partitioned in sediments. We reconstructed 82 bacterial and 24 archaeal high-quality genomes from different redox regimes (sulfate-rich, sulfate-methane transition zone, and methane-rich zones) of estuary sediments. These bacteria belong to 23 distinct groups, including uncultured candidate phyla (eg. KSB1, TA06, and KD3-62), and three newly described phyla (WOR-1, and -2, and -3). The archaea encompass 8 widespread sediment lineages including MGB-D, RC-III and IV, Z7ME43, Parvarchaeota, Lokiarchoaeta (MBG-B), SAGMEG, Bathyarchaeota (groups MCG-1, -6, -7, and -15) and previously unrecognized deeply branched phylum "Thorarchaeota". The uncultured phyla mediate essential biogeochemical processes of the estuarine environment. Z7ME43 archaea have genes for S disproportionation (S0 reduction and thiosulfate reduction and oxidation). SAGMEG appear to be strict anaerobes capable of coupling CO/H2 oxidation to either S0 or nitrite reduction and have novel RubisCO genes for carbon fixation. Thorarchaeota contain pathways for acetate production from the degradation of detrital proteins and intermediate S cycling. Furthermore, the gene content of this group revealed links in the evolutionary histories of archaea and eukaryotes. This dataset extents our knowledge of the metabolic potential of several uncultured phyla. We were able to chart the flow of carbon and nutrients through the multiple layers of bacterial processing and reveal potential ecological interactions within the communities.
Lee, Kang-Hoon; Lee, Young-Kwan; Kwon, Deug-Nam; Chiu, Sophia; Chew, Victoria; Rah, Hyungchul; Kujawski, Gregory; Melhem, Ramzi; Hsu, Karen; Chung, Cecilia; Greenhalgh, David G; Cho, Kiho
2011-06-01
Approximately 2% of the human genome is reported to be occupied by genes. Various forms of repetitive elements (REs), both characterized and uncharacterized, are presumed to make up the vast majority of the rest of the genomes of human and other species. In conjunction with a comprehensive annotation of genes, information regarding components of genome biology, such as gene polymorphisms, non-coding RNAs, and certain REs, is found in human genome databases. However, the genome-wide profile of unique RE arrangements formed by different groups of REs has not been fully characterized yet. In this study, the entire human genome was subjected to an unbiased RE survey to establish a whole-genome profile of REs and their arrangements. Due to the limitation in query size within the bl2seq alignment program (National Center for Biotechnology Information [NCBI]) utilized for the RE survey, the entire NCBI reference human genome was fragmented into 6206 units of 0.5M nucleotides. A number of RE arrangements with varying complexities and patterns were identified throughout the genome. Each chromosome had unique profiles of RE arrangements and density, and high levels of RE density were measured near the centromere regions. Subsequently, 175 complex RE arrangements, which were selected throughout the genome, were subjected to a comparison analysis using five different human genome sequences. Interestingly, three of the five human genome databases shared the exactly same arrangement patterns and sequences for all 175 RE arrangement regions (a total of 12,765,625 nucleotides). The findings from this study demonstrate that a substantial fraction of REs in the human genome are clustered into various forms of ordered structures. Further investigations are needed to examine whether some of these ordered RE arrangements contribute to the human pathobiology as a functional genome unit. Copyright © 2011 Elsevier Inc. All rights reserved.
Comparison of the cattle leukocyte receptor complex with related livestock species
USDA-ARS?s Scientific Manuscript database
The natural killer (NK) cell receptor gene complexes are highly variable between species, and their repetitive nature makes genomic assembly and characterization problematic. As a result, most reference genome assemblies are heavily fragmented and/or misassembled over these regions. However, new lon...
Decoding the Heart through Next Generation Sequencing Approaches.
Pawlak, Michal; Niescierowicz, Katarzyna; Winata, Cecilia Lanny
2018-06-07
: Vertebrate organs develop through a complex process which involves interaction between multiple signaling pathways at the molecular, cell, and tissue levels. Heart development is an example of such complex process which, when disrupted, results in congenital heart disease (CHD). This complexity necessitates a holistic approach which allows the visualization of genome-wide interaction networks, as opposed to assessment of limited subsets of factors. Genomics offers a powerful solution to address the problem of biological complexity by enabling the observation of molecular processes at a genome-wide scale. The emergence of next generation sequencing (NGS) technology has facilitated the expansion of genomics, increasing its output capacity and applicability in various biological disciplines. The application of NGS in various aspects of heart biology has resulted in new discoveries, generating novel insights into this field of study. Here we review the contributions of NGS technology into the understanding of heart development and its disruption reflected in CHD and discuss how emerging NGS based methodologies can contribute to the further understanding of heart repair.
Archaeal Genome Guardians Give Insights into Eukaryotic DNA Replication and Damage Response Proteins
Shin, David S.; Pratt, Ashley J.; Tainer, John A.
2014-01-01
As the third domain of life, archaea, like the eukarya and bacteria, must have robust DNA replication and repair complexes to ensure genome fidelity. Archaea moreover display a breadth of unique habitats and characteristics, and structural biologists increasingly appreciate these features. As archaea include extremophiles that can withstand diverse environmental stresses, they provide fundamental systems for understanding enzymes and pathways critical to genome integrity and stress responses. Such archaeal extremophiles provide critical data on the periodic table for life as well as on the biochemical, geochemical, and physical limitations to adaptive strategies allowing organisms to thrive under environmental stress relevant to determining the boundaries for life as we know it. Specifically, archaeal enzyme structures have informed the architecture and mechanisms of key DNA repair proteins and complexes. With added abilities to temperature-trap flexible complexes and reveal core domains of transient and dynamic complexes, these structures provide insights into mechanisms of maintaining genome integrity despite extreme environmental stress. The DNA damage response protein structures noted in this review therefore inform the basis for genome integrity in the face of environmental stress, with implications for all domains of life as well as for biomanufacturing, astrobiology, and medicine. PMID:24701133
Nielsen, H Bjørn; Almeida, Mathieu; Juncker, Agnieszka Sierakowska; Rasmussen, Simon; Li, Junhua; Sunagawa, Shinichi; Plichta, Damian R; Gautier, Laurent; Pedersen, Anders G; Le Chatelier, Emmanuelle; Pelletier, Eric; Bonde, Ida; Nielsen, Trine; Manichanh, Chaysavanh; Arumugam, Manimozhiyan; Batto, Jean-Michel; Quintanilha Dos Santos, Marcelo B; Blom, Nikolaj; Borruel, Natalia; Burgdorf, Kristoffer S; Boumezbeur, Fouad; Casellas, Francesc; Doré, Joël; Dworzynski, Piotr; Guarner, Francisco; Hansen, Torben; Hildebrand, Falk; Kaas, Rolf S; Kennedy, Sean; Kristiansen, Karsten; Kultima, Jens Roat; Léonard, Pierre; Levenez, Florence; Lund, Ole; Moumen, Bouziane; Le Paslier, Denis; Pons, Nicolas; Pedersen, Oluf; Prifti, Edi; Qin, Junjie; Raes, Jeroen; Sørensen, Søren; Tap, Julien; Tims, Sebastian; Ussery, David W; Yamada, Takuji; Renault, Pierre; Sicheritz-Ponten, Thomas; Bork, Peer; Wang, Jun; Brunak, Søren; Ehrlich, S Dusko
2014-08-01
Most current approaches for analyzing metagenomic data rely on comparisons to reference genomes, but the microbial diversity of many environments extends far beyond what is covered by reference databases. De novo segregation of complex metagenomic data into specific biological entities, such as particular bacterial strains or viruses, remains a largely unsolved problem. Here we present a method, based on binning co-abundant genes across a series of metagenomic samples, that enables comprehensive discovery of new microbial organisms, viruses and co-inherited genetic entities and aids assembly of microbial genomes without the need for reference sequences. We demonstrate the method on data from 396 human gut microbiome samples and identify 7,381 co-abundance gene groups (CAGs), including 741 metagenomic species (MGS). We use these to assemble 238 high-quality microbial genomes and identify affiliations between MGS and hundreds of viruses or genetic entities. Our method provides the means for comprehensive profiling of the diversity within complex metagenomic samples.
Hawkeye and AMOS: visualizing and assessing the quality of genome assemblies
Schatz, Michael C.; Phillippy, Adam M.; Sommer, Daniel D.; Delcher, Arthur L.; Puiu, Daniela; Narzisi, Giuseppe; Salzberg, Steven L.; Pop, Mihai
2013-01-01
Since its launch in 2004, the open-source AMOS project has released several innovative DNA sequence analysis applications including: Hawkeye, a visual analytics tool for inspecting the structure of genome assemblies; the Assembly Forensics and FRCurve pipelines for systematically evaluating the quality of a genome assembly; and AMOScmp, the first comparative genome assembler. These applications have been used to assemble and analyze dozens of genomes ranging in complexity from simple microbial species through mammalian genomes. Recent efforts have been focused on enhancing support for new data characteristics brought on by second- and now third-generation sequencing. This review describes the major components of AMOS in light of these challenges, with an emphasis on methods for assessing assembly quality and the visual analytics capabilities of Hawkeye. These interactive graphical aspects are essential for navigating and understanding the complexities of a genome assembly, from the overall genome structure down to individual bases. Hawkeye and AMOS are available open source at http://amos.sourceforge.net. PMID:22199379
Yu, Feiqiao Brian; Blainey, Paul C; Schulz, Frederik; Woyke, Tanja; Horowitz, Mark A; Quake, Stephen R
2017-07-05
Metagenomics and single-cell genomics have enabled genome discovery from unknown branches of life. However, extracting novel genomes from complex mixtures of metagenomic data can still be challenging and represents an ill-posed problem which is generally approached with ad hoc methods. Here we present a microfluidic-based mini-metagenomic method which offers a statistically rigorous approach to extract novel microbial genomes while preserving single-cell resolution. We used this approach to analyze two hot spring samples from Yellowstone National Park and extracted 29 new genomes, including three deeply branching lineages. The single-cell resolution enabled accurate quantification of genome function and abundance, down to 1% in relative abundance. Our analyses of genome level SNP distributions also revealed low to moderate environmental selection. The scale, resolution, and statistical power of microfluidic-based mini-metagenomics make it a powerful tool to dissect the genomic structure of microbial communities while effectively preserving the fundamental unit of biology, the single cell.
Mahata, Barun; Banerjee, Avisek; Kundu, Manjari; Bandyopadhyay, Uday; Biswas, Kaushik
2015-01-01
Complex ganglioside expression is highly deregulated in several tumors which is further dependent on specific ganglioside synthase genes. Here, we designed and constructed a pair of highly specific transcription-activator like effector endonuclease (TALENs) to disrupt a particular genomic locus of mouse GM2-synthase, a region conserved in coding sequence of all four transcript variants of mouse GM2-synthase. Our designed TALENs effectively work in different mouse cell lines and TALEN induced mutation rate is over 45%. Clonal selection strategy is undertaken to generate stable GM2-synthase knockout cell line. We have also demonstrated non-homologous end joining (NHEJ) mediated integration of neomycin cassette into the TALEN targeted GM2-synthase locus. Functionally, clonally selected GM2-synthase knockout clones show reduced anchorage-independent growth (AIG), reduction in tumor growth and higher cellular adhesion as compared to wild type Renca-v cells. Insight into the mechanism shows that, reduced AIG is due to loss in anoikis resistance, as both knockout clones show increased sensitivity to detachment induced apoptosis. Therefore, TALEN mediated precise genome editing at GM2-synthase locus not only helps us in understanding the function of GM2-synthase gene and complex gangliosides in tumorigenicity but also holds tremendous potential to use TALENs in translational cancer research and therapeutics. PMID:25762467
Mahata, Barun; Banerjee, Avisek; Kundu, Manjari; Bandyopadhyay, Uday; Biswas, Kaushik
2015-03-12
Complex ganglioside expression is highly deregulated in several tumors which is further dependent on specific ganglioside synthase genes. Here, we designed and constructed a pair of highly specific transcription-activator like effector endonuclease (TALENs) to disrupt a particular genomic locus of mouse GM2-synthase, a region conserved in coding sequence of all four transcript variants of mouse GM2-synthase. Our designed TALENs effectively work in different mouse cell lines and TALEN induced mutation rate is over 45%. Clonal selection strategy is undertaken to generate stable GM2-synthase knockout cell line. We have also demonstrated non-homologous end joining (NHEJ) mediated integration of neomycin cassette into the TALEN targeted GM2-synthase locus. Functionally, clonally selected GM2-synthase knockout clones show reduced anchorage-independent growth (AIG), reduction in tumor growth and higher cellular adhesion as compared to wild type Renca-v cells. Insight into the mechanism shows that, reduced AIG is due to loss in anoikis resistance, as both knockout clones show increased sensitivity to detachment induced apoptosis. Therefore, TALEN mediated precise genome editing at GM2-synthase locus not only helps us in understanding the function of GM2-synthase gene and complex gangliosides in tumorigenicity but also holds tremendous potential to use TALENs in translational cancer research and therapeutics.
Host-Associated Metagenomics: A Guide to Generating Infectious RNA Viromes
Robert, Catherine; Pascalis, Hervé; Michelle, Caroline; Jardot, Priscilla; Charrel, Rémi; Raoult, Didier; Desnues, Christelle
2015-01-01
Background Metagenomic analyses have been widely used in the last decade to describe viral communities in various environments or to identify the etiology of human, animal, and plant pathologies. Here, we present a simple and standardized protocol that allows for the purification and sequencing of RNA viromes from complex biological samples with an important reduction of host DNA and RNA contaminants, while preserving the infectivity of viral particles. Principal Findings We evaluated different viral purification steps, random reverse transcriptions and sequence-independent amplifications of a pool of representative RNA viruses. Viruses remained infectious after the purification process. We then validated the protocol by sequencing the RNA virome of human body lice engorged in vitro with artificially contaminated human blood. The full genomes of the most abundant viruses absorbed by the lice during the blood meal were successfully sequenced. Interestingly, random amplifications differed in the genome coverage of segmented RNA viruses. Moreover, the majority of reads were taxonomically identified, and only 7–15% of all reads were classified as “unknown”, depending on the random amplification method. Conclusion The protocol reported here could easily be applied to generate RNA viral metagenomes from complex biological samples of different origins. Our protocol allows further virological characterizations of the described viral communities because it preserves the infectivity of viral particles and allows for the isolation of viruses. PMID:26431175
Schelkunov, Mikhail I; Shtratnikova, Viktoria Yu; Nuraliev, Maxim S; Selosse, Marc-Andre; Penin, Aleksey A; Logacheva, Maria D
2015-01-28
The question on the patterns and limits of reduction of plastid genomes in nonphotosynthetic plants and the reasons of their conservation is one of the intriguing topics in plant genome evolution. Here, we report sequencing and analysis of plastid genome in nonphotosynthetic orchids Epipogium aphyllum and Epipogium roseum, which, with sizes of 31 and 19 kbp, respectively, represent the smallest plastid genomes characterized by now. Besides drastic reduction, which is expected, we found several unusual features of these "minimal" plastomes: Multiple rearrangements, highly biased nucleotide composition, and unprecedentedly high substitution rate. Only 27 and 29 genes remained intact in the plastomes of E. aphyllum and E. roseum-those encoding ribosomal components, transfer RNAs, and three additional housekeeping genes (infA, clpP, and accD). We found no signs of relaxed selection acting on these genes. We hypothesize that the main reason for retention of plastid genomes in Epipogium is the necessity to translate messenger RNAs (mRNAs) of accD and/or clpP proteins which are essential for cell metabolism. However, these genes are absent in plastomes of several plant species; their absence is compensated by the presence of a functional copy arisen by gene transfer from plastid to the nuclear genome. This suggests that there is no single set of plastid-encoded essential genes, but rather different sets for different species and that the retention of a gene in the plastome depends on the interaction between the nucleus and plastids. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Efficient computation of the joint sample frequency spectra for multiple populations.
Kamm, John A; Terhorst, Jonathan; Song, Yun S
2017-01-01
A wide range of studies in population genetics have employed the sample frequency spectrum (SFS), a summary statistic which describes the distribution of mutant alleles at a polymorphic site in a sample of DNA sequences and provides a highly efficient dimensional reduction of large-scale population genomic variation data. Recently, there has been much interest in analyzing the joint SFS data from multiple populations to infer parameters of complex demographic histories, including variable population sizes, population split times, migration rates, admixture proportions, and so on. SFS-based inference methods require accurate computation of the expected SFS under a given demographic model. Although much methodological progress has been made, existing methods suffer from numerical instability and high computational complexity when multiple populations are involved and the sample size is large. In this paper, we present new analytic formulas and algorithms that enable accurate, efficient computation of the expected joint SFS for thousands of individuals sampled from hundreds of populations related by a complex demographic model with arbitrary population size histories (including piecewise-exponential growth). Our results are implemented in a new software package called momi (MOran Models for Inference). Through an empirical study we demonstrate our improvements to numerical stability and computational complexity.
Efficient computation of the joint sample frequency spectra for multiple populations
Kamm, John A.; Terhorst, Jonathan; Song, Yun S.
2016-01-01
A wide range of studies in population genetics have employed the sample frequency spectrum (SFS), a summary statistic which describes the distribution of mutant alleles at a polymorphic site in a sample of DNA sequences and provides a highly efficient dimensional reduction of large-scale population genomic variation data. Recently, there has been much interest in analyzing the joint SFS data from multiple populations to infer parameters of complex demographic histories, including variable population sizes, population split times, migration rates, admixture proportions, and so on. SFS-based inference methods require accurate computation of the expected SFS under a given demographic model. Although much methodological progress has been made, existing methods suffer from numerical instability and high computational complexity when multiple populations are involved and the sample size is large. In this paper, we present new analytic formulas and algorithms that enable accurate, efficient computation of the expected joint SFS for thousands of individuals sampled from hundreds of populations related by a complex demographic model with arbitrary population size histories (including piecewise-exponential growth). Our results are implemented in a new software package called momi (MOran Models for Inference). Through an empirical study we demonstrate our improvements to numerical stability and computational complexity. PMID:28239248
Molecular Innovation in Ciliates with Complex Genome Rearrangements
NASA Astrophysics Data System (ADS)
Neme, R.; Landweber, L. F.
2017-07-01
We study molecular innovation in several ciliate species with unique massive genome rearrangements to understand how a radically distinct genome architecture can shape the process of acquiring new functions, genes and structures.
Balodite, Elina; Strazdina, Inese; Galinina, Nina; McLean, Samantha; Rutkis, Reinis; Poole, Robert K; Kalnenieks, Uldis
2014-09-01
The genome of the ethanol-producing bacterium Zymomonas mobilis encodes a bd-type terminal oxidase, cytochrome bc1 complex and several c-type cytochromes, yet lacks sequences homologous to any of the known bacterial cytochrome c oxidase genes. Recently, it was suggested that a putative respiratory cytochrome c peroxidase, receiving electrons from the cytochrome bc1 complex via cytochrome c552, might function as a peroxidase and/or an alternative oxidase. The present study was designed to test this hypothesis, by construction of a cytochrome c peroxidase mutant (Zm6-perC), and comparison of its properties with those of a mutant defective in the cytochrome b subunit of the bc1 complex (Zm6-cytB). Disruption of the cytochrome c peroxidase gene (ZZ60192) caused a decrease of the membrane NADH peroxidase activity, impaired the resistance of growing culture to exogenous hydrogen peroxide and hampered aerobic growth. However, this mutation did not affect the activity or oxygen affinity of the respiratory chain, or the kinetics of cytochrome d reduction. Furthermore, the peroxide resistance and membrane NADH peroxidase activity of strain Zm6-cytB had not decreased, but both the oxygen affinity of electron transport and the kinetics of cytochrome d reduction were affected. It is therefore concluded that the cytochrome c peroxidase does not terminate the cytochrome bc1 branch of Z. mobilis, and that it is functioning as a quinol peroxidase. © 2014 The Authors.
The Trichoplax Genome and the Nature of Placozoans
DOE Office of Scientific and Technical Information (OSTI.GOV)
Srivastava, Mansi; Begovic, Emina; Chapman, Jarrod
2008-08-01
Placozoans are arguably the simplest free-living animals, possibly evoking an early stage in metazoan evolution, yet their biology is poorly understood. Here we report the sequencing and analysis of the {approx}98 million base pair nuclear genome of the placozoan Trichoplax adhaerens. Whole genome phylogenetic analysis suggests that placozoans belong to a 'eumetazoan' clade that includes cnidarians and bilaterians, with sponges as the earliest diverging animals. The compact genome exhibits conserved gene content, gene structure, and synteny relative to the human and other complex eumetazoan genomes. Despite the apparent cellular and organismal simplicity of Trichoplax, its genome encodes a rich arraymore » of transcription factor and signaling pathway genes that are typically associated with diverse cell types and developmental processes in eumetazoans, motivating further searches for cryptic cellular complexity and/or as yet unobserved life history stages.« less
Sharp, Richard R
2011-03-01
As we look to a time when whole-genome sequencing is integrated into patient care, it is possible to anticipate a number of ethical challenges that will need to be addressed. The most intractable of these concern informed consent and the responsible management of very large amounts of genetic information. Given the range of possible findings, it remains unclear to what extent it will be possible to obtain meaningful patient consent to genomic testing. Equally unclear is how clinicians will disseminate the enormous volume of genetic information produced by whole-genome sequencing. Toward developing practical strategies for managing these ethical challenges, we propose a research agenda that approaches multiplexed forms of clinical genetic testing as natural laboratories in which to develop best practices for managing the ethical complexities of genomic medicine.
Holmes, Andrew; Szafranski, Karol; Faulkes, Chris G.; Coen, Clive W.; Buffenstein, Rochelle; Platzer, Matthias; de Magalhães, João Pedro; Church, George M.
2011-01-01
The naked mole-rat (Heterocephalus glaber) is a long-lived, cancer resistant rodent and there is a great interest in identifying the adaptations responsible for these and other of its unique traits. We employed RNA sequencing to compare liver gene expression profiles between naked mole-rats and wild-derived mice. Our results indicate that genes associated with oxidoreduction and mitochondria were expressed at higher relative levels in naked mole-rats. The largest effect is nearly 300-fold higher expression of epithelial cell adhesion molecule (Epcam), a tumour-associated protein. Also of interest are the protease inhibitor, alpha2-macroglobulin (A2m), and the mitochondrial complex II subunit Sdhc, both ageing-related genes found strongly over-expressed in the naked mole-rat. These results hint at possible candidates for specifying species differences in ageing and cancer, and in particular suggest complex alterations in mitochondrial and oxidation reduction pathways in the naked mole-rat. Our differential gene expression analysis obviated the need for a reference naked mole-rat genome by employing a combination of Illumina/Solexa and 454 platforms for transcriptome sequencing and assembling transcriptome contigs of the non-sequenced species. Overall, our work provides new research foci and methods for studying the naked mole-rat's fascinating characteristics. PMID:22073188
Dietzgen, Ralf G; Kondo, Hideki; Goodin, Michael M; Kurath, Gael; Vasilakis, Nikos
2017-01-02
The family Rhabdoviridae consists of mostly enveloped, bullet-shaped or bacilliform viruses with a negative-sense, single-stranded RNA genome that infect vertebrates, invertebrates or plants. This ecological diversity is reflected by the diversity and complexity of their genomes. Five canonical structural protein genes are conserved in all rhabdoviruses, but may be overprinted, overlapped or interspersed with several novel and diverse accessory genes. This review gives an overview of the characteristics and diversity of rhabdoviruses, their taxonomic classification, replication mechanism, properties of classical rhabdoviruses such as rabies virus and rhabdoviruses with complex genomes, rhabdoviruses infecting aquatic species, and plant rhabdoviruses with both mono- and bipartite genomes. Copyright © 2016 Elsevier B.V. All rights reserved.
Advances in cereal genomics and applications in crop breeding.
Varshney, Rajeev K; Hoisington, David A; Tyagi, Akhilesh K
2006-11-01
Recent advances in cereal genomics have made it possible to analyse the architecture of cereal genomes and their expressed components, leading to an increase in our knowledge of the genes that are linked to key agronomically important traits. These studies have used molecular genetic mapping of quantitative trait loci (QTL) of several complex traits that are important in breeding. The identification and molecular cloning of genes underlying QTLs offers the possibility to examine the naturally occurring allelic variation for respective complex traits. Novel alleles, identified by functional genomics or haplotype analysis, can enrich the genetic basis of cultivated crops to improve productivity. Advances made in cereal genomics research in recent years thus offer the opportunities to enhance the prediction of phenotypes from genotypes for cereal breeding.
Chadeau-Hyam, Marc; Campanella, Gianluca; Jombart, Thibaut; Bottolo, Leonardo; Portengen, Lutzen; Vineis, Paolo; Liquet, Benoit; Vermeulen, Roel C H
2013-08-01
Recent technological advances in molecular biology have given rise to numerous large-scale datasets whose analysis imposes serious methodological challenges mainly relating to the size and complex structure of the data. Considerable experience in analyzing such data has been gained over the past decade, mainly in genetics, from the Genome-Wide Association Study era, and more recently in transcriptomics and metabolomics. Building upon the corresponding literature, we provide here a nontechnical overview of well-established methods used to analyze OMICS data within three main types of regression-based approaches: univariate models including multiple testing correction strategies, dimension reduction techniques, and variable selection models. Our methodological description focuses on methods for which ready-to-use implementations are available. We describe the main underlying assumptions, the main features, and advantages and limitations of each of the models. This descriptive summary constitutes a useful tool for driving methodological choices while analyzing OMICS data, especially in environmental epidemiology, where the emergence of the exposome concept clearly calls for unified methods to analyze marginally and jointly complex exposure and OMICS datasets. Copyright © 2013 Wiley Periodicals, Inc.
Azevedo, C F; Nascimento, M; Silva, F F; Resende, M D V; Lopes, P S; Guimarães, S E F; Glória, L S
2015-10-09
A significant contribution of molecular genetics is the direct use of DNA information to identify genetically superior individuals. With this approach, genome-wide selection (GWS) can be used for this purpose. GWS consists of analyzing a large number of single nucleotide polymorphism markers widely distributed in the genome; however, because the number of markers is much larger than the number of genotyped individuals, and such markers are highly correlated, special statistical methods are widely required. Among these methods, independent component regression, principal component regression, partial least squares, and partial principal components stand out. Thus, the aim of this study was to propose an application of the methods of dimensionality reduction to GWS of carcass traits in an F2 (Piau x commercial line) pig population. The results show similarities between the principal and the independent component methods and provided the most accurate genomic breeding estimates for most carcass traits in pigs.
Nuclear and chloroplast DNA phylogeny reveals complex evolutionary history of Elymus pendulinus.
Yan, Chi; Hu, Qianni; Sun, Genlou
2014-02-01
Evidence accumulated over the last decade has shown that allopolyploid genomes may undergo complex reticulate evolution. In this study, 13 accessions of tetraploid Elymus pendulinus were analyzed using two low-copy nuclear genes (RPB2 and PepC) and two regions of chloroplast genome (Rps16 and trnD-trnT). Previous studies suggested that Pseudoroegneria (St) and an unknown diploid (Y) were genome donors to E. pendulinus, and that Pseudoroegneria was the maternal donor. Our results revealed an extreme reticulate pattern, with at least four distinct gene lineages coexisting within this species that might be acquired through a possible combination of allotetraploidization and introgression from both within and outside the tribe Hordeeae. Chloroplast DNA data identified two potential maternal genome donors (Pseudoroegneria and an unknown species outside Hordeeae) to E. pendulinus. Nuclear gene data indicated that both Pseudoroegneria and an unknown Y diploid have contributed to the nuclear genome of E. pendulinus, in agreement with cytogenetic data. However, unexpected contributions from Hordeum and unknown aliens from within or outside Hordeeae to E. pendulinus without genome duplication were observed. Elymus pendulinus provides a remarkable instance of the previously unsuspected chimerical nature of some plant genomes and the resulting phylogenetic complexity produced by multiple historical reticulation events.
Rocher, Solen; Jean, Martine; Castonguay, Yves; Belzile, François
2015-01-01
Genotyping-by-sequencing (GBS) is a relatively low-cost high throughput genotyping technology based on next generation sequencing and is applicable to orphan species with no reference genome. A combination of genome complexity reduction and multiplexing with DNA barcoding provides a simple and affordable way to resolve allelic variation between plant samples or populations. GBS was performed on ApeKI libraries using DNA from 48 genotypes each of two heterogeneous populations of tetraploid alfalfa (Medicago sativa spp. sativa): the synthetic cultivar Apica (ATF0) and a derived population (ATF5) obtained after five cycles of recurrent selection for superior tolerance to freezing (TF). Nearly 400 million reads were obtained from two lanes of an Illumina HiSeq 2000 sequencer and analyzed with the Universal Network-Enabled Analysis Kit (UNEAK) pipeline designed for species with no reference genome. Following the application of whole dataset-level filters, 11,694 single nucleotide polymorphism (SNP) loci were obtained. About 60% had a significant match on the Medicago truncatula syntenic genome. The accuracy of allelic ratios and genotype calls based on GBS data was directly assessed using 454 sequencing on a subset of SNP loci scored in eight plant samples. Sequencing depth in this study was not sufficient for accurate tetraploid allelic dosage, but reliable genotype calls based on diploid allelic dosage were obtained when using additional quality filtering. Principal Component Analysis of SNP loci in plant samples revealed that a small proportion (<5%) of the genetic variability assessed by GBS is able to differentiate ATF0 and ATF5. Our results confirm that analysis of GBS data using UNEAK is a reliable approach for genome-wide discovery of SNP loci in outcrossed polyploids. PMID:26115486
Derrien, Thomas; Axelsson, Erik; Rosengren Pielberg, Gerli; Sigurdsson, Snaevar; Fall, Tove; Seppälä, Eija H.; Hansen, Mark S. T.; Lawley, Cindy T.; Karlsson, Elinor K.; Bannasch, Danika; Vilà, Carles; Lohi, Hannes; Galibert, Francis; Fredholm, Merete; Häggström, Jens; Hedhammar, Åke; André, Catherine; Lindblad-Toh, Kerstin; Hitte, Christophe; Webster, Matthew T.
2011-01-01
The extraordinary phenotypic diversity of dog breeds has been sculpted by a unique population history accompanied by selection for novel and desirable traits. Here we perform a comprehensive analysis using multiple test statistics to identify regions under selection in 509 dogs from 46 diverse breeds using a newly developed high-density genotyping array consisting of >170,000 evenly spaced SNPs. We first identify 44 genomic regions exhibiting extreme differentiation across multiple breeds. Genetic variation in these regions correlates with variation in several phenotypic traits that vary between breeds, and we identify novel associations with both morphological and behavioral traits. We next scan the genome for signatures of selective sweeps in single breeds, characterized by long regions of reduced heterozygosity and fixation of extended haplotypes. These scans identify hundreds of regions, including 22 blocks of homozygosity longer than one megabase in certain breeds. Candidate selection loci are strongly enriched for developmental genes. We chose one highly differentiated region, associated with body size and ear morphology, and characterized it using high-throughput sequencing to provide a list of variants that may directly affect these traits. This study provides a catalogue of genomic regions showing extreme reduction in genetic variation or population differentiation in dogs, including many linked to phenotypic variation. The many blocks of reduced haplotype diversity observed across the genome in dog breeds are the result of both selection and genetic drift, but extended blocks of homozygosity on a megabase scale appear to be best explained by selection. Further elucidation of the variants under selection will help to uncover the genetic basis of complex traits and disease. PMID:22022279
Vaysse, Amaury; Ratnakumar, Abhirami; Derrien, Thomas; Axelsson, Erik; Rosengren Pielberg, Gerli; Sigurdsson, Snaevar; Fall, Tove; Seppälä, Eija H; Hansen, Mark S T; Lawley, Cindy T; Karlsson, Elinor K; Bannasch, Danika; Vilà, Carles; Lohi, Hannes; Galibert, Francis; Fredholm, Merete; Häggström, Jens; Hedhammar, Ake; André, Catherine; Lindblad-Toh, Kerstin; Hitte, Christophe; Webster, Matthew T
2011-10-01
The extraordinary phenotypic diversity of dog breeds has been sculpted by a unique population history accompanied by selection for novel and desirable traits. Here we perform a comprehensive analysis using multiple test statistics to identify regions under selection in 509 dogs from 46 diverse breeds using a newly developed high-density genotyping array consisting of >170,000 evenly spaced SNPs. We first identify 44 genomic regions exhibiting extreme differentiation across multiple breeds. Genetic variation in these regions correlates with variation in several phenotypic traits that vary between breeds, and we identify novel associations with both morphological and behavioral traits. We next scan the genome for signatures of selective sweeps in single breeds, characterized by long regions of reduced heterozygosity and fixation of extended haplotypes. These scans identify hundreds of regions, including 22 blocks of homozygosity longer than one megabase in certain breeds. Candidate selection loci are strongly enriched for developmental genes. We chose one highly differentiated region, associated with body size and ear morphology, and characterized it using high-throughput sequencing to provide a list of variants that may directly affect these traits. This study provides a catalogue of genomic regions showing extreme reduction in genetic variation or population differentiation in dogs, including many linked to phenotypic variation. The many blocks of reduced haplotype diversity observed across the genome in dog breeds are the result of both selection and genetic drift, but extended blocks of homozygosity on a megabase scale appear to be best explained by selection. Further elucidation of the variants under selection will help to uncover the genetic basis of complex traits and disease.
Siebers, Bettina; Zaparty, Melanie; Raddatz, Guenter; Tjaden, Britta; Albers, Sonja-Verena; Bell, Steve D.; Blombach, Fabian; Kletzin, Arnulf; Kyrpides, Nikos; Lanz, Christa; Plagens, André; Rampp, Markus; Rosinus, Andrea; von Jan, Mathias; Makarova, Kira S.; Klenk, Hans-Peter; Schuster, Stephan C.; Hensel, Reinhard
2011-01-01
Here, we report on the complete genome sequence of the hyperthermophilic Crenarchaeum Thermoproteus tenax (strain Kra1, DSM 2078T) a type strain of the crenarchaeotal order Thermoproteales. Its circular 1.84-megabase genome harbors no extrachromosomal elements and 2,051 open reading frames are identified, covering 90.6% of the complete sequence, which represents a high coding density. Derived from the gene content, T. tenax is a representative member of the Crenarchaeota. The organism is strictly anaerobic and sulfur-dependent with optimal growth at 86°C and pH 5.6. One particular feature is the great metabolic versatility, which is not accompanied by a distinct increase of genome size or information density as compared to other Crenarchaeota. T. tenax is able to grow chemolithoautotrophically (CO2/H2) as well as chemoorganoheterotrophically in presence of various organic substrates. All pathways for synthesizing the 20 proteinogenic amino acids are present. In addition, two presumably complete gene sets for NADH:quinone oxidoreductase (complex I) were identified in the genome and there is evidence that either NADH or reduced ferredoxin might serve as electron donor. Beside the typical archaeal A0A1-ATP synthase, a membrane-bound pyrophosphatase is found, which might contribute to energy conservation. Surprisingly, all genes required for dissimilatory sulfate reduction are present, which is confirmed by growth experiments. Mentionable is furthermore, the presence of two proteins (ParA family ATPase, actin-like protein) that might be involved in cell division in Thermoproteales, where the ESCRT system is absent, and of genes involved in genetic competence (DprA, ComF) that is so far unique within Archaea. PMID:22003381
Reliable Detection of Herpes Simplex Virus Sequence Variation by High-Throughput Resequencing.
Morse, Alison M; Calabro, Kaitlyn R; Fear, Justin M; Bloom, David C; McIntyre, Lauren M
2017-08-16
High-throughput sequencing (HTS) has resulted in data for a number of herpes simplex virus (HSV) laboratory strains and clinical isolates. The knowledge of these sequences has been critical for investigating viral pathogenicity. However, the assembly of complete herpesviral genomes, including HSV, is complicated due to the existence of large repeat regions and arrays of smaller reiterated sequences that are commonly found in these genomes. In addition, the inherent genetic variation in populations of isolates for viruses and other microorganisms presents an additional challenge to many existing HTS sequence assembly pipelines. Here, we evaluate two approaches for the identification of genetic variants in HSV1 strains using Illumina short read sequencing data. The first, a reference-based approach, identifies variants from reads aligned to a reference sequence and the second, a de novo assembly approach, identifies variants from reads aligned to de novo assembled consensus sequences. Of critical importance for both approaches is the reduction in the number of low complexity regions through the construction of a non-redundant reference genome. We compared variants identified in the two methods. Our results indicate that approximately 85% of variants are identified regardless of the approach. The reference-based approach to variant discovery captures an additional 15% representing variants divergent from the HSV1 reference possibly due to viral passage. Reference-based approaches are significantly less labor-intensive and identify variants across the genome where de novo assembly-based approaches are limited to regions where contigs have been successfully assembled. In addition, regions of poor quality assembly can lead to false variant identification in de novo consensus sequences. For viruses with a well-assembled reference genome, a reference-based approach is recommended.
Xiao, Jingfa; Hao, Lirui; Crowley, David E.; Zhang, Zhewen; Yu, Jun; Huang, Ning; Huo, Mingxin; Wu, Jiayan
2015-01-01
Cupriavidus sp. are generally heavy metal tolerant bacteria with the ability to degrade a variety of aromatic hydrocarbon compounds, although the degradation pathways and substrate versatilities remain largely unknown. Here we studied the bacterium Cupriavidus gilardii strain CR3, which was isolated from a natural asphalt deposit, and which was shown to utilize naphthenic acids as a sole carbon source. Genome sequencing of C. gilardii CR3 was carried out to elucidate possible mechanisms for the naphthenic acid biodegradation. The genome of C. gilardii CR3 was composed of two circular chromosomes chr1 and chr2 of respectively 3,539,530 bp and 2,039,213 bp in size. The genome for strain CR3 encoded 4,502 putative protein-coding genes, 59 tRNA genes, and many other non-coding genes. Many genes were associated with xenobiotic biodegradation and metal resistance functions. Pathway prediction for degradation of cyclohexanecarboxylic acid, a representative naphthenic acid, suggested that naphthenic acid undergoes initial ring-cleavage, after which the ring fission products can be degraded via several plausible degradation pathways including a mechanism similar to that used for fatty acid oxidation. The final metabolic products of these pathways are unstable or volatile compounds that were not toxic to CR3. Strain CR3 was also shown to have tolerance to at least 10 heavy metals, which was mainly achieved by self-detoxification through ion efflux, metal-complexation and metal-reduction, and a powerful DNA self-repair mechanism. Our genomic analysis suggests that CR3 is well adapted to survive the harsh environment in natural asphalts containing naphthenic acids and high concentrations of heavy metals. PMID:26301592
Thorup, Casper; Schramm, Andreas
2017-01-01
ABSTRACT This study demonstrates that the deltaproteobacterium Desulfurivibrio alkaliphilus can grow chemolithotrophically by coupling sulfide oxidation to the dissimilatory reduction of nitrate and nitrite to ammonium. Key genes of known sulfide oxidation pathways are absent from the genome of D. alkaliphilus. Instead, the genome contains all of the genes necessary for sulfate reduction, including a gene for a reductive-type dissimilatory bisulfite reductase (DSR). Despite this, growth by sulfate reduction was not observed. Transcriptomic analysis revealed a very high expression level of sulfate-reduction genes during growth by sulfide oxidation, while inhibition experiments with molybdate pointed to elemental sulfur/polysulfides as intermediates. Consequently, we propose that D. alkaliphilus initially oxidizes sulfide to elemental sulfur, which is then either disproportionated, or oxidized by a reversal of the sulfate reduction pathway. This is the first study providing evidence that a reductive-type DSR is involved in a sulfide oxidation pathway. Transcriptome sequencing further suggests that nitrate reduction to ammonium is performed by a novel type of periplasmic nitrate reductase and an unusual membrane-anchored nitrite reductase. PMID:28720728
Walkowiak, Sean; Rowland, Owen; Rodrigue, Nicolas; Subramaniam, Rajagopal
2016-12-09
The Fusarium graminearum species complex is composed of many distinct fungal species that cause several diseases in economically important crops, including Fusarium Head Blight of wheat. Despite being closely related, these species and individuals within species have distinct phenotypic differences in toxin production and pathogenicity, with some isolates reported as non-pathogenic on certain hosts. In this report, we compare genomes and gene content of six new isolates from the species complex, including the first available genomes of F. asiaticum and F. meridionale, with four other genomes reported in previous studies. A comparison of genome structure and gene content revealed a 93-99% overlap across all ten genomes. We identified more than 700 k base pairs (kb) of single nucleotide polymorphisms (SNPs), insertions, and deletions (indels) within common regions of the genome, which validated the species and genetic populations reported within species. We constructed a non-redundant pan gene list containing 15,297 genes from the ten genomes and among them 1827 genes or 12% were absent in at least one genome. These genes were co-localized in telomeric regions and select regions within chromosomes with a corresponding increase in SNPs and indels. Many are also predicted to encode for proteins involved in secondary metabolism and other functions associated with disease. Genes that were common between isolates contained high levels of nucleotide variation and may be pseudogenes, allelic, or under diversifying selection. The genomic resources we have contributed will be useful for the identification of genes that contribute to the phenotypic variation and niche specialization that have been reported among members of the F. graminearum species complex.
Haack, Tobias B; Madignier, Florence; Herzer, Martina; Lamantea, Eleonora; Danhauser, Katharina; Invernizzi, Federica; Koch, Johannes; Freitag, Martin; Drost, Rene; Hillier, Ingo; Haberberger, Birgit; Mayr, Johannes A; Ahting, Uwe; Tiranti, Valeria; Rötig, Agnes; Iuso, Arcangela; Horvath, Rita; Tesarova, Marketa; Baric, Ivo; Uziel, Graziella; Rolinski, Boris; Sperl, Wolfgang; Meitinger, Thomas; Zeviani, Massimo; Freisinger, Peter; Prokisch, Holger
2012-02-01
Mitochondrial complex I deficiency is the most common cause of mitochondrial disease in childhood. Identification of the molecular basis is difficult given the clinical and genetic heterogeneity. Most patients lack a molecular definition in routine diagnostics. A large-scale mutation screen of 75 candidate genes in 152 patients with complex I deficiency was performed by high-resolution melting curve analysis and Sanger sequencing. The causal role of a new disease allele was confirmed by functional complementation assays. The clinical phenotype of patients carrying mutations was documented using a standardised questionnaire. Causative mutations were detected in 16 genes, 15 of which had previously been associated with complex I deficiency: three mitochondrial DNA genes encoding complex I subunits, two mitochondrial tRNA genes and nuclear DNA genes encoding six complex I subunits and four assembly factors. For the first time, a causal mutation is described in NDUFB9, coding for a complex I subunit, resulting in reduction in NDUFB9 protein and both amount and activity of complex I. These features were rescued by expression of wild-type NDUFB9 in patient-derived fibroblasts. Mutant NDUFB9 is a new cause of complex I deficiency. A molecular diagnosis related to complex I deficiency was established in 18% of patients. However, most patients are likely to carry mutations in genes so far not associated with complex I function. The authors conclude that the high degree of genetic heterogeneity in complex I disorders warrants the implementation of unbiased genome-wide strategies for the complete molecular dissection of mitochondrial complex I deficiency.
Austin, Jehannine C; Honer, William G
2007-02-01
Genetic counseling is an important clinical service that is routinely offered to families affected by genetic disorders or by complex disorders for which genetic testing is available. It is not yet routinely offered to individuals with serious mental illnesses and their families, but recent findings that beliefs about the cause of mental illness can affect an individual's adaptation to the illness suggest that genetic counseling may be a useful intervention for this population. In a genetic counseling session the counselor discusses genetic and environmental contributors to disease pathogenesis; helps individuals explore conceptions, fears, and adaptive strategies; and provides nondirective support for decision making. Expected outcomes may include reductions in fear, stigma, and guilt associated with a psychiatric diagnosis; improvements in adherence to prescribed medications; declines in risk behaviors; and reductions in misconceptions about the illness. The authors endorse a multidisciplinary approach in which a psychiatrist and genetic counselor collaborate to provide comprehensive psychiatric genetic counseling.
Lee, Seungyeoun; Kim, Yongkang; Kwon, Min-Seok; Park, Taesung
2015-01-01
Genome-wide association studies (GWAS) have extensively analyzed single SNP effects on a wide variety of common and complex diseases and found many genetic variants associated with diseases. However, there is still a large portion of the genetic variants left unexplained. This missing heritability problem might be due to the analytical strategy that limits analyses to only single SNPs. One of possible approaches to the missing heritability problem is to consider identifying multi-SNP effects or gene-gene interactions. The multifactor dimensionality reduction method has been widely used to detect gene-gene interactions based on the constructive induction by classifying high-dimensional genotype combinations into one-dimensional variable with two attributes of high risk and low risk for the case-control study. Many modifications of MDR have been proposed and also extended to the survival phenotype. In this study, we propose several extensions of MDR for the survival phenotype and compare the proposed extensions with earlier MDR through comprehensive simulation studies. PMID:26339630
Yang, Jhung-Ahn; Yang, Sung-Hyun; Kim, Junghee; Kwon, Kae Kyoung; Oh, Hyun-Myung
2017-07-01
Here we report the comparative genomic analysis of strain UJ101 with 15 strains from the family Flavobacteriaceae, using the CGExplorer program. Flavobacteriales bacterium strain UJ101 was isolated from a xanthid crab, Atergatis reticulatus, from the East Sea near Korea. The complete genome of strain UJ101 is a 3,074,209 bp, single, circular chromosome with 30.74% GC content. While the UJ101 genome contains a number of annotated genes for many metabolic pathways, such as the Embden-Meyerhof pathway, the pentose phosphate pathway, the tricarboxylic acid (TCA) cycle, and the glyoxylate cycle, genes for the Entner-Douddoroff pathway are not found in the UJ101 genome. Overall, carbon fixation processes were absent but nitrate reduction and denitrification pathways were conserved. The UJ101 genome was compared to genomes from other marine animals (three invertebrate strains and 5 fish strains) and other marine animal- derived genera. Notable results by genome comparisons showed that UJ101 is capable of denitrification and nitrate reduction, and that biotin-thiamine pathway participation varies among marine bacteria; fish-dwelling bacteria, freeliving bacteria, invertebrate-dwelling bacteria, and strain UJ101. Pan-genome analysis of the 16 strains in this study included 7,220 non-redundant genes that covered 62% of the pan-genome. A core-genome of 994 genes was present and consisted of 8% of the genes from the pan-genome. Strain UJ101 is a symbiotic hetero-organotroph isolated from xanthid crab, and is a metabolic generalist with nitrate-reducing abilities but without the ability to synthesize biotin. There is a general tendency of UJ101 and some fish pathogens to prefer thiamine-dependent glycolysis to gluconeogenesis. Biotin and thiamine auxotrophy or prototrophy may be used as important markers in microbial community studies.
Michael, Todd P; Bryant, Douglas; Gutierrez, Ryan; Borisjuk, Nikolai; Chu, Philomena; Zhang, Hanzhong; Xia, Jing; Zhou, Junfei; Peng, Hai; El Baidouri, Moaine; Ten Hallers, Boudewijn; Hastie, Alex R; Liang, Tiffany; Acosta, Kenneth; Gilbert, Sarah; McEntee, Connor; Jackson, Scott A; Mockler, Todd C; Zhang, Weixiong; Lam, Eric
2017-02-01
Spirodela polyrhiza is a fast-growing aquatic monocot with highly reduced morphology, genome size and number of protein-coding genes. Considering these biological features of Spirodela and its basal position in the monocot lineage, understanding its genome architecture could shed light on plant adaptation and genome evolution. Like many draft genomes, however, the 158-Mb Spirodela genome sequence has not been resolved to chromosomes, and important genome characteristics have not been defined. Here we deployed rapid genome-wide physical maps combined with high-coverage short-read sequencing to resolve the 20 chromosomes of Spirodela and to empirically delineate its genome features. Our data revealed a dramatic reduction in the number of the rDNA repeat units in Spirodela to fewer than 100, which is even fewer than that reported for yeast. Consistent with its unique phylogenetic position, small RNA sequencing revealed 29 Spirodela-specific microRNA, with only two being shared with Elaeis guineensis (oil palm) and Musa balbisiana (banana). Combining DNA methylation data and small RNA sequencing enabled the accurate prediction of 20.5% long terminal repeats (LTRs) that doubled the previous estimate, and revealed a high Solo:Intact LTR ratio of 8.2. Interestingly, we found that Spirodela has the lowest global DNA methylation levels (9%) of any plant species tested. Taken together our results reveal a genome that has undergone reduction, likely through eliminating non-essential protein coding genes, rDNA and LTRs. In addition to delineating the genome features of this unique plant, the methodologies described and large-scale genome resources from this work will enable future evolutionary and functional studies of this basal monocot family. © 2016 The Authors The Plant Journal © 2016 John Wiley & Sons Ltd.
USDA-ARS?s Scientific Manuscript database
Histophilus somni is a fastidious gram-negative opportunistic pathogenic Pasteurellacea that affects multiple organ systems and is one of the principle bacterial species contributing to bovine respiratory disease complex (BRDC) in feed yard cattle. Here we present seven closed genomes isolated from...
Genome-wide characterization of Mediator recruitment, function, and regulation.
Grünberg, Sebastian; Zentner, Gabriel E
2017-05-27
Mediator is a conserved and essential coactivator complex broadly required for RNA polymerase II (RNAPII) transcription. Recent genome-wide studies of Mediator binding in budding yeast have revealed new insights into the functions of this critical complex and raised new questions about its role in the regulation of gene expression.
The Contribution of Short Repeats of Low Sequence Complexity to Large Conifer Genomes
A. Schmidt; R.L. Doudrick; J.S. Heslop-Harrison; T. Schmidt
2000-01-01
Abstract: The abundance and genomic organization of six simple sequence repeats, consisting of di-, tri-, and tetranucleotide sequence motifs, and a minisatellite repeat have been analyzed in different gymnosperms by Southern hybridization. Within the gymnosperm genomes investigated, the abundance and genomic organization of micro- and...
The genome editing toolbox: a spectrum of approaches for targeted modification.
Cheng, Joseph K; Alper, Hal S
2014-12-01
The increase in quality, quantity, and complexity of recombinant products heavily drives the need to predictably engineer model and complex (mammalian) cell systems. However, until recently, limited tools offered the ability to precisely manipulate their genomes, thus impeding the full potential of rational cell line development processes. Targeted genome editing can combine the advances in synthetic and systems biology with current cellular hosts to further push productivity and expand the product repertoire. This review highlights recent advances in targeted genome editing techniques, discussing some of their capabilities and limitations and their potential to aid advances in pharmaceutical biotechnology. Copyright © 2014 Elsevier Ltd. All rights reserved.
Ross, Daniel E.; Marshall, Christopher W.; May, Harold D.; ...
2015-01-15
A draft genome of Sulfurospirillum sp. strain MES was isolated through taxonomic binning of a metagenome sequenced from a microbial electrosynthesis system (MES) actively producing acetate and hydrogen. The genome contains the nosZDFLY genes, which are involved in nitrous oxide reduction, suggesting the potential role of this strain in denitrification.
Human papillomavirus type 16 E7 oncoprotein mediates CCNA1 promoter methylation.
Chalertpet, Kanwalat; Pakdeechaidan, Watcharapong; Patel, Vyomesh; Mutirangura, Apiwat; Yanatatsaneejit, Pattamawadee
2015-10-01
Human papillomavirus (HPV) oncoproteins drive distinctive promoter methylation patterns in cancer. However, the underlying mechanism remains to be elucidated. Cyclin A1 (CCNA1) promoter methylation is strongly associated with HPV-associated cancer. CCNA1 methylation is found in HPV-associated cervical cancers, as well as in head and neck squamous cell cancer. Numerous pieces of evidence suggest that E7 may drive CCNA1 methylation. First, the CCNA1 promoter is methylated in HPV-positive epithelial lesions after transformation. Second, the CCNA1 promoter is methylated at a high level when HPV is integrated into the human genome. Finally, E7 has been shown to interact with DNA methyltransferase 1 (Dnmt1). Here, we sought to determine the mechanism by which E7 increases methylation in cervical cancer by using CCNA1 as a gene model. We investigated whether E7 induces CCNA1 promoter methylation, resulting in the loss of expression. Using both E7 knockdown and overexpression approaches in SiHa and C33a cells, our data showed that CCNA1 promoter methylation decreases with a corresponding increase in expression in E7 siRNA-transfected cells. By contrast, CCNA1 promoter methylation was augmented with a corresponding reduction in expression in E7-overexpressing cells. To confirm whether the binding of the E7-Dnmt1 complex to the CCNA1 promoter induced methylation and loss of expression, ChIP assays were carried out in E7-, del CR3-E7 and vector control-overexpressing C33a cells. The data showed that E7 induced CCNA1 methylation by forming a complex with Dnmt1 at the CCNA1 promoter, resulting in the subsequent reduction of expression in cancers. It is interesting to further explore the genome-wide mechanism of E7 oncoprotein-mediated DNA methylation. © 2015 The Authors. Cancer Science published by Wiley Publishing Asia Pty Ltd on behalf of Japanese Cancer Association.
Limited mitogenomic degradation in response to a parasitic lifestyle in Orobanchaceae
Fan, Weishu; Zhu, Andan; Kozaczek, Melisa; Shah, Neethu; Pabón-Mora, Natalia; González, Favio; Mower, Jeffrey P.
2016-01-01
In parasitic plants, the reduction in plastid genome (plastome) size and content is driven predominantly by the loss of photosynthetic genes. The first completed mitochondrial genomes (mitogenomes) from parasitic mistletoes also exhibit significant degradation, but the generality of this observation for other parasitic plants is unclear. We sequenced the complete mitogenome and plastome of the hemiparasite Castilleja paramensis (Orobanchaceae) and compared them with additional holoparasitic, hemiparasitic and nonparasitic species from Orobanchaceae. Comparative mitogenomic analysis revealed minimal gene loss among the seven Orobanchaceae species, indicating the retention of typical mitochondrial function among Orobanchaceae species. Phylogenetic analysis demonstrated that the mobile cox1 intron was acquired vertically from a nonparasitic ancestor, arguing against a role for Orobanchaceae parasites in the horizontal acquisition or distribution of this intron. The C. paramensis plastome has retained nearly all genes except for the recent pseudogenization of four subunits of the NAD(P)H dehydrogenase complex, indicating a very early stage of plastome degradation. These results lend support to the notion that loss of ndh gene function is the first step of plastome degradation in the transition to a parasitic lifestyle. PMID:27808159
The Capsaspora genome reveals a complex unicellular prehistory of animals.
Suga, Hiroshi; Chen, Zehua; de Mendoza, Alex; Sebé-Pedrós, Arnau; Brown, Matthew W; Kramer, Eric; Carr, Martin; Kerner, Pierre; Vervoort, Michel; Sánchez-Pons, Núria; Torruella, Guifré; Derelle, Romain; Manning, Gerard; Lang, B Franz; Russ, Carsten; Haas, Brian J; Roger, Andrew J; Nusbaum, Chad; Ruiz-Trillo, Iñaki
2013-01-01
To reconstruct the evolutionary origin of multicellular animals from their unicellular ancestors, the genome sequences of diverse unicellular relatives are essential. However, only the genome of the choanoflagellate Monosiga brevicollis has been reported to date. Here we completely sequence the genome of the filasterean Capsaspora owczarzaki, the closest known unicellular relative of metazoans besides choanoflagellates. Analyses of this genome alter our understanding of the molecular complexity of metazoans' unicellular ancestors showing that they had a richer repertoire of proteins involved in cell adhesion and transcriptional regulation than previously inferred only with the choanoflagellate genome. Some of these proteins were secondarily lost in choanoflagellates. In contrast, most intercellular signalling systems controlling development evolved later concomitant with the emergence of the first metazoans. We propose that the acquisition of these metazoan-specific developmental systems and the co-option of pre-existing genes drove the evolutionary transition from unicellular protists to metazoans.
Targeted Genome Editing Using DNA-Free RNA-Guided Cas9 Ribonucleoprotein for CHO Cell Engineering.
Shin, Jongoh; Lee, Namil; Cho, Suhyung; Cho, Byung-Kwan
2018-01-01
Recent advances in the CRISPR/Cas9 system have dramatically facilitated genome engineering in various cell systems. Among the protocols, the direct delivery of the Cas9-sgRNA ribonucleoprotein (RNP) complex into cells is an efficient approach to increase genome editing efficiency. This method uses purified Cas9 protein and in vitro transcribed sgRNA to edit the target gene without vector DNA. We have applied the RNP complex to CHO cell engineering to obtain desirable phenotypes and to reduce unintended insertional mutagenesis and off-target effects. Here, we describe our routine methods for RNP complex-mediated gene deletion including the protocols to prepare the purified Cas9 protein and the in vitro transcribed sgRNA. Subsequently, we also describe a protocol to confirm the edited genomic positions using the T7E1 enzymatic assay and next-generation sequencing.
Kelemen, Arpad; Vasilakos, Athanasios V; Liang, Yulan
2009-09-01
Comprehensive evaluation of common genetic variations through association of single-nucleotide polymorphism (SNP) structure with common complex disease in the genome-wide scale is currently a hot area in human genome research due to the recent development of the Human Genome Project and HapMap Project. Computational science, which includes computational intelligence (CI), has recently become the third method of scientific enquiry besides theory and experimentation. There have been fast growing interests in developing and applying CI in disease mapping using SNP and haplotype data. Some of the recent studies have demonstrated the promise and importance of CI for common complex diseases in genomic association study using SNP/haplotype data, especially for tackling challenges, such as gene-gene and gene-environment interactions, and the notorious "curse of dimensionality" problem. This review provides coverage of recent developments of CI approaches for complex diseases in genetic association study with SNP/haplotype data.
Kang, Dongwan D.; Froula, Jeff; Egan, Rob; ...
2015-01-01
Grouping large genomic fragments assembled from shotgun metagenomic sequences to deconvolute complex microbial communities, or metagenome binning, enables the study of individual organisms and their interactions. Because of the complex nature of these communities, existing metagenome binning methods often miss a large number of microbial species. In addition, most of the tools are not scalable to large datasets. Here we introduce automated software called MetaBAT that integrates empirical probabilistic distances of genome abundance and tetranucleotide frequency for accurate metagenome binning. MetaBAT outperforms alternative methods in accuracy and computational efficiency on both synthetic and real metagenome datasets. Lastly, it automatically formsmore » hundreds of high quality genome bins on a very large assembly consisting millions of contigs in a matter of hours on a single node. MetaBAT is open source software and available at https://bitbucket.org/berkeleylab/metabat.« less
Bioenergetic Constraints on the Evolution of Complex Life
Lane, Nick
2014-01-01
All morphologically complex life on Earth, beyond the level of cyanobacteria, is eukaryotic. All eukaryotes share a common ancestor that was already a complex cell. Despite their biochemical virtuosity, prokaryotes show little tendency to evolve eukaryotic traits or large genomes. Here I argue that prokaryotes are constrained by their membrane bioenergetics, for fundamental reasons relating to the origin of life. Eukaryotes arose in a rare endosymbiosis between two prokaryotes, which broke the energetic constraints on prokaryotes and gave rise to mitochondria. Loss of almost all mitochondrial genes produced an extreme genomic asymmetry, in which tiny mitochondrial genomes support, energetically, a massive nuclear genome, giving eukaryotes three to five orders of magnitude more energy per gene than prokaryotes. The requirement for endosymbiosis radically altered selection on eukaryotes, potentially explaining the evolution of unique traits, including the nucleus, sex, two sexes, speciation, and aging. PMID:24789818
Regulatory variation: an emerging vantage point for cancer biology.
Li, Luolan; Lorzadeh, Alireza; Hirst, Martin
2014-01-01
Transcriptional regulation involves complex and interdependent interactions of noncoding and coding regions of the genome with proteins that interact and modify them. Genetic variation/mutation in coding and noncoding regions of the genome can drive aberrant transcription and disease. In spite of accounting for nearly 98% of the genome comparatively little is known about the contribution of noncoding DNA elements to disease. Genome-wide association studies of complex human diseases including cancer have revealed enrichment for variants in the noncoding genome. A striking finding of recent cancer genome re-sequencing efforts has been the previously underappreciated frequency of mutations in epigenetic modifiers across a wide range of cancer types. Taken together these results point to the importance of dysregulation in transcriptional regulatory control in genesis of cancer. Powered by recent technological advancements in functional genomic profiling, exploration of normal and transformed regulatory networks will provide novel insight into the initiation and progression of cancer and open new windows to future prognostic and diagnostic tools. © 2013 Wiley Periodicals, Inc.
UCSC genome browser: deep support for molecular biomedical research.
Mangan, Mary E; Williams, Jennifer M; Lathe, Scott M; Karolchik, Donna; Lathe, Warren C
2008-01-01
The volume and complexity of genomic sequence data, and the additional experimental data required for annotation of the genomic context, pose a major challenge for display and access for biomedical researchers. Genome browsers organize this data and make it available in various ways to extract useful information to advance research projects. The UCSC Genome Browser is one of these resources. The official sequence data for a given species forms the framework to display many other types of data such as expression, variation, cross-species comparisons, and more. Visual representations of the data are available for exploration. Data can be queried with sequences. Complex database queries are also easily achieved with the Table Browser interface. Associated tools permit additional query types or access to additional data sources such as images of in situ localizations. Support for solving researcher's issues is provided with active discussion mailing lists and by providing updated training materials. The UCSC Genome Browser provides a source of deep support for a wide range of biomedical molecular research (http://genome.ucsc.edu).
Yoshizumi, Takeshi; Oikawa, Kazusato; Chuah, Jo-Ann; Kodama, Yutaka; Numata, Keiji
2018-05-14
Selective gene delivery into organellar genomes (mitochondrial and plastid genomes) has been limited because of a lack of appropriate platform technology, even though these organelles are essential for metabolite and energy production. Techniques for selective organellar modification are needed to functionally improve organelles and produce transplastomic/transmitochondrial plants. However, no method for mitochondrial genome modification has yet been established for multicellular organisms including plants. Likewise, modification of plastid genomes has been limited to a few plant species and algae. In the present study, we developed ionic complexes of fusion peptides containing organellar targeting signal and plasmid DNA for selective delivery of exogenous DNA into the plastid and mitochondrial genomes of intact plants. This is the first report of exogenous DNA being integrated into the mitochondrial genomes of not only plants, but also multicellular organisms in general. This fusion peptide-mediated gene delivery system is a breakthrough platform for both plant organellar biotechnology and gene therapy for mitochondrial diseases in animals.
Insights From Genomics Into Spatial and Temporal Variation in Batrachochytrium dendrobatidis.
Byrne, A Q; Voyles, J; Rios-Sotelo, G; Rosenblum, E B
2016-01-01
Advances in genetics and genomics have provided new tools for the study of emerging infectious diseases. Researchers can now move quickly from simple hypotheses to complex explanations for pathogen origin, spread, and mechanisms of virulence. Here we focus on the application of genomics to understanding the biology of the fungal pathogen Batrachochytrium dendrobatidis (Bd), a novel and deadly pathogen of amphibians. We provide a brief history of the system, then focus on key insights into Bd variation garnered from genomics approaches, and finally, highlight new frontiers for future discoveries. Genomic tools have revealed unexpected complexity and variation in the Bd system suggesting that the history and biology of emerging pathogens may not be as simple as they initially seem. Copyright © 2016 Elsevier Inc. All rights reserved.
Family genome browser: visualizing genomes with pedigree information.
Juan, Liran; Liu, Yongzhuang; Wang, Yongtian; Teng, Mingxiang; Zang, Tianyi; Wang, Yadong
2015-07-15
Families with inherited diseases are widely used in Mendelian/complex disease studies. Owing to the advances in high-throughput sequencing technologies, family genome sequencing becomes more and more prevalent. Visualizing family genomes can greatly facilitate human genetics studies and personalized medicine. However, due to the complex genetic relationships and high similarities among genomes of consanguineous family members, family genomes are difficult to be visualized in traditional genome visualization framework. How to visualize the family genome variants and their functions with integrated pedigree information remains a critical challenge. We developed the Family Genome Browser (FGB) to provide comprehensive analysis and visualization for family genomes. The FGB can visualize family genomes in both individual level and variant level effectively, through integrating genome data with pedigree information. Family genome analysis, including determination of parental origin of the variants, detection of de novo mutations, identification of potential recombination events and identical-by-decent segments, etc., can be performed flexibly. Diverse annotations for the family genome variants, such as dbSNP memberships, linkage disequilibriums, genes, variant effects, potential phenotypes, etc., are illustrated as well. Moreover, the FGB can automatically search de novo mutations and compound heterozygous variants for a selected individual, and guide investigators to find high-risk genes with flexible navigation options. These features enable users to investigate and understand family genomes intuitively and systematically. The FGB is available at http://mlg.hit.edu.cn/FGB/. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
2011-01-01
Background Evolution of the Brassica species has been recursively affected by polyploidy events, and comparison to their relative, Arabidopsis thaliana, provides means to explore their genomic complexity. Results A genome-wide physical map of a rapid-cycling strain of B. oleracea was constructed by integrating high-information-content fingerprinting (HICF) of Bacterial Artificial Chromosome (BAC) clones with hybridization to sequence-tagged probes. Using 2907 contigs of two or more BACs, we performed several lines of comparative genomic analysis. Interspecific DNA synteny is much better preserved in euchromatin than heterochromatin, showing the qualitative difference in evolution of these respective genomic domains. About 67% of contigs can be aligned to the Arabidopsis genome, with 96.5% corresponding to euchromatic regions, and 3.5% (shown to contain repetitive sequences) to pericentromeric regions. Overgo probe hybridization data showed that contigs aligned to Arabidopsis euchromatin contain ~80% of low-copy-number genes, while genes with high copy number are much more frequently associated with pericentromeric regions. We identified 39 interchromosomal breakpoints during the diversification of B. oleracea and Arabidopsis thaliana, a relatively high level of genomic change since their divergence. Comparison of the B. oleracea physical map with Arabidopsis and other available eudicot genomes showed appreciable 'shadowing' produced by more ancient polyploidies, resulting in a web of relatedness among contigs which increased genomic complexity. Conclusions A high-resolution genetically-anchored physical map sheds light on Brassica genome organization and advances positional cloning of specific genes, and may help to validate genome sequence assembly and alignment to chromosomes. All the physical mapping data is freely shared at a WebFPC site (http://lulu.pgml.uga.edu/fpc/WebAGCoL/brassica/WebFPC/; Temporarily password-protected: account: pgml; password: 123qwe123. PMID:21955929
Yu, Feiqiao Brian; Blainey, Paul C; Schulz, Frederik; Woyke, Tanja; Horowitz, Mark A; Quake, Stephen R
2017-01-01
Metagenomics and single-cell genomics have enabled genome discovery from unknown branches of life. However, extracting novel genomes from complex mixtures of metagenomic data can still be challenging and represents an ill-posed problem which is generally approached with ad hoc methods. Here we present a microfluidic-based mini-metagenomic method which offers a statistically rigorous approach to extract novel microbial genomes while preserving single-cell resolution. We used this approach to analyze two hot spring samples from Yellowstone National Park and extracted 29 new genomes, including three deeply branching lineages. The single-cell resolution enabled accurate quantification of genome function and abundance, down to 1% in relative abundance. Our analyses of genome level SNP distributions also revealed low to moderate environmental selection. The scale, resolution, and statistical power of microfluidic-based mini-metagenomics make it a powerful tool to dissect the genomic structure of microbial communities while effectively preserving the fundamental unit of biology, the single cell. DOI: http://dx.doi.org/10.7554/eLife.26580.001 PMID:28678007
Were protein internal repeats formed by "bricolage"?
Lavorgna, G; Patthy, L; Boncinelli, E
2001-03-01
Is evolution an engineer, or is it a tinkerer--a "bricoleur"--building up complex molecules in organisms by increasing and adapting the materials at hand? An analysis of completely sequenced genomes suggests the latter, showing that increasing repetition of modules within the proteins encoded by these genomes is correlated with increasing complexity of the organism.
Genome-wide characterization of Mediator recruitment, function, and regulation
2017-01-01
ABSTRACT Mediator is a conserved and essential coactivator complex broadly required for RNA polymerase II (RNAPII) transcription. Recent genome-wide studies of Mediator binding in budding yeast have revealed new insights into the functions of this critical complex and raised new questions about its role in the regulation of gene expression. PMID:28301289
Genetic addiction: selfish gene's strategy for symbiosis in the genome.
Mochizuki, Atsushi; Yahara, Koji; Kobayashi, Ichizo; Iwasa, Yoh
2006-02-01
The evolution and maintenance of the phenomenon of postsegregational host killing or genetic addiction are paradoxical. In this phenomenon, a gene complex, once established in a genome, programs death of a host cell that has eliminated it. The intact form of the gene complex would survive in other members of the host population. It is controversial as to why these genetic elements are maintained, due to the lethal effects of host killing, or perhaps some other properties are beneficial to the host. We analyzed their population dynamics by analytical methods and computer simulations. Genetic addiction turned out to be advantageous to the gene complex in the presence of a competitor genetic element. The advantage is, however, limited in a population without spatial structure, such as that in a well-mixed liquid culture. In contrast, in a structured habitat, such as the surface of a solid medium, the addiction gene complex can increase in frequency, irrespective of its initial density. Our demonstration that genomes can evolve through acquisition of addiction genes has implications for the general question of how a genome can evolve as a community of potentially selfish genes.
Weiner, Ronald M.; Taylor, Larry E.; Henrissat, Bernard; Hauser, Loren; Land, Miriam; Coutinho, Pedro M.; Rancurel, Corinne; Saunders, Elizabeth H.; Longmire, Atkinson G.; Zhang, Haitao; Bayer, Edward A.; Gilbert, Harry J.; Larimer, Frank; Zhulin, Igor B.; Ekborg, Nathan A.; Lamed, Raphael; Richardson, Paul M.; Borovok, Ilya; Hutcheson, Steven
2008-01-01
The marine bacterium Saccharophagus degradans strain 2-40 (Sde 2-40) is emerging as a vanguard of a recently discovered group of marine and estuarine bacteria that recycles complex polysaccharides. We report its complete genome sequence, analysis of which identifies an unusually large number of enzymes that degrade >10 complex polysaccharides. Not only is this an extraordinary range of catabolic capability, many of the enzymes exhibit unusual architecture including novel combinations of catalytic and substrate-binding modules. We hypothesize that many of these features are adaptations that facilitate depolymerization of complex polysaccharides in the marine environment. This is the first sequenced genome of a marine bacterium that can degrade plant cell walls, an important component of the carbon cycle that is not well-characterized in the marine environment. PMID:18516288
Pooled genome wide association detects association upstream of FCRL3 with Graves' disease.
Khong, Jwu Jin; Burdon, Kathryn P; Lu, Yi; Laurie, Kate; Leonardos, Lefta; Baird, Paul N; Sahebjada, Srujana; Walsh, John P; Gajdatsy, Adam; Ebeling, Peter R; Hamblin, Peter Shane; Wong, Rosemary; Forehan, Simon P; Fourlanos, Spiros; Roberts, Anthony P; Doogue, Matthew; Selva, Dinesh; Montgomery, Grant W; Macgregor, Stuart; Craig, Jamie E
2016-11-18
Graves' disease is an autoimmune thyroid disease of complex inheritance. Multiple genetic susceptibility loci are thought to be involved in Graves' disease and it is therefore likely that these can be identified by genome wide association studies. This study aimed to determine if a genome wide association study, using a pooling methodology, could detect genomic loci associated with Graves' disease. Nineteen of the top ranking single nucleotide polymorphisms including HLA-DQA1 and C6orf10, were clustered within the Major Histo-compatibility Complex region on chromosome 6p21, with rs1613056 reaching genome wide significance (p = 5 × 10 -8 ). Technical validation of top ranking non-Major Histo-compatablity complex single nucleotide polymorphisms with individual genotyping in the discovery cohort revealed four single nucleotide polymorphisms with p ≤ 10 -4 . Rs17676303 on chromosome 1q23.1, located upstream of FCRL3, showed evidence of association with Graves' disease across the discovery, replication and combined cohorts. A second single nucleotide polymorphism rs9644119 downstream of DPYSL2 showed some evidence of association supported by finding in the replication cohort that warrants further study. Pooled genome wide association study identified a genetic variant upstream of FCRL3 as a susceptibility locus for Graves' disease in addition to those identified in the Major Histo-compatibility Complex. A second locus downstream of DPYSL2 is potentially a novel genetic variant in Graves' disease that requires further confirmation.
Kulski, Jerzy K; Shiina, Takashi; Anzai, Tatsuya; Kohara, Sakae; Inoko, Hidetoshi
2002-12-01
The major histocompatibility complex (MHC) genomic region is composed of a group of linked genes involved functionally with the adaptive and innate immune systems. The class I and class II genes are intrinsic features of the MHC and have been found in all the jawed vertebrates studied so far. The MHC genomic regions of the human and the chicken (B locus) have been fully sequenced and mapped, and the mouse MHC sequence is almost finished. Information on the MHC genomic structures (size, complexity, genic and intergenic composition and organization, gene order and number) of other vertebrates is largely limited or nonexistent. Therefore, we are mapping, sequencing and analyzing the MHC genomic regions of different human haplotypes and at least eight nonhuman species. Here, we review our progress with these sequences and compare the human MHC structure with that of the nonhuman primates (chimpanzee and rhesus macaque), other mammals (pigs, mice and rats) and nonmammalian vertebrates such as birds (chicken and quail), bony fish (medaka, pufferfish and zebrafish) and cartilaginous fish (nurse shark). This comparison reveals a complex MHC structure for mammals and a relatively simpler design for nonmammalian animals with a hypothetical prototypic structure for the shark. In the mammalian MHC, there are two to five different class I duplication blocks embedded within a framework of conserved nonclass I and/or nonclass II genes. With a few exceptions, the class I framework genes are absent from the MHC of birds, bony fish and sharks. Comparative genomics of the MHC reveal a highly plastic region with major structural differences between the mammalian and nonmammalian vertebrates. Additional genomic data are needed on animals of the reptilia, crocodilia and marsupial classes to find the origins of the class I framework genes and examples of structures that may be intermediate between the simple and complex MHC organizations of birds and mammals, respectively.
Origins and Domestication of Cultivated Banana Inferred from Chloroplast and Nuclear Genes
Zhang, Cui; Wang, Xin-Feng; Shi, Feng-Xue; Chen, Wen-Na; Ge, Xue-Jun
2013-01-01
Background Cultivated bananas are large, vegetatively-propagated members of the genus Musa. More than 1,000 cultivars are grown worldwide and they are major economic and food resources in numerous developing countries. It has been suggested that cultivated bananas originated from the islands of Southeast Asia (ISEA) and have been developed through complex geodomestication pathways. However, the maternal and parental donors of most cultivars are unknown, and the pattern of nucleotide diversity in domesticated banana has not been fully resolved. Methodology/Principal Findings We studied the genetics of 16 cultivated and 18 wild Musa accessions using two single-copy nuclear (granule-bound starch synthase I, GBSS I, also known as Waxy, and alcohol dehydrogenase 1, Adh1) and two chloroplast (maturase K, matK, and the trnL-F gene cluster) genes. The results of phylogenetic analyses showed that all A-genome haplotypes of cultivated bananas were grouped together with those of ISEA subspecies of M. acuminata (A-genome). Similarly, the B- and S-genome haplotypes of cultivated bananas clustered with the wild species M. balbisiana (B-genome) and M. schizocarpa (S-genome), respectively. Notably, it has been shown that distinct haplotypes of each cultivar (A-genome group) were nested together to different ISEA subspecies M. acuminata. Analyses of nucleotide polymorphism in the Waxy and Adh1 genes revealed that, in comparison to the wild relatives, cultivated banana exhibited slightly lower nucleotide diversity both across all sites and specifically at silent sites. However, dramatically reduced nucleotide diversity was found at nonsynonymous sites for cultivated bananas. Conclusions/Significance Our study not only confirmed the origin of cultivated banana as arising from multiple intra- and inter-specific hybridization events, but also showed that cultivated banana may have not suffered a severe genetic bottleneck during the domestication process. Importantly, our findings suggested that multiple maternal origins and a reduction in nucleotide diversity at nonsynonymous sites are general attributes of cultivated bananas. PMID:24260405
Genome-reconstruction for eukaryotes from complex natural microbial communities.
West, Patrick T; Probst, Alexander J; Grigoriev, Igor V; Thomas, Brian C; Banfield, Jillian F
2018-04-01
Microbial eukaryotes are integral components of natural microbial communities, and their inclusion is critical for many ecosystem studies, yet the majority of published metagenome analyses ignore eukaryotes. In order to include eukaryotes in environmental studies, we propose a method to recover eukaryotic genomes from complex metagenomic samples. A key step for genome recovery is separation of eukaryotic and prokaryotic fragments. We developed a k -mer-based strategy, EukRep, for eukaryotic sequence identification and applied it to environmental samples to show that it enables genome recovery, genome completeness evaluation, and prediction of metabolic potential. We used this approach to test the effect of addition of organic carbon on a geyser-associated microbial community and detected a substantial change of the community metabolism, with selection against almost all candidate phyla bacteria and archaea and for eukaryotes. Near complete genomes were reconstructed for three fungi placed within the Eurotiomycetes and an arthropod. While carbon fixation and sulfur oxidation were important functions in the geyser community prior to carbon addition, the organic carbon-impacted community showed enrichment for secreted proteases, secreted lipases, cellulose targeting CAZymes, and methanol oxidation. We demonstrate the broader utility of EukRep by reconstructing and evaluating relatively high-quality fungal, protist, and rotifer genomes from complex environmental samples. This approach opens the way for cultivation-independent analyses of whole microbial communities. © 2018 West et al.; Published by Cold Spring Harbor Laboratory Press.
Reducing assembly complexity of microbial genomes with single-molecule sequencing.
Koren, Sergey; Harhay, Gregory P; Smith, Timothy P L; Bono, James L; Harhay, Dayna M; Mcvey, Scott D; Radune, Diana; Bergman, Nicholas H; Phillippy, Adam M
2013-01-01
The short reads output by first- and second-generation DNA sequencing instruments cannot completely reconstruct microbial chromosomes. Therefore, most genomes have been left unfinished due to the significant resources required to manually close gaps in draft assemblies. Third-generation, single-molecule sequencing addresses this problem by greatly increasing sequencing read length, which simplifies the assembly problem. To measure the benefit of single-molecule sequencing on microbial genome assembly, we sequenced and assembled the genomes of six bacteria and analyzed the repeat complexity of 2,267 complete bacteria and archaea. Our results indicate that the majority of known bacterial and archaeal genomes can be assembled without gaps, at finished-grade quality, using a single PacBio RS sequencing library. These single-library assemblies are also more accurate than typical short-read assemblies and hybrid assemblies of short and long reads. Automated assembly of long, single-molecule sequencing data reduces the cost of microbial finishing to $1,000 for most genomes, and future advances in this technology are expected to drive the cost lower. This is expected to increase the number of completed genomes, improve the quality of microbial genome databases, and enable high-fidelity, population-scale studies of pan-genomes and chromosomal organization.
Complexity: an internet resource for analysis of DNA sequence complexity
Orlov, Y. L.; Potapov, V. N.
2004-01-01
The search for DNA regions with low complexity is one of the pivotal tasks of modern structural analysis of complete genomes. The low complexity may be preconditioned by strong inequality in nucleotide content (biased composition), by tandem or dispersed repeats or by palindrome-hairpin structures, as well as by a combination of all these factors. Several numerical measures of textual complexity, including combinatorial and linguistic ones, together with complexity estimation using a modified Lempel–Ziv algorithm, have been implemented in a software tool called ‘Complexity’ (http://wwwmgs.bionet.nsc.ru/mgs/programs/low_complexity/). The software enables a user to search for low-complexity regions in long sequences, e.g. complete bacterial genomes or eukaryotic chromosomes. In addition, it estimates the complexity of groups of aligned sequences. PMID:15215465
Kerner, Berit; North, Kari E; Fallin, M Daniele
2010-01-01
Participants analyzed actual and simulated longitudinal data from the Framingham Heart Study for various metabolic and cardiovascular traits. The genetic information incorporated into these investigations ranged from selected single-nucleotide polymorphisms to genome-wide association arrays. Genotypes were incorporated using a broad range of methodological approaches including conditional logistic regression, linear mixed models, generalized estimating equations, linear growth curve estimation, growth modeling, growth mixture modeling, population attributable risk fraction based on survival functions under the proportional hazards models, and multivariate adaptive splines for the analysis of longitudinal data. The specific scientific questions addressed by these different approaches also varied, ranging from a more precise definition of the phenotype, bias reduction in control selection, estimation of effect sizes and genotype associated risk, to direct incorporation of genetic data into longitudinal modeling approaches and the exploration of population heterogeneity with regard to longitudinal trajectories. The group reached several overall conclusions: 1) The additional information provided by longitudinal data may be useful in genetic analyses. 2) The precision of the phenotype definition as well as control selection in nested designs may be improved, especially if traits demonstrate a trend over time or have strong age-of-onset effects. 3) Analyzing genetic data stratified for high-risk subgroups defined by a unique development over time could be useful for the detection of rare mutations in common multi-factorial diseases. 4) Estimation of the population impact of genomic risk variants could be more precise. The challenges and computational complexity demanded by genome-wide single-nucleotide polymorphism data were also discussed. PMID:19924713
Fredlake, Christopher P; Hert, Daniel G; Kan, Cheuk-Wai; Chiesl, Thomas N; Root, Brian E; Forster, Ryan E; Barron, Annelise E
2008-01-15
To realize the immense potential of large-scale genomic sequencing after the completion of the second human genome (Venter's), the costs for the complete sequencing of additional genomes must be dramatically reduced. Among the technologies being developed to reduce sequencing costs, microchip electrophoresis is the only new technology ready to produce the long reads most suitable for the de novo sequencing and assembly of large and complex genomes. Compared with the current paradigm of capillary electrophoresis, microchip systems promise to reduce sequencing costs dramatically by increasing throughput, reducing reagent consumption, and integrating the many steps of the sequencing pipeline onto a single platform. Although capillary-based systems require approximately 70 min to deliver approximately 650 bases of contiguous sequence, we report sequencing up to 600 bases in just 6.5 min by microchip electrophoresis with a unique polymer matrix/adsorbed polymer wall coating combination. This represents a two-thirds reduction in sequencing time over any previously published chip sequencing result, with comparable read length and sequence quality. We hypothesize that these ultrafast long reads on chips can be achieved because the combined polymer system engenders a recently discovered "hybrid" mechanism of DNA electromigration, in which DNA molecules alternate rapidly between repeating through the intact polymer network and disrupting network entanglements to drag polymers through the solution, similar to dsDNA dynamics we observe in single-molecule DNA imaging studies. Most importantly, these results reveal the surprisingly powerful ability of microchip electrophoresis to provide ultrafast Sanger sequencing, which will translate to increased system throughput and reduced costs.
Fredlake, Christopher P.; Hert, Daniel G.; Kan, Cheuk-Wai; Chiesl, Thomas N.; Root, Brian E.; Forster, Ryan E.; Barron, Annelise E.
2008-01-01
To realize the immense potential of large-scale genomic sequencing after the completion of the second human genome (Venter's), the costs for the complete sequencing of additional genomes must be dramatically reduced. Among the technologies being developed to reduce sequencing costs, microchip electrophoresis is the only new technology ready to produce the long reads most suitable for the de novo sequencing and assembly of large and complex genomes. Compared with the current paradigm of capillary electrophoresis, microchip systems promise to reduce sequencing costs dramatically by increasing throughput, reducing reagent consumption, and integrating the many steps of the sequencing pipeline onto a single platform. Although capillary-based systems require ≈70 min to deliver ≈650 bases of contiguous sequence, we report sequencing up to 600 bases in just 6.5 min by microchip electrophoresis with a unique polymer matrix/adsorbed polymer wall coating combination. This represents a two-thirds reduction in sequencing time over any previously published chip sequencing result, with comparable read length and sequence quality. We hypothesize that these ultrafast long reads on chips can be achieved because the combined polymer system engenders a recently discovered “hybrid” mechanism of DNA electromigration, in which DNA molecules alternate rapidly between reptating through the intact polymer network and disrupting network entanglements to drag polymers through the solution, similar to dsDNA dynamics we observe in single-molecule DNA imaging studies. Most importantly, these results reveal the surprisingly powerful ability of microchip electrophoresis to provide ultrafast Sanger sequencing, which will translate to increased system throughput and reduced costs. PMID:18184818
Quantitative Effects of P Elements on Hybrid Dysgenesis in Drosophila Melanogaster
Rasmusson, K. E.; Simmons, M. J.; Raymond, J. D.; McLarnon, C. F.
1990-01-01
Genetic analyses involving chromosomes from seven inbred lines derived from a single M' strain were used to study the quantitative relationships between the incidence and severity of P-M hybrid dysgenesis and the number of genomic P elements. In four separate analyses, the mutability of sn(w), a P element-insertion mutation of the X-linked singed locus, was found to be inversely related to the number of autosomal P elements. Since sn(w) mutability is caused by the action of the P transposase, this finding supports the hypothesis that genomic P elements titrate the transposase present within a cell. Other analyses demonstrated that autosomal transmission ratios were distorted by P element action. In these analyses, the amount of distortion against an autosome increased more or less linearly with the number of P elements carried by the autosome. Additional analyses showed that the magnitude of this distortion was reduced when a second P element-containing autosome was present in the genome. This reduction could adequately be explained by transposase titration; there was no evidence that it was due to repressor molecules binding to P elements and inhibiting their movement. The influence of genomic P elements on the incidence of gonadal dysgenesis was also investigated. Although no simple relationship between the number of P elements and the incidence of the trait could be discerned, it was clear that even a small number of elements could increase the incidence markedly. The failure to find a quantitative relationship between P element number and the incidence of gonadal dysgenesis probably reflects the complex etiology of this trait. PMID:2155853
Anti-inflammatory genes associated with multiple sclerosis: a gene expression study.
Perga, S; Montarolo, F; Martire, S; Berchialla, P; Malucchi, S; Bertolotto, A
2015-02-15
Multiple sclerosis (MS) is an autoimmune inflammatory disease of the central nervous system caused by a complex interaction between multiple genes and environmental factors. HLA region is the strongest susceptibility locus, but recent huge genome-wide association studies identified new susceptibility genes. Among these, BACH2, PTGER4, RGS1 and ZFP36L1 were highlighted. Here, a gene expression analysis revealed that three of them, namely BACH2, PTGER4 and ZFP36L1, are down-regulated in MS patients' blood cells compared to healthy subjects. Interestingly, all these genes are involved in the immune system regulation with predominant anti-inflammatory role and their reduction could predispose to MS development. Copyright © 2015 Elsevier B.V. All rights reserved.
Fleige, Tobias; Fischer, Karsten; Ferguson, David J. P.; Gross, Uwe; Bohne, Wolfgang
2007-01-01
Many apicomplexan parasites, such as Toxoplasma gondii and Plasmodium species, possess a nonphotosynthetic plastid, referred to as the apicoplast, which is essential for the parasites’ viability and displays characteristics similar to those of nongreen plastids in plants. In this study, we localized several key enzymes of the carbohydrate metabolism of T. gondii to either the apicoplast or the cytosol by engineering parasites which express epitope-tagged fusion proteins. The cytosol contains a complete set of enzymes for glycolysis, which should enable the parasite to metabolize imported glucose into pyruvate. All the glycolytic enzymes, from phosphofructokinase up to pyruvate kinase, are present in the T. gondii genome, as duplicates and isoforms of triose phosphate isomerase, phosphoglycerate kinase, and pyruvate kinase were found to localize to the apicoplast. The mRNA expression levels of all genes with glycolytic products were compared between tachyzoites and bradyzoites; however, a strict bradyzoite-specific expression pattern was observed only for enolase I. The T. gondii genome encodes a single pyruvate dehydrogenase complex, which was located in the apicoplast and absent in the mitochondrion, as shown by targeting of epitope-tagged fusion proteins and by immunolocalization of the native pyruvate dehydrogenase complex. The exchange of metabolites between the cytosol and the apicoplast is likely to be mediated by a phosphate translocator which was localized to the apicoplast. Based on these localization studies, a model is proposed that explains the supply of the apicoplast with ATP and the reduction power, as well as the exchange of metabolites between the cytosol and the apicoplast. PMID:17449654
NASA Astrophysics Data System (ADS)
Momper, L. M.; Magnabosco, C.; Amend, J.; Osburn, M. R.; Fournier, G. P.
2017-12-01
The marine and terrestrial subsurface biospheres represent quite likely the largest reservoirs for life on Earth, directly impacting surface processes and global cycles throughout Earth's history. In the deep subsurface biosphere (DSB) organic carbon and energy are often extremely scarce. However, archaea and bacteria are able to persist in the DSB to at least 3.5 km below surface [1]. Understanding how they persist, and by what metabolisms they subsist, are key questions in this biosphere. To address these questions we investigated 5 global DSB environments: one legacy mine in South Dakota, USA, 3 mines in South Africa and marine fluids circulating beneath the Juan de Fuca Ridge. Boreholes within these mines provided access to fluids buried beneath the earth's surface and sampled depths down to 3.1 km. Geochemical data were collected concomitantly with DNA for metagenomic sequencing. We examined genomes of the ancient and deeply branching Chloroflexi for metabolic capabilities and interrogated the geochemical drivers behind those metabolisms with in situ thermodynamic modeling of reaction energetics. In total, 23 Chloroflexi genomes were identified and analyzed from the 5 subsurface sites. Genes for nitrate reduction (nar) and sulfite reduction (dsr) were found in many of the South Africa Chloroflexi but were absent from genomes collected in South Dakota. Indeed, nitrate reduction was among the most energetically favorable reactions in South African fluids (10-14 kJ cell-1 sec -1 per mol of reactant) and sulfur reduction with Fe2+ or H2 was also exergonic [2]. Conversely, genes for nitrite and nitrous oxide reduction (nrf, nir and nos) were found in genomes collected in South Dakota and Juan de Fuca, but not South Africa. We examined the origin of genes conferring these metabolisms in the Chloroflexi genomes. We discovered evidence for horizontal gene transfer (HGT) for all of these putative metabolisms. Retention of these genes in Chloroflexi lineages indicates HGT may have conferred an advantageous metabolism in DSB environments. We are using molecular dating techniques to constrain the timing of these HGT events on geologic timescales. [1] Baker J. B. et al. (2003) Environ Microbiol., 5, 267-277. [2] Magnabosco C. et al. (2016) ISME J, 10(3), 730-741.
DDB2 promotes chromatin decondensation at UV-induced DNA damage
Lindh, Michael; Acs, Klara; Vrouwe, Mischa G.; Pines, Alex; van Attikum, Haico; Mullenders, Leon H.
2012-01-01
Nucleotide excision repair (NER) is the principal pathway that removes helix-distorting deoxyribonucleic acid (DNA) damage from the mammalian genome. Recognition of DNA lesions by xeroderma pigmentosum group C (XPC) protein in chromatin is stimulated by the damaged DNA-binding protein 2 (DDB2), which is part of a CUL4A–RING ubiquitin ligase (CRL4) complex. In this paper, we report a new function of DDB2 in modulating chromatin structure at DNA lesions. We show that DDB2 elicits unfolding of large-scale chromatin structure independently of the CRL4 ubiquitin ligase complex. Our data reveal a marked adenosine triphosphate (ATP)–dependent reduction in the density of core histones in chromatin containing UV-induced DNA lesions, which strictly required functional DDB2 and involved the activity of poly(adenosine diphosphate [ADP]–ribose) polymerase 1. Finally, we show that lesion recognition by XPC, but not DDB2, was strongly reduced in ATP-depleted cells and was regulated by the steady-state levels of poly(ADP-ribose) chains. PMID:22492724
cuRRBS: simple and robust evaluation of enzyme combinations for reduced representation approaches.
Martin-Herranz, Daniel E; Ribeiro, António J M; Krueger, Felix; Thornton, Janet M; Reik, Wolf; Stubbs, Thomas M
2017-11-16
DNA methylation is an important epigenetic modification in many species that is critical for development, and implicated in ageing and many complex diseases, such as cancer. Many cost-effective genome-wide analyses of DNA modifications rely on restriction enzymes capable of digesting genomic DNA at defined sequence motifs. There are hundreds of restriction enzyme families but few are used to date, because no tool is available for the systematic evaluation of restriction enzyme combinations that can enrich for certain sites of interest in a genome. Herein, we present customised Reduced Representation Bisulfite Sequencing (cuRRBS), a novel and easy-to-use computational method that solves this problem. By computing the optimal enzymatic digestions and size selection steps required, cuRRBS generalises the traditional MspI-based Reduced Representation Bisulfite Sequencing (RRBS) protocol to all restriction enzyme combinations. In addition, cuRRBS estimates the fold-reduction in sequencing costs and provides a robustness value for the personalised RRBS protocol, allowing users to tailor the protocol to their experimental needs. Moreover, we show in silico that cuRRBS-defined restriction enzymes consistently out-perform MspI digestion in many biological systems, considering both CpG and CHG contexts. Finally, we have validated the accuracy of cuRRBS predictions for single and double enzyme digestions using two independent experimental datasets. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Genetic Architecture of Parallel Pelvic Reduction in Ninespine Sticklebacks
Shikano, Takahito; Laine, Veronika N.; Herczeg, Gábor; Vilkki, Johanna; Merilä, Juha
2013-01-01
Teleost fish genomes are known to be evolving faster than those of other vertebrate taxa. Thus, fish are suited to address the extent to which the same vs. different genes are responsible for similar phenotypic changes in rapidly evolving genomes of evolutionary independent lineages. To gain insights into the genetic basis and evolutionary processes behind parallel phenotypic changes within and between species, we identified the genomic regions involved in pelvic reduction in Northern European ninespine sticklebacks (Pungitius pungitius) and compared them to those of North American ninespine and threespine sticklebacks (Gasterosteus aculeatus). To this end, we conducted quantitative trait locus (QTL) mapping using 283 F2 progeny from an interpopulation cross. Phenotypic analyses indicated that pelvic reduction is a recessive trait and is inherited in a simple Mendelian fashion. Significant QTL for pelvic spine and girdle lengths were identified in the region of the Pituitary homeobox transcription factor 1 (Pitx1) gene, also responsible for pelvic reduction in threespine sticklebacks. The fact that no QTL was observed in the region identified in the mapping study of North American ninespine sticklebacks suggests that an alternative QTL for pelvic reduction has emerged in this species within the past 1.6 million years after the split between Northern European and North American populations. In general, our study provides empirical support for the view that alternative genetic mechanisms that lead to similar phenotypes can evolve over short evolutionary time scales. PMID:23979937
GARNATJE, TERESA; GARCIA, SÒNIA; VILATERSANA, ROSER; VALLÈS, JOAN
2006-01-01
• Background and Aims Plant genome size is an important biological characteristic, with relationships to systematics, ecology and distribution. Currently, there is no information regarding nuclear DNA content for any Carthamus species. In addition to improving the knowledge base, this research focuses on interspecific variation and its implications for the infrageneric classification of this genus. Genome size variation in the process of allopolyploid formation is also addressed. • Methods Nuclear DNA samples from 34 populations of 16 species of the genus Carthamus were assessed by flow cytometry using propidium iodide. • Key Results The 2C values ranged from 2·26 pg for C. leucocaulos to 7·46 pg for C. turkestanicus, and monoploid genome size (1Cx-value) ranged from 1·13 pg in C. leucocaulos to 1·53 pg in C. alexandrinus. Mean genome sizes differed significantly, based on sectional classification. Both allopolyploid species (C. creticus and C. turkestanicus) exhibited nuclear DNA contents in accordance with the sum of the putative parental C-values (in one case with a slight reduction, frequent in polyploids), supporting their hybrid origin. • Conclusions Genome size represents a useful tool in elucidating systematic relationships between closely related species. A considerable reduction in monoploid genome size, possibly due to the hybrid formation, is also reported within these taxa. PMID:16390843
Overcoming Barriers to Progress in Exercise Genomics
Bouchard, Claude
2011-01-01
This commentary focuses on the issues of statistical power, the usefulness of hypothesis-free approaches such as in genome-wide association explorations, the necessity of expanding the research beyond common DNA variants, the advantage of combining transcriptomics with genomics, and the complexities inherent to the search for links between genotype and phenotype in exercise genomics research. PMID:21697717
Single-cell genomics for the masses
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tringe, Susannah G.
In this issue of Nature Biotechnology, Lan et al. describe a new tool in the toolkit for studying uncultivated microbial communities, enabling orders of magnitude higher single cell genome throughput than previous methods. This is achieved by a complex droplet microfluidics workflow encompassing steps from physical cell isolation through genome sequencing, producing tens of thousands of lowcoverage genomes from individual cells.
Single-cell genomics for the masses
Tringe, Susannah G.
2017-07-12
In this issue of Nature Biotechnology, Lan et al. describe a new tool in the toolkit for studying uncultivated microbial communities, enabling orders of magnitude higher single cell genome throughput than previous methods. This is achieved by a complex droplet microfluidics workflow encompassing steps from physical cell isolation through genome sequencing, producing tens of thousands of lowcoverage genomes from individual cells.
USDA-ARS?s Scientific Manuscript database
Like many agricultural crops, the cultivated cotton genome is large and polyploid (~2.5Gb), consisting of two very similar repeat-rich subgenomes, whose size and complexity pose significant challenges for accurate genome reconstruction using whole-genome shotgun approaches. A strategy for accurately...
Race and Ethnicity in the Genome Era: The Complexity of the Constructs
ERIC Educational Resources Information Center
Bonham, Vence L.; Warshauer-Baker, Esther; Collins, Francis S.
2005-01-01
The vast amount of biological information that is now available through the completion of the Human Genome Project presents opportunities and challenges. The genomic era has the potential to advance an understanding of human genetic variation and its role in human health and disease. A challenge for genomics research is to understand the…
USDA-ARS?s Scientific Manuscript database
A reassociation kinetics-based approach was used to reduce the complexity of genomic DNA from the Deutsch laboratory strain of the cattle tick, Rhipicephalus microplus, to facilitate genome sequencing. Selected genomic DNA (Cot value = 660) was sequenced using 454 GS FLX technology, resulting in 356...
PGSB PlantsDB: updates to the database framework for comparative plant genome research.
Spannagl, Manuel; Nussbaumer, Thomas; Bader, Kai C; Martis, Mihaela M; Seidel, Michael; Kugler, Karl G; Gundlach, Heidrun; Mayer, Klaus F X
2016-01-04
PGSB (Plant Genome and Systems Biology: formerly MIPS) PlantsDB (http://pgsb.helmholtz-muenchen.de/plant/index.jsp) is a database framework for the comparative analysis and visualization of plant genome data. The resource has been updated with new data sets and types as well as specialized tools and interfaces to address user demands for intuitive access to complex plant genome data. In its latest incarnation, we have re-worked both the layout and navigation structure and implemented new keyword search options and a new BLAST sequence search functionality. Actively involved in corresponding sequencing consortia, PlantsDB has dedicated special efforts to the integration and visualization of complex triticeae genome data, especially for barley, wheat and rye. We enhanced CrowsNest, a tool to visualize syntenic relationships between genomes, with data from the wheat sub-genome progenitor Aegilops tauschii and added functionality to the PGSB RNASeqExpressionBrowser. GenomeZipper results were integrated for the genomes of barley, rye, wheat and perennial ryegrass and interactive access is granted through PlantsDB interfaces. Data exchange and cross-linking between PlantsDB and other plant genome databases is stimulated by the transPLANT project (http://transplantdb.eu/). © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
The draft genome of MD-2 pineapple using hybrid error correction of long reads
Redwan, Raimi M.; Saidin, Akzam; Kumar, S. Vijay
2016-01-01
The introduction of the elite pineapple variety, MD-2, has caused a significant market shift in the pineapple industry. Better productivity, overall increased in fruit quality and taste, resilience to chilled storage and resistance to internal browning are among the key advantages of the MD-2 as compared with its previous predecessor, the Smooth Cayenne. Here, we present the genome sequence of the MD-2 pineapple (Ananas comosus (L.) Merr.) by using the hybrid sequencing technology from two highly reputable platforms, i.e. the PacBio long sequencing reads and the accurate Illumina short reads. Our draft genome achieved 99.6% genome coverage with 27,017 predicted protein-coding genes while 45.21% of the genome was identified as repetitive elements. Furthermore, differential expression of ripening RNASeq library of pineapple fruits revealed ethylene-related transcripts, believed to be involved in regulating the process of non-climacteric pineapple fruit ripening. The MD-2 pineapple draft genome serves as an example of how a complex heterozygous genome is amenable to whole genome sequencing by using a hybrid technology that is both economical and accurate. The genome will make genomic applications more feasible as a medium to understand complex biological processes specific to pineapple. PMID:27374615
Bork, Peer
2018-02-14
The U.S. Department of Energy Joint Genome Institute (JGI) invited scientists interested in the application of genomics to bioenergy and environmental issues, as well as all current and prospective users and collaborators, to attend the annual DOE JGI Genomics of Energy & Environment Meeting held March 22-24, 2011 in Walnut Creek, Calif. The emphasis of this meeting was on the genomics of renewable energy strategies, carbon cycling, environmental gene discovery, and engineering of fuel-producing organisms. The meeting features presentations by leading scientists advancing these topics. Peer Bork of the European Molecular Biology Laboratory on Comparative Metagenomics of Gut and Ocean: Identification of Microbial Marker Genes for Complex Environmental Properties at the 6th annual Genomics of Energy & Environment Meeting on March 23, 2011.
Barrett, Craig F; Wicke, Susann; Sass, Chodon
2018-05-01
Heterotrophic plants provide excellent opportunities to study the effects of altered selective regimes on genome evolution. Plastid genome (plastome) studies in heterotrophic plants are often based on one or a few highly divergent species or sequences as representatives of an entire lineage, thus missing important evolutionary-transitory events. Here, we present the first infraspecific analysis of plastome evolution in any heterotrophic plant. By combining genome skimming and targeted sequence capture, we address hypotheses on the degree and rate of plastome degradation in a complex of leafless orchids (Corallorhiza striata) across its geographic range. Plastomes provide strong support for relationships and evidence of reciprocal monophyly between C. involuta and the endangered C. bentleyi. Plastome degradation is extensive, occurring rapidly over a few million years, with evidence of differing rates of genomic change among the two principal clades of the complex. Genome skimming and targeted sequence capture differ widely in coverage depth overall, with depth in targeted sequence capture datasets varying immensely across the plastome as a function of GC content. These findings will help to fill a knowledge gap in models of heterotrophic plastid genome evolution, and have implications for future studies in heterotrophs. © 2018 The Authors. New Phytologist © 2018 New Phytologist Trust.
Living Organisms Author Their Read-Write Genomes in Evolution
2017-01-01
Evolutionary variations generating phenotypic adaptations and novel taxa resulted from complex cellular activities altering genome content and expression: (i) Symbiogenetic cell mergers producing the mitochondrion-bearing ancestor of eukaryotes and chloroplast-bearing ancestors of photosynthetic eukaryotes; (ii) interspecific hybridizations and genome doublings generating new species and adaptive radiations of higher plants and animals; and, (iii) interspecific horizontal DNA transfer encoding virtually all of the cellular functions between organisms and their viruses in all domains of life. Consequently, assuming that evolutionary processes occur in isolated genomes of individual species has become an unrealistic abstraction. Adaptive variations also involved natural genetic engineering of mobile DNA elements to rewire regulatory networks. In the most highly evolved organisms, biological complexity scales with “non-coding” DNA content more closely than with protein-coding capacity. Coincidentally, we have learned how so-called “non-coding” RNAs that are rich in repetitive mobile DNA sequences are key regulators of complex phenotypes. Both biotic and abiotic ecological challenges serve as triggers for episodes of elevated genome change. The intersections of cell activities, biosphere interactions, horizontal DNA transfers, and non-random Read-Write genome modifications by natural genetic engineering provide a rich molecular and biological foundation for understanding how ecological disruptions can stimulate productive, often abrupt, evolutionary transformations. PMID:29211049
Living Organisms Author Their Read-Write Genomes in Evolution.
Shapiro, James A
2017-12-06
Evolutionary variations generating phenotypic adaptations and novel taxa resulted from complex cellular activities altering genome content and expression: (i) Symbiogenetic cell mergers producing the mitochondrion-bearing ancestor of eukaryotes and chloroplast-bearing ancestors of photosynthetic eukaryotes; (ii) interspecific hybridizations and genome doublings generating new species and adaptive radiations of higher plants and animals; and, (iii) interspecific horizontal DNA transfer encoding virtually all of the cellular functions between organisms and their viruses in all domains of life. Consequently, assuming that evolutionary processes occur in isolated genomes of individual species has become an unrealistic abstraction. Adaptive variations also involved natural genetic engineering of mobile DNA elements to rewire regulatory networks. In the most highly evolved organisms, biological complexity scales with "non-coding" DNA content more closely than with protein-coding capacity. Coincidentally, we have learned how so-called "non-coding" RNAs that are rich in repetitive mobile DNA sequences are key regulators of complex phenotypes. Both biotic and abiotic ecological challenges serve as triggers for episodes of elevated genome change. The intersections of cell activities, biosphere interactions, horizontal DNA transfers, and non-random Read-Write genome modifications by natural genetic engineering provide a rich molecular and biological foundation for understanding how ecological disruptions can stimulate productive, often abrupt, evolutionary transformations.
Harhay, Gregory P; Harhay, Dayna M; Bono, James L; Smith, Timothy P L; Capik, Sarah F; DeDonder, Keith D; Apley, Michael D; Lubbers, Brian V; White, Bradley J; Larson, Robert L
2017-10-05
Histophilus somni is a fastidious Gram-negative opportunistic pathogenic Pasteurellaceae that affects multiple organ systems and is one of the principal bacterial species contributing to bovine respiratory disease complex (BRDC) in feed yard cattle. Here, we present seven closed genome sequences isolated from three beef calves showing sign of BRDC.
USDA-ARS?s Scientific Manuscript database
The comprehensive identification of genes underlying phenotypic variation of complex traits remains a major challenge. Most genome-wide screens lack sufficient resolving power as they typically depend on linkage. An alternate method is to screen for allele-specific expression (ASE), a simple yet pow...
Stielow, Bastian; Finkernagel, Florian; Stiewe, Thorsten
2018-01-01
Diverse Polycomb repressive complexes 1 (PRC1) play essential roles in gene regulation, differentiation and development. Six major groups of PRC1 complexes that differ in their subunit composition have been identified in mammals. How the different PRC1 complexes are recruited to specific genomic sites is poorly understood. The Polycomb Ring finger protein PCGF6, the transcription factors MGA and E2F6, and the histone-binding protein L3MBTL2 are specific components of the non-canonical PRC1.6 complex. In this study, we have investigated their role in genomic targeting of PRC1.6. ChIP-seq analysis revealed colocalization of MGA, L3MBTL2, E2F6 and PCGF6 genome-wide. Ablation of MGA in a human cell line by CRISPR/Cas resulted in complete loss of PRC1.6 binding. Rescue experiments revealed that MGA recruits PRC1.6 to specific loci both by DNA binding-dependent and by DNA binding-independent mechanisms. Depletion of L3MBTL2 and E2F6 but not of PCGF6 resulted in differential, locus-specific loss of PRC1.6 binding illustrating that different subunits mediate PRC1.6 loading to distinct sets of promoters. Mga, L3mbtl2 and Pcgf6 colocalize also in mouse embryonic stem cells, where PRC1.6 has been linked to repression of germ cell-related genes. Our findings unveil strikingly different genomic recruitment mechanisms of the non-canonical PRC1.6 complex, which specify its cell type- and context-specific regulatory functions. PMID:29381691
Goswami, Sathi; Sanyal, Sulagna; Chakraborty, Payal; Das, Chandrima; Sarkar, Munna
2017-08-01
NSAIDs are the most common class of painkillers and anti-inflammatory agents. They also show other functions like chemoprevention and chemosuppression for which they act at the protein but not at the genome level since they are mostly anions at physiological pH, which prohibit their approach to the poly-anionic DNA. Complexing the drugs with bioactive metal obliterate their negative charge and allow them to bind to the DNA, thereby, opening the possibility of genome level interaction. To test this hypothesis, we present the interaction of a traditional NSAID, Piroxicam and its copper complex with core histone and chromatin. Spectroscopy, DLS, and SEM studies were applied to see the effect of the interaction on the structure of histone/chromatin. This was coupled with MTT assay, immunoblot analysis, confocal microscopy, micro array analysis and qRT-PCR. The interaction of Piroxicam and its copper complex with histone/chromatin results in structural alterations. Such structural alterations can have different biological manifestations, but to test our hypothesis, we have focused only on the accompanied modulations at the epigenomic/genomic level. The complex, showed alteration of key epigenetic signatures implicated in transcription in the global context, although Piroxicam caused no significant changes. We have correlated such alterations caused by the complex with the changes in global gene expression and validated the candidate gene expression alterations. Our results provide the proof of concept that DNA binding ability of the copper complexes of a traditional NSAID, opens up the possibility of modulations at the epigenomic/genomic level. Copyright © 2017 Elsevier B.V. All rights reserved.
solGS: a web-based tool for genomic selection
USDA-ARS?s Scientific Manuscript database
Genomic selection (GS) promises to improve accuracy in estimating breeding values and genetic gain for quantitative traits compared to traditional breeding methods. Its reliance on high-throughput genome-wide markers and statistical complexity, however, is a serious challenge in data management, ana...
Svarovskaia, Evguenia S; Xu, Hongzhan; Mbisa, Jean L; Barr, Rebekah; Gorelick, Robert J; Ono, Akira; Freed, Eric O; Hu, Wei-Shau; Pathak, Vinay K
2004-08-20
Apolipoprotein B mRNA-editing enzyme-catalytic polypeptide-like 3G (APOBEC3G) is a host cytidine deaminase that is packaged into virions and confers resistance to retroviral infection. APOBEC3G deaminates deoxycytidines in minus strand DNA to deoxyuridines, resulting in G to A hypermutation and viral inactivation. Human immunodeficiency virus type 1 (HIV-1) virion infectivity factor counteracts the antiviral activity of APOBEC3G by inducing its proteosomal degradation and preventing virion incorporation. To elucidate the mechanism of viral suppression by APOBEC3G, we developed a sensitive cytidine deamination assay and analyzed APOBEC3G virion incorporation in a series of HIV-1 deletion mutants. Virus-like particles derived from constructs in which pol, env, and most of gag were deleted still contained high levels of cytidine deaminase activity; in addition, coimmunoprecipitation of APOBEC3G and HIV-1 Gag in the presence and absence of RNase A indicated that the two proteins do not interact directly but form an RNase-sensitive complex. Viral particles lacking HIV-1 genomic RNA which were generated from the gag-pol expression constructs pC-Help and pSYNGP packaged APOBEC3G at 30-40% of the wild-type level, indicating that interactions with viral RNA are not necessary for incorporation. In addition, viral particles produced from an nucleocapsid zinc finger mutant contained approximately 1% of the viral genomic RNA but approximately 30% of the cytidine deaminase activity. The reduction in APOBEC3G incorporation was equivalent to the reduction in the total RNA present in the nucleocapsid mutant virions. These results indicate that interactions with viral proteins or viral genomic RNA are not essential for APOBEC3G incorporation and suggest that APOBEC3G interactions with viral and nonviral RNAs that are packaged into viral particles are sufficient for APOBEC3G virion incorporation.
A genomic comparison of two termites with different social complexity.
Korb, Judith; Poulsen, Michael; Hu, Haofu; Li, Cai; Boomsma, Jacobus J; Zhang, Guojie; Liebig, Jürgen
2015-01-01
The termites evolved eusociality and complex societies before the ants, but have been studied much less. The recent publication of the first two termite genomes provides a unique comparative opportunity, particularly because the sequenced termites represent opposite ends of the social complexity spectrum. Zootermopsis nevadensis has simple colonies with totipotent workers that can develop into all castes (dispersing reproductives, nest-inheriting replacement reproductives, and soldiers). In contrast, the fungus-growing termite Macrotermes natalensis belongs to the higher termites and has very large and complex societies with morphologically distinct castes that are life-time sterile. Here we compare key characteristics of genomic architecture, focusing on genes involved in communication, immune defenses, mating biology and symbiosis that were likely important in termite social evolution. We discuss these in relation to what is known about these genes in the ants and outline hypothesis for further testing.
A genomic comparison of two termites with different social complexity
Korb, Judith; Poulsen, Michael; Hu, Haofu; Li, Cai; Boomsma, Jacobus J.; Zhang, Guojie; Liebig, Jürgen
2015-01-01
The termites evolved eusociality and complex societies before the ants, but have been studied much less. The recent publication of the first two termite genomes provides a unique comparative opportunity, particularly because the sequenced termites represent opposite ends of the social complexity spectrum. Zootermopsis nevadensis has simple colonies with totipotent workers that can develop into all castes (dispersing reproductives, nest-inheriting replacement reproductives, and soldiers). In contrast, the fungus-growing termite Macrotermes natalensis belongs to the higher termites and has very large and complex societies with morphologically distinct castes that are life-time sterile. Here we compare key characteristics of genomic architecture, focusing on genes involved in communication, immune defenses, mating biology and symbiosis that were likely important in termite social evolution. We discuss these in relation to what is known about these genes in the ants and outline hypothesis for further testing. PMID:25788900
Continuing Evolution of Burkholderia mallei Through Genome Reduction and Large-Scale Rearrangements
2010-01-22
in Materials and Methods. b NRPS, nonribosomal peptide synthase ; PKS, polyketide synthase ; RND, resistance nodulation-division like pump. Losada et al...genomics, genome erosion, bacterial virulence. ª The Author(s) 2010. Published by Oxford University Press on behalf of the Society for Molecular Biology...creativecommons.org/licenses/by-nc/ 2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original
Peacock, D Matthew; Jiang, Quan; Hanley, Patrick S; Cundari, Thomas R; Hartwig, John F
2018-04-11
We report the formation of phosphine-ligated alkylpalladium(II) amido complexes that undergo reductive elimination to form alkyl-nitrogen bonds and a combined experimental and computational investigation of the factors controlling the rates of these reactions. The free-energy barriers to reductive elimination from t-Bu 3 P-ligated complexes were significantly lower (ca. 3 kcal/mol) than those previously reported from NHC-ligated complexes. The rates of reactions from complexes containing a series of electronically and sterically varied anilido ligands showed that the reductive elimination is slower from complexes of less electron-rich or more sterically hindered anilido ligands than from those containing more electron-rich and less hindered anilido ligands. Reductive elimination of alkylamines also occurred from complexes bearing bidentate P,O ligands. The rates of reactions of these four-coordinate complexes were slower than those for reactions of the three-coordinate, t-Bu 3 P-ligated complexes. The calculated pathway for reductive elimination from rigid, 2-methoxyarylphosphine-ligated complexes does not involve initial dissociation of the oxygen. Instead, reductive elimination is calculated to occur directly from the four-coordinate complex in concert with a lengthening of the Pd-O bond. To investigate this effect experimentally, a four-coordinate Pd(II) anilido complex containing a flexible, aliphatic linker between the P and O atoms was synthesized. Reductive elimination from this complex was faster than that from the analogous complex containing the more rigid, aryl linker. The flexible linker enables full dissociation of the ether ligand during reductive elimination, leading to the faster reaction of this complex.
Suzuki, Shigekatsu; Endoh, Rikiya; Manabe, Ri-Ichiroh; Ohkuma, Moriya; Hirakawa, Yoshihisa
2018-01-17
Autotrophic eukaryotes have evolved by the endosymbiotic uptake of photosynthetic organisms. Interestingly, many algae and plants have secondarily lost the photosynthetic activity despite its great advantages. Prototheca and Helicosporidium are non-photosynthetic green algae possessing colourless plastids. The plastid genomes of Prototheca wickerhamii and Helicosporidium sp. are highly reduced owing to the elimination of genes related to photosynthesis. To gain further insight into the reductive genome evolution during the shift from a photosynthetic to a heterotrophic lifestyle, we sequenced the plastid and nuclear genomes of two Prototheca species, P. cutis JCM 15793 and P. stagnora JCM 9641, and performed comparative genome analyses among trebouxiophytes. Our phylogenetic analyses using plastid- and nucleus-encoded proteins strongly suggest that independent losses of photosynthesis have occurred at least three times in the clade of Prototheca and Helicosporidium. Conserved gene content among these non-photosynthetic lineages suggests that the plastid and nuclear genomes have convergently eliminated a similar set of photosynthesis-related genes. Other than the photosynthetic genes, significant gene loss and gain were not observed in Prototheca compared to its closest photosynthetic relative Auxenochlorella. Although it remains unclear why loss of photosynthesis occurred in Prototheca, the mixotrophic capability of trebouxiophytes likely made it possible to eliminate photosynthesis.
Vandamme, Peter; Peeters, Charlotte; De Smet, Birgit; Price, Erin P.; Sarovich, Derek S.; Henry, Deborah A.; Hird, Trevor J.; Zlosnik, James E. A.; Mayo, Mark; Warner, Jeffrey; Baker, Anthony; Currie, Bart J.; Carlier, Aurélien
2017-01-01
Four Burkholderia pseudomallei-like isolates of human clinical origin were examined by a polyphasic taxonomic approach that included comparative whole genome analyses. The results demonstrated that these isolates represent a rare and unusual, novel Burkholderia species for which we propose the name B. singularis. The type strain is LMG 28154T (=CCUG 65685T). Its genome sequence has an average mol% G+C content of 64.34%, which is considerably lower than that of other Burkholderia species. The reduced G+C content of strain LMG 28154T was characterized by a genome wide AT bias that was not due to reduced GC-biased gene conversion or reductive genome evolution, but might have been caused by an altered DNA base excision repair pathway. B. singularis can be differentiated from other Burkholderia species by multilocus sequence analysis, MALDI-TOF mass spectrometry and a distinctive biochemical profile that includes the absence of nitrate reduction, a mucoid appearance on Columbia sheep blood agar, and a slowly positive oxidase reaction. Comparisons with publicly available whole genome sequences demonstrated that strain TSV85, an Australian water isolate, also represents the same species and therefore, to date, B. singularis has been recovered from human or environmental samples on three continents. PMID:28932212
Rupakula, Aamani; Kruse, Thomas; Boeren, Sjef; Holliger, Christof; Smidt, Hauke; Maillard, Julien
2013-01-01
Dehalobacter restrictus strain PER-K23 is an obligate organohalide respiring bacterium, which displays extremely narrow metabolic capabilities. It grows only via coupling energy conservation to anaerobic respiration of tetra- and trichloroethene with hydrogen as sole electron donor. Dehalobacter restrictus represents the paradigmatic member of the genus Dehalobacter, which in recent years has turned out to be a major player in the bioremediation of an increasing number of organohalides, both in situ and in laboratory studies. The recent elucidation of the D. restrictus genome revealed a rather elaborate genome with predicted pathways that were not suspected from its restricted metabolism, such as a complete corrinoid biosynthetic pathway, the Wood–Ljungdahl (WL) pathway for CO2 fixation, abundant transcriptional regulators and several types of hydrogenases. However, one important feature of the genome is the presence of 25 reductive dehalogenase genes, from which so far only one, pceA, has been characterized on genetic and biochemical levels. This study describes a multi-level functional genomics approach on D. restrictus across three different growth phases. A global proteomic analysis allowed consideration of general metabolic pathways relevant to organohalide respiration, whereas the dedicated genomic and transcriptomic analysis focused on the diversity, composition and expression of genes associated with reductive dehalogenases. PMID:23479754
Rapid and accurate pyrosequencing of angiosperm plastid genomes
Moore, Michael J; Dhingra, Amit; Soltis, Pamela S; Shaw, Regina; Farmerie, William G; Folta, Kevin M; Soltis, Douglas E
2006-01-01
Background Plastid genome sequence information is vital to several disciplines in plant biology, including phylogenetics and molecular biology. The past five years have witnessed a dramatic increase in the number of completely sequenced plastid genomes, fuelled largely by advances in conventional Sanger sequencing technology. Here we report a further significant reduction in time and cost for plastid genome sequencing through the successful use of a newly available pyrosequencing platform, the Genome Sequencer 20 (GS 20) System (454 Life Sciences Corporation), to rapidly and accurately sequence the whole plastid genomes of the basal eudicot angiosperms Nandina domestica (Berberidaceae) and Platanus occidentalis (Platanaceae). Results More than 99.75% of each plastid genome was simultaneously obtained during two GS 20 sequence runs, to an average depth of coverage of 24.6× in Nandina and 17.3× in Platanus. The Nandina and Platanus plastid genomes shared essentially identical gene complements and possessed the typical angiosperm plastid structure and gene arrangement. To assess the accuracy of the GS 20 sequence, over 45 kilobases of sequence were generated for each genome using conventional sequencing. Overall error rates of 0.043% and 0.031% were observed in GS 20 sequence for Nandina and Platanus, respectively. More than 97% of all observed errors were associated with homopolymer runs, with ~60% of all errors associated with homopolymer runs of 5 or more nucleotides and ~50% of all errors associated with regions of extensive homopolymer runs. No substitution errors were present in either genome. Error rates were generally higher in the single-copy and noncoding regions of both plastid genomes relative to the inverted repeat and coding regions. Conclusion Highly accurate and essentially complete sequence information was obtained for the Nandina and Platanus plastid genomes using the GS 20 System. More importantly, the high accuracy observed in the GS 20 plastid genome sequence was generated for a significant reduction in time and cost over traditional shotgun-based genome sequencing techniques, although with approximately half the coverage of previously reported GS 20 de novo genome sequence. The GS 20 should be broadly applicable to angiosperm plastid genome sequencing, and therefore promises to expand the scale of plant genetic and phylogenetic research dramatically. PMID:16934154
Karen, Kasey A.; Hearing, Patrick
2011-01-01
Adenovirus has a linear, double-stranded DNA genome that is perceived by the cellular Mre11-Rad50-Nbs1 (MRN) DNA repair complex as a double-strand break. If unabated, MRN elicits a double-strand break repair response that blocks viral DNA replication and ligates the viral genomes into concatemers. There are two sets of early viral proteins that inhibit the MRN complex. The E1B-55K/E4-ORF6 complex recruits an E3 ubiquitin ligase and targets MRN proteins for proteasome-dependent degradation. The E4-ORF3 protein inhibits MRN through sequestration. The mechanism that prevents MRN recognition of the viral genome prior to the expression of these early proteins was previously unknown. Here we show a temporal correlation between the loss of viral core protein VII from the adenovirus genome and a gain of checkpoint signaling due to the double-strand break repair response. While checkpoint signaling corresponds to the recognition of the viral genome, core protein VII binding to and checkpoint signaling at viral genomes are largely mutually exclusive. Transcription is known to release protein VII from the genome, and the inhibition of transcription shows a decrease in checkpoint signaling. Finally, we show that the nuclease activity of Mre11 is dispensable for the inhibition of viral DNA replication during a DNA damage response. These results support a model involving the protection of the incoming viral genome from checkpoint signaling by core protein VII and suggest that the induction of an MRN-dependent DNA damage response may inhibit adenovirus replication by physically masking the origins of DNA replication rather than altering their integrity. PMID:21345950
Hulse-Kemp, Amanda M; Maheshwari, Shamoni; Stoffel, Kevin; Hill, Theresa A; Jaffe, David; Williams, Stephen R; Weisenfeld, Neil; Ramakrishnan, Srividya; Kumar, Vijay; Shah, Preyas; Schatz, Michael C; Church, Deanna M; Van Deynze, Allen
2018-01-01
Linked-Read sequencing technology has recently been employed successfully for de novo assembly of human genomes, however, the utility of this technology for complex plant genomes is unproven. We evaluated the technology for this purpose by sequencing the 3.5-gigabase (Gb) diploid pepper ( Capsicum annuum ) genome with a single Linked-Read library. Plant genomes, including pepper, are characterized by long, highly similar repetitive sequences. Accordingly, significant effort is used to ensure that the sequenced plant is highly homozygous and the resulting assembly is a haploid consensus. With a phased assembly approach, we targeted a heterozygous F 1 derived from a wide cross to assess the ability to derive both haplotypes and characterize a pungency gene with a large insertion/deletion. The Supernova software generated a highly ordered, more contiguous sequence assembly than all currently available C. annuum reference genomes. Over 83% of the final assembly was anchored and oriented using four publicly available de novo linkage maps. A comparison of the annotation of conserved eukaryotic genes indicated the completeness of assembly. The validity of the phased assembly is further demonstrated with the complete recovery of both 2.5-Kb insertion/deletion haplotypes of the PUN1 locus in the F 1 sample that represents pungent and nonpungent peppers, as well as nearly full recovery of the BUSCO2 gene set within each of the two haplotypes. The most contiguous pepper genome assembly to date has been generated which demonstrates that Linked-Read library technology provides a tool to de novo assemble complex highly repetitive heterozygous plant genomes. This technology can provide an opportunity to cost-effectively develop high-quality genome assemblies for other complex plants and compare structural and gene differences through accurate haplotype reconstruction.
Renzong, Q
2001-12-01
A human being or person cannot be reduced to a set of human genes, or human genome. Genetic essentialism is wrong, because as a person the entity should have self-conscious and social interaction capacity which is grown in an interpersonal relationship. Genetic determinism is wrong too, the relationship between a gene and a trait is not a linear model of causation, but rather a non-linear one. Human genome is a complexity system and functions in a complexity system of human body and a complexity of systems of natural/social environment. Genetic determinism also caused the issue of how much responsibility an agent should take for her/his action, and how much degrees of freedom will a human being have. Human genome research caused several conceptual issues. Can we call a gene 'good' or 'bad', 'superior' of 'inferior'? Is a boy who is detected to have the gene of Huntington's chorea or Alzheimer disease a patient? What should the term 'eugenics' mean? What do the terms such as 'gene therapy', 'treatment' and 'enhancement' and 'human cloning' mean etc.? The research of human genome and its application caused and will cause ethical issues. Can human genome research and its application be used for eugenics, or only for the treatment and prevention of diseases? Must the principle of informed consent/choice be insisted in human genome research and its application? How to protecting gene privacy and combating the discrimination on the basis of genes? How to promote the quality between persons, harmony between ethnic groups and peace between countries? How to establish a fair, just, equal and equitable relationship between developing and developed countries in regarding to human genome research and its application?
2010-01-01
Background The information provided by dense genome-wide markers using high throughput technology is of considerable potential in human disease studies and livestock breeding programs. Genome-wide association studies relate individual single nucleotide polymorphisms (SNP) from dense SNP panels to individual measurements of complex traits, with the underlying assumption being that any association is caused by linkage disequilibrium (LD) between SNP and quantitative trait loci (QTL) affecting the trait. Often SNP are in genomic regions of no trait variation. Whole genome Bayesian models are an effective way of incorporating this and other important prior information into modelling. However a full Bayesian analysis is often not feasible due to the large computational time involved. Results This article proposes an expectation-maximization (EM) algorithm called emBayesB which allows only a proportion of SNP to be in LD with QTL and incorporates prior information about the distribution of SNP effects. The posterior probability of being in LD with at least one QTL is calculated for each SNP along with estimates of the hyperparameters for the mixture prior. A simulated example of genomic selection from an international workshop is used to demonstrate the features of the EM algorithm. The accuracy of prediction is comparable to a full Bayesian analysis but the EM algorithm is considerably faster. The EM algorithm was accurate in locating QTL which explained more than 1% of the total genetic variation. A computational algorithm for very large SNP panels is described. Conclusions emBayesB is a fast and accurate EM algorithm for implementing genomic selection and predicting complex traits by mapping QTL in genome-wide dense SNP marker data. Its accuracy is similar to Bayesian methods but it takes only a fraction of the time. PMID:20969788
Tolerance of Sir1p/Origin Recognition Complex-Dependent Silencing for Enhanced Origin Firing at HMRa
McConnell, Kristopher H.; Müller, Philipp; Fox, Catherine A.
2006-01-01
The HMR-E silencer is a DNA element that directs the formation of silent chromatin at the HMRa locus in Saccharomyces cerevisiae. Sir1p is one of four Sir proteins required for silent chromatin formation at HMRa. Sir1p functions by binding the origin recognition complex (ORC), which binds to HMR-E, and recruiting the other Sir proteins (Sir2p to -4p). ORCs also bind to hundreds of nonsilencer positions distributed throughout the genome, marking them as replication origins, the sites for replication initiation. HMR-E also acts as a replication origin, but compared to many origins in the genome, it fires extremely inefficiently and late during S phase. One postulate to explain this observation is that ORC's role in origin firing is incompatible with its role in binding Sir1p and/or the formation of silent chromatin. Here we examined a mutant HMR-E silencer and fusions between robust replication origins and HMR-E for HMRa silencing, origin firing, and replication timing. Origin firing within HMRa and from the HMR-E silencer itself could be significantly enhanced, and the timing of HMRa replication during an otherwise normal S phase advanced, without a substantial reduction in SIR1-dependent silencing. However, although the robust origin/silencer fusions silenced HMRa quite well, they were measurably less effective than a comparable silencer containing HMR-E's native ORC binding site. PMID:16479013
Collins, Ryan L; Brand, Harrison; Redin, Claire E; Hanscom, Carrie; Antolik, Caroline; Stone, Matthew R; Glessner, Joseph T; Mason, Tamara; Pregno, Giulia; Dorrani, Naghmeh; Mandrile, Giorgia; Giachino, Daniela; Perrin, Danielle; Walsh, Cole; Cipicchio, Michelle; Costello, Maura; Stortchevoi, Alexei; An, Joon-Yong; Currall, Benjamin B; Seabra, Catarina M; Ragavendran, Ashok; Margolin, Lauren; Martinez-Agosto, Julian A; Lucente, Diane; Levy, Brynn; Sanders, Stephan J; Wapner, Ronald J; Quintero-Rivera, Fabiola; Kloosterman, Wigard; Talkowski, Michael E
2017-03-06
Structural variation (SV) influences genome organization and contributes to human disease. However, the complete mutational spectrum of SV has not been routinely captured in disease association studies. We sequenced 689 participants with autism spectrum disorder (ASD) and other developmental abnormalities to construct a genome-wide map of large SV. Using long-insert jumping libraries at 105X mean physical coverage and linked-read whole-genome sequencing from 10X Genomics, we document seven major SV classes at ~5 kb SV resolution. Our results encompass 11,735 distinct large SV sites, 38.1% of which are novel and 16.8% of which are balanced or complex. We characterize 16 recurrent subclasses of complex SV (cxSV), revealing that: (1) cxSV are larger and rarer than canonical SV; (2) each genome harbors 14 large cxSV on average; (3) 84.4% of large cxSVs involve inversion; and (4) most large cxSV (93.8%) have not been delineated in previous studies. Rare SVs are more likely to disrupt coding and regulatory non-coding loci, particularly when truncating constrained and disease-associated genes. We also identify multiple cases of catastrophic chromosomal rearrangements known as chromoanagenesis, including somatic chromoanasynthesis, and extreme balanced germline chromothripsis events involving up to 65 breakpoints and 60.6 Mb across four chromosomes, further defining rare categories of extreme cxSV. These data provide a foundational map of large SV in the morbid human genome and demonstrate a previously underappreciated abundance and diversity of cxSV that should be considered in genomic studies of human disease.
Luan, Guodong; Bao, Guanhui; Lin, Zhao; Li, Yang; Chen, Zugen; Li, Yin; Cai, Zhen
2015-12-25
Heat tolerance of microbes is of great importance for efficient biorefinery and bioconversion. However, engineering and understanding of microbial heat tolerance are difficult and insufficient because it is a complex physiological trait which probably correlates with all gene functions, genetic regulations, and cellular metabolisms and activities. In this work, a novel strain engineering approach named Genome Replication Engineering Assisted Continuous Evolution (GREACE) was employed to improve the heat tolerance of Escherichia coli. When the E. coli strain carrying a mutator was cultivated under gradually increasing temperature, genome-wide mutations were continuously generated during genome replication and the mutated strains with improved thermotolerance were autonomously selected. A thermotolerant strain HR50 capable of growing at 50°C on LB agar plate was obtained within two months, demonstrating the efficiency of GREACE in improving such a complex physiological trait. To understand the improved heat tolerance, genomes of HR50 and its wildtype strain DH5α were sequenced. Evenly distributed 361 mutations covering all mutation types were found in HR50. Closed material transportations, loose genome conformation, and possibly altered cell wall structure and transcription pattern were the main differences of HR50 compared with DH5α, which were speculated to be responsible for the improved heat tolerance. This work not only expanding our understanding of microbial heat tolerance, but also emphasizing that the in vivo continuous genome mutagenesis method, GREACE, is efficient in improving microbial complex physiological trait. Copyright © 2015 Elsevier B.V. All rights reserved.
Nazarian, Alireza; Gezan, Salvador A
2016-03-01
The study of genetic architecture of complex traits has been dramatically influenced by implementing genome-wide analytical approaches during recent years. Of particular interest are genomic prediction strategies which make use of genomic information for predicting phenotypic responses instead of detecting trait-associated loci. In this work, we present the results of a simulation study to improve our understanding of the statistical properties of estimation of genetic variance components of complex traits, and of additive, dominance, and genetic effects through best linear unbiased prediction methodology. Simulated dense marker information was used to construct genomic additive and dominance matrices, and multiple alternative pedigree- and marker-based models were compared to determine if including a dominance term into the analysis may improve the genetic analysis of complex traits. Our results showed that a model containing a pedigree- or marker-based additive relationship matrix along with a pedigree-based dominance matrix provided the best partitioning of genetic variance into its components, especially when some degree of true dominance effects was expected to exist. Also, we noted that the use of a marker-based additive relationship matrix along with a pedigree-based dominance matrix had the best performance in terms of accuracy of correlations between true and estimated additive, dominance, and genetic effects. © The American Genetic Association 2015. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Comparative Single-Cell Genomics of Chloroflexi from the Okinawa Trough Deep-Subsurface Biosphere.
Fullerton, Heather; Moyer, Craig L
2016-05-15
Chloroflexi small-subunit (SSU) rRNA gene sequences are frequently recovered from subseafloor environments, but the metabolic potential of the phylum is poorly understood. The phylum Chloroflexi is represented by isolates with diverse metabolic strategies, including anoxic phototrophy, fermentation, and reductive dehalogenation; therefore, function cannot be attributed to these organisms based solely on phylogeny. Single-cell genomics can provide metabolic insights into uncultured organisms, like the deep-subsurface Chloroflexi Nine SSU rRNA gene sequences were identified from single-cell sorts of whole-round core material collected from the Okinawa Trough at Iheya North hydrothermal field as part of Integrated Ocean Drilling Program (IODP) expedition 331 (Deep Hot Biosphere). Previous studies of subsurface Chloroflexi single amplified genomes (SAGs) suggested heterotrophic or lithotrophic metabolisms and provided no evidence for growth by reductive dehalogenation. Our nine Chloroflexi SAGs (seven of which are from the order Anaerolineales) indicate that, in addition to genes for the Wood-Ljungdahl pathway, exogenous carbon sources can be actively transported into cells. At least one subunit for pyruvate ferredoxin oxidoreductase was found in four of the Chloroflexi SAGs. This protein can provide a link between the Wood-Ljungdahl pathway and other carbon anabolic pathways. Finally, one of the seven Anaerolineales SAGs contains a distinct reductive dehalogenase homologous (rdhA) gene. Through the use of single amplified genomes (SAGs), we have extended the metabolic potential of an understudied group of subsurface microbes, the Chloroflexi These microbes are frequently detected in the subsurface biosphere, though their metabolic capabilities have remained elusive. In contrast to previously examined Chloroflexi SAGs, our genomes (several are from the order Anaerolineales) were recovered from a hydrothermally driven system and therefore provide a unique window into the metabolic potential of this type of habitat. In addition, a reductive dehalogenase gene (rdhA) has been directly linked to marine subsurface Chloroflexi, suggesting that reductive dehalogenation is not limited to the class Dehalococcoidia This discovery expands the nutrient-cycling and metabolic potential present within the deep subsurface and provides functional gene information relating to this enigmatic group. Copyright © 2016 Fullerton and Moyer.
Decoding the complex genetic causes of heart diseases using systems biology.
Djordjevic, Djordje; Deshpande, Vinita; Szczesnik, Tomasz; Yang, Andrian; Humphreys, David T; Giannoulatou, Eleni; Ho, Joshua W K
2015-03-01
The pace of disease gene discovery is still much slower than expected, even with the use of cost-effective DNA sequencing and genotyping technologies. It is increasingly clear that many inherited heart diseases have a more complex polygenic aetiology than previously thought. Understanding the role of gene-gene interactions, epigenetics, and non-coding regulatory regions is becoming increasingly critical in predicting the functional consequences of genetic mutations identified by genome-wide association studies and whole-genome or exome sequencing. A systems biology approach is now being widely employed to systematically discover genes that are involved in heart diseases in humans or relevant animal models through bioinformatics. The overarching premise is that the integration of high-quality causal gene regulatory networks (GRNs), genomics, epigenomics, transcriptomics and other genome-wide data will greatly accelerate the discovery of the complex genetic causes of congenital and complex heart diseases. This review summarises state-of-the-art genomic and bioinformatics techniques that are used in accelerating the pace of disease gene discovery in heart diseases. Accompanying this review, we provide an interactive web-resource for systems biology analysis of mammalian heart development and diseases, CardiacCode ( http://CardiacCode.victorchang.edu.au/ ). CardiacCode features a dataset of over 700 pieces of manually curated genetic or molecular perturbation data, which enables the inference of a cardiac-specific GRN of 280 regulatory relationships between 33 regulator genes and 129 target genes. We believe this growing resource will fill an urgent unmet need to fully realise the true potential of predictive and personalised genomic medicine in tackling human heart disease.
Local Genetic Correlation Gives Insights into the Shared Genetic Architecture of Complex Traits.
Shi, Huwenbo; Mancuso, Nicholas; Spendlove, Sarah; Pasaniuc, Bogdan
2017-11-02
Although genetic correlations between complex traits provide valuable insights into epidemiological and etiological studies, a precise quantification of which genomic regions disproportionately contribute to the genome-wide correlation is currently lacking. Here, we introduce ρ-HESS, a technique to quantify the correlation between pairs of traits due to genetic variation at a small region in the genome. Our approach requires GWAS summary data only and makes no distributional assumption on the causal variant effect sizes while accounting for linkage disequilibrium (LD) and overlapping GWAS samples. We analyzed large-scale GWAS summary data across 36 quantitative traits, and identified 25 genomic regions that contribute significantly to the genetic correlation among these traits. Notably, we find 6 genomic regions that contribute to the genetic correlation of 10 pairs of traits that show negligible genome-wide correlation, further showcasing the power of local genetic correlation analyses. Finally, we report the distribution of local genetic correlations across the genome for 55 pairs of traits that show putative causal relationships. Copyright © 2017 American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
Reducing assembly complexity of microbial genomes with single-molecule sequencing
USDA-ARS?s Scientific Manuscript database
Genome assembly algorithms cannot fully reconstruct microbial chromosomes from the DNA reads output by first or second-generation sequencing instruments. Therefore, most genomes are left unfinished due to the significant resources required to manually close gaps left in the draft assemblies. Single-...
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gardner, Shea N.; McLoughlin, Kevin; Be, Nicholas A.
Venezuelan equine encephalitis virus (VEEV) is a mosquito-borne alphavirus that has caused large outbreaks of severe illness in both horses and humans. New approaches are needed to rapidly infer the origin of a newly discovered VEEV strain, estimate its equine amplification and resultant epidemic potential, and predict human virulence phenotype. We performed whole genome single nucleotide polymorphism (SNP) analysis of all available VEE antigenic complex genomes, verified that a SNP-based phylogeny accurately captured the features of a phylogenetic tree based on multiple sequence alignment, and developed a high resolution genome-wide SNP microarray. We used the microarray to analyze a broadmore » panel of VEEV isolates, found excellent concordance between array- and sequence-based SNP calls, genotyped unsequenced isolates, and placed them on a phylogeny with sequenced genomes. The microarray successfully genotyped VEEV directly from tissue samples of an infected mouse, bypassing the need for viral isolation, culture and genomic sequencing. Lastly, we identified genomic variants associated with serotypes and host species, revealing a complex relationship between genotype and phenotype.« less
Significance of genome-wide association studies in molecular anthropology.
Gupta, Vipin; Khadgawat, Rajesh; Sachdeva, Mohinder Pal
2009-12-01
The successful advent of a genome-wide approach in association studies raises the hopes of human geneticists for solving a genetic maze of complex traits especially the disorders. This approach, which is replete with the application of cutting-edge technology and supported by big science projects (like Human Genome Project; and even more importantly the International HapMap Project) and various important databases (SNP database, CNV database, etc.), has had unprecedented success in rapidly uncovering many of the genetic determinants of complex disorders. The magnitude of this approach in the genetics of classical anthropological variables like height, skin color, eye color, and other genome diversity projects has certainly expanded the horizons of molecular anthropology. Therefore, in this article we have proposed a genome-wide association approach in molecular anthropological studies by providing lessons from the exemplary study of the Wellcome Trust Case Control Consortium. We have also highlighted the importance and uniqueness of Indian population groups in facilitating the design and finding optimum solutions for other genome-wide association-related challenges.
Using genomics to characterize evolutionary potential for conservation of wild populations
Harrisson, Katherine A; Pavlova, Alexandra; Telonis-Scott, Marina; Sunnucks, Paul
2014-01-01
Genomics promises exciting advances towards the important conservation goal of maximizing evolutionary potential, notwithstanding associated challenges. Here, we explore some of the complexity of adaptation genetics and discuss the strengths and limitations of genomics as a tool for characterizing evolutionary potential in the context of conservation management. Many traits are polygenic and can be strongly influenced by minor differences in regulatory networks and by epigenetic variation not visible in DNA sequence. Much of this critical complexity is difficult to detect using methods commonly used to identify adaptive variation, and this needs appropriate consideration when planning genomic screens, and when basing management decisions on genomic data. When the genomic basis of adaptation and future threats are well understood, it may be appropriate to focus management on particular adaptive traits. For more typical conservations scenarios, we argue that screening genome-wide variation should be a sensible approach that may provide a generalized measure of evolutionary potential that accounts for the contributions of small-effect loci and cryptic variation and is robust to uncertainty about future change and required adaptive response(s). The best conservation outcomes should be achieved when genomic estimates of evolutionary potential are used within an adaptive management framework. PMID:25553064
Breast cancer: The translation of big genomic data to cancer precision medicine.
Low, Siew-Kee; Zembutsu, Hitoshi; Nakamura, Yusuke
2018-03-01
Cancer is a complex genetic disease that develops from the accumulation of genomic alterations in which germline variations predispose individuals to cancer and somatic alterations initiate and trigger the progression of cancer. For the past 2 decades, genomic research has advanced remarkably, evolving from single-gene to whole-genome screening by using genome-wide association study and next-generation sequencing that contributes to big genomic data. International collaborative efforts have contributed to curating these data to identify clinically significant alterations that could be used in clinical settings. Focusing on breast cancer, the present review summarizes the identification of genomic alterations with high-throughput screening as well as the use of genomic information in clinical trials that match cancer patients to therapies, which further leads to cancer precision medicine. Furthermore, cancer screening and monitoring were enhanced greatly by the use of liquid biopsies. With the growing data complexity and size, there is much anticipation in exploiting deep machine learning and artificial intelligence to curate integrative "-omics" data to refine the current medical practice to be applied in the near future. © 2017 The Authors. Cancer Science published by John Wiley & Sons Australia, Ltd on behalf of Japanese Cancer Association.
Alenghat, Theresa; Yu, Jiujiu; Lazar, Mitchell A
2006-01-01
Unliganded thyroid hormone receptor (TR) actively represses transcription via the nuclear receptor corepressor (N-CoR)/histone deacetylase 3 (HDAC3) complex. Although transcriptional activation by liganded receptors involves chromatin remodeling, the role of ATP-dependent remodeling in receptor-mediated repression is unknown. Here we report that SNF2H, the mammalian ISWI chromatin remodeling ATPase, is critical for repression of a genomically integrated, TR-regulated reporter gene. N-CoR and HDAC3 are both required for recruitment of SNF2H to the repressed gene. SNF2H does not interact directly with the N-CoR/HDAC3 complex, but binds to unacetylated histone H4 tails, suggesting that deacetylase activity of the corepressor complex is critical to SNF2H function. Indeed, HDAC3 as well as SNF2H are required for nucleosomal organization on the TR target gene. Consistent with these findings, reduction of SNF2H induces expression of an endogenous TR-regulated gene, dio1, in liver cells. Thus, although not apparent from studies of transiently transfected reporter genes, gene repression by TR involves the targeting of chromatin remodeling factors to repressed genes by the HDAC activity of nuclear receptor corepressors. PMID:16917504
Leichty, Aaron R; Brisson, Dustin
2014-10-01
Population genomic analyses have demonstrated power to address major questions in evolutionary and molecular microbiology. Collecting populations of genomes is hindered in many microbial species by the absence of a cost effective and practical method to collect ample quantities of sufficiently pure genomic DNA for next-generation sequencing. Here we present a simple method to amplify genomes of a target microbial species present in a complex, natural sample. The selective whole genome amplification (SWGA) technique amplifies target genomes using nucleotide sequence motifs that are common in the target microbe genome, but rare in the background genomes, to prime the highly processive phi29 polymerase. SWGA thus selectively amplifies the target genome from samples in which it originally represented a minor fraction of the total DNA. The post-SWGA samples are enriched in target genomic DNA, which are ideal for population resequencing. We demonstrate the efficacy of SWGA using both laboratory-prepared mixtures of cultured microbes as well as a natural host-microbe association. Targeted amplification of Borrelia burgdorferi mixed with Escherichia coli at genome ratios of 1:2000 resulted in >10(5)-fold amplification of the target genomes with <6.7-fold amplification of the background. SWGA-treated genomic extracts from Wolbachia pipientis-infected Drosophila melanogaster resulted in up to 70% of high-throughput resequencing reads mapping to the W. pipientis genome. By contrast, 2-9% of sequencing reads were derived from W. pipientis without prior amplification. The SWGA technique results in high sequencing coverage at a fraction of the sequencing effort, thus allowing population genomic studies at affordable costs. Copyright © 2014 by the Genetics Society of America.
Grigera, Fernando; Bellacosa, Alfonso; Kenter, Amy L
2013-01-01
Mismatch repair (MMR) safeguards against genomic instability and is required for efficient Ig class switch recombination (CSR). Methyl CpG binding domain protein 4 (MBD4) binds to MutL homologue 1 (MLH1) and controls the post-transcriptional level of several MMR proteins, including MutS homologue 2 (MSH2). We show that in WT B cells activated for CSR, MBD4 is induced and interacts with MMR proteins, thereby implying a role for MBD4 in CSR. However, CSR is in the normal range in Mbd4 deficient mice deleted for exons 2-5 despite concomitant reduction of MSH2. We show by comparison in Msh2(+/-) B cells that a two-fold reduction of MSH2 and MBD4 proteins is correlated with impaired CSR. It is therefore surprising that CSR occurs at normal frequencies in the Mbd4 deficient B cells where MSH2 is reduced. We find that a variant Mbd4 transcript spanning exons 1,6-8 is expressed in Mbd4 deficient B cells. This transcript can be ectopically expressed and produces a truncated MBD4 peptide. Thus, the 3' end of the Mbd4 locus is not silent in Mbd4 deficient B cells and may contribute to CSR. Our findings highlight a complex relationship between MBD4 and MMR proteins in B cells and a potential reconsideration of their role in CSR.
Harhay, Dayna M.; Bono, James L.; Smith, Timothy P. L.; Capik, Sarah F.; DeDonder, Keith D.; Apley, Michael D.; Lubbers, Brian V.; White, Bradley J.; Larson, Robert L.
2017-01-01
ABSTRACT Histophilus somni is a fastidious Gram-negative opportunistic pathogenic Pasteurellaceae that affects multiple organ systems and is one of the principal bacterial species contributing to bovine respiratory disease complex (BRDC) in feed yard cattle. Here, we present seven closed genome sequences isolated from three beef calves showing sign of BRDC. PMID:28983006
ERIC Educational Resources Information Center
Connolly, John J.; Glessner, Joseph T.; Hakonarson, Hakon
2013-01-01
Efforts to understand the causes of autism spectrum disorders (ASDs) have been hampered by genetic complexity and heterogeneity among individuals. One strategy for reducing complexity is to target endophenotypes, simpler biologically based measures that may involve fewer genes and constitute a more homogenous sample. A genome-wide association…
Speed congenics: accelerated genome recovery using genetic markers.
Visscher, P M
1999-08-01
Genetic markers throughout the genome can be used to speed up 'recovery' of the recipient genome in the backcrossing phase of the construction of a congenic strain. The prediction of the genomic proportion during backcrossing depends on the assumptions regarding the distribution of chromosome segments, the population structure, the marker spacing and the selection strategy. In this study simulation was used to investigate the rate of recovery of the recipient genome for a mouse, Drosophila and Arabidopsis genome. It was shown that an incorrect assumption of a binomial distribution of chromosome segments, and failing to take account of a reduction in variance in genomic proportion due to selection, can lead to a downward bias of up to two generations in the estimation of the number of generations required for the formation of a congenic strain.
Wang, Xihong; Zheng, Zhuqing; Cai, Yudong; Chen, Ting; Li, Chao; Fu, Weiwei; Jiang, Yu
2017-12-01
The increasing amount of sequencing data available for a wide variety of species can be theoretically used for detecting copy number variations (CNVs) at the population level. However, the growing sample sizes and the divergent complexity of nonhuman genomes challenge the efficiency and robustness of current human-oriented CNV detection methods. Here, we present CNVcaller, a read-depth method for discovering CNVs in population sequencing data. The computational speed of CNVcaller was 1-2 orders of magnitude faster than CNVnator and Genome STRiP for complex genomes with thousands of unmapped scaffolds. CNV detection of 232 goats required only 1.4 days on a single compute node. Additionally, the Mendelian consistency of sheep trios indicated that CNVcaller mitigated the influence of high proportions of gaps and misassembled duplications in the nonhuman reference genome assembly. Furthermore, multiple evaluations using real sheep and human data indicated that CNVcaller achieved the best accuracy and sensitivity for detecting duplications. The fast generalized detection algorithms included in CNVcaller overcome prior computational barriers for detecting CNVs in large-scale sequencing data with complex genomic structures. Therefore, CNVcaller promotes population genetic analyses of functional CNVs in more species. © The Authors 2017. Published by Oxford University Press.
Wang, Xihong; Zheng, Zhuqing; Cai, Yudong; Chen, Ting; Li, Chao; Fu, Weiwei
2017-01-01
Abstract Background The increasing amount of sequencing data available for a wide variety of species can be theoretically used for detecting copy number variations (CNVs) at the population level. However, the growing sample sizes and the divergent complexity of nonhuman genomes challenge the efficiency and robustness of current human-oriented CNV detection methods. Results Here, we present CNVcaller, a read-depth method for discovering CNVs in population sequencing data. The computational speed of CNVcaller was 1–2 orders of magnitude faster than CNVnator and Genome STRiP for complex genomes with thousands of unmapped scaffolds. CNV detection of 232 goats required only 1.4 days on a single compute node. Additionally, the Mendelian consistency of sheep trios indicated that CNVcaller mitigated the influence of high proportions of gaps and misassembled duplications in the nonhuman reference genome assembly. Furthermore, multiple evaluations using real sheep and human data indicated that CNVcaller achieved the best accuracy and sensitivity for detecting duplications. Conclusions The fast generalized detection algorithms included in CNVcaller overcome prior computational barriers for detecting CNVs in large-scale sequencing data with complex genomic structures. Therefore, CNVcaller promotes population genetic analyses of functional CNVs in more species. PMID:29220491
Single cell genomic study of dehalogenating Chloroflexi from deep sea sediments of Peruvian Margin
NASA Astrophysics Data System (ADS)
Spormann, A.; Kaster, A.; Meyer-Blackwell, K.; Biddle, J.
2012-12-01
Dehalogenating Chloroflexi, such as Dehalococcoidites (Dhc), are members of the rare biosphere of deep sea sediments but were originally discovered as the key microbes mediating reductive dehalogenation of the prevalent groundwater contaminants tetrachloroethene and trichloroethene to ethene. Dhc are slow growing, highly niche adapted microbes that are specialized to organohalide respiration as the sole mode of energy conservation. These strictly anaerobic microbes depend on a supporting microbial community to mitigate electron donor and cofactor requirements among other factors. Molecular and genomic studies on the key enzymes for energy conservation, reductive dehalogenases, have provided evidence for rapid adaptive evolution in terrestrial environments. However, the metabolic life style of Dhc in the absence of anthropogenic contaminants, such as in pristine deep sea sediments, is still unknown. In order to provide fundamental insights into life style, genomic population structure and evolution of Dhc, we analyzed a non-contaminated deep sea sediment sample of the Peru Margin 1230 site collected 6 mbf by a metagenomic and single cell genomic. We present for the first time single cell genomic data on dehalogenating Chloroflexi, a significant microbial population in the poorly understood oligotrophic marine sub-surface environments.
Single cell genomic study of dehalogenating Chloroflexi in deep sea sediments of Peru Margin 1230
NASA Astrophysics Data System (ADS)
Kaster, A.; Meyer-Blackwell, K.; Biddle, J.; Spormann, A.
2012-12-01
Dehalogenating Chloroflexi, such as Dehalococcoidites (Dhc), are members of the rare biosphere of deep sea sediments but were originally discovered as the key microbes mediating reductive dehalogenation of the prevalent groundwater contaminants tetrachloroethene and trichloroethene to ethene. Dhc are slow growing, highly niche adapted microbes that are specialized to organohalide respiration as the sole mode of energy conservation. They are strictly anaerobic microbes that depend on a supporting microbial community for electron donor and cofactor requirements among other factors. Molecular and genomic studies on the key enzymes for energy conservation, reductive dehalogenases, have provided evidence for rapid adaptive evolution in terrestrial environments. However, the metabolic life style of Dhc in the absence of anthropogenic contaminants, such as in pristine deep sea sediments, is still unknown. In order to provide fundamental insights into life style, genomic population structure and evolution of Dhc, we analyzed a non-contaminated deep sea sediment sample of the Peru Margin 1230 site collected 6 mbsf by a metagenomic and single cell genomic approach. We present for the first time single cell genomic data on dehalogenating Chloroflexi, a significant microbial population in the poorly understood oligotrophic marine sub-surface environment.
Anchoring and ordering NGS contig assemblies by population sequencing (POPSEQ)
Mascher, Martin; Muehlbauer, Gary J; Rokhsar, Daniel S; Chapman, Jarrod; Schmutz, Jeremy; Barry, Kerrie; Muñoz-Amatriaín, María; Close, Timothy J; Wise, Roger P; Schulman, Alan H; Himmelbach, Axel; Mayer, Klaus FX; Scholz, Uwe; Poland, Jesse A; Stein, Nils; Waugh, Robbie
2013-01-01
Next-generation whole-genome shotgun assemblies of complex genomes are highly useful, but fail to link nearby sequence contigs with each other or provide a linear order of contigs along individual chromosomes. Here, we introduce a strategy based on sequencing progeny of a segregating population that allows de novo production of a genetically anchored linear assembly of the gene space of an organism. We demonstrate the power of the approach by reconstructing the chromosomal organization of the gene space of barley, a large, complex and highly repetitive 5.1 Gb genome. We evaluate the robustness of the new assembly by comparison to a recently released physical and genetic framework of the barley genome, and to various genetically ordered sequence-based genotypic datasets. The method is independent of the need for any prior sequence resources, and will enable rapid and cost-efficient establishment of powerful genomic information for many species. PMID:23998490
Spider genomes provide insight into composition and evolution of venom and silk
Sanggaard, Kristian W.; Bechsgaard, Jesper S.; Fang, Xiaodong; Duan, Jinjie; Dyrlund, Thomas F.; Gupta, Vikas; Jiang, Xuanting; Cheng, Ling; Fan, Dingding; Feng, Yue; Han, Lijuan; Huang, Zhiyong; Wu, Zongze; Liao, Li; Settepani, Virginia; Thøgersen, Ida B.; Vanthournout, Bram; Wang, Tobias; Zhu, Yabing; Funch, Peter; Enghild, Jan J.; Schauser, Leif; Andersen, Stig U.; Villesen, Palle; Schierup, Mikkel H; Bilde, Trine; Wang, Jun
2014-01-01
Spiders are ecologically important predators with complex venom and extraordinarily tough silk that enables capture of large prey. Here we present the assembled genome of the social velvet spider and a draft assembly of the tarantula genome that represent two major taxonomic groups of spiders. The spider genomes are large with short exons and long introns, reminiscent of mammalian genomes. Phylogenetic analyses place spiders and ticks as sister groups supporting polyphyly of the Acari. Complex sets of venom and silk genes/proteins are identified. We find that venom genes evolved by sequential duplication, and that the toxic effect of venom is most likely activated by proteases present in the venom. The set of silk genes reveals a highly dynamic gene evolution, new types of silk genes and proteins, and a novel use of aciniform silk. These insights create new opportunities for pharmacological applications of venom and biomaterial applications of silk. PMID:24801114
Integrative Analysis of Complex Cancer Genomics and Clinical Profiles Using the cBioPortal
Gao, Jianjiong; Aksoy, Bülent Arman; Dogrusoz, Ugur; Dresdner, Gideon; Gross, Benjamin; Sumer, S. Onur; Sun, Yichao; Jacobsen, Anders; Sinha, Rileen; Larsson, Erik; Cerami, Ethan; Sander, Chris; Schultz, Nikolaus
2014-01-01
The cBioPortal for Cancer Genomics (http://cbioportal.org) provides a Web resource for exploring, visualizing, and analyzing multidimensional cancer genomics data. The portal reduces molecular profiling data from cancer tissues and cell lines into readily understandable genetic, epigenetic, gene expression, and proteomic events. The query interface combined with customized data storage enables researchers to interactively explore genetic alterations across samples, genes, and pathways and, when available in the underlying data, to link these to clinical outcomes. The portal provides graphical summaries of gene-level data from multiple platforms, network visualization and analysis, survival analysis, patient-centric queries, and software programmatic access. The intuitive Web interface of the portal makes complex cancer genomics profiles accessible to researchers and clinicians without requiring bioinformatics expertise, thus facilitating biological discoveries. Here, we provide a practical guide to the analysis and visualization features of the cBioPortal for Cancer Genomics. PMID:23550210
Kawakami, Shin-ichi; Ebana, Kaworu; Nishikawa, Tomotaro; Sato, Yo-ichiro; Vaughan, Duncan A; Kadowaki, Koh-ichi
2007-02-01
Two hundred and seventy-five accessions of cultivated Asian rice and 44 accessions of AA genome Oryza species were classified into 8 chloroplast (cp) genome types (A-H) based on insertion-deletion events at 3 regions (8K, 57K, and 76K) of the cp genome. The ancestral cp genome type was determined according to the frequency of occurrence in Oryza species and the likely evolution of the variable 57K region of the cp genome. When 2 nucleotide substitutions (AA or TT) were taken into account, these 8 cp types were subdivided into 11 cp types. Most indica cultivars had 1 of 3 cp genome types that were also identified in the wild relatives of rice, O. nivara and O. rufipogon, suggesting that the 3 indica cp types had evolved from distinct gene pools of the O. rufipogon - O. nivara complex. The majority of japonica cultivars had 1 of 3 different cp genome types. One of these 3 was identified in O. rufipogon, suggesting that at least 1 japonica type is derived from O. rufipogon with the same cp genome type. These results provide evidence to support a polyphyletic origin of cultivated Asian rice from at least 4 principal lineages in the O. rufipogon - O. nivara complex.
Piras, Bryan A; O'Connor, Daniel M; French, Brent A
2013-01-01
AAV9 is a powerful gene delivery vehicle capable of providing long-term gene expression in a variety of cell types, particularly cardiomyocytes. The use of AAV-delivery for RNA interference is an intense area of research, but a comprehensive analysis of knockdown in cardiac and liver tissues after systemic delivery of AAV9 has yet to be reported. We sought to address this question by using AAV9 to deliver a short-hairpin RNA targeting the enhanced green fluorescent protein (GFP) in transgenic mice that constitutively overexpress GFP in all tissues. The expression cassette was initially tested in vitro and we demonstrated a 61% reduction in mRNA and a 90% reduction in GFP protein in dual-transfected 293 cells. Next, the expression cassette was packaged as single-stranded genomes in AAV9 capsids to test cardiac GFP knockdown with several doses ranging from 1.8×10(10) to 1.8×10(11) viral genomes per mouse and a dose-dependent response was obtained. We then analyzed GFP expression in both heart and liver after delivery of 4.4×10(11) viral genomes per mouse. We found that while cardiac knockdown was highly efficient, with a 77% reduction in GFP mRNA and a 71% reduction in protein versus control-treated mice, there was no change in liver expression. This was despite a 4.5-fold greater number of viral genomes in the liver than in the heart. This study demonstrates that single-stranded AAV9 vectors expressing shRNA can be used to achieve highly efficient cardiac-selective knockdown of GFP expression that is sustained for at least 7 weeks after the systemic injection of 8 day old mice, with no change in liver expression and no evidence of liver damage despite high viral genome presence in the liver.
Kernel methods for large-scale genomic data analysis
Xing, Eric P.; Schaid, Daniel J.
2015-01-01
Machine learning, particularly kernel methods, has been demonstrated as a promising new tool to tackle the challenges imposed by today’s explosive data growth in genomics. They provide a practical and principled approach to learning how a large number of genetic variants are associated with complex phenotypes, to help reveal the complexity in the relationship between the genetic markers and the outcome of interest. In this review, we highlight the potential key role it will have in modern genomic data processing, especially with regard to integration with classical methods for gene prioritizing, prediction and data fusion. PMID:25053743
Surviving an Identity Crisis: A Revised View of Chromatin Insulators in the Genomics Era
Matzat, Leah H.; Lei, Elissa P.
2013-01-01
The control of complex, developmentally regulated loci and partitioning of the genome into active and silent domains is in part accomplished through the activity of DNA-protein complexes termed chromatin insulators. Together, the multiple, well-studied classes of insulators in Drosophila melanogaster appear to be generally functionally conserved. In this review, we discuss recent genomic-scale experiments and attempt to reconcile these newer findings in the context of previously defined insulator characteristics based on classical genetic analyses and transgenic approaches. Finally, we discuss the emerging understanding of mechanisms of chromatin insulator regulation. PMID:24189492
Soler-Bistué, Alfonso; Mondotte, Juan A.; Bland, Michael Jason; Val, Marie-Eve; Saleh, María-Carla; Mazel, Didier
2015-01-01
The effects on cell physiology of gene order within the bacterial chromosome are poorly understood. In silico approaches have shown that genes involved in transcription and translation processes, in particular ribosomal protein (RP) genes, localize near the replication origin (oriC) in fast-growing bacteria suggesting that such a positional bias is an evolutionarily conserved growth-optimization strategy. Such genomic localization could either provide a higher dosage of these genes during fast growth or facilitate the assembly of ribosomes and transcription foci by keeping physically close the many components of these macromolecular machines. To explore this, we used novel recombineering tools to create a set of Vibrio cholerae strains in which S10-spec-α (S10), a locus bearing half of the ribosomal protein genes, was systematically relocated to alternative genomic positions. We show that the relative distance of S10 to the origin of replication tightly correlated with a reduction of S10 dosage, mRNA abundance and growth rate within these otherwise isogenic strains. Furthermore, this was accompanied by a significant reduction in the host-invasion capacity in Drosophila melanogaster. Both phenotypes were rescued in strains bearing two S10 copies highly distal to oriC, demonstrating that replication-dependent gene dosage reduction is the main mechanism behind these alterations. Hence, S10 positioning connects genome structure to cell physiology in Vibrio cholerae. Our results show experimentally for the first time that genomic positioning of genes involved in the flux of genetic information conditions global growth control and hence bacterial physiology and potentially its evolution. PMID:25875621
Researchers from British Columbia Cancer Agency used whole genome sequencing to analyze 40 DLBCL cases and 13 cell lines in order to fill in the gaps of the complex landscape of DLBCL genomes. Their analysis, “Mutational and structural analysis of diffuse large B-cell lymphoma using whole genome sequencing,” was published online in Blood on May 22. The authors are Ryan Morin, Marco Marra, and colleagues.
Azolla--a model organism for plant genomic studies.
Qiu, Yin-Long; Yu, Jun
2003-02-01
The aquatic ferns of the genus Azolla are nitrogen-fixing plants that have great potentials in agricultural production and environmental conservation. Azolla in many aspects is qualified to serve as a model organism for genomic studies because of its importance in agriculture, its unique position in plant evolution, its symbiotic relationship with the N2-fixing cyanobacterium, Anabaena azollae, and its moderate-sized genome. The goals of this genome project are not only to understand the biology of the Azolla genome to promote its applications in biological research and agriculture practice but also to gain critical insights about evolution of plant genomes. Together with the strategic and technical improvement as well as cost reduction of DNA sequencing, the deciphering of their genetic code is imminent.
Karanyicz, Edina; Antunovics, Zsuzsa; Kallai, Z; Sipiczki, M
2017-06-01
Saccharomyces strains with chimerical genomes consisting of mosaics of the genomes of different species ("natural hybrids") occur quite frequently among industrial and wine strains. The most widely endorsed hypothesis is that the mosaics are introgressions acquired via hybridisation and repeated backcrosses of the hybrids with one of the parental species. However, the interspecies hybrids are sterile, unable to mate with their parents. Here, we show by analysing synthetic Saccharomyces kudriavzevii x Saccharomyces uvarum hybrids that mosaic (chimeric) genomes can arise without introgressive backcrosses. These species are biologically separated by a double sterility barrier (sterility of allodiploids and F1 sterility of allotetraploids). F1 sterility is due to the diploidisation of the tetraploid meiosis resulting in MAT a /MAT α heterozygosity which suppresses mating in the spores. This barrier can occasionally be broken down by malsegregation of autosyndetically paired chromosomes carrying the MAT loci (loss of MAT heterozygosity). Subsequent malsegregation of additional autosyndetically paired chromosomes and occasional allosyndetic interactions chimerise the hybrid genome. Chromosomes are preferentially lost from the S. kudriavzevii subgenome. The uniparental transmission of the mitochondrial DNA to the hybrids indicates that nucleo-mitochondrial interactions might affect the direction of the genomic changes. We propose the name GARMe (Genome AutoReduction in Meiosis) for this process of genome reduction and chimerisation which involves no introgressive backcrossings. It opens a way to transfer genetic information between species and thus to get one step ahead after hybridisation in the production of yeast strains with beneficial combinations of properties of different species.
Ranade, Sonali Sachin; García-Gil, María Rosario; Rosselló, Josep A
2016-04-01
Many genes have been lost from the prokaryote plastidial genome during the early events of endosymbiosis in eukaryotes. Some of them were definitively lost, but others were relocated and functionally integrated to the host nuclear genomes through serial events of gene transfer during plant evolution. In gymnosperms, plastid genome sequencing has revealed the loss of ndh genes from several species of Gnetales and Pinaceae, including Norway spruce (Picea abies). This study aims to trace the ndh genes in the nuclear and organellar Norway spruce genomes. The plastid genomes of higher plants contain 11 ndh genes which are homologues of mitochondrial genes encoding subunits of the proton-pumping NADH-dehydrogenase (nicotinamide adenine dinucleotide dehydrogenase) or complex I (electron transport chain). Ndh genes encode 11 NDH polypeptides forming the Ndh complex (analogous to complex I) which seems to be primarily involved in chloro-respiration processes. We considered ndh genes from the plastidial genome of four gymnosperms (Cryptomeria japonica, Cycas revoluta, Ginkgo biloba, Podocarpus totara) and a single angiosperm species (Arabidopsis thaliana) to trace putative homologs in the nuclear and organellar Norway spruce genomes using tBLASTn to assess the evolutionary fate of ndh genes in Norway spruce and to address their genomic location(s), structure, integrity and functionality. The results obtained from tBLASTn were subsequently analyzed by performing homology search for finding ndh specific conserved domains using conserved domain search. We report the presence of non-functional plastid ndh gene fragments, excepting ndhE and ndhG genes, in the nuclear genome of Norway spruce. Regulatory transcriptional elements like promoters, TATA boxes and enhancers were detected in the upstream regions of some ndh fragments. We also found transposable elements in the flanking regions of few ndh fragments suggesting nuclear rearrangements in those regions. These evidences support the hypothesis that, at least in Picea, ndh translocations from the plastid to the nuclear genome have occurred, and that there might have been a functional machinery at some time during evolution to accommodate them within a nuclear-encoded environment, or attempts to form it.
Maize HapMap2 identifies extant variation from a genome in flux
USDA-ARS?s Scientific Manuscript database
The maize genome is the largest, most diverse and complex plant genome sequenced to date. Using high-throughput sequencing to access genetic variation and a population genetics model to score the polymorphisms, we characterize and unite the diversity of the world’s key breeding germplasm, wild rela...
USDA-ARS?s Scientific Manuscript database
The Rhipicephalus microplus genome is large and complex in structure, making a genome sequence difficult to assemble and costly to resource the required bioinformatics. In light of this, a consortium of international collaborators was formed to pool resources to begin sequencing this genome. We have...
Bioconductor | Informatics Technology for Cancer Research (ITCR)
Bioconductor provides tools for the analysis and comprehension of high-throughput genomic data. R/Bioconductor will be enhanced to meet the increasing complexity of multiassay cancer genomics experiments.
NASA Astrophysics Data System (ADS)
Karakatsanis, L. P.; Pavlos, G. P.; Iliopoulos, A. C.; Pavlos, E. G.; Clark, P. M.; Duke, J. L.; Monos, D. S.
2018-09-01
This study combines two independent domains of science, the high throughput DNA sequencing capabilities of Genomics and complexity theory from Physics, to assess the information encoded by the different genomic segments of exonic, intronic and intergenic regions of the Major Histocompatibility Complex (MHC) and identify possible interactive relationships. The dynamic and non-extensive statistical characteristics of two well characterized MHC sequences from the homozygous cell lines, PGF and COX, in addition to two other genomic regions of comparable size, used as controls, have been studied using the reconstructed phase space theorem and the non-extensive statistical theory of Tsallis. The results reveal similar non-linear dynamical behavior as far as complexity and self-organization features. In particular, the low-dimensional deterministic nonlinear chaotic and non-extensive statistical character of the DNA sequences was verified with strong multifractal characteristics and long-range correlations. The nonlinear indices repeatedly verified that MHC sequences, whether exonic, intronic or intergenic include varying levels of information and reveal an interaction of the genes with intergenic regions, whereby the lower the number of genes in a region, the less the complexity and information content of the intergenic region. Finally we showed the significance of the intergenic region in the production of the DNA dynamics. The findings reveal interesting content information in all three genomic elements and interactive relationships of the genes with the intergenic regions. The results most likely are relevant to the whole genome and not only to the MHC. These findings are consistent with the ENCODE project, which has now established that the non-coding regions of the genome remain to be of relevance, as they are functionally important and play a significant role in the regulation of expression of genes and coordination of the many biological processes of the cell.
Freua, Mateus Castelani; Santana, Miguel Henrique de Almeida; Ventura, Ricardo Vieira; Tedeschi, Luis Orlindo; Ferraz, José Bento Sterman
2017-08-01
The interplay between dynamic models of biological systems and genomics is based on the assumption that genetic variation of the complex trait (i.e., outcome of model behavior) arises from component traits (i.e., model parameters) in lower hierarchical levels. In order to provide a proof of concept of this statement for a cattle growth model, we ask whether model parameters map genomic regions that harbor quantitative trait loci (QTLs) already described for the complex trait. We conducted a genome-wide association study (GWAS) with a Bayesian hierarchical LASSO method in two parameters of the Davis Growth Model, a system of three ordinary differential equations describing DNA accretion, protein synthesis and degradation, and fat synthesis. Phenotypic and genotypic data were available for 893 Nellore (Bos indicus) cattle. Computed values for parameter k 1 (DNA accretion rate) ranged from 0.005 ± 0.003 and for α (constant for energy for maintenance requirement) 0.134 ± 0.024. The expected biological interpretation of the parameters is confirmed by QTLs mapped for k 1 and α. QTLs within genomic regions mapped for k 1 are expected to be correlated with the DNA pool: body size and weight. Single nucleotide polymorphisms (SNPs) which were significant for α mapped QTLs that had already been associated with residual feed intake, feed conversion ratio, average daily gain (ADG), body weight, and also dry matter intake. SNPs identified for k 1 were able to additionally explain 2.2% of the phenotypic variability of the complex ADG, even when SNPs for k 1 did not match the genomic regions associated with ADG. Although improvements are needed, our findings suggest that genomic analysis on component traits may help to uncover the genetic basis of more complex traits, particularly when lower biological hierarchies are mechanistically described by mathematical simulation models.
Zhang, Jin; Ruhlman, Tracey A.; Sabir, Jamal S. M.; Blazier, John Chris; Weng, Mao-Lun; Park, Seongjun; Jansen, Robert K.
2016-01-01
Disruption of DNA replication, recombination, and repair (DNA-RRR) systems has been hypothesized to cause highly elevated nucleotide substitution rates and genome rearrangements in the plastids of angiosperms, but this theory remains untested. To investigate nuclear–plastid genome (plastome) coevolution in Geraniaceae, four different measures of plastome complexity (rearrangements, repeats, nucleotide insertions/deletions, and substitution rates) were evaluated along with substitution rates of 12 nuclear-encoded, plastid-targeted DNA-RRR genes from 27 Geraniales species. Significant correlations were detected for nonsynonymous (dN) but not synonymous (dS) substitution rates for three DNA-RRR genes (uvrB/C, why1, and gyrA) supporting a role for these genes in accelerated plastid genome evolution in Geraniaceae. Furthermore, correlation between dN of uvrB/C and plastome complexity suggests the presence of nucleotide excision repair system in plastids. Significant correlations were also detected between plastome complexity and 13 of the 90 nuclear-encoded organelle-targeted genes investigated. Comparisons revealed significant acceleration of dN in plastid-targeted genes of Geraniales relative to Brassicales suggesting this correlation may be an artifact of elevated rates in this gene set in Geraniaceae. Correlation between dN of plastid-targeted DNA-RRR genes and plastome complexity supports the hypothesis that the aberrant patterns in angiosperm plastome evolution could be caused by dysfunction in DNA-RRR systems. PMID:26893456
Raguideau, Sébastien; Plancade, Sandra; Pons, Nicolas; Leclerc, Marion; Laroche, Béatrice
2016-12-01
Whole Genome Shotgun (WGS) metagenomics is increasingly used to study the structure and functions of complex microbial ecosystems, both from the taxonomic and functional point of view. Gene inventories of otherwise uncultured microbial communities make the direct functional profiling of microbial communities possible. The concept of community aggregated trait has been adapted from environmental and plant functional ecology to the framework of microbial ecology. Community aggregated traits are quantified from WGS data by computing the abundance of relevant marker genes. They can be used to study key processes at the ecosystem level and correlate environmental factors and ecosystem functions. In this paper we propose a novel model based approach to infer combinations of aggregated traits characterizing specific ecosystemic metabolic processes. We formulate a model of these Combined Aggregated Functional Traits (CAFTs) accounting for a hierarchical structure of genes, which are associated on microbial genomes, further linked at the ecosystem level by complex co-occurrences or interactions. The model is completed with constraints specifically designed to exploit available genomic information, in order to favor biologically relevant CAFTs. The CAFTs structure, as well as their intensity in the ecosystem, is obtained by solving a constrained Non-negative Matrix Factorization (NMF) problem. We developed a multicriteria selection procedure for the number of CAFTs. We illustrated our method on the modelling of ecosystemic functional traits of fiber degradation by the human gut microbiota. We used 1408 samples of gene abundances from several high-throughput sequencing projects and found that four CAFTs only were needed to represent the fiber degradation potential. This data reduction highlighted biologically consistent functional patterns while providing a high quality preservation of the original data. Our method is generic and can be applied to other metabolic processes in the gut or in other ecosystems.
Russi, Luigi; Marconi, Gianpiero; Sharbel, Timothy F.; Veronesi, Fabio; Albertini, Emidio
2015-01-01
Poa pratensis L. is a forage and turf grass species well adapted to a wide range of mesic to moist habitats. Due to its genome complexity little is known regarding evolution, genome composition and intraspecific phylogenetic relationships of this species. In the present study we investigated the morphological and genetic diversity of 33 P. pratensis accessions from 23 different countries using both nuclear and chloroplast molecular markers as well as flow cytometry of somatic tissues. This with the aim of shedding light on the genetic diversity and phylogenetic relationships of the collection that includes both cultivated and wild materials. Morphological characterization showed that the most relevant traits able to distinguish cultivated from wild forms were spring growth habit and leaf colour. The genome size analysis revealed high variability both within and between accessions in both wild and cultivated materials. The sequence analysis of the trnL-F chloroplast region revealed a low polymorphism level that could be the result of the complex mode of reproduction of this species. In addition, a strong reduction of chloroplast SSR variability was detected in cultivated materials, where only two alleles were conserved out of the four present in wild accessions. Contrarily, at nuclear level, high variability exist in the collection where the analysis of 11 SSR loci allowed the detection of a total of 91 different alleles. A Bayesian analysis performed on nuclear SSR data revealed that studied materials belong to two main clusters. While wild materials are equally represented in both clusters, the domesticated forms are mostly belonging to cluster P2 which is characterized by lower genetic diversity compared to the cluster P1. In the Neighbour Joining tree no clear distinction was found between accessions with the exception of those from China and Mongolia that were clearly separated from all the others. PMID:25893249
Genomic analyses of bacterial porin-cytochrome gene clusters
Shi, Liang; Fredrickson, James K.; Zachara, John M.
2014-11-26
In this study, the porin-cytochrome (Pcc) protein complex is responsible for trans-outer membrane electron transfer during extracellular reduction of Fe(III) by the dissimilatory metal-reducing bacterium Geobacter sulfurreducens PCA. The identified and characterized Pcc complex of G. sulfurreducens PCA consists of a porin-like outer-membrane protein, a periplasmic 8-heme c type cytochrome (c-Cyt) and an outer-membrane 12-heme c-Cyt, and the genes encoding the Pcc proteins are clustered in the same regions of genome (i.e., the pcc gene clusters) of G. sulfurreducens PCA. A survey of additionally microbial genomes has identified the pcc gene clusters in all sequenced Geobacter spp. and other bacteriamore » from six different phyla, including Anaeromyxobacter dehalogenans 2CP-1, A. dehalogenans 2CP-C, Anaeromyxobacter sp. K, Candidatus Kuenenia stuttgartiensis, Denitrovibrio acetiphilus DSM 12809, Desulfurispirillum indicum S5, Desulfurivibrio alkaliphilus AHT2, Desulfurobacterium thermolithotrophum DSM 11699, Desulfuromonas acetoxidans DSM 684, Ignavibacterium album JCM 16511, and Thermovibrio ammonificans HB-1. The numbers of genes in the pcc gene clusters vary, ranging from two to nine. Similar to the metal-reducing (Mtr) gene clusters of other Fe(III)-reducing bacteria, such as Shewanella spp., additional genes that encode putative c-Cyts with predicted cellular localizations at the cytoplasmic membrane, periplasm and outer membrane often associate with the pcc gene clusters. This suggests that the Pcc-associated c-Cyts may be part of the pathways for extracellular electron transfer reactions. The presence of pcc gene clusters in the microorganisms that do not reduce solid-phase Fe(III) and Mn(IV) oxides, such as D. alkaliphilus AHT2 and I. album JCM 16511, also suggests that some of the pcc gene clusters may be involved in extracellular electron transfer reactions with the substrates other than Fe(III) and Mn(IV) oxides.« less
Tamminen, Manu V; Virta, Marko P J
2015-01-01
Recent progress in environmental microbiology has revealed vast populations of microbes in any given habitat that cannot be detected by conventional culturing strategies. The use of sensitive genetic detection methods such as CARD-FISH and in situ PCR have been limited by the cell wall permeabilization requirement that cannot be performed similarly on all cell types without lysing some and leaving some nonpermeabilized. Furthermore, the detection of low copy targets such as genes present in single copies in the microbial genomes, has remained problematic. We describe an emulsion-based procedure to trap individual microbial cells into picoliter-volume polyacrylamide droplets that provide a rigid support for genetic material and therefore allow complete degradation of cellular material to expose the individual genomes. The polyacrylamide droplets are subsequently converted into picoliter-scale reactors for genome amplification. The amplified genomes are labeled based on the presence of a target gene and differentiated from those that do not contain the gene by flow cytometry. Using the Escherichia coli strains XL1 and MC1061, which differ with respect to the presence (XL1), or absence (MC1061) of a single copy of a tetracycline resistance gene per genome, we demonstrate that XL1 genomes present at 0.1% of MC1061 genomes can be differentiated using this method. Using a spiked sediment microbial sample, we demonstrate that the method is applicable to highly complex environmental microbial communities as a target gene-based screen for individual microbes. The method provides a novel tool for enumerating functional cell populations in complex microbial communities. We envision that the method could be optimized for fluorescence-activated cell sorting to enrich genetic material of interest from complex environmental samples.
Aristidou, Constantia; Theodosiou, Athina; Ketoni, Andria; Bak, Mads; Mehrjouy, Mana M; Tommerup, Niels; Sismani, Carolina
2018-01-01
Precise characterization of apparently balanced complex chromosomal rearrangements in non-affected individuals is crucial as they may result in reproductive failure, recurrent miscarriages or affected offspring. We present a family, where the non-affected father and daughter were found, using FISH and karyotyping, to be carriers of a three-way complex chromosomal rearrangement [t(6;7;10)(q16.2;q34;q26.1), de novo in the father]. The family suffered from two stillbirths, one miscarriage, and has a son with severe intellectual disability. In the present study, the family was revisited using whole-genome mate-pair sequencing. Interestingly, whole-genome mate-pair sequencing revealed a cryptic breakpoint on derivative (der) chromosome 6 rendering the rearrangement even more complex. FISH using a chromosome (chr) 6 custom-designed probe and a chr10 control probe confirmed that the interstitial chr6 segment, created by the two chr6 breakpoints, was translocated onto der(10). Breakpoints were successfully validated with Sanger sequencing, and small imbalances as well as microhomology were identified. Finally, the complex chromosomal rearrangement breakpoints disrupted the SIM1 , GRIK2 , CNTNAP2 , and PTPRE genes without causing any phenotype development. In contrast to the majority of maternally transmitted complex chromosomal rearrangement cases, our study investigated a rare case where a complex chromosomal rearrangement, which most probably resulted from a Type IV hexavalent during the pachytene stage of meiosis I, was stably transmitted from a fertile father to his non-affected daughter. Whole-genome mate-pair sequencing proved highly successful in identifying cryptic complexity, which consequently provided further insight into the meiotic segregation of chromosomes and the increased reproductive risk in individuals carrying the specific complex chromosomal rearrangement. We propose that such complex rearrangements should be characterized in detail using a combination of conventional cytogenetic and NGS-based approaches to aid in better prenatal preimplantation genetic diagnosis and counseling in couples with reproductive problems.
The future of microarray technology: networking the genome search.
D'Ambrosio, C; Gatta, L; Bonini, S
2005-10-01
In recent years microarray technology has been increasingly used in both basic and clinical research, providing substantial information for a better understanding of genome-environment interactions responsible for diseases, as well as for their diagnosis and treatment. However, in genomic research using microarray technology there are several unresolved issues, including scientific, ethical and legal issues. Networks of excellence like GA(2)LEN may represent the best approach for teaching, cost reduction, data repositories, and functional studies implementation.
The genome revolution and its role in understanding complex diseases.
Hofker, Marten H; Fu, Jingyuan; Wijmenga, Cisca
2014-10-01
The completion of the human genome sequence in 2003 clearly marked the beginning of a new era for biomedical research. It spurred technological progress that was unprecedented in the life sciences, including the development of high-throughput technologies to detect genetic variation and gene expression. The study of genetics has become "big data science". One of the current goals of genetic research is to use genomic information to further our understanding of common complex diseases. An essential first step made towards this goal was by the identification of thousands of single nucleotide polymorphisms showing robust association with hundreds of different traits and diseases. As insight into common genetic variation has expanded enormously and the technology to identify more rare variation has become available, we can utilize these advances to gain a better understanding of disease etiology. This will lead to developments in personalized medicine and P4 healthcare. Here, we review some of the historical events and perspectives before and after the completion of the human genome sequence. We also describe the success of large-scale genetic association studies and how these are expected to yield more insight into complex disorders. We show how we can now combine gene-oriented research and systems-based approaches to develop more complex models to help explain the etiology of common diseases. This article is part of a Special Issue entitled: From Genome to Function. Copyright © 2014 Elsevier B.V. All rights reserved.
A survey of genes encoding H2O2-producing GMC oxidoreductases in 10 Polyporales genomes.
Ferreira, Patricia; Carro, Juan; Serrano, Ana; Martínez, Angel T
2015-01-01
The genomes of three representative Polyporales (Bjerkandera adusta, Phlebia brevispora and a member of the Ganoderma lucidum complex) recently were sequenced to expand our knowledge on the diversity and distribution of genes involved in degradation of plant polymers in this Basidiomycota order, which includes most wood-rotting fungi. Oxidases, including members of the glucose-methanol-choline (GMC) oxidoreductase superfamily, play a central role in the above degradative process because they generate extracellular H2O2 acting as the ultimate oxidizer in both white-rot and brown-rot decay. The survey was completed by analyzing the GMC genes in the available genomes of seven more species to cover the four Polyporales clades. First, an in silico search for sequences encoding members of the aryl-alcohol oxidase, glucose oxidase, methanol oxidase, pyranose oxidase, cellobiose dehydrogenase and pyranose dehydrogenase families was performed. The curated sequences were subjected to an analysis of their evolutionary relationships, followed by estimation of gene duplication/reduction history during fungal evolution. Second, the molecular structures of the near one hundred GMC oxidoreductases identified were modeled to gain insight into their structural variation and expected catalytic properties. In contrast to ligninolytic peroxidases, whose genes are present in all white-rot Polyporales genomes and absent from those of brown-rot species, the H2O2-generating oxidases are widely distributed in both fungal types. This indicates that the GMC oxidases provide H2O2 for both ligninolytic peroxidase activity (in white-rot decay) and Fenton attack on cellulose (in brown-rot decay), after the transition between both decay patterns in Polyporales occurred. © 2015 by The Mycological Society of America.
Qanbari, Saber; Strom, Tim M.; Haberer, Georg; Weigend, Steffen; Gheyas, Almas A.; Turner, Frances; Burt, David W.; Preisinger, Rudolf; Gianola, Daniel; Simianer, Henner
2012-01-01
In most studies aimed at localizing footprints of past selection, outliers at tails of the empirical distribution of a given test statistic are assumed to reflect locus-specific selective forces. Significance cutoffs are subjectively determined, rather than being related to a clear set of hypotheses. Here, we define an empirical p-value for the summary statistic by means of a permutation method that uses the observed SNP structure in the real data. To illustrate the methodology, we applied our approach to a panel of 2.9 million autosomal SNPs identified from re-sequencing a pool of 15 individuals from a brown egg layer line. We scanned the genome for local reductions in heterozygosity, suggestive of selective sweeps. We also employed a modified sliding window approach that accounts for gaps in the sequence and increases scanning resolution by moving the overlapping windows by steps of one SNP only, and suggest to call this a “creeping window” strategy. The approach confirmed selective sweeps in the region of previously described candidate genes, i.e. TSHR, PRL, PRLHR, INSR, LEPR, IGF1, and NRAMP1 when used as positive controls. The genome scan revealed 82 distinct regions with strong evidence of selection (genome-wide p-value<0.001), including genes known to be associated with eggshell structure and immune system such as CALB1 and GAL cluster, respectively. A substantial proportion of signals was found in poor gene content regions including the most extreme signal on chromosome 1. The observation of multiple signals in a highly selected layer line of chicken is consistent with the hypothesis that egg production is a complex trait controlled by many genes. PMID:23209582
Postberg, Jan; Jönsson, Franziska; Weil, Patrick Philipp; Bulic, Aneta; Juranek, Stefan Andreas; Lipps, Hans-Joachim
2018-06-12
During sexual reproduction in the unicellular ciliate Stylonychia somatic macronuclei differentiate from germline micronuclei. Thereby, programmed sequence reduction takes place, leading to the elimination of > 95% of germline sequences, which priorly adopt heterochromatin structure via H3K27me3. Simultaneously, 27nt-ncRNAs become synthesized from parental transcripts and are bound by the Argonaute protein PIWI1. These 27nt-ncRNAs cover sequences destined to the developing macronucleus and are thought to protect them from degradation. We provide evidence and propose that RNA/DNA base-pairing guides PIWI1/27nt-RNA complexes to complementary macronucleus-destined DNA target sequences, hence transiently causing locally stalled replication during polytene chromosome formation. This spatiotemporal delay enables the selective deposition of temporarily available histone H3.4K27me3 nucleosomes at all other sequences being continuously replicated, thus dictating their prospective heterochromatin structure before becoming developmentally eliminated. Concomitantly, 27nt-RNA-covered sites remain protected. We introduce the concept of 'RNA-induced DNA replication interference' and explain how the parental functional genome partition could become transmitted to the progeny.
Wernegreen, Jennifer J
2017-09-15
Ancient associations between insects and bacteria provide models to study intimate host-microbe interactions. Currently, a wealth of genome sequence data for long-term, obligately intracellular (primary) endosymbionts of insects reveals profound genomic consequences of this specialized bacterial lifestyle. Those consequences include severe genome reduction and extreme base compositions. This minireview highlights the utility of genome sequence data to understand how, and why, endosymbionts have been pushed to such extremes, and to illuminate the functional consequences of such extensive genome change. While the static snapshots provided by individual endosymbiont genomes are valuable, comparative analyses of multiple genomes have shed light on evolutionary mechanisms. Namely, genome comparisons have told us that selection is important in fine-tuning gene content, but at the same time, mutational pressure and genetic drift contribute to genome degradation. Examples from Blochmannia, the primary endosymbiont of the ant tribe Camponotini, illustrate the value and constraints of genome sequence data, and exemplify how genomes can serve as a springboard for further comparative and experimental inquiry. Copyright © 2017. Published by Elsevier Inc.
Natural Allelic Variations in Highly Polyploidy Saccharum Complex
DOE Office of Scientific and Technical Information (OSTI.GOV)
Song, Jian; Yang, Xiping; Resende, Jr., Marcio F. R.
Sugarcane ( Saccharum spp.) is an important sugar and biofuel crop with high polyploid and complex genomes. The Saccharum complex, comprised of Saccharum genus and a few related genera, are important genetic resources for sugarcane breeding. A large amount of natural variation exists within the Saccharum complex. Though understanding their allelic variation has been challenging, it is critical to dissect allelic structure and to identify the alleles controlling important traits in sugarcane. To characterize natural variations in Saccharum complex, a target enrichment sequencing approach was used to assay 12 representative germplasm accessions. In total, 55,946 highly efficient probes were designedmore » based on the sorghum genome and sugarcane unigene set targeting a total of 6 Mb of the sugarcane genome. A pipeline specifically tailored for polyploid sequence variants and genotype calling was established. BWAmem and sorghum genome approved to be an acceptable aligner and reference for sugarcane target enrichment sequence analysis, respectively. Genetic variations including 1,166,066 non-redundant SNPs, 150,421 InDels, 919 gene copy number variations, and 1,257 gene presence/absence variations were detected. SNPs from three different callers (Samtools, Freebayes, and GATK) were compared and the validation rates were nearly 90%. Based on the SNP loci of each accession and their ploidy levels, 999,258 single dosage SNPs were identified and most loci were estimated as largely homozygotes. An average of 34,397 haplotype blocks for each accession was inferred. The highest divergence time among the Saccharum spp. was estimated as 1.2 million years ago (MYA). Saccharum spp. diverged from Erianthus and Sorghum approximately 5 and 6 MYA, respectively. Furthermore, the target enrichment sequencing approach provided an effective way to discover and catalog natural allelic variation in highly polyploid or heterozygous genomes.« less
Natural Allelic Variations in Highly Polyploidy Saccharum Complex
Song, Jian; Yang, Xiping; Resende, Jr., Marcio F. R.; ...
2016-06-08
Sugarcane ( Saccharum spp.) is an important sugar and biofuel crop with high polyploid and complex genomes. The Saccharum complex, comprised of Saccharum genus and a few related genera, are important genetic resources for sugarcane breeding. A large amount of natural variation exists within the Saccharum complex. Though understanding their allelic variation has been challenging, it is critical to dissect allelic structure and to identify the alleles controlling important traits in sugarcane. To characterize natural variations in Saccharum complex, a target enrichment sequencing approach was used to assay 12 representative germplasm accessions. In total, 55,946 highly efficient probes were designedmore » based on the sorghum genome and sugarcane unigene set targeting a total of 6 Mb of the sugarcane genome. A pipeline specifically tailored for polyploid sequence variants and genotype calling was established. BWAmem and sorghum genome approved to be an acceptable aligner and reference for sugarcane target enrichment sequence analysis, respectively. Genetic variations including 1,166,066 non-redundant SNPs, 150,421 InDels, 919 gene copy number variations, and 1,257 gene presence/absence variations were detected. SNPs from three different callers (Samtools, Freebayes, and GATK) were compared and the validation rates were nearly 90%. Based on the SNP loci of each accession and their ploidy levels, 999,258 single dosage SNPs were identified and most loci were estimated as largely homozygotes. An average of 34,397 haplotype blocks for each accession was inferred. The highest divergence time among the Saccharum spp. was estimated as 1.2 million years ago (MYA). Saccharum spp. diverged from Erianthus and Sorghum approximately 5 and 6 MYA, respectively. Furthermore, the target enrichment sequencing approach provided an effective way to discover and catalog natural allelic variation in highly polyploid or heterozygous genomes.« less
Metabolic Roles of Uncultivated Bacterioplankton Lineages in the Northern Gulf of Mexico “Dead Zone”
Seitz, Kiley W.; Temperton, Ben; Gillies, Lauren E.; Rabalais, Nancy N.; Henrissat, Bernard; Mason, Olivia U.
2017-01-01
ABSTRACT Marine regions that have seasonal to long-term low dissolved oxygen (DO) concentrations, sometimes called “dead zones,” are increasing in number and severity around the globe with deleterious effects on ecology and economics. One of the largest of these coastal dead zones occurs on the continental shelf of the northern Gulf of Mexico (nGOM), which results from eutrophication-enhanced bacterioplankton respiration and strong seasonal stratification. Previous research in this dead zone revealed the presence of multiple cosmopolitan bacterioplankton lineages that have eluded cultivation, and thus their metabolic roles in this ecosystem remain unknown. We used a coupled shotgun metagenomic and metatranscriptomic approach to determine the metabolic potential of Marine Group II Euryarchaeota, SAR406, and SAR202. We recovered multiple high-quality, nearly complete genomes from all three groups as well as candidate phyla usually associated with anoxic environments—Parcubacteria (OD1) and Peregrinibacteria. Two additional groups with putative assignments to ACD39 and PAUC34f supplement the metabolic contributions by uncultivated taxa. Our results indicate active metabolism in all groups, including prevalent aerobic respiration, with concurrent expression of genes for nitrate reduction in SAR406 and SAR202, and dissimilatory nitrite reduction to ammonia and sulfur reduction by SAR406. We also report a variety of active heterotrophic carbon processing mechanisms, including degradation of complex carbohydrate compounds by SAR406, SAR202, ACD39, and PAUC34f. Together, these data help constrain the metabolic contributions from uncultivated groups in the nGOM during periods of low DO and suggest roles for these organisms in the breakdown of complex organic matter. PMID:28900024
Verwaaijen, Bart; Wibberg, Daniel; Nelkner, Johanna; Gordin, Miriam; Rupp, Oliver; Winkler, Anika; Bremges, Andreas; Blom, Jochen; Grosch, Rita; Pühler, Alfred; Schlüter, Andreas
2018-02-10
Lettuce (Lactuca sativa, L.) is an important annual plant of the family Asteraceae (Compositae). The commercial lettuce cultivar Tizian has been used in various scientific studies investigating the interaction of the plant with phytopathogens or biological control agents. Here, we present the de novo draft genome sequencing and gene prediction for this specific cultivar derived from transcriptome sequence data. The assembled scaffolds amount to a size of 2.22 Gb. Based on RNAseq data, 31,112 transcript isoforms were identified. Functional predictions for these transcripts were determined within the GenDBE annotation platform. Comparison with the cv. Salinas reference genome revealed a high degree of sequence similarity on genome and transcriptome levels, with an average amino acid identity of 99%. Furthermore, it was observed that two large regions are either missing or are highly divergent within the cv. Tizian genome compared to cv. Salinas. One of these regions covers the major resistance complex 1 region of cv. Salinas. The cv. Tizian draft genome sequence provides a valuable resource for future functional and transcriptome analyses focused on this lettuce cultivar. Copyright © 2017 Elsevier B.V. All rights reserved.
Discovering Hematopoietic Mechanisms Through Genome-Wide Analysis of GATA Factor Chromatin Occupancy
Fujiwara, Tohru; O'Geen, Henriette; Keles, Sunduz; Blahnik, Kimberly; Linnemann, Amelia K.; Kang, Yoon-A; Choi, Kyunghee; Farnham, Peggy J.; Bresnick, Emery H.
2009-01-01
SUMMARY GATA factors interact with simple DNA motifs (WGATAR) to regulate critical processes, including hematopoiesis, but very few WGATAR motifs are occupied in genomes. Given the rudimentary knowledge of mechanisms underlying this restriction, and how GATA factors establish genetic networks, we used ChIP-seq to define GATA-1 and GATA-2 occupancy genome-wide in erythroid cells. Coupled with genetic complementation analysis and transcriptional profiling, these studies revealed a rich collection of targets containing a characteristic binding motif of greater complexity than WGATAR. GATA factors occupied loci encoding multiple components of the Scl/TAL1 complex, a master regulator of hematopoiesis and leukemogenic target. Mechanistic analyses provided evidence for cross-regulatory and autoregulatory interactions among components of this complex, including GATA-2 induction of the hematopoietic corepressor ETO-2 and an ETO-2 negative autoregulatory loop. These results establish fundamental principles underlying GATA factor mechanisms in chromatin and illustrate a complex network of considerable importance for the control of hematopoiesis. PMID:19941826
Genomic analysis of organismal complexity in the multicellular green alga Volvox carteri
DOE Office of Scientific and Technical Information (OSTI.GOV)
Prochnik, Simon E.; Umen, James; Nedelcu, Aurora
2010-07-01
Analysis of the Volvox carteri genome reveals that this green alga's increased organismal complexity and multicellularity are associated with modifications in protein families shared with its unicellular ancestor, and not with large-scale innovations in protein coding capacity. The multicellular green alga Volvox carteri and its morphologically diverse close relatives (the volvocine algae) are uniquely suited for investigating the evolution of multicellularity and development. We sequenced the 138 Mb genome of V. carteri and compared its {approx}14,500 predicted proteins to those of its unicellular relative, Chlamydomonas reinhardtii. Despite fundamental differences in organismal complexity and life history, the two species have similarmore » protein-coding potentials, and few species-specific protein-coding gene predictions. Interestingly, volvocine algal-specific proteins are enriched in Volvox, including those associated with an expanded and highly compartmentalized extracellular matrix. Our analysis shows that increases in organismal complexity can be associated with modifications of lineage-specific proteins rather than large-scale invention of protein-coding capacity.« less
Target Capture during Mos1 Transposition*
Pflieger, Aude; Jaillet, Jerôme; Petit, Agnès; Augé-Gouillou, Corinne; Renault, Sylvaine
2014-01-01
DNA transposition contributes to genomic plasticity. Target capture is a key step in the transposition process, because it contributes to the selection of new insertion sites. Nothing or little is known about how eukaryotic mariner DNA transposons trigger this step. In the case of Mos1, biochemistry and crystallography have deciphered several inverted terminal repeat-transposase complexes that are intermediates during transposition. However, the target capture complex is still unknown. Here, we show that the preintegration complex (i.e., the excised transposon) is the only complex able to capture a target DNA. Mos1 transposase does not support target commitment, which has been proposed to explain Mos1 random genomic integrations within host genomes. We demonstrate that the TA dinucleotide used as the target is crucial both to target recognition and in the chemistry of the strand transfer reaction. Bent DNA molecules are better targets for the capture when the target DNA is nicked two nucleotides apart from the TA. They improve strand transfer when the target DNA contains a mismatch near the TA dinucleotide. PMID:24269942
Target capture during Mos1 transposition.
Pflieger, Aude; Jaillet, Jerôme; Petit, Agnès; Augé-Gouillou, Corinne; Renault, Sylvaine
2014-01-03
DNA transposition contributes to genomic plasticity. Target capture is a key step in the transposition process, because it contributes to the selection of new insertion sites. Nothing or little is known about how eukaryotic mariner DNA transposons trigger this step. In the case of Mos1, biochemistry and crystallography have deciphered several inverted terminal repeat-transposase complexes that are intermediates during transposition. However, the target capture complex is still unknown. Here, we show that the preintegration complex (i.e., the excised transposon) is the only complex able to capture a target DNA. Mos1 transposase does not support target commitment, which has been proposed to explain Mos1 random genomic integrations within host genomes. We demonstrate that the TA dinucleotide used as the target is crucial both to target recognition and in the chemistry of the strand transfer reaction. Bent DNA molecules are better targets for the capture when the target DNA is nicked two nucleotides apart from the TA. They improve strand transfer when the target DNA contains a mismatch near the TA dinucleotide.
Short template switch events explain mutation clusters in the human genome.
Löytynoja, Ari; Goldman, Nick
2017-06-01
Resequencing efforts are uncovering the extent of genetic variation in humans and provide data to study the evolutionary processes shaping our genome. One recurring puzzle in both intra- and inter-species studies is the high frequency of complex mutations comprising multiple nearby base substitutions or insertion-deletions. We devised a generalized mutation model of template switching during replication that extends existing models of genome rearrangement and used this to study the role of template switch events in the origin of short mutation clusters. Applied to the human genome, our model detects thousands of template switch events during the evolution of human and chimp from their common ancestor and hundreds of events between two independently sequenced human genomes. Although many of these are consistent with a template switch mechanism previously proposed for bacteria, our model also identifies new types of mutations that create short inversions, some flanked by paired inverted repeats. The local template switch process can create numerous complex mutation patterns, including hairpin loop structures, and explains multinucleotide mutations and compensatory substitutions without invoking positive selection, speculative mechanisms, or implausible coincidence. Clustered sequence differences are challenging for current mapping and variant calling methods, and we show that many erroneous variant annotations exist in human reference data. Local template switch events may have been neglected as an explanation for complex mutations because of biases in commonly used analyses. Incorporation of our model into reference-based analysis pipelines and comparisons of de novo assembled genomes will lead to improved understanding of genome variation and evolution. © 2017 Löytynoja and Goldman; Published by Cold Spring Harbor Laboratory Press.
Discovering novel subsystems using comparative genomics
Ferrer, Luciana; Shearer, Alexander G.; Karp, Peter D.
2011-01-01
Motivation: Key problems for computational genomics include discovering novel pathways in genome data, and discovering functional interaction partners for genes to define new members of partially elucidated pathways. Results: We propose a novel method for the discovery of subsystems from annotated genomes. For each gene pair, a score measuring the likelihood that the two genes belong to a same subsystem is computed using genome context methods. Genes are then grouped based on these scores, and the resulting groups are filtered to keep only high-confidence groups. Since the method is based on genome context analysis, it relies solely on structural annotation of the genomes. The method can be used to discover new pathways, find missing genes from a known pathway, find new protein complexes or other kinds of functional groups and assign function to genes. We tested the accuracy of our method in Escherichia coli K-12. In one configuration of the system, we find that 31.6% of the candidate groups generated by our method match a known pathway or protein complex closely, and that we rediscover 31.2% of all known pathways and protein complexes of at least 4 genes. We believe that a significant proportion of the candidates that do not match any known group in E.coli K-12 corresponds to novel subsystems that may represent promising leads for future laboratory research. We discuss in-depth examples of these findings. Availability: Predicted subsystems are available at http://brg.ai.sri.com/pwy-discovery/journal.html. Contact: lferrer@ai.sri.com Supplementary information: Supplementary data are available at Bioinformatics online. PMID:21775308
DOE Office of Scientific and Technical Information (OSTI.GOV)
Labbe, Jessy L; Uehling, Jessie K; Payen, Thibaut
The last 10 years have seen the cost of sequencing complete genomes decrease at an incredible speed. This has led to an increase in the number of genomes sequenced in all the fungal tree of life as well as a wide variety of plant genomes. The increase in sequencing has permitted us to study the evolution of organisms on a genomic scale. A number of talks during the conference discussed the importance of transposable elements (TEs) that are present in almost all species of fungi. These TEs represent an especially large percentage of genomic space in fungi that interact withmore » plants. Thierry Rouxel (INRA, Nancy, France) showed the link between speciation in the Leptosphaeria complex and the expansion of TE families. For example in the Leptosphaeria complex, one species associated with oilseed rape has experienced a recent and massive burst of movement by a few TE families. The alterations caused by these TEs took place in discrete regions of the genome leading to shuffling of the genomic landscape and the appearance of genes specific to the species, such as effectors useful for the interactions with a particular plant (Rouxel et al., 2011). Other presentations showed the importance of TEs in affecting genome organization. For example, in Amanita different species appear to have been invaded by different TE families (Veneault-Fourrey & Martin, 2011).« less
Effectiveness of liquid soap and hand sanitizer against Norwalk virus on contaminated hands.
Liu, Pengbo; Yuen, Yvonne; Hsiao, Hui-Mien; Jaykus, Lee-Ann; Moe, Christine
2010-01-01
Disinfection is an essential measure for interrupting human norovirus (HuNoV) transmission, but it is difficult to evaluate the efficacy of disinfectants due to the absence of a practicable cell culture system for these viruses. The purpose of this study was to screen sodium hypochlorite and ethanol for efficacy against Norwalk virus (NV) and expand the studies to evaluate the efficacy of antibacterial liquid soap and alcohol-based hand sanitizer for the inactivation of NV on human finger pads. Samples were tested by real-time reverse transcription-quantitative PCR (RT-qPCR) both with and without a prior RNase treatment. In suspension assay, sodium hypochlorite concentrations of >or=160 ppm effectively eliminated RT-qPCR detection signal, while ethanol, regardless of concentration, was relatively ineffective, giving at most a 0.5 log(10) reduction in genomic copies of NV cDNA. Using the American Society for Testing and Materials (ASTM) standard finger pad method and a modification thereof (with rubbing), we observed the greatest reduction in genomic copies of NV cDNA with the antibacterial liquid soap treatment (0.67 to 1.20 log(10) reduction) and water rinse only (0.58 to 1.58 log(10) reduction). The alcohol-based hand sanitizer was relatively ineffective, reducing the genomic copies of NV cDNA by only 0.14 to 0.34 log(10) compared to baseline. Although the concentrations of genomic copies of NV cDNA were consistently lower on finger pad eluates pretreated with RNase compared to those without prior RNase treatment, these differences were not statistically significant. Despite the promise of alcohol-based sanitizers for the control of pathogen transmission, they may be relatively ineffective against the HuNoV, reinforcing the need to develop and evaluate new products against this important group of viruses.
Zhang, Yan; Yang, Jing; Zhang, Jing; Sun, Liangdan; Hirankarn, Nattiya; Pan, Hai-Feng; Lau, Chak Sing; Chan, Tak Mao; Lee, Tsz Leung; Leung, Alexander Moon Ho; Mok, Chi Chiu; Zhang, Lu; Wang, Yongfei; Shen, Jiangshan Jane; Wong, Sik Nin; Lee, Ka Wing; Ho, Marco Hok Kung; Lee, Pamela Pui Wah; Chung, Brian Hon-Yin; Chong, Chun Yin; Wong, Raymond Woon Sing; Mok, Mo Yin; Wong, Wilfred Hing Sang; Tong, Kwok Lung; Tse, Niko Kei Chiu; Li, Xiang-Pei; Avihingsanon, Yingyos; Rianthavorn, Pornpimol; Deekajorndej, Thavatchai; Suphapeetiporn, Kanya; Shotelersuk, Vorasuk; Ying, Shirley King Yee; Fung, Samuel Ka Shun; Lai, Wai Ming; Wong, Chun-Ming; Ng, Irene Oi Lin; Garcia-Barcelo, Maria-Merce; Cherny, Stacey S; Cui, Yong; Sham, Pak Chung; Yang, Sen; Ye, Dong-Qing; Zhang, Xue-Jun; Lau, Yu Lung; Yang, Wanling
2016-05-01
Genetic interaction has been considered as a hallmark of the genetic architecture of systemic lupus erythematosus (SLE). Based on two independent genome-wide association studies (GWAS) on Chinese populations, we performed a genome-wide search for genetic interactions contributing to SLE susceptibility. The study involved a total of 1 659 cases and 3 398 controls in the discovery stage and 2 612 cases and 3 441 controls in three cohorts for replication. Logistic regression and multifactor dimensionality reduction were used to search for genetic interaction. Interaction of CD80 (rs2222631) and ALOX5AP (rs12876893) was found to be significantly associated with SLE (OR_int=1.16, P_int_all=7.7E-04 at false discovery rate<0.05). Single nuclear polymorphism rs2222631 was found associated with SLE with genome-wide significance (P_all=4.5E-08, OR=0.86) and is independent of rs6804441 in CD80, whose association was reported previously. Significant correlation was observed between expression of these two genes in healthy controls and SLE cases, together with differential expression of these genes between cases and controls, observed from individuals from the Hong Kong cohort. Genetic interactions between BLK (rs13277113) and DDX6 (rs4639966), and between TNFSF4 (rs844648) and PXK (rs6445975) were also observed in both GWAS data sets. Our study represents the first genome-wide evaluation of epistasis interactions on SLE and the findings suggest interactions and independent variants may help partially explain missing heritability for complex diseases. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/
Winkel, Matthias; Salman-Carvalho, Verena; Woyke, Tanja; ...
2016-06-21
Large, colorless sulfur-oxidizing bacteria (LSB) of the family Beggiatoaceae form thick mats at sulfidic sediment surfaces, where they efficiently detoxify sulfide before it enters the water column. The genus Thiomargarita harbors the largest known free-living bacteria with cell sizes of up to 750 μm in diameter. In addition to their ability to oxidize reduced sulfur compounds, some Thiornargarita spp. are known to store large amounts of nitrate, phosphate and elemental sulfur internally. To date little is known about their energy yielding metabolic pathways, and how these pathways compare to other Beggiatoaceae. Here, we present a draft single-cell genome of amore » chain-forming " Candidatus Thiomargarita nelsonii Thio36", and conduct a comparative analysis to five draft and one full genome of other members of the Beggiatoaceae. " Ca. T. nelsonii Thio36" is able to respire nitrate to both ammonium and dinitrogen, which allows them to flexibly respond to environmental changes. Genes for sulfur oxidation and inorganic carbon fixation confirmed that " Ca. T. nelsonii Thio36" can function as a chemolithoautotroph. Carbon can be fixed via the Calvin-Benson-Bassham cycle, which is common among the Beggiatoaceae. In addition we found key genes of the reductive tricarboxylic acid cycle that point toward an alternative CO 2 fixation pathway. Surprisingly, " Ca. T. nelsonii Thio36" also encodes key genes of the C2-cycle that convert 2-phosphoglycolate to 3-phosphoglycerate during photorespiration in higher plants and cyanobacteria. Moreover, we identified a novel trait of a flavin-based energy bifurcation pathway coupled to a Na +-translocating membrane complex (Rnf). The coupling of these pathways may be key to surviving long periods of anoxia. As other Beggiatoaceae " Ca. T. nelsonii Thio36" encodes many genes similar to those of (filamentous) cyanobacteria. In conclusion, the genome of " Ca. T. nelsonii Thio36" provides additional insight into the ecology of giant sulfur-oxidizing bacteria, and reveals unique genomic features for the Thiomargarita lineage within the Beggiatoaceae.« less
Spermatogenesis in mammals: proteomic insights.
Chocu, Sophie; Calvel, Pierre; Rolland, Antoine D; Pineau, Charles
2012-08-01
Spermatogenesis is a highly sophisticated process involved in the transmission of genetic heritage. It includes halving ploidy, repackaging of the chromatin for transport, and the equipment of developing spermatids and eventually spermatozoa with the advanced apparatus (e.g., tightly packed mitochondrial sheat in the mid piece, elongating of the tail, reduction of cytoplasmic volume) to elicit motility once they reach the epididymis. Mammalian spermatogenesis is divided into three phases. In the first the primitive germ cells or spermatogonia undergo a series of mitotic divisions. In the second the spermatocytes undergo two consecutive divisions in meiosis to produce haploid spermatids. In the third the spermatids differentiate into spermatozoa in a process called spermiogenesis. Paracrine, autocrine, juxtacrine, and endocrine pathways all contribute to the regulation of the process. The array of structural elements and chemical factors modulating somatic and germ cell activity is such that the network linking the various cellular activities during spermatogenesis is unimaginably complex. Over the past two decades, advances in genomics have greatly improved our knowledge of spermatogenesis, by identifying numerous genes essential for the development of functional male gametes. Large-scale analyses of testicular function have deepened our insight into normal and pathological spermatogenesis. Progress in genome sequencing and microarray technology have been exploited for genome-wide expression studies, leading to the identification of hundreds of genes differentially expressed within the testis. However, although proteomics has now come of age, the proteomics-based investigation of spermatogenesis remains in its infancy. Here, we review the state-of-the-art of large-scale proteomic analyses of spermatogenesis, from germ cell development during sex determination to spermatogenesis in the adult. Indeed, a few laboratories have undertaken differential protein profiling expression studies and/or systematic analyses of testicular proteomes in entire organs or isolated cells from various species. We consider the pros and cons of proteomics for studying the testicular germ cell gene expression program. Finally, we address the use of protein datasets, through integrative genomics (i.e., combining genomics, transcriptomics, and proteomics), bioinformatics, and modelling.
Campana, Michael G; Parker, Lillian D; Hawkins, Melissa T R; Young, Hillary S; Helgen, Kristofer M; Szykman Gunther, Micaela; Woodroffe, Rosie; Maldonado, Jesús E; Fleischer, Robert C
2016-12-09
The African wild dog (Lycaon pictus) is an endangered African canid threatened by severe habitat fragmentation, human-wildlife conflict, and infectious disease. A highly specialized carnivore, it is distinguished by its social structure, dental morphology, absence of dewclaws, and colorful pelage. We sequenced the genomes of two individuals from populations representing two distinct ecological histories (Laikipia County, Kenya and KwaZulu-Natal Province, South Africa). We reconstructed population demographic histories for the two individuals and scanned the genomes for evidence of selection. We show that the African wild dog has undergone at least two effective population size reductions in the last 1,000,000 years. We found evidence of Lycaon individual-specific regions of low diversity, suggestive of inbreeding or population-specific selection. Further research is needed to clarify whether these population reductions and low diversity regions are characteristic of the species as a whole. We documented positive selection on the Lycaon mitochondrial genome. Finally, we identified several candidate genes (ASIP, MITF, MLPH, PMEL) that may play a role in the characteristic Lycaon pelage.
Genome sequence of the cultivated cotton Gossypium arboreum
USDA-ARS?s Scientific Manuscript database
Cotton is one of the most economically important natural fiber crops in the world, and the complex tetraploid nature of its genome (AADD, 2n = 52) makes genetic, genomic and functional analyses extremely challenging. Here we sequenced and assembled 98.3% of the 1.7-gigabase G. arboreum (AA, 2n = 26...
Genome Island: A Virtual Science Environment in Second Life
ERIC Educational Resources Information Center
Clark, Mary Anne
2009-01-01
Mary Anne CLark describes the organization and uses of Genome Island, a virtual laboratory complex constructed in Second Life. Genome Island was created for teaching genetics to university undergraduates but also provides a public space where anyone interested in genetics can spend a few minutes, or a few hours, interacting with genetic…
USDA-ARS?s Scientific Manuscript database
Plant organellar genomes contain large repetitive elements that may undergo pairing or recombination to form complex structures and/or sub-genomic fragments. Organellar genomes also exist in admixtures within a given cell or tissue type (heteroplasmy) and abundance of sub-types may change through de...
USDA-ARS?s Scientific Manuscript database
The comprehensive identification of genes underlying phenotypic variation of complex traits such as disease resistance remains one of the greatest challenges in biology despite having genome sequences and more powerful tools. Most genome-wide screens lack sufficient resolving power as they typically...
Analysis of pig genomes provide insight into porcine demography and evolution
USDA-ARS?s Scientific Manuscript database
For nearly 8,000 years pigs and humans have shared a close and complex relationship, and through domestication and breeding, humans have shaped the genomes of current diverse pig breeds. Here we present the assembly and analysis of the genome sequence of a female domestic pig from the European Duroc...
Characterization of genetic variability of Venezuelan equine encephalitis viruses
Gardner, Shea N.; McLoughlin, Kevin; Be, Nicholas A.; ...
2016-04-07
Venezuelan equine encephalitis virus (VEEV) is a mosquito-borne alphavirus that has caused large outbreaks of severe illness in both horses and humans. New approaches are needed to rapidly infer the origin of a newly discovered VEEV strain, estimate its equine amplification and resultant epidemic potential, and predict human virulence phenotype. We performed whole genome single nucleotide polymorphism (SNP) analysis of all available VEE antigenic complex genomes, verified that a SNP-based phylogeny accurately captured the features of a phylogenetic tree based on multiple sequence alignment, and developed a high resolution genome-wide SNP microarray. We used the microarray to analyze a broadmore » panel of VEEV isolates, found excellent concordance between array- and sequence-based SNP calls, genotyped unsequenced isolates, and placed them on a phylogeny with sequenced genomes. The microarray successfully genotyped VEEV directly from tissue samples of an infected mouse, bypassing the need for viral isolation, culture and genomic sequencing. Lastly, we identified genomic variants associated with serotypes and host species, revealing a complex relationship between genotype and phenotype.« less
Scalable and cost-effective NGS genotyping in the cloud.
Souilmi, Yassine; Lancaster, Alex K; Jung, Jae-Yoon; Rizzo, Ettore; Hawkins, Jared B; Powles, Ryan; Amzazi, Saaïd; Ghazal, Hassan; Tonellato, Peter J; Wall, Dennis P
2015-10-15
While next-generation sequencing (NGS) costs have plummeted in recent years, cost and complexity of computation remain substantial barriers to the use of NGS in routine clinical care. The clinical potential of NGS will not be realized until robust and routine whole genome sequencing data can be accurately rendered to medically actionable reports within a time window of hours and at scales of economy in the 10's of dollars. We take a step towards addressing this challenge, by using COSMOS, a cloud-enabled workflow management system, to develop GenomeKey, an NGS whole genome analysis workflow. COSMOS implements complex workflows making optimal use of high-performance compute clusters. Here we show that the Amazon Web Service (AWS) implementation of GenomeKey via COSMOS provides a fast, scalable, and cost-effective analysis of both public benchmarking and large-scale heterogeneous clinical NGS datasets. Our systematic benchmarking reveals important new insights and considerations to produce clinical turn-around of whole genome analysis optimization and workflow management including strategic batching of individual genomes and efficient cluster resource configuration.
Weitzel, Jeffrey N.; Blazer, Kathleen R.; MacDonald, Deborah J.; Culver, Julie O.; Offit, Kenneth
2012-01-01
Scientific and technologic advances are revolutionizing our approach to genetic cancer risk assessment, cancer screening and prevention, and targeted therapy, fulfilling the promise of personalized medicine. In this monograph we review the evolution of scientific discovery in cancer genetics and genomics, and describe current approaches, benefits and barriers to the translation of this information to the practice of preventive medicine. Summaries of known hereditary cancer syndromes and highly penetrant genes are provided and contrasted with recently-discovered genomic variants associated with modest increases in cancer risk. We describe the scope of knowledge, tools, and expertise required for the translation of complex genetic and genomic test information into clinical practice. The challenges of genomic counseling include the need for genetics and genomics professional education and multidisciplinary team training, the need for evidence-based information regarding the clinical utility of testing for genomic variants, the potential dangers posed by premature marketing of first-generation genomic profiles, and the need for new clinical models to improve access to and responsible communication of complex disease-risk information. We conclude that given the experiences and lessons learned in the genetics era, the multidisciplinary model of genetic cancer risk assessment and management will serve as a solid foundation to support the integration of personalized genomic information into the practice of cancer medicine. PMID:21858794
The emerging genomics and systems biology research lead to systems genomics studies.
Yang, Mary Qu; Yoshigoe, Kenji; Yang, William; Tong, Weida; Qin, Xiang; Dunker, A; Chen, Zhongxue; Arbania, Hamid R; Liu, Jun S; Niemierko, Andrzej; Yang, Jack Y
2014-01-01
Synergistically integrating multi-layer genomic data at systems level not only can lead to deeper insights into the molecular mechanisms related to disease initiation and progression, but also can guide pathway-based biomarker and drug target identification. With the advent of high-throughput next-generation sequencing technologies, sequencing both DNA and RNA has generated multi-layer genomic data that can provide DNA polymorphism, non-coding RNA, messenger RNA, gene expression, isoform and alternative splicing information. Systems biology on the other hand studies complex biological systems, particularly systematic study of complex molecular interactions within specific cells or organisms. Genomics and molecular systems biology can be merged into the study of genomic profiles and implicated biological functions at cellular or organism level. The prospectively emerging field can be referred to as systems genomics or genomic systems biology. The Mid-South Bioinformatics Centre (MBC) and Joint Bioinformatics Ph.D. Program of University of Arkansas at Little Rock and University of Arkansas for Medical Sciences are particularly interested in promoting education and research advancement in this prospectively emerging field. Based on past investigations and research outcomes, MBC is further utilizing differential gene and isoform/exon expression from RNA-seq and co-regulation from the ChiP-seq specific for different phenotypes in combination with protein-protein interactions, and protein-DNA interactions to construct high-level gene networks for an integrative genome-phoneme investigation at systems biology level.
Different Evolutionary Paths to Complexity for Small and Large Populations of Digital Organisms
2016-01-01
A major aim of evolutionary biology is to explain the respective roles of adaptive versus non-adaptive changes in the evolution of complexity. While selection is certainly responsible for the spread and maintenance of complex phenotypes, this does not automatically imply that strong selection enhances the chance for the emergence of novel traits, that is, the origination of complexity. Population size is one parameter that alters the relative importance of adaptive and non-adaptive processes: as population size decreases, selection weakens and genetic drift grows in importance. Because of this relationship, many theories invoke a role for population size in the evolution of complexity. Such theories are difficult to test empirically because of the time required for the evolution of complexity in biological populations. Here, we used digital experimental evolution to test whether large or small asexual populations tend to evolve greater complexity. We find that both small and large—but not intermediate-sized—populations are favored to evolve larger genomes, which provides the opportunity for subsequent increases in phenotypic complexity. However, small and large populations followed different evolutionary paths towards these novel traits. Small populations evolved larger genomes by fixing slightly deleterious insertions, while large populations fixed rare beneficial insertions that increased genome size. These results demonstrate that genetic drift can lead to the evolution of complexity in small populations and that purifying selection is not powerful enough to prevent the evolution of complexity in large populations. PMID:27923053
Visualization for genomics: the Microbial Genome Viewer.
Kerkhoven, Robert; van Enckevort, Frank H J; Boekhorst, Jos; Molenaar, Douwe; Siezen, Roland J
2004-07-22
A Web-based visualization tool, the Microbial Genome Viewer, is presented that allows the user to combine complex genomic data in a highly interactive way. This Web tool enables the interactive generation of chromosome wheels and linear genome maps from genome annotation data stored in a MySQL database. The generated images are in scalable vector graphics (SVG) format, which is suitable for creating high-quality scalable images and dynamic Web representations. Gene-related data such as transcriptome and time-course microarray experiments can be superimposed on the maps for visual inspection. The Microbial Genome Viewer 1.0 is freely available at http://www.cmbi.kun.nl/MGV
Almeida, Daniela; Maldonado, Emanuel; Vasconcelos, Vitor; Antunes, Agostinho
2015-01-01
Mitochondrial protein-coding genes (mt genes) encode subunits forming complexes of crucial cellular pathways, including those involved in the vital process of oxidative phosphorylation (OXPHOS). Despite the vital role of the mitochondrial genome (mt genome) in the survival of organisms, little is known with respect to its adaptive implications within marine invertebrates. The molluscan Class Cephalopoda is represented by a marine group of species known to occupy contrasting environments ranging from the intertidal to the deep sea, having distinct metabolic requirements, varied body shapes and highly advanced visual and nervous systems that make them highly competitive and successful worldwide predators. Thus, cephalopods are valuable models for testing natural selection acting on their mitochondrial subunits (mt subunits). Here, we used concatenated mt genes from 17 fully sequenced mt genomes of diverse cephalopod species to generate a robust mitochondrial phylogeny for the Class Cephalopoda. We followed an integrative approach considering several branches of interest–covering cephalopods with distinct morphologies, metabolic rates and habitats–to identify sites under positive selection and localize them in the respective protein alignment and/or tridimensional structure of the mt subunits. Our results revealed significant adaptive variation in several mt subunits involved in the energy production pathway of cephalopods: ND5 and ND6 from Complex I, CYTB from Complex III, COX2 and COX3 from Complex IV, and in ATP8 from Complex V. Furthermore, we identified relevant sites involved in protein-interactions, lining proton translocation channels, as well as disease/deficiencies related sites in the aforementioned complexes. A particular case, revealed by this study, is the involvement of some positively selected sites, found in Octopoda lineage in lining proton translocation channels (site 74 from ND5) and in interactions between subunits (site 507 from ND5) of Complex I. PMID:26285039
A mixed-integer linear programming approach to the reduction of genome-scale metabolic networks.
Röhl, Annika; Bockmayr, Alexander
2017-01-03
Constraint-based analysis has become a widely used method to study metabolic networks. While some of the associated algorithms can be applied to genome-scale network reconstructions with several thousands of reactions, others are limited to small or medium-sized models. In 2015, Erdrich et al. introduced a method called NetworkReducer, which reduces large metabolic networks to smaller subnetworks, while preserving a set of biological requirements that can be specified by the user. Already in 2001, Burgard et al. developed a mixed-integer linear programming (MILP) approach for computing minimal reaction sets under a given growth requirement. Here we present an MILP approach for computing minimum subnetworks with the given properties. The minimality (with respect to the number of active reactions) is not guaranteed by NetworkReducer, while the method by Burgard et al. does not allow specifying the different biological requirements. Our procedure is about 5-10 times faster than NetworkReducer and can enumerate all minimum subnetworks in case there exist several ones. This allows identifying common reactions that are present in all subnetworks, and reactions appearing in alternative pathways. Applying complex analysis methods to genome-scale metabolic networks is often not possible in practice. Thus it may become necessary to reduce the size of the network while keeping important functionalities. We propose a MILP solution to this problem. Compared to previous work, our approach is more efficient and allows computing not only one, but even all minimum subnetworks satisfying the required properties.
Loux, Valentin; Coeuret, Gwendoline; Zagorec, Monique; Champomier Vergès, Marie-Christine; Chaillou, Stéphane
2018-04-19
We present here the complete and draft genome sequences of nine Lactobacillus sakei strains, selected from the entire range of clonal complexes from the three known lineages of the species. The strains were chosen to provide a wide view of pangenomic and plasmidic diversity for this important foodborne species. Copyright © 2018 Loux et al.
Confrontation, Consolidation, and Recognition: The Oocyte’s Perspective on the Incoming Sperm
Miller, David
2015-01-01
From the oocyte’s perspective, the incoming sperm poses a significant challenge. Despite (usually) arising from a male of the same species, the sperm is a “foreign” body that may carry with it additional, undesirable factors such as transposable elements (mainly retroposons) into the egg. These factors can arise either during spermatogenesis or while the sperm is moving through the epididymis or the female genital tract. Furthermore, in addition to the paternal genome, the sperm also carries its own complex repertoire of RNAs into the egg that includes mRNAs, lncRNAs, and sncRNAs. Last, the paternal genome itself is efficiently packaged into a protamine (nucleo-toroid) and histone (nucleosome)-based chromatin scaffold within which much of the RNA is embedded. Taken together, the sperm delivers a far more complex package to the egg than was originally thought. Understanding this complexity, at both the compositional and structural level, depends largely on investigating sperm chromatin from both the genomic (DNA packaging) and epigenomic (RNA carriage and extant histone modifications) perspectives. Why this complexity has arisen and its likely purpose requires us to look more closely at what happens in the oocyte when the sperm gains entry and the processes that then take place preparing the paternal (and maternal) genomes for syngamy. PMID:25957313
Staňková, Helena; Hastie, Alex R; Chan, Saki; Vrána, Jan; Tulpová, Zuzana; Kubaláková, Marie; Visendi, Paul; Hayashi, Satomi; Luo, Mingcheng; Batley, Jacqueline; Edwards, David; Doležel, Jaroslav; Šimková, Hana
2016-07-01
The assembly of a reference genome sequence of bread wheat is challenging due to its specific features such as the genome size of 17 Gbp, polyploid nature and prevalence of repetitive sequences. BAC-by-BAC sequencing based on chromosomal physical maps, adopted by the International Wheat Genome Sequencing Consortium as the key strategy, reduces problems caused by the genome complexity and polyploidy, but the repeat content still hampers the sequence assembly. Availability of a high-resolution genomic map to guide sequence scaffolding and validate physical map and sequence assemblies would be highly beneficial to obtaining an accurate and complete genome sequence. Here, we chose the short arm of chromosome 7D (7DS) as a model to demonstrate for the first time that it is possible to couple chromosome flow sorting with genome mapping in nanochannel arrays and create a de novo genome map of a wheat chromosome. We constructed a high-resolution chromosome map composed of 371 contigs with an N50 of 1.3 Mb. Long DNA molecules achieved by our approach facilitated chromosome-scale analysis of repetitive sequences and revealed a ~800-kb array of tandem repeats intractable to current DNA sequencing technologies. Anchoring 7DS sequence assemblies obtained by clone-by-clone sequencing to the 7DS genome map provided a valuable tool to improve the BAC-contig physical map and validate sequence assembly on a chromosome-arm scale. Our results indicate that creating genome maps for the whole wheat genome in a chromosome-by-chromosome manner is feasible and that they will be an affordable tool to support the production of improved pseudomolecules. © 2016 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.
Vertebrate Genome Evolution in the Light of Fish Cytogenomics and rDNAomics
Howell, W. Mike
2018-01-01
To understand the cytogenomic evolution of vertebrates, we must first unravel the complex genomes of fishes, which were the first vertebrates to evolve and were ancestors to all other vertebrates. We must not forget the immense time span during which the fish genomes had to evolve. Fish cytogenomics is endowed with unique features which offer irreplaceable insights into the evolution of the vertebrate genome. Due to the general DNA base compositional homogeneity of fish genomes, fish cytogenomics is largely based on mapping DNA repeats that still represent serious obstacles in genome sequencing and assembling, even in model species. Localization of repeats on chromosomes of hundreds of fish species and populations originating from diversified environments have revealed the biological importance of this genomic fraction. Ribosomal genes (rDNA) belong to the most informative repeats and in fish, they are subject to a more relaxed regulation than in higher vertebrates. This can result in formation of a literal ‘rDNAome’ consisting of more than 20,000 copies with their high proportion employed in extra-coding functions. Because rDNA has high rates of transcription and recombination, it contributes to genome diversification and can form reproductive barrier. Our overall knowledge of fish cytogenomics grows rapidly by a continuously increasing number of fish genomes sequenced and by use of novel sequencing methods improving genome assembly. The recently revealed exceptional compositional heterogeneity in an ancient fish lineage (gars) sheds new light on the compositional genome evolution in vertebrates generally. We highlight the power of synergy of cytogenetics and genomics in fish cytogenomics, its potential to understand the complexity of genome evolution in vertebrates, which is also linked to clinical applications and the chromosomal backgrounds of speciation. We also summarize the current knowledge on fish cytogenomics and outline its main future avenues. PMID:29443947
TUMOR HAPLOTYPE ASSEMBLY ALGORITHMS FOR CANCER GENOMICS
AGUIAR, DEREK; WONG, WENDY S.W.; ISTRAIL, SORIN
2014-01-01
The growing availability of inexpensive high-throughput sequence data is enabling researchers to sequence tumor populations within a single individual at high coverage. But, cancer genome sequence evolution and mutational phenomena like driver mutations and gene fusions are difficult to investigate without first reconstructing tumor haplotype sequences. Haplotype assembly of single individual tumor populations is an exceedingly difficult task complicated by tumor haplotype heterogeneity, tumor or normal cell sequence contamination, polyploidy, and complex patterns of variation. While computational and experimental haplotype phasing of diploid genomes has seen much progress in recent years, haplotype assembly in cancer genomes remains uncharted territory. In this work, we describe HapCompass-Tumor a computational modeling and algorithmic framework for haplotype assembly of copy number variable cancer genomes containing haplotypes at different frequencies and complex variation. We extend our polyploid haplotype assembly model and present novel algorithms for (1) complex variations, including copy number changes, as varying numbers of disjoint paths in an associated graph, (2) variable haplotype frequencies and contamination, and (3) computation of tumor haplotypes using simple cycles of the compass graph which constrain the space of haplotype assembly solutions. The model and algorithm are implemented in the software package HapCompass-Tumor which is available for download from http://www.brown.edu/Research/Istrail_Lab/. PMID:24297529
The genome projects: implications for dental practice and education.
Wright, J T; Hart, T C
2002-05-01
Information from the Human Genome Project (HGP) and the integration of information from related areas of study and technology will dramatically change health care for the craniofacial complex. Approaches to risk assessment and diagnosis, prevention, early intervention, and management of craniofacial conditions are and will continue to evolve through the application of this new knowledge. While this information will advance our health care abilities, it is clear that the dental profession will face challenges regarding the acquisition, application, transfer, and effective and efficient use of this knowledge with regards to dental research, dental education, and clinical practice. Unraveling the human genomic sequence now allows accurate diagnosis of numerous craniofacial conditions. However, the greatest oral disease burden results from dental caries and periodontal disease that are complex disorders having both hereditary and environmental factors determining disease risk, progression, and course. Disease risk assessment, prevention, and therapy, based on knowledge from the HGP, will likely vary markedly for the different complex conditions affecting the head and neck. Integration of Information from the human genome, comparative and microbial genomics, proteomics, bioinformatics, and related technologies will provide the basis for proactive prevention and intervention and novel and more efficient treatment approaches. Oral health care practitioners will increasingly require knowledge of human genetics and the application of new molecular-based diagnostic and therapeutic technologies.
Decoding transcriptional enhancers: Evolving from annotation to functional interpretation
Engel, Krysta L.; Mackiewicz, Mark; Hardigan, Andrew A.; Myers, Richard M.; Savic, Daniel
2016-01-01
Deciphering the intricate molecular processes that orchestrate the spatial and temporal regulation of genes has become an increasingly major focus of biological research. The differential expression of genes by diverse cell types with a common genome is a hallmark of complex cellular functions, as well as the basis for multicellular life. Importantly, a more coherent understanding of gene regulation is critical for defining developmental processes, evolutionary principles and disease etiologies. Here we present our current understanding of gene regulation by focusing on the role of enhancer elements in these complex processes. Although functional genomic methods have provided considerable advances to our understanding of gene regulation, these assays, which are usually performed on a genome-wide scale, typically provide correlative observations that lack functional interpretation. Recent innovations in genome editing technologies have placed gene regulatory studies at an exciting crossroads, as systematic, functional evaluation of enhancers and other transcriptional regulatory elements can now be performed in a coordinated, high-throughput manner across the entire genome. This review provides insights on transcriptional enhancer function, their role in development and disease, and catalogues experimental tools commonly used to study these elements. Additionally, we discuss the crucial role of novel techniques in deciphering the complex gene regulatory landscape and how these studies will shape future research. PMID:27224938
Decoding transcriptional enhancers: Evolving from annotation to functional interpretation.
Engel, Krysta L; Mackiewicz, Mark; Hardigan, Andrew A; Myers, Richard M; Savic, Daniel
2016-09-01
Deciphering the intricate molecular processes that orchestrate the spatial and temporal regulation of genes has become an increasingly major focus of biological research. The differential expression of genes by diverse cell types with a common genome is a hallmark of complex cellular functions, as well as the basis for multicellular life. Importantly, a more coherent understanding of gene regulation is critical for defining developmental processes, evolutionary principles and disease etiologies. Here we present our current understanding of gene regulation by focusing on the role of enhancer elements in these complex processes. Although functional genomic methods have provided considerable advances to our understanding of gene regulation, these assays, which are usually performed on a genome-wide scale, typically provide correlative observations that lack functional interpretation. Recent innovations in genome editing technologies have placed gene regulatory studies at an exciting crossroads, as systematic, functional evaluation of enhancers and other transcriptional regulatory elements can now be performed in a coordinated, high-throughput manner across the entire genome. This review provides insights on transcriptional enhancer function, their role in development and disease, and catalogues experimental tools commonly used to study these elements. Additionally, we discuss the crucial role of novel techniques in deciphering the complex gene regulatory landscape and how these studies will shape future research. Copyright © 2016 Elsevier Ltd. All rights reserved.
Drosophila Sld5 is essential for normal cell cycle progression and maintenance of genomic integrity
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gouge, Catherine A.; Christensen, Tim W., E-mail: christensent@ecu.edu
2010-09-10
Research highlights: {yields} Drosophila Sld5 interacts with Psf1, PPsf2, and Mcm10. {yields} Haploinsufficiency of Sld5 leads to M-phase delay and genomic instability. {yields} Sld5 is also required for normal S phase progression. -- Abstract: Essential for the normal functioning of a cell is the maintenance of genomic integrity. Failure in this process is often catastrophic for the organism, leading to cell death or mis-proliferation. Central to genomic integrity is the faithful replication of DNA during S phase. The GINS complex has recently come to light as a critical player in DNA replication through stabilization of MCM2-7 and Cdc45 as amore » member of the CMG complex which is likely responsible for the processivity of helicase activity during S phase. The GINS complex is made up of 4 members in a 1:1:1:1 ratio: Psf1, Psf2, Psf3, And Sld5. Here we present the first analysis of the function of the Sld5 subunit in a multicellular organism. We show that Drosophila Sld5 interacts with Psf1, Psf2, and Mcm10 and that mutations in Sld5 lead to M and S phase delays with chromosomes exhibiting hallmarks of genomic instability.« less
Halbedel, Sven; Prager, Rita; Fuchs, Stephan; Trost, Eva; Werner, Guido; Flieger, Antje
2018-06-01
Listeria monocytogenes causes foodborne outbreaks with high mortality. For improvement of outbreak cluster detection, the German consiliary laboratory for listeriosis implemented whole-genome sequencing (WGS) in 2015. A total of 424 human L. monocytogenes isolates collected in 2007 to 2017 were subjected to WGS and core-genome multilocus sequence typing (cgMLST). cgMLST grouped the isolates into 38 complexes, reflecting 4 known and 34 unknown disease clusters. Most of these complexes were confirmed by single nucleotide polymorphism (SNP) calling, but some were further differentiated. Interestingly, several cgMLST cluster types were further subtyped by pulsed-field gel electrophoresis, partly due to phage insertions in the accessory genome. Our results highlight the usefulness of cgMLST for routine cluster detection but also show that cgMLST complexes require validation by methods providing higher typing resolution. Twelve cgMLST clusters included recent cases, suggesting activity of the source. Therefore, the cgMLST nomenclature data presented here may support future public health actions. Copyright © 2018 American Society for Microbiology.
Combining clinical and genomics queries using i2b2 – Three methods
Murphy, Shawn N.; Avillach, Paul; Bellazzi, Riccardo; Phillips, Lori; Gabetta, Matteo; Eran, Alal; McDuffie, Michael T.; Kohane, Isaac S.
2017-01-01
We are fortunate to be living in an era of twin biomedical data surges: a burgeoning representation of human phenotypes in the medical records of our healthcare systems, and high-throughput sequencing making rapid technological advances. The difficulty representing genomic data and its annotations has almost by itself led to the recognition of a biomedical “Big Data” challenge, and the complexity of healthcare data only compounds the problem to the point that coherent representation of both systems on the same platform seems insuperably difficult. We investigated the capability for complex, integrative genomic and clinical queries to be supported in the Informatics for Integrating Biology and the Bedside (i2b2) translational software package. Three different data integration approaches were developed: The first is based on Sequence Ontology, the second is based on the tranSMART engine, and the third on CouchDB. These novel methods for representing and querying complex genomic and clinical data on the i2b2 platform are available today for advancing precision medicine. PMID:28388645
He, Awen; Wang, Wenyu; Prakash, N Tejo; Tinkov, Alexey A; Skalny, Anatoly V; Wen, Yan; Hao, Jingcan; Guo, Xiong; Zhang, Feng
2018-03-01
Chemical elements are closely related to human health. Extensive genomic profile data of complex diseases offer us a good opportunity to systemically investigate the relationships between elements and complex diseases/traits. In this study, we applied gene set enrichment analysis (GSEA) approach to detect the associations between elements and complex diseases/traits though integrating element-gene interaction datasets and genome-wide association study (GWAS) data of complex diseases/traits. To illustrate the performance of GSEA, the element-gene interaction datasets of 24 elements were extracted from the comparative toxicogenomics database (CTD). GWAS summary datasets of 24 complex diseases or traits were downloaded from the dbGaP or GEFOS websites. We observed significant associations between 7 elements and 13 complex diseases or traits (all false discovery rate (FDR) < 0.05), including reported relationships such as aluminum vs. Alzheimer's disease (FDR = 0.042), calcium vs. bone mineral density (FDR = 0.031), magnesium vs. systemic lupus erythematosus (FDR = 0.012) as well as novel associations, such as nickel vs. hypertriglyceridemia (FDR = 0.002) and bipolar disorder (FDR = 0.027). Our study results are consistent with previous biological studies, supporting the good performance of GSEA. Our analyzing results based on GSEA framework provide novel clues for discovering causal relationships between elements and complex diseases. © 2017 WILEY PERIODICALS, INC.
Gene editing by CRISPR/Cas9 in the obligatory outcrossing Medicago sativa.
Gao, Ruimin; Feyissa, Biruk A; Croft, Mana; Hannoufa, Abdelali
2018-04-01
The CRISPR/Cas9 technique was successfully used to edit the genome of the obligatory outcrossing plant species Medicago sativa L. (alfalfa). RNA-guided genome engineering using Clustered Regularly Interspersed Short Palindromic Repeats (CRISPR)/Cas9 technology enables a variety of applications in plants. Successful application and validation of the CRISPR technique in a multiplex genome, such as that of M. sativa (alfalfa) will ultimately lead to major advances in the improvement of this crop. We used CRISPR/Cas9 technique to mutate squamosa promoter binding protein like 9 (SPL9) gene in alfalfa. Because of the complex features of the alfalfa genome, we first used droplet digital PCR (ddPCR) for high-throughput screening of large populations of CRISPR-modified plants. Based on the results of genome editing rates obtained from the ddPCR screening, plants with relatively high rates were subjected to further analysis by restriction enzyme digestion/PCR amplification analyses. PCR products encompassing the respective small guided RNA target locus were then sub-cloned and sequenced to verify genome editing. In summary, we successfully applied the CRISPR/Cas9 technique to edit the SPL9 gene in a multiplex genome, providing some insights into opportunities to apply this technology in future alfalfa breeding. The overall efficiency in the polyploid alfalfa genome was lower compared to other less-complex plant genomes. Further refinement of the CRISPR technology system will thus be required for more efficient genome editing in this plant.
Merhej, Vicky; Raoult, Didier
2012-01-01
Darwin's theory about the evolution of species has been the object of considerable dispute. In this review, we have described seven key principles in Darwin's book The Origin of Species and tried to present how genomics challenge each of these concepts and improve our knowledge about evolution. Darwin believed that species evolution consists on a positive directional selection ensuring the “survival of the fittest.” The most developed state of the species is characterized by increasing complexity. Darwin proposed the theory of “descent with modification” according to which all species evolve from a single common ancestor through a gradual process of small modification of their vertical inheritance. Finally, the process of evolution can be depicted in the form of a tree. However, microbial genomics showed that evolution is better described as the “biological changes over time.” The mode of change is not unidirectional and does not necessarily favors advantageous mutations to increase fitness it is rather subject to random selection as a result of catastrophic stochastic processes. Complexity is not necessarily the completion of development: several complex organisms have gone extinct and many microbes including bacteria with intracellular lifestyle have streamlined highly effective genomes. Genomes evolve through large events of gene deletions, duplications, insertions, and genomes rearrangements rather than a gradual adaptative process. Genomes are dynamic and chimeric entities with gene repertoires that result from vertical and horizontal acquisitions as well as de novo gene creation. The chimeric character of microbial genomes excludes the possibility of finding a single common ancestor for all the genes recorded currently. Genomes are collections of genes with different evolutionary histories that cannot be represented by a single tree of life (TOL). A forest, a network or a rhizome of life may be more accurate to represent evolutionary relationships among species. PMID:22973559
Kedinger, C; Brison, O; Perrin, F; Wilhelm, J
1978-01-01
Deoxyribonucleoprotein complexes released 17 h postinfection from adenovirus type 1 (Ad2)-infected HeLa cell nuclei were shown by electron microscopy to contain filaments much thicker (about 200 A [20 nm]) than double-stranded DNA (about 20 A [2 nm]). The complexes were partially purified through a linear sucrose gradient, concentrated, and further purified in a metrizamide gradient. The major protein present in the complexes was identified as the 72,000-dalton (72K), adenovirus-coded single-stranded DNA-binding protein (72K DBP). Three types of complexes have been visualized by electron microscopy. Some linear complexes were uniformly thick, and their length corresponded roughly to that of the adenovirus genome. Other linear genome-length complexes appeared to consist of a thick filament connected to a thinner filament with the diameter of double-stranded DNA. Forked complexes consisting of one thick filament connected to a genome-length, thinner double-stranded DNA filament were also visualized. Both thick and thin filaments were sensitive to DNase and not to RNase, but only the thick filaments were digested by the single-strand-specific Neurospora crassa nuclease, indicating that they correspond to a complex of 72K DBP and Ad2 single-stranded DNA. Experiments with anti-72K DBP immunoglobulins indicated that these nucleoprotein complexes, containing the 72K DBP, correspond to replicative intermediates. Both strands of the Ad2 genome were found associated to the 72K DBP. Altogether, our results establish the in vivo association of the 72K DBP with adenovirus single-stranded DNA, as previously suggested from in vitro studies, and support a strand displacement mechanism for Ad2 DNA replication, in which both strands can be displaced. In addition, our results indicate that, late in infection, histones are not bound to adenovirus DNA in the form of a nucleosomal chromatine-like structure. Images PMID:207893
Kedinger, C; Brison, O; Perrin, F; Wilhelm, J
1978-05-01
Deoxyribonucleoprotein complexes released 17 h postinfection from adenovirus type 1 (Ad2)-infected HeLa cell nuclei were shown by electron microscopy to contain filaments much thicker (about 200 A [20 nm]) than double-stranded DNA (about 20 A [2 nm]). The complexes were partially purified through a linear sucrose gradient, concentrated, and further purified in a metrizamide gradient. The major protein present in the complexes was identified as the 72,000-dalton (72K), adenovirus-coded single-stranded DNA-binding protein (72K DBP). Three types of complexes have been visualized by electron microscopy. Some linear complexes were uniformly thick, and their length corresponded roughly to that of the adenovirus genome. Other linear genome-length complexes appeared to consist of a thick filament connected to a thinner filament with the diameter of double-stranded DNA. Forked complexes consisting of one thick filament connected to a genome-length, thinner double-stranded DNA filament were also visualized. Both thick and thin filaments were sensitive to DNase and not to RNase, but only the thick filaments were digested by the single-strand-specific Neurospora crassa nuclease, indicating that they correspond to a complex of 72K DBP and Ad2 single-stranded DNA. Experiments with anti-72K DBP immunoglobulins indicated that these nucleoprotein complexes, containing the 72K DBP, correspond to replicative intermediates. Both strands of the Ad2 genome were found associated to the 72K DBP. Altogether, our results establish the in vivo association of the 72K DBP with adenovirus single-stranded DNA, as previously suggested from in vitro studies, and support a strand displacement mechanism for Ad2 DNA replication, in which both strands can be displaced. In addition, our results indicate that, late in infection, histones are not bound to adenovirus DNA in the form of a nucleosomal chromatine-like structure.
Han, Ruyang; Karaoz, Ulas; Lim, HsiaoChien; Brodie, Eoin L.
2013-01-01
Pelosinus spp. are fermentative firmicutes that were recently reported to be prominent members of microbial communities at contaminated subsurface sites in multiple locations. Here we report metabolic characteristics and their putative genetic basis in Pelosinus sp. strain HCF1, an isolate that predominated anaerobic, Cr(VI)-reducing columns constructed with aquifer sediment. Strain HCF1 ferments lactate to propionate and acetate (the methylmalonyl-coenzyme A [CoA] pathway was identified in the genome), and its genome encodes two [NiFe]- and four [FeFe]-hydrogenases for H2 cycling. The reduction of Cr(VI) and Fe(III) may be catalyzed by a flavoprotein with 42 to 51% sequence identity to both ChrR and FerB. This bacterium has unexpected capabilities and gene content associated with reduction of nitrogen oxides, including dissimilatory reduction of nitrate to ammonium (two copies of NrfH and NrfA were identified along with NarGHI) and a nitric oxide reductase (NorCB). In this strain, either H2 or lactate can act as a sole electron donor for nitrate, Cr(VI), and Fe(III) reduction. Transcriptional studies demonstrated differential expression of hydrogenases and nitrate and nitrite reductases. Overall, the unexpected metabolic capabilities and gene content reported here broaden our perspective on what biogeochemical and ecological roles this species might play as a prominent member of microbial communities in subsurface environments. PMID:23064329
Simplifier: a web tool to eliminate redundant NGS contigs.
Ramos, Rommel Thiago Jucá; Carneiro, Adriana Ribeiro; Azevedo, Vasco; Schneider, Maria Paula; Barh, Debmalya; Silva, Artur
2012-01-01
Modern genomic sequencing technologies produce a large amount of data with reduced cost per base; however, this data consists of short reads. This reduction in the size of the reads, compared to those obtained with previous methodologies, presents new challenges, including a need for efficient algorithms for the assembly of genomes from short reads and for resolving repetitions. Additionally after abinitio assembly, curation of the hundreds or thousands of contigs generated by assemblers demands considerable time and computational resources. We developed Simplifier, a stand-alone software that selectively eliminates redundant sequences from the collection of contigs generated by ab initio assembly of genomes. Application of Simplifier to data generated by assembly of the genome of Corynebacterium pseudotuberculosis strain 258 reduced the number of contigs generated by ab initio methods from 8,004 to 5,272, a reduction of 34.14%; in addition, N50 increased from 1 kb to 1.5 kb. Processing the contigs of Escherichia coli DH10B with Simplifier reduced the mate-paired library 17.47% and the fragment library 23.91%. Simplifier removed redundant sequences from datasets produced by assemblers, thereby reducing the effort required for finalization of genome assembly in tests with data from Prokaryotic organisms. Simplifier is available at http://www.genoma.ufpa.br/rramos/softwares/simplifier.xhtmlIt requires Sun jdk 6 or higher.
SAGE: String-overlap Assembly of GEnomes.
Ilie, Lucian; Haider, Bahlul; Molnar, Michael; Solis-Oba, Roberto
2014-09-15
De novo genome assembly of next-generation sequencing data is one of the most important current problems in bioinformatics, essential in many biological applications. In spite of significant amount of work in this area, better solutions are still very much needed. We present a new program, SAGE, for de novo genome assembly. As opposed to most assemblers, which are de Bruijn graph based, SAGE uses the string-overlap graph. SAGE builds upon great existing work on string-overlap graph and maximum likelihood assembly, bringing an important number of new ideas, such as the efficient computation of the transitive reduction of the string overlap graph, the use of (generalized) edge multiplicity statistics for more accurate estimation of read copy counts, and the improved use of mate pairs and min-cost flow for supporting edge merging. The assemblies produced by SAGE for several short and medium-size genomes compared favourably with those of existing leading assemblers. SAGE benefits from innovations in almost every aspect of the assembly process: error correction of input reads, string-overlap graph construction, read copy counts estimation, overlap graph analysis and reduction, contig extraction, and scaffolding. We hope that these new ideas will help advance the current state-of-the-art in an essential area of research in genomics.
Tiwari, Jitendra N.; Nath, Krishna; Kumar, Susheel; Tiwari, Rajanish N.; Kemp, K. Christian; Le, Nhien H.; Youn, Duck Hyun; Lee, Jae Sung; Kim, Kwang S.
2013-01-01
Nanosize platinum clusters with small diameters of 2–4 nm are known to be excellent catalysts for the oxygen reduction reaction. The inherent catalytic activity of smaller platinum clusters has not yet been reported due to a lack of preparation methods to control their size (<2 nm). Here we report the synthesis of platinum clusters (diameter ≤1.4 nm) deposited on genomic double-stranded DNA–graphene oxide composites, and their high-performance electrocatalysis of the oxygen reduction reaction. The electrochemical behaviour, characterized by oxygen reduction reaction onset potential, half-wave potential, specific activity, mass activity, accelerated durability test (10,000 cycles) and cyclic voltammetry stability (10,000 cycles) is attributed to the strong interaction between the nanosize platinum clusters and the DNA–graphene oxide composite, which induces modulation in the electronic structure of the platinum clusters. Furthermore, we show that the platinum cluster/DNA–graphene oxide composite possesses notable environmental durability and stability, vital for high-performance fuel cells and batteries. PMID:23900456
The Nature and Evolution of Genomic Diversity in the Mycobacterium tuberculosis Complex.
Brites, Daniela; Gagneux, Sebastien
2017-01-01
The Mycobacterium tuberculosis Complex (MTBC) consists of a clonal group of several mycobacterial lineages pathogenic to a range of different mammalian hosts. In this chapter, we discuss the origins and the evolutionary forces shaping the genomic diversity of the human-adapted MTBC. Advances in whole-genome sequencing have brought invaluable insights into the macro-evolution of the MTBC, and the biogeographical distribution of the different MTBC lineages, the phylogenetic relationships between these lineages. Moreover, micro-evolutionary processes start to be better understood, including those influencing bacterial mutation rates and those governing the fate of new mutations emerging within patients during treatment. Current genomic and epidemiological evidence reflect the fact that, through ecological specialization, the MTBC affecting humans became an obligate and extremely well-adapted human pathogen. Identifying the adaptive traits of human-adapted MTBC and unraveling the bacterial loci that interact with human genomic variation might help identify new targets for developing better vaccines and designing more effective treatments.
Redefining the genetics of Murine Gammaherpesvirus 68 via transcriptome-based annotation
Johnson, L. Steven; Willert, Erin K.; Virgin, Herbert W.
2010-01-01
Summary Viral genetic studies often focus on large open reading frames (ORFs) identified during genome annotation (ORF-based annotation). Here we provide a tool and software set for defining gene expression by murine gammaherpesvirus 68 (γHV68) nucleotide-by-nucleotide across the 119,450 basepair (bp) genome. These tools allowed us to determine that viral RNA expression was significantly more complex than predicted from ORF-based annotation, including over 73,000 nucleotides of unexpected transcription within 30 expressed genomic regions (EGRs). Approximately 90% of this RNA expression was antisense to genomic regions containing known large ORFs. We verified the existence of novel transcripts in three EGRs using standard methods to validate the approach and determined which parts of the transcriptome depend on protein or viral DNA synthesis. This redefines the genetic map of γHV68, indicates that herpesviruses contain significantly more genetic complexity than predicted from ORF-based genome annotations, and provides new tools and approaches for viral genetic studies. PMID:20542255
Transforming the practice of medicine using genomics
Ginsburg, Geoffrey S.; Ginsburg, Geoffrey S.; J. McCarthy, Jeanette
2009-01-01
Recent studies have demonstrated the use of genomic data, particularly gene expression signatures, as clinical prognostic factors in complex diseases. Such studies herald the future for genomic medicine and the opportunity for personalized prognosis in a variety of clinical contexts that utilize genomescale molecular information. Several key areas represent logical and critical next steps in the use of complex genomic profiling data towards the goal of personalized medicine. First, analyses should be geared toward the development of molecular profiles that predict future events – such as major clinical events or the response, resistance, or adverse reaction to therapy. Secondly, these must move into actual clinical practice by forming the basis for the next generation of clinical trials that will employ these methodologies to stratify patients. Lastly, there remain formidable challenges is in the translation of genomic technologies into clinical medicine that will need to be addressed: professional and public education, health outcomes research, reimbursement, regulatory oversight and privacy protection. PMID:22461094
Mapping and phasing of structural variation in patient genomes using nanopore sequencing.
Cretu Stancu, Mircea; van Roosmalen, Markus J; Renkens, Ivo; Nieboer, Marleen M; Middelkamp, Sjors; de Ligt, Joep; Pregno, Giulia; Giachino, Daniela; Mandrile, Giorgia; Espejo Valle-Inclan, Jose; Korzelius, Jerome; de Bruijn, Ewart; Cuppen, Edwin; Talkowski, Michael E; Marschall, Tobias; de Ridder, Jeroen; Kloosterman, Wigard P
2017-11-06
Despite improvements in genomics technology, the detection of structural variants (SVs) from short-read sequencing still poses challenges, particularly for complex variation. Here we analyse the genomes of two patients with congenital abnormalities using the MinION nanopore sequencer and a novel computational pipeline-NanoSV. We demonstrate that nanopore long reads are superior to short reads with regard to detection of de novo chromothripsis rearrangements. The long reads also enable efficient phasing of genetic variations, which we leveraged to determine the parental origin of all de novo chromothripsis breakpoints and to resolve the structure of these complex rearrangements. Additionally, genome-wide surveillance of inherited SVs reveals novel variants, missed in short-read data sets, a large proportion of which are retrotransposon insertions. We provide a first exploration of patient genome sequencing with a nanopore sequencer and demonstrate the value of long-read sequencing in mapping and phasing of SVs for both clinical and research applications.
On the Structural Plasticity of the Human Genome: Chromosomal Inversions Revisited
Alves, Joao M; Lopes, Alexandra M; Chikhi, Lounès; Amorim, António
2012-01-01
With the aid of novel and powerful molecular biology techniques, recent years have witnessed a dramatic increase in the number of studies reporting the involvement of complex structural variants in several genomic disorders. In fact, with the discovery of Copy Number Variants (CNVs) and other forms of unbalanced structural variation, much attention has been directed to the detection and characterization of such rearrangements, as well as the identification of the mechanisms involved in their formation. However, it has long been appreciated that chromosomes can undergo other forms of structural changes - balanced rearrangements - that do not involve quantitative variation of genetic material. Indeed, a particular subtype of balanced rearrangement – inversions – was recently found to be far more common than had been predicted from traditional cytogenetics. Chromosomal inversions alter the orientation of a specific genomic sequence and, unless involving breaks in coding or regulatory regions (and, disregarding complex trans effects, in their close vicinity), appear to be phenotypically silent. Such a surprising finding, which is difficult to reconcile with the classical interpretation of inversions as a mechanism causing subfertility (and ultimately reproductive isolation), motivated a new series of theoretical and empirical studies dedicated to understand their role in human genome evolution and to explore their possible association to complex genetic disorders. With this review, we attempt to describe the latest methodological improvements to inversions detection at a genome wide level, while exploring some of the possible implications of inversion rearrangements on the evolution of the human genome. PMID:23730202
Genetic Complexity and Quantitative Trait Loci Mapping of Yeast Morphological Traits
Nogami, Satoru; Ohya, Yoshikazu; Yvert, Gaël
2007-01-01
Functional genomics relies on two essential parameters: the sensitivity of phenotypic measures and the power to detect genomic perturbations that cause phenotypic variations. In model organisms, two types of perturbations are widely used. Artificial mutations can be introduced in virtually any gene and allow the systematic analysis of gene function via mutants fitness. Alternatively, natural genetic variations can be associated to particular phenotypes via genetic mapping. However, the access to genome manipulation and breeding provided by model organisms is sometimes counterbalanced by phenotyping limitations. Here we investigated the natural genetic diversity of Saccharomyces cerevisiae cellular morphology using a very sensitive high-throughput imaging platform. We quantified 501 morphological parameters in over 50,000 yeast cells from a cross between two wild-type divergent backgrounds. Extensive morphological differences were found between these backgrounds. The genetic architecture of the traits was complex, with evidence of both epistasis and transgressive segregation. We mapped quantitative trait loci (QTL) for 67 traits and discovered 364 correlations between traits segregation and inheritance of gene expression levels. We validated one QTL by the replacement of a single base in the genome. This study illustrates the natural diversity and complexity of cellular traits among natural yeast strains and provides an ideal framework for a genetical genomics dissection of multiple traits. Our results did not overlap with results previously obtained from systematic deletion strains, showing that both approaches are necessary for the functional exploration of genomes. PMID:17319748
2012-01-01
Background The genome of Mycobacterium avium subspecies paratuberculosis (MAP) is remarkably homogeneous among the genomes of bovine, human and wildlife isolates. However, previous work in our laboratories with the bovine K-10 strain has revealed substantial differences compared to sheep isolates. To systematically characterize all genomic differences that may be associated with the specific hosts, we sequenced the genomes of three U.S. sheep isolates and also obtained an optical map. Results Our analysis of one of the isolates, MAP S397, revealed a genome 4.8 Mb in size with 4,700 open reading frames (ORFs). Comparative analysis of the MAP S397 isolate showed it acquired approximately 10 large sequence regions that are shared with the human M. avium subsp. hominissuis strain 104 and lost 2 large regions that are present in the bovine strain. In addition, optical mapping defined the presence of 7 large inversions between the bovine and ovine genomes (~ 2.36 Mb). Whole-genome sequencing of 2 additional sheep strains of MAP (JTC1074 and JTC7565) further confirmed genomic homogeneity of the sheep isolates despite the presence of polymorphisms on the nucleotide level. Conclusions Comparative sequence analysis employed here provided a better understanding of the host association, evolution of members of the M. avium complex and could help in deciphering the phenotypic differences observed among sheep and cattle strains of MAP. A similar approach based on whole-genome sequencing combined with optical mapping could be employed to examine closely related pathogens. We propose an evolutionary scenario for M. avium complex strains based on these genome sequences. PMID:22409516
Edwards, Stefan M.; Sørensen, Izel F.; Sarup, Pernille; Mackay, Trudy F. C.; Sørensen, Peter
2016-01-01
Predicting individual quantitative trait phenotypes from high-resolution genomic polymorphism data is important for personalized medicine in humans, plant and animal breeding, and adaptive evolution. However, this is difficult for populations of unrelated individuals when the number of causal variants is low relative to the total number of polymorphisms and causal variants individually have small effects on the traits. We hypothesized that mapping molecular polymorphisms to genomic features such as genes and their gene ontology categories could increase the accuracy of genomic prediction models. We developed a genomic feature best linear unbiased prediction (GFBLUP) model that implements this strategy and applied it to three quantitative traits (startle response, starvation resistance, and chill coma recovery) in the unrelated, sequenced inbred lines of the Drosophila melanogaster Genetic Reference Panel. Our results indicate that subsetting markers based on genomic features increases the predictive ability relative to the standard genomic best linear unbiased prediction (GBLUP) model. Both models use all markers, but GFBLUP allows differential weighting of the individual genetic marker relationships, whereas GBLUP weighs the genetic marker relationships equally. Simulation studies show that it is possible to further increase the accuracy of genomic prediction for complex traits using this model, provided the genomic features are enriched for causal variants. Our GFBLUP model using prior information on genomic features enriched for causal variants can increase the accuracy of genomic predictions in populations of unrelated individuals and provides a formal statistical framework for leveraging and evaluating information across multiple experimental studies to provide novel insights into the genetic architecture of complex traits. PMID:27235308
Decoding the massive genome of loblolly pine using haploid DNA and novel assembly strategies
2014-01-01
Background The size and complexity of conifer genomes has, until now, prevented full genome sequencing and assembly. The large research community and economic importance of loblolly pine, Pinus taeda L., made it an early candidate for reference sequence determination. Results We develop a novel strategy to sequence the genome of loblolly pine that combines unique aspects of pine reproductive biology and genome assembly methodology. We use a whole genome shotgun approach relying primarily on next generation sequence generated from a single haploid seed megagametophyte from a loblolly pine tree, 20-1010, that has been used in industrial forest tree breeding. The resulting sequence and assembly was used to generate a draft genome spanning 23.2 Gbp and containing 20.1 Gbp with an N50 scaffold size of 66.9 kbp, making it a significant improvement over available conifer genomes. The long scaffold lengths allow the annotation of 50,172 gene models with intron lengths averaging over 2.7 kbp and sometimes exceeding 100 kbp in length. Analysis of orthologous gene sets identifies gene families that may be unique to conifers. We further characterize and expand the existing repeat library based on the de novo analysis of the repetitive content, estimated to encompass 82% of the genome. Conclusions In addition to its value as a resource for researchers and breeders, the loblolly pine genome sequence and assembly reported here demonstrates a novel approach to sequencing the large and complex genomes of this important group of plants that can now be widely applied. PMID:24647006
Enantioselective Reduction of Ketones and Imines Catalyzed by (CN-Box)Re(V)-Oxo Complexes
Nolin, Kristine A.; Ahn, Richard W.; Kobayashi, Yusuke; Kennedy-Smith, Joshua J.
2012-01-01
The development and application of chiral, non-racemic Re(V)-oxo complexes to the enantioselective reduction of prochiral ketones is described. In addition to the enantioselective reduction of prochiral ketones, we report the application of these complexes to (1) a tandem Meyer-Schuster rearrangement/reduction to access enantioenriched allylic alcohols and (2) the enantioselective reduction of imines. PMID:20623567
Tamarit, Daniel; Ellegaard, Kirsten M.; Wikander, Johan; Olofsson, Tobias; Vásquez, Alejandra; Andersson, Siv G.E.
2015-01-01
Lactobacillus kunkeei is the most abundant bacterial species in the honey crop and food products of honeybees. The 16 S rRNA genes of strains isolated from different bee species are nearly identical in sequence and therefore inadequate as markers for studies of coevolutionary patterns. Here, we have compared the 1.5 Mb genomes of ten L. kunkeei strains isolated from all recognized Apis species and another two strains from Meliponini species. A gene flux analysis, including previously sequenced Lactobacillus species as outgroups, indicated the influence of reductive evolution. The genome architecture is unique in that vertically inherited core genes are located near the terminus of replication, whereas genes for secreted proteins and putative host-adaptive traits are located near the origin of replication. We suggest that these features have resulted from a genome-wide loss of genes, with integrations of novel genes mostly occurring in regions flanking the origin of replication. The phylogenetic analyses showed that the bacterial topology was incongruent with the host topology, and that strains of the same microcluster have recombined frequently across the host species barriers, arguing against codiversification. Multiple genotypes were recovered in the individual hosts and transfers of mobile elements could be demonstrated for strains isolated from the same host species. Unlike other bacteria with small genomes, short generation times and multiple rRNA operons suggest that L. kunkeei evolves under selection for rapid growth in its natural growth habitat. The results provide an extended framework for reductive genome evolution and functional genome organization in bacteria. PMID:25953738
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kublanov, Ilya V.; Sigalova, Olga M.; Gavrilov, Sergey N.
The genome of Caldithrix abyssi, the first cultivated representative of a phylum-level bacterial lineage, was sequenced within the framework of Genomic Encyclopedia of Bacteria and Archaea (GEBA) project. The genomic analysis revealed mechanisms allowing this anaerobic bacterium to ferment peptides or to implement nitrate reduction with acetate or molecular hydrogen as electron donors. The genome encoded five different [NiFe]- and [FeFe]-hydrogenases, one of which, group 1 [NiFe]-hydrogenase, is presumably involved in lithoheterotrophic growth, three other produce H 2 during fermentation, and one is apparently bidirectional. The ability to reduce nitrate is determined by a nitrate reductase of the Nap family,more » while nitrite reduction to ammonia is presumably catalyzed by an octaheme cytochrome c nitrite reductase εHao. The genome contained genes of respiratory polysulfide/thiosulfate reductase, however, elemental sulfur and thiosulfate were not used as the electron acceptors for anaerobic respiration with acetate or H 2, probably due to the lack of the gene of the maturation protein. Nevertheless, elemental sulfur and thiosulfate stimulated growth on fermentable substrates (peptides), being reduced to sulfide, most probably through the action of the cytoplasmic sulfide dehydrogenase and/or NAD(P)-dependent [NiFe]-hydrogenase (sulfhydrogenase) encoded by the genome. Surprisingly, the genome of this anaerobic microorganism encoded all genes for cytochrome c oxidase, however, its maturation machinery seems to be non-operational due to genomic rearrangements of supplementary genes. Despite the fact that sugars were not among the substrates reported when C. abyssi was first described, our genomic analysis revealed multiple genes of glycoside hydrolases, and some of them were predicted to be secreted. This finding aided in bringing out four carbohydrates that supported the growth of C. abyssi: starch, cellobiose, glucomannan and xyloglucan. The genomic analysis demonstrated the ability of C. abyssi to synthesize nucleotides and most amino acids and vitamins. Finally, the genomic sequence allowed us to perform a phylogenomic analysis, based on 38 protein sequences, which confirmed the deep branching of this lineage and justified the proposal of a novel phylum Calditrichaeota.« less
Kublanov, Ilya V.; Sigalova, Olga M.; Gavrilov, Sergey N.; ...
2017-02-20
The genome of Caldithrix abyssi, the first cultivated representative of a phylum-level bacterial lineage, was sequenced within the framework of Genomic Encyclopedia of Bacteria and Archaea (GEBA) project. The genomic analysis revealed mechanisms allowing this anaerobic bacterium to ferment peptides or to implement nitrate reduction with acetate or molecular hydrogen as electron donors. The genome encoded five different [NiFe]- and [FeFe]-hydrogenases, one of which, group 1 [NiFe]-hydrogenase, is presumably involved in lithoheterotrophic growth, three other produce H 2 during fermentation, and one is apparently bidirectional. The ability to reduce nitrate is determined by a nitrate reductase of the Nap family,more » while nitrite reduction to ammonia is presumably catalyzed by an octaheme cytochrome c nitrite reductase εHao. The genome contained genes of respiratory polysulfide/thiosulfate reductase, however, elemental sulfur and thiosulfate were not used as the electron acceptors for anaerobic respiration with acetate or H 2, probably due to the lack of the gene of the maturation protein. Nevertheless, elemental sulfur and thiosulfate stimulated growth on fermentable substrates (peptides), being reduced to sulfide, most probably through the action of the cytoplasmic sulfide dehydrogenase and/or NAD(P)-dependent [NiFe]-hydrogenase (sulfhydrogenase) encoded by the genome. Surprisingly, the genome of this anaerobic microorganism encoded all genes for cytochrome c oxidase, however, its maturation machinery seems to be non-operational due to genomic rearrangements of supplementary genes. Despite the fact that sugars were not among the substrates reported when C. abyssi was first described, our genomic analysis revealed multiple genes of glycoside hydrolases, and some of them were predicted to be secreted. This finding aided in bringing out four carbohydrates that supported the growth of C. abyssi: starch, cellobiose, glucomannan and xyloglucan. The genomic analysis demonstrated the ability of C. abyssi to synthesize nucleotides and most amino acids and vitamins. Finally, the genomic sequence allowed us to perform a phylogenomic analysis, based on 38 protein sequences, which confirmed the deep branching of this lineage and justified the proposal of a novel phylum Calditrichaeota.« less
acdc – Automated Contamination Detection and Confidence estimation for single-cell genome data
Lux, Markus; Kruger, Jan; Rinke, Christian; ...
2016-12-20
A major obstacle in single-cell sequencing is sample contamination with foreign DNA. To guarantee clean genome assemblies and to prevent the introduction of contamination into public databases, considerable quality control efforts are put into post-sequencing analysis. Contamination screening generally relies on reference-based methods such as database alignment or marker gene search, which limits the set of detectable contaminants to organisms with closely related reference species. As genomic coverage in the tree of life is highly fragmented, there is an urgent need for a reference-free methodology for contaminant identification in sequence data. We present acdc, a tool specifically developed to aidmore » the quality control process of genomic sequence data. By combining supervised and unsupervised methods, it reliably detects both known and de novo contaminants. First, 16S rRNA gene prediction and the inclusion of ultrafast exact alignment techniques allow sequence classification using existing knowledge from databases. Second, reference-free inspection is enabled by the use of state-of-the-art machine learning techniques that include fast, non-linear dimensionality reduction of oligonucleotide signatures and subsequent clustering algorithms that automatically estimate the number of clusters. The latter also enables the removal of any contaminant, yielding a clean sample. Furthermore, given the data complexity and the ill-posedness of clustering, acdc employs bootstrapping techniques to provide statistically profound confidence values. Tested on a large number of samples from diverse sequencing projects, our software is able to quickly and accurately identify contamination. Results are displayed in an interactive user interface. Acdc can be run from the web as well as a dedicated command line application, which allows easy integration into large sequencing project analysis workflows. Acdc can reliably detect contamination in single-cell genome data. In addition to database-driven detection, it complements existing tools by its unsupervised techniques, which allow for the detection of de novo contaminants. Our contribution has the potential to drastically reduce the amount of resources put into these processes, particularly in the context of limited availability of reference species. As single-cell genome data continues to grow rapidly, acdc adds to the toolkit of crucial quality assurance tools.« less
acdc – Automated Contamination Detection and Confidence estimation for single-cell genome data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lux, Markus; Kruger, Jan; Rinke, Christian
A major obstacle in single-cell sequencing is sample contamination with foreign DNA. To guarantee clean genome assemblies and to prevent the introduction of contamination into public databases, considerable quality control efforts are put into post-sequencing analysis. Contamination screening generally relies on reference-based methods such as database alignment or marker gene search, which limits the set of detectable contaminants to organisms with closely related reference species. As genomic coverage in the tree of life is highly fragmented, there is an urgent need for a reference-free methodology for contaminant identification in sequence data. We present acdc, a tool specifically developed to aidmore » the quality control process of genomic sequence data. By combining supervised and unsupervised methods, it reliably detects both known and de novo contaminants. First, 16S rRNA gene prediction and the inclusion of ultrafast exact alignment techniques allow sequence classification using existing knowledge from databases. Second, reference-free inspection is enabled by the use of state-of-the-art machine learning techniques that include fast, non-linear dimensionality reduction of oligonucleotide signatures and subsequent clustering algorithms that automatically estimate the number of clusters. The latter also enables the removal of any contaminant, yielding a clean sample. Furthermore, given the data complexity and the ill-posedness of clustering, acdc employs bootstrapping techniques to provide statistically profound confidence values. Tested on a large number of samples from diverse sequencing projects, our software is able to quickly and accurately identify contamination. Results are displayed in an interactive user interface. Acdc can be run from the web as well as a dedicated command line application, which allows easy integration into large sequencing project analysis workflows. Acdc can reliably detect contamination in single-cell genome data. In addition to database-driven detection, it complements existing tools by its unsupervised techniques, which allow for the detection of de novo contaminants. Our contribution has the potential to drastically reduce the amount of resources put into these processes, particularly in the context of limited availability of reference species. As single-cell genome data continues to grow rapidly, acdc adds to the toolkit of crucial quality assurance tools.« less
Zhang, Jin; Ruhlman, Tracey A; Sabir, Jamal S M; Blazier, John Chris; Weng, Mao-Lun; Park, Seongjun; Jansen, Robert K
2016-02-17
Disruption of DNA replication, recombination, and repair (DNA-RRR) systems has been hypothesized to cause highly elevated nucleotide substitution rates and genome rearrangements in the plastids of angiosperms, but this theory remains untested. To investigate nuclear-plastid genome (plastome) coevolution in Geraniaceae, four different measures of plastome complexity (rearrangements, repeats, nucleotide insertions/deletions, and substitution rates) were evaluated along with substitution rates of 12 nuclear-encoded, plastid-targeted DNA-RRR genes from 27 Geraniales species. Significant correlations were detected for nonsynonymous (dN) but not synonymous (dS) substitution rates for three DNA-RRR genes (uvrB/C, why1, and gyrA) supporting a role for these genes in accelerated plastid genome evolution in Geraniaceae. Furthermore, correlation between dN of uvrB/C and plastome complexity suggests the presence of nucleotide excision repair system in plastids. Significant correlations were also detected between plastome complexity and 13 of the 90 nuclear-encoded organelle-targeted genes investigated. Comparisons revealed significant acceleration of dN in plastid-targeted genes of Geraniales relative to Brassicales suggesting this correlation may be an artifact of elevated rates in this gene set in Geraniaceae. Correlation between dN of plastid-targeted DNA-RRR genes and plastome complexity supports the hypothesis that the aberrant patterns in angiosperm plastome evolution could be caused by dysfunction in DNA-RRR systems. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Enhancer-Derived lncRNAs Regulate Genome Architecture: Fact or Fiction?
Fanucchi, Stephanie; Mhlanga, Musa M
2017-06-01
How does the non-coding portion of the genome contribute to the regulation of genome architecture? A recent paper by Tan et al. focuses on the relationship between cis-acting complex-trait-associated lincRNAs and the formation of chromosomal contacts in topologically associating domains (TADs). Copyright © 2017 Elsevier Ltd. All rights reserved.
Reference quality assembly of the 3.5 Gb genome of Capsicum annuum form a single linked-read library
USDA-ARS?s Scientific Manuscript database
Linked-Read sequencing technology has recently been employed successfully for de novo assembly of multiple human genomes, however the utility of this technology for complex plant genomes is unproven. We evaluated the technology for this purpose by sequencing the 3.5 gigabase (Gb) diploid pepper (Cap...
Draft Genome Sequence of Fish Pathogen Aeromonas bestiarum GA97-22.
Kumru, Salih; Tekedar, Hasan C; Griffin, Matt J; Waldbieser, Geoffrey C; Liles, Mark R; Sonstegard, Tad; Schroeder, Steven G; Lawrence, Mark L; Karsi, Attila
2018-06-14
Aeromonas bestiarum is a Gram-negative mesophilic motile bacterium causing acute hemorrhagic septicemia or chronic skin ulcers in fish. Here, we report the draft genome sequence of A. bestiarum strain GA97-22, which was isolated from rainbow trout in 1997. This genome sequence will improve our understanding of the complex taxonomy of motile aeromonads.
Genomic signatures of evolutionary transitions from solitary to group living
Kapheim, Karen M.; Pan, Hailin; Li, Cai; Salzberg, Steven L.; Puiu, Daniela; Magoc, Tanja; Robertson, Hugh M.; Hudson, Matthew E.; Venkat, Aarti; Fischman, Brielle J.; Hernandez, Alvaro; Yandell, Mark; Ence, Daniel; Holt, Carson; Yocum, George D.; Kemp, William P.; Bosch, Jordi; Waterhouse, Robert M.; Zdobnov, Evgeny M.; Stolle, Eckart; Kraus, F. Bernhard; Helbing, Sophie; Moritz, Robin F. A.; Glastad, Karl M.; Hunt, Brendan G.; Goodisman, Michael A. D.; Hauser, Frank; Grimmelikhuijzen, Cornelis J. P.; Pinheiro, Daniel Guariz; Nunes, Francis Morais Franco; Soares, Michelle Prioli Miranda; Tanaka, Érica Donato; Simões, Zilá Luz Paulino; Hartfelder, Klaus; Evans, Jay D.; Barribeau, Seth M.; Johnson, Reed M.; Massey, Jonathan H.; Southey, Bruce R.; Hasselmann, Martin; Hamacher, Daniel; Biewer, Matthias; Kent, Clement F.; Zayed, Amro; Blatti, Charles; Sinha, Saurabh; Johnston, J. Spencer; Hanrahan, Shawn J.; Kocher, Sarah D.; Wang, Jun; Robinson, Gene E.; Zhang, Guojie
2017-01-01
The evolution of eusociality is one of the major transitions in evolution, but the underlying genomic changes are unknown. We compared the genomes of 10 bee species that vary in social complexity, representing multiple independent transitions in social evolution, and report three major findings. First, many important genes show evidence of neutral evolution as a consequence of relaxed selection with increasing social complexity. Second, there is no single road map to eusociality; independent evolutionary transitions in sociality have independent genetic underpinnings. Third, though clearly independent in detail, these transitions do have similar general features, including an increase in constrained protein evolution accompanied by increases in the potential for gene regulation and decreases in diversity and abundance of transposable elements. Eusociality may arise through different mechanisms each time, but would likely always involve an increase in the complexity of gene networks. PMID:25977371
Social evolution. Genomic signatures of evolutionary transitions from solitary to group living.
Kapheim, Karen M; Pan, Hailin; Li, Cai; Salzberg, Steven L; Puiu, Daniela; Magoc, Tanja; Robertson, Hugh M; Hudson, Matthew E; Venkat, Aarti; Fischman, Brielle J; Hernandez, Alvaro; Yandell, Mark; Ence, Daniel; Holt, Carson; Yocum, George D; Kemp, William P; Bosch, Jordi; Waterhouse, Robert M; Zdobnov, Evgeny M; Stolle, Eckart; Kraus, F Bernhard; Helbing, Sophie; Moritz, Robin F A; Glastad, Karl M; Hunt, Brendan G; Goodisman, Michael A D; Hauser, Frank; Grimmelikhuijzen, Cornelis J P; Pinheiro, Daniel Guariz; Nunes, Francis Morais Franco; Soares, Michelle Prioli Miranda; Tanaka, Érica Donato; Simões, Zilá Luz Paulino; Hartfelder, Klaus; Evans, Jay D; Barribeau, Seth M; Johnson, Reed M; Massey, Jonathan H; Southey, Bruce R; Hasselmann, Martin; Hamacher, Daniel; Biewer, Matthias; Kent, Clement F; Zayed, Amro; Blatti, Charles; Sinha, Saurabh; Johnston, J Spencer; Hanrahan, Shawn J; Kocher, Sarah D; Wang, Jun; Robinson, Gene E; Zhang, Guojie
2015-06-05
The evolution of eusociality is one of the major transitions in evolution, but the underlying genomic changes are unknown. We compared the genomes of 10 bee species that vary in social complexity, representing multiple independent transitions in social evolution, and report three major findings. First, many important genes show evidence of neutral evolution as a consequence of relaxed selection with increasing social complexity. Second, there is no single road map to eusociality; independent evolutionary transitions in sociality have independent genetic underpinnings. Third, though clearly independent in detail, these transitions do have similar general features, including an increase in constrained protein evolution accompanied by increases in the potential for gene regulation and decreases in diversity and abundance of transposable elements. Eusociality may arise through different mechanisms each time, but would likely always involve an increase in the complexity of gene networks. Copyright © 2015, American Association for the Advancement of Science.
Rahman, Syed Asad; Singh, Yadvir; Kohli, Sakshi; Ahmad, Javeed; Ehtesham, Nasreen Z; Tyagi, Anil K; Hasnain, Seyed E
2014-11-04
Mycobacterial evolution involves various processes, such as genome reduction, gene cooption, and critical gene acquisition. Our comparative genome size analysis of 44 mycobacterial genomes revealed that the nonpathogenic (NP) genomes were bigger than those of opportunistic (OP) or totally pathogenic (TP) mycobacteria, with the TP genomes being smaller yet variable in size--their genomic plasticity reflected their ability to evolve and survive under various environmental conditions. From the 44 mycobacterial species, 13 species, representing TP, OP, and NP, were selected for genomic-relatedness analyses. Analysis of homologous protein-coding genes shared between Mycobacterium indicus pranii (NP), Mycobacterium intracellulare ATCC 13950 (OP), and Mycobacterium tuberculosis H37Rv (TP) revealed that 4,995 (i.e., ~95%) M. indicaus pranii proteins have homology with M. intracellulare, whereas the homologies among M. indicus pranii, M. intracellulare ATCC 13950, and M. tuberculosis H37Rv were significantly lower. A total of 4,153 (~79%) M. indicus pranii proteins and 4,093 (~79%) M. intracellulare ATCC 13950 proteins exhibited homology with the M. tuberculosis H37Rv proteome, while 3,301 (~82%) and 3,295 (~82%) M. tuberculosis H37Rv proteins showed homology with M. indicus pranii and M. intracellulare ATCC 13950 proteomes, respectively. Comparative metabolic pathway analyses of TP/OP/NP mycobacteria showed enzymatic plasticity between M. indicus pranii (NP) and M. intracellulare ATCC 13950 (OP), Mycobacterium avium 104 (OP), and M. tuberculosis H37Rv (TP). Mycobacterium tuberculosis seems to have acquired novel alternate pathways with possible roles in metabolism, host-pathogen interactions, virulence, and intracellular survival, and by implication some of these could be potential drug targets. The complete sequence analysis of Mycobacterium indicus pranii, a novel species of Mycobacterium shown earlier to have strong immunomodulatory properties and currently in use for the treatment of leprosy, places it evolutionarily at the point of transition to pathogenicity. With the purpose of establishing the importance of M. indicus pranii in providing insight into the virulence mechanism of tuberculous and nontuberculous mycobacteria, we carried out comparative genomic and proteomic analyses of 44 mycobacterial species representing nonpathogenic (NP), opportunistic (OP), and totally pathogenic (TP) mycobacteria. Our results clearly placed M. indicus pranii as an ancestor of the M. avium complex. Analyses of comparative metabolic pathways between M. indicus pranii (NP), M. tuberculosis (TP), and M. intracellulare (OP) pointed to the presence of novel alternative pathways in M. tuberculosis with implications for pathogenesis and survival in the human host and identification of new drug targets. Copyright © 2014 Rahman et al.
Telenti, Amalio; Ayday, Erman; Hubaux, Jean Pierre
2014-01-01
The storage of greater numbers of exomes or genomes raises the question of loss of privacy for the individual and for families if genomic data are not properly protected. Access to genome data may result from a personal decision to disclose, or from gaps in protection. In either case, revealing genome data has consequences beyond the individual, as it compromises the privacy of family members. Increasing availability of genome data linked or linkable to metadata through online social networks and services adds one additional layer of complexity to the protection of genome privacy. The field of computer science and information technology offers solutions to secure genomic data so that individuals, medical personnel or researchers can access only the subset of genomic information required for healthcare or dedicated studies. PMID:25254097
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chapman, Carol; Henry, Matthew; Bishop-Lilly, Kimberly A.
Historically, cholera outbreaks have been linked to V. cholerae O1 serogroup strains or its derivatives of the O37 and O139 serogroups. A genomic study on the 2010 Haiti cholera outbreak strains highlighted the putative role of non O1/non-O139 V. cholerae in causing cholera and the lack of genomic sequences of such strains from around the world. Here we address these gaps by scanning a global collection of V. cholerae strains as a first step towards understanding the population genetic diversity and epidemic potential of non O1/non-O139 strains. Whole Genome Mapping (Optical Mapping) based bar coding produces a high resolution, orderedmore » restriction map, depicting a complete view of the unique chromosomal architecture of an organism. To assess the genomic diversity of non-O1/non-O139 V. cholerae, we applied a Whole Genome Mapping strategy on a well-defined and geographically and temporally diverse strain collection, the Sakazaki serogroup type strains. Whole Genome Map data on 91 of the 206 serogroup type strains support the hypothesis that V. cholerae has an unprecedented genetic and genomic structural diversity. Interestingly, we discovered chromosomal fusions in two unusual strains that possess a single chromosome instead of the two chromosomes usually found in V. cholerae. We also found pervasive chromosomal rearrangements such as duplications and indels in many strains. The majority of Vibrio genome sequences currently in public databases are unfinished draft sequences. The Whole Genome Mapping approach presented here enables rapid screening of large strain collections to capture genomic complexities that would not have been otherwise revealed by unfinished draft genome sequencing and thus aids in assembling and finishing draft sequences of complex genomes. Furthermore, Whole Genome Mapping allows for prediction of novel V. cholerae non-O1/non-O139 strains that may have the potential to cause future cholera outbreaks.« less
Chapman, Carol; Henry, Matthew; Bishop-Lilly, Kimberly A; Awosika, Joy; Briska, Adam; Ptashkin, Ryan N; Wagner, Trevor; Rajanna, Chythanya; Tsang, Hsinyi; Johnson, Shannon L; Mokashi, Vishwesh P; Chain, Patrick S G; Sozhamannan, Shanmuga
2015-01-01
Historically, cholera outbreaks have been linked to V. cholerae O1 serogroup strains or its derivatives of the O37 and O139 serogroups. A genomic study on the 2010 Haiti cholera outbreak strains highlighted the putative role of non O1/non-O139 V. cholerae in causing cholera and the lack of genomic sequences of such strains from around the world. Here we address these gaps by scanning a global collection of V. cholerae strains as a first step towards understanding the population genetic diversity and epidemic potential of non O1/non-O139 strains. Whole Genome Mapping (Optical Mapping) based bar coding produces a high resolution, ordered restriction map, depicting a complete view of the unique chromosomal architecture of an organism. To assess the genomic diversity of non-O1/non-O139 V. cholerae, we applied a Whole Genome Mapping strategy on a well-defined and geographically and temporally diverse strain collection, the Sakazaki serogroup type strains. Whole Genome Map data on 91 of the 206 serogroup type strains support the hypothesis that V. cholerae has an unprecedented genetic and genomic structural diversity. Interestingly, we discovered chromosomal fusions in two unusual strains that possess a single chromosome instead of the two chromosomes usually found in V. cholerae. We also found pervasive chromosomal rearrangements such as duplications and indels in many strains. The majority of Vibrio genome sequences currently in public databases are unfinished draft sequences. The Whole Genome Mapping approach presented here enables rapid screening of large strain collections to capture genomic complexities that would not have been otherwise revealed by unfinished draft genome sequencing and thus aids in assembling and finishing draft sequences of complex genomes. Furthermore, Whole Genome Mapping allows for prediction of novel V. cholerae non-O1/non-O139 strains that may have the potential to cause future cholera outbreaks.
Chapman, Carol; Henry, Matthew; Bishop-Lilly, Kimberly A.; ...
2015-03-20
Historically, cholera outbreaks have been linked to V. cholerae O1 serogroup strains or its derivatives of the O37 and O139 serogroups. A genomic study on the 2010 Haiti cholera outbreak strains highlighted the putative role of non O1/non-O139 V. cholerae in causing cholera and the lack of genomic sequences of such strains from around the world. Here we address these gaps by scanning a global collection of V. cholerae strains as a first step towards understanding the population genetic diversity and epidemic potential of non O1/non-O139 strains. Whole Genome Mapping (Optical Mapping) based bar coding produces a high resolution, orderedmore » restriction map, depicting a complete view of the unique chromosomal architecture of an organism. To assess the genomic diversity of non-O1/non-O139 V. cholerae, we applied a Whole Genome Mapping strategy on a well-defined and geographically and temporally diverse strain collection, the Sakazaki serogroup type strains. Whole Genome Map data on 91 of the 206 serogroup type strains support the hypothesis that V. cholerae has an unprecedented genetic and genomic structural diversity. Interestingly, we discovered chromosomal fusions in two unusual strains that possess a single chromosome instead of the two chromosomes usually found in V. cholerae. We also found pervasive chromosomal rearrangements such as duplications and indels in many strains. The majority of Vibrio genome sequences currently in public databases are unfinished draft sequences. The Whole Genome Mapping approach presented here enables rapid screening of large strain collections to capture genomic complexities that would not have been otherwise revealed by unfinished draft genome sequencing and thus aids in assembling and finishing draft sequences of complex genomes. Furthermore, Whole Genome Mapping allows for prediction of novel V. cholerae non-O1/non-O139 strains that may have the potential to cause future cholera outbreaks.« less
From integrative genomics to systems genetics in the rat to link genotypes to phenotypes
Moreno-Moral, Aida
2016-01-01
ABSTRACT Complementary to traditional gene mapping approaches used to identify the hereditary components of complex diseases, integrative genomics and systems genetics have emerged as powerful strategies to decipher the key genetic drivers of molecular pathways that underlie disease. Broadly speaking, integrative genomics aims to link cellular-level traits (such as mRNA expression) to the genome to identify their genetic determinants. With the characterization of several cellular-level traits within the same system, the integrative genomics approach evolved into a more comprehensive study design, called systems genetics, which aims to unravel the complex biological networks and pathways involved in disease, and in turn map their genetic control points. The first fully integrated systems genetics study was carried out in rats, and the results, which revealed conserved trans-acting genetic regulation of a pro-inflammatory network relevant to type 1 diabetes, were translated to humans. Many studies using different organisms subsequently stemmed from this example. The aim of this Review is to describe the most recent advances in the fields of integrative genomics and systems genetics applied in the rat, with a focus on studies of complex diseases ranging from inflammatory to cardiometabolic disorders. We aim to provide the genetics community with a comprehensive insight into how the systems genetics approach came to life, starting from the first integrative genomics strategies [such as expression quantitative trait loci (eQTLs) mapping] and concluding with the most sophisticated gene network-based analyses in multiple systems and disease states. Although not limited to studies that have been directly translated to humans, we will focus particularly on the successful investigations in the rat that have led to primary discoveries of genes and pathways relevant to human disease. PMID:27736746
From integrative genomics to systems genetics in the rat to link genotypes to phenotypes.
Moreno-Moral, Aida; Petretto, Enrico
2016-10-01
Complementary to traditional gene mapping approaches used to identify the hereditary components of complex diseases, integrative genomics and systems genetics have emerged as powerful strategies to decipher the key genetic drivers of molecular pathways that underlie disease. Broadly speaking, integrative genomics aims to link cellular-level traits (such as mRNA expression) to the genome to identify their genetic determinants. With the characterization of several cellular-level traits within the same system, the integrative genomics approach evolved into a more comprehensive study design, called systems genetics, which aims to unravel the complex biological networks and pathways involved in disease, and in turn map their genetic control points. The first fully integrated systems genetics study was carried out in rats, and the results, which revealed conserved trans-acting genetic regulation of a pro-inflammatory network relevant to type 1 diabetes, were translated to humans. Many studies using different organisms subsequently stemmed from this example. The aim of this Review is to describe the most recent advances in the fields of integrative genomics and systems genetics applied in the rat, with a focus on studies of complex diseases ranging from inflammatory to cardiometabolic disorders. We aim to provide the genetics community with a comprehensive insight into how the systems genetics approach came to life, starting from the first integrative genomics strategies [such as expression quantitative trait loci (eQTLs) mapping] and concluding with the most sophisticated gene network-based analyses in multiple systems and disease states. Although not limited to studies that have been directly translated to humans, we will focus particularly on the successful investigations in the rat that have led to primary discoveries of genes and pathways relevant to human disease. © 2016. Published by The Company of Biologists Ltd.
Lesion complexity drives age related cancer susceptibility in human mammary epithelial cells
Sridharan, Deepa M.; Enerio, Shiena; Stampfer, Martha M.; ...
2017-02-28
Exposures to various DNA damaging agents can deregulate a wide array of critical mechanisms that maintain genome integrity. It is unclear how these processes are impacted by one's age at the time of exposure and the complexity of the DNA lesion. To clarify this, we employed radiation as a tool to generate simple and complex lesions in normal primary human mammary epithelial cells derived from women of various ages. We hypothesized that genomic instability in the progeny of older cells exposed to complex damages will be exacerbated by age-associated deterioration in function and accentuate age-related cancer predisposition. Centrosome aberrations andmore » changes in stem cell numbers were examined to assess cancer susceptibility. Our data show that the frequency of centrosome aberrations proportionately increases with age following complex damage causing exposures. However, a dose-dependent increase in stem cell numbers was independent of both age and the nature of the insult. Phospho-protein signatures provide mechanistic clues to signaling networks implicated in these effects. Together these studies suggest that complex damage can threaten the genome stability of the stem cell population in older people. Propagation of this instability is subject to influence by the microenvironment and will ultimately define cancer risk in the older population.« less
Lesion complexity drives age related cancer susceptibility in human mammary epithelial cells
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sridharan, Deepa M.; Enerio, Shiena; Stampfer, Martha M.
Exposures to various DNA damaging agents can deregulate a wide array of critical mechanisms that maintain genome integrity. It is unclear how these processes are impacted by one's age at the time of exposure and the complexity of the DNA lesion. To clarify this, we employed radiation as a tool to generate simple and complex lesions in normal primary human mammary epithelial cells derived from women of various ages. We hypothesized that genomic instability in the progeny of older cells exposed to complex damages will be exacerbated by age-associated deterioration in function and accentuate age-related cancer predisposition. Centrosome aberrations andmore » changes in stem cell numbers were examined to assess cancer susceptibility. Our data show that the frequency of centrosome aberrations proportionately increases with age following complex damage causing exposures. However, a dose-dependent increase in stem cell numbers was independent of both age and the nature of the insult. Phospho-protein signatures provide mechanistic clues to signaling networks implicated in these effects. Together these studies suggest that complex damage can threaten the genome stability of the stem cell population in older people. Propagation of this instability is subject to influence by the microenvironment and will ultimately define cancer risk in the older population.« less
Formation of stable and functional HIV-1 nucleoprotein complexes in vitro.
Tanchou, V; Gabus, C; Rogemond, V; Darlix, J L
1995-10-06
HIV genomic RNA resides within the nucleocapsid, in the interior of the virus, which serves to protect the RNA against nuclease degradation and to promote its reverse transcription. To investigate the role of nucleocapsid protein (NCp7) in the stability and replication of genomic RNA within the nucleocapsid, we used NCp7, reverse transcriptase (RT) and RNAs representing the 5' and 3' regions of the genome to reconstitute functional HIV-1 nucleocapsids. The nucleoprotein complexes generated in vitro were found to be stable, which, according to biochemical and genetic data, probably results from the tight binding of NCp7 molecules to the RNA and strong NCp7/NCp7 interactions. The nucleoprotein complexes efficiently protected viral RNA against RNase degradation and, at the same time, promoted viral DNA synthesis by RT. DNA strand transfer from the 5' to the 3' RNA template was very efficient in nucleoprotein complexes formed in the presence of both RNAs, but not when the RNAs were in separate complexes. These results indicate that the in vitro reconstituted HIV-1 nucleoprotein complexes function like virion nucleocapsids and thus provide a way to study at the molecular level this viral substructure and the synthesis of proviral DNA, and to search for new anti-HIV agents.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Grigoriev, Igor V.
2011-03-14
Genomes of energy and environment fungi are in focus of the Fungal Genomic Program at the US Department of Energy Joint Genome Institute (JGI). Its key project, the Genomics Encyclopedia of Fungi, targets fungi related to plant health (symbionts, pathogens, and biocontrol agents) and biorefinery processes (cellulose degradation, sugar fermentation, industrial hosts), and explores fungal diversity by means of genome sequencing and analysis. Over 50 fungal genomes have been sequenced by JGI to date and released through MycoCosm (www.jgi.doe.gov/fungi), a fungal web-portal, which integrates sequence and functional data with genome analysis tools for user community. Sequence analysis supported by functionalmore » genomics leads to developing parts list for complex systems ranging from ecosystems of biofuel crops to biorefineries. Recent examples of such 'parts' suggested by comparative genomics and functional analysis in these areas are presented here« less
Chironomid midges (Diptera, chironomidae) show extremely small genome sizes.
Cornette, Richard; Gusev, Oleg; Nakahara, Yuichi; Shimura, Sachiko; Kikawada, Takahiro; Okuda, Takashi
2015-06-01
Chironomid midges (Diptera; Chironomidae) are found in various environments from the high Arctic to the Antarctic, including temperate and tropical regions. In many freshwater habitats, members of this family are among the most abundant invertebrates. In the present study, the genome sizes of 25 chironomid species were determined by flow cytometry and the resulting C-values ranged from 0.07 to 0.20 pg DNA (i.e. from about 68 to 195 Mbp). These genome sizes were uniformly very small and included, to our knowledge, the smallest genome sizes recorded to date among insects. Small proportion of transposable elements and short intron sizes were suggested to contribute to the reduction of genome sizes in chironomids. We discuss about the possible developmental and physiological advantages of having a small genome size and about putative implications for the ecological success of the family Chironomidae.
Use of RecA protein to enrich for homologous genes in a genomic library
DOE Office of Scientific and Technical Information (OSTI.GOV)
Taidi-Laskowski, B.; Grumet, F.C.; Tyan, D.
1988-08-25
RecA protein-coated probe has been utilized to enrich genomic digests for desired genes in order to facilitate cloning from genomic libraries. Using a previously cloned HLA-B27 gene as the recA-coated enrichment probe, the authors obtained a mean 108x increase in the ratio of specific to nonspecific plaques in lambda libraries screened for B27 variant alleles of estimated 99% homology to the probe. Class I genes of lesser homology were less enriched. Loss of genomic DNA during the enrichment procedure can, however, restrict application of this technique whenever starting genomic DNA is very limited. Nevertheless, the impressive reduction in cloning effortmore » and material makes recA enrichment a useful new tool for cloning homologous genes from genomic DNA.« less
Machitani, Mitsuhiro; Sakurai, Fuminori; Wakabayashi, Keisaku; Nakatani, Kosuke; Takayama, Kazuo; Tachibana, Masashi; Mizuguchi, Hiroyuki
2017-01-01
Clustered regularly interspaced short palindromic repeat (CRISPR)/Cas9-mediated genome engineering technology is a powerful tool for generation of cells and animals with engineered mutations in their genomes. In order to introduce the CRISPR/Cas9 system into target cells, nonviral and viral vectors are often used; however, such vectors trigger innate immune responses associated with production of type I interferons (IFNs). We have recently demonstrated that type I IFNs inhibit short-hairpin RNA-mediated gene silencing, which led us to hypothesize that type I IFNs may also inhibit CRISPR/Cas9-mediated genome mutagenesis. Here we investigated this hypothesis. A single-strand annealing assay using a reporter plasmid demonstrated that CRISPR/Cas9-mediated cleavage efficiencies of the target double-stranded DNA were significantly reduced by IFNα. A mismatch recognition nuclease-dependent genotyping assay also demonstrated that IFNα reduced insertion or deletion (indel) mutation levels by approximately half. Treatment with IFNα did not alter Cas9 protein expression levels, whereas the copy numbers of guide RNA (gRNA) were significantly reduced by IFNα stimulation. These results indicate that type I IFNs significantly reduce gRNA expression levels following introduction of the CRISPR/Cas9 system in the cells, leading to a reduction in the efficiencies of CRISPR/Cas9-mediated genome mutagenesis. Our findings provide important clues for the achievement of efficient genome engineering using the CRISPR/Cas9 system.
Mimivirus shows dramatic genome reduction after intraamoebal culture
Boyer, Mickaël; Azza, Saïd; Barrassi, Lina; Klose, Thomas; Campocasso, Angélique; Pagnier, Isabelle; Fournous, Ghislain; Borg, Audrey; Robert, Catherine; Zhang, Xinzheng; Desnues, Christelle; Henrissat, Bernard; Rossmann, Michael G.; La Scola, Bernard; Raoult, Didier
2011-01-01
Most phagocytic protist viruses have large particles and genomes as well as many laterally acquired genes that may be associated with a sympatric intracellular life (a community-associated lifestyle with viruses, bacteria, and eukaryotes) and the presence of virophages. By subculturing Mimivirus 150 times in a germ-free amoebal host, we observed the emergence of a bald form of the virus that lacked surface fibers and replicated in a morphologically different type of viral factory. When studying a 0.40-μm filtered cloned particle, we found that its genome size shifted from 1.2 (M1) to 0.993 Mb (M4), mainly due to large deletions occurring at both ends of the genome. Some of the lost genes are encoding enzymes required for posttranslational modification of the structural viral proteins, such as glycosyltransferases and ankyrin repeat proteins. Proteomic analysis allowed identification of three proteins, probably required for the assembly of virus fibers. The genes for two of these were found to be deleted from the M4 virus genome. The proteins associated with fibers are highly antigenic and can be recognized by mouse and human antimimivirus antibodies. In addition, the bald strain (M4) was not able to propagate the sputnik virophage. Overall, the Mimivirus transition from a sympatric to an allopatric lifestyle was associated with a stepwise genome reduction and the production of a predominantly bald virophage resistant strain. The new axenic ecosystem allowed the allopatric Mimivirus to lose unnecessary genes that might be involved in the control of competitors. PMID:21646533
Mimivirus shows dramatic genome reduction after intraamoebal culture.
Boyer, Mickaël; Azza, Saïd; Barrassi, Lina; Klose, Thomas; Campocasso, Angélique; Pagnier, Isabelle; Fournous, Ghislain; Borg, Audrey; Robert, Catherine; Zhang, Xinzheng; Desnues, Christelle; Henrissat, Bernard; Rossmann, Michael G; La Scola, Bernard; Raoult, Didier
2011-06-21
Most phagocytic protist viruses have large particles and genomes as well as many laterally acquired genes that may be associated with a sympatric intracellular life (a community-associated lifestyle with viruses, bacteria, and eukaryotes) and the presence of virophages. By subculturing Mimivirus 150 times in a germ-free amoebal host, we observed the emergence of a bald form of the virus that lacked surface fibers and replicated in a morphologically different type of viral factory. When studying a 0.40-μm filtered cloned particle, we found that its genome size shifted from 1.2 (M1) to 0.993 Mb (M4), mainly due to large deletions occurring at both ends of the genome. Some of the lost genes are encoding enzymes required for posttranslational modification of the structural viral proteins, such as glycosyltransferases and ankyrin repeat proteins. Proteomic analysis allowed identification of three proteins, probably required for the assembly of virus fibers. The genes for two of these were found to be deleted from the M4 virus genome. The proteins associated with fibers are highly antigenic and can be recognized by mouse and human antimimivirus antibodies. In addition, the bald strain (M4) was not able to propagate the sputnik virophage. Overall, the Mimivirus transition from a sympatric to an allopatric lifestyle was associated with a stepwise genome reduction and the production of a predominantly bald virophage resistant strain. The new axenic ecosystem allowed the allopatric Mimivirus to lose unnecessary genes that might be involved in the control of competitors.
Daaboul, George G; Lopez, Carlos A; Chinnala, Jyothsna; Goldberg, Bennett B; Connor, John H; Ünlü, M Selim
2014-06-24
Rapid, sensitive, and direct label-free capture and characterization of nanoparticles from complex media such as blood or serum will broadly impact medicine and the life sciences. We demonstrate identification of virus particles in complex samples for replication-competent wild-type vesicular stomatitis virus (VSV), defective VSV, and Ebola- and Marburg-pseudotyped VSV with high sensitivity and specificity. Size discrimination of the imaged nanoparticles (virions) allows differentiation between modified viruses having different genome lengths and facilitates a reduction in the counting of nonspecifically bound particles to achieve a limit-of-detection (LOD) of 5 × 10(3) pfu/mL for the Ebola and Marburg VSV pseudotypes. We demonstrate the simultaneous detection of multiple viruses in a single sample (composed of serum or whole blood) for screening applications and uncompromised detection capabilities in samples contaminated with high levels of bacteria. By employing affinity-based capture, size discrimination, and a "digital" detection scheme to count single virus particles, we show that a robust and sensitive virus/nanoparticle sensing assay can be established for targets in complex samples. The nanoparticle microscopy system is termed the Single Particle Interferometric Reflectance Imaging Sensor (SP-IRIS) and is capable of high-throughput and rapid sizing of large numbers of biological nanoparticles on an antibody microarray for research and diagnostic applications.
This proposal develops scalable R / Bioconductor software infrastructure and data resources to integrate complex, heterogeneous, and large cancer genomic experiments. The falling cost of genomic assays facilitates collection of multiple data types (e.g., gene and transcript expression, structural variation, copy number, methylation, and microRNA data) from a set of clinical specimens. Furthermore, substantial resources are now available from large consortium activities like The Cancer Genome Atlas (TCGA).
The Genome and Methylome of a Subsocial Small Carpenter Bee, Ceratina calcarata
Rehan, Sandra M.; Glastad, Karl M.; Lawson, Sarah P.; Hunt, Brendan G.
2016-01-01
Understanding the evolution of animal societies, considered to be a major transition in evolution, is a key topic in evolutionary biology. Recently, new gateways for understanding social evolution have opened up due to advances in genomics, allowing for unprecedented opportunities in studying social behavior on a molecular level. In particular, highly eusocial insect species (caste-containing societies with nonreproductives that care for siblings) have taken center stage in studies of the molecular evolution of sociality. Despite advances in genomic studies of both solitary and eusocial insects, we still lack genomic resources for early insect societies. To study the genetic basis of social traits requires comparison of genomes from a diversity of organisms ranging from solitary to complex social forms. Here we present the genome of a subsocial bee, Ceratina calcarata. This study begins to address the types of genomic changes associated with the earliest origins of simple sociality using the small carpenter bee. Genes associated with lipid transport and DNA recombination have undergone positive selection in C. calcarata relative to other bee lineages. Furthermore, we provide the first methylome of a noneusocial bee. Ceratina calcarata contains the complete enzymatic toolkit for DNA methylation. As in the honey bee and many other holometabolous insects, DNA methylation is targeted to exons. The addition of this genome allows for new lines of research into the genetic and epigenetic precursors to complex social behaviors. PMID:27048475
Vasconcelos, Ana Tereza R.; Ferreira, Henrique B.; Bizarro, Cristiano V.; Bonatto, Sandro L.; Carvalho, Marcos O.; Pinto, Paulo M.; Almeida, Darcy F.; Almeida, Luiz G. P.; Almeida, Rosana; Alves-Filho, Leonardo; Assunção, Enedina N.; Azevedo, Vasco A. C.; Bogo, Maurício R.; Brigido, Marcelo M.; Brocchi, Marcelo; Burity, Helio A.; Camargo, Anamaria A.; Camargo, Sandro S.; Carepo, Marta S.; Carraro, Dirce M.; de Mattos Cascardo, Júlio C.; Castro, Luiza A.; Cavalcanti, Gisele; Chemale, Gustavo; Collevatti, Rosane G.; Cunha, Cristina W.; Dallagiovanna, Bruno; Dambrós, Bibiana P.; Dellagostin, Odir A.; Falcão, Clarissa; Fantinatti-Garboggini, Fabiana; Felipe, Maria S. S.; Fiorentin, Laurimar; Franco, Gloria R.; Freitas, Nara S. A.; Frías, Diego; Grangeiro, Thalles B.; Grisard, Edmundo C.; Guimarães, Claudia T.; Hungria, Mariangela; Jardim, Sílvia N.; Krieger, Marco A.; Laurino, Jomar P.; Lima, Lucymara F. A.; Lopes, Maryellen I.; Loreto, Élgion L. S.; Madeira, Humberto M. F.; Manfio, Gilson P.; Maranhão, Andrea Q.; Martinkovics, Christyanne T.; Medeiros, Sílvia R. B.; Moreira, Miguel A. M.; Neiva, Márcia; Ramalho-Neto, Cicero E.; Nicolás, Marisa F.; Oliveira, Sergio C.; Paixão, Roger F. C.; Pedrosa, Fábio O.; Pena, Sérgio D. J.; Pereira, Maristela; Pereira-Ferrari, Lilian; Piffer, Itamar; Pinto, Luciano S.; Potrich, Deise P.; Salim, Anna C. M.; Santos, Fabrício R.; Schmitt, Renata; Schneider, Maria P. C.; Schrank, Augusto; Schrank, Irene S.; Schuck, Adriana F.; Seuanez, Hector N.; Silva, Denise W.; Silva, Rosane; Silva, Sérgio C.; Soares, Célia M. A.; Souza, Kelly R. L.; Souza, Rangel C.; Staats, Charley C.; Steffens, Maria B. R.; Teixeira, Santuza M. R.; Urmenyi, Turan P.; Vainstein, Marilene H.; Zuccherato, Luciana W.; Simpson, Andrew J. G.; Zaha, Arnaldo
2005-01-01
This work reports the results of analyses of three complete mycoplasma genomes, a pathogenic (7448) and a nonpathogenic (J) strain of the swine pathogen Mycoplasma hyopneumoniae and a strain of the avian pathogen Mycoplasma synoviae; the genome sizes of the three strains were 920,079 bp, 897,405 bp, and 799,476 bp, respectively. These genomes were compared with other sequenced mycoplasma genomes reported in the literature to examine several aspects of mycoplasma evolution. Strain-specific regions, including integrative and conjugal elements, and genome rearrangements and alterations in adhesin sequences were observed in the M. hyopneumoniae strains, and all of these were potentially related to pathogenicity. Genomic comparisons revealed that reduction in genome size implied loss of redundant metabolic pathways, with maintenance of alternative routes in different species. Horizontal gene transfer was consistently observed between M. synoviae and Mycoplasma gallisepticum. Our analyses indicated a likely transfer event of hemagglutinin-coding DNA sequences from M. gallisepticum to M. synoviae. PMID:16077101
Farré, Marta; Robinson, Terence J; Ruiz-Herrera, Aurora
2015-05-01
Our understanding of genomic reorganization, the mechanics of genomic transmission to offspring during germ line formation, and how these structural changes contribute to the speciation process, and genetic disease is far from complete. Earlier attempts to understand the mechanism(s) and constraints that govern genome remodeling suffered from being too narrowly focused, and failed to provide a unified and encompassing view of how genomes are organized and regulated inside cells. Here, we propose a new multidisciplinary Integrative Breakage Model for the study of genome evolution. The analysis of the high-level structural organization of genomes (nucleome), together with the functional constrains that accompany genome reshuffling, provide insights into the origin and plasticity of genome organization that may assist with the detection and isolation of therapeutic targets for the treatment of complex human disorders. © 2015 WILEY Periodicals, Inc.
Grigera, Fernando; Bellacosa, Alfonso; Kenter, Amy L.
2013-01-01
Mismatch repair (MMR) safeguards against genomic instability and is required for efficient Ig class switch recombination (CSR). Methyl CpG binding domain protein 4 (MBD4) binds to MutL homologue 1 (MLH1) and controls the post-transcriptional level of several MMR proteins, including MutS homologue 2 (MSH2). We show that in WT B cells activated for CSR, MBD4 is induced and interacts with MMR proteins, thereby implying a role for MBD4 in CSR. However, CSR is in the normal range in Mbd4 deficient mice deleted for exons 2–5 despite concomitant reduction of MSH2. We show by comparison in Msh2+/− B cells that a two-fold reduction of MSH2 and MBD4 proteins is correlated with impaired CSR. It is therefore surprising that CSR occurs at normal frequencies in the Mbd4 deficient B cells where MSH2 is reduced. We find that a variant Mbd4 transcript spanning exons 1,6–8 is expressed in Mbd4 deficient B cells. This transcript can be ectopically expressed and produces a truncated MBD4 peptide. Thus, the 3′ end of the Mbd4 locus is not silent in Mbd4 deficient B cells and may contribute to CSR. Our findings highlight a complex relationship between MBD4 and MMR proteins in B cells and a potential reconsideration of their role in CSR. PMID:24205214
Sanitá Lima, Matheus; Woods, Laura C; Cartwright, Matthew W; Smith, David Roy
2016-11-01
Not long ago, scientists paid dearly in time, money and skill for every nucleotide that they sequenced. Today, DNA sequencing technologies epitomize the slogan 'faster, easier, cheaper and more', and in many ways, sequencing an entire genome has become routine, even for the smallest laboratory groups. This is especially true for mitochondrial and plastid genomes. Given their relatively small sizes and high copy numbers per cell, organelle DNAs are currently among the most highly sequenced kind of chromosome. But accurately characterizing an organelle genome and the information it encodes can require much more than DNA sequencing and bioinformatics analyses. Organelle genomes can be surprisingly complex and can exhibit convoluted and unconventional modes of gene expression. Unravelling this complexity can demand a wide assortment of experiments, from pulsed-field gel electrophoresis to Southern and Northern blots to RNA analyses. Here, we show that it is exactly these types of 'complementary' analyses that are often lacking from contemporary organelle genome papers, particularly short 'genome announcement' articles. Consequently, crucial and interesting features of organelle chromosomes are going undescribed, which could ultimately lead to a poor understanding and even a misrepresentation of these genomes and the genes they express. High-throughput sequencing and bioinformatics have made it easy to sequence and assemble entire chromosomes, but they should not be used as a substitute for or at the expense of other types of genomic characterization methods. © 2016 The Authors. Molecular Ecology Resources Published by John Wiley & Sons Ltd.
Sweet, Kevin; Gordon, Erynn S.; Sturm, Amy C.; Schmidlen, Tara J.; Manickam, Kandamurugu; Toland, Amanda Ewart; Keller, Margaret A.; Stack, Catharine B.; García-España, J. Felipe; Bellafante, Mark; Tayal, Neeraj; Embi, Peter; Binkley, Philip; Hershberger, Ray E.; Sadee, Wolfgang; Christman, Michael; Marsh, Clay
2014-01-01
We describe the development and implementation of a randomized controlled trial to investigate the impact of genomic counseling on a cohort of patients with heart failure (HF) or hypertension (HTN), managed at a large academic medical center, the Ohio State University Wexner Medical Center (OSUWMC). Our study is built upon the existing Coriell Personalized Medicine Collaborative (CPMC®). OSUWMC patient participants with chronic disease (CD) receive eight actionable complex disease and one pharmacogenomic test report through the CPMC® web portal. Participants are randomized to either the in-person post-test genomic counseling—active arm, versus web-based only return of results—control arm. Study-specific surveys measure: (1) change in risk perception; (2) knowledge retention; (3) perceived personal control; (4) health behavior change; and, for the active arm (5), overall satisfaction with genomic counseling. This ongoing partnership has spurred creation of both infrastructure and procedures necessary for the implementation of genomics and genomic counseling in clinical care and clinical research. This included creation of a comprehensive informed consent document and processes for prospective return of actionable results for multiple complex diseases and pharmacogenomics (PGx) through a web portal, and integration of genomic data files and clinical decision support into an EPIC-based electronic medical record. We present this partnership, the infrastructure, genomic counseling approach, and the challenges that arose in the design and conduct of this ongoing trial to inform subsequent collaborative efforts and best genomic counseling practices. PMID:24926413
Kradolfer, David; Hennig, Lars; Köhler, Claudia
2013-01-01
Seed development in flowering plants is initiated after a double fertilization event with two sperm cells fertilizing two female gametes, the egg cell and the central cell, leading to the formation of embryo and endosperm, respectively. In most species the endosperm is a polyploid tissue inheriting two maternal genomes and one paternal genome. As a consequence of this particular genomic configuration the endosperm is a dosage sensitive tissue, and changes in the ratio of maternal to paternal contributions strongly impact on endosperm development. The FERTILIZATION INDEPENDENT SEED (FIS) Polycomb Repressive Complex 2 (PRC2) is essential for endosperm development; however, the underlying forces that led to the evolution of the FIS-PRC2 remained unknown. Here, we show that the functional requirement of the FIS-PRC2 can be bypassed by increasing the ratio of maternal to paternal genomes in the endosperm, suggesting that the main functional requirement of the FIS-PRC2 is to balance parental genome contributions and to reduce genetic conflict. We furthermore reveal that the AGAMOUS LIKE (AGL) gene AGL62 acts as a dosage-sensitive seed size regulator and that reduced expression of AGL62 might be responsible for reduced size of seeds with increased maternal genome dosage. PMID:23326241
GWAMA: software for genome-wide association meta-analysis.
Mägi, Reedik; Morris, Andrew P
2010-05-28
Despite the recent success of genome-wide association studies in identifying novel loci contributing effects to complex human traits, such as type 2 diabetes and obesity, much of the genetic component of variation in these phenotypes remains unexplained. One way to improving power to detect further novel loci is through meta-analysis of studies from the same population, increasing the sample size over any individual study. Although statistical software analysis packages incorporate routines for meta-analysis, they are ill equipped to meet the challenges of the scale and complexity of data generated in genome-wide association studies. We have developed flexible, open-source software for the meta-analysis of genome-wide association studies. The software incorporates a variety of error trapping facilities, and provides a range of meta-analysis summary statistics. The software is distributed with scripts that allow simple formatting of files containing the results of each association study and generate graphical summaries of genome-wide meta-analysis results. The GWAMA (Genome-Wide Association Meta-Analysis) software has been developed to perform meta-analysis of summary statistics generated from genome-wide association studies of dichotomous phenotypes or quantitative traits. Software with source files, documentation and example data files are freely available online at http://www.well.ox.ac.uk/GWAMA.
Applications and challenges of next-generation sequencing in Brassica species.
Wei, Lijuan; Xiao, Meili; Hayward, Alice; Fu, Donghui
2013-12-01
Next-generation sequencing (NGS) produces numerous (often millions) short DNA sequence reads, typically varying between 25 and 400 bp in length, at a relatively low cost and in a short time. This revolutionary technology is being increasingly applied in whole-genome, transcriptome, epigenome and small RNA sequencing, molecular marker and gene discovery, comparative and evolutionary genomics, and association studies. The Brassica genus comprises some of the most agro-economically important crops, providing abundant vegetables, condiments, fodder, oil and medicinal products. Many Brassica species have undergone the process of polyploidization, which makes their genomes exceptionally complex and can create difficulties in genomics research. NGS injects new vigor into Brassica research, yet also faces specific challenges in the analysis of complex crop genomes and traits. In this article, we review the advantages and limitations of different NGS technologies and their applications and challenges, using Brassica as an advanced model system for agronomically important, polyploid crops. Specifically, we focus on the use of NGS for genome resequencing, transcriptome sequencing, development of single-nucleotide polymorphism markers, and identification of novel microRNAs and their targets. We present trends and advances in NGS technology in relation to Brassica crop improvement, with wide application for sophisticated genomics research into agronomically important polyploid crops.
SvABA: genome-wide detection of structural variants and indels by local assembly.
Wala, Jeremiah A; Bandopadhayay, Pratiti; Greenwald, Noah F; O'Rourke, Ryan; Sharpe, Ted; Stewart, Chip; Schumacher, Steve; Li, Yilong; Weischenfeldt, Joachim; Yao, Xiaotong; Nusbaum, Chad; Campbell, Peter; Getz, Gad; Meyerson, Matthew; Zhang, Cheng-Zhong; Imielinski, Marcin; Beroukhim, Rameen
2018-04-01
Structural variants (SVs), including small insertion and deletion variants (indels), are challenging to detect through standard alignment-based variant calling methods. Sequence assembly offers a powerful approach to identifying SVs, but is difficult to apply at scale genome-wide for SV detection due to its computational complexity and the difficulty of extracting SVs from assembly contigs. We describe SvABA, an efficient and accurate method for detecting SVs from short-read sequencing data using genome-wide local assembly with low memory and computing requirements. We evaluated SvABA's performance on the NA12878 human genome and in simulated and real cancer genomes. SvABA demonstrates superior sensitivity and specificity across a large spectrum of SVs and substantially improves detection performance for variants in the 20-300 bp range, compared with existing methods. SvABA also identifies complex somatic rearrangements with chains of short (<1000 bp) templated-sequence insertions copied from distant genomic regions. We applied SvABA to 344 cancer genomes from 11 cancer types and found that short templated-sequence insertions occur in ∼4% of all somatic rearrangements. Finally, we demonstrate that SvABA can identify sites of viral integration and cancer driver alterations containing medium-sized (50-300 bp) SVs. © 2018 Wala et al.; Published by Cold Spring Harbor Laboratory Press.
Conflict Resolution in the Genome: How Transcription and Replication Make It Work.
Hamperl, Stephan; Cimprich, Karlene A
2016-12-01
The complex machineries involved in replication and transcription translocate along the same DNA template, often in opposing directions and at different rates. These processes routinely interfere with each other in prokaryotes, and mounting evidence now suggests that RNA polymerase complexes also encounter replication forks in higher eukaryotes. Indeed, cells rely on numerous mechanisms to avoid, tolerate, and resolve such transcription-replication conflicts, and the absence of these mechanisms can lead to catastrophic effects on genome stability and cell viability. In this article, we review the cellular responses to transcription-replication conflicts and highlight how these inevitable encounters shape the genome and impact diverse cellular processes. Copyright © 2016 Elsevier Inc. All rights reserved.
Ilkilic, Ilhan; Paul, Norbert W
2009-03-01
The goal of the Human Genome Diversity Project (HGDP) was to reconstruct the history of human evolution and the historical and geographical distribution of populations with the help of scientific research. Through this kind of research, the entire spectrum of genetic diversity to be found in the human species was to be explored with the hope of generating a better understanding of the history of humankind. An important part of this genome diversity research consists in taking blood and tissue samples from indigenous populations. For various reasons, it has not been possible to execute this project in the planned scope and form to date. Nevertheless, genomic diversity research addresses complex issues which prove to be highly relevant from the perspective of research ethics, transcultural medical ethics, and cultural philosophy. In the article at hand, we discuss these ethical issues as illustrated by the HGDP. This investigation focuses on the confrontation of culturally diverse images of humans and their cosmologies within the framework of genome diversity research and the ethical questions it raises. We argue that in addition to complex questions pertaining to research ethics such as informed consent and autonomy of probands, genome diversity research also has a cultural-philosophical, meta-ethical, and phenomenological dimension which must be taken into account in ethical discourses. Acknowledging this fact, we attempt to show the limits of current guidelines used in international genome diversity studies, following this up by a formulation of theses designed to facilitate an appropriate inquiry and ethical evaluation of intercultural dimensions of genome research.