Analysis of a MULE-cyanide hydratase gene fusion in Verticillium dahliae
USDA-ARS?s Scientific Manuscript database
The genome of the phytopathogenic fungus Verticillium dahliae encodes numerous Class II “cut-and-paste” transposable elements, including those of a small group of MULE transposons. We have previously identified a fusion event between a MULE transposon sequence and sequence encoding a cyanide hydrata...
Delwart, Eric; Li, Linlin
2011-01-01
The genomes of numerous circoviruses and distantly related circular DNA viruses encoding a rolling circle replication initiator protein (Rep) have been characterized from the tissues of mammals, fish, insects, and plants (geminivirus and nanovirus), human and animal feces, in an algae cell, and in diverse environmental samples. We review the genome organization, phylogenetic relationships and initial prevalence studies of cycloviruses, a proposed new genus in the Circoviridae family. Viral fossil rep sequences were also identified integrated on the chromosomes of mammals, frogs, lancelets, crustaceans, mites, gastropods, roundworms, placozoans, hydrozoans, protozoans, land plants, fungi, algae, and phytoplasma bacterias and their plasmids, reflecting their past host range. An ancient origin for viruses with rep-encoding single stranded small circular genomes, predating the diversification of eukaryotes, is discussed. The cellular hosts and pathogenicity of many recently described rep-containing circular genomes remain to be determined. Future studies of the virome of single cell and multi-cellular eukaryotes are likely to further extend the known diversity and host-range of small rep-containing circular viral genomes. PMID:22155583
Farrugia, Daniel N.; Elbourne, Liam D. H.; Mabbutt, Bridget C.; Paulsen, Ian T.
2015-01-01
Genomic islands play a key role in prokaryotic genome plasticity. Genomic islands integrate into chromosomal loci such as transfer RNA genes and protein coding genes, whilst retaining various cargo genes that potentially bestow novel functions on the host organism. A gene encoding a putative integrase was identified at a single site within the 5′ end of the dusA gene in the genomes of over 200 bacteria. This integrase was discovered to be a component of numerous genomic islands, which appear to share a target site within the dusA gene. dusA encodes the tRNA-dihydrouridine synthase A enzyme, which catalyses the post-transcriptional reduction of uridine to dihydrouridine in tRNA. Genomic islands encoding homologous dusA-associated integrases were found at a much lower frequency within the related dusB and dusC genes, and non-dus genes. Excision of these dusA-associated islands from the chromosome as circularized intermediates was confirmed by polymerase chain reaction. Analysis of the dusA-associated islands indicated that they were highly diverse, with the integrase gene representing the only universal common feature. PMID:25883135
Delwart, Eric; Li, Linlin
2012-03-01
The genomes of numerous circoviruses and distantly related circular ssDNA viruses encoding a rolling circle replication initiator protein (Rep) have been characterized from the tissues of mammals, fish, insects, plants (geminivirus and nanovirus), in human and animal feces, in an algae cell, and in diverse environmental samples. We review the genome organization, phylogenetic relationships and initial prevalence studies of cycloviruses, a proposed new genus in the Circoviridae family. Viral fossil rep sequences were also recently identified integrated on the chromosomes of mammals, frogs, lancelets, crustaceans, mites, gastropods, roundworms, placozoans, hydrozoans, protozoans, land plants, fungi, algae, and phytoplasma bacterias and their plasmids, reflecting the very wide past host range of rep bearing viruses. An ancient origin for viruses with Rep-encoding small circular ssDNA genomes, predating the diversification of eukaryotes, is discussed. The cellular hosts and pathogenicity of many recently described rep-containing circular ssDNA genomes remain to be determined. Future studies of the virome of single cell and multi-cellular eukaryotes are likely to further extend the known diversity and host-range of small rep-containing circular ssDNA viral genomes. Copyright © 2011 Elsevier B.V. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Norton, Jeanette M.; Klotz, Martin G; Stein, Lisa Y
2008-01-01
The complete genome of the ammonia-oxidizing bacterium, Nitrosospira multiformis (ATCC 25196T), consists of a circular chromosome and three small plasmids totaling 3,234,309 bp and encoding 2827 putative proteins. Of these, 2026 proteins have predicted functions and 801 are without conserved functional domains, yet 747 of these have similarity to other predicted proteins in databases. Gene homologs from Nitrosomonas europaea and N. eutropha were the best match for 42% of the predicted genes in N. multiformis. The genome contains three nearly identical copies of amo and hao gene clusters as large repeats. Distinguishing features compared to N. europaea include: the presencemore » of gene clusters encoding urease and hydrogenase, a RuBisCO-encoding operon of distinctive structure and phylogeny, and a relatively small complement of genes related to Fe acquisition. Systems for synthesis of a pyoverdine-like siderophore and for acyl-homoserine lactone were unique to N. multiformis among the sequenced AOB genomes. Gene clusters encoding proteins associated with outer membrane and cell envelope functions including transporters, porins, exopolysaccharide synthesis, capsule formation and protein sorting/export were abundant. Numerous sensory transduction and response regulator gene systems directed towards sensing of the extracellular environment are described. Gene clusters for glycogen, polyphosphate and cyanophycin storage and utilization were identified providing mechanisms for meeting energy requirements under substrate-limited conditions. The genome of N. multiformis encodes the core pathways for chemolithoautotrophy along with adaptations for surface growth and survival in soil environments.« less
Kozak, Natalia A; Buss, Meghan; Lucas, Claressa E; Frace, Michael; Govil, Dhwani; Travis, Tatiana; Olsen-Rasmussen, Melissa; Benson, Robert F; Fields, Barry S
2010-02-01
Legionella longbeachae causes most cases of legionellosis in Australia and may be underreported worldwide due to the lack of L. longbeachae-specific diagnostic tests. L. longbeachae displays distinctive differences in intracellular trafficking, caspase 1 activation, and infection in mouse models compared to Legionella pneumophila, yet these two species have indistinguishable clinical presentations in humans. Unlike other legionellae, which inhabit freshwater systems, L. longbeachae is found predominantly in moist soil. In this study, we sequenced and annotated the genome of an L. longbeachae clinical isolate from Oregon, isolate D-4968, and compared it to the previously published genomes of L. pneumophila. The results revealed that the D-4968 genome is larger than the L. pneumophila genome and has a gene order that is different from that of the L. pneumophila genome. Genes encoding structural components of type II, type IV Lvh, and type IV Icm/Dot secretion systems are conserved. In contrast, only 42/140 homologs of genes encoding L. pneumophila Icm/Dot substrates have been found in the D-4968 genome. L. longbeachae encodes numerous proteins with eukaryotic motifs and eukaryote-like proteins unique to this species, including 16 ankyrin repeat-containing proteins and a novel U-box protein. We predict that these proteins are secreted by the L. longbeachae Icm/Dot secretion system. In contrast to the L. pneumophila genome, the L. longbeachae D-4968 genome does not contain flagellar biosynthesis genes, yet it contains a chemotaxis operon. The lack of a flagellum explains the failure of L. longbeachae to activate caspase 1 and trigger pyroptosis in murine macrophages. These unique features of L. longbeachae may reflect adaptation of this species to life in soil.
Aylward, Frank O.; McDonald, Bradon R.; Adams, Sandra M.; Valenzuela, Alejandra; Schmidt, Rebeccah A.; Goodwin, Lynne A.; Woyke, Tanja; Currie, Cameron R.; Suen, Garret
2013-01-01
Sphingomonads comprise a physiologically versatile group within the Alphaproteobacteria that includes strains of interest for biotechnology, human health, and environmental nutrient cycling. In this study, we compared 26 sphingomonad genome sequences to gain insight into their ecology, metabolic versatility, and environmental adaptations. Our multilocus phylogenetic and average amino acid identity (AAI) analyses confirm that Sphingomonas, Sphingobium, Sphingopyxis, and Novosphingobium are well-resolved monophyletic groups with the exception of Sphingomonas sp. strain SKA58, which we propose belongs to the genus Sphingobium. Our pan-genomic analysis of sphingomonads reveals numerous species-specific open reading frames (ORFs) but few signatures of genus-specific cores. The organization and coding potential of the sphingomonad genomes appear to be highly variable, and plasmid-mediated gene transfer and chromosome-plasmid recombination, together with prophage- and transposon-mediated rearrangements, appear to play prominent roles in the genome evolution of this group. We find that many of the sphingomonad genomes encode numerous oxygenases and glycoside hydrolases, which are likely responsible for their ability to degrade various recalcitrant aromatic compounds and polysaccharides, respectively. Many of these enzymes are encoded on megaplasmids, suggesting that they may be readily transferred between species. We also identified enzymes putatively used for the catabolism of sulfonate and nitroaromatic compounds in many of the genomes, suggesting that plant-based compounds or chemical contaminants may be sources of nitrogen and sulfur. Many of these sphingomonads appear to be adapted to oligotrophic environments, but several contain genomic features indicative of host associations. Our work provides a basis for understanding the ecological strategies employed by sphingomonads and their role in environmental nutrient cycling. PMID:23563954
Carlson, Jonathan; Yan, Jiyu; Akinsiku, Olusimidele T.; Schaefer, Malinda; Sabbaj, Steffanie; Bet, Anne; Levy, David N.; Heath, Sonya; Tang, Jianming; Kaslow, Richard A.; Walker, Bruce D.; Ndung’u, Thumbi; Goulder, Philip J.; Heckerman, David; Hunter, Eric; Goepfert, Paul A.
2010-01-01
Retroviruses pack multiple genes into relatively small genomes by encoding several genes in the same genomic region with overlapping reading frames. Both sense and antisense HIV-1 transcripts contain open reading frames for known functional proteins as well as numerous alternative reading frames (ARFs). At least some ARFs have the potential to encode proteins of unknown function, and their antigenic properties can be considered as cryptic epitopes (CEs). To examine the extent of active immune response to virally encoded CEs, we analyzed human leukocyte antigen class I–associated polymorphisms in HIV-1 gag, pol, and nef genes from a large cohort of South Africans with chronic infection. In all, 391 CEs and 168 conventional epitopes were predicted, with the majority (307; 79%) of CEs derived from antisense transcripts. In further evaluation of CD8 T cell responses to a subset of the predicted CEs in patients with primary or chronic infection, both sense- and antisense-encoded CEs were immunogenic at both stages of infection. In addition, CEs often mutated during the first year of infection, which was consistent with immune selection for escape variants. These findings indicate that the HIV-1 genome might encode and deploy a large potential repertoire of unconventional epitopes to enhance vaccine-induced antiviral immunity. PMID:20065064
DOE Office of Scientific and Technical Information (OSTI.GOV)
John C. Meeks
2001-12-31
Nostoc punctiforme is a filamentous cyanobacterium with extensive phenotypic characteristics and a relatively large genome, approaching 10 Mb. The phenotypic characteristics include a photoautotrophic, diazotrophic mode of growth, but N. punctiforme is also facultatively heterotrophic; its vegetative cells have multiple development alternatives, including terminal differentiation into nitrogen-fixing heterocysts and transient differentiation into spore-like akinetes or motile filaments called hormogonia; and N. punctiforme has broad symbiotic competence with fungi and terrestrial plants, including bryophytes, gymnosperms and an angiosperm. The shotgun-sequencing phase of the N. punctiforme strain ATCC 29133 genome has been completed by the Joint Genome Institute. Annotation of an 8.9more » Mb database yielded 7432 open reading frames, 45% of which encode proteins with known or probable known function and 29% of which are unique to N. punctiforme. Comparative analysis of the sequence indicates a genome that is highly plastic and in a state of flux, with numerous insertion sequences and multilocus repeats, as well as genes encoding transposases and DNA modification enzymes. The sequence also reveals the presence of genes encoding putative proteins that collectively define almost all characteristics of cyanobacteria as a group. N. punctiforme has an extensive potential to sense and respond to environmental signals as reflected by the presence of more than 400 genes encoding sensor protein kinases, response regulators and other transcriptional factors. The signal transduction systems and any of the large number of unique genes may play essential roles in the cell differentiation and symbiotic interaction properties of N. punctiforme.« less
Comparative genomics of defense systems in archaea and bacteria
Makarova, Kira S.; Wolf, Yuri I.; Koonin, Eugene V.
2013-01-01
Our knowledge of prokaryotic defense systems has vastly expanded as the result of comparative genomic analysis, followed by experimental validation. This expansion is both quantitative, including the discovery of diverse new examples of known types of defense systems, such as restriction-modification or toxin-antitoxin systems, and qualitative, including the discovery of fundamentally new defense mechanisms, such as the CRISPR-Cas immunity system. Large-scale statistical analysis reveals that the distribution of different defense systems in bacterial and archaeal taxa is non-uniform, with four groups of organisms distinguishable with respect to the overall abundance and the balance between specific types of defense systems. The genes encoding defense system components in bacterial and archaea typically cluster in defense islands. In addition to genes encoding known defense systems, these islands contain numerous uncharacterized genes, which are candidates for new types of defense systems. The tight association of the genes encoding immunity systems and dormancy- or cell death-inducing defense systems in prokaryotic genomes suggests that these two major types of defense are functionally coupled, providing for effective protection at the population level. PMID:23470997
A roadmap for natural product discovery based on large-scale genomics and metabolomics
USDA-ARS?s Scientific Manuscript database
Actinobacteria encode a wealth of natural product biosynthetic gene clusters, whose systematic study is complicated by numerous repetitive motifs. By combining several metrics we developed a method for global classification of these gene clusters into families (GCFs) and analyzed the biosynthetic ca...
The Landscape of Somatic Chromosomal Copy Number Aberrations in GEM Models of Prostate Carcinoma
Bianchi-Frias, Daniella; Hernandez, Susana A.; Coleman, Roger; Wu, Hong; Nelson, Peter S.
2015-01-01
Human prostate cancer (PCa) is known to harbor recurrent genomic aberrations consisting of chromosomal losses, gains, rearrangements and mutations that involve oncogenes and tumor suppressors. Genetically engineered mouse (GEM) models have been constructed to assess the causal role of these putative oncogenic events and provide molecular insight into disease pathogenesis. While GEM models generally initiate neoplasia by manipulating a single gene, expression profiles of GEM tumors typically comprise hundreds of transcript alterations. It is unclear whether these transcriptional changes represent the pleiotropic effects of single oncogenes, and/or cooperating genomic or epigenomic events. Therefore, it was determined if structural chromosomal alterations occur in GEM models of PCa and whether the changes are concordant with human carcinomas. Whole genome array-based comparative genomic hybridization (CGH) was used to identify somatic chromosomal copy number aberrations (SCNAs) in the widely used TRAMP, Hi-Myc, Pten-null and LADY GEM models. Interestingly, very few SCNAs were identified and the genomic architecture of Hi-Myc, Pten-null and LADY tumors were essentially identical to the germline. TRAMP neuroendocrine carcinomas contained SCNAs, which comprised three recurrent aberrations including a single copy loss of chromosome 19 (encoding Pten). In contrast, cell lines derived from the TRAMP, Hi-Myc, and Pten-null tumors were notable for numerous SCNAs that included copy gains of chromosome 15 (encoding Myc) and losses of chromosome 11 (encoding p53). PMID:25298407
The Saccharomyces Genome Database Variant Viewer
Sheppard, Travis K.; Hitz, Benjamin C.; Engel, Stacia R.; Song, Giltae; Balakrishnan, Rama; Binkley, Gail; Costanzo, Maria C.; Dalusag, Kyla S.; Demeter, Janos; Hellerstedt, Sage T.; Karra, Kalpana; Nash, Robert S.; Paskov, Kelley M.; Skrzypek, Marek S.; Weng, Shuai; Wong, Edith D.; Cherry, J. Michael
2016-01-01
The Saccharomyces Genome Database (SGD; http://www.yeastgenome.org) is the authoritative community resource for the Saccharomyces cerevisiae reference genome sequence and its annotation. In recent years, we have moved toward increased representation of sequence variation and allelic differences within S. cerevisiae. The publication of numerous additional genomes has motivated the creation of new tools for their annotation and analysis. Here we present the Variant Viewer: a dynamic open-source web application for the visualization of genomic and proteomic differences. Multiple sequence alignments have been constructed across high quality genome sequences from 11 different S. cerevisiae strains and stored in the SGD. The alignments and summaries are encoded in JSON and used to create a two-tiered dynamic view of the budding yeast pan-genome, available at http://www.yeastgenome.org/variant-viewer. PMID:26578556
Evolutionary Genomics of an Ancient Prophage of the Order Sphingomonadales
Viswanathan, Vandana; Narjala, Anushree; Ravichandran, Aravind; Jayaprasad, Suvratha
2017-01-01
The order Sphingomonadales, containing the families Erythrobacteraceae and Sphingomonadaceae, is a relatively less well-studied phylogenetic branch within the class Alphaproteobacteria. Prophage elements are present in most bacterial genomes and are important determinants of adaptive evolution. An “intact” prophage was predicted within the genome of Sphingomonas hengshuiensis strain WHSC-8 and was designated Prophage IWHSC-8. Loci homologous to the region containing the first 22 open reading frames (ORFs) of Prophage IWHSC-8 were discovered among the genomes of numerous Sphingomonadales. In 17 genomes, the homologous loci were co-located with an ORF encoding a putative superoxide dismutase. Several other lines of molecular evidence implied that these homologous loci represent an ancient temperate bacteriophage integration, and this horizontal transfer event pre-dated niche-based speciation within the order Sphingomonadales. The “stabilization” of prophages in the genomes of their hosts is an indicator of “fitness” conferred by these elements and natural selection. Among the various ORFs predicted within the conserved prophages, an ORF encoding a putative proline-rich outer membrane protein A was consistently present among the genomes of many Sphingomonadales. Furthermore, the conserved prophages in six Sphingomonas sp. contained an ORF encoding a putative spermidine synthase. It is possible that one or more of these ORFs bestow selective fitness, and thus the prophages continue to be vertically transferred within the host strains. Although conserved prophages have been identified previously among closely related genera and species, this is the first systematic and detailed description of orthologous prophages at the level of an order that contains two diverse families and many pigmented species. PMID:28201618
Meher, J K; Meher, P K; Dash, G N; Raval, M K
2012-01-01
The first step in gene identification problem based on genomic signal processing is to convert character strings into numerical sequences. These numerical sequences are then analysed spectrally or using digital filtering techniques for the period-3 peaks, which are present in exons (coding areas) and absent in introns (non-coding areas). In this paper, we have shown that single-indicator sequences can be generated by encoding schemes based on physico-chemical properties. Two new methods are proposed for generating single-indicator sequences based on hydration energy and dipole moments. The proposed methods produce high peak at exon locations and effectively suppress false exons (intron regions having greater peak than exon regions) resulting in high discriminating factor, sensitivity and specificity.
Xu, Min; Wang, Yemin; Zhao, Zhilong; Gao, Guixi; Huang, Sheng-Xiong; Kang, Qianjin; He, Xinyi; Lin, Shuangjun; Pang, Xiuhua; Deng, Zixin
2016-01-01
ABSTRACT Genome sequencing projects in the last decade revealed numerous cryptic biosynthetic pathways for unknown secondary metabolites in microbes, revitalizing drug discovery from microbial metabolites by approaches called genome mining. In this work, we developed a heterologous expression and functional screening approach for genome mining from genomic bacterial artificial chromosome (BAC) libraries in Streptomyces spp. We demonstrate mining from a strain of Streptomyces rochei, which is known to produce streptothricins and borrelidin, by expressing its BAC library in the surrogate host Streptomyces lividans SBT5, and screening for antimicrobial activity. In addition to the successful capture of the streptothricin and borrelidin biosynthetic gene clusters, we discovered two novel linear lipopeptides and their corresponding biosynthetic gene cluster, as well as a novel cryptic gene cluster for an unknown antibiotic from S. rochei. This high-throughput functional genome mining approach can be easily applied to other streptomycetes, and it is very suitable for the large-scale screening of genomic BAC libraries for bioactive natural products and the corresponding biosynthetic pathways. IMPORTANCE Microbial genomes encode numerous cryptic biosynthetic gene clusters for unknown small metabolites with potential biological activities. Several genome mining approaches have been developed to activate and bring these cryptic metabolites to biological tests for future drug discovery. Previous sequence-guided procedures relied on bioinformatic analysis to predict potentially interesting biosynthetic gene clusters. In this study, we describe an efficient approach based on heterologous expression and functional screening of a whole-genome library for the mining of bioactive metabolites from Streptomyces. The usefulness of this function-driven approach was demonstrated by the capture of four large biosynthetic gene clusters for metabolites of various chemical types, including streptothricins, borrelidin, two novel lipopeptides, and one unknown antibiotic from Streptomyces rochei Sal35. The transfer, expression, and screening of the library were all performed in a high-throughput way, so that this approach is scalable and adaptable to industrial automation for next-generation antibiotic discovery. PMID:27451447
USDA-ARS?s Scientific Manuscript database
Numerous factors have been reported to affect rainbow trout egg quality, among which, post-ovulatory aging is one of the most significant causes as reared rainbow trout do not usually volitionally oviposit the ovulated eggs. Frequent examination of the stock is therefore required in order to reduce...
Ankyrin-repeat containing proteins of microbes: a conserved structure with functional diversity
Al-Khodor, Souhaila; Price, Christopher T.; Kalia, Awdhesh; Kwaik, Yousef Abu
2009-01-01
Summary The ankyrin repeat (ANK) is the most common protein-protein interaction motif in nature and predominantly found in eukaryotic proteins. The genome sequencing of various pathogenic or symbiotic bacteria and eukaryotic viruses identified numerous genes encoding ANK-containing proteins that were proposed to have been acquired from eukaryotes by horizontal gene transfer. However, the recent discovery of additional ANK-containing proteins encoded in the genomes of archaea and free-living bacteria suggests either a more ancient origin of the ANK motif or multiple convergent evolution events. Many bacterial pathogens employ various types of secretion systems to deliver ANK-containing proteins into eukaryotic cells where they mimic or manipulate various host functions. Understanding the molecular and biochemical functions of this family of proteins will enhance our understanding of important host-microbe interactions. PMID:19962898
The Saccharomyces Genome Database Variant Viewer.
Sheppard, Travis K; Hitz, Benjamin C; Engel, Stacia R; Song, Giltae; Balakrishnan, Rama; Binkley, Gail; Costanzo, Maria C; Dalusag, Kyla S; Demeter, Janos; Hellerstedt, Sage T; Karra, Kalpana; Nash, Robert S; Paskov, Kelley M; Skrzypek, Marek S; Weng, Shuai; Wong, Edith D; Cherry, J Michael
2016-01-04
The Saccharomyces Genome Database (SGD; http://www.yeastgenome.org) is the authoritative community resource for the Saccharomyces cerevisiae reference genome sequence and its annotation. In recent years, we have moved toward increased representation of sequence variation and allelic differences within S. cerevisiae. The publication of numerous additional genomes has motivated the creation of new tools for their annotation and analysis. Here we present the Variant Viewer: a dynamic open-source web application for the visualization of genomic and proteomic differences. Multiple sequence alignments have been constructed across high quality genome sequences from 11 different S. cerevisiae strains and stored in the SGD. The alignments and summaries are encoded in JSON and used to create a two-tiered dynamic view of the budding yeast pan-genome, available at http://www.yeastgenome.org/variant-viewer. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Gornik, S. G.; Waller, R. F.
2012-01-01
The sister phyla dinoflagellates and apicomplexans inherited a drastically reduced mitochondrial genome (mitochondrial DNA, mtDNA) containing only three protein-coding (cob, cox1, and cox3) genes and two ribosomal RNA (rRNA) genes. In apicomplexans, single copies of these genes are encoded on the smallest known mtDNA chromosome (6 kb). In dinoflagellates, however, the genome has undergone further substantial modifications, including massive genome amplification and recombination resulting in multiple copies of each gene and gene fragments linked in numerous combinations. Furthermore, protein-encoding genes have lost standard stop codons, trans-splicing of messenger RNAs (mRNAs) is required to generate complete cox3 transcripts, and extensive RNA editing recodes most genes. From taxa investigated to date, it is unclear when many of these unusual dinoflagellate mtDNA characters evolved. To address this question, we investigated the mitochondrial genome and transcriptome character states of the deep branching dinoflagellate Hematodinium sp. Genomic data show that like later-branching dinoflagellates Hematodinium sp. also contains an inflated, heavily recombined genome of multicopy genes and gene fragments. Although stop codons are also lacking for cox1 and cob, cox3 still encodes a conventional stop codon. Extensive editing of mRNAs also occurs in Hematodinium sp. The mtDNA of basal dinoflagellate Hematodinium sp. indicates that much of the mtDNA modification in dinoflagellates occurred early in this lineage, including genome amplification and recombination, and decreased use of standard stop codons. Trans-splicing, on the other hand, occurred after Hematodinium sp. diverged. Only RNA editing presents a nonlinear pattern of evolution in dinoflagellates as this process occurs in Hematodinium sp. but is absent in some later-branching taxa indicating that this process was either lost in some lineages or developed more than once during the evolution of the highly unusual dinoflagellate mtDNA. PMID:22113794
Jackson, C J; Gornik, S G; Waller, R F
2012-01-01
The sister phyla dinoflagellates and apicomplexans inherited a drastically reduced mitochondrial genome (mitochondrial DNA, mtDNA) containing only three protein-coding (cob, cox1, and cox3) genes and two ribosomal RNA (rRNA) genes. In apicomplexans, single copies of these genes are encoded on the smallest known mtDNA chromosome (6 kb). In dinoflagellates, however, the genome has undergone further substantial modifications, including massive genome amplification and recombination resulting in multiple copies of each gene and gene fragments linked in numerous combinations. Furthermore, protein-encoding genes have lost standard stop codons, trans-splicing of messenger RNAs (mRNAs) is required to generate complete cox3 transcripts, and extensive RNA editing recodes most genes. From taxa investigated to date, it is unclear when many of these unusual dinoflagellate mtDNA characters evolved. To address this question, we investigated the mitochondrial genome and transcriptome character states of the deep branching dinoflagellate Hematodinium sp. Genomic data show that like later-branching dinoflagellates Hematodinium sp. also contains an inflated, heavily recombined genome of multicopy genes and gene fragments. Although stop codons are also lacking for cox1 and cob, cox3 still encodes a conventional stop codon. Extensive editing of mRNAs also occurs in Hematodinium sp. The mtDNA of basal dinoflagellate Hematodinium sp. indicates that much of the mtDNA modification in dinoflagellates occurred early in this lineage, including genome amplification and recombination, and decreased use of standard stop codons. Trans-splicing, on the other hand, occurred after Hematodinium sp. diverged. Only RNA editing presents a nonlinear pattern of evolution in dinoflagellates as this process occurs in Hematodinium sp. but is absent in some later-branching taxa indicating that this process was either lost in some lineages or developed more than once during the evolution of the highly unusual dinoflagellate mtDNA.
A Molecular Basis for Bifidobacterial Enrichment in the Infant Gastrointestinal Tract123
Garrido, Daniel; Barile, Daniela; Mills, David A.
2012-01-01
Bifidobacteria are commonly used as probiotics in dairy foods. Select bifidobacterial species are also early colonizers of the breast-fed infant colon; however, the mechanism for this enrichment is unclear. We previously showed that Bifidobacterium longum subsp. infantis is a prototypical bifidobacterial species that can readily utilize human milk oligosaccharides as the sole carbon source. MS-based glycoprofiling has revealed that numerous B. infantis strains preferentially consume small mass oligosaccharides, abundant in human milks. Genome sequencing revealed that B. infantis possesses a bias toward genes required to use mammalian-derived carbohydrates. Many of these genomic features encode enzymes that are active on milk oligosaccharides including a novel 40-kb region dedicated to oligosaccharide utilization. Biochemical and molecular characterization of the encoded glycosidases and transport proteins has further resolved the mechanism by which B. infantis selectively imports and catabolizes milk oligosaccharides. Expression studies indicate that many of these key functions are only induced during growth on milk oligosaccharides and not expressed during growth on other prebiotics. Analysis of numerous B. infantis isolates has confirmed that these genomic features are common among the B. infantis subspecies and likely constitute a competitive colonization strategy used by these unique bifidobacteria. By detailed characterization of the molecular mechanisms responsible, these studies provide a conceptual framework for bifidobacterial persistence and host interaction in the infant gastrointestinal tract mediated in part through consumption of human milk oligosaccharides. PMID:22585920
Yassin, Atteyet F; Langenberg, Stefan; Huntemann, Marcel; Clum, Alicia; Pillay, Manoj; Palaniappan, Krishnaveni; Varghese, Neha; Mikhailova, Natalia; Mukherjee, Supratim; Reddy, T B K; Daum, Chris; Shapiro, Nicole; Ivanova, Natalia; Woyke, Tanja; Kyrpides, Nikos C
2017-01-01
The permanent draft genome sequence of Actinotignum schaalii DSM 15541T is presented. The annotated genome includes 2,130,987 bp, with 1777 protein-coding and 58 rRNA-coding genes. Genome sequence analysis revealed absence of genes encoding for: components of the PTS systems, enzymes of the TCA cycle, glyoxylate shunt and gluconeogensis. Genomic data revealed that A. schaalii is able to oxidize carbohydrates via glycolysis, the nonoxidative pentose phosphate and the Entner-Doudoroff pathways. Besides, the genome harbors genes encoding for enzymes involved in the conversion of pyruvate to lactate, acetate and ethanol, which are found to be the end products of carbohydrate fermentation. The genome contained the gene encoding Type I fatty acid synthase required for de novo FAS biosynthesis. The plsY and plsX genes encoding the acyltransferases necessary for phosphatidic acid biosynthesis were absent from the genome. The genome harbors genes encoding enzymes responsible for isoprene biosynthesis via the mevalonate (MVA) pathway. Genes encoding enzymes that confer resistance to reactive oxygen species (ROS) were identified. In addition, A. schaalii harbors genes that protect the genome against viral infections. These include restriction-modification (RM) systems, type II toxin-antitoxin (TA), CRISPR-Cas and abortive infection system. A. schaalii genome also encodes several virulence factors that contribute to adhesion and internalization of this pathogen such as the tad genes encoding proteins required for pili assembly, the nanI gene encoding exo-alpha-sialidase, genes encoding heat shock proteins and genes encoding type VII secretion system. These features are consistent with anaerobic and pathogenic lifestyles. Finally, resistance to ciprofloxacin occurs by mutation in chromosomal genes that encode the subunits of DNA-gyrase (GyrA) and topisomerase IV (ParC) enzymes, while resistant to metronidazole was due to the frxA gene, which encodes NADPH-flavin oxidoreductase.
Shen, Ping; Fan, Jianzhong; Guo, Lihua; Li, Jiahua; Li, Ang; Zhang, Jing; Ying, Chaoqun; Ji, Jinru; Xu, Hao; Zheng, Beiwen; Xiao, Yonghong
2017-05-12
Shigellosis is the most common cause of gastrointestinal infections in developing countries. In China, the species most frequently responsible for shigellosis is Shigella flexneri. S. flexneri remains largely unexplored from a genomic standpoint and is still described using a vocabulary based on biochemical and serological properties. Moreover, increasing numbers of ESBL-producing Shigella strains have been isolated from clinical samples. Despite this, only a few cases of ESBL-producing Shigella have been described in China. Therefore, a better understanding of ESBL-producing Shigella from a genomic standpoint is required. In this study, a S. flexneri type 1a isolate SP1 harboring bla CTX-M-14 , which was recovered from the patient with diarrhea, was subjected to whole genome sequencing. The draft genome assembly of S. flexneri strain SP1 consisted of 4,592,345 bp with a G+C content of 50.46%. RAST analysis revealed the genome contained 4798 coding sequences (CDSs) and 100 RNA-encoding genes. We detected one incomplete prophage and six candidate CRISPR loci in the genome. In vitro antimicrobial susceptibility testing demonstrated that strain SP1 is resistant to ampicillin, amoxicillin/clavulanic acid, cefazolin, ceftriaxone and trimethoprim. In silico analysis detected genes mediating resistance to aminoglycosides, β-lactams, phenicol, tetracycline, sulphonamides, and trimethoprim. The bla CTX-M-14 gene was located on an IncFII2 plasmid. A series of virulence factors were identified in the genome. In this study, we report the whole genome sequence of a bla CTX-M-14 -encoding S. flexneri strain SP1. Dozens of resistance determinants were detected in the genome and may be responsible for the multidrug-resistance of this strain, although further confirmation studies are warranted. Numerous virulence factors identified in the strain suggest that isolate SP1 is potential pathogenic. The availability of the genome sequence and comparative analysis with other S. flexneri strains provides the basis to further address the evolution of drug resistance mechanisms and pathogenicity in S. flexneri.
Jackson, Christopher J; Norman, John E; Schnare, Murray N; Gray, Michael W; Keeling, Patrick J; Waller, Ross F
2007-01-01
Background Dinoflagellates comprise an ecologically significant and diverse eukaryotic phylum that is sister to the phylum containing apicomplexan endoparasites. The mitochondrial genome of apicomplexans is uniquely reduced in gene content and size, encoding only three proteins and two ribosomal RNAs (rRNAs) within a highly compacted 6 kb DNA. Dinoflagellate mitochondrial genomes have been comparatively poorly studied: limited available data suggest some similarities with apicomplexan mitochondrial genomes but an even more radical type of genomic organization. Here, we investigate structure, content and expression of dinoflagellate mitochondrial genomes. Results From two dinoflagellates, Crypthecodinium cohnii and Karlodinium micrum, we generated over 42 kb of mitochondrial genomic data that indicate a reduced gene content paralleling that of mitochondrial genomes in apicomplexans, i.e., only three protein-encoding genes and at least eight conserved components of the highly fragmented large and small subunit rRNAs. Unlike in apicomplexans, dinoflagellate mitochondrial genes occur in multiple copies, often as gene fragments, and in numerous genomic contexts. Analysis of cDNAs suggests several novel aspects of dinoflagellate mitochondrial gene expression. Polycistronic transcripts were found, standard start codons are absent, and oligoadenylation occurs upstream of stop codons, resulting in the absence of termination codons. Transcripts of at least one gene, cox3, are apparently trans-spliced to generate full-length mRNAs. RNA substitutional editing, a process previously identified for mRNAs in dinoflagellate mitochondria, is also implicated in rRNA expression. Conclusion The dinoflagellate mitochondrial genome shares the same gene complement and fragmentation of rRNA genes with its apicomplexan counterpart. However, it also exhibits several unique characteristics. Most notable are the expansion of gene copy numbers and their arrangements within the genome, RNA editing, loss of stop codons, and use of trans-splicing. PMID:17897476
Molecular analysis of the anaerobic rumen fungus Orpinomyces - insights into an AT-rich genome.
Nicholson, Matthew J; Theodorou, Michael K; Brookman, Jayne L
2005-01-01
The anaerobic gut fungi occupy a unique niche in the intestinal tract of large herbivorous animals and are thought to act as primary colonizers of plant material during digestion. They are the only known obligately anaerobic fungi but molecular analysis of this group has been hampered by difficulties in their culture and manipulation, and by their extremely high A+T nucleotide content. This study begins to answer some of the fundamental questions about the structure and organization of the anaerobic gut fungal genome. Directed plasmid libraries using genomic DNA digested with highly or moderately rich AT-specific restriction enzymes (VspI and EcoRI) were prepared from a polycentric Orpinomyces isolate. Clones were sequenced from these libraries and the breadth of genomic inserts, both genic and intergenic, was characterized. Genes encoding numerous functions not previously characterized for these fungi were identified, including cytoskeletal, secretory pathway and transporter genes. A peptidase gene with no introns and having sequence similarity to a gene encoding a bacterial peptidase was also identified, extending the range of metabolic enzymes resulting from apparent trans-kingdom transfer from bacteria to fungi, as previously characterized largely for genes encoding plant-degrading enzymes. This paper presents the first thorough analysis of the genic, intergenic and rDNA regions of a variety of genomic segments from an anaerobic gut fungus and provides observations on rules governing intron boundaries, the codon biases observed with different types of genes, and the sequence of only the second anaerobic gut fungal promoter reported. Large numbers of retrotransposon sequences of different types were found and the authors speculate on the possible consequences of any such transposon activity in the genome. The coding sequences identified included several orphan gene sequences, including one with regions strongly suggestive of structural proteins such as collagens and lampirin. This gene was present as a single copy in Orpinomyces, was expressed during vegetative growth and was also detected in genomes from another gut fungal genus, Neocallimastix.
Evaluating the protein coding potential of exonized transposable element sequences
Piriyapongsa, Jittima; Rutledge, Mark T; Patel, Sanil; Borodovsky, Mark; Jordan, I King
2007-01-01
Background Transposable element (TE) sequences, once thought to be merely selfish or parasitic members of the genomic community, have been shown to contribute a wide variety of functional sequences to their host genomes. Analysis of complete genome sequences have turned up numerous cases where TE sequences have been incorporated as exons into mRNAs, and it is widely assumed that such 'exonized' TEs encode protein sequences. However, the extent to which TE-derived sequences actually encode proteins is unknown and a matter of some controversy. We have tried to address this outstanding issue from two perspectives: i-by evaluating ascertainment biases related to the search methods used to uncover TE-derived protein coding sequences (CDS) and ii-through a probabilistic codon-frequency based analysis of the protein coding potential of TE-derived exons. Results We compared the ability of three classes of sequence similarity search methods to detect TE-derived sequences among data sets of experimentally characterized proteins: 1-a profile-based hidden Markov model (HMM) approach, 2-BLAST methods and 3-RepeatMasker. Profile based methods are more sensitive and more selective than the other methods evaluated. However, the application of profile-based search methods to the detection of TE-derived sequences among well-curated experimentally characterized protein data sets did not turn up many more cases than had been previously detected and nowhere near as many cases as recent genome-wide searches have. We observed that the different search methods used were complementary in the sense that they yielded largely non-overlapping sets of hits and differed in their ability to recover known cases of TE-derived CDS. The probabilistic analysis of TE-derived exon sequences indicates that these sequences have low protein coding potential on average. In particular, non-autonomous TEs that do not encode protein sequences, such as Alu elements, are frequently exonized but unlikely to encode protein sequences. Conclusion The exaptation of the numerous TE sequences found in exons as bona fide protein coding sequences may prove to be far less common than has been suggested by the analysis of complete genomes. We hypothesize that many exonized TE sequences actually function as post-transcriptional regulators of gene expression, rather than coding sequences, which may act through a variety of double stranded RNA related regulatory pathways. Indeed, their relatively high copy numbers and similarity to sequences dispersed throughout the genome suggests that exonized TE sequences could serve as master regulators with a wide scope of regulatory influence. Reviewers: This article was reviewed by Itai Yanai, Kateryna D. Makova, Melissa Wilson (nominated by Kateryna D. Makova) and Cedric Feschotte (nominated by John M. Logsdon Jr.). PMID:18036258
Genome assembly with in vitro proximity ligation data and whole-genome triplication in lettuce
Reyes-Chin-Wo, Sebastian; Wang, Zhiwen; Yang, Xinhua; Kozik, Alexander; Arikit, Siwaret; Song, Chi; Xia, Liangfeng; Froenicke, Lutz; Lavelle, Dean O.; Truco, María-José; Xia, Rui; Zhu, Shilin; Xu, Chunyan; Xu, Huaqin; Xu, Xun; Cox, Kyle; Korf, Ian; Meyers, Blake C.; Michelmore, Richard W.
2017-01-01
Lettuce (Lactuca sativa) is a major crop and a member of the large, highly successful Compositae family of flowering plants. Here we present a reference assembly for the species and family. This was generated using whole-genome shotgun Illumina reads plus in vitro proximity ligation data to create large superscaffolds; it was validated genetically and superscaffolds were oriented in genetic bins ordered along nine chromosomal pseudomolecules. We identify several genomic features that may have contributed to the success of the family, including genes encoding Cycloidea-like transcription factors, kinases, enzymes involved in rubber biosynthesis and disease resistance proteins that are expanded in the genome. We characterize 21 novel microRNAs, one of which may trigger phasiRNAs from numerous kinase transcripts. We provide evidence for a whole-genome triplication event specific but basal to the Compositae. We detect 26% of the genome in triplicated regions containing 30% of all genes that are enriched for regulatory sequences and depleted for genes involved in defence. PMID:28401891
Gartemann, Karl-Heinz; Abt, Birte; Bekel, Thomas; Burger, Annette; Engemann, Jutta; Flügel, Monika; Gaigalat, Lars; Goesmann, Alexander; Gräfen, Ines; Kalinowski, Jörn; Kaup, Olaf; Kirchner, Oliver; Krause, Lutz; Linke, Burkhard; McHardy, Alice; Meyer, Folker; Pohle, Sandra; Rückert, Christian; Schneiker, Susanne; Zellermann, Eva-Maria; Pühler, Alfred; Eichenlaub, Rudolf; Kaiser, Olaf; Bartels, Daniela
2008-01-01
Clavibacter michiganensis subsp. michiganensis is a plant-pathogenic actinomycete that causes bacterial wilt and canker of tomato. The nucleotide sequence of the genome of strain NCPPB382 was determined. The chromosome is circular, consists of 3.298 Mb, and has a high G+C content (72.6%). Annotation revealed 3,080 putative protein-encoding sequences; only 26 pseudogenes were detected. Two rrn operons, 45 tRNAs, and three small stable RNA genes were found. The two circular plasmids, pCM1 (27.4 kbp) and pCM2 (70.0 kbp), which carry pathogenicity genes and thus are essential for virulence, have lower G+C contents (66.5 and 67.6%, respectively). In contrast to the genome of the closely related organism Clavibacter michiganensis subsp. sepedonicus, the genome of C. michiganensis subsp. michiganensis lacks complete insertion elements and transposons. The 129-kb chp/tomA region with a low G+C content near the chromosomal origin of replication was shown to be necessary for pathogenicity. This region contains numerous genes encoding proteins involved in uptake and metabolism of sugars and several serine proteases. There is evidence that single genes located in this region, especially genes encoding serine proteases, are required for efficient colonization of the host. Although C. michiganensis subsp. michiganensis grows mainly in the xylem of tomato plants, no evidence for pronounced genome reduction was found. C. michiganensis subsp. michiganensis seems to have as many transporters and regulators as typical soil-inhabiting bacteria. However, the apparent lack of a sulfate reduction pathway, which makes C. michiganensis subsp. michiganensis dependent on reduced sulfur compounds for growth, is probably the reason for the poor survival of C. michiganensis subsp. michiganensis in soil. PMID:18192381
Wei, Chaoling; Yang, Hua; Wang, Songbo; Zhao, Jian; Liu, Chun; Gao, Liping; Xia, Enhua; Lu, Ying; Tai, Yuling; She, Guangbiao; Sun, Jun; Cao, Haisheng; Tong, Wei; Gao, Qiang; Li, Yeyun; Deng, Weiwei; Jiang, Xiaolan; Wang, Wenzhao; Chen, Qi; Zhang, Shihua; Li, Haijing; Wu, Junlan; Wang, Ping; Li, Penghui; Shi, Chengying; Zheng, Fengya; Jian, Jianbo; Huang, Bei; Shan, Dai; Shi, Mingming; Fang, Congbing; Yue, Yi; Li, Fangdong; Li, Daxiang; Wei, Shu; Han, Bin; Jiang, Changjun; Yin, Ye; Xia, Tao; Zhang, Zhengzhu; Bennetzen, Jeffrey L; Zhao, Shancen; Wan, Xiaochun
2018-05-01
Tea, one of the world's most important beverage crops, provides numerous secondary metabolites that account for its rich taste and health benefits. Here we present a high-quality sequence of the genome of tea, Camellia sinensis var. sinensis (CSS), using both Illumina and PacBio sequencing technologies. At least 64% of the 3.1-Gb genome assembly consists of repetitive sequences, and the rest yields 33,932 high-confidence predictions of encoded proteins. Divergence between two major lineages, CSS and Camellia sinensis var. assamica (CSA), is calculated to ∼0.38 to 1.54 million years ago (Mya). Analysis of genic collinearity reveals that the tea genome is the product of two rounds of whole-genome duplications (WGDs) that occurred ∼30 to 40 and ∼90 to 100 Mya. We provide evidence that these WGD events, and subsequent paralogous duplications, had major impacts on the copy numbers of secondary metabolite genes, particularly genes critical to producing three key quality compounds: catechins, theanine, and caffeine. Analyses of transcriptome and phytochemistry data show that amplification and transcriptional divergence of genes encoding a large acyltransferase family and leucoanthocyanidin reductases are associated with the characteristic young leaf accumulation of monomeric galloylated catechins in tea, while functional divergence of a single member of the glutamine synthetase gene family yielded theanine synthetase. This genome sequence will facilitate understanding of tea genome evolution and tea metabolite pathways, and will promote germplasm utilization for breeding improved tea varieties. Copyright © 2018 the Author(s). Published by PNAS.
Wei, Chaoling; Yang, Hua; Wang, Songbo; Zhao, Jian; Liu, Chun; Gao, Liping; Xia, Enhua; Lu, Ying; Tai, Yuling; She, Guangbiao; Sun, Jun; Cao, Haisheng; Tong, Wei; Gao, Qiang; Li, Yeyun; Deng, Weiwei; Jiang, Xiaolan; Wang, Wenzhao; Chen, Qi; Zhang, Shihua; Li, Haijing; Wu, Junlan; Wang, Ping; Li, Penghui; Shi, Chengying; Zheng, Fengya; Jian, Jianbo; Huang, Bei; Shan, Dai; Shi, Mingming; Fang, Congbing; Yue, Yi; Li, Fangdong; Li, Daxiang; Wei, Shu; Han, Bin; Jiang, Changjun; Yin, Ye; Xia, Tao; Zhang, Zhengzhu; Bennetzen, Jeffrey L.; Zhao, Shancen; Wan, Xiaochun
2018-01-01
Tea, one of the world’s most important beverage crops, provides numerous secondary metabolites that account for its rich taste and health benefits. Here we present a high-quality sequence of the genome of tea, Camellia sinensis var. sinensis (CSS), using both Illumina and PacBio sequencing technologies. At least 64% of the 3.1-Gb genome assembly consists of repetitive sequences, and the rest yields 33,932 high-confidence predictions of encoded proteins. Divergence between two major lineages, CSS and Camellia sinensis var. assamica (CSA), is calculated to ∼0.38 to 1.54 million years ago (Mya). Analysis of genic collinearity reveals that the tea genome is the product of two rounds of whole-genome duplications (WGDs) that occurred ∼30 to 40 and ∼90 to 100 Mya. We provide evidence that these WGD events, and subsequent paralogous duplications, had major impacts on the copy numbers of secondary metabolite genes, particularly genes critical to producing three key quality compounds: catechins, theanine, and caffeine. Analyses of transcriptome and phytochemistry data show that amplification and transcriptional divergence of genes encoding a large acyltransferase family and leucoanthocyanidin reductases are associated with the characteristic young leaf accumulation of monomeric galloylated catechins in tea, while functional divergence of a single member of the glutamine synthetase gene family yielded theanine synthetase. This genome sequence will facilitate understanding of tea genome evolution and tea metabolite pathways, and will promote germplasm utilization for breeding improved tea varieties. PMID:29678829
Positive Selection in Rapidly Evolving Plastid–Nuclear Enzyme Complexes
Rockenbach, Kate; Havird, Justin C.; Monroe, J. Grey; Triant, Deborah A.; Taylor, Douglas R.; Sloan, Daniel B.
2016-01-01
Rates of sequence evolution in plastid genomes are generally low, but numerous angiosperm lineages exhibit accelerated evolutionary rates in similar subsets of plastid genes. These genes include clpP1 and accD, which encode components of the caseinolytic protease (CLP) and acetyl-coA carboxylase (ACCase) complexes, respectively. Whether these extreme and repeated accelerations in rates of plastid genome evolution result from adaptive change in proteins (i.e., positive selection) or simply a loss of functional constraint (i.e., relaxed purifying selection) is a source of ongoing controversy. To address this, we have taken advantage of the multiple independent accelerations that have occurred within the genus Silene (Caryophyllaceae) by examining phylogenetic and population genetic variation in the nuclear genes that encode subunits of the CLP and ACCase complexes. We found that, in species with accelerated plastid genome evolution, the nuclear-encoded subunits in the CLP and ACCase complexes are also evolving rapidly, especially those involved in direct physical interactions with plastid-encoded proteins. A massive excess of nonsynonymous substitutions between species relative to levels of intraspecific polymorphism indicated a history of strong positive selection (particularly in CLP genes). Interestingly, however, some species are likely undergoing loss of the native (heteromeric) plastid ACCase and putative functional replacement by a duplicated cytosolic (homomeric) ACCase. Overall, the patterns of molecular evolution in these plastid–nuclear complexes are unusual for anciently conserved enzymes. They instead resemble cases of antagonistic coevolution between pathogens and host immune genes. We discuss a possible role of plastid–nuclear conflict as a novel cause of accelerated evolution. PMID:27707788
Yu, Jingyin; Tehrim, Sadia; Zhang, Fengqi; Tong, Chaobo; Huang, Junyan; Cheng, Xiaohui; Dong, Caihua; Zhou, Yanqiu; Qin, Rui; Hua, Wei; Liu, Shengyi
2014-01-03
Plant disease resistance (R) genes with the nucleotide binding site (NBS) play an important role in offering resistance to pathogens. The availability of complete genome sequences of Brassica oleracea and Brassica rapa provides an important opportunity for researchers to identify and characterize NBS-encoding R genes in Brassica species and to compare with analogues in Arabidopsis thaliana based on a comparative genomics approach. However, little is known about the evolutionary fate of NBS-encoding genes in the Brassica lineage after split from A. thaliana. Here we present genome-wide analysis of NBS-encoding genes in B. oleracea, B. rapa and A. thaliana. Through the employment of HMM search and manual curation, we identified 157, 206 and 167 NBS-encoding genes in B. oleracea, B. rapa and A. thaliana genomes, respectively. Phylogenetic analysis among 3 species classified NBS-encoding genes into 6 subgroups. Tandem duplication and whole genome triplication (WGT) analyses revealed that after WGT of the Brassica ancestor, NBS-encoding homologous gene pairs on triplicated regions in Brassica ancestor were deleted or lost quickly, but NBS-encoding genes in Brassica species experienced species-specific gene amplification by tandem duplication after divergence of B. rapa and B. oleracea. Expression profiling of NBS-encoding orthologous gene pairs indicated the differential expression pattern of retained orthologous gene copies in B. oleracea and B. rapa. Furthermore, evolutionary analysis of CNL type NBS-encoding orthologous gene pairs among 3 species suggested that orthologous genes in B. rapa species have undergone stronger negative selection than those in B .oleracea species. But for TNL type, there are no significant differences in the orthologous gene pairs between the two species. This study is first identification and characterization of NBS-encoding genes in B. rapa and B. oleracea based on whole genome sequences. Through tandem duplication and whole genome triplication analysis in B. oleracea, B. rapa and A. thaliana genomes, our study provides insight into the evolutionary history of NBS-encoding genes after divergence of A. thaliana and the Brassica lineage. These results together with expression pattern analysis of NBS-encoding orthologous genes provide useful resource for functional characterization of these genes and genetic improvement of relevant crops.
γ-PGA Hydrolases of Phage Origin in Bacillus subtilis and Other Microbial Genomes.
Mamberti, Stefania; Prati, Paola; Cremaschi, Paolo; Seppi, Claudio; Morelli, Carlo F; Galizzi, Alessandro; Fabbi, Massimo; Calvio, Cinzia
2015-01-01
Poly-γ-glutamate (γ-PGA) is an industrially interesting polymer secreted mainly by members of the class Bacilli which forms a shield able to protect bacteria from phagocytosis and phages. Few enzymes are known to degrade γ-PGA; among them is a phage-encoded γ-PGA hydrolase, PghP. The supposed role of PghP in phages is to ensure access to the surface of bacterial cells by dismantling the γ-PGA barrier. We identified four unannotated B. subtilis genes through similarity of their encoded products to PghP; in fact these genes reside in prophage elements of B. subtilis genome. The recombinant products of two of them demonstrate efficient polymer degradation, confirming that sequence similarity reflects functional homology. Genes encoding similar γ-PGA hydrolases were identified in phages specific for the order Bacillales and in numerous microbial genomes, not only belonging to that order. The distribution of the γ-PGA biosynthesis operon was also investigated with a bioinformatics approach; it was found that the list of organisms endowed with γ-PGA biosynthetic functions is larger than expected and includes several pathogenic species. Moreover in non-Bacillales bacteria the predicted γ-PGA hydrolase genes are preferentially found in species that do not have the genetic asset for polymer production. Our findings suggest that γ-PGA hydrolase genes might have spread across microbial genomes via horizontal exchanges rather than via phage infection. We hypothesize that, in natural habitats rich in γ-PGA supplied by producer organisms, the availability of hydrolases that release glutamate oligomers from γ-PGA might be a beneficial trait under positive selection.
NASA Technical Reports Server (NTRS)
Everroad, R. Craig; Stuart, Rhona K.; Bebout, Brad M.; Detweiler, Angela M.; Lee, Jackson Zan; Woebken, Dagmar; Bebout, Leslie E.; Pett-Ridge, Jennifer
2016-01-01
The nonheterocystous filamentous cyanobacterium, strain ESFC-1, is a recently described member of the order Oscillatoriales within the Cyanobacteria. ESFC-1 has been shown to be a major diazotroph in the intertidal microbial mat system at Elkhorn Slough, CA, USA. Based on phylogenetic analyses of the 16S RNA gene, ESFC-1 appears to belong to a unique, genus-level divergence; the draft genome sequence of this strain has now been determined. Here we report features of this genome as they relate to the ecological functions and capabilities of strain ESFC-1. The 5,632,035 bp genome sequence encodes 4914 protein-coding genes and 92 RNA genes. One striking feature of this cyanobacterium is the apparent lack of either uptake or bi-directional hydrogenases typically expected within a diazotroph. Additionally, a large genomic island is found that contains numerous low GC-content genes and genes related to extracellular polysaccharide production and cell wall synthesis and maintenance.
Everroad, R. Craig; Stuart, Rhona K.; Bebout, Brad M.; ...
2016-08-24
The nonheterocystous filamentous cyanobacterium, strain ESFC-1, is a recently described member of the order Oscillatoriales within the Cyanobacteria. ESFC-1 has been shown to be a major diazotroph in the intertidal microbial mat system at Elkhorn Slough, CA, USA. Based on phylogenetic analyses of the 16S RNA gene, ESFC-1 appears to belong to a unique, genus-level divergence; the draft genome sequence of this strain has now been determined. Here we report features of this genome as they relate to the ecological functions and capabilities of strain ESFC-1. The 5,632,035 bp genome sequence encodes 4914 protein-coding genes and 92 RNA genes. Onemore » striking feature of this cyanobacterium is the apparent lack of either uptake or bi-directional hydrogenases typically expected within a diazotroph. In addition, a large genomic island is found that contains numerous low GC-content genes and genes related to extracellular polysaccharide production and cell wall synthesis and maintenance.« less
Kapranov, Philipp; St Laurent, Georges; Raz, Tal; Ozsolak, Fatih; Reynolds, C Patrick; Sorensen, Poul H B; Reaman, Gregory; Milos, Patrice; Arceci, Robert J; Thompson, John F; Triche, Timothy J
2010-12-21
Discovery that the transcriptional output of the human genome is far more complex than predicted by the current set of protein-coding annotations and that most RNAs produced do not appear to encode proteins has transformed our understanding of genome complexity and suggests new paradigms of genome regulation. However, the fraction of all cellular RNA whose function we do not understand and the fraction of the genome that is utilized to produce that RNA remain controversial. This is not simply a bookkeeping issue because the degree to which this un-annotated transcription is present has important implications with respect to its biologic function and to the general architecture of genome regulation. For example, efforts to elucidate how non-coding RNAs (ncRNAs) regulate genome function will be compromised if that class of RNAs is dismissed as simply 'transcriptional noise'. We show that the relative mass of RNA whose function and/or structure we do not understand (the so called 'dark matter' RNAs), as a proportion of all non-ribosomal, non-mitochondrial human RNA (mt-RNA), can be greater than that of protein-encoding transcripts. This observation is obscured in studies that focus only on polyA-selected RNA, a method that enriches for protein coding RNAs and at the same time discards the vast majority of RNA prior to analysis. We further show the presence of a large number of very long, abundantly-transcribed regions (100's of kb) in intergenic space and further show that expression of these regions is associated with neoplastic transformation. These overlap some regions found previously in normal human embryonic tissues and raises an interesting hypothesis as to the function of these ncRNAs in both early development and neoplastic transformation. We conclude that 'dark matter' RNA can constitute the majority of non-ribosomal, non-mitochondrial-RNA and a significant fraction arises from numerous very long, intergenic transcribed regions that could be involved in neoplastic transformation.
2013-01-01
Background Comparatively little information is available on members of the Myoviridae infecting low G+C content, Gram-positive host bacteria of the family Firmicutes. While numerous Bacillus phages have been isolated up till now only very few Bacillus cereus phages have been characterized in detail. Results Here we present data on the large, virulent, broad-host-range B. cereus phage vB_BceM_Bc431v3 (Bc431v3). Bc431v3 features a 158,618 bp dsDNA genome, encompassing 239 putative open reading frames (ORFs) and, 20 tRNA genes encoding 17 different amino acids. Since pulsed-field gel electrophoresis indicated that the genome of this phage has a mass of 155-158 kb Bc431v3 DNA appears not to contain long terminal repeats that are found in the genome of Bacillus phage SPO1. Conclusions Bc431v3 displays significant sequence similarity, at the protein level, to B. cereus phage BCP78, Listeria phage A511 and Enterococcus phage ØEF24C and other morphologically related phages infecting Firmicutes such as Staphylococcus phage K and Lactobacillus phage LP65. Based on these data we suggest that Bc431v3 should be included as a member of the Spounavirinae; however, because of all the diverse taxonomical information has been addressed recently, it is difficult to determine the genus. The Bc431v3 phage contains some highly unusual genes such as gp143 encoding putative tRNAHis guanylyltransferase. In addition, it carries some genes that appear to be related to the host sporulation regulators. These are: gp098, which encodes a putative segregation protein related to FstK/SpoIIIE DNA transporters; gp105, a putative segregation protein; gp108, RNA polymerase sigma factor F/B; and, gp109 encoding RNA polymerase sigma factor G. PMID:23388049
De Paepe, Marianne; Hutinet, Geoffrey; Son, Olivier; Amarir-Bouhram, Jihane; Schbath, Sophie; Petit, Marie-Agnès
2014-01-01
Bacteriophages (or phages) dominate the biosphere both numerically and in terms of genetic diversity. In particular, genomic comparisons suggest a remarkable level of horizontal gene transfer among temperate phages, favoring a high evolution rate. Molecular mechanisms of this pervasive mosaicism are mostly unknown. One hypothesis is that phage encoded recombinases are key players in these horizontal transfers, thanks to their high efficiency and low fidelity. Here, we associate two complementary in vivo assays and a bioinformatics analysis to address the role of phage encoded recombinases in genomic mosaicism. The first assay allowed determining the genetic determinants of mosaic formation between lambdoid phages and Escherichia coli prophage remnants. In the second assay, recombination was monitored between sequences on phage λ, and allowed to compare the performance of three different Rad52-like recombinases on the same substrate. We also addressed the importance of homologous recombination in phage evolution by a genomic comparison of 84 E. coli virulent and temperate phages or prophages. We demonstrate that mosaics are mainly generated by homology-driven mechanisms that tolerate high substrate divergence. We show that phage encoded Rad52-like recombinases act independently of RecA, and that they are relatively more efficient when the exchanged fragments are divergent. We also show that accessory phage genes orf and rap contribute to mosaicism. A bioinformatics analysis strengthens our experimental results by showing that homologous recombination left traces in temperate phage genomes at the borders of recently exchanged fragments. We found no evidence of exchanges between virulent and temperate phages of E. coli. Altogether, our results demonstrate that Rad52-like recombinases promote gene shuffling among temperate phages, accelerating their evolution. This mechanism may prove to be more general, as other mobile genetic elements such as ICE encode Rad52-like functions, and play an important role in bacterial evolution itself. PMID:24603854
Structure and genomic organization of the human B1 receptor gene for kinins (BDKRB1).
Bachvarov, D R; Hess, J F; Menke, J G; Larrivée, J F; Marceau, F
1996-05-01
Two subtypes of mammalian bradykinin receptors, B1 and B2 (BDKRB1 and BDKRB2), have been defined based on their pharmacological properties. The B1 type kinin receptors have weak affinity for intact BK or Lys-BK but strong affinity for kinin metabolites without the C-terminal arginine (e.g., des-Arg9-BK and Lys-des-Arg9-BK, also called des-Arg10-kallidin), which are generated by kininase I. The B1 receptor expression is up-regulated following tissue injury and inflammation (hyperemia, exudation, hyperalgesia, etc.). In the present study, we have cloned and sequenced the gene encoding human B1 receptor from a human genomic library. The human B1 receptor gene contains three exons separated by two introns. The first and the second exon are noncoding, while the coding region and the 3'-flanking region are located entirely on the third exon. The exon-intron arrangement of the human B1 receptor gene shows significant similarity with the genes encoding the B2 receptor subtype in human, mouse, and rat. Sequence analysis of the 5'-flanking region revealed the presence of a consensus TATA box and of numerous candidate transcription factor binding sequences. Primer extension experiments have shown the existence of multiple transcription initiation sites situated downstream and upstream from the consensus TATA box. Genomic Southern blot analysis indicated that the human B1 receptor is encoded by a single-copy gene.
Building a genome analysis pipeline to predict disease risk and prevent disease.
Bromberg, Y
2013-11-01
Reduced costs and increased speed and accuracy of sequencing can bring the genome-based evaluation of individual disease risk to the bedside. While past efforts have identified a number of actionable mutations, the bulk of genetic risk remains hidden in sequence data. The biggest challenge facing genomic medicine today is the development of new techniques to predict the specifics of a given human phenome (set of all expressed phenotypes) encoded by each individual variome (full set of genome variants) in the context of the given environment. Numerous tools exist for the computational identification of the functional effects of a single variant. However, the pipelines taking advantage of full genomic, exomic, transcriptomic (and other) sequences have only recently become a reality. This review looks at the building of methodologies for predicting "variome"-defined disease risk. It also discusses some of the challenges for incorporating such a pipeline into everyday medical practice. © 2013. Published by Elsevier Ltd. All rights reserved.
Versluis, Dennis; Nijsse, Bart; Naim, Mohd Azrul; Koehorst, Jasper J; Wiese, Jutta; Imhoff, Johannes F; Schaap, Peter J; van Passel, Mark W J; Smidt, Hauke
2018-01-01
Abstract Pseudovibrio is a marine bacterial genus members of which are predominantly isolated from sessile marine animals, and particularly sponges. It has been hypothesized that Pseudovibrio spp. form mutualistic relationships with their hosts. Here, we studied Pseudovibrio phylogeny and genetic adaptations that may play a role in host colonization by comparative genomics of 31 Pseudovibrio strains, including 25 sponge isolates. All genomes were highly similar in terms of encoded core metabolic pathways, albeit with substantial differences in overall gene content. Based on gene composition, Pseudovibrio spp. clustered by geographic region, indicating geographic speciation. Furthermore, the fact that isolates from the Mediterranean Sea clustered by sponge species suggested host-specific adaptation or colonization. Genome analyses suggest that Pseudovibrio hongkongensis UST20140214-015BT is only distantly related to other Pseudovibrio spp., thereby challenging its status as typical Pseudovibrio member. All Pseudovibrio genomes were found to encode numerous proteins with SEL1 and tetratricopeptide repeats, which have been suggested to play a role in host colonization. For evasion of the host immune system, Pseudovibrio spp. may depend on type III, IV, and VI secretion systems that can inject effector molecules into eukaryotic cells. Furthermore, Pseudovibrio genomes carry on average seven secondary metabolite biosynthesis clusters, reinforcing the role of Pseudovibrio spp. as potential producers of novel bioactive compounds. Tropodithietic acid, bacteriocin, and terpene biosynthesis clusters were highly conserved within the genus, suggesting an essential role in survival, for example through growth inhibition of bacterial competitors. Taken together, these results support the hypothesis that Pseudovibrio spp. have mutualistic relations with sponges. PMID:29319806
A High-Resolution Gene Map of the Chloroplast Genome of the Red Alga Porphyra purpurea.
Reith, M; Munholland, J
1993-01-01
Extensive DNA sequencing of the chloroplast genome of the red alga Porphyra purpurea has resulted in the detection of more than 125 genes. Fifty-eight (approximately 46%) of these genes are not found on the chloroplast genomes of land plants. These include genes encoding 17 photosynthetic proteins, three tRNAs, and nine ribosomal proteins. In addition, nine genes encoding proteins related to biosynthetic functions, six genes encoding proteins involved in gene expression, and at least five genes encoding miscellaneous proteins are among those not known to be located on land plant chloroplast genomes. The increased coding capacity of the P. purpurea chloroplast genome, along with other characteristics such as the absence of introns and the conservation of ancestral operons, demonstrate the primitive nature of the P. purpurea chloroplast genome. In addition, evidence for a monophyletic origin of chloroplasts is suggested by the identification of two groups of genes that are clustered in chloroplast genomes but not in cyanobacteria. PMID:12271072
SnoVault and encodeD: A novel object-based storage system and applications to ENCODE metadata.
Hitz, Benjamin C; Rowe, Laurence D; Podduturi, Nikhil R; Glick, David I; Baymuradov, Ulugbek K; Malladi, Venkat S; Chan, Esther T; Davidson, Jean M; Gabdank, Idan; Narayana, Aditi K; Onate, Kathrina C; Hilton, Jason; Ho, Marcus C; Lee, Brian T; Miyasato, Stuart R; Dreszer, Timothy R; Sloan, Cricket A; Strattan, J Seth; Tanaka, Forrest Y; Hong, Eurie L; Cherry, J Michael
2017-01-01
The Encyclopedia of DNA elements (ENCODE) project is an ongoing collaborative effort to create a comprehensive catalog of functional elements initiated shortly after the completion of the Human Genome Project. The current database exceeds 6500 experiments across more than 450 cell lines and tissues using a wide array of experimental techniques to study the chromatin structure, regulatory and transcriptional landscape of the H. sapiens and M. musculus genomes. All ENCODE experimental data, metadata, and associated computational analyses are submitted to the ENCODE Data Coordination Center (DCC) for validation, tracking, storage, unified processing, and distribution to community resources and the scientific community. As the volume of data increases, the identification and organization of experimental details becomes increasingly intricate and demands careful curation. The ENCODE DCC has created a general purpose software system, known as SnoVault, that supports metadata and file submission, a database used for metadata storage, web pages for displaying the metadata and a robust API for querying the metadata. The software is fully open-source, code and installation instructions can be found at: http://github.com/ENCODE-DCC/snovault/ (for the generic database) and http://github.com/ENCODE-DCC/encoded/ to store genomic data in the manner of ENCODE. The core database engine, SnoVault (which is completely independent of ENCODE, genomic data, or bioinformatic data) has been released as a separate Python package.
SnoVault and encodeD: A novel object-based storage system and applications to ENCODE metadata
Podduturi, Nikhil R.; Glick, David I.; Baymuradov, Ulugbek K.; Malladi, Venkat S.; Chan, Esther T.; Davidson, Jean M.; Gabdank, Idan; Narayana, Aditi K.; Onate, Kathrina C.; Hilton, Jason; Ho, Marcus C.; Lee, Brian T.; Miyasato, Stuart R.; Dreszer, Timothy R.; Sloan, Cricket A.; Strattan, J. Seth; Tanaka, Forrest Y.; Hong, Eurie L.; Cherry, J. Michael
2017-01-01
The Encyclopedia of DNA elements (ENCODE) project is an ongoing collaborative effort to create a comprehensive catalog of functional elements initiated shortly after the completion of the Human Genome Project. The current database exceeds 6500 experiments across more than 450 cell lines and tissues using a wide array of experimental techniques to study the chromatin structure, regulatory and transcriptional landscape of the H. sapiens and M. musculus genomes. All ENCODE experimental data, metadata, and associated computational analyses are submitted to the ENCODE Data Coordination Center (DCC) for validation, tracking, storage, unified processing, and distribution to community resources and the scientific community. As the volume of data increases, the identification and organization of experimental details becomes increasingly intricate and demands careful curation. The ENCODE DCC has created a general purpose software system, known as SnoVault, that supports metadata and file submission, a database used for metadata storage, web pages for displaying the metadata and a robust API for querying the metadata. The software is fully open-source, code and installation instructions can be found at: http://github.com/ENCODE-DCC/snovault/ (for the generic database) and http://github.com/ENCODE-DCC/encoded/ to store genomic data in the manner of ENCODE. The core database engine, SnoVault (which is completely independent of ENCODE, genomic data, or bioinformatic data) has been released as a separate Python package. PMID:28403240
Microbial genomic island discovery, visualization and analysis.
Bertelli, Claire; Tilley, Keith E; Brinkman, Fiona S L
2018-06-03
Horizontal gene transfer (also called lateral gene transfer) is a major mechanism for microbial genome evolution, enabling rapid adaptation and survival in specific niches. Genomic islands (GIs), commonly defined as clusters of bacterial or archaeal genes of probable horizontal origin, are of particular medical, environmental and/or industrial interest, as they disproportionately encode virulence factors and some antimicrobial resistance genes and may harbor entire metabolic pathways that confer a specific adaptation (solvent resistance, symbiosis properties, etc). As large-scale analyses of microbial genomes increases, such as for genomic epidemiology investigations of infectious disease outbreaks in public health, there is increased appreciation of the need to accurately predict and track GIs. Over the past decade, numerous computational tools have been developed to tackle the challenges inherent in accurate GI prediction. We review here the main types of GI prediction methods and discuss their advantages and limitations for a routine analysis of microbial genomes in this era of rapid whole-genome sequencing. An assessment is provided of 20 GI prediction software methods that use sequence-composition bias to identify the GIs, using a reference GI data set from 104 genomes obtained using an independent comparative genomics approach. Finally, we present guidelines to assist researchers in effectively identifying these key genomic regions.
Veenstra, Jan A; Khammassi, Hela
2017-04-01
RYamides are arthropod neuropeptides with unknown function. In 2011 two RYamides were isolated from D. melanogaster as the ligands for the G-protein coupled receptor CG5811. The D. melanogaster gene encoding these neuropeptides is highly unusual, as there are four RYamide encoding exons in the current genome assembly, but an exon encoding a signal peptide is absent. Comparing the D. melanogaster gene structure with those from other species, including D. virilis, suggests that the gene is degenerating. RNAseq data from 1634 short sequence read archives at NCBI containing more than 34 billion spots yielded numerous individual spots that correspond to the RYamide encoding exons, of which a large number include the intron-exon boundary at the start of this exon. Although 72 different sequences have been spliced onto this RYamide encoding exon, none codes for the signal peptide of this gene. Thus, the RNAseq data for this gene reveal only noise and no signal. The very small quantities of peptide recovered during isolation and the absence of credible RNAseq data, indicates that the gene is very little expressed, while the RYamide gene structure in D. melanogaster suggests that it might be evolving into a pseudogene. Yet, the identification of the peptides it encodes clearly shows it is still functional. Using region specific antisera, we could localize numerous neurons and enteroendocrine cells in D. willistoni, D. virilis and D. pseudoobscura, but only two adult abdominal neurons in D. melanogaster. Those two neurons project to and innervate the rectal papillae, suggesting that RYamides may be involved in the regulation of water homeostasis. Copyright © 2017 Elsevier Ltd. All rights reserved.
Versluis, Dennis; Nijsse, Bart; Naim, Mohd Azrul; Koehorst, Jasper J; Wiese, Jutta; Imhoff, Johannes F; Schaap, Peter J; van Passel, Mark W J; Smidt, Hauke; Sipkema, Detmer
2018-01-01
Pseudovibrio is a marine bacterial genus members of which are predominantly isolated from sessile marine animals, and particularly sponges. It has been hypothesized that Pseudovibrio spp. form mutualistic relationships with their hosts. Here, we studied Pseudovibrio phylogeny and genetic adaptations that may play a role in host colonization by comparative genomics of 31 Pseudovibrio strains, including 25 sponge isolates. All genomes were highly similar in terms of encoded core metabolic pathways, albeit with substantial differences in overall gene content. Based on gene composition, Pseudovibrio spp. clustered by geographic region, indicating geographic speciation. Furthermore, the fact that isolates from the Mediterranean Sea clustered by sponge species suggested host-specific adaptation or colonization. Genome analyses suggest that Pseudovibrio hongkongensis UST20140214-015BT is only distantly related to other Pseudovibrio spp., thereby challenging its status as typical Pseudovibrio member. All Pseudovibrio genomes were found to encode numerous proteins with SEL1 and tetratricopeptide repeats, which have been suggested to play a role in host colonization. For evasion of the host immune system, Pseudovibrio spp. may depend on type III, IV, and VI secretion systems that can inject effector molecules into eukaryotic cells. Furthermore, Pseudovibrio genomes carry on average seven secondary metabolite biosynthesis clusters, reinforcing the role of Pseudovibrio spp. as potential producers of novel bioactive compounds. Tropodithietic acid, bacteriocin, and terpene biosynthesis clusters were highly conserved within the genus, suggesting an essential role in survival, for example through growth inhibition of bacterial competitors. Taken together, these results support the hypothesis that Pseudovibrio spp. have mutualistic relations with sponges. © The Author(s) 2018. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Plastid: nucleotide-resolution analysis of next-generation sequencing and genomics data.
Dunn, Joshua G; Weissman, Jonathan S
2016-11-22
Next-generation sequencing (NGS) informs many biological questions with unprecedented depth and nucleotide resolution. These assays have created a need for analytical tools that enable users to manipulate data nucleotide-by-nucleotide robustly and easily. Furthermore, because many NGS assays encode information jointly within multiple properties of read alignments - for example, in ribosome profiling, the locations of ribosomes are jointly encoded in alignment coordinates and length - analytical tools are often required to extract the biological meaning from the alignments before analysis. Many assay-specific pipelines exist for this purpose, but there remains a need for user-friendly, generalized, nucleotide-resolution tools that are not limited to specific experimental regimes or analytical workflows. Plastid is a Python library designed specifically for nucleotide-resolution analysis of genomics and NGS data. As such, Plastid is designed to extract assay-specific information from read alignments while retaining generality and extensibility to novel NGS assays. Plastid represents NGS and other biological data as arrays of values associated with genomic or transcriptomic positions, and contains configurable tools to convert data from a variety of sources to such arrays. Plastid also includes numerous tools to manipulate even discontinuous genomic features, such as spliced transcripts, with nucleotide precision. Plastid automatically handles conversion between genomic and feature-centric coordinates, accounting for splicing and strand, freeing users of burdensome accounting. Finally, Plastid's data models use consistent and familiar biological idioms, enabling even beginners to develop sophisticated analytical workflows with minimal effort. Plastid is a versatile toolkit that has been used to analyze data from multiple NGS assays, including RNA-seq, ribosome profiling, and DMS-seq. It forms the genomic engine of our ORF annotation tool, ORF-RATER, and is readily adapted to novel NGS assays. Examples, tutorials, and extensive documentation can be found at https://plastid.readthedocs.io .
Inhibition of CRISPR-Cas9 with Bacteriophage Proteins.
Rauch, Benjamin J; Silvis, Melanie R; Hultquist, Judd F; Waters, Christopher S; McGregor, Michael J; Krogan, Nevan J; Bondy-Denomy, Joseph
2017-01-12
Bacterial CRISPR-Cas systems utilize sequence-specific RNA-guided nucleases to defend against bacteriophage infection. As a countermeasure, numerous phages are known that produce proteins to block the function of class 1 CRISPR-Cas systems. However, currently no proteins are known to inhibit the widely used class 2 CRISPR-Cas9 system. To find these inhibitors, we searched cas9-containing bacterial genomes for the co-existence of a CRISPR spacer and its target, a potential indicator for CRISPR inhibition. This analysis led to the discovery of four unique type II-A CRISPR-Cas9 inhibitor proteins encoded by Listeria monocytogenes prophages. More than half of L. monocytogenes strains with cas9 contain at least one prophage-encoded inhibitor, suggesting widespread CRISPR-Cas9 inactivation. Two of these inhibitors also blocked the widely used Streptococcus pyogenes Cas9 when assayed in Escherichia coli and human cells. These natural Cas9-specific "anti-CRISPRs" present tools that can be used to regulate the genome engineering activities of CRISPR-Cas9. Copyright © 2017 Elsevier Inc. All rights reserved.
A novel totivirus-like virus isolated from bat guano.
Yang, Xinglou; Zhang, Yunzhi; Ge, Xingyi; Yuan, Junfa; Shi, Zhengli
2012-06-01
Previous metagenomic analysis indicated that numerous insect viruses exist in bat guano. In this study, we isolated a novel double-stranded RNA virus, a tentative member of the family Totiviridae, designated Tianjin totivirus (ToV-TJ), from bat feces. The virus is an icosahedral particle with a diameter of 40-43 nm, and it causes cytopathic effect in Sf9, Hz, and C6/36 cell lines. Full-length genomic sequence analysis showed that ToV-TJ shares high similarity with the totivirus OMRV-AK4, which was recently isolated from mosquitoes in Japan. The full-length genome of the ToV-TJ was 7611 bp and contained two predicted non-overlapping open reading frames (ORFs): ORF1, encoding the capsid protein (CP), and ORF2, encoding an RNA-dependent RNA polymerase. Bioassay of ToV-TJ by feeding on the larvae of Spodoptera exigua and Helicoverpa armigera (Hubner) suggests that this virus is not infectious for these two larvae in vivo. Sequences similar to that of ToV-TJ have been detected in bat feces sampled in Yunnan and Hainan Provinces, suggesting that this virus is widely distributed.
ISC, a Novel Group of Bacterial and Archaeal DNA Transposons That Encode Cas9 Homologs
Kapitonov, Vladimir V.; Makarova, Kira S.
2015-01-01
ABSTRACT Bacterial genomes encode numerous homologs of Cas9, the effector protein of the type II CRISPR-Cas systems. The homology region includes the arginine-rich helix and the HNH nuclease domain that is inserted into the RuvC-like nuclease domain. These genes, however, are not linked to cas genes or CRISPR. Here, we show that Cas9 homologs represent a distinct group of nonautonomous transposons, which we denote ISC (insertion sequences Cas9-like). We identify many diverse families of full-length ISC transposons and demonstrate that their terminal sequences (particularly 3′ termini) are similar to those of IS605 superfamily transposons that are mobilized by the Y1 tyrosine transposase encoded by the TnpA gene and often also encode the TnpB protein containing the RuvC-like endonuclease domain. The terminal regions of the ISC and IS605 transposons contain palindromic structures that are likely recognized by the Y1 transposase. The transposons from these two groups are inserted either exactly in the middle or upstream of specific 4-bp target sites, without target site duplication. We also identify autonomous ISC transposons that encode TnpA-like Y1 transposases. Thus, the nonautonomous ISC transposons could be mobilized in trans either by Y1 transposases of other, autonomous ISC transposons or by Y1 transposases of the more abundant IS605 transposons. These findings imply an evolutionary scenario in which the ISC transposons evolved from IS605 family transposons, possibly via insertion of a mobile group II intron encoding the HNH domain, and Cas9 subsequently evolved via immobilization of an ISC transposon. IMPORTANCE Cas9 endonucleases, the effectors of type II CRISPR-Cas systems, represent the new generation of genome-engineering tools. Here, we describe in detail a novel family of transposable elements that encode the likely ancestors of Cas9 and outline the evolutionary scenario connecting different varieties of these transposons and Cas9. PMID:26712934
Wee, Bryan A; Woolfit, Megan; Beatson, Scott A; Petty, Nicola K
2013-01-01
Legionella encodes multiple classes of Type IV Secretion Systems (T4SSs), including the Dot/Icm protein secretion system that is essential for intracellular multiplication in amoebal and human hosts. Other T4SSs not essential for virulence are thought to facilitate the acquisition of niche-specific adaptation genes including the numerous effector genes that are a hallmark of this genus. Previously, we identified two novel gene clusters in the draft genome of Legionella pneumophila strain 130b that encode homologues of a subtype of T4SS, the genomic island-associated T4SS (GI-T4SS), usually associated with integrative and conjugative elements (ICE). In this study, we performed genomic analyses of 14 homologous GI-T4SS clusters found in eight publicly available Legionella genomes and show that this cluster is unusually well conserved in a region of high plasticity. Phylogenetic analyses show that Legionella GI-T4SSs are substantially divergent from other members of this subtype of T4SS and represent a novel clade of GI-T4SSs only found in this genus. The GI-T4SS was found to be under purifying selection, suggesting it is functional and may play an important role in the evolution and adaptation of Legionella. Like other GI-T4SSs, the Legionella clusters are also associated with ICEs, but lack the typical integration and replication modules of related ICEs. The absence of complete replication and DNA pre-processing modules, together with the presence of Legionella-specific regulatory elements, suggest the Legionella GI-T4SS-associated ICE is unique and may employ novel mechanisms of regulation, maintenance and excision. The Legionella GI-T4SS cluster was found to be associated with several cargo genes, including numerous antibiotic resistance and virulence factors, which may confer a fitness benefit to the organism. The in-silico characterisation of this new T4SS furthers our understanding of the diversity of secretion systems involved in the frequent horizontal gene transfers that allow Legionella to adapt to and exploit diverse environmental niches.
Wee, Bryan A.; Woolfit, Megan; Beatson, Scott A.; Petty, Nicola K.
2013-01-01
Legionella encodes multiple classes of Type IV Secretion Systems (T4SSs), including the Dot/Icm protein secretion system that is essential for intracellular multiplication in amoebal and human hosts. Other T4SSs not essential for virulence are thought to facilitate the acquisition of niche-specific adaptation genes including the numerous effector genes that are a hallmark of this genus. Previously, we identified two novel gene clusters in the draft genome of Legionella pneumophila strain 130b that encode homologues of a subtype of T4SS, the genomic island-associated T4SS (GI-T4SS), usually associated with integrative and conjugative elements (ICE). In this study, we performed genomic analyses of 14 homologous GI-T4SS clusters found in eight publicly available Legionella genomes and show that this cluster is unusually well conserved in a region of high plasticity. Phylogenetic analyses show that Legionella GI-T4SSs are substantially divergent from other members of this subtype of T4SS and represent a novel clade of GI-T4SSs only found in this genus. The GI-T4SS was found to be under purifying selection, suggesting it is functional and may play an important role in the evolution and adaptation of Legionella. Like other GI-T4SSs, the Legionella clusters are also associated with ICEs, but lack the typical integration and replication modules of related ICEs. The absence of complete replication and DNA pre-processing modules, together with the presence of Legionella-specific regulatory elements, suggest the Legionella GI-T4SS-associated ICE is unique and may employ novel mechanisms of regulation, maintenance and excision. The Legionella GI-T4SS cluster was found to be associated with several cargo genes, including numerous antibiotic resistance and virulence factors, which may confer a fitness benefit to the organism. The in-silico characterisation of this new T4SS furthers our understanding of the diversity of secretion systems involved in the frequent horizontal gene transfers that allow Legionella to adapt to and exploit diverse environmental niches. PMID:24358157
[The ENCODE project and functional genomics studies].
Ding, Nan; Qu, Hongzhu; Fang, Xiangdong
2014-03-01
Upon the completion of the Human Genome Project, scientists have been trying to interpret the underlying genomic code for human biology. Since 2003, National Human Genome Research Institute (NHGRI) has invested nearly $0.3 billion and gathered over 440 scientists from more than 32 institutions in the United States, China, United Kingdom, Japan, Spain and Singapore to initiate the Encyclopedia of DNA Elements (ENCODE) project, aiming to identify and analyze all regulatory elements in the human genome. Taking advantage of the development of next-generation sequencing technologies and continuous improvement of experimental methods, ENCODE had made remarkable achievements: identified methylation and histone modification of DNA sequences and their regulatory effects on gene expression through altering chromatin structures, categorized binding sites of various transcription factors and constructed their regulatory networks, further revised and updated database for pseudogenes and non-coding RNA, and identified SNPs in regulatory sequences associated with diseases. These findings help to comprehensively understand information embedded in gene and genome sequences, the function of regulatory elements as well as the molecular mechanism underlying the transcriptional regulation by noncoding regions, and provide extensive data resource for life sciences, particularly for translational medicine. We re-viewed the contributions of high-throughput sequencing platform development and bioinformatical technology improve-ment to the ENCODE project, the association between epigenetics studies and the ENCODE project, and the major achievement of the ENCODE project. We also provided our prospective on the role of the ENCODE project in promoting the development of basic and clinical medicine.
How to kill the honey bee larva: genomic potential and virulence mechanisms of Paenibacillus larvae.
Djukic, Marvin; Brzuszkiewicz, Elzbieta; Fünfhaus, Anne; Voss, Jörn; Gollnow, Kathleen; Poppinga, Lena; Liesegang, Heiko; Garcia-Gonzalez, Eva; Genersch, Elke; Daniel, Rolf
2014-01-01
Paenibacillus larvae, a Gram positive bacterial pathogen, causes American Foulbrood (AFB), which is the most serious infectious disease of honey bees. In order to investigate the genomic potential of P. larvae, two strains belonging to two different genotypes were sequenced and used for comparative genome analysis. The complete genome sequence of P. larvae strain DSM 25430 (genotype ERIC II) consisted of 4,056,006 bp and harbored 3,928 predicted protein-encoding genes. The draft genome sequence of P. larvae strain DSM 25719 (genotype ERIC I) comprised 4,579,589 bp and contained 4,868 protein-encoding genes. Both strains harbored a 9.7 kb plasmid and encoded a large number of virulence-associated proteins such as toxins and collagenases. In addition, genes encoding large multimodular enzymes producing nonribosomally peptides or polyketides were identified. In the genome of strain DSM 25719 seven toxin associated loci were identified and analyzed. Five of them encoded putatively functional toxins. The genome of strain DSM 25430 harbored several toxin loci that showed similarity to corresponding loci in the genome of strain DSM 25719, but were non-functional due to point mutations or disruption by transposases. Although both strains cause AFB, significant differences between the genomes were observed including genome size, number and composition of transposases, insertion elements, predicted phage regions, and strain-specific island-like regions. Transposases, integrases and recombinases are important drivers for genome plasticity. A total of 390 and 273 mobile elements were found in strain DSM 25430 and strain DSM 25719, respectively. Comparative genomics of both strains revealed acquisition of virulence factors by horizontal gene transfer and provided insights into evolution and pathogenicity.
Horizontal gene transfer of chromosomal Type II toxin-antitoxin systems of Escherichia coli.
Ramisetty, Bhaskar Chandra Mohan; Santhosh, Ramachandran Sarojini
2016-02-01
Type II toxin-antitoxin systems (TAs) are small autoregulated bicistronic operons that encode a toxin protein with the potential to inhibit metabolic processes and an antitoxin protein to neutralize the toxin. Most of the bacterial genomes encode multiple TAs. However, the diversity and accumulation of TAs on bacterial genomes and its physiological implications are highly debated. Here we provide evidence that Escherichia coli chromosomal TAs (encoding RNase toxins) are 'acquired' DNA likely originated from heterologous DNA and are the smallest known autoregulated operons with the potential for horizontal propagation. Sequence analyses revealed that integration of TAs into the bacterial genome is unique and contributes to variations in the coding and/or regulatory regions of flanking host genome sequences. Plasmids and genomes encoding identical TAs of natural isolates are mutually exclusive. Chromosomal TAs might play significant roles in the evolution and ecology of bacteria by contributing to host genome variation and by moderation of plasmid maintenance. © FEMS 2015. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Lei, Wanjun; Ni, Dapeng; Wang, Yujun; Shao, Junjie; Wang, Xincun; Yang, Dan; Wang, Jinsheng; Chen, Haimei; Liu, Chang
2016-02-22
Astragalus membranaceus is an important medicinal plant in Asia. Several of its varieties have been used interchangeably as raw materials for commercial production. High resolution genetic markers are in urgent need to distinguish these varieties. Here, we sequenced and analyzed the chloroplast genome of A. membranaceus (Fisch.) Bunge var. mongholicus (Bunge) P.K. Hsiao using the next generation DNA sequencing technology. The genome was assembled using Abyss and then subjected to gene prediction using CPGAVAS and repeat analysis using MISA, Tandem Repeats Finder, and REPuter. Finally, the genome was subjected phylogenetic and comparative genomic analyses. The complete genome is 123,582 bp long, containing only one copy of the inverted repeat. Gene prediction revealed 110 genes encoding 76 proteins, 30 tRNAs, and four rRNAs. Five intra-specific hypermutation loci were identified, three of which are heteroplasmic. Furthermore, three gene losses and two large inversions were identified. Comparative genomic analyses demonstrated the dynamic nature of the Papilionoideae chloroplast genomes, which showed occurrence of numerous hypermutation loci, frequent gene losses, and fragment inversions. Results obtained herein elucidate the complex evolutionary history of chloroplast genomes and have laid the foundation for the identification of genetic markers to distinguish A. membranaceus varieties.
Gross, Roy; Guzman, Carlos A; Sebaihia, Mohammed; dos Santos, Vítor A P Martins; Pieper, Dietmar H; Koebnik, Ralf; Lechner, Melanie; Bartels, Daniela; Buhrmester, Jens; Choudhuri, Jomuna V; Ebensen, Thomas; Gaigalat, Lars; Herrmann, Stefanie; Khachane, Amit N; Larisch, Christof; Link, Stefanie; Linke, Burkhard; Meyer, Folker; Mormann, Sascha; Nakunst, Diana; Rückert, Christian; Schneiker-Bekel, Susanne; Schulze, Kai; Vorhölter, Frank-Jörg; Yevsa, Tetyana; Engle, Jacquelyn T; Goldman, William E; Pühler, Alfred; Göbel, Ulf B; Goesmann, Alexander; Blöcker, Helmut; Kaiser, Olaf; Martinez-Arias, Rosa
2008-09-30
Bordetella petrii is the only environmental species hitherto found among the otherwise host-restricted and pathogenic members of the genus Bordetella. Phylogenetically, it connects the pathogenic Bordetellae and environmental bacteria of the genera Achromobacter and Alcaligenes, which are opportunistic pathogens. B. petrii strains have been isolated from very different environmental niches, including river sediment, polluted soil, marine sponges and a grass root. Recently, clinical isolates associated with bone degenerative disease or cystic fibrosis have also been described. In this manuscript we present the results of the analysis of the completely annotated genome sequence of the B. petrii strain DSMZ12804. B. petrii has a mosaic genome of 5,287,950 bp harboring numerous mobile genetic elements, including seven large genomic islands. Four of them are highly related to the clc element of Pseudomonas knackmussii B13, which encodes genes involved in the degradation of aromatics. Though being an environmental isolate, the sequenced B. petrii strain also encodes proteins related to virulence factors of the pathogenic Bordetellae, including the filamentous hemagglutinin, which is a major colonization factor of B. pertussis, and the master virulence regulator BvgAS. However, it lacks all known toxins of the pathogenic Bordetellae. The genomic analysis suggests that B. petrii represents an evolutionary link between free-living environmental bacteria and the host-restricted obligate pathogenic Bordetellae. Its remarkable metabolic versatility may enable B. petrii to thrive in very different ecological niches.
Gross, Roy; Guzman, Carlos A; Sebaihia, Mohammed; Martins dos Santos, Vítor AP; Pieper, Dietmar H; Koebnik, Ralf; Lechner, Melanie; Bartels, Daniela; Buhrmester, Jens; Choudhuri, Jomuna V; Ebensen, Thomas; Gaigalat, Lars; Herrmann, Stefanie; Khachane, Amit N; Larisch, Christof; Link, Stefanie; Linke, Burkhard; Meyer, Folker; Mormann, Sascha; Nakunst, Diana; Rückert, Christian; Schneiker-Bekel, Susanne; Schulze, Kai; Vorhölter, Frank-Jörg; Yevsa, Tetyana; Engle, Jacquelyn T; Goldman, William E; Pühler, Alfred; Göbel, Ulf B; Goesmann, Alexander; Blöcker, Helmut; Kaiser, Olaf; Martinez-Arias, Rosa
2008-01-01
Background Bordetella petrii is the only environmental species hitherto found among the otherwise host-restricted and pathogenic members of the genus Bordetella. Phylogenetically, it connects the pathogenic Bordetellae and environmental bacteria of the genera Achromobacter and Alcaligenes, which are opportunistic pathogens. B. petrii strains have been isolated from very different environmental niches, including river sediment, polluted soil, marine sponges and a grass root. Recently, clinical isolates associated with bone degenerative disease or cystic fibrosis have also been described. Results In this manuscript we present the results of the analysis of the completely annotated genome sequence of the B. petrii strain DSMZ12804. B. petrii has a mosaic genome of 5,287,950 bp harboring numerous mobile genetic elements, including seven large genomic islands. Four of them are highly related to the clc element of Pseudomonas knackmussii B13, which encodes genes involved in the degradation of aromatics. Though being an environmental isolate, the sequenced B. petrii strain also encodes proteins related to virulence factors of the pathogenic Bordetellae, including the filamentous hemagglutinin, which is a major colonization factor of B. pertussis, and the master virulence regulator BvgAS. However, it lacks all known toxins of the pathogenic Bordetellae. Conclusion The genomic analysis suggests that B. petrii represents an evolutionary link between free-living environmental bacteria and the host-restricted obligate pathogenic Bordetellae. Its remarkable metabolic versatility may enable B. petrii to thrive in very different ecological niches. PMID:18826580
Epigenetics, chromatin and genome organization: recent advances from the ENCODE project.
Siggens, L; Ekwall, K
2014-09-01
The organization of the genome into functional units, such as enhancers and active or repressed promoters, is associated with distinct patterns of DNA and histone modifications. The Encyclopedia of DNA Elements (ENCODE) project has advanced our understanding of the principles of genome, epigenome and chromatin organization, identifying hundreds of thousands of potential regulatory regions and transcription factor binding sites. Part of the ENCODE consortium, GENCODE, has annotated the human genome with novel transcripts including new noncoding RNAs and pseudogenes, highlighting transcriptional complexity. Many disease variants identified in genome-wide association studies are located within putative enhancer regions defined by the ENCODE project. Understanding the principles of chromatin and epigenome organization will help to identify new disease mechanisms, biomarkers and drug targets, particularly as ongoing epigenome mapping projects generate data for primary human cell types that play important roles in disease. © 2014 The Association for the Publication of the Journal of Internal Medicine.
Kirsch, Petra; Jores, Jörg; Wieler, Lothar H
2004-01-01
Many bacterial virulence attributes, like toxins, adhesins, invasins, iron uptake systems, are encoded within specific regions of the bacterial genome. These in size varying regions are termed pathogenicity islands (PAIs) since they confer pathogenic properties to the respective micro-organism. Per definition PAIs are exclusively found in pathogenic strains and are often inserted near transfer-RNA genes. Nevertheless, non-pathogenic bacteria also possess foreign DNA elements that confer advantageous features, leading to improved fitness. These additional DNA elements as well as PAIs are termed genomic islands and were acquired during bacterial evolution. Significant G+C content deviation in pathogenicity islands with respect to the rest of the genome, the presence of direct repeat sequences at the flanking regions, the presence of integrase gene determinants as other mobility features,the particular insertion site (tRNA gene) as well as the observed genetic instability suggests that pathogenicity islands were acquired by horizontal gene transfer. PAIs are the fascinating proof of the plasticity of bacterial genomes. PAIs were originally described in human pathogenic Escherichia (E.) coli strains. In the meantime PAIs have been found in various pathogenic bacteria of humans, animals and even plants. The Locus of Enterocyte Effacement (LEE) is one particular widely distributed PAI of E coli. In addition, it also confers pathogenicity to the related species Citrobacter (C.) rodentium and Escherichia (E.) alvei. The LEE is an important virulence feature of several animal pathogens. It is an obligate PAI of all animal and human enteropathogenic E. coli (EPEC), and most enterohaemorrhegic E. coli (EHEC) also harbor the LEE. The LEE encodes a type III secretion system, an adhesion (intimin) that mediates the intimate contact between the bacterium and the epithelial cell, as well as various proteins which are secreted via the type III secretion system. The LEE encoded virulence features are responsible for the formation of so called attaching and effacing (AE) lesions in the intestinal epithelium. Due to its wide distribution in animal pathogens, LEE encoded antigens are suitable vaccine antigens. Acquisition and structure of the LEE pathogenicity island is the crucial point of numerous investigations. However, the evolution of the LEE, its origin and further spread in E. coli, are far from being resolved.
Yamada, Takuji; Waller, Alison S; Raes, Jeroen; Zelezniak, Aleksej; Perchat, Nadia; Perret, Alain; Salanoubat, Marcel; Patil, Kiran R; Weissenbach, Jean; Bork, Peer
2012-01-01
Despite the current wealth of sequencing data, one-third of all biochemically characterized metabolic enzymes lack a corresponding gene or protein sequence, and as such can be considered orphan enzymes. They represent a major gap between our molecular and biochemical knowledge, and consequently are not amenable to modern systemic analyses. As 555 of these orphan enzymes have metabolic pathway neighbours, we developed a global framework that utilizes the pathway and (meta)genomic neighbour information to assign candidate sequences to orphan enzymes. For 131 orphan enzymes (37% of those for which (meta)genomic neighbours are available), we associate sequences to them using scoring parameters with an estimated accuracy of 70%, implying functional annotation of 16 345 gene sequences in numerous (meta)genomes. As a case in point, two of these candidate sequences were experimentally validated to encode the predicted activity. In addition, we augmented the currently available genome-scale metabolic models with these new sequence–function associations and were able to expand the models by on average 8%, with a considerable change in the flux connectivity patterns and improved essentiality prediction. PMID:22569339
Sharma, Sandeep; Zaccaron, Alex Z; Ridenour, John B; Allen, Tom W; Conner, Kassie; Doyle, Vinson P; Price, Trey; Sikora, Edward; Singh, Raghuwinder; Spurlock, Terry; Tomaso-Peterson, Maria; Wilkerson, Tessie; Bluhm, Burton H
2018-04-01
The draft genome of Xylaria sp. isolate MSU_SB201401, causal agent of taproot decline of soybean in the southern U.S., is presented here. The genome assembly was 56.7 Mb in size with an L50 of 246. A total of 10,880 putative protein-encoding genes were predicted, including 647 genes encoding carbohydrate-active enzymes and 1053 genes encoding secreted proteins. This is the first draft genome of a plant-pathogenic Xylaria sp. associated with soybean. The draft genome of Xylaria sp. isolate MSU_SB201401 will provide an important resource for future experiments to determine the molecular basis of pathogenesis.
The complete mitochondrial genome sequence of Eimeria innocua (Eimeriidae, Coccidia, Apicomplexa).
Hafeez, Mian Abdul; Vrba, Vladimir; Barta, John Robert
2016-07-01
The complete mitochondrial genome of Eimeria innocua KR strain (Eimeriidae, Coccidia, Apicomplexa) was sequenced. This coccidium infects turkeys (Meleagris gallopavo), Bobwhite quails (Colinus virginianus), and Grey partridges (Perdix perdix). Genome organization and gene contents were comparable with other Eimeria spp. infecting galliform birds. The circular-mapping mt genome of E. innocua is 6247 bp in length with three protein-coding genes (cox1, cox3, and cytb), 19 gene fragments encoding large subunit (LSU) rRNA and 14 gene fragments encoding small subunit (SSU) rRNA. Like other Apicomplexa, no tRNA was encoded. The mitochondrial genome of E. innocua confirms its close phylogenetic affinities to Eimeria dispersa.
A decade of human genome project conclusion: Scientific diffusion about our genome knowledge.
Moraes, Fernanda; Góes, Andréa
2016-05-06
The Human Genome Project (HGP) was initiated in 1990 and completed in 2003. It aimed to sequence the whole human genome. Although it represented an advance in understanding the human genome and its complexity, many questions remained unanswered. Other projects were launched in order to unravel the mysteries of our genome, including the ENCyclopedia of DNA Elements (ENCODE). This review aims to analyze the evolution of scientific knowledge related to both the HGP and ENCODE projects. Data were retrieved from scientific articles published in 1990-2014, a period comprising the development and the 10 years following the HGP completion. The fact that only 20,000 genes are protein and RNA-coding is one of the most striking HGP results. A new concept about the organization of genome arose. The ENCODE project was initiated in 2003 and targeted to map the functional elements of the human genome. This project revealed that the human genome is pervasively transcribed. Therefore, it was determined that a large part of the non-protein coding regions are functional. Finally, a more sophisticated view of chromatin structure emerged. The mechanistic functioning of the genome has been redrafted, revealing a much more complex picture. Besides, a gene-centric conception of the organism has to be reviewed. A number of criticisms have emerged against the ENCODE project approaches, raising the question of whether non-conserved but biochemically active regions are truly functional. Thus, HGP and ENCODE projects accomplished a great map of the human genome, but the data generated still requires further in depth analysis. © 2016 by The International Union of Biochemistry and Molecular Biology, 44:215-223, 2016. © 2016 The International Union of Biochemistry and Molecular Biology.
Grohmann, L; Brennicke, A; Schuster, W
1992-01-01
The Oenothera mitochondrial genome contains only a gene fragment for ribosomal protein S12 (rps12), while other plants encode a functional gene in the mitochondrion. The complete Oenothera rps12 gene is located in the nucleus. The transit sequence necessary to target this protein to the mitochondrion is encoded by a 5'-extension of the open reading frame. Comparison of the amino acid sequence encoded by the nuclear gene with the polypeptides encoded by edited mitochondrial cDNA and genomic sequences of other plants suggests that gene transfer between mitochondrion and nucleus started from edited mitochondrial RNA molecules. Mechanisms and requirements of gene transfer and activation are discussed. Images PMID:1454526
Prostate cancer epigenetics and its clinical implications
Yegnasubramanian, Srinivasan
2016-01-01
Normal cells have a level of epigenetic programming that is superimposed on the genetic code to establish and maintain their cell identity and phenotypes. This epigenetic programming can be thought as the architecture, a sort of cityscape, that is built upon the underlying genetic landscape. The epigenetic programming is encoded by a complex set of chemical marks on DNA, on histone proteins in nucleosomes, and by numerous context-specific DNA, RNA, protein interactions that all regulate the structure, organization, and function of the genome in a given cell. It is becoming increasingly evident that abnormalities in both the genetic landscape and epigenetic cityscape can cooperate to drive carcinogenesis and disease progression. Large-scale cancer genome sequencing studies have revealed that mutations in genes encoding the enzymatic machinery for shaping the epigenetic cityscape are among the most common mutations observed in human cancers, including prostate cancer. Interestingly, although the constellation of genetic mutations in a given cancer can be quite heterogeneous from person to person, there are numerous epigenetic alterations that appear to be highly recurrent, and nearly universal in a given cancer type, including in prostate cancer. The highly recurrent nature of these alterations can be exploited for development of biomarkers for cancer detection and risk stratification and as targets for therapeutic intervention. Here, we explore the basic principles of epigenetic processes in normal cells and prostate cancer cells and discuss the potential clinical implications with regards to prostate cancer biomarker development and therapy. PMID:27212125
Prostate cancer epigenetics and its clinical implications.
Yegnasubramanian, Srinivasan
2016-01-01
Normal cells have a level of epigenetic programming that is superimposed on the genetic code to establish and maintain their cell identity and phenotypes. This epigenetic programming can be thought as the architecture, a sort of cityscape, that is built upon the underlying genetic landscape. The epigenetic programming is encoded by a complex set of chemical marks on DNA, on histone proteins in nucleosomes, and by numerous context-specific DNA, RNA, protein interactions that all regulate the structure, organization, and function of the genome in a given cell. It is becoming increasingly evident that abnormalities in both the genetic landscape and epigenetic cityscape can cooperate to drive carcinogenesis and disease progression. Large-scale cancer genome sequencing studies have revealed that mutations in genes encoding the enzymatic machinery for shaping the epigenetic cityscape are among the most common mutations observed in human cancers, including prostate cancer. Interestingly, although the constellation of genetic mutations in a given cancer can be quite heterogeneous from person to person, there are numerous epigenetic alterations that appear to be highly recurrent, and nearly universal in a given cancer type, including in prostate cancer. The highly recurrent nature of these alterations can be exploited for development of biomarkers for cancer detection and risk stratification and as targets for therapeutic intervention. Here, we explore the basic principles of epigenetic processes in normal cells and prostate cancer cells and discuss the potential clinical implications with regards to prostate cancer biomarker development and therapy.
USDA-ARS?s Scientific Manuscript database
The cattle tick, Rhipicephalus (Boophilus) microplus, has a genome over 2.4 times the size of the human genome, and with over 70% of repetitive DNA, this genome would prove very costly to sequence at today's prices and difficult to assemble and analyze. BAC clones give insight into the genome struct...
Is junk DNA bunk? A critique of ENCODE.
Doolittle, W Ford
2013-04-02
Do data from the Encyclopedia Of DNA Elements (ENCODE) project render the notion of junk DNA obsolete? Here, I review older arguments for junk grounded in the C-value paradox and propose a thought experiment to challenge ENCODE's ontology. Specifically, what would we expect for the number of functional elements (as ENCODE defines them) in genomes much larger than our own genome? If the number were to stay more or less constant, it would seem sensible to consider the rest of the DNA of larger genomes to be junk or, at least, assign it a different sort of role (structural rather than informational). If, however, the number of functional elements were to rise significantly with C-value then, (i) organisms with genomes larger than our genome are more complex phenotypically than we are, (ii) ENCODE's definition of functional element identifies many sites that would not be considered functional or phenotype-determining by standard uses in biology, or (iii) the same phenotypic functions are often determined in a more diffuse fashion in larger-genomed organisms. Good cases can be made for propositions ii and iii. A larger theoretical framework, embracing informational and structural roles for DNA, neutral as well as adaptive causes of complexity, and selection as a multilevel phenomenon, is needed.
Characterization of "cis"-regulatory elements ("c"RE) associated with mammary gland function
USDA-ARS?s Scientific Manuscript database
The Bos taurus genome assembly has propelled dairy science into a new era; still, most of the information encoded in the genome has not yet been decoded. The human Encyclopedia of DNA Elements (ENCODE) project has spearheaded the identification and annotation of functional genomic elements in the hu...
Runcharoen, Chakkaphan; Raven, Kathy E; Reuter, Sandra; Kallonen, Teemu; Paksanont, Suporn; Thammachote, Jeeranan; Anun, Suthatip; Blane, Beth; Parkhill, Julian; Peacock, Sharon J; Chantratita, Narisara
2017-09-06
Tackling multidrug-resistant Escherichia coli requires evidence from One Health studies that capture numerous potential reservoirs in circumscribed geographic areas. We conducted a survey of extended β-lactamase (ESBL)-producing E. coli isolated from patients, canals and livestock wastewater in eastern Thailand between 2014 and 2015, and analyzed isolates using whole genome sequencing. The bacterial collection of 149 isolates consisted of 84 isolates from a single hospital and 65 from the hospital sewer, canals and farm wastewater within a 20 km radius. E. coli ST131 predominated the clinical collection (28.6%), but was uncommon in the environment. Genome-based comparison of E. coli from infected patients and their immediate environment indicated low genetic similarity overall between the two, although three clinical-environmental isolate pairs differed by ≤ 5 single nucleotide polymorphisms. Thai E. coli isolates were dispersed throughout a phylogenetic tree containing a global E. coli collection. All Thai ESBL-positive E. coli isolates were multidrug resistant, including high rates of resistance to tobramycin (77.2%), gentamicin (77.2%), ciprofloxacin (67.8%) and trimethoprim (68.5%). ESBL was encoded by six different CTX-M elements and SHV-12. Three isolates from clinical samples (n = 2) or a hospital sewer (n = 1) were resistant to the carbapenem drugs (encoded by NDM-1, NDM-5 or GES-5), and three isolates (clinical (n = 1) and canal water (n = 2)) were resistant to colistin (encoded by mcr-1); no isolates were resistant to both carbapenems and colistin. Tackling ESBL-producing E. coli in this setting will be challenging based on widespread distribution, but the low prevalence of resistance to carbapenems and colistin suggests that efforts are now required to prevent these from becoming ubiquitous.
2012-11-01
306. 70. Smith DL, Rooks DJ, Fogg PC, Darby AC, Thomson NR, et al. (2012) Comparative genomics of Shiga toxin encoding bacteriophages. BMC Genomics 13...genomic rearrangements to lysogenic conversion. Microbiol Mol Biol Rev 68: 560–602. 77. Smith DL, Wareing BM, Fogg PCM, Riley LM, Spencer M, et al
Rüping, Boris; Ernst, Antonia M; Jekat, Stephan B; Nordzieke, Steffen; Reineke, Anna R; Müller, Boje; Bornberg-Bauer, Erich; Prüfer, Dirk; Noll, Gundula A
2010-10-08
The phloem of dicotyledonous plants contains specialized P-proteins (phloem proteins) that accumulate during sieve element differentiation and remain parietally associated with the cisternae of the endoplasmic reticulum in mature sieve elements. Wounding causes P-protein filaments to accumulate at the sieve plates and block the translocation of photosynthate. Specialized, spindle-shaped P-proteins known as forisomes that undergo reversible calcium-dependent conformational changes have evolved exclusively in the Fabaceae. Recently, the molecular characterization of three genes encoding forisome components in the model legume Medicago truncatula (MtSEO1, MtSEO2 and MtSEO3; SEO = sieve element occlusion) was reported, but little is known about the molecular characteristics of P-proteins in non-Fabaceae. We performed a comprehensive genome-wide comparative analysis by screening the M. truncatula, Glycine max, Arabidopsis thaliana, Vitis vinifera and Solanum phureja genomes, and a Malus domestica EST library for homologs of MtSEO1, MtSEO2 and MtSEO3 and identified numerous novel SEO genes in Fabaceae and even non-Fabaceae plants, which do not possess forisomes. Even in Fabaceae some SEO genes appear to not encode forisome components. All SEO genes have a similar exon-intron structure and are expressed predominantly in the phloem. Phylogenetic analysis revealed the presence of several subgroups with Fabaceae-specific subgroups containing all of the known as well as newly identified forisome component proteins. We constructed Hidden Markov Models that identified three conserved protein domains, which characterize SEO proteins when present in combination. In addition, one common and three subgroup specific protein motifs were found in the amino acid sequences of SEO proteins. SEO genes are organized in genomic clusters and the conserved synteny allowed us to identify several M. truncatula vs G. max orthologs as well as paralogs within the G. max genome. The unexpected occurrence of forisome-like genes in non-Fabaceae plants may indicate that these proteins encode species-specific P-proteins, which is backed up by the phloem-specific expression profiles. The conservation of gene structure, the presence of specific motifs and domains and the genomic synteny argue for a common phylogenetic origin of forisomes and other P-proteins.
Glioblastoma (GBM) is the most common primary brain tumor and has a dismal prognosis. Amplification of chromosome 12q13-q15 (Cyclin-dependent kinase 4 (CDK4) amplicon) is frequently observed in numerous human cancers including GBM. Phosphoinositide 3-kinase enhancer (PIKE) is a group of GTP-binding proteins that belong to the subgroup of centaurin GTPase family, encoded by CENTG1 located in CDK4 amplicon. However, the pathological significance of CDK4 amplicon in GBM formation remains incompletely understood.
Genomic understanding of dinoflagellates.
Lin, Senjie
2011-01-01
The phylum of dinoflagellates is characterized by many unusual and interesting genomic and physiological features, the imprint of which, in its immense genome, remains elusive. Much novel understanding has been achieved in the last decade on various aspects of dinoflagellate biology, but most remarkably about the structure, expression pattern and epigenetic modification of protein-coding genes in the nuclear and organellar genomes. Major findings include: 1) the great diversity of dinoflagellates, especially at the base of the dinoflagellate tree of life; 2) mini-circularization of the genomes of typical dinoflagellate plastids (with three membranes, chlorophylls a, c1 and c2, and carotenoid peridinin), the scrambled mitochondrial genome and the extensive mRNA editing occurring in both systems; 3) ubiquitous spliced leader trans-splicing of nuclear-encoded mRNA and demonstrated potential as a novel tool for studying dinoflagellate transcriptomes in mixed cultures and natural assemblages; 4) existence and expression of histones and other nucleosomal proteins; 5) a ribosomal protein set expected of typical eukaryotes; 6) genetic potential of non-photosynthetic solar energy utilization via proton-pump rhodopsin; 7) gene candidates in the toxin synthesis pathways; and 8) evidence of a highly redundant, high gene number and highly recombined genome. Despite this progress, much more work awaits genome-wide transcriptome and whole genome sequencing in order to unfold the molecular mechanisms underlying the numerous mysterious attributes of dinoflagellates. Copyright © 2011 Institut Pasteur. Published by Elsevier SAS. All rights reserved.
A User's Guide to the Encyclopedia of DNA Elements (ENCODE)
2011-01-01
The mission of the Encyclopedia of DNA Elements (ENCODE) Project is to enable the scientific and medical communities to interpret the human genome sequence and apply it to understand human biology and improve health. The ENCODE Consortium is integrating multiple technologies and approaches in a collective effort to discover and define the functional elements encoded in the human genome, including genes, transcripts, and transcriptional regulatory regions, together with their attendant chromatin states and DNA methylation patterns. In the process, standards to ensure high-quality data have been implemented, and novel algorithms have been developed to facilitate analysis. Data and derived results are made available through a freely accessible database. Here we provide an overview of the project and the resources it is generating and illustrate the application of ENCODE data to interpret the human genome. PMID:21526222
A New Family of Secreted Toxins in Pathogenic Neisseria Species
Jamet, Anne; Jousset, Agnès B.; Euphrasie, Daniel; Mukorako, Paulette; Boucharlat, Alix; Ducousso, Alexia; Charbit, Alain; Nassif, Xavier
2015-01-01
The genus Neisseria includes both commensal and pathogenic species which are genetically closely related. However, only meningococcus and gonococcus are important human pathogens. Very few toxins are known to be secreted by pathogenic Neisseria species. Recently, toxins secreted via type V secretion system and belonging to the widespread family of contact-dependent inhibition (CDI) toxins have been described in numerous species including meningococcus. In this study, we analyzed loci containing the maf genes in N. meningitidis and N. gonorrhoeae and proposed a novel uniform nomenclature for maf genomic islands (MGIs). We demonstrated that mafB genes encode secreted polymorphic toxins and that genes immediately downstream of mafB encode a specific immunity protein (MafI). We focused on a MafB toxin found in meningococcal strain NEM8013 and characterized its EndoU ribonuclease activity. maf genes represent 2% of the genome of pathogenic Neisseria, and are virtually absent from non-pathogenic species, thus arguing for an important biological role. Indeed, we showed that overexpression of one of the four MafB toxins of strain NEM8013 provides an advantage in competition assays, suggesting a role of maf loci in niche adaptation. PMID:25569427
2010-01-01
Background Rhodospirillum centenum is a photosynthetic non-sulfur purple bacterium that favors growth in an anoxygenic, photosynthetic N2-fixing environment. It is emerging as a genetically amenable model organism for molecular genetic analysis of cyst formation, photosynthesis, phototaxis, and cellular development. Here, we present an analysis of the genome of this bacterium. Results R. centenum contains a singular circular chromosome of 4,355,548 base pairs in size harboring 4,105 genes. It has an intact Calvin cycle with two forms of Rubisco, as well as a gene encoding phosphoenolpyruvate carboxylase (PEPC) for mixotrophic CO2 fixation. This dual carbon-fixation system may be required for regulating internal carbon flux to facilitate bacterial nitrogen assimilation. Enzymatic reactions associated with arsenate and mercuric detoxification are rare or unique compared to other purple bacteria. Among numerous newly identified signal transduction proteins, of particular interest is a putative bacteriophytochrome that is phylogenetically distinct from a previously characterized R. centenum phytochrome, Ppr. Genes encoding proteins involved in chemotaxis as well as a sophisticated dual flagellar system have also been mapped. Conclusions Remarkable metabolic versatility and a superior capability for photoautotrophic carbon assimilation is evident in R. centenum. PMID:20500872
The not so universal tree of life or the place of viruses in the living world
Brüssow, Harald
2009-01-01
Darwin provided a great unifying theory for biology; its visual expression is the universal tree of life. The tree concept is challenged by the occurrence of horizontal gene transfer and—as summarized in this review—by the omission of viruses. Microbial ecologists have demonstrated that viruses are the most numerous biological entities on earth, outnumbering cells by a factor of 10. Viral genomics have revealed an unexpected size and distinctness of the viral DNA sequence space. Comparative genomics has shown elements of vertical evolution in some groups of viruses. Furthermore, structural biology has demonstrated links between viruses infecting the three domains of life pointing to a very ancient origin of viruses. However, presently viruses do not find a place on the universal tree of life, which is thus only a tree of cellular life. In view of the polythetic nature of current life definitions, viruses cannot be dismissed as non-living material. On earth we have therefore at least two large DNA sequence spaces, one represented by capsid-encoding viruses and another by ribosome-encoding cells. Despite their probable distinct evolutionary origin, both spheres were and are connected by intensive two-way gene transfers. PMID:19571246
Swart, Estienne C.; Bracht, John R.; Magrini, Vincent; Minx, Patrick; Chen, Xiao; Zhou, Yi; Khurana, Jaspreet S.; Goldman, Aaron D.; Nowacki, Mariusz; Schotanus, Klaas; Jung, Seolkyoung; Fulton, Robert S.; Ly, Amy; McGrath, Sean; Haub, Kevin; Wiggins, Jessica L.; Storton, Donna; Matese, John C.; Parsons, Lance; Chang, Wei-Jen; Bowen, Michael S.; Stover, Nicholas A.; Jones, Thomas A.; Eddy, Sean R.; Herrick, Glenn A.; Doak, Thomas G.; Wilson, Richard K.; Mardis, Elaine R.; Landweber, Laura F.
2013-01-01
The macronuclear genome of the ciliate Oxytricha trifallax displays an extreme and unique eukaryotic genome architecture with extensive genomic variation. During sexual genome development, the expressed, somatic macronuclear genome is whittled down to the genic portion of a small fraction (∼5%) of its precursor “silent” germline micronuclear genome by a process of “unscrambling” and fragmentation. The tiny macronuclear “nanochromosomes” typically encode single, protein-coding genes (a small portion, 10%, encode 2–8 genes), have minimal noncoding regions, and are differentially amplified to an average of ∼2,000 copies. We report the high-quality genome assembly of ∼16,000 complete nanochromosomes (∼50 Mb haploid genome size) that vary from 469 bp to 66 kb long (mean ∼3.2 kb) and encode ∼18,500 genes. Alternative DNA fragmentation processes ∼10% of the nanochromosomes into multiple isoforms that usually encode complete genes. Nucleotide diversity in the macronucleus is very high (SNP heterozygosity is ∼4.0%), suggesting that Oxytricha trifallax may have one of the largest known effective population sizes of eukaryotes. Comparison to other ciliates with nonscrambled genomes and long macronuclear chromosomes (on the order of 100 kb) suggests several candidate proteins that could be involved in genome rearrangement, including domesticated MULE and IS1595-like DDE transposases. The assembly of the highly fragmented Oxytricha macronuclear genome is the first completed genome with such an unusual architecture. This genome sequence provides tantalizing glimpses into novel molecular biology and evolution. For example, Oxytricha maintains tens of millions of telomeres per cell and has also evolved an intriguing expansion of telomere end-binding proteins. In conjunction with the micronuclear genome in progress, the O. trifallax macronuclear genome will provide an invaluable resource for investigating programmed genome rearrangements, complementing studies of rearrangements arising during evolution and disease. PMID:23382650
Nougairede, Antoine; De Fabritus, Lauriane; Aubry, Fabien; Gould, Ernest A; Holmes, Edward C; de Lamballerie, Xavier
2013-02-01
Large-scale codon re-encoding represents a powerful method of attenuating viruses to generate safe and cost-effective vaccines. In contrast to specific approaches of codon re-encoding which modify genome-scale properties, we evaluated the effects of random codon re-encoding on the re-emerging human pathogen Chikungunya virus (CHIKV), and assessed the stability of the resultant viruses during serial in cellulo passage. Using different combinations of three 1.4 kb randomly re-encoded regions located throughout the CHIKV genome six codon re-encoded viruses were obtained. Introducing a large number of slightly deleterious synonymous mutations reduced the replicative fitness of CHIKV in both primate and arthropod cells, demonstrating the impact of synonymous mutations on fitness. Decrease of replicative fitness correlated with the extent of re-encoding, an observation that may assist in the modulation of viral attenuation. The wild-type and two re-encoded viruses were passaged 50 times either in primate or insect cells, or in each cell line alternately. These viruses were analyzed using detailed fitness assays, complete genome sequences and the analysis of intra-population genetic diversity. The response to codon re-encoding and adaptation to culture conditions occurred simultaneously, resulting in significant replicative fitness increases for both re-encoded and wild type viruses. Importantly, however, the most re-encoded virus failed to recover its replicative fitness. Evolution of these viruses in response to codon re-encoding was largely characterized by the emergence of both synonymous and non-synonymous mutations, sometimes located in genomic regions other than those involving re-encoding, and multiple convergent and compensatory mutations. However, there was a striking absence of codon reversion (<0.4%). Finally, multiple mutations were rapidly fixed in primate cells, whereas mosquito cells acted as a brake on evolution. In conclusion, random codon re-encoding provides important information on the evolution and genetic stability of CHIKV viruses and could be exploited to develop a safe, live attenuated CHIKV vaccine.
Significant expansion of exon-bordering protein domains during animal proteome evolution
Liu, Mingyi; Walch, Heiko; Wu, Shaoping; Grigoriev, Andrei
2005-01-01
We present evidence of remarkable genome-wide mobility and evolutionary expansion for a class of protein domains whose borders locate close to the borders of their encoding exons. These exon-bordering domains are more numerous and widely distributed in the human genome than other domains. They also co-occur with more diverse domains to form a larger variety of domain architectures in human proteins. A systematic comparison of nine animal genomes from nematodes to mammals revealed that exon-bordering domains expanded faster than other protein domains in both abundance and distribution, as well as the diversity of co-occurring domains and the domain architectures of harboring proteins. Furthermore, exon-bordering domains exhibited a particularly strong preference for class 1-1 intron phase. Our findings suggest that exon-bordering domains were amplified and interchanged within a genome more often and/or more successfully than other domains during evolution, probably the result of extensive exon shuffling and gene duplication events. The diverse biological functions of these domains underscore the important role they play in the expansion and diversification of animal proteomes. PMID:15640447
Solving the Problem: Genome Annotation Standards before the Data Deluge.
Klimke, William; O'Donovan, Claire; White, Owen; Brister, J Rodney; Clark, Karen; Fedorov, Boris; Mizrachi, Ilene; Pruitt, Kim D; Tatusova, Tatiana
2011-10-15
The promise of genome sequencing was that the vast undiscovered country would be mapped out by comparison of the multitude of sequences available and would aid researchers in deciphering the role of each gene in every organism. Researchers recognize that there is a need for high quality data. However, different annotation procedures, numerous databases, and a diminishing percentage of experimentally determined gene functions have resulted in a spectrum of annotation quality. NCBI in collaboration with sequencing centers, archival databases, and researchers, has developed the first international annotation standards, a fundamental step in ensuring that high quality complete prokaryotic genomes are available as gold standard references. Highlights include the development of annotation assessment tools, community acceptance of protein naming standards, comparison of annotation resources to provide consistent annotation, and improved tracking of the evidence used to generate a particular annotation. The development of a set of minimal standards, including the requirement for annotated complete prokaryotic genomes to contain a full set of ribosomal RNAs, transfer RNAs, and proteins encoding core conserved functions, is an historic milestone. The use of these standards in existing genomes and future submissions will increase the quality of databases, enabling researchers to make accurate biological discoveries.
Solving the Problem: Genome Annotation Standards before the Data Deluge
Klimke, William; O'Donovan, Claire; White, Owen; Brister, J. Rodney; Clark, Karen; Fedorov, Boris; Mizrachi, Ilene; Pruitt, Kim D.; Tatusova, Tatiana
2011-01-01
The promise of genome sequencing was that the vast undiscovered country would be mapped out by comparison of the multitude of sequences available and would aid researchers in deciphering the role of each gene in every organism. Researchers recognize that there is a need for high quality data. However, different annotation procedures, numerous databases, and a diminishing percentage of experimentally determined gene functions have resulted in a spectrum of annotation quality. NCBI in collaboration with sequencing centers, archival databases, and researchers, has developed the first international annotation standards, a fundamental step in ensuring that high quality complete prokaryotic genomes are available as gold standard references. Highlights include the development of annotation assessment tools, community acceptance of protein naming standards, comparison of annotation resources to provide consistent annotation, and improved tracking of the evidence used to generate a particular annotation. The development of a set of minimal standards, including the requirement for annotated complete prokaryotic genomes to contain a full set of ribosomal RNAs, transfer RNAs, and proteins encoding core conserved functions, is an historic milestone. The use of these standards in existing genomes and future submissions will increase the quality of databases, enabling researchers to make accurate biological discoveries. PMID:22180819
Toxin-antitoxin systems and regulatory mechanisms in Mycobacterium tuberculosis.
Slayden, Richard A; Dawson, Clinton C; Cummings, Jason E
2018-06-01
There has been a significant reduction in annual tuberculosis incidence since the World Health Organization declared tuberculosis a global health threat. However, treatment of M. tuberculosis infections requires lengthy multidrug therapeutic regimens to achieve a durable cure. The development of new drugs that are active against resistant strains and phenotypically diverse organisms continues to present the greatest challenge in the future. Numerous phylogenomic analyses have revealed that the Mtb genome encodes a significantly expanded repertoire of toxin-antitoxin (TA) loci that makes up the Mtb TA system. A TA loci is a two-gene operon encoding a 'toxin' protein that inhibits bacterial growth and an interacting 'antitoxin' partner that neutralizes the inhibitory activity of the toxin. The presence of multiple chromosomally encoded TA loci in Mtb raises important questions in regard to expansion, regulation and function. Thus, the functional roles of TA loci in Mtb pathogenesis have received considerable attention over the last decade. The cumulative results indicate that they are involved in regulating adaptive responses to stresses associated with the host environment and drug treatment. Here we review the TA families encoded in Mtb, discuss the duplication of TA loci in Mtb, regulatory mechanism of TA loci, and phenotypic heterogeneity and pathogenesis.
Bao, Xuerui; Yang, Ling; Chen, Lequn; Li, Bing; Li, Lin; Li, Yanyan; Xu, Zhenbo
2017-08-01
Cronobacter sakazakii is an opportunistic pathogen responsible for necrotizing enterocolitis, meningitis and septicaemia especially to infant and neonate, with high lethality ranging in 40%-80%. This strain is able to survive in infant milk formula and possesses capability of pathogenicity and virulence, biofilm formation, and high resistance to elevated osmotic, low pH, heat, oxidation, and desiccasion. This study is aims to investigate the molecular characteristics of Cronobacter sakazakii BAA 894, including mechanisms of its invasion and adherence, biofilm formation, unusual resistance to environmental stress employing whole genome sequencing and comparative genomics. Results in this study suggest that numerous genes and pathways, such as LysM, Cyx system, luxS, vancomycin resistance pathway, insulin resistance pathway, and sod encoding superoxide dismutase for the survival of C. sakazakii in macrophages, contribute to pathogenicity and resistance to stressful environment of C. sakazakii BAA 894. Copyright © 2017. Published by Elsevier Ltd.
Audit, Benjamin; Zaghloul, Lamia; Baker, Antoine; Arneodo, Alain; Chen, Chun-Long; d'Aubenton-Carafa, Yves; Thermes, Claude
2013-01-01
In higher eukaryotes, the absence of specific sequence motifs, marking the origins of replication has been a serious hindrance to the understanding of (i) the mechanisms that regulate the spatio-temporal replication program, and (ii) the links between origins activation, chromatin structure and transcription. In this chapter, we review the partitioning of the human genome into megabased-size replication domains delineated as N-shaped motifs in the strand compositional asymmetry profiles. They collectively span 28.3% of the genome and are bordered by more than 1,000 putative replication origins. We recapitulate the comparison of this partition of the human genome with high-resolution experimental data that confirms that replication domain borders are likely to be preferential replication initiation zones in the germline. In addition, we highlight the specific distribution of experimental and numerical chromatin marks along replication domains. Domain borders correspond to particular open chromatin regions, possibly encoded in the DNA sequence, and around which replication and transcription are highly coordinated. These regions also present a high evolutionary breakpoint density, suggesting that susceptibility to breakage might be linked to local open chromatin fiber state. Altogether, this chapter presents a compartmentalization of the human genome into replication domains that are landmarks of the human genome organization and are likely to play a key role in genome dynamics during evolution and in pathological situations.
Diop, Awa; Diop, Khoudia; Tomei, Enora; Raoult, Didier; Fenollar, Florence; Fournier, Pierre-Edouard
2018-03-01
We report here the draft genome sequence of Ezakiella peruensis strain M6.X2 T The draft genome is 1,672,788 bp long and harbors 1,589 predicted protein-encoding genes, including 26 antibiotic resistance genes with 1 gene encoding vancomycin resistance. The genome also exhibits 1 clustered regularly interspaced short palindromic repeat region and 333 genes acquired by horizontal gene transfer. Copyright © 2018 Diop et al.
Oldfield, Lauren M; Grzesik, Peter; Voorhies, Alexander A; Alperovich, Nina; MacMath, Derek; Najera, Claudia D; Chandra, Diya Sabrina; Prasad, Sanjana; Noskov, Vladimir N; Montague, Michael G; Friedman, Robert M; Desai, Prashant J; Vashee, Sanjay
2017-10-17
Here, we present a transformational approach to genome engineering of herpes simplex virus type 1 (HSV-1), which has a large DNA genome, using synthetic genomics tools. We believe this method will enable more rapid and complex modifications of HSV-1 and other large DNA viruses than previous technologies, facilitating many useful applications. Yeast transformation-associated recombination was used to clone 11 fragments comprising the HSV-1 strain KOS 152 kb genome. Using overlapping sequences between the adjacent pieces, we assembled the fragments into a complete virus genome in yeast, transferred it into an Escherichia coli host, and reconstituted infectious virus following transfection into mammalian cells. The virus derived from this yeast-assembled genome, KOS YA , replicated with kinetics similar to wild-type virus. We demonstrated the utility of this modular assembly technology by making numerous modifications to a single gene, making changes to two genes at the same time and, finally, generating individual and combinatorial deletions to a set of five conserved genes that encode virion structural proteins. While the ability to perform genome-wide editing through assembly methods in large DNA virus genomes raises dual-use concerns, we believe the incremental risks are outweighed by potential benefits. These include enhanced functional studies, generation of oncolytic virus vectors, development of delivery platforms of genes for vaccines or therapy, as well as more rapid development of countermeasures against potential biothreats.
Grzesik, Peter; Voorhies, Alexander A.; Alperovich, Nina; MacMath, Derek; Najera, Claudia D.; Chandra, Diya Sabrina; Prasad, Sanjana; Noskov, Vladimir N.; Montague, Michael G.; Friedman, Robert M.; Desai, Prashant J.
2017-01-01
Here, we present a transformational approach to genome engineering of herpes simplex virus type 1 (HSV-1), which has a large DNA genome, using synthetic genomics tools. We believe this method will enable more rapid and complex modifications of HSV-1 and other large DNA viruses than previous technologies, facilitating many useful applications. Yeast transformation-associated recombination was used to clone 11 fragments comprising the HSV-1 strain KOS 152 kb genome. Using overlapping sequences between the adjacent pieces, we assembled the fragments into a complete virus genome in yeast, transferred it into an Escherichia coli host, and reconstituted infectious virus following transfection into mammalian cells. The virus derived from this yeast-assembled genome, KOSYA, replicated with kinetics similar to wild-type virus. We demonstrated the utility of this modular assembly technology by making numerous modifications to a single gene, making changes to two genes at the same time and, finally, generating individual and combinatorial deletions to a set of five conserved genes that encode virion structural proteins. While the ability to perform genome-wide editing through assembly methods in large DNA virus genomes raises dual-use concerns, we believe the incremental risks are outweighed by potential benefits. These include enhanced functional studies, generation of oncolytic virus vectors, development of delivery platforms of genes for vaccines or therapy, as well as more rapid development of countermeasures against potential biothreats. PMID:28928148
Chakrabarti, Kausik; Pearson, Michael; Grate, Leslie; Sterne-Weiler, Timothy; Deans, Jonathan; Donohue, John Paul; Ares, Manuel
2007-01-01
As the genomes of more eukaryotic pathogens are sequenced, understanding how molecular differences between parasite and host might be exploited to provide new therapies has become a major focus. Central to cell function are RNA-containing complexes involved in gene expression, such as the ribosome, the spliceosome, snoRNAs, RNase P, and telomerase, among others. In this article we identify by comparative genomics and validate by RNA analysis numerous previously unknown structural RNAs encoded by the Plasmodium falciparum genome, including the telomerase RNA, U3, 31 snoRNAs, as well as previously predicted spliceosomal snRNAs, SRP RNA, MRP RNA, and RNAse P RNA. Furthermore, we identify six new RNA coding genes of unknown function. To investigate the relationships of the RNA coding genes to other genomic features in related parasites, we developed a genome browser for P. falciparum (http://areslab.ucsc.edu/cgi-bin/hgGateway). Additional experiments provide evidence supporting the prediction that snoRNAs guide methylation of a specific position on U4 snRNA, as well as predicting an snRNA promoter element particular to Plasmodium sp. These findings should allow detailed structural comparisons between the RNA components of the gene expression machinery of the parasite and its vertebrate hosts. PMID:17901154
2012-01-01
Background Natrialba magadii is an aerobic chemoorganotrophic member of the Euryarchaeota and is a dual extremophile requiring alkaline conditions and hypersalinity for optimal growth. The genome sequence of Nab. magadii type strain ATCC 43099 was deciphered to obtain a comprehensive insight into the genetic content of this haloarchaeon and to understand the basis of some of the cellular functions necessary for its survival. Results The genome of Nab. magadii consists of four replicons with a total sequence of 4,443,643 bp and encodes 4,212 putative proteins, some of which contain peptide repeats of various lengths. Comparative genome analyses facilitated the identification of genes encoding putative proteins involved in adaptation to hypersalinity, stress response, glycosylation, and polysaccharide biosynthesis. A proton-driven ATP synthase and a variety of putative cytochromes and other proteins supporting aerobic respiration and electron transfer were encoded by one or more of Nab. magadii replicons. The genome encodes a number of putative proteases/peptidases as well as protein secretion functions. Genes encoding putative transcriptional regulators, basal transcription factors, signal perception/transduction proteins, and chemotaxis/phototaxis proteins were abundant in the genome. Pathways for the biosynthesis of thiamine, riboflavin, heme, cobalamin, coenzyme F420 and other essential co-factors were deduced by in depth sequence analyses. However, approximately 36% of Nab. magadii protein coding genes could not be assigned a function based on Blast analysis and have been annotated as encoding hypothetical or conserved hypothetical proteins. Furthermore, despite extensive comparative genomic analyses, genes necessary for survival in alkaline conditions could not be identified in Nab. magadii. Conclusions Based on genomic analyses, Nab. magadii is predicted to be metabolically versatile and it could use different carbon and energy sources to sustain growth. Nab. magadii has the genetic potential to adapt to its milieu by intracellular accumulation of inorganic cations and/or neutral organic compounds. The identification of Nab. magadii genes involved in coenzyme biosynthesis is a necessary step toward further reconstruction of the metabolic pathways in halophilic archaea and other extremophiles. The knowledge gained from the genome sequence of this haloalkaliphilic archaeon is highly valuable in advancing the applications of extremophiles and their enzymes. PMID:22559199
Murphy, James; Klumpp, Jochen; Mahony, Jennifer; O'Connell-Motherway, Mary; Nauta, Arjen; van Sinderen, Douwe
2014-10-01
So-called 936-type phages are among the most frequently isolated phages in dairy facilities utilising Lactococcus lactis starter cultures. Despite extensive efforts to control phage proliferation and decades of research, these phages continue to negatively impact cheese production in terms of the final product quality and consequently, monetary return. Whole genome sequencing and in silico analysis of three 936-type phage genomes identified several putative (orphan) methyltransferase (MTase)-encoding genes located within the packaging and replication regions of the genome. Utilising SMRT sequencing, methylome analysis was performed on all three phages, allowing the identification of adenine modifications consistent with N-6 methyladenine sequence methylation, which in some cases could be attributed to these phage-encoded MTases. Heterologous gene expression revealed that M.Phi145I/M.Phi93I and M.Phi93DAM, encoded by genes located within the packaging module, provide protection against the restriction enzymes HphI and DpnII, respectively, representing the first functional MTases identified in members of 936-type phages. SMRT sequencing technology enabled the identification of the target motifs of MTases encoded by the genomes of three lytic 936-type phages and these MTases represent the first functional MTases identified in this species of phage. The presence of these MTase-encoding genes on 936-type phage genomes is assumed to represent an adaptive response to circumvent host encoded restriction-modification systems thereby increasing the fitness of the phages in a dynamic dairy environment.
Hou, Shaobin; Makarova, Kira S; Saw, Jimmy HW; Senin, Pavel; Ly, Benjamin V; Zhou, Zhemin; Ren, Yan; Wang, Jianmei; Galperin, Michael Y; Omelchenko, Marina V; Wolf, Yuri I; Yutin, Natalya; Koonin, Eugene V; Stott, Matthew B; Mountain, Bruce W; Crowe, Michelle A; Smirnova, Angela V; Dunfield, Peter F; Feng, Lu; Wang, Lei; Alam, Maqsudul
2008-01-01
Background The phylum Verrucomicrobia is a widespread but poorly characterized bacterial clade. Although cultivation-independent approaches detect representatives of this phylum in a wide range of environments, including soils, seawater, hot springs and human gastrointestinal tract, only few have been isolated in pure culture. We have recently reported cultivation and initial characterization of an extremely acidophilic methanotrophic member of the Verrucomicrobia, strain V4, isolated from the Hell's Gate geothermal area in New Zealand. Similar organisms were independently isolated from geothermal systems in Italy and Russia. Results We report the complete genome sequence of strain V4, the first one from a representative of the Verrucomicrobia. Isolate V4, initially named "Methylokorus infernorum" (and recently renamed Methylacidiphilum infernorum) is an autotrophic bacterium with a streamlined genome of ~2.3 Mbp that encodes simple signal transduction pathways and has a limited potential for regulation of gene expression. Central metabolism of M. infernorum was reconstructed almost completely and revealed highly interconnected pathways of autotrophic central metabolism and modifications of C1-utilization pathways compared to other known methylotrophs. The M. infernorum genome does not encode tubulin, which was previously discovered in bacteria of the genus Prosthecobacter, or close homologs of any other signature eukaryotic proteins. Phylogenetic analysis of ribosomal proteins and RNA polymerase subunits unequivocally supports grouping Planctomycetes, Verrucomicrobia and Chlamydiae into a single clade, the PVC superphylum, despite dramatically different gene content in members of these three groups. Comparative-genomic analysis suggests that evolution of the M. infernorum lineage involved extensive horizontal gene exchange with a variety of bacteria. The genome of M. infernorum shows apparent adaptations for existence under extremely acidic conditions including a major upward shift in the isoelectric points of proteins. Conclusion The results of genome analysis of M. infernorum support the monophyly of the PVC superphylum. M. infernorum possesses a streamlined genome but seems to have acquired numerous genes including those for enzymes of methylotrophic pathways via horizontal gene transfer, in particular, from Proteobacteria. Reviewers This article was reviewed by John A. Fuerst, Ludmila Chistoserdova, and Radhey S. Gupta. PMID:18593465
Hou, Shaobin; Makarova, Kira S; Saw, Jimmy H W; Senin, Pavel; Ly, Benjamin V; Zhou, Zhemin; Ren, Yan; Wang, Jianmei; Galperin, Michael Y; Omelchenko, Marina V; Wolf, Yuri I; Yutin, Natalya; Koonin, Eugene V; Stott, Matthew B; Mountain, Bruce W; Crowe, Michelle A; Smirnova, Angela V; Dunfield, Peter F; Feng, Lu; Wang, Lei; Alam, Maqsudul
2008-07-01
The phylum Verrucomicrobia is a widespread but poorly characterized bacterial clade. Although cultivation-independent approaches detect representatives of this phylum in a wide range of environments, including soils, seawater, hot springs and human gastrointestinal tract, only few have been isolated in pure culture. We have recently reported cultivation and initial characterization of an extremely acidophilic methanotrophic member of the Verrucomicrobia, strain V4, isolated from the Hell's Gate geothermal area in New Zealand. Similar organisms were independently isolated from geothermal systems in Italy and Russia. We report the complete genome sequence of strain V4, the first one from a representative of the Verrucomicrobia. Isolate V4, initially named "Methylokorus infernorum" (and recently renamed Methylacidiphilum infernorum) is an autotrophic bacterium with a streamlined genome of ~2.3 Mbp that encodes simple signal transduction pathways and has a limited potential for regulation of gene expression. Central metabolism of M. infernorum was reconstructed almost completely and revealed highly interconnected pathways of autotrophic central metabolism and modifications of C1-utilization pathways compared to other known methylotrophs. The M. infernorum genome does not encode tubulin, which was previously discovered in bacteria of the genus Prosthecobacter, or close homologs of any other signature eukaryotic proteins. Phylogenetic analysis of ribosomal proteins and RNA polymerase subunits unequivocally supports grouping Planctomycetes, Verrucomicrobia and Chlamydiae into a single clade, the PVC superphylum, despite dramatically different gene content in members of these three groups. Comparative-genomic analysis suggests that evolution of the M. infernorum lineage involved extensive horizontal gene exchange with a variety of bacteria. The genome of M. infernorum shows apparent adaptations for existence under extremely acidic conditions including a major upward shift in the isoelectric points of proteins. The results of genome analysis of M. infernorum support the monophyly of the PVC superphylum. M. infernorum possesses a streamlined genome but seems to have acquired numerous genes including those for enzymes of methylotrophic pathways via horizontal gene transfer, in particular, from Proteobacteria. This article was reviewed by John A. Fuerst, Ludmila Chistoserdova, and Radhey S. Gupta.
Kim, Eunsoo; Lane, Christopher E; Curtis, Bruce A; Kozera, Catherine; Bowman, Sharen; Archibald, John M
2008-05-12
Cryptophytes are an enigmatic group of unicellular eukaryotes with plastids derived by secondary (i.e., eukaryote-eukaryote) endosymbiosis. Cryptophytes are unusual in that they possess four genomes-a host cell-derived nuclear and mitochondrial genome and an endosymbiont-derived plastid and 'nucleomorph' genome. The evolutionary origins of the host and endosymbiont components of cryptophyte algae are at present poorly understood. Thus far, a single complete mitochondrial genome sequence has been determined for the cryptophyte Rhodomonas salina. Here, the second complete mitochondrial genome of the cryptophyte alga Hemiselmis andersenii CCMP644 is presented. The H. andersenii mtDNA is 60,553 bp in size and encodes 30 structural RNAs and 36 protein-coding genes, all located on the same strand. A prominent feature of the genome is the presence of a approximately 20 Kbp long intergenic region comprised of numerous tandem and dispersed repeat units of between 22-336 bp. Adjacent to these repeats are 27 copies of palindromic sequences predicted to form stable DNA stem-loop structures. One such stem-loop is located near a GC-rich and GC-poor region and may have a regulatory function in replication or transcription. The H. andersenii mtDNA shares a number of features in common with the genome of the cryptophyte Rhodomonas salina, including general architecture, gene content, and the presence of a large repeat region. However, the H. andersenii mtDNA is devoid of inverted repeats and introns, which are present in R. salina. Comparative analyses of the suite of tRNAs encoded in the two genomes reveal that the H. andersenii mtDNA has lost or converted its original trnK(uuu) gene and possesses a trnS-derived 'trnK(uuu)', which appears unable to produce a functional tRNA. Mitochondrial protein coding gene phylogenies strongly support a variety of previously established eukaryotic groups, but fail to resolve the relationships among higher-order eukaryotic lineages. Comparison of the H. andersenii and R. salina mitochondrial genomes reveals a number of cryptophyte-specific genomic features, most notably the presence of a large repeat-rich intergenic region. However, unlike R. salina, the H. andersenii mtDNA does not possess introns and lacks a Lys-tRNA, which is presumably imported from the cytosol.
Blackburn, Michael B; Sparks, Michael E; Gundersen-Rindal, Dawn E
2016-12-01
The genome of Chromobacterium subtsugae strain PRAA4-1, a betaproteobacterium producing insecticidal compounds, was sequenced and compared with the genome of C. violaceum ATCC 12472. The genome of C. subtsugae displayed a reduction in genes devoted to capsular and extracellular polysaccharide, possessed no genes encoding nitrate reductases, and exhibited many more phage-related sequences than were observed for C. violaceum. The genomes of both species possess a number of gene clusters predicted to encode biosynthetic complexes for secondary metabolites; these clusters suggest they produce overlapping, but distinct assortments of metabolites.
Hornung, Claudia; Poehlein, Anja; Haack, Frederike S.; Schmidt, Martina; Dierking, Katja; Pohlen, Andrea; Schulenburg, Hinrich; Blokesch, Melanie; Plener, Laure; Jung, Kirsten; Bonge, Andreas; Krohn-Molt, Ines; Utpatel, Christian; Timmermann, Gabriele; Spieck, Eva; Pommerening-Röser, Andreas; Bode, Edna; Bode, Helge B.; Daniel, Rolf; Schmeisser, Christel; Streit, Wolfgang R.
2013-01-01
Janthinobacteria commonly form biofilms on eukaryotic hosts and are known to synthesize antibacterial and antifungal compounds. Janthinobacterium sp. HH01 was recently isolated from an aquatic environment and its genome sequence was established. The genome consists of a single chromosome and reveals a size of 7.10 Mb, being the largest janthinobacterial genome so far known. Approximately 80% of the 5,980 coding sequences (CDSs) present in the HH01 genome could be assigned putative functions. The genome encodes a wealth of secretory functions and several large clusters for polyketide biosynthesis. HH01 also encodes a remarkable number of proteins involved in resistance to drugs or heavy metals. Interestingly, the genome of HH01 apparently lacks the N-acylhomoserine lactone (AHL)-dependent signaling system and the AI-2-dependent quorum sensing regulatory circuit. Instead it encodes a homologue of the Legionella- and Vibrio-like autoinducer (lqsA/cqsA) synthase gene which we designated jqsA. The jqsA gene is linked to a cognate sensor kinase (jqsS) which is flanked by the response regulator jqsR. Here we show that a jqsA deletion has strong impact on the violacein biosynthesis in Janthinobacterium sp. HH01 and that a jqsA deletion mutant can be functionally complemented with the V. cholerae cqsA and the L. pneumophila lqsA genes. PMID:23405110
Wang, Xiuna; Zhang, Xiaoling; Liu, Ling; Xiang, Meichun; Wang, Wenzhao; Sun, Xiang; Che, Yongsheng; Guo, Liangdong; Liu, Gang; Guo, Liyun; Wang, Chengshu; Yin, Wen-Bing; Stadler, Marc; Zhang, Xinyu; Liu, Xingzhong
2015-01-27
In recent years, the genus Pestalotiopsis is receiving increasing attention, not only because of its economic impact as a plant pathogen but also as a commonly isolated endophyte which is an important source of bioactive natural products. Pestalotiopsis fici Steyaert W106-1/CGMCC3.15140 as an endophyte of tea produces numerous novel secondary metabolites, including chloropupukeananin, a derivative of chlorinated pupukeanane that is first discovered in fungi. Some of them might be important as the drug leads for future pharmaceutics. Here, we report the genome sequence of the endophytic fungus of tea Pestalotiopsis fici W106-1/CGMCC3.15140. The abundant carbohydrate-active enzymes especially significantly expanding pectinases allow the fungus to utilize the limited intercellular nutrients within the host plants, suggesting adaptation of the fungus to endophytic lifestyle. The P. fici genome encodes a rich set of secondary metabolite synthesis genes, including 27 polyketide synthases (PKSs), 12 non-ribosomal peptide synthases (NRPSs), five dimethylallyl tryptophan synthases, four putative PKS-like enzymes, 15 putative NRPS-like enzymes, 15 terpenoid synthases, seven terpenoid cyclases, seven fatty-acid synthases, and five hybrids of PKS-NRPS. The majority of these core enzymes distributed into 74 secondary metabolite clusters. The putative Diels-Alderase genes have undergone expansion. The significant expansion of pectinase encoding genes provides essential insight in the life strategy of endophytes, and richness of gene clusters for secondary metabolites reveals high potential of natural products of endophytic fungi.
[ENCODE apophenia or a panglossian analysis of the human genome].
Casane, Didier; Fumey, Julien; Laurenti, Patrick
2015-01-01
In September 2012, a batch of more than 30 articles presenting the results of the ENCODE (Encyclopaedia of DNA Elements) project was released. Many of these articles appeared in Nature and Science, the two most prestigious interdisciplinary scientific journals. Since that time, hundreds of other articles dedicated to the further analyses of the Encode data have been published. The time of hundreds of scientists and hundreds of millions of dollars were not invested in vain since this project had led to an apparent paradigm shift: contrary to the classical view, 80% of the human genome is not junk DNA, but is functional. This hypothesis has been criticized by evolutionary biologists, sometimes eagerly, and detailed refutations have been published in specialized journals with impact factors far below those that published the main contribution of the Encode project to our understanding of genome architecture. In 2014, the Encode consortium released a new batch of articles that neither suggested that 80% of the genome is functional nor commented on the disappearance of their 2012 scientific breakthrough. Unfortunately, by that time many biologists had accepted the idea that 80% of the genome is functional, or at least, that this idea is a valid alternative to the long held evolutionary genetic view that it is not. In order to understand the dynamics of the genome, it is necessary to re-examine the basics of evolutionary genetics because, not only are they well established, they also will allow us to avoid the pitfall of a panglossian interpretation of Encode. Actually, the architecture of the genome and its dynamics are the product of trade-offs between various evolutionary forces, and many structural features are not related to functional properties. In other words, evolution does not produce the best of all worlds, not even the best of all possible worlds, but only one possible world. © 2015 médecine/sciences – Inserm.
Hayashi, J; Nishikawa, K; Hirano, R; Noguchi, T; Yoshimura, F
2000-01-01
Porphyromonas gingivalis, a periodontopathogen, is an oral anaerobic gram-negative bacterium with numerous fimbriae on the cell surface. Fimbriae have been considered to be an important virulence factor in this organism. We analyzed the genomic DNA of transposon-induced, fimbria-deficient mutants derived from ATCC 33277 and found that seven independent mutants had transposon insertions within the same restriction fragment. Cloning and sequencing of the disrupted region from one of the mutants revealed two adjacent open reading frames (ORFs) which seemed to encode a two-component signal transduction system. We also found that six of the mutants had insertions in a gene, fimS, a homologue of the genes encoding sensor kinase, and that the insertion in the remaining one disrupted the gene immediately downstream, fimR, a homologue of the response regulator genes in other bacteria. These findings suggest that this two-component regulatory system is involved in fimbriation of P. gingivalis.
New genes and new biological roles for expansins
NASA Technical Reports Server (NTRS)
Cosgrove, D. J.
2000-01-01
Expansins are extracellular proteins that loosen plant cell walls in novel ways. They are thought to function in cell enlargement, pollen tube invasion of the stigma (in grasses), wall disassembly during fruit ripening, abscission and other cell separation events. Expansins are encoded by two multigene families and each gene is often expressed in highly specific locations and cell types. Structural analysis indicates that one expansin region resembles the catalytic domain of family-45 endoglucanases but glucanase activity has not been detected. The genome projects have revealed numerous expansin-related sequences but their putative wall-loosening functions remain to be assessed.
Genes contributing to the development of alcoholism: an overview.
Edenberg, Howard J
2012-01-01
Genetic factors (i.e., variations in specific genes) account for a substantial portion of the risk for alcoholism. However, identifying those genes and the specific variations involved is challenging. Researchers have used both case-control and family studies to identify genes related to alcoholism risk. In addition, different strategies such as candidate gene analyses and genome-wide association studies have been used. The strongest effects have been found for specific variants of genes that encode two enzymes involved in alcohol metabolism-alcohol dehydrogenase and aldehyde dehydrogenase. Accumulating evidence indicates that variations in numerous other genes have smaller but measurable effects.
Thuan, Nguyen Huy; Dhakal, Dipesh; Pokhrel, Anaya Raj; Chu, Luan Luong; Van Pham, Thi Thuy; Shrestha, Anil; Sohng, Jae Kyung
2018-05-01
Streptomyces peucetius ATCC 27952 produces two major anthracyclines, doxorubicin (DXR) and daunorubicin (DNR), which are potent chemotherapeutic agents for the treatment of several cancers. In order to gain detailed insight on genetics and biochemistry of the strain, the complete genome was determined and analyzed. The result showed that its complete sequence contains 7187 protein coding genes in a total of 8,023,114 bp, whereas 87% of the genome contributed to the protein coding region. The genomic sequence included 18 rRNA, 66 tRNAs, and 3 non-coding RNAs. In silico studies predicted ~ 68 biosynthetic gene clusters (BCGs) encoding diverse classes of secondary metabolites, including non-ribosomal polyketide synthase (NRPS), polyketide synthase (PKS I, II, and III), terpenes, and others. Detailed analysis of the genome sequence revealed versatile biocatalytic enzymes such as cytochrome P450 (CYP), electron transfer systems (ETS) genes, methyltransferase (MT), glycosyltransferase (GT). In addition, numerous functional genes (transporter gene, SOD, etc.) and regulatory genes (afsR-sp, metK-sp, etc.) involved in the regulation of secondary metabolites were found. This minireview summarizes the genome-based genome mining (GM) of diverse BCGs and genome exploration (GE) of versatile biocatalytic enzymes, and other enzymes involved in maintenance and regulation of metabolism of S. peucetius. The detailed analysis of genome sequence provides critically important knowledge useful in the bioengineering of the strain or harboring catalytically efficient enzymes for biotechnological applications.
Kim, Eunsoo; Lane, Christopher E; Curtis, Bruce A; Kozera, Catherine; Bowman, Sharen; Archibald, John M
2008-01-01
Background Cryptophytes are an enigmatic group of unicellular eukaryotes with plastids derived by secondary (i.e., eukaryote-eukaryote) endosymbiosis. Cryptophytes are unusual in that they possess four genomes–a host cell-derived nuclear and mitochondrial genome and an endosymbiont-derived plastid and 'nucleomorph' genome. The evolutionary origins of the host and endosymbiont components of cryptophyte algae are at present poorly understood. Thus far, a single complete mitochondrial genome sequence has been determined for the cryptophyte Rhodomonas salina. Here, the second complete mitochondrial genome of the cryptophyte alga Hemiselmis andersenii CCMP644 is presented. Results The H. andersenii mtDNA is 60,553 bp in size and encodes 30 structural RNAs and 36 protein-coding genes, all located on the same strand. A prominent feature of the genome is the presence of a ~20 Kbp long intergenic region comprised of numerous tandem and dispersed repeat units of between 22–336 bp. Adjacent to these repeats are 27 copies of palindromic sequences predicted to form stable DNA stem-loop structures. One such stem-loop is located near a GC-rich and GC-poor region and may have a regulatory function in replication or transcription. The H. andersenii mtDNA shares a number of features in common with the genome of the cryptophyte Rhodomonas salina, including general architecture, gene content, and the presence of a large repeat region. However, the H. andersenii mtDNA is devoid of inverted repeats and introns, which are present in R. salina. Comparative analyses of the suite of tRNAs encoded in the two genomes reveal that the H. andersenii mtDNA has lost or converted its original trnK(uuu) gene and possesses a trnS-derived 'trnK(uuu)', which appears unable to produce a functional tRNA. Mitochondrial protein coding gene phylogenies strongly support a variety of previously established eukaryotic groups, but fail to resolve the relationships among higher-order eukaryotic lineages. Conclusion Comparison of the H. andersenii and R. salina mitochondrial genomes reveals a number of cryptophyte-specific genomic features, most notably the presence of a large repeat-rich intergenic region. However, unlike R. salina, the H. andersenii mtDNA does not possess introns and lacks a Lys-tRNA, which is presumably imported from the cytosol. PMID:18474103
Genome-Wide Analysis of bZIP-Encoding Genes in Maize
Wei, Kaifa; Chen, Juan; Wang, Yanmei; Chen, Yanhui; Chen, Shaoxiang; Lin, Yina; Pan, Si; Zhong, Xiaojun; Xie, Daoxin
2012-01-01
In plants, basic leucine zipper (bZIP) proteins regulate numerous biological processes such as seed maturation, flower and vascular development, stress signalling and pathogen defence. We have carried out a genome-wide identification and analysis of 125 bZIP genes that exist in the maize genome, encoding 170 distinct bZIP proteins. This family can be divided into 11 groups according to the phylogenetic relationship among the maize bZIP proteins and those in Arabidopsis and rice. Six kinds of intron patterns (a–f) within the basic and hinge regions are defined. The additional conserved motifs have been identified and present the group specificity. Detailed three-dimensional structure analysis has been done to display the sequence conservation and potential distribution of the bZIP domain. Further, we predict the DNA-binding pattern and the dimerization property on the basis of the characteristic features in the basic and hinge regions and the leucine zipper, respectively, which supports our classification greatly and helps to classify 26 distinct subfamilies. The chromosome distribution and the genetic analysis reveal that 58 ZmbZIP genes are located in the segmental duplicate regions in the maize genome, suggesting that the segment chromosomal duplications contribute greatly to the expansion of the maize bZIP family. Across the 60 different developmental stages of 11 organs, three apparent clusters formed represent three kinds of different expression patterns among the ZmbZIP gene family in maize development. A similar but slightly different expression pattern of bZIPs in two inbred lines displays that 22 detected ZmbZIP genes might be involved in drought stress. Thirteen pairs and 143 pairs of ZmbZIP genes show strongly negative and positive correlations in the four distinct fungal infections, respectively, based on the expression profile and Pearson's correlation coefficient analysis. PMID:23103471
Norman, Paul J.; Norberg, Steven J.; Guethlein, Lisbeth A.; Nemat-Gorgani, Neda; Royce, Thomas; Wroblewski, Emily E.; Dunn, Tamsen; Mann, Tobias; Alicata, Claudia; Hollenbach, Jill A.; Chang, Weihua; Shults Won, Melissa; Gunderson, Kevin L.; Abi-Rached, Laurent; Ronaghi, Mostafa; Parham, Peter
2017-01-01
The most polymorphic part of the human genome, the MHC, encodes over 160 proteins of diverse function. Half of them, including the HLA class I and II genes, are directly involved in immune responses. Consequently, the MHC region strongly associates with numerous diseases and clinical therapies. Notoriously, the MHC region has been intractable to high-throughput analysis at complete sequence resolution, and current reference haplotypes are inadequate for large-scale studies. To address these challenges, we developed a method that specifically captures and sequences the 4.8-Mbp MHC region from genomic DNA. For 95 MHC homozygous cell lines we assembled, de novo, a set of high-fidelity contigs and a sequence scaffold, representing a mean 98% of the target region. Included are six alternative MHC reference sequences of the human genome that we completed and refined. Characterization of the sequence and structural diversity of the MHC region shows the approach accurately determines the sequences of the highly polymorphic HLA class I and HLA class II genes and the complex structural diversity of complement factor C4A/C4B. It has also uncovered extensive and unexpected diversity in other MHC genes; an example is MUC22, which encodes a lung mucin and exhibits more coding sequence alleles than any HLA class I or II gene studied here. More than 60% of the coding sequence alleles analyzed were previously uncharacterized. We have created a substantial database of robust reference MHC haplotype sequences that will enable future population scale studies of this complicated and clinically important region of the human genome. PMID:28360230
Pauchet, Y; Saski, C A; Feltus, F A; Luyten, I; Quesneville, H; Heckel, D G
2014-06-01
The ability of herbivorous beetles from the superfamilies Chrysomeloidea and Curculionoidea to degrade plant cell wall polysaccharides has only recently begun to be appreciated. The presence of plant cell wall degrading enzymes (PCWDEs) in the beetle's digestive tract makes this degradation possible. Sequences encoding these beetle-derived PCWDEs were originally identified from transcriptomes and strikingly resemble those of saprophytic and phytopathogenic microorganisms, raising questions about their origin; e.g. are they insect- or microorganism-derived? To demonstrate unambiguously that the genes encoding PCWDEs found in beetle transcriptomes are indeed of insect origin, we generated a bacterial artificial chromosome library from the genome of the leaf beetle Chrysomela tremula, containing 18 432 clones with an average size of 143 kb. After hybridizing this library with probes derived from 12 C. tremula PCWDE-encoding genes and sequencing the positive clones, we demonstrated that the latter genes are encoded by the insect's genome and are surrounded by genes possessing orthologues in the genome of Tribolium castaneum as well as in three other beetle genomes. Our analyses showed that although the level of overall synteny between C. tremula and T. castaneum seems high, the degree of microsynteny between both species is relatively low, in contrast to the more closely related Colorado potato beetle. © 2014 The Royal Entomological Society.
Genome-wide comparative analysis of NBS-encoding genes in four Gossypium species
USDA-ARS?s Scientific Manuscript database
Nucleotide binding site (NBS) genes encode a large family of disease resistance (R) proteins in plants. The availability of genomic data of the two diploid cotton species, Gossypium arboreum and Gossypium raimondii, and the two allotetraploid cotton species, Gossypium hirsutum (TM-1) and Gossypium ...
Ogilvie, Lesley A; Nzakizwanayo, Jonathan; Guppy, Fergus M; Dedi, Cinzia; Diston, David; Taylor, Huw; Ebdon, James; Jones, Brian V
2018-04-01
Just as the expansion in genome sequencing has revealed and permitted the exploitation of phylogenetic signals embedded in bacterial genomes, the application of metagenomics has begun to provide similar insights at the ecosystem level for microbial communities. However, little is known regarding this aspect of bacteriophage associated with microbial ecosystems, and if phage encode discernible habitat-associated signals diagnostic of underlying microbiomes. Here we demonstrate that individual phage can encode clear habitat-related 'ecogenomic signatures', based on relative representation of phage-encoded gene homologues in metagenomic data sets. Furthermore, we show the ecogenomic signature encoded by the gut-associated ɸB124-14 can be used to segregate metagenomes according to environmental origin, and distinguish 'contaminated' environmental metagenomes (subject to simulated in silico human faecal pollution) from uncontaminated data sets. This indicates phage-encoded ecological signals likely possess sufficient discriminatory power for use in biotechnological applications, such as development of microbial source tracking tools for monitoring water quality.
Yan, Hong-Bin; Lou, Zhong-Zi; Li, Li; Brindley, Paul J; Zheng, Yadong; Luo, Xuenong; Hou, Junling; Guo, Aijiang; Jia, Wan-Zhong; Cai, Xuepeng
2014-06-04
Cysticercosis remains a major neglected tropical disease of humanity in many regions, especially in sub-Saharan Africa, Central America and elsewhere. Owing to the emerging drug resistance and the inability of current drugs to prevent re-infection, identification of novel vaccines and chemotherapeutic agents against Taenia solium and related helminth pathogens is a public health priority. The T. solium genome and the predicted proteome were reported recently, providing a wealth of information from which new interventional targets might be identified. In order to characterize and classify the entire repertoire of protease-encoding genes of T. solium, which act fundamental biological roles in all life processes, we analyzed the predicted proteins of this cestode through a combination of bioinformatics tools. Functional annotation was performed to yield insights into the signaling processes relevant to the complex developmental cycle of this tapeworm and to highlight a suite of the proteases as potential intervention targets. Within the genome of this helminth parasite, we identified 200 open reading frames encoding proteases from five clans, which correspond to 1.68% of the 11,902 protein-encoding genes predicted to be present in its genome. These proteases include calpains, cytosolic, mitochondrial signal peptidases, ubiquitylation related proteins, and others. Many not only show significant similarity to proteases in the Conserved Domain Database but have conserved active sites and catalytic domains. KEGG Automatic Annotation Server (KAAS) analysis indicated that ~60% of these proteases share strong sequence identities with proteins of the KEGG database, which are involved in human disease, metabolic pathways, genetic information processes, cellular processes, environmental information processes and organismal systems. Also, we identified signal peptides and transmembrane helices through comparative analysis with classes of important regulatory proteases. Phylogenetic analysis using Bayes approach provided support for inferring functional divergence among regulatory cysteine and serine proteases. Numerous putative proteases were identified for the first time in T. solium, and important regulatory proteases have been predicted. This comprehensive analysis not only complements the growing knowledge base of proteolytic enzymes, but also provides a platform from which to expand knowledge of cestode proteases and to explore their biochemistry and potential as intervention targets.
Shukla, Avi; Chatterjee, Anirvan
2018-01-01
Abstract Curiously, in viruses, the virion volume appears to be predominantly driven by genome length rather than the number of proteins it encodes or geometric constraints. With their large genome and giant particle size, amoebal viruses (AVs) are ideally suited to study the relationship between genome and virion size and explore the role of genome plasticity in their evolutionary success. Different genomic regions of AVs exhibit distinct genealogies. Although the vertically transferred core genes and their functions are universally conserved across the nucleocytoplasmic large DNA virus (NCLDV) families and are essential for their replication, the horizontally acquired genes are variable across families and are lineage-specific. When compared with other giant virus families, we observed a near–linear increase in the number of genes encoding repeat domain-containing proteins (RDCPs) with the increase in the genome size of AVs. From what is known about the functions of RDCPs in bacteria and eukaryotes and their prevalence in the AV genomes, we envisage important roles for RDCPs in the life cycle of AVs, their genome expansion, and plasticity. This observation also supports the evolution of AVs from a smaller viral ancestor by the acquisition of diverse gene families from the environment including RDCPs that might have helped in host adaption. PMID:29308275
Identification of functional elements and regulatory circuits by Drosophila modENCODE
DOE Office of Scientific and Technical Information (OSTI.GOV)
Roy, Sushmita; Ernst, Jason; Kharchenko, Peter V.
2010-12-22
To gain insight into how genomic information is translated into cellular and developmental programs, the Drosophila model organism Encyclopedia of DNA Elements (modENCODE) project is comprehensively mapping transcripts, histone modifications, chromosomal proteins, transcription factors, replication proteins and intermediates, and nucleosome properties across a developmental time course and in multiple cell lines. We have generated more than 700 data sets and discovered protein-coding, noncoding, RNA regulatory, replication, and chromatin elements, more than tripling the annotated portion of the Drosophila genome. Correlated activity patterns of these elements reveal a functional regulatory network, which predicts putative new functions for genes, reveals stage- andmore » tissue-specific regulators, and enables gene-expression prediction. Our results provide a foundation for directed experimental and computational studies in Drosophila and related species and also a model for systematic data integration toward comprehensive genomic and functional annotation. Several years after the complete genetic sequencing of many species, it is still unclear how to translate genomic information into a functional map of cellular and developmental programs. The Encyclopedia of DNA Elements (ENCODE) (1) and model organism ENCODE (modENCODE) (2) projects use diverse genomic assays to comprehensively annotate the Homo sapiens (human), Drosophila melanogaster (fruit fly), and Caenorhabditis elegans (worm) genomes, through systematic generation and computational integration of functional genomic data sets. Previous genomic studies in flies have made seminal contributions to our understanding of basic biological mechanisms and genome functions, facilitated by genetic, experimental, computational, and manual annotation of the euchromatic and heterochromatic genome (3), small genome size, short life cycle, and a deep knowledge of development, gene function, and chromosome biology. The functions of {approx}40% of the protein and nonprotein-coding genes [FlyBase 5.12 (4)] have been determined from cDNA collections (5, 6), manual curation of gene models (7), gene mutations and comprehensive genome-wide RNA interference screens (8-10), and comparative genomic analyses (11, 12). The Drosophila modENCODE project has generated more than 700 data sets that profile transcripts, histone modifications and physical nucleosome properties, general and specific transcription factors (TFs), and replication programs in cell lines, isolated tissues, and whole organisms across several developmental stages (Fig. 1). Here, we computationally integrate these data sets and report (i) improved and additional genome annotations, including full-length proteincoding genes and peptides as short as 21 amino acids; (ii) noncoding transcripts, including 132 candidate structural RNAs and 1608 nonstructural transcripts; (iii) additional Argonaute (Ago)-associated small RNA genes and pathways, including new microRNAs (miRNAs) encoded within protein-coding exons and endogenous small interfering RNAs (siRNAs) from 3-inch untranslated regions; (iv) chromatin 'states' defined by combinatorial patterns of 18 chromatin marks that are associated with distinct functions and properties; (v) regions of high TF occupancy and replication activity with likely epigenetic regulation; (vi)mixed TF and miRNA regulatory networks with hierarchical structure and enriched feed-forward loops; (vii) coexpression- and co-regulation-based functional annotations for nearly 3000 genes; (viii) stage- and tissue-specific regulators; and (ix) predictive models of gene expression levels and regulator function.« less
2010-01-01
Background The phloem of dicotyledonous plants contains specialized P-proteins (phloem proteins) that accumulate during sieve element differentiation and remain parietally associated with the cisternae of the endoplasmic reticulum in mature sieve elements. Wounding causes P-protein filaments to accumulate at the sieve plates and block the translocation of photosynthate. Specialized, spindle-shaped P-proteins known as forisomes that undergo reversible calcium-dependent conformational changes have evolved exclusively in the Fabaceae. Recently, the molecular characterization of three genes encoding forisome components in the model legume Medicago truncatula (MtSEO1, MtSEO2 and MtSEO3; SEO = sieve element occlusion) was reported, but little is known about the molecular characteristics of P-proteins in non-Fabaceae. Results We performed a comprehensive genome-wide comparative analysis by screening the M. truncatula, Glycine max, Arabidopsis thaliana, Vitis vinifera and Solanum phureja genomes, and a Malus domestica EST library for homologs of MtSEO1, MtSEO2 and MtSEO3 and identified numerous novel SEO genes in Fabaceae and even non-Fabaceae plants, which do not possess forisomes. Even in Fabaceae some SEO genes appear to not encode forisome components. All SEO genes have a similar exon-intron structure and are expressed predominantly in the phloem. Phylogenetic analysis revealed the presence of several subgroups with Fabaceae-specific subgroups containing all of the known as well as newly identified forisome component proteins. We constructed Hidden Markov Models that identified three conserved protein domains, which characterize SEO proteins when present in combination. In addition, one common and three subgroup specific protein motifs were found in the amino acid sequences of SEO proteins. SEO genes are organized in genomic clusters and the conserved synteny allowed us to identify several M. truncatula vs G. max orthologs as well as paralogs within the G. max genome. Conclusions The unexpected occurrence of forisome-like genes in non-Fabaceae plants may indicate that these proteins encode species-specific P-proteins, which is backed up by the phloem-specific expression profiles. The conservation of gene structure, the presence of specific motifs and domains and the genomic synteny argue for a common phylogenetic origin of forisomes and other P-proteins. PMID:20932300
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chan, Chai Ling; Yew, Su Mei; Ngeow, Yun Fong
Background: Daldinia eschscholtzii is a wood-inhabiting fungus that causes wood decay under certain conditions. It has a broad host range and produces a large repertoire of potentially bioactive compounds. However, there is no extensive genome analysis on this fungal species. Results: Two fungal isolates (UM 1400 and UM 1020) from human specimens were identified as Daldinia eschscholtzii by morphological features and ITS-based phylogenetic analysis. Both genomes were similar in size with 10,822 predicted genes in UM 1400 (35.8 Mb) and 11,120 predicted genes in UM 1020 (35.5 Mb). A total of 751 gene families were shared among both UM isolates,more » including gene families associated with fungus-host interactions. In the CAZyme comparative analysis, both genomes were found to contain arrays of CAZyme related to plant cell wall degradation. Genes encoding secreted peptidases were found in the genomes, which encode for the peptidases involved in the degradation of structural proteins in plant cell wall. In addition, arrays of secondary metabolite backbone genes were identified in both genomes, indicating of their potential to produce bioactive secondary metabolites. Both genomes also contained an abundance of gene encoding signaling components, with three proposed MAPK cascades involved in cell wall integrity, osmoregulation, and mating/filamentation. Besides genomic evidence for degrading capability, both isolates also harbored an array of genes encoding stress response proteins that are potentially significant for adaptation to living in the hostile environments. In conclusion: Our genomic studies provide further information for the biological understanding of the D. eschscholtzii and suggest that these wood-decaying fungi are also equipped for adaptation to adverse environments in the human host.« less
Chan, Chai Ling; Yew, Su Mei; Ngeow, Yun Fong; ...
2015-11-18
Background: Daldinia eschscholtzii is a wood-inhabiting fungus that causes wood decay under certain conditions. It has a broad host range and produces a large repertoire of potentially bioactive compounds. However, there is no extensive genome analysis on this fungal species. Results: Two fungal isolates (UM 1400 and UM 1020) from human specimens were identified as Daldinia eschscholtzii by morphological features and ITS-based phylogenetic analysis. Both genomes were similar in size with 10,822 predicted genes in UM 1400 (35.8 Mb) and 11,120 predicted genes in UM 1020 (35.5 Mb). A total of 751 gene families were shared among both UM isolates,more » including gene families associated with fungus-host interactions. In the CAZyme comparative analysis, both genomes were found to contain arrays of CAZyme related to plant cell wall degradation. Genes encoding secreted peptidases were found in the genomes, which encode for the peptidases involved in the degradation of structural proteins in plant cell wall. In addition, arrays of secondary metabolite backbone genes were identified in both genomes, indicating of their potential to produce bioactive secondary metabolites. Both genomes also contained an abundance of gene encoding signaling components, with three proposed MAPK cascades involved in cell wall integrity, osmoregulation, and mating/filamentation. Besides genomic evidence for degrading capability, both isolates also harbored an array of genes encoding stress response proteins that are potentially significant for adaptation to living in the hostile environments. In conclusion: Our genomic studies provide further information for the biological understanding of the D. eschscholtzii and suggest that these wood-decaying fungi are also equipped for adaptation to adverse environments in the human host.« less
Wang, Peipei; Li, Jing; Gao, Xiaoyang; Zhang, Di; Li, Anlin; Liu, Changning
2018-05-29
Physic nut ( Jatropha curcas L.) is a species of flowering plant with great potential for biofuel production and as an emerging model organism for functional genomic analysis, particularly in the Euphorbiaceae family. DNA binding with one finger (Dof) transcription factors play critical roles in numerous biological processes in plants. Nevertheless, the knowledge about members, and the evolutionary and functional characteristics of the Dof gene family in physic nut is insufficient. Therefore, we performed a genome-wide screening and characterization of the Dof gene family within the physic nut draft genome. In total, 24 JcDof genes (encoding 33 JcDof proteins) were identified. All the JcDof genes were divided into three major groups based on phylogenetic inference, which was further validated by the subsequent gene structure and motif analysis. Genome comparison revealed that segmental duplication may have played crucial roles in the expansion of the JcDof gene family, and gene expansion was mainly subjected to positive selection. The expression profile demonstrated the broad involvement of JcDof genes in response to various abiotic stresses, hormonal treatments and functional divergence. This study provides valuable information for better understanding the evolution of JcDof genes, and lays a foundation for future functional exploration of JcDof genes.
A fully decompressed synthetic bacteriophage øX174 genome assembled and archived in yeast.
Jaschke, Paul R; Lieberman, Erica K; Rodriguez, Jon; Sierra, Adrian; Endy, Drew
2012-12-20
The 5386 nucleotide bacteriophage øX174 genome has a complicated architecture that encodes 11 gene products via overlapping protein coding sequences spanning multiple reading frames. We designed a 6302 nucleotide synthetic surrogate, øX174.1, that fully separates all primary phage protein coding sequences along with cognate translation control elements. To specify øX174.1f, a decompressed genome the same length as wild type, we truncated the gene F coding sequence. We synthesized DNA encoding fragments of øX174.1f and used a combination of in vitro- and yeast-based assembly to produce yeast vectors encoding natural or designer bacteriophage genomes. We isolated clonal preparations of yeast plasmid DNA and transfected E. coli C strains. We recovered viable øX174 particles containing the øX174.1f genome from E. coli C strains that independently express full-length gene F. We expect that yeast can serve as a genomic 'drydock' within which to maintain and manipulate clonal lineages of other obligate lytic phage. Copyright © 2012 Elsevier Inc. All rights reserved.
Ansari, M Azim; Pedergnana, Vincent; L C Ip, Camilla; Magri, Andrea; Von Delft, Annette; Bonsall, David; Chaturvedi, Nimisha; Bartha, Istvan; Smith, David; Nicholson, George; McVean, Gilean; Trebes, Amy; Piazza, Paolo; Fellay, Jacques; Cooke, Graham; Foster, Graham R; Hudson, Emma; McLauchlan, John; Simmonds, Peter; Bowden, Rory; Klenerman, Paul; Barnes, Eleanor; Spencer, Chris C A
2017-05-01
Outcomes of hepatitis C virus (HCV) infection and treatment depend on viral and host genetic factors. Here we use human genome-wide genotyping arrays and new whole-genome HCV viral sequencing technologies to perform a systematic genome-to-genome study of 542 individuals who were chronically infected with HCV, predominantly genotype 3. We show that both alleles of genes encoding human leukocyte antigen molecules and genes encoding components of the interferon lambda innate immune system drive viral polymorphism. Additionally, we show that IFNL4 genotypes determine HCV viral load through a mechanism dependent on a specific amino acid residue in the HCV NS5A protein. These findings highlight the interplay between the innate immune system and the viral genome in HCV control.
Sequence and analysis of chromosome 4 of the plant Arabidopsis thaliana.
Mayer, K; Schüller, C; Wambutt, R; Murphy, G; Volckaert, G; Pohl, T; Düsterhöft, A; Stiekema, W; Entian, K D; Terryn, N; Harris, B; Ansorge, W; Brandt, P; Grivell, L; Rieger, M; Weichselgartner, M; de Simone, V; Obermaier, B; Mache, R; Müller, M; Kreis, M; Delseny, M; Puigdomenech, P; Watson, M; Schmidtheini, T; Reichert, B; Portatelle, D; Perez-Alonso, M; Boutry, M; Bancroft, I; Vos, P; Hoheisel, J; Zimmermann, W; Wedler, H; Ridley, P; Langham, S A; McCullagh, B; Bilham, L; Robben, J; Van der Schueren, J; Grymonprez, B; Chuang, Y J; Vandenbussche, F; Braeken, M; Weltjens, I; Voet, M; Bastiaens, I; Aert, R; Defoor, E; Weitzenegger, T; Bothe, G; Ramsperger, U; Hilbert, H; Braun, M; Holzer, E; Brandt, A; Peters, S; van Staveren, M; Dirske, W; Mooijman, P; Klein Lankhorst, R; Rose, M; Hauf, J; Kötter, P; Berneiser, S; Hempel, S; Feldpausch, M; Lamberth, S; Van den Daele, H; De Keyser, A; Buysshaert, C; Gielen, J; Villarroel, R; De Clercq, R; Van Montagu, M; Rogers, J; Cronin, A; Quail, M; Bray-Allen, S; Clark, L; Doggett, J; Hall, S; Kay, M; Lennard, N; McLay, K; Mayes, R; Pettett, A; Rajandream, M A; Lyne, M; Benes, V; Rechmann, S; Borkova, D; Blöcker, H; Scharfe, M; Grimm, M; Löhnert, T H; Dose, S; de Haan, M; Maarse, A; Schäfer, M; Müller-Auer, S; Gabel, C; Fuchs, M; Fartmann, B; Granderath, K; Dauner, D; Herzl, A; Neumann, S; Argiriou, A; Vitale, D; Liguori, R; Piravandi, E; Massenet, O; Quigley, F; Clabauld, G; Mündlein, A; Felber, R; Schnabl, S; Hiller, R; Schmidt, W; Lecharny, A; Aubourg, S; Chefdor, F; Cooke, R; Berger, C; Montfort, A; Casacuberta, E; Gibbons, T; Weber, N; Vandenbol, M; Bargues, M; Terol, J; Torres, A; Perez-Perez, A; Purnelle, B; Bent, E; Johnson, S; Tacon, D; Jesse, T; Heijnen, L; Schwarz, S; Scholler, P; Heber, S; Francs, P; Bielke, C; Frishman, D; Haase, D; Lemcke, K; Mewes, H W; Stocker, S; Zaccaria, P; Bevan, M; Wilson, R K; de la Bastide, M; Habermann, K; Parnell, L; Dedhia, N; Gnoj, L; Schutz, K; Huang, E; Spiegel, L; Sehkon, M; Murray, J; Sheet, P; Cordes, M; Abu-Threideh, J; Stoneking, T; Kalicki, J; Graves, T; Harmon, G; Edwards, J; Latreille, P; Courtney, L; Cloud, J; Abbott, A; Scott, K; Johnson, D; Minx, P; Bentley, D; Fulton, B; Miller, N; Greco, T; Kemp, K; Kramer, J; Fulton, L; Mardis, E; Dante, M; Pepin, K; Hillier, L; Nelson, J; Spieth, J; Ryan, E; Andrews, S; Geisel, C; Layman, D; Du, H; Ali, J; Berghoff, A; Jones, K; Drone, K; Cotton, M; Joshu, C; Antonoiu, B; Zidanic, M; Strong, C; Sun, H; Lamar, B; Yordan, C; Ma, P; Zhong, J; Preston, R; Vil, D; Shekher, M; Matero, A; Shah, R; Swaby, I K; O'Shaughnessy, A; Rodriguez, M; Hoffmann, J; Till, S; Granat, S; Shohdy, N; Hasegawa, A; Hameed, A; Lodhi, M; Johnson, A; Chen, E; Marra, M; Martienssen, R; McCombie, W R
1999-12-16
The higher plant Arabidopsis thaliana (Arabidopsis) is an important model for identifying plant genes and determining their function. To assist biological investigations and to define chromosome structure, a coordinated effort to sequence the Arabidopsis genome was initiated in late 1996. Here we report one of the first milestones of this project, the sequence of chromosome 4. Analysis of 17.38 megabases of unique sequence, representing about 17% of the genome, reveals 3,744 protein coding genes, 81 transfer RNAs and numerous repeat elements. Heterochromatic regions surrounding the putative centromere, which has not yet been completely sequenced, are characterized by an increased frequency of a variety of repeats, new repeats, reduced recombination, lowered gene density and lowered gene expression. Roughly 60% of the predicted protein-coding genes have been functionally characterized on the basis of their homology to known genes. Many genes encode predicted proteins that are homologous to human and Caenorhabditis elegans proteins.
GeNemo: a search engine for web-based functional genomic data.
Zhang, Yongqing; Cao, Xiaoyi; Zhong, Sheng
2016-07-08
A set of new data types emerged from functional genomic assays, including ChIP-seq, DNase-seq, FAIRE-seq and others. The results are typically stored as genome-wide intensities (WIG/bigWig files) or functional genomic regions (peak/BED files). These data types present new challenges to big data science. Here, we present GeNemo, a web-based search engine for functional genomic data. GeNemo searches user-input data against online functional genomic datasets, including the entire collection of ENCODE and mouse ENCODE datasets. Unlike text-based search engines, GeNemo's searches are based on pattern matching of functional genomic regions. This distinguishes GeNemo from text or DNA sequence searches. The user can input any complete or partial functional genomic dataset, for example, a binding intensity file (bigWig) or a peak file. GeNemo reports any genomic regions, ranging from hundred bases to hundred thousand bases, from any of the online ENCODE datasets that share similar functional (binding, modification, accessibility) patterns. This is enabled by a Markov Chain Monte Carlo-based maximization process, executed on up to 24 parallel computing threads. By clicking on a search result, the user can visually compare her/his data with the found datasets and navigate the identified genomic regions. GeNemo is available at www.genemo.org. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Grant, Ar’Quette; Choi, Seon Young; Alam, M. Samiul; Bell, Rebecca; Cavanaugh, Christopher; Balan, Kannan V.; Babu, Uma S.
2017-01-01
Abstract Salmonella Typhimurium is the leading cause of human non-typhoidal gastroenteritis in the US. S. Kentucky is one the most commonly recovered serovars from commercially processed poultry carcasses. This study compared the genotypic and phenotypic properties of two Salmonella enterica strains Typhimurium (ST221_31B) and Kentucky (SK222_32B) recovered from commercially processed chicken carcasses using whole genome sequencing, phenotype characterizations and an intracellular killing assay. Illumina MiSeq platform was used for sequencing of two Salmonella genomes. Phylogenetic analysis employing homologous alignment of a 1,185 non-duplicated protein-coding gene in the Salmonella core genome demonstrated fully resolved bifurcating patterns with varying levels of diversity that separated ST221_31B and SK222_32B genomes into distinct monophyletic serovar clades. Single nucleotide polymorphism (SNP) analysis identified 2,432 (ST19) SNPs within 13 Typhimurium genomes including ST221_31B representing Sequence Type ST19 and 650 (ST152) SNPs were detected within 13 Kentucky genomes including SK222_32B representing Sequence Type ST152. In addition to serovar-specific conserved coding sequences, the genomes of ST221_31B and SK222_32B harbor several genomic regions with significant genetic differences. These included phage and phage-like elements, carbon utilization or transport operons, fimbriae operons, putative membrane associated protein-encoding genes, antibiotic resistance genes, siderophore operons, and numerous hypothetical protein-encoding genes. Phenotype microarray results demonstrated that ST221_31B is capable of utilizing certain carbon compounds more efficiently as compared to SK222_3B; namely, 1,2-propanediol, M-inositol, L-threonine, α-D-lactose, D-tagatose, adonitol, formic acid, acetoacetic acid, and L-tartaric acid. ST221_31B survived for 48 h in macrophages, while SK222_32B was mostly eliminated. Further, a 3-fold growth of ST221_31B was observed at 24 hours post-infection in chicken granulosa cells while SK222_32B was unable to replicate in these cells. These results suggest that Salmonella Typhimurium can survive host defenses better and could be more invasive than Salmonella Kentucky and provide some insights into the genomic determinants responsible for these differences. PMID:28481935
Tasmin, Rizwana; Hasan, Nur A; Grim, Christopher J; Grant, Ar'Quette; Choi, Seon Young; Alam, M Samiul; Bell, Rebecca; Cavanaugh, Christopher; Balan, Kannan V; Babu, Uma S; Parveen, Salina
2017-01-01
Salmonella Typhimurium is the leading cause of human non-typhoidal gastroenteritis in the US. S. Kentucky is one the most commonly recovered serovars from commercially processed poultry carcasses. This study compared the genotypic and phenotypic properties of two Salmonella enterica strains Typhimurium (ST221_31B) and Kentucky (SK222_32B) recovered from commercially processed chicken carcasses using whole genome sequencing, phenotype characterizations and an intracellular killing assay. Illumina MiSeq platform was used for sequencing of two Salmonella genomes. Phylogenetic analysis employing homologous alignment of a 1,185 non-duplicated protein-coding gene in the Salmonella core genome demonstrated fully resolved bifurcating patterns with varying levels of diversity that separated ST221_31B and SK222_32B genomes into distinct monophyletic serovar clades. Single nucleotide polymorphism (SNP) analysis identified 2,432 (ST19) SNPs within 13 Typhimurium genomes including ST221_31B representing Sequence Type ST19 and 650 (ST152) SNPs were detected within 13 Kentucky genomes including SK222_32B representing Sequence Type ST152. In addition to serovar-specific conserved coding sequences, the genomes of ST221_31B and SK222_32B harbor several genomic regions with significant genetic differences. These included phage and phage-like elements, carbon utilization or transport operons, fimbriae operons, putative membrane associated protein-encoding genes, antibiotic resistance genes, siderophore operons, and numerous hypothetical protein-encoding genes. Phenotype microarray results demonstrated that ST221_31B is capable of utilizing certain carbon compounds more efficiently as compared to SK222_3B; namely, 1,2-propanediol, M-inositol, L-threonine, α-D-lactose, D-tagatose, adonitol, formic acid, acetoacetic acid, and L-tartaric acid. ST221_31B survived for 48 h in macrophages, while SK222_32B was mostly eliminated. Further, a 3-fold growth of ST221_31B was observed at 24 hours post-infection in chicken granulosa cells while SK222_32B was unable to replicate in these cells. These results suggest that Salmonella Typhimurium can survive host defenses better and could be more invasive than Salmonella Kentucky and provide some insights into the genomic determinants responsible for these differences.
Banks, David J; Porcella, Stephen F; Barbian, Kent D; Beres, Stephen B; Philips, Lauren E; Voyich, Jovanka M; DeLeo, Frank R; Martin, Judith M; Somerville, Greg A; Musser, James M
2004-08-15
We describe the genome sequence of a macrolide-resistant strain (MGAS10394) of serotype M6 group A Streptococcus (GAS). The genome is 1,900,156 bp in length, and 8 prophage-like elements or remnants compose 12.4% of the chromosome. A 8.3-kb prophage remnant encodes the SpeA4 variant of streptococcal pyrogenic exotoxin A. The genome of strain MGAS10394 contains a chimeric genetic element composed of prophage genes and a transposon encoding the mefA gene conferring macrolide resistance. This chimeric element also has a gene encoding a novel surface-exposed protein (designated "R6 protein"), with an LPKTG cell-anchor motif located at the carboxyterminus. Surface expression of this protein was confirmed by flow cytometry. Humans with GAS pharyngitis caused by serotype M6 strains had antibody against the R6 protein present in convalescent, but not acute, serum samples. Our studies add to the theme that GAS prophage-encoded extracellular proteins contribute to host-pathogen interactions in a strain-specific fashion.
Mann, Nicholas H.; Clokie, Martha R. J.; Millard, Andrew; Cook, Annabel; Wilson, William H.; Wheatley, Peter J.; Letarov, Andrey; Krisch, H. M.
2005-01-01
Bacteriophage S-PM2 infects several strains of the abundant and ecologically important marine cyanobacterium Synechococcus. A large lytic phage with an isometric icosahedral head, S-PM2 has a contractile tail and by this criterion is classified as a myovirus (1). The linear, circularly permuted, 196,280-bp double-stranded DNA genome of S-PM2 contains 37.8% G+C residues. It encodes 239 open reading frames (ORFs) and 25 tRNAs. Of these ORFs, 19 appear to encode proteins associated with the cell envelope, including a putative S-layer-associated protein. Twenty additional S-PM2 ORFs have homologues in the genomes of their cyanobacterial hosts. There is a group I self-splicing intron within the gene encoding the D1 protein. A total of 40 ORFs, organized into discrete clusters, encode homologues of T4 proteins involved in virion morphogenesis, nucleotide metabolism, gene regulation, and DNA replication and repair. The S-PM2 genome encodes a few surprisingly large (e.g., 3,779 amino acids) ORFs of unknown function. Our analysis of the S-PM2 genome suggests that many of the unknown S-PM2 functions may be involved in the adaptation of the metabolism of the host cell to the requirements of phage infection. This hypothesis originates from the identification of multiple phage-mediated modifications of the host's photosynthetic apparatus that appear to be essential for maintaining energy production during the lytic cycle. PMID:15838046
Mobile genetic element-encoded cytolysin connects virulence to methicillin resistance in MRSA.
Queck, Shu Y; Khan, Burhan A; Wang, Rong; Bach, Thanh-Huy L; Kretschmer, Dorothee; Chen, Liang; Kreiswirth, Barry N; Peschel, Andreas; Deleo, Frank R; Otto, Michael
2009-07-01
Bacterial virulence and antibiotic resistance have a significant influence on disease severity and treatment options during bacterial infections. Frequently, the underlying genetic determinants are encoded on mobile genetic elements (MGEs). In the leading human pathogen Staphylococcus aureus, MGEs that contain antibiotic resistance genes commonly do not contain genes for virulence determinants. The phenol-soluble modulins (PSMs) are staphylococcal cytolytic toxins with a crucial role in immune evasion. While all known PSMs are core genome-encoded, we here describe a previously unidentified psm gene, psm-mec, within the staphylococcal methicillin resistance-encoding MGE SCCmec. PSM-mec was strongly expressed in many strains and showed the physico-chemical, pro-inflammatory, and cytolytic characteristics typical of PSMs. Notably, in an S. aureus strain with low production of core genome-encoded PSMs, expression of PSM-mec had a significant impact on immune evasion and disease. In addition to providing high-level resistance to methicillin, acquisition of SCCmec elements encoding PSM-mec by horizontal gene transfer may therefore contribute to staphylococcal virulence by substituting for the lack of expression of core genome-encoded PSMs. Thus, our study reveals a previously unknown role of methicillin resistance clusters in staphylococcal pathogenesis and shows that important virulence and antibiotic resistance determinants may be combined in staphylococcal MGEs.
The ENCODE project: implications for psychiatric genetics.
Kavanagh, D H; Dwyer, S; O'Donovan, M C; Owen, M J
2013-05-01
The ENCyclopedia Of DNA Elements (ENCODE) project is a public research consortium that aims to identify all functional elements of the human genome sequence. The project comprised 1640 data sets, from 147 different cell type and the findings were released in a coordinated set of 34 publications across several journals. The ENCODE publications report that 80.4% of the human genome displays some functionality. These data have important implications for interpreting results from large-scale genetics studies. We reviewed some of the key findings from the ENCODE publications and discuss how they can influence or inform further investigations into the genetic factors contributing to neuropsychiatric disorders.
The ENCODE Project at UC Santa Cruz.
Thomas, Daryl J; Rosenbloom, Kate R; Clawson, Hiram; Hinrichs, Angie S; Trumbower, Heather; Raney, Brian J; Karolchik, Donna; Barber, Galt P; Harte, Rachel A; Hillman-Jackson, Jennifer; Kuhn, Robert M; Rhead, Brooke L; Smith, Kayla E; Thakkapallayil, Archana; Zweig, Ann S; Haussler, David; Kent, W James
2007-01-01
The goal of the Encyclopedia Of DNA Elements (ENCODE) Project is to identify all functional elements in the human genome. The pilot phase is for comparison of existing methods and for the development of new methods to rigorously analyze a defined 1% of the human genome sequence. Experimental datasets are focused on the origin of replication, DNase I hypersensitivity, chromatin immunoprecipitation, promoter function, gene structure, pseudogenes, non-protein-coding RNAs, transcribed RNAs, multiple sequence alignment and evolutionarily constrained elements. The ENCODE project at UCSC website (http://genome.ucsc.edu/ENCODE) is the primary portal for the sequence-based data produced as part of the ENCODE project. In the pilot phase of the project, over 30 labs provided experimental results for a total of 56 browser tracks supported by 385 database tables. The site provides researchers with a number of tools that allow them to visualize and analyze the data as well as download data for local analyses. This paper describes the portal to the data, highlights the data that has been made available, and presents the tools that have been developed within the ENCODE project. Access to the data and types of interactive analysis that are possible are illustrated through supplemental examples.
Genome sequence of the model medicinal mushroom Ganoderma lucidum
Chen, Shilin; Xu, Jiang; Liu, Chang; Zhu, Yingjie; Nelson, David R.; Zhou, Shiguo; Li, Chunfang; Wang, Lizhi; Guo, Xu; Sun, Yongzhen; Luo, Hongmei; Li, Ying; Song, Jingyuan; Henrissat, Bernard; Levasseur, Anthony; Qian, Jun; Li, Jianqin; Luo, Xiang; Shi, Linchun; He, Liu; Xiang, Li; Xu, Xiaolan; Niu, Yunyun; Li, Qiushi; Han, Mira V.; Yan, Haixia; Zhang, Jin; Chen, Haimei; Lv, Aiping; Wang, Zhen; Liu, Mingzhu; Schwartz, David C.; Sun, Chao
2012-01-01
Ganoderma lucidum is a widely used medicinal macrofungus in traditional Chinese medicine that creates a diverse set of bioactive compounds. Here we report its 43.3-Mb genome, encoding 16,113 predicted genes, obtained using next-generation sequencing and optical mapping approaches. The sequence analysis reveals an impressive array of genes encoding cytochrome P450s (CYPs), transporters and regulatory proteins that cooperate in secondary metabolism. The genome also encodes one of the richest sets of wood degradation enzymes among all of the sequenced basidiomycetes. In all, 24 physical CYP gene clusters are identified. Moreover, 78 CYP genes are coexpressed with lanosterol synthase, and 16 of these show high similarity to fungal CYPs that specifically hydroxylate testosterone, suggesting their possible roles in triterpenoid biosynthesis. The elucidation of the G. lucidum genome makes this organism a potential model system for the study of secondary metabolic pathways and their regulation in medicinal fungi. PMID:22735441
Phylogenetics of Lophotrochozoan bHLH Genes and the Evolution of Lineage-Specific Gene Duplicates.
Bao, Yongbo; Xu, Fei; Shimeld, Sebastian M
2017-04-01
The gain and loss of genes encoding transcription factors is of importance to understanding the evolution of gene regulatory complexity. The basic helix-loop-helix (bHLH) genes encode a large superfamily of transcription factors. We systematically classify the bHLH genes from five mollusc, two annelid and one brachiopod genomes, tracing the pattern of bHLH gene evolution across these poorly studied Phyla. In total, 56-88 bHLH genes were identified in each genome, with most identifiable as members of previously described bilaterian families, or of new families we define. Of such families only one, Mesp, appears lost by all these species. Additional duplications have also played a role in the evolution of the bHLH gene repertoire, with many new lophotrochozoan-, mollusc-, bivalve-, or gastropod-specific genes defined. Using a combination of transcriptome mining, RT-PCR, and in situ hybridization we compared the expression of several of these novel genes in tissues and embryos of the molluscs Crassostrea gigas and Patella vulgata, finding both conserved expression and evidence for neofunctionalization. We also map the positions of the genes across these genomes, identifying numerous gene linkages. Some reflect recent paralog divergence by tandem duplication, others are remnants of ancient tandem duplications dating to the lophotrochozoan or bilaterian common ancestors. These data are built into a model of the evolution of bHLH genes in molluscs, showing formidable evolutionary stasis at the family level but considerable within-family diversification by tandem gene duplication. © The Author(s) 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Herlemann, D. P. R.; Geissinger, O.; Ikeda-Ohtsubo, W.
2009-02-01
The candidate phylum Termite group 1 (TG1), is regularly 1 encountered in termite hindguts but is present also in many other habitats. Here we report the complete genome sequence (1.64 Mbp) of Elusimicrobium minutum strain Pei191{sup T}, the first cultured representative of the TG1 phylum. We reconstructed the metabolism of this strictly anaerobic bacterium isolated from a beetle larva gut and discuss the findings in light of physiological data. E. minutum has all genes required for uptake and fermentation of sugars via the Embden-Meyerhof pathway, including several hydrogenases, and an unusual peptide degradation pathway comprising transamination reactions and leading tomore » the formation of alanine, which is excreted in substantial amounts. The presence of genes encoding lipopolysaccharide biosynthesis and the presence of a pathway for peptidoglycan formation are consistent with ultrastructural evidence of a Gram-negative cell envelope. Even though electron micrographs showed no cell appendages, the genome encodes many genes putatively involved in pilus assembly. We assigned some to a type II secretion system, but the function of 60 pilE-like genes remains unknown. Numerous genes with hypothetical functions, e.g., polyketide synthesis, non-ribosomal peptide synthesis, antibiotic transport, and oxygen stress protection, indicate the presence of hitherto undiscovered physiological traits. Comparative analysis of 22 concatenated single-copy marker genes corroborated the status of Elusimicrobia (formerly TG1) as a separate phylum in the bacterial domain, which was so far based only on 16S rRNA sequence analysis.« less
Maier, Uwe-G; Zauner, Stefan; Woehle, Christian; Bolte, Kathrin; Hempel, Franziska; Allen, John F.; Martin, William F.
2013-01-01
Plastid and mitochondrial genomes have undergone parallel evolution to encode the same functional set of genes. These encode conserved protein components of the electron transport chain in their respective bioenergetic membranes and genes for the ribosomes that express them. This highly convergent aspect of organelle genome evolution is partly explained by the redox regulation hypothesis, which predicts a separate plastid or mitochondrial location for genes encoding bioenergetic membrane proteins of either photosynthesis or respiration. Here we show that convergence in organelle genome evolution is far stronger than previously recognized, because the same set of genes for ribosomal proteins is independently retained by both plastid and mitochondrial genomes. A hitherto unrecognized selective pressure retains genes for the same ribosomal proteins in both organelles. On the Escherichia coli ribosome assembly map, the retained proteins are implicated in 30S and 50S ribosomal subunit assembly and initial rRNA binding. We suggest that ribosomal assembly imposes functional constraints that govern the retention of ribosomal protein coding genes in organelles. These constraints are subordinate to redox regulation for electron transport chain components, which anchor the ribosome to the organelle genome in the first place. As organelle genomes undergo reduction, the rRNAs also become smaller. Below size thresholds of approximately 1,300 nucleotides (16S rRNA) and 2,100 nucleotides (26S rRNA), all ribosomal protein coding genes are lost from organelles, while electron transport chain components remain organelle encoded as long as the organelles use redox chemistry to generate a proton motive force. PMID:24259312
Long Noncoding RNAs: a New Regulatory Code in Metabolic Control
Zhao, Xu-Yun; Lin, Jiandie D.
2015-01-01
Long noncoding RNAs (lncRNAs) are emerging as an integral part of the regulatory information encoded in the genome. LncRNAs possess the unique capability to interact with nucleic acids and proteins and exert discrete effects on numerous biological processes. Recent studies have delineated multiple lncRNA pathways that control metabolic tissue development and function. The expansion of the regulatory code that links nutrient and hormonal signals to tissue metabolism gives new insights into the genetic and pathogenic mechanisms underlying metabolic disease. This review discusses lncRNA biology with a focus on its role in the development, signaling, and function of key metabolic tissues. PMID:26410599
Genomic evolution in domestic cattle: ancestral haplotypes and healthy beef.
Williamson, Joseph F; Steele, Edward J; Lester, Susan; Kalai, Oscar; Millman, John A; Wolrige, Lindsay; Bayard, Dominic; McLure, Craig; Dawkins, Roger L
2011-05-01
We have identified numerous Ancestral Haplotypes encoding a 14-Mb region of Bota C19. Three are frequent in Simmental, Angus and Wagyu and have been conserved since common progenitor populations. Others are more relevant to the differences between these 3 breeds including fat content and distribution in muscle. SREBF1 and Growth Hormone, which have been implicated in the production of healthy beef, are included within these haplotypes. However, we conclude that alleles at these 2 loci are less important than other sequences within the haplotypes. Identification of breeds and hybrids is improved by using haplotypes rather than individual alleles. Copyright © 2010 Elsevier Inc. All rights reserved.
Methyl jasmonate as a vital substance in plants.
Cheong, Jong-Joo; Choi, Yang Do
2003-07-01
The plant floral scent methyl jasmonate (MeJA) has been identified as a vital cellular regulator that mediates diverse developmental processes and defense responses against biotic and abiotic stresses. The pleiotropic effects of MeJA have raised numerous questions about its regulation for biogenesis and mode of action. Characterization of the gene encoding jasmonic acid carboxyl methyltransferase has provided basic information on the role(s) of this phytohormone in gene-activation control and systemic long-distance signaling. Recent approaches using functional genomics and bioinformatics have identified a whole set of MeJA-responsive genes, and provide insights into how plants use volatile signals to withstand diverse and variable environments.
Valles, Steven M; Bell, Susanne; Firth, Andrew E
2014-01-01
Solenopsis invicta virus 3 (SINV-3) is a positive-sense single-stranded RNA virus that infects the red imported fire ant, Solenopsis invicta. We show that the second open reading frame (ORF) of the dicistronic genome is expressed via a frameshifting mechanism and that the sequences encoding the structural proteins map to both ORF2 and the 3' end of ORF1, downstream of the sequence that encodes the RNA-dependent RNA polymerase. The genome organization and structural protein expression strategy resemble those of Acyrthosiphon pisum virus (APV), an aphid virus. The capsid protein that is encoded by the 3' end of ORF1 in SINV-3 and APV is predicted to have a jelly-roll fold similar to the capsid proteins of picornaviruses and caliciviruses. The capsid-extension protein that is produced by frameshifting, includes the jelly-roll fold domain encoded by ORF1 as its N-terminus, while the C-terminus encoded by the 5' half of ORF2 has no clear homology with other viral structural proteins. A third protein, encoded by the 3' half of ORF2, is associated with purified virions at sub-stoichiometric ratios. Although the structural proteins can be translated from the genomic RNA, we show that SINV-3 also produces a subgenomic RNA encoding the structural proteins. Circumstantial evidence suggests that APV may also produce such a subgenomic RNA. Both SINV-3 and APV are unclassified picorna-like viruses distantly related to members of the order Picornavirales and the family Caliciviridae. Within this grouping, features of the genome organization and capsid domain structure of SINV-3 and APV appear more similar to caliciviruses, perhaps suggesting the basis for a "Calicivirales" order.
Metabolism and Genetics of Helicobacter pylori: the Genome Era
Marais, Armelle; Mendz, George L.; Hazell, Stuart L.; Mégraud, Francis
1999-01-01
The publication of the complete sequence of Helicobacter pylori 26695 in 1997 and more recently that of strain J99 has provided new insight into the biology of this organism. In this review, we attempt to analyze and interpret the information provided by sequence annotations and to compare these data with those provided by experimental analyses. After a brief description of the general features of the genomes of the two sequenced strains, the principal metabolic pathways are analyzed. In particular, the enzymes encoded by H. pylori involved in fermentative and oxidative metabolism, lipopolysaccharide biosynthesis, nucleotide biosynthesis, aerobic and anaerobic respiration, and iron and nitrogen assimilation are described, and the areas of controversy between the experimental data and those provided by the sequence annotation are discussed. The role of urease, particularly in pH homeostasis, and other specialized mechanisms developed by the bacterium to maintain its internal pH are also considered. The replicational, transcriptional, and translational apparatuses are reviewed, as is the regulatory network. The numerous findings on the metabolism of the bacteria and the paucity of gene expression regulation systems are indicative of the high level of adaptation to the human gastric environment. Arguments in favor of the diversity of H. pylori and molecular data reflecting possible mechanisms involved in this diversity are presented. Finally, we compare the numerous experimental data on the colonization factors and those provided from the genome sequence annotation, in particular for genes involved in motility and adherence of the bacterium to the gastric tissue. PMID:10477311
Targeting of cytosolic mRNA to mitochondria: naked RNA can bind to the mitochondrial surface.
Michaud, Morgane; Maréchal-Drouard, Laurence; Duchêne, Anne-Marie
2014-05-01
Mitochondria contain hundreds of proteins but only a few are encoded by the mitochondrial genome. The other proteins are nuclear-encoded and imported into mitochondria. These proteins can be translated on free cytosolic polysomes, then targeted and imported into mitochondria. Nonetheless, numerous cytosolic mRNAs encoding mitochondrial proteins are detected at the surface of mitochondria in yeast, plants and animals. The localization of mRNAs to the vicinity of mitochondria would be a way for mitochondrial protein sorting. The mechanisms responsible for mRNA targeting to mitochondria are not clearly identified. Sequences within the mRNA molecules (cis-elements), as well as a few trans-acting factors, have been shown to be essential for targeting of some mRNAs. In order to identify receptors involved in mRNA docking to the mitochondrial surface, we have developed an in vitro mRNA binding assay with isolated plant mitochondria. We show that naked mRNAs are able to bind to isolated mitochondria, and our results strongly suggest that mRNA docking to the plant mitochondrial outer membrane requires at least one component of TOM complex. Copyright © 2013 Elsevier Masson SAS. All rights reserved.
DNA Asymmetric Strand Bias Affects the Amino Acid Composition of Mitochondrial Proteins
Min, Xiang Jia; Hickey, Donal A.
2007-01-01
Abstract Variations in GC content between genomes have been extensively documented. Genomes with comparable GC contents can, however, still differ in the apportionment of the G and C nucleotides between the two DNA strands. This asymmetric strand bias is known as GC skew. Here, we have investigated the impact of differences in nucleotide skew on the amino acid composition of the encoded proteins. We compared orthologous genes between animal mitochondrial genomes that show large differences in GC and AT skews. Specifically, we compared the mitochondrial genomes of mammals, which are characterized by a negative GC skew and a positive AT skew, to those of flatworms, which show the opposite skews for both GC and AT base pairs. We found that the mammalian proteins are highly enriched in amino acids encoded by CA-rich codons (as predicted by their negative GC and positive AT skews), whereas their flatworm orthologs were enriched in amino acids encoded by GT-rich codons (also as predicted from their skews). We found that these differences in mitochondrial strand asymmetry (measured as GC and AT skews) can have very large, predictable effects on the composition of the encoded proteins. PMID:17974594
Gupta, Adarsh K; Hein, Gary L; Graybosch, Robert A; Tatineni, Satyanarayana
2018-05-01
High Plains wheat mosaic virus (HPWMoV, genus Emaravirus; family Fimoviridae), transmitted by the wheat curl mite (Aceria tosichella Keifer), harbors a monocistronic octapartite single-stranded negative-sense RNA genome. In this study, putative proteins encoded by HPWMoV genomic RNAs 2-8 were screened for potential RNA silencing suppression activity by using a green fluorescent protein-based reporter agroinfiltration assay. We found that proteins encoded by RNAs 7 (P7) and 8 (P8) suppressed silencing induced by single- or double-stranded RNAs and efficiently suppressed the transitive pathway of RNA silencing. Additionally, a Wheat streak mosaic virus (WSMV, genus Tritimovirus; family Potyviridae) mutant lacking the suppressor of RNA silencing (ΔP1) but having either P7 or P8 from HPWMoV restored cell-to-cell and long-distance movement in wheat, thus indicating that P7 or P8 rescued silencing suppressor-deficient WSMV. Furthermore, HPWMoV P7 and P8 substantially enhanced the pathogenicity of Potato virus X in Nicotiana benthamiana. Collectively, these data demonstrate that the octapartite genome of HPWMoV encodes two suppressors of RNA silencing. Published by Elsevier Inc.
The COG database: a tool for genome-scale analysis of protein functions and evolution
Tatusov, Roman L.; Galperin, Michael Y.; Natale, Darren A.; Koonin, Eugene V.
2000-01-01
Rational classification of proteins encoded in sequenced genomes is critical for making the genome sequences maximally useful for functional and evolutionary studies. The database of Clusters of Orthologous Groups of proteins (COGs) is an attempt on a phylogenetic classification of the proteins encoded in 21 complete genomes of bacteria, archaea and eukaryotes (http://www.ncbi.nlm.nih.gov/COG ). The COGs were constructed by applying the criterion of consistency of genome-specific best hits to the results of an exhaustive comparison of all protein sequences from these genomes. The database comprises 2091 COGs that include 56–83% of the gene products from each of the complete bacterial and archaeal genomes and ~35% of those from the yeast Saccharomyces cerevisiae genome. The COG database is accompanied by the COGNITOR program that is used to fit new proteins into the COGs and can be applied to functional and phylogenetic annotation of newly sequenced genomes. PMID:10592175
HnRNP A3 genes and pseudogenes in the vertebrate genomes.
Makeyev, Aleksandr V; Kim, Chang Bae; Ruddle, Frank H; Enkhmandakh, Badam; Erdenechimeg, Lkhamsuren; Bayarsaihan, Dashzeveg
2005-04-01
The hnRNP A/B type proteins are abundant nuclear factors that bind to Pol II transcripts and are involved in numerous RNA-related activities. To date most data on the hnRNP A/B family have been obtained with recombinant proteins and cell cultures. Further characterization can result from an examination of the impact of various modifications in intact functional loci; however, such characterization is hampered by the presence of numerous and widely dispersed hnRNP A/B-related sequences in the mammalian genome. We have found hnRNP A3, a poorly recognized member of the hnRNP A/B family, among candidate transcription factors that interact with the regulatory region of the Hoxc8 gene and screened the human and mouse genomes for genes that encode hnRNP A3. We demonstrate that the sequence reported previously as the human hnRNP A3 gene (Accession number S63912) and located on 10p11.1 belongs to a processed pseudogene of the functional intron-containing locus HNRPA3, which we have identified on 2q31.2. We have also identified its murine orthologs on mouse chromosome 2D and rat chromosome 3q23. Alternative splices were revealed at the N-terminus and in the middle of hnRNP A3. 14 and 28 additional loci in the human and mouse genome, respectively, were mapped and identified as hnRNP A3 processed pseudogenes. In addition, we have found and compared hnRNP A3 orthologous genes in Gallus gallus, Xenopus tropicalis, and Danio rerio. The present in silico analysis serves as a necessary step toward a further functional characterization of hnRNP A3. (c) 2005 Wiley-Liss, Inc.
Bogdanova, Vera S.; Zaytseva, Olga O.; Mglinets, Anatoliy V.; Shatskaya, Natalia V.; Kosterin, Oleg E.; Vasiliev, Gennadiy V.
2015-01-01
In crosses of wild and cultivated peas (Pisum sativum L.), nuclear-cytoplasmic incompatibility frequently occurs manifested as decreased pollen fertility, male gametophyte lethality, sporophyte lethality. High-throughput sequencing of plastid genomes of one cultivated and four wild pea accessions differing in cross-compatibility was performed. Candidate genes for involvement in the nuclear-plastid conflict were searched in the reconstructed plastid genomes. In the annotated Medicago truncatula genome, nuclear candidate genes were searched in the portion syntenic to the pea chromosome region known to harbor a locus involved in the conflict. In the plastid genomes, a substantial variability of the accD locus represented by nucleotide substitutions and indels was found to correspond to the pattern of cross-compatibility among the accessions analyzed. Amino acid substitutions in the polypeptides encoded by the alleles of a nuclear locus, designated as Bccp3, with a complementary function to accD, fitted the compatibility pattern. The accD locus in the plastid genome encoding beta subunit of the carboxyltransferase of acetyl-coA carboxylase and the nuclear locus Bccp3 encoding biotin carboxyl carrier protein of the same multi-subunit enzyme were nominated as candidate genes for main contribution to nuclear-cytoplasmic incompatibility in peas. Existence of another nuclear locus involved in the accD-mediated conflict is hypothesized. PMID:25789472
Nucleotide sequences of two genomic DNAs encoding peroxidase of Arabidopsis thaliana.
Intapruk, C; Higashimura, N; Yamamoto, K; Okada, N; Shinmyo, A; Takano, M
1991-02-15
The peroxidase (EC 1.11.1.7)-encoding gene of Arabidopsis thaliana was screened from a genomic library using a cDNA encoding a neutral isozyme of horseradish, Armoracia rusticana, peroxidase (HRP) as a probe, and two positive clones were isolated. From the comparison with the sequences of the HRP-encoding genes, we concluded that two clones contained peroxidase-encoding genes, and they were named prxCa and prxEa. Both genes consisted of four exons and three introns; the introns had consensus nucleotides, GT and AG, at the 5' and 3' ends, respectively. The lengths of each putative exon of the prxEa gene were the same as those of the HRP-basic-isozyme-encoding gene, prxC3, and coded for 349 amino acids (aa) with a sequence homology of 89% to that encoded by prxC3. The prxCa gene was very close to the HRP-neutral-isozyme-encoding gene, prxC1b, and coded for 354 aa with 91% homology to that encoded by prxC1b. The aa sequence homology was 64% between the two peroxidases encoded by prxCa and prxEa.
Genome complexity in the coelacanth is reflected in its adaptive immune system
Saha, Nil Ratan; Ota, Tatsuya; Litman, Gary W.; Hansen, John; Parra, Zuly; Hsu, Ellen; Buonocore, Francesco; Canapa, Adriana; Cheng, Jan-Fang; Amemiya, Chris T.
2014-01-01
We have analyzed the available genome and transcriptome resources from the coelacanth in order to characterize genes involved in adaptive immunity. Two highly distinctive IgW-encoding loci have been identified that exhibit a unique genomic organization, including a multiplicity of tandemly repeated constant region exons. The overall organization of the IgW loci precludes typical heavy chain class switching. A locus encoding IgM could not be identified either computationally or by using several different experimental strategies. Four distinct sets of genes encoding Ig light chains were identified. This includes a variant sigma-type Ig light chain previously identified only in cartilaginous fishes and which is now provisionally denoted sigma-2. Genes encoding α/β and γ/δ T-cell receptors, and CD3, CD4, and CD8 co-receptors also were characterized. Ig heavy chain variable region genes and TCR components are interspersed within the TCR α/δ locus; this organization previously was reported only in tetrapods and raises questions regarding evolution and functional cooption of genes encoding variable regions. The composition, organization and syntenic conservation of the major histocompatibility complex locus have been characterized. We also identified large numbers of genes encoding cytokines and their receptors, and other genes associated with adaptive immunity. In terms of sequence identity and organization, the adaptive immune genes of the coelacanth more closely resemble orthologous genes in tetrapods than those in teleost fishes, consistent with current phylogenomic interpretations. Overall, the work reported described herein highlights the complexity inherent in the coelacanth genome and provides a rich catalog of immune genes for future investigations.
Nolan, John P.; Mandy, Francis
2008-01-01
While the term flow cytometry refers to the measurement of cells, the approach of making sensitive multiparameter optical measurements in a flowing sample stream is a very general analytical approach. The past few years have seen an explosion in the application of flow cytometry technology for molecular analysis and measurements using micro-particles as solid supports. While microsphere-based molecular analyses using flow cytometry date back three decades, the need for highly parallel quantitative molecular measurements that has arisen from various genomic and proteomic advances has driven the development in particle encoding technology to enable highly multiplexed assays. Multiplexed particle-based immunoassays are now common place, and new assays to study genes, protein function, and molecular assembly. Numerous efforts are underway to extend the multiplexing capabilities of microparticle-based assays through new approaches to particle encoding and analyte reporting. The impact of these developments will be seen in the basic research and clinical laboratories, as well as in drug development. PMID:16604537
Sharma, Akanksha; Sharma, Niharika; Bhalla, Prem; Singh, Mohan
2017-01-01
Comparative genomics have facilitated the mining of biological information from a genome sequence, through the detection of similarities and differences with genomes of closely or more distantly related species. By using such comparative approaches, knowledge can be transferred from the model to non-model organisms and insights can be gained in the structural and evolutionary patterns of specific genes. In the absence of sequenced genomes for allergenic grasses, this study was aimed at understanding the structure, organisation and expression profiles of grass pollen allergens using the genomic data from Brachypodium distachyon as it is phylogenetically related to the allergenic grasses. Combining genomic data with the anther RNA-Seq dataset revealed 24 pollen allergen genes belonging to eight allergen groups mapping on the five chromosomes in B. distachyon. High levels of anther-specific expression profiles were observed for the 24 identified putative allergen-encoding genes in Brachypodium. The genomic evidence suggests that gene encoding the group 5 allergen, the most potent trigger of hay fever and allergic asthma originated as a pollen specific orphan gene in a common grass ancestor of Brachypodium and Triticiae clades. Gene structure analysis showed that the putative allergen-encoding genes in Brachypodium either lack or contain reduced number of introns. Promoter analysis of the identified Brachypodium genes revealed the presence of specific cis-regulatory sequences likely responsible for high anther/pollen-specific expression. With the identification of putative allergen-encoding genes in Brachypodium, this study has also described some important plant gene families (e.g. expansin superfamily, EF-Hand family, profilins etc) for the first time in the model plant Brachypodium. Altogether, the present study provides new insights into structural characterization and evolution of pollen allergens and will further serve as a base for their functional characterization in related grass species.
Joardar, Vinita; Williams, Kelly P.; Driscoll, Timothy; Hostetler, Jessica B.; Nordberg, Eric; Shukla, Maulik; Walenz, Brian; Hill, Catherine A.; Nene, Vishvanath M.; Azad, Abdu F.; Sobral, Bruno W.; Caler, Elisabet
2012-01-01
We present the draft genome for the Rickettsia endosymbiont of Ixodes scapularis (REIS), a symbiont of the deer tick vector of Lyme disease in North America. Among Rickettsia species (Alphaproteobacteria: Rickettsiales), REIS has the largest genome sequenced to date (>2 Mb) and contains 2,309 genes across the chromosome and four plasmids (pREIS1 to pREIS4). The most remarkable finding within the REIS genome is the extraordinary proliferation of mobile genetic elements (MGEs), which contributes to a limited synteny with other Rickettsia genomes. In particular, an integrative conjugative element named RAGE (for Rickettsiales amplified genetic element), previously identified in scrub typhus rickettsiae (Orientia tsutsugamushi) genomes, is present on both the REIS chromosome and plasmids. Unlike the pseudogene-laden RAGEs of O. tsutsugamushi, REIS encodes nine conserved RAGEs that include F-like type IV secretion systems similar to that of the tra genes encoded in the Rickettsia bellii and R. massiliae genomes. An unparalleled abundance of encoded transposases (>650) relative to genome size, together with the RAGEs and other MGEs, comprise ∼35% of the total genome, making REIS one of the most plastic and repetitive bacterial genomes sequenced to date. We present evidence that conserved rickettsial genes associated with an intracellular lifestyle were acquired via MGEs, especially the RAGE, through a continuum of genomic invasions. Robust phylogeny estimation suggests REIS is ancestral to the virulent spotted fever group of rickettsiae. As REIS is not known to invade vertebrate cells and has no known pathogenic effects on I. scapularis, its genome sequence provides insight on the origin of mechanisms of rickettsial pathogenicity. PMID:22056929
Pichon, Christophe; du Merle, Laurence; Caliot, Marie Elise; Trieu-Cuot, Patrick; Le Bouguénec, Chantal
2012-04-01
Characterization of small non-coding ribonucleic acids (sRNA) among the large volume of data generated by high-throughput RNA-seq or tiling microarray analyses remains a challenge. Thus, there is still a need for accurate in silico prediction methods to identify sRNAs within a given bacterial species. After years of effort, dedicated software were developed based on comparative genomic analyses or mathematical/statistical models. Although these genomic analyses enabled sRNAs in intergenic regions to be efficiently identified, they all failed to predict antisense sRNA genes (asRNA), i.e. RNA genes located on the DNA strand complementary to that which encodes the protein. The statistical models enabled any genomic region to be analyzed theorically but not efficiently. We present a new model for in silico identification of sRNA and asRNA candidates within an entire bacterial genome. This model was successfully used to analyze the Gram-negative Escherichia coli and Gram-positive Streptococcus agalactiae. In both bacteria, numerous asRNAs are transcribed from the complementary strand of genes located in pathogenicity islands, strongly suggesting that these asRNAs are regulators of the virulence expression. In particular, we characterized an asRNA that acted as an enhancer-like regulator of the type 1 fimbriae production involved in the virulence of extra-intestinal pathogenic E. coli.
Pichon, Christophe; du Merle, Laurence; Caliot, Marie Elise; Trieu-Cuot, Patrick; Le Bouguénec, Chantal
2012-01-01
Characterization of small non-coding ribonucleic acids (sRNA) among the large volume of data generated by high-throughput RNA-seq or tiling microarray analyses remains a challenge. Thus, there is still a need for accurate in silico prediction methods to identify sRNAs within a given bacterial species. After years of effort, dedicated software were developed based on comparative genomic analyses or mathematical/statistical models. Although these genomic analyses enabled sRNAs in intergenic regions to be efficiently identified, they all failed to predict antisense sRNA genes (asRNA), i.e. RNA genes located on the DNA strand complementary to that which encodes the protein. The statistical models enabled any genomic region to be analyzed theorically but not efficiently. We present a new model for in silico identification of sRNA and asRNA candidates within an entire bacterial genome. This model was successfully used to analyze the Gram-negative Escherichia coli and Gram-positive Streptococcus agalactiae. In both bacteria, numerous asRNAs are transcribed from the complementary strand of genes located in pathogenicity islands, strongly suggesting that these asRNAs are regulators of the virulence expression. In particular, we characterized an asRNA that acted as an enhancer-like regulator of the type 1 fimbriae production involved in the virulence of extra-intestinal pathogenic E. coli. PMID:22139924
Cas6 is an endoribonuclease that generates guide RNAs for invader defense in prokaryotes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Carte, Jason; Wang, Ruiying; Li, Hong
An RNA-based gene silencing pathway that protects bacteria and archaea from viruses and other genome invaders is hypothesized to arise from guide RNAs encoded by CRISPR loci and proteins encoded by the cas genes. CRISPR loci contain multiple short invader-derived sequences separated by short repeats. The presence of virus-specific sequences within CRISPR loci of prokaryotic genomes confers resistance against corresponding viruses. The CRISPR loci are transcribed as long RNAs that must be processed to smaller guide RNAs. Here we identified Pyrococcus furiosus Cas6 as a novel endoribonuclease that cleaves CRISPR RNAs within the repeat sequences to release individual invader targetingmore » RNAs. Cas6 interacts with a specific sequence motif in the 5{prime} region of the CRISPR repeat element and cleaves at a defined site within the 3{prime} region of the repeat. The 1.8 angstrom crystal structure of the enzyme reveals two ferredoxin-like folds that are also found in other RNA-binding proteins. The predicted active site of the enzyme is similar to that of tRNA splicing endonucleases, and concordantly, Cas6 activity is metal-independent. cas6 is one of the most widely distributed CRISPR-associated genes. Our findings indicate that Cas6 functions in the generation of CRISPR-derived guide RNAs in numerous bacteria and archaea.« less
Poehlein, Anja; Heym, Daniel; Quitzke, Vivien; Fersch, Julia; Daniel, Rolf; Rother, Michael
2018-04-05
Methanococcus maripaludis type strain JJ (DSM 2067) is an important organism because it serves as a model for primary energy metabolism and hydrogenotrophic methanogenesis and is amenable to genetic manipulation. The complete genome (1.7 Mb) harbors 1,815 predicted protein-encoding genes, including 9 encoding selenoproteins. Copyright © 2018 Poehlein et al.
The Caulobacter crescentus phage phiCbK: genomics of a canonical phage
2012-01-01
Background The bacterium Caulobacter crescentus is a popular model for the study of cell cycle regulation and senescence. The large prolate siphophage phiCbK has been an important tool in C. crescentus biology, and has been studied in its own right as a model for viral morphogenesis. Although a system of some interest, to date little genomic information is available on phiCbK or its relatives. Results Five novel phiCbK-like C. crescentus bacteriophages, CcrMagneto, CcrSwift, CcrKarma, CcrRogue and CcrColossus, were isolated from the environment. The genomes of phage phiCbK and these five environmental phage isolates were obtained by 454 pyrosequencing. The phiCbK-like phage genomes range in size from 205 kb encoding 318 proteins (phiCbK) to 280 kb encoding 448 proteins (CcrColossus), and were found to contain nonpermuted terminal redundancies of 10 to 17 kb. A novel method of terminal ligation was developed to map genomic termini, which confirmed termini predicted by coverage analysis. This suggests that sequence coverage discontinuities may be useable as predictors of genomic termini in phage genomes. Genomic modules encoding virion morphogenesis, lysis and DNA replication proteins were identified. The phiCbK-like phages were also found to encode a number of intriguing proteins; all contain a clearly T7-like DNA polymerase, and five of the six encode a possible homolog of the C. crescentus cell cycle regulator GcrA, which may allow the phage to alter the host cell’s replicative state. The structural proteome of phage phiCbK was determined, identifying the portal, major and minor capsid proteins, the tail tape measure and possible tail fiber proteins. All six phage genomes are clearly related; phiCbK, CcrMagneto, CcrSwift, CcrKarma and CcrRogue form a group related at the DNA level, while CcrColossus is more diverged but retains significant similarity at the protein level. Conclusions Due to their lack of any apparent relationship to other described phages, this group is proposed as the founding cohort of a new phage type, the phiCbK-like phages. This work will serve as a foundation for future studies on morphogenesis, infection and phage-host interactions in C. crescentus. PMID:23050599
Complete genome sequence of keunjorong mosaic virus, a potyvirus from Cynanchum wilfordii.
Nam, Moon; Lee, Joo-Hee; Choi, Hong Soo; Lim, Hyoun-Sub; Moon, Jae Sun; Lee, Su-Heon
2013-08-01
We have determined the complete genome sequence of keunjorong mosaic virus (KjMV). The KjMV genome is composed of 9,611 nucleotides, excluding the 3'-terminal poly(A) tail. It contains two open reading frames (ORFs), with the large one encoding a polyprotein of 3,070 amino acids and the small overlapping ORF encoding a PIPO protein of 81 amino acids. The KjMV genome shared the highest nucleotide sequence identity (57.5 %) with pepper mottle virus and freesia mosaic virus, two members of the genus Potyvirus. Based on the phylogenetic relatedness to known potyviruses, KjMV appears to be a member of a new species in the genus Potyvirus.
Genomic instability--an evolving hallmark of cancer.
Negrini, Simona; Gorgoulis, Vassilis G; Halazonetis, Thanos D
2010-03-01
Genomic instability is a characteristic of most cancers. In hereditary cancers, genomic instability results from mutations in DNA repair genes and drives cancer development, as predicted by the mutator hypothesis. In sporadic (non-hereditary) cancers the molecular basis of genomic instability remains unclear, but recent high-throughput sequencing studies suggest that mutations in DNA repair genes are infrequent before therapy, arguing against the mutator hypothesis for these cancers. Instead, the mutation patterns of the tumour suppressor TP53 (which encodes p53), ataxia telangiectasia mutated (ATM) and cyclin-dependent kinase inhibitor 2A (CDKN2A; which encodes p16INK4A and p14ARF) support the oncogene-induced DNA replication stress model, which attributes genomic instability and TP53 and ATM mutations to oncogene-induced DNA damage.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Suzuki, Hitoshi; MacDonald, Jacqueline; Syed, Khajamohiddin
Background Softwood is the predominant form of land plant biomass in the Northern hemisphere, and is among the most recalcitrant biomass resources to bioprocess technologies. The white rot fungus, Phanerochaete carnosa, has been isolated almost exclusively from softwoods, while most other known white-rot species, including Phanerochaete chrysosporium, were mainly isolated from hardwoods. Accordingly, it is anticipated that P. carnosa encodes a distinct set of enzymes and proteins that promote softwood decomposition. To elucidate the genetic basis of softwood bioconversion by a white-rot fungus, the present study reports the P. carnosa genome sequence and its comparative analysis with the previously reportedmore » P. chrysosporium genome. Results P. carnosa encodes a complete set of lignocellulose-active enzymes. Comparative genomic analysis revealed that P. carnosa is enriched with genes encoding manganese peroxidase, and that the most divergent glycoside hydrolase families were predicted to encode hemicellulases and glycoprotein degrading enzymes. Most remarkably, P. carnosa possesses one of the largest P450 contingents (266 P450s) among the sequenced and annotated wood-rotting basidiomycetes, nearly double that of P. chrysosporium. Along with metabolic pathway modeling, comparative growth studies on model compounds and chemical analyses of decomposed wood components showed greater tolerance of P. carnosa to various substrates including coniferous heartwood. Conclusions The P. carnosa genome is enriched with genes that encode P450 monooxygenases that can participate in extractives degradation, and manganese peroxidases involved in lignin degradation. The significant expansion of P450s in P. carnosa, along with differences in carbohydrate- and lignin-degrading enzymes, could be correlated to the utilization of heartwood and sapwood preparations from both coniferous and hardwood species.« less
2012-01-01
Background Softwood is the predominant form of land plant biomass in the Northern hemisphere, and is among the most recalcitrant biomass resources to bioprocess technologies. The white rot fungus, Phanerochaete carnosa, has been isolated almost exclusively from softwoods, while most other known white-rot species, including Phanerochaete chrysosporium, were mainly isolated from hardwoods. Accordingly, it is anticipated that P. carnosa encodes a distinct set of enzymes and proteins that promote softwood decomposition. To elucidate the genetic basis of softwood bioconversion by a white-rot fungus, the present study reports the P. carnosa genome sequence and its comparative analysis with the previously reported P. chrysosporium genome. Results P. carnosa encodes a complete set of lignocellulose-active enzymes. Comparative genomic analysis revealed that P. carnosa is enriched with genes encoding manganese peroxidase, and that the most divergent glycoside hydrolase families were predicted to encode hemicellulases and glycoprotein degrading enzymes. Most remarkably, P. carnosa possesses one of the largest P450 contingents (266 P450s) among the sequenced and annotated wood-rotting basidiomycetes, nearly double that of P. chrysosporium. Along with metabolic pathway modeling, comparative growth studies on model compounds and chemical analyses of decomposed wood components showed greater tolerance of P. carnosa to various substrates including coniferous heartwood. Conclusions The P. carnosa genome is enriched with genes that encode P450 monooxygenases that can participate in extractives degradation, and manganese peroxidases involved in lignin degradation. The significant expansion of P450s in P. carnosa, along with differences in carbohydrate- and lignin-degrading enzymes, could be correlated to the utilization of heartwood and sapwood preparations from both coniferous and hardwood species. PMID:22937793
Tsuchiaka, Shinobu; Rahpaya, Sayed Samim; Otomaru, Konosuke; Aoki, Hiroshi; Kishimoto, Mai; Naoi, Yuki; Omatsu, Tsutomu; Sano, Kaori; Okazaki-Terashima, Sachiko; Katayama, Yukie; Oba, Mami; Nagai, Makoto; Mizutani, Tetsuya
2017-01-17
Bovine enterovirus (BEV) belongs to the species Enterovirus E or F, genus Enterovirus and family Picornaviridae. Although numerous studies have identified BEVs in the feces of cattle with diarrhea, the pathogenicity of BEVs remains unclear. Previously, we reported the detection of novel kobu-like virus in calf feces, by metagenomics analysis. In the present study, we identified a novel BEV in diarrheal feces collected for that survey. Complete genome sequences were determined by deep sequencing in feces. Secondary RNA structure analysis of the 5' untranslated region (UTR), phylogenetic tree construction and pairwise identity analysis were conducted. The complete genome sequences of BEV were genetically distant from other EVs and the VP1 coding region contained novel and unique amino acid sequences. We named this strain as BEV AN12/Bos taurus/JPN/2014 (referred to as BEV-AN12). According to genome analysis, the genome length of this virus is 7414 nucleotides excluding the poly (A) tail and its genome consists of a 5'UTR, open reading frame encoding a single polyprotein, and 3'UTR. The results of secondary RNA structure analysis showed that in the 5'UTR, BEV-AN12 had an additional clover leaf structure and small stem loop structure, similarly to other BEVs. In pairwise identity analysis, BEV-AN12 showed high amino acid (aa) identities to Enterovirus F in the polyprotein, P2 and P3 regions (aa identity ≥82.4%). Therefore, BEV-AN12 is closely related to Enterovirus F. However, aa sequences in the capsid protein regions, particularly the VP1 encoding region, showed significantly low aa identity to other viruses in genus Enterovirus (VP1 aa identity ≤58.6%). In addition, BEV-AN12 branched separately from Enterovirus E and F in phylogenetic trees based on the aa sequences of P1 and VP1, although it clustered with Enterovirus F in trees based on sequences in the P2 and P3 genome region. We identified novel BEV possessing highly divergent aa sequences in the VP1 coding region in Japan. According to species definition, we proposed naming this strain as "Enterovirus K", which is a novel species within genus Enterovirus. Further genomic studies are needed to understand the pathogenicity of BEVs.
2013-01-01
Background Lyme disease is caused by spirochete bacteria from the Borrelia burgdorferi sensu lato (B. burgdorferi s.l.) species complex. To reconstruct the evolution of B. burgdorferi s.l. and identify the genomic basis of its human virulence, we compared the genomes of 23 B. burgdorferi s.l. isolates from Europe and the United States, including B. burgdorferi sensu stricto (B. burgdorferi s.s., 14 isolates), B. afzelii (2), B. garinii (2), B. “bavariensis” (1), B. spielmanii (1), B. valaisiana (1), B. bissettii (1), and B. “finlandensis” (1). Results Robust B. burgdorferi s.s. and B. burgdorferi s.l. phylogenies were obtained using genome-wide single-nucleotide polymorphisms, despite recombination. Phylogeny-based pan-genome analysis showed that the rate of gene acquisition was higher between species than within species, suggesting adaptive speciation. Strong positive natural selection drives the sequence evolution of lipoproteins, including chromosomally-encoded genes 0102 and 0404, cp26-encoded ospC and b08, and lp54-encoded dbpA, a07, a22, a33, a53, a65. Computer simulations predicted rapid adaptive radiation of genomic groups as population size increases. Conclusions Intra- and inter-specific pan-genome sizes of B. burgdorferi s.l. expand linearly with phylogenetic diversity. Yet gene-acquisition rates in B. burgdorferi s.l. are among the lowest in bacterial pathogens, resulting in high genome stability and few lineage-specific genes. Genome adaptation of B. burgdorferi s.l. is driven predominantly by copy-number and sequence variations of lipoprotein genes. New genomic groups are likely to emerge if the current trend of B. burgdorferi s.l. population expansion continues. PMID:24112474
Mehdizadeh Gohari, Iman; Kropinski, Andrew M; Weese, Scott J; Parreira, Valeria R; Whitehead, Ashley E; Boerlin, Patrick; Prescott, John F
2016-01-01
The recent discovery of a novel beta-pore-forming toxin, NetF, which is strongly associated with canine and foal necrotizing enteritis should improve our understanding of the role of type A Clostridium perfringens associated disease in these animals. The current study presents the complete genome sequence of two netF-positive strains, JFP55 and JFP838, which were recovered from cases of foal necrotizing enteritis and canine hemorrhagic gastroenteritis, respectively. Genome sequencing was done using Single Molecule, Real-Time (SMRT) technology-PacBio and Illumina Hiseq2000. The JFP55 and JFP838 genomes include a single 3.34 Mb and 3.53 Mb chromosome, respectively, and both genomes include five circular plasmids. Plasmid annotation revealed that three plasmids were shared by the two newly sequenced genomes, including a NetF/NetE toxins-encoding tcp-conjugative plasmid, a CPE/CPB2 toxins-encoding tcp-conjugative plasmid and a putative bacteriocin-encoding plasmid. The putative beta-pore-forming toxin genes, netF, netE and netG, were located in unique pathogenicity loci on tcp-conjugative plasmids. The C. perfringens JFP55 chromosome carries 2,825 protein-coding genes whereas the chromosome of JFP838 contains 3,014 protein-encoding genes. Comparison of these two chromosomes with three available reference C. perfringens chromosome sequences identified 48 (~247 kb) and 81 (~430 kb) regions unique to JFP55 and JFP838, respectively. Some of these divergent genomic regions in both chromosomes are phage- and plasmid-related segments. Sixteen of these unique chromosomal regions (~69 kb) were shared between the two isolates. Five of these shared regions formed a mosaic of plasmid-integrated segments, suggesting that these elements were acquired early in a clonal lineage of netF-positive C. perfringens strains. These results provide significant insight into the basis of canine and foal necrotizing enteritis and are the first to demonstrate that netF resides on a large and unique plasmid-encoded locus.
Is junk DNA bunk? A critique of ENCODE
Doolittle, W. Ford
2013-01-01
Do data from the Encyclopedia Of DNA Elements (ENCODE) project render the notion of junk DNA obsolete? Here, I review older arguments for junk grounded in the C-value paradox and propose a thought experiment to challenge ENCODE’s ontology. Specifically, what would we expect for the number of functional elements (as ENCODE defines them) in genomes much larger than our own genome? If the number were to stay more or less constant, it would seem sensible to consider the rest of the DNA of larger genomes to be junk or, at least, assign it a different sort of role (structural rather than informational). If, however, the number of functional elements were to rise significantly with C-value then, (i) organisms with genomes larger than our genome are more complex phenotypically than we are, (ii) ENCODE’s definition of functional element identifies many sites that would not be considered functional or phenotype-determining by standard uses in biology, or (iii) the same phenotypic functions are often determined in a more diffuse fashion in larger-genomed organisms. Good cases can be made for propositions ii and iii. A larger theoretical framework, embracing informational and structural roles for DNA, neutral as well as adaptive causes of complexity, and selection as a multilevel phenomenon, is needed. PMID:23479647
A deep auto-encoder model for gene expression prediction.
Xie, Rui; Wen, Jia; Quitadamo, Andrew; Cheng, Jianlin; Shi, Xinghua
2017-11-17
Gene expression is a key intermediate level that genotypes lead to a particular trait. Gene expression is affected by various factors including genotypes of genetic variants. With an aim of delineating the genetic impact on gene expression, we build a deep auto-encoder model to assess how good genetic variants will contribute to gene expression changes. This new deep learning model is a regression-based predictive model based on the MultiLayer Perceptron and Stacked Denoising Auto-encoder (MLP-SAE). The model is trained using a stacked denoising auto-encoder for feature selection and a multilayer perceptron framework for backpropagation. We further improve the model by introducing dropout to prevent overfitting and improve performance. To demonstrate the usage of this model, we apply MLP-SAE to a real genomic datasets with genotypes and gene expression profiles measured in yeast. Our results show that the MLP-SAE model with dropout outperforms other models including Lasso, Random Forests and the MLP-SAE model without dropout. Using the MLP-SAE model with dropout, we show that gene expression quantifications predicted by the model solely based on genotypes, align well with true gene expression patterns. We provide a deep auto-encoder model for predicting gene expression from SNP genotypes. This study demonstrates that deep learning is appropriate for tackling another genomic problem, i.e., building predictive models to understand genotypes' contribution to gene expression. With the emerging availability of richer genomic data, we anticipate that deep learning models play a bigger role in modeling and interpreting genomics.
The Fragmented Mitochondrial Ribosomal RNAs of Plasmodium falciparum
Feagin, Jean E.; Harrell, Maria Isabel; Lee, Jung C.; Coe, Kevin J.; Sands, Bryan H.; Cannone, Jamie J.; Tami, Germaine; Schnare, Murray N.; Gutell, Robin R.
2012-01-01
Background The mitochondrial genome in the human malaria parasite Plasmodium falciparum is most unusual. Over half the genome is composed of the genes for three classic mitochondrial proteins: cytochrome oxidase subunits I and III and apocytochrome b. The remainder encodes numerous small RNAs, ranging in size from 23 to 190 nt. Previous analysis revealed that some of these transcripts have significant sequence identity with highly conserved regions of large and small subunit rRNAs, and can form the expected secondary structures. However, these rRNA fragments are not encoded in linear order; instead, they are intermixed with one another and the protein coding genes, and are coded on both strands of the genome. This unorthodox arrangement hindered the identification of transcripts corresponding to other regions of rRNA that are highly conserved and/or are known to participate directly in protein synthesis. Principal Findings The identification of 14 additional small mitochondrial transcripts from P. falcipaurm and the assignment of 27 small RNAs (12 SSU RNAs totaling 804 nt, 15 LSU RNAs totaling 1233 nt) to specific regions of rRNA are supported by multiple lines of evidence. The regions now represented are highly similar to those of the small but contiguous mitochondrial rRNAs of Caenorhabditis elegans. The P. falciparum rRNA fragments cluster on the interfaces of the two ribosomal subunits in the three-dimensional structure of the ribosome. Significance All of the rRNA fragments are now presumed to have been identified with experimental methods, and nearly all of these have been mapped onto the SSU and LSU rRNAs. Conversely, all regions of the rRNAs that are known to be directly associated with protein synthesis have been identified in the P. falciparum mitochondrial genome and RNA transcripts. The fragmentation of the rRNA in the P. falciparum mitochondrion is the most extreme example of any rRNA fragmentation discovered. PMID:22761677
Nunoura, Takuro; Hirayama, Hisako; Takami, Hideto; Oida, Hanako; Nishi, Shinro; Shimamura, Shigeru; Suzuki, Yohey; Inagaki, Fumio; Takai, Ken; Nealson, Kenneth H; Horikoshi, Koki
2005-12-01
Within a phylum Crenarchaeota, only some members of the hyperthermophilic class Thermoprotei, have been cultivated and characterized. In this study, we have constructed a metagenomic library from a microbial mat formation in a subsurface hot water stream of the Hishikari gold mine, Japan, and sequenced genome fragments of two different phylogroups of uncultivated thermophilic Crenarchaeota: (i) hot water crenarchaeotic group (HWCG) I (41.2 kb), and (ii) HWCG III (49.3 kb). The genome fragment of HWCG I contained a 16S rRNA gene, two tRNA genes and 35 genes encoding proteins but no 23S rRNA gene. Among the genes encoding proteins, several genes for putative aerobic-type carbon monoxide dehydrogenase represented a potential clue with regard to the yet unknown metabolism of HWCG I Archaea. The genome fragment of HWCG III contained a 16S/23S rRNA operon and 44 genes encoding proteins. In the 23S rRNA gene, we detected a homing-endonuclease encoding a group I intron similar to those detected in hyperthermophilic Crenarchaeota and Bacteria, as well as eukaryotic organelles. The reconstructed phylogenetic tree based on the 23S rRNA gene sequence reinforced the intermediate phylogenetic affiliation of HWCG III bridging the hyperthermophilic and non-thermophilic uncultivated Crenarchaeota.
Rademaker, Jan L. W.; Herbet, Hélène; Starrenburg, Marjo J. C.; Naser, Sabri M.; Gevers, Dirk; Kelly, William J.; Hugenholtz, Jeroen; Swings, Jean; van Hylckama Vlieg, Johan E. T.
2007-01-01
The diversity of a collection of 102 lactococcus isolates including 91 Lactococcus lactis isolates of dairy and nondairy origin was explored using partial small subunit rRNA gene sequence analysis and limited phenotypic analyses. A subset of 89 strains of L. lactis subsp. cremoris and L. lactis subsp. lactis isolates was further analyzed by (GTG)5-PCR fingerprinting and a novel multilocus sequence analysis (MLSA) scheme. Two major genomic lineages within L. lactis were found. The L. lactis subsp. cremoris type-strain-like genotype lineage included both L. lactis subsp. cremoris and L. lactis subsp. lactis isolates. The other major lineage, with a L. lactis subsp. lactis type-strain-like genotype, comprised L. lactis subsp. lactis isolates only. A novel third genomic lineage represented two L. lactis subsp. lactis isolates of nondairy origin. The genomic lineages deviate from the subspecific classification of L. lactis that is based on a few phenotypic traits only. MLSA of six partial genes (atpA, encoding ATP synthase alpha subunit; pheS, encoding phenylalanine tRNA synthetase; rpoA, encoding RNA polymerase alpha chain; bcaT, encoding branched chain amino acid aminotransferase; pepN, encoding aminopeptidase N; and pepX, encoding X-prolyl dipeptidyl peptidase) revealed 363 polymorphic sites (total length, 1,970 bases) among 89 L. lactis subsp. cremoris and L. lactis subsp. lactis isolates with unique sequence types for most isolates. This allowed high-resolution cluster analysis in which dairy isolates form subclusters of limited diversity within the genomic lineages. The pheS DNA sequence analysis yielded two genetic groups dissimilar to the other genotyping analysis-based lineages, indicating a disparate acquisition route for this gene. PMID:17890345
Rademaker, Jan L W; Herbet, Hélène; Starrenburg, Marjo J C; Naser, Sabri M; Gevers, Dirk; Kelly, William J; Hugenholtz, Jeroen; Swings, Jean; van Hylckama Vlieg, Johan E T
2007-11-01
The diversity of a collection of 102 lactococcus isolates including 91 Lactococcus lactis isolates of dairy and nondairy origin was explored using partial small subunit rRNA gene sequence analysis and limited phenotypic analyses. A subset of 89 strains of L. lactis subsp. cremoris and L. lactis subsp. lactis isolates was further analyzed by (GTG)(5)-PCR fingerprinting and a novel multilocus sequence analysis (MLSA) scheme. Two major genomic lineages within L. lactis were found. The L. lactis subsp. cremoris type-strain-like genotype lineage included both L. lactis subsp. cremoris and L. lactis subsp. lactis isolates. The other major lineage, with a L. lactis subsp. lactis type-strain-like genotype, comprised L. lactis subsp. lactis isolates only. A novel third genomic lineage represented two L. lactis subsp. lactis isolates of nondairy origin. The genomic lineages deviate from the subspecific classification of L. lactis that is based on a few phenotypic traits only. MLSA of six partial genes (atpA, encoding ATP synthase alpha subunit; pheS, encoding phenylalanine tRNA synthetase; rpoA, encoding RNA polymerase alpha chain; bcaT, encoding branched chain amino acid aminotransferase; pepN, encoding aminopeptidase N; and pepX, encoding X-prolyl dipeptidyl peptidase) revealed 363 polymorphic sites (total length, 1,970 bases) among 89 L. lactis subsp. cremoris and L. lactis subsp. lactis isolates with unique sequence types for most isolates. This allowed high-resolution cluster analysis in which dairy isolates form subclusters of limited diversity within the genomic lineages. The pheS DNA sequence analysis yielded two genetic groups dissimilar to the other genotyping analysis-based lineages, indicating a disparate acquisition route for this gene.
The complete genomic sequence of egg drop syndrome virus strain AAV-2.
Jin, Q; Zeng, L; Yang, F; Li, M; Hou, Y
1999-12-01
In the search for the genome of egg drop syndrome virus (EDSV-76) Chinese strain AAV-2, part of restriction endonuclease physical map is analyzed, the complete genomic library is organized. On basis of this, the complete genome nucleotide sequences (32 838 bp in length, including terminal structures) are determined. The data analysis shows: compared with the other Adenoviruses, strain AAV-2 has more disparity on genomic structure and the distribution of open reading frame (ORF). There are no clear E1, E3 and E4 regions in AAV-2 genome. Two segments located at both ends of genome (1.1 kb and 8.3 kb in length respectively) have no homology with the other adenovirus genomes. In addition, strain AAV-2 genome lacks ORFs encoding ElA, pV and pIX, which are common ORFs encoding early, lately proteins in Adenovirus. This reveals differences between EDSA-76, the sole standard strain of group III Avian Adenoviruses, and the other Avian Adenoviruses for the first time. It will help the search for Avian Adenovirus and will also help the search of all Adenoviruses.
The location and translocation of ndh genes of chloroplast origin in the Orchidaceae family
Lin, Choun-Sea; Chen, Jeremy J. W.; Huang, Yao-Ting; Chan, Ming-Tsair; Daniell, Henry; Chang, Wan-Jung; Hsu, Chen-Tran; Liao, De-Chih; Wu, Fu-Huei; Lin, Sheng-Yi; Liao, Chen-Fu; Deyholos, Michael K.; Wong, Gane Ka-Shu; Albert, Victor A.; Chou, Ming-Lun; Chen, Chun-Yi; Shih, Ming-Che
2015-01-01
The NAD(P)H dehydrogenase complex is encoded by 11 ndh genes in plant chloroplast (cp) genomes. However, ndh genes are truncated or deleted in some autotrophic Epidendroideae orchid cp genomes. To determine the evolutionary timing of the gene deletions and the genomic locations of the various ndh genes in orchids, the cp genomes of Vanilla planifolia, Paphiopedilum armeniacum, Paphiopedilum niveum, Cypripedium formosanum, Habenaria longidenticulata, Goodyera fumata and Masdevallia picturata were sequenced; these genomes represent Vanilloideae, Cypripedioideae, Orchidoideae and Epidendroideae subfamilies. Four orchid cp genome sequences were found to contain a complete set of ndh genes. In other genomes, ndh deletions did not correlate to known taxonomic or evolutionary relationships and deletions occurred independently after the orchid family split into different subfamilies. In orchids lacking cp encoded ndh genes, non cp localized ndh sequences were identified. In Erycina pusilla, at least 10 truncated ndh gene fragments were found transferred to the mitochondrial (mt) genome. The phenomenon of orchid ndh transfer to the mt genome existed in ndh-deleted orchids and also in ndh containing species. PMID:25761566
Ebolavirus comparative genomics
Jun, Se-Ran; Leuze, Michael R.; Nookaew, Intawat; ...
2015-07-14
The 2014 Ebola outbreak in West Africa is the largest documented for this virus. We examine the dynamics of this genome, comparing more than one hundred currently available ebolavirus genomes to each other and to other viral genomes. Based on oligomer frequency analysis, the family Filoviridae forms a distinct group from all other sequenced viral genomes. All filovirus genomes sequenced to date encode proteins with similar functions and gene order, although there is considerable divergence in sequences between the three genera Ebolavirus, Cuevavirus, and Marburgvirus within the family Filoviridae. Whereas all ebolavirus genomes are quite similar (multiple sequences of themore » same strain are often identical), variation is most common in the intergenic regions and within specific areas of the genes encoding the glycoprotein (GP), nucleoprotein (NP), and polymerase (L). We predict regions that could contain epitope-binding sites, which might be good vaccine targets. In conclusion, this information, combined with glycosylation sites and experimentally determined epitopes, can identify the most promising regions for the development of therapeutic strategies.« less
A genomic approach to the understanding of Xylella fastidiosa pathogenicity.
Lambais, M R; Goldman, M H; Camargo, L E; Goldman, G H
2000-10-01
Xylella fastidiosa is a fastidious, xylem-limited bacterium that causes several economically important plant diseases, including citrus variegated chlorosis (CVC). X. fastidiosa is the first plant pathogen to have its genome completely sequenced. In addition, it is probably the least previously studied of any organism for which the complete genome sequence is available. Several pathogenicity-related genes have been identified in the X. fastidiosa genome by similarity with other bacterial genes involved in pathogenesis in plants, as well as in animals. The X. fastidiosa genome encodes different classes of proteins directly or indirectly involved in cell-cell interactions, degradation of plant cell walls, iron homeostasis, anti-oxidant responses, synthesis of toxins, and regulation of pathogenicity. Neither genes encoding members of the type III protein secretion system nor avirulence-like genes have been identified in X. fastidiosa.
Mugford, Sam T.; Louveau, Thomas; Melton, Rachel; Qi, Xiaoquan; Bakht, Saleha; Hill, Lionel; Tsurushima, Tetsu; Honkanen, Suvi; Rosser, Susan J.; Lomonossoff, George P.; Osbourn, Anne
2013-01-01
Operon-like gene clusters are an emerging phenomenon in the field of plant natural products. The genes encoding some of the best-characterized plant secondary metabolite biosynthetic pathways are scattered across plant genomes. However, an increasing number of gene clusters encoding the synthesis of diverse natural products have recently been reported in plant genomes. These clusters have arisen through the neo-functionalization and relocation of existing genes within the genome, and not by horizontal gene transfer from microbes. The reasons for clustering are not yet clear, although this form of gene organization is likely to facilitate co-inheritance and co-regulation. Oats (Avena spp) synthesize antimicrobial triterpenoids (avenacins) that provide protection against disease. The synthesis of these compounds is encoded by a gene cluster. Here we show that a module of three adjacent genes within the wider biosynthetic gene cluster is required for avenacin acylation. Through the characterization of these genes and their encoded proteins we present a model of the subcellular organization of triterpenoid biosynthesis. PMID:23532069
van der Ley, P
1988-11-01
Gonococci express a family of related outer membrane proteins designated protein II (P.II). These surface proteins are subject to both phase variation and antigenic variation. The P.II gene repertoire of Neisseria gonorrhoeae strain JS3 was found to consist of at least ten genes, eight of which were cloned. Sequence analysis and DNA hybridization studies revealed that one particular P.II-encoding sequence is present in three distinct, but almost identical, copies in the JS3 genome. These genes encode the P.II protein that was previously identified as P.IIc. Comparison of their sequences shows that the multiple copies of this P.IIc-encoding gene might have been generated by both gene conversion and gene duplication.
Complete Mitochondrial Genome of the Medicinal Mushroom Ganoderma lucidum
Chen, Haimei; Chen, Xiangdong; Lan, Jin; Liu, Chang
2013-01-01
Ganoderma lucidum is one of the well-known medicinal basidiomycetes worldwide. The mitochondrion, referred to as the second genome, is an organelle found in most eukaryotic cells and participates in critical cellular functions. Elucidating the structure and function of this genome is important to understand completely the genetic contents of G. lucidum. In this study, we assembled the mitochondrial genome of G. lucidum and analyzed the differential expressions of its encoded genes across three developmental stages. The mitochondrial genome is a typical circular DNA molecule of 60,630 bp with a GC content of 26.67%. Genome annotation identified genes that encode 15 conserved proteins, 27 tRNAs, small and large rRNAs, four homing endonucleases, and two hypothetical proteins. Except for genes encoding trnW and two hypothetical proteins, all genes were located on the positive strand. For the repeat structure analysis, eight forward, two inverted, and three tandem repeats were detected. A pair of fragments with a total length around 5.5 kb was found in both the nuclear and mitochondrial genomes, which suggests the possible transfer of DNA sequences between two genomes. RNA-Seq data for samples derived from three stages, namely, mycelia, primordia, and fruiting bodies, were mapped to the mitochondrial genome and qualified. The protein-coding genes were expressed higher in mycelia or primordial stages compared with those in the fruiting bodies. The rRNA abundances were significantly higher in all three stages. Two regions were transcribed but did not contain any identified protein or tRNA genes. Furthermore, three RNA-editing sites were detected. Genome synteny analysis showed that significant genome rearrangements occurred in the mitochondrial genomes. This study provides valuable information on the gene contents of the mitochondrial genome and their differential expressions at various developmental stages of G. lucidum. The results contribute to the understanding of the functions and evolution of fungal mitochondrial DNA. PMID:23991034
funRNA: a fungi-centered genomics platform for genes encoding key components of RNAi.
Choi, Jaeyoung; Kim, Ki-Tae; Jeon, Jongbum; Wu, Jiayao; Song, Hyeunjeong; Asiegbu, Fred O; Lee, Yong-Hwan
2014-01-01
RNA interference (RNAi) is involved in genome defense as well as diverse cellular, developmental, and physiological processes. Key components of RNAi are Argonaute, Dicer, and RNA-dependent RNA polymerase (RdRP), which have been functionally characterized mainly in model organisms. The key components are believed to exist throughout eukaryotes; however, there is no systematic platform for archiving and dissecting these important gene families. In addition, few fungi have been studied to date, limiting our understanding of RNAi in fungi. Here we present funRNA http://funrna.riceblast.snu.ac.kr/, a fungal kingdom-wide comparative genomics platform for putative genes encoding Argonaute, Dicer, and RdRP. To identify and archive genes encoding the abovementioned key components, protein domain profiles were determined from reference sequences obtained from UniProtKB/SwissProt. The domain profiles were searched using fungal, metazoan, and plant genomes, as well as bacterial and archaeal genomes. 1,163, 442, and 678 genes encoding Argonaute, Dicer, and RdRP, respectively, were predicted. Based on the identification results, active site variation of Argonaute, diversification of Dicer, and sequence analysis of RdRP were discussed in a fungus-oriented manner. funRNA provides results from diverse bioinformatics programs and job submission forms for BLAST, BLASTMatrix, and ClustalW. Furthermore, sequence collections created in funRNA are synced with several gene family analysis portals and databases, offering further analysis opportunities. funRNA provides identification results from a broad taxonomic range and diverse analysis functions, and could be used in diverse comparative and evolutionary studies. It could serve as a versatile genomics workbench for key components of RNAi.
Marcotte, Harold; Krogh Andersen, Kasper; Lin, Yin; Zuo, Fanglei; Zeng, Zhu; Larsson, Per Göran; Brandsborg, Erik; Brønstad, Gunnar; Hammarström, Lennart
2017-12-01
Lactobacillus rhamnosus DSM 14870 and Lactobacillus gasseri DSM 14869 were previously isolated from the vaginal epithelial cells (VEC) of healthy women and selected for the development of the vaginal EcoVag ® probiotic capsules. EcoVag ® was subsequently shown to provide long-term cure and reduce relapse of bacterial vaginosis (BV) as an adjunct to antibiotic therapy. To identify genes potentially involved in probiotic activity, we performed genome sequencing and characterization of the two strains. The complete genome analysis of both strains revealed the presence of genes encoding functions related to adhesion, exopolysaccharide (EPS) biosynthesis, antimicrobial activity, and CRISPR adaptive immunity but absence of antibiotic resistance genes. Interesting features of L. rhamnosus DSM 14870 genome include the presence of the spaCBA-srtC gene encoding spaCBA pili and interruption of the gene cluster encoding long galactose-rich EPS by integrases. Unique to L. gasseri DSM 14869 genome was the presence of a gene encoding a putative (1456 amino acid) new adhesin containing two rib/alpha-like repeats. L. rhamnosus DSM 14870 and L. gasseri DSM 14869 showed acidification of the culture medium (to pH 3.8) and a strong adhesion capability to the Caco-2 cell line and VEC. L. gasseri DSM 14869 could produce a thick (40nm) EPS layer and hydrogen peroxide. L. rhamnosus DSM 14870 was shown to produce SpaCBA pili and a 20nm EPS layer, and could inhibit the growth of Gardnerella vaginalis, a bacterium commonly associated with BV. The genome sequences provide a basis for further elucidation of the molecular basis for their probiotic functions. Copyright © 2017 Elsevier GmbH. All rights reserved.
Wang, Jian; Wang, Chang; Zhen, Shoumin; Li, Xiaohui; Yan, Yueming
2018-04-01
Wheat-related genomes may carry new glutenin genes with the potential for quality improvement of breadmaking. In this study, we estimated the gluten quality properties of the wheat line CNU609 derived from crossing between Chinese Spring (CS, Triticum aestivum L., 2n = 6x = 42, AABBDD) and the wheat Aegilops umbellulata (2n = 2x = 14, UU) 1U(1B) substitution line, and investigated the function of 1U-encoded low-molecular-weight glutenin subunits (LMW-GS). The main quality parameters of CNU609 were significantly improved due to introgression of the 1U genome, including dough development time, stability time, farinograph quality number, gluten index, loaf size and inner structure. Glutenin analysis showed that CNU609 and CS had the same high-molecular-weight glutenin subunit (HMW-GS) composition, but CNU609 carried eight specific 1U genome-encoded LMW-GS. The introgression of the 1U-encoded LMW-GS led to more and larger protein body formation in the CNU609 endosperm. Two new LMW-m type genes from the 1U genome, designated Glu-U3a and Glu-U3b, were cloned and characterized. Secondary structure prediction implied that both Glu-U3a and Glu-U3b encode subunits with high α-helix and β-strand content that could benefit the formation of superior gluten structure. Our results indicate that the 1U genome has superior LMW-GS that can be used as new gene resources for wheat gluten quality improvement. © 2017 Society of Chemical Industry. © 2017 Society of Chemical Industry.
Auffret, Pauline; Segura, Audrey; Klopp, Christophe; Bouchez, Olivier; Kérourédan, Monique; Bibbal, Delphine; Brugère, Hubert; Forano, Evelyne
2017-01-01
ABSTRACT Enterohemorrhagic Escherichia coli (EHEC) with serotype O157:H7 is a major foodborne pathogen. Here, we report the draft genome sequence of EHEC O157:H7 strain MC2 isolated from cattle in France. The assembly contains 5,400,376 bp that encoded 5,914 predicted genes (5,805 protein-encoding genes and 109 RNA genes). PMID:28983004
Identification of a novel circular DNA virus in pig feces
USDA-ARS?s Scientific Manuscript database
Metagenomic analysis of fecal samples collected from a swine with diarrhea detected sequences encoding a replicase (Rep) protein typically found in small circular Rep-encoding ssDNA (CRESS-DNA) viruses. The complete 3,062 nucleotide genome was generated and found to encode two bi-directionally trans...
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mavromatis, K; Doyle, C Kuyler; Lykidis, A
2006-01-01
Ehrlichia canis, a small obligately intracellular, tick-transmitted, gram-negative, {alpha}-proteobacterium, is the primary etiologic agent of globally distributed canine monocytic ehrlichiosis. Complete genome sequencing revealed that the E. canis genome consists of a single circular chromosome of 1,315,030 bp predicted to encode 925 proteins, 40 stable RNA species, 17 putative pseudogenes, and a substantial proportion of noncoding sequence (27%). Interesting genome features include a large set of proteins with transmembrane helices and/or signal sequences and a unique serine-threonine bias associated with the potential for O glycosylation that was prominent in proteins associated with pathogen-host interactions. Furthermore, two paralogous protein families associatedmore » with immune evasion were identified, one of which contains poly(G-C) tracts, suggesting that they may play a role in phase variation and facilitation of persistent infections. Genes associated with pathogen-host interactions were identified, including a small group encoding proteins (n = 12) with tandem repeats and another group encoding proteins with eukaryote-like ankyrin domains (n = 7).« less
Dehoux, Pierre; Marvaud, Jean Christophe; Abouelleil, Amr; Earl, Ashlee M; Lambert, Thierry; Dauga, Catherine
2016-10-21
Clostridium bolteae and Clostridium clostridioforme, previously included in the complex C. clostridioforme in the group Clostridium XIVa, remain difficult to distinguish by phenotypic methods. These bacteria, prevailing in the human intestinal microbiota, are opportunistic pathogens with various drug susceptibility patterns. In order to better characterize the two species and to obtain information on their antibiotic resistance genes, we analyzed the genomes of six strains of C. bolteae and six strains of C. clostridioforme, isolated from human infection. The genome length of C. bolteae varied from 6159 to 6398 kb, and 5719 to 6059 CDSs were detected. The genomes of C. clostridioforme were smaller, between 5467 and 5927 kb, and contained 5231 to 5916 CDSs. The two species display different metabolic pathways. The genomes of C. bolteae contained lactose operons involving PTS system and complex regulation, which contribute to phenotypic differentiation from C. clostridioforme. The Acetyl-CoA pathway, similar to that of Faecalibacterium prausnitzii, a major butyrate producer in the human gut, was only found in C. clostridioforme. The two species have also developed diverse flagella mobility systems contributing to gut colonization. Their genomes harboured many CDSs involved in resistance to beta-lactams, glycopeptides, macrolides, chloramphenicol, lincosamides, rifampin, linezolid, bacitracin, aminoglycosides and tetracyclines. Overall antimicrobial resistance genes were similar within a species, but strain-specific resistance genes were found. We discovered a new group of genes coding for rifampin resistance in C. bolteae. C. bolteae 90B3 was resistant to phenicols and linezolide in producing a 23S rRNA methyltransferase. C. clostridioforme 90A8 contained the VanB-type Tn1549 operon conferring vancomycin resistance. We also detected numerous genes encoding proteins related to efflux pump systems. Genomic comparison of C. bolteae and C. clostridiofrome revealed functional differences in butyrate pathways and in flagellar systems, which play a critical role within human microbiota. Most of the resistance genes detected in both species were previously characterized in other bacterial species. A few of them were related to antibiotics inactive against Clostridium spp. Some were part of mobile genetic elements suggesting that these commensals of the human microbiota act as reservoir of antimicrobial resistances.
Comparative genomic analysis of Acinetobacter strains isolated from murine colonic crypts.
Saffarian, Azadeh; Touchon, Marie; Mulet, Céline; Tournebize, Régis; Passet, Virginie; Brisse, Sylvain; Rocha, Eduardo P C; Sansonetti, Philippe J; Pédron, Thierry
2017-07-11
A restricted set of aerobic bacteria dominated by the Acinetobacter genus was identified in murine intestinal colonic crypts. The vicinity of such bacteria with intestinal stem cells could indicate that they protect the crypt against cytotoxic and genotoxic signals. Genome analyses of these bacteria were performed to better appreciate their biodegradative capacities. Two taxonomically different clusters of Acinetobacter were isolated from murine proximal colonic crypts, one was identified as A. modestus and the other as A. radioresistens. Their identification was performed through biochemical parameters and housekeeping gene sequencing. After selection of one strain of each cluster (A. modestus CM11G and A. radioresistens CM38.2), comparative genomic analysis was performed on whole-genome sequencing data. The antibiotic resistance pattern of these two strains is different, in line with the many genes involved in resistance to heavy metals identified in both genomes. Moreover whereas the operon benABCDE involved in benzoate metabolism is encoded by the two genomes, the operon antABC encoding the anthranilate dioxygenase, and the phenol hydroxylase gene cluster are absent in the A. modestus genomic sequence, indicating that the two strains have different capacities to metabolize xenobiotics. A common feature of the two strains is the presence of a type IV pili system, and the presence of genes encoding proteins pertaining to secretion systems such as Type I and Type II secretion systems. Our comparative genomic analysis revealed that different Acinetobacter isolated from the same biological niche, even if they share a large majority of genes, possess unique features that could play a specific role in the protection of the intestinal crypt.
Extensive Microbial and Functional Diversity within the Chicken Cecal Microbiome
Sergeant, Martin J.; Constantinidou, Chrystala; Cogan, Tristan A.; Bedford, Michael R.; Penn, Charles W.; Pallen, Mark J.
2014-01-01
Chickens are major source of food and protein worldwide. Feed conversion and the health of chickens relies on the largely unexplored complex microbial community that inhabits the chicken gut, including the ceca. We have carried out deep microbial community profiling of the microbiota in twenty cecal samples via 16S rRNA gene sequences and an in-depth metagenomics analysis of a single cecal microbiota. We recovered 699 phylotypes, over half of which appear to represent previously unknown species. We obtained 648,251 environmental gene tags (EGTs), the majority of which represent new species. These were binned into over two-dozen draft genomes, which included Campylobacter jejuni and Helicobacter pullorum. We found numerous polysaccharide- and oligosaccharide-degrading enzymes encoding within the metagenome, some of which appeared to be part of polysaccharide utilization systems with genetic evidence for the co-ordination of polysaccharide degradation with sugar transport and utilization. The cecal metagenome encodes several fermentation pathways leading to the production of short-chain fatty acids, including some with novel features. We found a dozen uptake hydrogenases encoded in the metagenome and speculate that these provide major hydrogen sinks within this microbial community and might explain the high abundance of several genera within this microbiome, including Campylobacter, Helicobacter and Megamonas. PMID:24657972
Lin, Choun-Sea; Chen, Jeremy J W; Chiu, Chi-Chou; Hsiao, Han C W; Yang, Chen-Jui; Jin, Xiao-Hua; Leebens-Mack, James; de Pamphilis, Claude W; Huang, Yao-Ting; Yang, Ling-Hung; Chang, Wan-Jung; Kui, Ling; Wong, Gane Ka-Shu; Hu, Jer-Ming; Wang, Wen; Shih, Ming-Che
2017-06-01
The chloroplast NAD(P)H dehydrogenase-like (NDH) complex consists of about 30 subunits from both the nuclear and chloroplast genomes and is ubiquitous across most land plants. In some orchids, such as Phalaenopsis equestris, Dendrobium officinale and Dendrobium catenatum, most of the 11 chloroplast genome-encoded ndh genes (cp-ndh) have been lost. Here we investigated whether functional cp-ndh genes have been completely lost in these orchids or whether they have been transferred and retained in the nuclear genome. Further, we assessed whether both cp-ndh genes and nucleus-encoded NDH-related genes can be lost, resulting in the absence of the NDH complex. Comparative analyses of the genome of Apostasia odorata, an orchid species with a complete complement of cp-ndh genes which represents the sister lineage to all other orchids, and three published orchid genome sequences for P. equestris, D. officinale and D. catenatum, which are all missing cp-ndh genes, indicated that copies of cp-ndh genes are not present in any of these four nuclear genomes. This observation suggests that the NDH complex is not necessary for some plants. Comparative genomic/transcriptomic analyses of currently available plastid genome sequences and nuclear transcriptome data showed that 47 out of 660 photoautotrophic plants and all the heterotrophic plants are missing plastid-encoded cp-ndh genes and exhibit no evidence for maintenance of a functional NDH complex. Our data indicate that the NDH complex can be lost in photoautotrophic plant species. Further, the loss of the NDH complex may increase the probability of transition from a photoautotrophic to a heterotrophic life history. © 2017 The Authors The Plant Journal © 2017 John Wiley & Sons Ltd.
Sharma, Akanksha; Sharma, Niharika; Bhalla, Prem; Singh, Mohan
2017-01-01
Comparative genomics have facilitated the mining of biological information from a genome sequence, through the detection of similarities and differences with genomes of closely or more distantly related species. By using such comparative approaches, knowledge can be transferred from the model to non-model organisms and insights can be gained in the structural and evolutionary patterns of specific genes. In the absence of sequenced genomes for allergenic grasses, this study was aimed at understanding the structure, organisation and expression profiles of grass pollen allergens using the genomic data from Brachypodium distachyon as it is phylogenetically related to the allergenic grasses. Combining genomic data with the anther RNA-Seq dataset revealed 24 pollen allergen genes belonging to eight allergen groups mapping on the five chromosomes in B. distachyon. High levels of anther-specific expression profiles were observed for the 24 identified putative allergen-encoding genes in Brachypodium. The genomic evidence suggests that gene encoding the group 5 allergen, the most potent trigger of hay fever and allergic asthma originated as a pollen specific orphan gene in a common grass ancestor of Brachypodium and Triticiae clades. Gene structure analysis showed that the putative allergen-encoding genes in Brachypodium either lack or contain reduced number of introns. Promoter analysis of the identified Brachypodium genes revealed the presence of specific cis-regulatory sequences likely responsible for high anther/pollen-specific expression. With the identification of putative allergen-encoding genes in Brachypodium, this study has also described some important plant gene families (e.g. expansin superfamily, EF-Hand family, profilins etc) for the first time in the model plant Brachypodium. Altogether, the present study provides new insights into structural characterization and evolution of pollen allergens and will further serve as a base for their functional characterization in related grass species. PMID:28103252
Emerling, Christopher A.
2018-01-01
The end-Cretaceous extinction led to a massive faunal turnover, with placental mammals radiating in the wake of nonavian dinosaurs. Fossils indicate that Cretaceous stem placentals were generally insectivorous, whereas their earliest Cenozoic descendants occupied a variety of dietary niches. It is hypothesized that this dietary radiation resulted from the opening of niche space, following the extinction of dinosaurian carnivores and herbivores. We provide the first genomic evidence for the occurrence and timing of this dietary radiation in placental mammals. By comparing the genomes of 107 placental mammals, we robustly infer that chitinase genes (CHIAs), encoding enzymes capable of digesting insect exoskeletal chitin, were present as five functional copies in the ancestor of all placental mammals, and the number of functional CHIAs in the genomes of extant species positively correlates with the percentage of invertebrates in their diets. The diverse repertoire of CHIAs in early placental mammals corroborates fossil evidence of insectivory in Cretaceous eutherians, with descendant lineages repeatedly losing CHIAs beginning at the Cretaceous/Paleogene (K/Pg) boundary as they radiated into noninsectivorous niches. Furthermore, the timing of gene loss suggests that interordinal diversification of placental mammals in the Cretaceous predates the dietary radiation in the early Cenozoic, helping to reconcile a long-standing debate between molecular timetrees and the fossil record. Our results demonstrate that placental mammal genomes, including humans, retain a molecular record of the post-K/Pg placental adaptive radiation in the form of numerous chitinase pseudogenes. PMID:29774238
Salinero, Alicia C; Knoll, Elisabeth R; Zhu, Z Iris; Landsman, David; Curcio, M Joan; Morse, Randall H
2018-02-01
The Ty1 retrotransposons present in the genome of Saccharomyces cerevisiae belong to the large class of mobile genetic elements that replicate via an RNA intermediary and constitute a significant portion of most eukaryotic genomes. The retromobility of Ty1 is regulated by numerous host factors, including several subunits of the Mediator transcriptional co-activator complex. In spite of its known function in the nucleus, previous studies have implicated Mediator in the regulation of post-translational steps in Ty1 retromobility. To resolve this paradox, we systematically examined the effects of deleting non-essential Mediator subunits on the frequency of Ty1 retromobility and levels of retromobility intermediates. Our findings reveal that loss of distinct Mediator subunits alters Ty1 retromobility positively or negatively over a >10,000-fold range by regulating the ratio of an internal transcript, Ty1i, to the genomic Ty1 transcript. Ty1i RNA encodes a dominant negative inhibitor of Ty1 retromobility that blocks virus-like particle maturation and cDNA synthesis. These results resolve the conundrum of Mediator exerting sweeping control of Ty1 retromobility with only minor effects on the levels of Ty1 genomic RNA and the capsid protein, Gag. Since the majority of characterized intrinsic and extrinsic regulators of Ty1 retromobility do not appear to effect genomic Ty1 RNA levels, Mediator could play a central role in integrating signals that influence Ty1i expression to modulate retromobility.
Bentolila, Stéphane; Stefanov, Stefan
2012-01-01
Plant mitochondrial genomes have features that distinguish them radically from their animal counterparts: a high rate of rearrangement, of uptake and loss of DNA sequences, and an extremely low point mutation rate. Perhaps the most unique structural feature of plant mitochondrial DNAs is the presence of large repeated sequences involved in intramolecular and intermolecular recombination. In addition, rare recombination events can occur across shorter repeats, creating rearrangements that result in aberrant phenotypes, including pollen abortion, which is known as cytoplasmic male sterility (CMS). Using next-generation sequencing, we pyrosequenced two rice (Oryza sativa) mitochondrial genomes that belong to the indica subspecies. One genome is normal, while the other carries the wild abortive-CMS. We find that numerous rearrangements in the rice mitochondrial genome occur even between close cytotypes during rice evolution. Unlike maize (Zea mays), a closely related species also belonging to the grass family, integration of plastid sequences did not play a role in the sequence divergence between rice cytotypes. This study also uncovered an excellent candidate for the wild abortive-CMS-encoding gene; like most of the CMS-associated open reading frames that are known in other species, this candidate was created via a rearrangement, is chimeric in structure, possesses predicted transmembrane domains, and coopted the promoter of a genuine mitochondrial gene. Our data give new insights into rice mitochondrial evolution, correcting previous reports. PMID:22128137
The COG database: new developments in phylogenetic classification of proteins from complete genomes
Tatusov, Roman L.; Natale, Darren A.; Garkavtsev, Igor V.; Tatusova, Tatiana A.; Shankavaram, Uma T.; Rao, Bachoti S.; Kiryutin, Boris; Galperin, Michael Y.; Fedorova, Natalie D.; Koonin, Eugene V.
2001-01-01
The database of Clusters of Orthologous Groups of proteins (COGs), which represents an attempt on a phylogenetic classification of the proteins encoded in complete genomes, currently consists of 2791 COGs including 45 350 proteins from 30 genomes of bacteria, archaea and the yeast Saccharomyces cerevisiae (http://www.ncbi.nlm.nih.gov/COG). In addition, a supplement to the COGs is available, in which proteins encoded in the genomes of two multicellular eukaryotes, the nematode Caenorhabditis elegans and the fruit fly Drosophila melanogaster, and shared with bacteria and/or archaea were included. The new features added to the COG database include information pages with structural and functional details on each COG and literature references, improvements of the COGNITOR program that is used to fit new proteins into the COGs, and classification of genomes and COGs constructed by using principal component analysis. PMID:11125040
Complete genome sequence of a Watermelon silver mottle virus isolate from China.
Rao, Xueqin; Wu, Zhuyan; Li, Yuan
2013-06-01
The complete genome of a Watermelon silver mottle virus (WSMoV) (genus Tospovirus, family Bunyaviridae) isolate (WSMoV-GZ) from Guangdong province, China was sequenced. The genomes of WSMoV-GZ contained 3,603, 4,909, and 8,914 nt of small (S), medium (M), and large (L) RNA segments, respectively, and had a genomic organization characteristic of members of the genus Tospovirus. The amino acid sequence of the nucleocapsid (N) protein, S RNA-encoded nonstructural (NSs) protein, M RNA-encoded nonstructural (NSm) protein, Gn/Gc glycoprotein precursor, and RNA-dependent RNA polymerase (RdRp) protein showed 94.3-97.5 % identity with those of other WSMoV isolates. Phylogenetic analysis showed that the N protein of WSMoV-GZ was clustered together with those of the WSMoV isolates. The full sequence of WSMoV-GZ provides a reference genome for comparison with other tospoviruses.
Genome of the opportunistic pathogen Streptococcus sanguinis.
Xu, Ping; Alves, Joao M; Kitten, Todd; Brown, Arunsri; Chen, Zhenming; Ozaki, Luiz S; Manque, Patricio; Ge, Xiuchun; Serrano, Myrna G; Puiu, Daniela; Hendricks, Stephanie; Wang, Yingping; Chaplin, Michael D; Akan, Doruk; Paik, Sehmi; Peterson, Darrell L; Macrina, Francis L; Buck, Gregory A
2007-04-01
The genome of Streptococcus sanguinis is a circular DNA molecule consisting of 2,388,435 bp and is 177 to 590 kb larger than the other 21 streptococcal genomes that have been sequenced. The G+C content of the S. sanguinis genome is 43.4%, which is considerably higher than the G+C contents of other streptococci. The genome encodes 2,274 predicted proteins, 61 tRNAs, and four rRNA operons. A 70-kb region encoding pathways for vitamin B(12) biosynthesis and degradation of ethanolamine and propanediol was apparently acquired by horizontal gene transfer. The gene complement suggests new hypotheses for the pathogenesis and virulence of S. sanguinis and differs from the gene complements of other pathogenic and nonpathogenic streptococci. In particular, S. sanguinis possesses a remarkable abundance of putative surface proteins, which may permit it to be a primary colonizer of the oral cavity and agent of streptococcal endocarditis and infection in neutropenic patients.
Defense Islands in Bacterial and Archaeal Genomes and Prediction of Novel Defense Systems ▿†‡
Makarova, Kira S.; Wolf, Yuri I.; Snir, Sagi; Koonin, Eugene V.
2011-01-01
The arms race between cellular life forms and viruses is a major driving force of evolution. A substantial fraction of bacterial and archaeal genomes is dedicated to antivirus defense. We analyzed the distribution of defense genes and typical mobilome components (such as viral and transposon genes) in bacterial and archaeal genomes and demonstrated statistically significant clustering of antivirus defense systems and mobile genes and elements in genomic islands. The defense islands are enriched in putative operons and contain numerous overrepresented gene families. A detailed sequence analysis of the proteins encoded by genes in these families shows that many of them are diverged variants of known defense system components, whereas others show features, such as characteristic operonic organization, that are suggestive of novel defense systems. Thus, genomic islands provide abundant material for the experimental study of bacterial and archaeal antivirus defense. Except for the CRISPR-Cas systems, different classes of defense systems, in particular toxin-antitoxin and restriction-modification systems, show nonrandom clustering in defense islands. It remains unclear to what extent these associations reflect functional cooperation between different defense systems and to what extent the islands are genomic “sinks” that accumulate diverse nonessential genes, particularly those acquired via horizontal gene transfer. The characteristics of defense islands resemble those of mobilome islands. Defense and mobilome genes are nonrandomly associated in islands, suggesting nonadaptive evolution of the islands via a preferential attachment-like mechanism underpinned by the addictive properties of defense systems such as toxins-antitoxins and an important role of horizontal mobility in the evolution of these islands. PMID:21908672
Defense islands in bacterial and archaeal genomes and prediction of novel defense systems.
Makarova, Kira S; Wolf, Yuri I; Snir, Sagi; Koonin, Eugene V
2011-11-01
The arms race between cellular life forms and viruses is a major driving force of evolution. A substantial fraction of bacterial and archaeal genomes is dedicated to antivirus defense. We analyzed the distribution of defense genes and typical mobilome components (such as viral and transposon genes) in bacterial and archaeal genomes and demonstrated statistically significant clustering of antivirus defense systems and mobile genes and elements in genomic islands. The defense islands are enriched in putative operons and contain numerous overrepresented gene families. A detailed sequence analysis of the proteins encoded by genes in these families shows that many of them are diverged variants of known defense system components, whereas others show features, such as characteristic operonic organization, that are suggestive of novel defense systems. Thus, genomic islands provide abundant material for the experimental study of bacterial and archaeal antivirus defense. Except for the CRISPR-Cas systems, different classes of defense systems, in particular toxin-antitoxin and restriction-modification systems, show nonrandom clustering in defense islands. It remains unclear to what extent these associations reflect functional cooperation between different defense systems and to what extent the islands are genomic "sinks" that accumulate diverse nonessential genes, particularly those acquired via horizontal gene transfer. The characteristics of defense islands resemble those of mobilome islands. Defense and mobilome genes are nonrandomly associated in islands, suggesting nonadaptive evolution of the islands via a preferential attachment-like mechanism underpinned by the addictive properties of defense systems such as toxins-antitoxins and an important role of horizontal mobility in the evolution of these islands.
Constructs and methods for genome editing and genetic engineering of fungi and protists
Hittinger, Christopher Todd; Alexander, William Gerald
2018-01-30
Provided herein are constructs for genome editing or genetic engineering in fungi or protists, methods of using the constructs and media for use in selecting cells. The construct include a polynucleotide encoding a thymidine kinase operably connected to a promoter, suitably a constitutive promoter; a polynucleotide encoding an endonuclease operably connected to an inducible promoter; and a recognition site for the endonuclease. The constructs may also include selectable markers for use in selecting recombinations.
Graur, Dan; Zheng, Yichen; Price, Nicholas; Azevedo, Ricardo B R; Zufall, Rebecca A; Elhaik, Eran
2013-01-01
A recent slew of ENCyclopedia Of DNA Elements (ENCODE) Consortium publications, specifically the article signed by all Consortium members, put forward the idea that more than 80% of the human genome is functional. This claim flies in the face of current estimates according to which the fraction of the genome that is evolutionarily conserved through purifying selection is less than 10%. Thus, according to the ENCODE Consortium, a biological function can be maintained indefinitely without selection, which implies that at least 80 - 10 = 70% of the genome is perfectly invulnerable to deleterious mutations, either because no mutation can ever occur in these "functional" regions or because no mutation in these regions can ever be deleterious. This absurd conclusion was reached through various means, chiefly by employing the seldom used "causal role" definition of biological function and then applying it inconsistently to different biochemical properties, by committing a logical fallacy known as "affirming the consequent," by failing to appreciate the crucial difference between "junk DNA" and "garbage DNA," by using analytical methods that yield biased errors and inflate estimates of functionality, by favoring statistical sensitivity over specificity, and by emphasizing statistical significance rather than the magnitude of the effect. Here, we detail the many logical and methodological transgressions involved in assigning functionality to almost every nucleotide in the human genome. The ENCODE results were predicted by one of its authors to necessitate the rewriting of textbooks. We agree, many textbooks dealing with marketing, mass-media hype, and public relations may well have to be rewritten.
Hammarlöf, Disa L; Canals, Rocío; Hinton, Jay C D
2013-10-01
The availability of thousands of genome sequences of bacterial pathogens poses a particular challenge because each genome contains hundreds of genes of unknown function (FUN). How can we easily discover which FUN genes encode important virulence factors? One solution is to combine two different functional genomic approaches. First, transcriptomics identifies bacterial FUN genes that show differential expression during the process of mammalian infection. Second, global mutagenesis identifies individual FUN genes that the pathogen requires to cause disease. The intersection of these datasets can reveal a small set of candidate genes most likely to encode novel virulence attributes. We demonstrate this approach with the Salmonella infection model, and propose that a similar strategy could be used for other bacterial pathogens. Copyright © 2013 Elsevier Ltd. All rights reserved.
Investigating the Genome Diversity of B. cereus and Evolutionary Aspects of B. anthracis Emergence
Papazisi, Leka; Rasko, David A.; Ratnayake, Shashikala; Bock, Geoff R.; Remortel, Brian G.; Appalla, Lakshmi; Liu, Jia; Dracheva, Tatiana; Braisted, John C.; Shallom, Shamira; Jarrahi, Benham; Snesrud, Erik; Ahn, Susie; Sun, Qiang; Rilstone, Jenifer; Økstad, Ole Andreas; Kolstø, Anne-Brit; Fleischmann, Robert D.; Peterson, Scott N.
2011-01-01
Here we report the use of a multi-genome DNA microarray to investigate the genome diversity of Bacillus cereus group members and elucidate the events associated with the emergence of B. anthracis the causative agent of anthrax–a lethal zoonotic disease. We initially performed directed genome sequencing of seven diverse B. cereus strains to identify novel sequences encoded in those genomes. The novel genes identified, combined with those publicly available, allowed the design of a “species” DNA microarray. Comparative genomic hybridization analyses of 41 strains indicates that substantial heterogeneity exists with respect to the genes comprising functional role categories. While the acquisition of the plasmid-encoded pathogenicity island (pXO1) and capsule genes (pXO2) represent a crucial landmark dictating the emergence of B. anthracis, the evolution of this species and its close relatives was associated with an overall a shift in the fraction of genes devoted to energy metabolism, cellular processes, transport, as well as virulence. PMID:21447378
Genome dynamics and its impact on evolution of Escherichia coli.
Dobrindt, Ulrich; Chowdary, M Geddam; Krumbholz, G; Hacker, J
2010-08-01
The Escherichia coli genome consists of a conserved part, the so-called core genome, which encodes essential cellular functions and of a flexible, strain-specific part. Genes that belong to the flexible genome code for factors involved in bacterial fitness and adaptation to different environments. Adaptation includes increase in fitness and colonization capacity. Pathogenic as well as non-pathogenic bacteria carry mobile and accessory genetic elements such as plasmids, bacteriophages, genomic islands and others, which code for functions required for proper adaptation. Escherichia coli is a very good example to study the interdependency of genome architecture and lifestyle of bacteria. Thus, these species include pathogenic variants as well as commensal bacteria adapted to different host organisms. In Escherichia coli, various genetic elements encode for pathogenicity factors as well as factors, which increase the fitness of non-pathogenic bacteria. The processes of genome dynamics, such as gene transfer, genome reduction, rearrangements as well as point mutations contribute to the adaptation of the bacteria into particular environments. Using Escherichia coli model organisms, such as uropathogenic strain 536 or commensal strain Nissle 1917, we studied mechanisms of genome dynamics and discuss these processes in the light of the evolution of microbes.
A robust TALENs system for highly efficient mammalian genome editing.
Feng, Yuanxi; Zhang, Siliang; Huang, Xin
2014-01-10
Recently, transcription activator-like effector nucleases (TALENs) have emerged as a highly effective tool for genomic editing. A pair of TALENs binds to two DNA recognition sites separated by a spacer sequence, and the dimerized FokI nucleases at the C terminal then cleave DNA in the spacer. Because of its modular design and capacity to precisely target almost any desired genomic locus, TALEN is a technology that can revolutionize the entire biomedical research field. Currently, for genomic editing in cultured cells, two plasmids encoding a pair of TALENs are co-transfected, followed by limited dilution to isolate cell colonies with the intended genomic manipulation. However, uncertain transfection efficiency becomes a bottleneck, especially in hard-to-transfect cells, reducing the overall efficiency of genome editing. We have developed a robust TALENs system in which each TALEN plasmid also encodes a fluorescence protein. Thus, cells transfected with both TALEN plasmids, a prerequisite for genomic editing, can be isolated by fluorescence-activated cell sorting. Our improved TALENs system can be applied to all cultured cells to achieve highly efficient genomic editing. Furthermore, an optimized procedure for genomic editing using TALENs is also presented. We expect our system to be widely adopted by the scientific community.
Phages and the Evolution of Bacterial Pathogens: from Genomic Rearrangements to Lysogenic Conversion
Brüssow, Harald; Canchaya, Carlos; Hardt, Wolf-Dietrich
2004-01-01
Comparative genomics demonstrated that the chromosomes from bacteria and their viruses (bacteriophages) are coevolving. This process is most evident for bacterial pathogens where the majority contain prophages or phage remnants integrated into the bacterial DNA. Many prophages from bacterial pathogens encode virulence factors. Two situations can be distinguished: Vibrio cholerae, Shiga toxin-producing Escherichia coli, Corynebacterium diphtheriae, and Clostridium botulinum depend on a specific prophage-encoded toxin for causing a specific disease, whereas Staphylococcus aureus, Streptococcus pyogenes, and Salmonella enterica serovar Typhimurium harbor a multitude of prophages and each phage-encoded virulence or fitness factor makes an incremental contribution to the fitness of the lysogen. These prophages behave like “swarms” of related prophages. Prophage diversification seems to be fueled by the frequent transfer of phage material by recombination with superinfecting phages, resident prophages, or occasional acquisition of other mobile DNA elements or bacterial chromosomal genes. Prophages also contribute to the diversification of the bacterial genome architecture. In many cases, they actually represent a large fraction of the strain-specific DNA sequences. In addition, they can serve as anchoring points for genome inversions. The current review presents the available genomics and biological data on prophages from bacterial pathogens in an evolutionary framework. PMID:15353570
Blumer-Schuette, Sara E.; Giannone, Richard J.; Zurawski, Jeffrey V.; Ozdemir, Inci; Ma, Qin; Yin, Yanbin; Xu, Ying; Kataeva, Irina; Poole, Farris L.; Adams, Michael W. W.; Hamilton-Brehm, Scott D.; Elkins, James G.; Larimer, Frank W.; Land, Miriam L.; Hauser, Loren J.; Cottingham, Robert W.; Hettich, Robert L.
2012-01-01
Extremely thermophilic bacteria of the genus Caldicellulosiruptor utilize carbohydrate components of plant cell walls, including cellulose and hemicellulose, facilitated by a diverse set of glycoside hydrolases (GHs). From a biofuel perspective, this capability is crucial for deconstruction of plant biomass into fermentable sugars. While all species from the genus grow on xylan and acid-pretreated switchgrass, growth on crystalline cellulose is variable. The basis for this variability was examined using microbiological, genomic, and proteomic analyses of eight globally diverse Caldicellulosiruptor species. The open Caldicellulosiruptor pangenome (4,009 open reading frames [ORFs]) encodes 106 GHs, representing 43 GH families, but only 26 GHs from 17 families are included in the core (noncellulosic) genome (1,543 ORFs). Differentiating the strongly cellulolytic Caldicellulosiruptor species from the others is a specific genomic locus that encodes multidomain cellulases from GH families 9 and 48, which are associated with cellulose-binding modules. This locus also encodes a novel adhesin associated with type IV pili, which was identified in the exoproteome bound to crystalline cellulose. Taking into account the core genomes, pangenomes, and individual genomes, the ancestral Caldicellulosiruptor was likely cellulolytic and evolved, in some cases, into species that lost the ability to degrade crystalline cellulose while maintaining the capacity to hydrolyze amorphous cellulose and hemicellulose. PMID:22636774
A draft genome sequence of “Candidatus Liberibacter asiaticus” from California, USA
USDA-ARS?s Scientific Manuscript database
The draft genome sequence of “Candidatus Liberibacter asiaticus” strain HHCA, collected from a lemon tree in California, USA, is reported. The HHCA strain has a genome size of 1,118,244 bp, with G+C content of 36.6%. The HHCA genome encodes 1,191 predicted open reading frames and 51 RNA genes....
USDA-ARS?s Scientific Manuscript database
Huanglongbing (HLB) is presently the most devastating citrus disease worldwide. As an intracellular plant pathogen and insect symbiont, the HLB bacterium, ‘Candidatus Liberibacter asiaticus’ (Las) retains the entire flagellum-encoding gene cluster in its significantly reduced genome. Las encodes a...
Mehdizadeh Gohari, Iman; Kropinski, Andrew M.; Weese, Scott J.; Parreira, Valeria R.; Whitehead, Ashley E.; Boerlin, Patrick; Prescott, John F.
2016-01-01
The recent discovery of a novel beta-pore-forming toxin, NetF, which is strongly associated with canine and foal necrotizing enteritis should improve our understanding of the role of type A Clostridium perfringens associated disease in these animals. The current study presents the complete genome sequence of two netF-positive strains, JFP55 and JFP838, which were recovered from cases of foal necrotizing enteritis and canine hemorrhagic gastroenteritis, respectively. Genome sequencing was done using Single Molecule, Real-Time (SMRT) technology-PacBio and Illumina Hiseq2000. The JFP55 and JFP838 genomes include a single 3.34 Mb and 3.53 Mb chromosome, respectively, and both genomes include five circular plasmids. Plasmid annotation revealed that three plasmids were shared by the two newly sequenced genomes, including a NetF/NetE toxins-encoding tcp-conjugative plasmid, a CPE/CPB2 toxins-encoding tcp-conjugative plasmid and a putative bacteriocin-encoding plasmid. The putative beta-pore-forming toxin genes, netF, netE and netG, were located in unique pathogenicity loci on tcp-conjugative plasmids. The C. perfringens JFP55 chromosome carries 2,825 protein-coding genes whereas the chromosome of JFP838 contains 3,014 protein-encoding genes. Comparison of these two chromosomes with three available reference C. perfringens chromosome sequences identified 48 (~247 kb) and 81 (~430 kb) regions unique to JFP55 and JFP838, respectively. Some of these divergent genomic regions in both chromosomes are phage- and plasmid-related segments. Sixteen of these unique chromosomal regions (~69 kb) were shared between the two isolates. Five of these shared regions formed a mosaic of plasmid-integrated segments, suggesting that these elements were acquired early in a clonal lineage of netF-positive C. perfringens strains. These results provide significant insight into the basis of canine and foal necrotizing enteritis and are the first to demonstrate that netF resides on a large and unique plasmid-encoded locus. PMID:26859667
De Maayer, Pieter; Chan, Wai Yin; Rubagotti, Enrico; Venter, Stephanus N; Toth, Ian K; Birch, Paul R J; Coutinho, Teresa A
2014-05-27
Pantoea ananatis is found in a wide range of natural environments, including water, soil, as part of the epi- and endophytic flora of various plant hosts, and in the insect gut. Some strains have proven effective as biological control agents and plant-growth promoters, while other strains have been implicated in diseases of a broad range of plant hosts and humans. By analysing the pan-genome of eight sequenced P. ananatis strains isolated from different sources we identified factors potentially underlying its ability to colonize and interact with hosts in both the plant and animal Kingdoms. The pan-genome of the eight compared P. ananatis strains consisted of a core genome comprised of 3,876 protein coding sequences (CDSs) and a sizeable accessory genome consisting of 1,690 CDSs. We estimate that ~106 unique CDSs would be added to the pan-genome with each additional P. ananatis genome sequenced in the future. The accessory fraction is derived mainly from integrated prophages and codes mostly for proteins of unknown function. Comparison of the translated CDSs on the P. ananatis pan-genome with the proteins encoded on all sequenced bacterial genomes currently available revealed that P. ananatis carries a number of CDSs with orthologs restricted to bacteria associated with distinct hosts, namely plant-, animal- and insect-associated bacteria. These CDSs encode proteins with putative roles in transport and metabolism of carbohydrate and amino acid substrates, adherence to host tissues, protection against plant and animal defense mechanisms and the biosynthesis of potential pathogenicity determinants including insecticidal peptides, phytotoxins and type VI secretion system effectors. P. ananatis has an 'open' pan-genome typical of bacterial species that colonize several different environments. The pan-genome incorporates a large number of genes encoding proteins that may enable P. ananatis to colonize, persist in and potentially cause disease symptoms in a wide range of plant and animal hosts.
Toward a Better Compression for DNA Sequences Using Huffman Encoding
Almarri, Badar; Al Yami, Sultan; Huang, Chun-Hsi
2017-01-01
Abstract Due to the significant amount of DNA data that are being generated by next-generation sequencing machines for genomes of lengths ranging from megabases to gigabases, there is an increasing need to compress such data to a less space and a faster transmission. Different implementations of Huffman encoding incorporating the characteristics of DNA sequences prove to better compress DNA data. These implementations center on the concepts of selecting frequent repeats so as to force a skewed Huffman tree, as well as the construction of multiple Huffman trees when encoding. The implementations demonstrate improvements on the compression ratios for five genomes with lengths ranging from 5 to 50 Mbp, compared with the standard Huffman tree algorithm. The research hence suggests an improvement on all such DNA sequence compression algorithms that use the conventional Huffman encoding. The research suggests an improvement on all DNA sequence compression algorithms that use the conventional Huffman encoding. Accompanying software is publicly available (AL-Okaily, 2016). PMID:27960065
Toward a Better Compression for DNA Sequences Using Huffman Encoding.
Al-Okaily, Anas; Almarri, Badar; Al Yami, Sultan; Huang, Chun-Hsi
2017-04-01
Due to the significant amount of DNA data that are being generated by next-generation sequencing machines for genomes of lengths ranging from megabases to gigabases, there is an increasing need to compress such data to a less space and a faster transmission. Different implementations of Huffman encoding incorporating the characteristics of DNA sequences prove to better compress DNA data. These implementations center on the concepts of selecting frequent repeats so as to force a skewed Huffman tree, as well as the construction of multiple Huffman trees when encoding. The implementations demonstrate improvements on the compression ratios for five genomes with lengths ranging from 5 to 50 Mbp, compared with the standard Huffman tree algorithm. The research hence suggests an improvement on all such DNA sequence compression algorithms that use the conventional Huffman encoding. The research suggests an improvement on all DNA sequence compression algorithms that use the conventional Huffman encoding. Accompanying software is publicly available (AL-Okaily, 2016 ).
RPO41-independent maintenance of [rho-] mitochondrial DNA in Saccharomyces cerevisiae.
Fangman, W L; Henly, J W; Brewer, B J
1990-01-01
A subset of promoters in the mitochondrial DNA (mtDNA) of the yeast Saccharomyces cerevisiae has been proposed to participate in replication initiation, giving rise to a primer through site-specific cleavage of an RNA transcript. To test whether transcription is essential for mtDNA maintenance, we examined two simple mtDNA deletion ([rho-]) genomes in yeast cells. One genome (HS3324) contains a consensus promoter (ATATAAGTA) for the mitochondrial RNA polymerase encoded by the nuclear gene RPO41, and the other genome (4a) does not. As anticipated, in RPO41 cells transcripts from the HS3324 genome were more abundant than were transcripts from the 4a genome. When the RPO41 gene was disrupted, both [rho-] genomes were efficiently maintained. The level of transcripts from HS3324 mtDNA was decreased greater than 400-fold in cells carrying the RPO41 disrupted gene; however, the low-level transcripts from 4a mtDNA were undiminished. These results indicate that replication of [rho-] genomes can be initiated in the absence of wild-type levels of the RPO41-encoded RNA polymerase.
Draft Map of Human Proteome Published | Office of Cancer Clinical Proteomics Research
In a recently published article in the journal Nature, researchers have developed a draft map of the human proteome. Striving for the protein equivalent of the Human Genome Project, an international team of researchers has created an initial catalog of the human proteome. In total, using 30 different human tissues, the researchers identified proteins encoded by 17,294 genes, which is approximately 84 percent of all of the genes in the human genome predicted to encode proteins.
Bhattacharya, D; Surek, B; Rüsing, M; Damberger, S; Melkonian, M
1994-01-01
Group I introns are found in organellar genomes, in the genomes of eubacteria and phages, and in nuclear-encoded rRNAs. The origin and distribution of nuclear-encoded rRNA group I introns are not understood. To elucidate their evolutionary relationships, we analyzed diverse nuclear-encoded small-subunit rRNA group I introns including nine sequences from the green-algal order Zygnematales (Charophyceae). Phylogenetic analyses of group I introns and rRNA coding regions suggest that lateral transfers have occurred in the evolutionary history of group I introns and that, after transfer, some of these elements may form stable components of the host-cell nuclear genomes. The Zygnematales introns, which share a common insertion site (position 1506 relative to the Escherichia coli small-subunit rRNA), form one subfamily of group I introns that has, after its origin, been inherited through common ancestry. Since the first Zygnematales appear in the middle Devonian within the fossil record, the "1506" group I intron presumably has been a stable component of the Zygnematales small-subunit rRNA coding region for 350-400 million years. PMID:7937917
Characterization of Urtica dioica agglutinin isolectins and the encoding gene family.
Does, M P; Ng, D K; Dekker, H L; Peumans, W J; Houterman, P M; Van Damme, E J; Cornelissen, B J
1999-01-01
Urtica dioica agglutinin (UDA) has previously been found in roots and rhizomes of stinging nettles as a mixture of UDA-isolectins. Protein and cDNA sequencing have shown that mature UDA is composed of two hevein domains and is processed from a precursor protein. The precursor contains a signal peptide, two in-tandem hevein domains, a hinge region and a carboxyl-terminal chitinase domain. Genomic fragments encoding precursors for UDA-isolectins have been amplified by five independent polymerase chain reactions on genomic DNA from stinging nettle ecotype Weerselo. One amplified gene was completely sequenced. As compared to the published cDNA sequence, the genomic sequence contains, besides two basepair substitutions, two introns located at the same positions as in other plant chitinases. By partial sequence analysis of 40 amplified genes, 16 different genes were identified which encode seven putative UDA-isolectins. The deduced amino acid sequences share 78.9-98.9% identity. In extracts of roots and rhizomes of stinging nettle ecotype Weerselo six out of these seven isolectins were detected by mass spectrometry. One of them is an acidic form, which has not been identified before. Our results demonstrate that UDA is encoded by a large gene family.
Regis, David P.; Dobaño, Carlota; Quiñones-Olson, Paola; Liang, Xiaowu; Graber, Norma L.; Stefaniak, Maureen E.; Campo, Joseph J.; Carucci, Daniel J.; Roth, David A.; He, Huaping; Felgner, Philip L.; Doolan, Denise L.
2009-01-01
We have evaluated a technology called Transcriptionally Active PCR (TAP) for high throughput identification and prioritization of novel target antigens from genomic sequence data using the Plasmodium parasite, the causative agent of malaria, as a model. First, we adapted the TAP technology for the highly AT-rich Plasmodium genome, using well-characterized P. falciparum and P. yoelii antigens and a small panel of uncharacterized open reading frames from the P. falciparum genome sequence database. We demonstrated that TAP fragments encoding six well-characterized P. falciparum antigens and five well-characterized P. yoelii antigens could be amplified in an equivalent manner from both plasmid DNA and genomic DNA templates, and that uncharacterized open reading frames could also be amplified from genomic DNA template. Second, we showed that the in vitro expression of the TAP fragments was equivalent or superior to that of supercoiled plasmid DNA encoding the same antigen. Third, we evaluated the in vivo immunogenicity of TAP fragments encoding a subset of the model P. falciparum and P. yoelii antigens. We found that antigen-specific antibody and cellular immune responses induced by the TAP fragments in mice were equivalent or superior to those induced by the corresponding plasmid DNA vaccines. Finally, we developed and demonstrated proof-of-principle for an in vitro humoral immunoscreening assay for down-selection of novel target antigens. These data support the potential of a TAP approach for rapid high throughput functional screening and identification of potential candidate vaccine antigens from genomic sequence data. PMID:18164079
Regis, David P; Dobaño, Carlota; Quiñones-Olson, Paola; Liang, Xiaowu; Graber, Norma L; Stefaniak, Maureen E; Campo, Joseph J; Carucci, Daniel J; Roth, David A; He, Huaping; Felgner, Philip L; Doolan, Denise L
2008-03-01
We have evaluated a technology called transcriptionally active PCR (TAP) for high throughput identification and prioritization of novel target antigens from genomic sequence data using the Plasmodium parasite, the causative agent of malaria, as a model. First, we adapted the TAP technology for the highly AT-rich Plasmodium genome, using well-characterized P. falciparum and P. yoelii antigens and a small panel of uncharacterized open reading frames from the P. falciparum genome sequence database. We demonstrated that TAP fragments encoding six well-characterized P. falciparum antigens and five well-characterized P. yoelii antigens could be amplified in an equivalent manner from both plasmid DNA and genomic DNA templates, and that uncharacterized open reading frames could also be amplified from genomic DNA template. Second, we showed that the in vitro expression of the TAP fragments was equivalent or superior to that of supercoiled plasmid DNA encoding the same antigen. Third, we evaluated the in vivo immunogenicity of TAP fragments encoding a subset of the model P. falciparum and P. yoelii antigens. We found that antigen-specific antibody and cellular immune responses induced by the TAP fragments in mice were equivalent or superior to those induced by the corresponding plasmid DNA vaccines. Finally, we developed and demonstrated proof-of-principle for an in vitro humoral immunoscreening assay for down-selection of novel target antigens. These data support the potential of a TAP approach for rapid high throughput functional screening and identification of potential candidate vaccine antigens from genomic sequence data.
Youssef, Noha H; Blainey, Paul C; Quake, Stephen R; Elshahed, Mostafa S
2011-11-01
Members of candidate division OP11 are widely distributed in terrestrial and marine ecosystems, yet little information regarding their metabolic capabilities and ecological role within such habitats is currently available. Here, we report on the microfluidic isolation, multiple-displacement-amplification, pyrosequencing, and genomic analysis of a single cell (ZG1) belonging to candidate division OP11. Genome analysis of the ∼270-kb partial genome assembly obtained showed that it had no particular similarity to a specific phylum. Four hundred twenty-three open reading frames were identified, 46% of which had no function prediction. In-depth analysis revealed a heterotrophic lifestyle, with genes encoding endoglucanase, amylopullulanase, and laccase enzymes, suggesting a capacity for utilization of cellulose, starch, and, potentially, lignin, respectively. Genes encoding several glycolysis enzymes as well as formate utilization were identified, but no evidence for an electron transport chain was found. The presence of genes encoding various components of lipopolysaccharide biosynthesis indicates a Gram-negative bacterial cell wall. The partial genome also provides evidence for antibiotic resistance (β-lactamase, aminoglycoside phosphotransferase), as well as antibiotic production (bacteriocin) and extracellular bactericidal peptidases. Multiple mechanisms for stress response were identified, as were elements of type I and type IV secretion systems. Finally, housekeeping genes identified within the partial genome were used to demonstrate the OP11 affiliation of multiple hitherto unclassified genomic fragments from multiple database-deposited metagenomic data sets. These results provide the first glimpse into the lifestyle of a member of a ubiquitous, yet poorly understood bacterial candidate division.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Martinez, Antonio D.; Berka, Randy; Henrissat, Bernard
2008-05-01
A major thrust of the white biotechnology movement involves the development of enzyme systems which depolymerize biomass to simple sugars which are subsequently converted to sustainable biofuels (e.g., ethanol) and chemical intermediates. The fungus Trichoderma reesei (syn. Hypocrea jecorina) represents a paradigm for the industrial production of highly efficient cellulases and hemicellulases needed for hydrolysis of biomass polysaccharides. Herein we describe intriguing attributes of the T. reeseigenome in relation to the future of fuel biotechnology. The T. reesei genome sequence was derived using a whole genome shotgun approach combined with finishing work to generate an assembly comprising 89 scaffolds totalingmore » 34 Mbp with few gaps. In total, 9,130 gene models were predicted using a combination of ab initio and sequence similarity-based methods and EST data. Considering the industrial utility and effectiveness of its enzymes, the T. reesei genome surprisingly encodes the fewest cellulases and hemicellulases of any fungus having the ability to hydrolyze plant cell wall polysaccharides and whose genome has been sequenced. Many genes encoding carbohydrate active enzymes are distributed non-randomly in groups or clusters that interestingly lie between regions of synteny with other Sordariomycetes. Additionally, the T. reesei genome contains a multitude of genes encoding biosynthetic pathways for secondary metabolites (possible antibacterial and antifungal compounds) which may promote successful competition and survival in the crowded and competitive soil habitat occupied by T. reesei. Our analysis coupled with the availability of genome sequence data provides a roadmap for construction of enhanced T. reesei strains for industrial applications.« less
Music, Nedzad; Gagnon, Carl A
2010-12-01
Porcine reproductive and respiratory syndrome (PRRS) is an economically devastating viral disease affecting the swine industry worldwide. The etiological agent, PRRS virus (PRRSV), possesses a RNA viral genome with nine open reading frames (ORFs). The ORF1a and ORF1b replicase-associated genes encode the polyproteins pp1a and pp1ab, respectively. The pp1a is processed in nine non-structural proteins (nsps): nsp1α, nsp1β, and nsp2 to nsp8. Proteolytic cleavage of pp1ab generates products nsp9 to nsp12. The proteolytic pp1a cleavage products process and cleave pp1a and pp1ab into nsp products. The nsp9 to nsp12 are involved in virus genome transcription and replication. The 3' end of the viral genome encodes four minor and three major structural proteins. The GP(2a), GP₃ and GP₄ (encoded by ORF2a, 3 and 4), are glycosylated membrane associated minor structural proteins. The fourth minor structural protein, the E protein (encoded by ORF2b), is an unglycosylated membrane associated protein. The viral envelope contains two major structural proteins: a glycosylated major envelope protein GP₅ (encoded by ORF5) and an unglycosylated membrane M protein (encoded by ORF6). The third major structural protein is the nucleocapsid N protein (encoded by ORF7). All PRRSV non-structural and structural proteins are essential for virus replication, and PRRSV infectivity is relatively intolerant to subtle changes within the structural proteins. PRRSV virulence is multigenic and resides in both the non-structural and structural viral proteins. This review discusses the molecular characteristics, biological and immunological functions of the PRRSV structural and nsps and their involvement in the virus pathogenesis.
A highly divergent gene cluster in honey bees encodes a novel silk family.
Sutherland, Tara D; Campbell, Peter M; Weisman, Sarah; Trueman, Holly E; Sriskantha, Alagacone; Wanjura, Wolfgang J; Haritos, Victoria S
2006-11-01
The pupal cocoon of the domesticated silk moth Bombyx mori is the best known and most extensively studied insect silk. It is not widely known that Apis mellifera larvae also produce silk. We have used a combination of genomic and proteomic techniques to identify four honey bee fiber genes (AmelFibroin1-4) and two silk-associated genes (AmelSA1 and 2). The four fiber genes are small, comprise a single exon each, and are clustered on a short genomic region where the open reading frames are GC-rich amid low GC intergenic regions. The genes encode similar proteins that are highly helical and predicted to form unusually tight coiled coils. Despite the similarity in size, structure, and composition of the encoded proteins, the genes have low primary sequence identity. We propose that the four fiber genes have arisen from gene duplication events but have subsequently diverged significantly. The silk-associated genes encode proteins likely to act as a glue (AmelSA1) and involved in silk processing (AmelSA2). Although the silks of honey bees and silkmoths both originate in larval labial glands, the silk proteins are completely different in their primary, secondary, and tertiary structures as well as the genomic arrangement of the genes encoding them. This implies independent evolutionary origins for these functionally related proteins.
Draft Genome Sequence of Mycobacterium asiaticum Strain DSM 44297.
Croce, Olivier; Robert, Catherine; Raoult, Didier; Drancourt, Michel
2014-04-17
We report the draft genome sequence of Mycobacterium asiaticum strain DSM 44297, a tropical mycobacterium seldom responsible for human infection. The genome of M. asiaticum has a size of 5,935,986 bp, with a 66.03% G+C content, encoding 5,591 proteins and 81 RNAs.
Ross, Daniel E.; Gulliver, Djuna
2016-10-06
The draft genome sequence ofPseudomonas stutzeristrain K35 was separated from a metagenome derived from a produced water microbial community of a coalbed methane well. The genome encodes a complete nitrogen fixation pathway and the upper and lower naphthalene degradation pathways.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ross, Daniel E.; Gulliver, Djuna
The draft genome sequence ofPseudomonas stutzeristrain K35 was separated from a metagenome derived from a produced water microbial community of a coalbed methane well. The genome encodes a complete nitrogen fixation pathway and the upper and lower naphthalene degradation pathways.
Genome Sequence of Enterohemorrhagic Escherichia coli NCCP15658
Song, Ju Yeon; Yoo, Ran Hee; Jang, Song Yee; Seong, Won-Keun; Kim, Seon-Young; Jeong, Haeyoung; Kang, Sung Gyun; Kim, Byung Kwon; Kwon, Soon-Kyeong; Lee, Choong Hoon; Yu, Dong Su; Park, Mi-Sun
2012-01-01
Enterohemorrhagic Escherichia coli causes severe food-borne disease in the guts of humans and animals. Here, we report the high-quality draft genome sequence of E. coli NCCP15658 isolated from a patient in the Republic of Korea. Its genome size was determined to be 5.46 Mb, and its genomic features, including genes encoding virulence factors, were analyzed. PMID:22740673
USDA-ARS?s Scientific Manuscript database
In previous work, we reported on the isolation and genome sequence analysis of Bacillus cereus strain tsu1 NCBI accession number JPYN00000000. The 36 scaffolds in the assembled tsu1 genome were all aligned with B. cereus B4264 genome with variations. Genes encoding for xylanase and cellulase and the...
Nishimura, Yuki; Kamikawa, Ryoma; Hashimoto, Tetsuo; Inagaki, Yuji
2014-01-01
Mitochondrial (mt) genome sequences, which often bear introns, have been sampled from phylogenetically diverse eukaryotes. Thus, we can anticipate novel insights into intron evolution from previously unstudied mt genomes. We here investigated the origins and evolution of three introns in the mt genome of the haptophyte Chrysochromulina sp. NIES-1333, which was sequenced completely in this study. All the three introns were characterized as group II, on the basis of predicted secondary structure, and the conserved sequence motifs at the 5′ and 3′ termini. Our comparative studies on diverse mt genomes prompt us to propose that the Chrysochromulina mt genome laterally acquired the introns from mt genomes in distantly related eukaryotes. Many group II introns harbor intronic open reading frames for the proteins (intron-encoded proteins or IEPs), which likely facilitate the splicing of their host introns. However, we propose that a “free-standing,” IEP-like protein, which is not encoded within any introns in the Chrysochromulina mt genome, is involved in the splicing of the first cox1 intron that lacks any open reading frames. PMID:25054084
Hidden weapons of microbial destruction in plant genomes
Manners, John M
2007-01-01
Recent bioinformatic analyses of sequenced plant genomes reveal a previously unrecognized abundance of genes encoding antimicrobial cysteine-rich peptides, representing a formidable and dynamic defense arsenal against plant pests and pathogens. PMID:17903311
Lozano, Roberto; Ponce, Olga; Ramirez, Manuel; Mostajo, Nelly; Orjeda, Gisella
2012-01-01
The majority of disease resistance (R) genes identified to date in plants encode a nucleotide-binding site (NBS) and leucine-rich repeat (LRR) domain containing protein. Additional domains such as coiled-coil (CC) and TOLL/interleukin-1 receptor (TIR) domains can also be present. In the recently sequenced Solanum tuberosum group phureja genome we used HMM models and manual curation to annotate 435 NBS-encoding R gene homologs and 142 NBS-derived genes that lack the NBS domain. Highly similar homologs for most previously documented Solanaceae R genes were identified. A surprising ∼41% (179) of the 435 NBS-encoding genes are pseudogenes primarily caused by premature stop codons or frameshift mutations. Alignment of 81.80% of the 577 homologs to S. tuberosum group phureja pseudomolecules revealed non-random distribution of the R-genes; 362 of 470 genes were found in high density clusters on 11 chromosomes. PMID:22493716
Tuning of RNA editing by ADAR is required in Drosophila
Keegan, Liam P; Brindle, James; Gallo, Angela; Leroy, Anne; Reenan, Robert A; O'Connell, Mary A
2005-01-01
RNA editing increases during development in more than 20 transcripts encoding proteins involved in rapid synaptic neurotransmission in Drosophila central nervous system and muscle. Adar (adenosine deaminase acting on RNA) mutant flies expressing only genome-encoded, unedited isoforms of ion-channel subunits are viable but show severe locomotion defects. The Adar transcript itself is edited in adult wild-type flies to generate an isoform with a serine to glycine substitution close to the ADAR active site. We show that editing restricts ADAR function since the edited isoform of ADAR is less active in vitro and in vivo than the genome-encoded, unedited isoform. Ubiquitous expression in embryos and larvae of an Adar transcript that is resistant to editing is lethal. Expression of this transcript in embryonic muscle is also lethal, with above-normal, adult-like levels of editing at sites in a transcript encoding a muscle voltage-gated calcium channel. PMID:15920480
Drissi, F; Merhej, V; Angelakis, E; El Kaoutari, A; Carrière, F; Henrissat, B; Raoult, D
2014-02-24
Some Lactobacillus species are associated with obesity and weight gain while others are associated with weight loss. Lactobacillus spp. and bifidobacteria represent a major bacterial population of the small intestine where lipids and simple carbohydrates are absorbed, particularly in the duodenum and jejunum. The objective of this study was to identify Lactobacillus spp. proteins involved in carbohydrate and lipid metabolism associated with weight modifications. We examined a total of 13 complete genomes belonging to seven different Lactobacillus spp. previously associated with weight gain or weight protection. We combined the data obtained from the Rapid Annotation using Subsystem Technology, Batch CD-Search and Gene Ontology to classify gene function in each genome. We observed major differences between the two groups of genomes. Weight gain-associated Lactobacillus spp. appear to lack enzymes involved in the catabolism of fructose, defense against oxidative stress and the synthesis of dextrin, L-rhamnose and acetate. Weight protection-associated Lactobacillus spp. encoded a significant gene amount of glucose permease. Regarding lipid metabolism, thiolases were only encoded in the genome of weight gain-associated Lactobacillus spp. In addition, we identified 18 different types of bacteriocins in the studied genomes, and weight gain-associated Lactobacillus spp. encoded more bacteriocins than weight protection-associated Lactobacillus spp. The results of this study revealed that weight protection-associated Lactobacillus spp. have developed defense mechanisms for enhanced glycolysis and defense against oxidative stress. Weight gain-associated Lactobacillus spp. possess a limited ability to breakdown fructose or glucose and might reduce ileal brake effects.
Shen, Zhicheng; Denton, Michael; Mutti, Navdeep; Pappan, Kirk; Kanost, Michael R.; Reese, John C.; Reeck, Gerald R.
2003-01-01
Endo-polygalacturonase, one of the group of enzymes known collectively as pectinases, is widely distributed in bacteria, plants and fungi. The enzyme has also been found in several weevil species and a few other insects, such as aphids, but not in Drosophila melanogaster, Anopheles gambiae, or Caenorhabditis elegans or, as far as is known, in any more primitive animal species. What, then, is the genetic origin of the polygalacturonases in weevils? Since some weevil species harbor symbiotic microorganisms, it has been suggested, reasonably, that the symbionts' genomes of both aphids and weevils, rather than the insects' genomes, could encode polygalacturonase. We report here the cloning of a cDNA that encodes endo-polygalacturonase in the rice weevil, Sitophilus oryzae (L.), and investigations based on the cloned cDNA. Our results, which include analysis of genes in antibiotic-treated rice weevils, indicate that the enzyme is, in fact, encoded by the insect genome. Given the apparent absence of the gene in much of the rest of the animal kingdom, it is therefore likely that the rice weevil polygalacturonase gene was incorporated into the weevil's genome by horizontal transfer, possibly from a fungus. PMID:15841240
Shen, Zhicheng; Denton, Michael; Mutti, Navdeep; Pappan, Kirk; Kanost, Michael R; Reese, John C; Reeck, Gerald R
2003-01-01
Endo-polygalacturonase, one of the group of enzymes known collectively as pectinases, is widely distributed in bacteria, plants and fungi. The enzyme has also been found in several weevil species and a few other insects, such as aphids, but not in Drosophila melanogaster, Anopheles gambiae, or Caenorhabditis elegans or, as far as is known, in any more primitive animal species. What, then, is the genetic origin of the polygalacturonases in weevils? Since some weevil species harbor symbiotic microorganisms, it has been suggested, reasonably, that the symbionts' genomes of both aphids and weevils, rather than the insects' genomes, could encode polygalacturonase. We report here the cloning of a cDNA that encodes endo-polygalacturonase in the rice weevil, Sitophilus oryzae (L.), and investigations based on the cloned cDNA. Our results, which include analysis of genes in antibiotic-treated rice weevils, indicate that the enzyme is, in fact, encoded by the insect genome. Given the apparent absence of the gene in much of the rest of the animal kingdom, it is therefore likely that the rice weevil polygalacturonase gene was incorporated into the weevil's genome by horizontal transfer, possibly from a fungus.
Comparative genome analysis of non-toxigenic non-O1 versus toxigenic O1 Vibrio cholerae
Mukherjee, Munmun; Kakarla, Prathusha; Kumar, Sanath; Gonzalez, Esmeralda; Floyd, Jared T.; Inupakutika, Madhuri; Devireddy, Amith Reddy; Tirrell, Selena R.; Bruns, Merissa; He, Guixin; Lindquist, Ingrid E.; Sundararajan, Anitha; Schilkey, Faye D.; Mudge, Joann; Varela, Manuel F.
2015-01-01
Pathogenic strains of Vibrio cholerae are responsible for endemic and pandemic outbreaks of the disease cholera. The complete toxigenic mechanisms underlying virulence in Vibrio strains are poorly understood. The hypothesis of this work was that virulent versus non-virulent strains of V. cholerae harbor distinctive genomic elements that encode virulence. The purpose of this study was to elucidate genomic differences between the O1 serotypes and non-O1 V. cholerae PS15, a non-toxigenic strain, in order to identify novel genes potentially responsible for virulence. In this study, we compared the whole genome of the non-O1 PS15 strain to the whole genomes of toxigenic serotypes at the phylogenetic level, and found that the PS15 genome was distantly related to those of toxigenic V. cholerae. Thus we focused on a detailed gene comparison between PS15 and the distantly related O1 V. cholerae N16961. Based on sequence alignment we tentatively assigned chromosome numbers 1 and 2 to elements within the genome of non-O1 V. cholerae PS15. Further, we found that PS15 and O1 V. cholerae N16961 shared 98% identity and 766 genes, but of the genes present in N16961 that were missing in the non-O1 V. cholerae PS15 genome, 56 were predicted to encode not only for virulence–related genes (colonization, antimicrobial resistance, and regulation of persister cells) but also genes involved in the metabolic biosynthesis of lipids, nucleosides and sulfur compounds. Additionally, we found 113 genes unique to PS15 that were predicted to encode other properties related to virulence, disease, defense, membrane transport, and DNA metabolism. Here, we identified distinctive and novel genomic elements between O1 and non-O1 V. cholerae genomes as potential virulence factors and, thus, targets for future therapeutics. Modulation of such novel targets may eventually enhance eradication efforts of endemic and pandemic disease cholera in afflicted nations. PMID:25722857
Comparative genome analysis of non-toxigenic non-O1 versus toxigenic O1 Vibrio cholerae.
Mukherjee, Munmun; Kakarla, Prathusha; Kumar, Sanath; Gonzalez, Esmeralda; Floyd, Jared T; Inupakutika, Madhuri; Devireddy, Amith Reddy; Tirrell, Selena R; Bruns, Merissa; He, Guixin; Lindquist, Ingrid E; Sundararajan, Anitha; Schilkey, Faye D; Mudge, Joann; Varela, Manuel F
Pathogenic strains of Vibrio cholerae are responsible for endemic and pandemic outbreaks of the disease cholera. The complete toxigenic mechanisms underlying virulence in Vibrio strains are poorly understood. The hypothesis of this work was that virulent versus non-virulent strains of V. cholerae harbor distinctive genomic elements that encode virulence. The purpose of this study was to elucidate genomic differences between the O1 serotypes and non-O1 V. cholerae PS15, a non-toxigenic strain, in order to identify novel genes potentially responsible for virulence. In this study, we compared the whole genome of the non-O1 PS15 strain to the whole genomes of toxigenic serotypes at the phylogenetic level, and found that the PS15 genome was distantly related to those of toxigenic V. cholerae . Thus we focused on a detailed gene comparison between PS15 and the distantly related O1 V. cholerae N16961. Based on sequence alignment we tentatively assigned chromosome numbers 1 and 2 to elements within the genome of non-O1 V. cholerae PS15. Further, we found that PS15 and O1 V. cholerae N16961 shared 98% identity and 766 genes, but of the genes present in N16961 that were missing in the non-O1 V. cholerae PS15 genome, 56 were predicted to encode not only for virulence-related genes (colonization, antimicrobial resistance, and regulation of persister cells) but also genes involved in the metabolic biosynthesis of lipids, nucleosides and sulfur compounds. Additionally, we found 113 genes unique to PS15 that were predicted to encode other properties related to virulence, disease, defense, membrane transport, and DNA metabolism. Here, we identified distinctive and novel genomic elements between O1 and non-O1 V. cholerae genomes as potential virulence factors and, thus, targets for future therapeutics. Modulation of such novel targets may eventually enhance eradication efforts of endemic and pandemic disease cholera in afflicted nations.
Chen, Lei; Pospíšilová, Petra; Strouhal, Michal; Qin, Xiang; Mikalová, Lenka; Norris, Steven J.; Muzny, Donna M.; Gibbs, Richard A.; Fulton, Lucinda L.; Sodergren, Erica; Weinstock, George M.; Šmajs, David
2012-01-01
Background The yaws treponemes, Treponema pallidum ssp. pertenue (TPE) strains, are closely related to syphilis causing strains of Treponema pallidum ssp. pallidum (TPA). Both yaws and syphilis are distinguished on the basis of epidemiological characteristics, clinical symptoms, and several genetic signatures of the corresponding causative agents. Methodology/Principal Findings To precisely define genetic differences between TPA and TPE, high-quality whole genome sequences of three TPE strains (Samoa D, CDC-2, Gauthier) were determined using next-generation sequencing techniques. TPE genome sequences were compared to four genomes of TPA strains (Nichols, DAL-1, SS14, Chicago). The genome structure was identical in all three TPE strains with similar length ranging between 1,139,330 bp and 1,139,744 bp. No major genome rearrangements were found when compared to the four TPA genomes. The whole genome nucleotide divergence (dA) between TPA and TPE subspecies was 4.7 and 4.8 times higher than the observed nucleotide diversity (π) among TPA and TPE strains, respectively, corresponding to 99.8% identity between TPA and TPE genomes. A set of 97 (9.9%) TPE genes encoded proteins containing two or more amino acid replacements or other major sequence changes. The TPE divergent genes were mostly from the group encoding potential virulence factors and genes encoding proteins with unknown function. Conclusions/Significance Hypothetical genes, with genetic differences, consistently found between TPE and TPA strains are candidates for syphilitic treponemes virulence factors. Seventeen TPE genes were predicted under positive selection, and eleven of them coded either for predicted exported proteins or membrane proteins suggesting their possible association with the cell surface. Sequence changes between TPE and TPA strains and changes specific to individual strains represent suitable targets for subspecies- and strain-specific molecular diagnostics. PMID:22292095
Graur, Dan; Zheng, Yichen; Price, Nicholas; Azevedo, Ricardo B.R.; Zufall, Rebecca A.; Elhaik, Eran
2013-01-01
A recent slew of ENCyclopedia Of DNA Elements (ENCODE) Consortium publications, specifically the article signed by all Consortium members, put forward the idea that more than 80% of the human genome is functional. This claim flies in the face of current estimates according to which the fraction of the genome that is evolutionarily conserved through purifying selection is less than 10%. Thus, according to the ENCODE Consortium, a biological function can be maintained indefinitely without selection, which implies that at least 80 − 10 = 70% of the genome is perfectly invulnerable to deleterious mutations, either because no mutation can ever occur in these “functional” regions or because no mutation in these regions can ever be deleterious. This absurd conclusion was reached through various means, chiefly by employing the seldom used “causal role” definition of biological function and then applying it inconsistently to different biochemical properties, by committing a logical fallacy known as “affirming the consequent,” by failing to appreciate the crucial difference between “junk DNA” and “garbage DNA,” by using analytical methods that yield biased errors and inflate estimates of functionality, by favoring statistical sensitivity over specificity, and by emphasizing statistical significance rather than the magnitude of the effect. Here, we detail the many logical and methodological transgressions involved in assigning functionality to almost every nucleotide in the human genome. The ENCODE results were predicted by one of its authors to necessitate the rewriting of textbooks. We agree, many textbooks dealing with marketing, mass-media hype, and public relations may well have to be rewritten. PMID:23431001
Barret, Matthieu; Egan, Frank; Fargier, Emilie; Morrissey, John P; O'Gara, Fergal
2011-06-01
Bacteria encode multiple protein secretion systems that are crucial for interaction with the environment and with hosts. In recent years, attention has focused on type VI secretion systems (T6SSs), which are specialized transporters widely encoded in Proteobacteria. The myriad of processes associated with these secretion systems could be explained by subclasses of T6SS, each involved in specialized functions. To assess diversity and predict function associated with different T6SSs, comparative genomic analysis of 34 Pseudomonas genomes was performed. This identified 70 T6SSs, with at least one locus in every strain, except for Pseudomonas stutzeri A1501. By comparing 11 core genes of the T6SS, it was possible to identify five main Pseudomonas phylogenetic clusters, with strains typically carrying T6SSs from more than one clade. In addition, most strains encode additional vgrG and hcp genes, which encode extracellular structural components of the secretion apparatus. Using a combination of phylogenetic and meta-analysis of transcriptome datasets it was possible to associate specific subsets of VgrG and Hcp proteins with each Pseudomonas T6SS clade. Moreover, a closer examination of the genomic context of vgrG genes in multiple strains highlights a number of additional genes associated with these regions. It is proposed that these genes may play a role in secretion or alternatively could be new T6S effectors.
Plastid–Nuclear Interaction and Accelerated Coevolution in Plastid Ribosomal Genes in Geraniaceae
Weng, Mao-Lun; Ruhlman, Tracey A.; Jansen, Robert K.
2016-01-01
Plastids and mitochondria have many protein complexes that include subunits encoded by organelle and nuclear genomes. In animal cells, compensatory evolution between mitochondrial and nuclear-encoded subunits was identified and the high mitochondrial mutation rates were hypothesized to drive compensatory evolution in nuclear genomes. In plant cells, compensatory evolution between plastid and nucleus has rarely been investigated in a phylogenetic framework. To investigate plastid–nuclear coevolution, we focused on plastid ribosomal protein genes that are encoded by plastid and nuclear genomes from 27 Geraniales species. Substitution rates were compared for five sets of genes representing plastid- and nuclear-encoded ribosomal subunit proteins targeted to the cytosol or the plastid as well as nonribosomal protein controls. We found that nonsynonymous substitution rates (dN) and the ratios of nonsynonymous to synonymous substitution rates (ω) were accelerated in both plastid- (CpRP) and nuclear-encoded subunits (NuCpRP) of the plastid ribosome relative to control sequences. Our analyses revealed strong signals of cytonuclear coevolution between plastid- and nuclear-encoded subunits, in which nonsynonymous substitutions in CpRP and NuCpRP tend to occur along the same branches in the Geraniaceae phylogeny. This coevolution pattern cannot be explained by physical interaction between amino acid residues. The forces driving accelerated coevolution varied with cellular compartment of the sequence. Increased ω in CpRP was mainly due to intensified positive selection whereas increased ω in NuCpRP was caused by relaxed purifying selection. In addition, the many indels identified in plastid rRNA genes in Geraniaceae may have contributed to changes in plastid subunits. PMID:27190001
Genome-Wide Architecture of Disease Resistance Genes in Lettuce
Christopoulou, Marilena; Wo, Sebastian Reyes-Chin; Kozik, Alex; McHale, Leah K.; Truco, Maria-Jose; Wroblewski, Tadeusz; Michelmore, Richard W.
2015-01-01
Genome-wide motif searches identified 1134 genes in the lettuce reference genome of cv. Salinas that are potentially involved in pathogen recognition, of which 385 were predicted to encode nucleotide binding-leucine rich repeat receptor (NLR) proteins. Using a maximum-likelihood approach, we grouped the NLRs into 25 multigene families and 17 singletons. Forty-one percent of these NLR-encoding genes belong to three families, the largest being RGC16 with 62 genes in cv. Salinas. The majority of NLR-encoding genes are located in five major resistance clusters (MRCs) on chromosomes 1, 2, 3, 4, and 8 and cosegregate with multiple disease resistance phenotypes. Most MRCs contain primarily members of a single NLR gene family but a few are more complex. MRC2 spans 73 Mb and contains 61 NLRs of six different gene families that cosegregate with nine disease resistance phenotypes. MRC3, which is 25 Mb, contains 22 RGC21 genes and colocates with Dm13. A library of 33 transgenic RNA interference tester stocks was generated for functional analysis of NLR-encoding genes that cosegregated with disease resistance phenotypes in each of the MRCs. Members of four NLR-encoding families, RGC1, RGC2, RGC21, and RGC12 were shown to be required for 16 disease resistance phenotypes in lettuce. The general composition of MRCs is conserved across different genotypes; however, the specific repertoire of NLR-encoding genes varied particularly of the rapidly evolving Type I genes. These tester stocks are valuable resources for future analyses of additional resistance phenotypes. PMID:26449254
Carbone, Alessandra; Madden, Richard
2005-10-01
Codon bias is related to metabolic functions in translationally biased organisms, and two facts are argued about. First, genes with high codon bias describe in meaningful ways the metabolic characteristics of the organism; important metabolic pathways corresponding to crucial characteristics of the lifestyle of an organism, such as photosynthesis, nitrification, anaerobic versus aerobic respiration, sulfate reduction, methanogenesis, and others, happen to involve especially biased genes. Second, gene transcriptional levels of sets of experiments representing a significant variation of biological conditions strikingly confirm, in the case of Saccharomyces cerevisiae, that metabolic preferences are detectable by purely statistical analysis: the high metabolic activity of yeast during fermentation is encoded in the high bias of enzymes involved in the associated pathways, suggesting that this genome was affected by a strong evolutionary pressure that favored a predominantly fermentative metabolism of yeast in the wild. The ensemble of metabolic pathways involving enzymes with high codon bias is rather well defined and remains consistent across many species, even those that have not been considered as translationally biased, such as Helicobacter pylori, for instance, reveal some weak form of translational bias for this genome. We provide numerical evidence, supported by experimental data, of these facts and conclude that the metabolic networks of translationally biased genomes, observable today as projections of eons of evolutionary pressure, can be analyzed numerically and predictions of the role of specific pathways during evolution can be derived. The new concepts of Comparative Pathway Index, used to compare organisms with respect to their metabolic networks, and Evolutionary Pathway Index, used to detect evolutionarily meaningful bias in the genetic code from transcriptional data, are introduced.
Fourie, Gerda; van der Merwe, Nicolaas A; Wingfield, Brenda D; Bogale, Mesfin; Tudzynski, Bettina; Wingfield, Michael J; Steenkamp, Emma T
2013-09-08
The availability of mitochondrial genomes has allowed for the resolution of numerous questions regarding the evolutionary history of fungi and other eukaryotes. In the Gibberella fujikuroi species complex, the exact relationships among the so-called "African", "Asian" and "American" Clades remain largely unresolved, irrespective of the markers employed. In this study, we considered the feasibility of using mitochondrial genes to infer the phylogenetic relationships among Fusarium species in this complex. The mitochondrial genomes of representatives of the three Clades (Fusarium circinatum, F. verticillioides and F. fujikuroi) were characterized and we determined whether or not the mitochondrial genomes of these fungi have value in resolving the higher level evolutionary relationships in the complex. Overall, the mitochondrial genomes of the three species displayed a high degree of synteny, with all the genes (protein coding genes, unique ORFs, ribosomal RNA and tRNA genes) in identical order and orientation, as well as introns that share similar positions within genes. The intergenic regions and introns generally contributed significantly to the size differences and diversity observed among these genomes. Phylogenetic analysis of the concatenated protein-coding dataset separated members of the Gibberella fujikuroi complex from other Fusarium species and suggested that F. fujikuroi ("Asian" Clade) is basal in the complex. However, individual mitochondrial gene trees were largely incongruent with one another and with the concatenated gene tree, because six distinct phylogenetic trees were recovered from the various single gene datasets. The mitochondrial genomes of Fusarium species in the Gibberella fujikuroi complex are remarkably similar to those of the previously characterized Fusarium species and Sordariomycetes. Despite apparently representing a single replicative unit, all of the genes encoded on the mitochondrial genomes of these fungi do not share the same evolutionary history. This incongruence could be due to biased selection on some genes or recombination among mitochondrial genomes. The results thus suggest that the use of individual mitochondrial genes for phylogenetic inference could mask the true relationships between species in this complex.
Rychli, Kathrin; Müller, Anneliese; Zaiser, Andreas; Schoder, Dagmar; Allerberger, Franz; Wagner, Martin; Schmitz-Esser, Stephan
2014-01-01
A large listeriosis outbreak occurred in Austria, Germany and the Czech Republic in 2009 and 2010. The outbreak was traced back to a traditional Austrian curd cheese called “Quargel” which was contaminated with two distinct serovar 1/2a Listeria monocytogenes strains (QOC1 and QOC2). In this study we sequenced and analysed the genomes of both outbreak strains in order to investigate the extent of genetic diversity between the two strains belonging to MLST sequence types 398 (QOC2) and 403 (QOC1). Both genomes are highly similar, but also display distinct properties: The QOC1 genome is approximately 74 kbp larger than the QOC2 genome. In addition, the strains harbour 93 (QOC1) and 45 (QOC2) genes encoding strain-specific proteins. A 21 kbp region showing highest similarity to plasmid pLMIV encoding three putative internalins is integrated in the QOC1 genome. In contrast to QOC1, strain QOC2 harbours a vip homologue, which encodes a LPXTG surface protein involved in cell invasion. In accordance, in vitro virulence assays revealed distinct differences in invasion efficiency and intracellular proliferation within different cell types. The higher virulence potential of QOC1 in non-phagocytic cells may be explained by the presence of additional internalins in the pLMIV-like region, whereas the higher invasion capability of QOC2 into phagocytic cells may be due to the presence of a vip homologue. In addition, both strains show differences in stress-related gene content. Strain QOC1 encodes a so-called stress survival islet 1, whereas strain QOC2 harbours a homologue of the uncharacterized LMOf2365_0481 gene. Consistently, QOC1 shows higher resistance to acidic, alkaline and gastric stress. In conclusion, our results show that strain QOC1 and QOC2 are distinct and did not recently evolve from a common ancestor. PMID:24587155
Genome sequence of the Fleming strain of Micrococcus luteus, a simple free- living actinobacterium
DOE Office of Scientific and Technical Information (OSTI.GOV)
Young, Michael; Artsatbanov, Vladislav; Beller, Harry R.
Micrococcus luteus (NCTC2665, Fleming strain) has one of the smallest genomes of free living actinobacteria sequenced to date, comprising a single circular chromosome of 2,501,097 bp (G+C content 73%) predicted to encode 2403 proteins. The genome shows extensive synteny with that of the closely related organism, Kocuria rhizophila, from which it was taxonomically separated relatively recently. Despite its small size, the genome harbors 73 IS elements, almost all of which are closely related to elements found in other actinobacteria. An IS element is inserted into the rrs gene of one of only two rrn operons found in M. luteus. Themore » genome encodes only four sigma factors and fourteen response regulators, indicative of adaptation to a rather strict ecological niche (mammalian skin). The high sensitivity of M. luteus to {Beta}-lactam antibiotics may result from the presence of a reduced set of penicillin binding proteins and the absence of a wblC gene, which plays an important role in antibiotic resistance in other actinobacteria. Consistent with the restricted range of compounds it can use as a sole source of carbon for energy and growth, M. luteus has a minimal complement of genes concerned with carbohydrate transport and metabolism and its inability to utilize glucose as a sole carbon source may be due to the apparent absence of a gene encoding glucokinase. Uniquely among characterized bacteria, M. luteus appears to be able to metabolize glycogen only via trehalose, and to make trehalose only via glycogen. It has very few genes associated with secondary metabolism. In contrast to other actinobacteria, M. luteus encodes only one resuscitation-promoting factor (Rpf) required for emergence from dormancy and its complement of other dormancy-related proteins is also much reduced. M. luteus is capable of long-chain alkene biosynthesis, which is of interest for advanced biofuel production; a three gene cluster essential for this metabolism has been identified in the genome.« less
Learning from number board games: you learn what you encode.
Laski, Elida V; Siegler, Robert S
2014-03-01
We tested the hypothesis that encoding the numerical-spatial relations in a number board game is a key process in promoting learning from playing such games. Experiment 1 used a microgenetic design to examine the effects on learning of the type of counting procedure that children use. As predicted, having kindergartners count-on from their current number on the board while playing a 0-100 number board game facilitated their encoding of the numerical-spatial relations on the game board and improved their number line estimates, numeral identification, and count-on skill. Playing the same game using the standard count-from-1 procedure led to considerably less learning. Experiment 2 demonstrated that comparable improvement in number line estimation does not occur with practice encoding the numerals 1-100 outside of the context of a number board game. The general importance of aligning learning activities and physical materials with desired mental representations is discussed. PsycINFO Database Record (c) 2014 APA, all rights reserved.
Xyloglucan breakdown by endo-xyloglucanase family 74 from Aspergillus fumigatus.
Damasio, André Ricardo de Lima; Rubio, Marcelo Ventura; Gonçalves, Thiago Augusto; Persinoti, Gabriela Felix; Segato, Fernando; Prade, Rolf Alexander; Contesini, Fabiano Jares; de Souza, Amanda Pereira; Buckeridge, Marcos Silveira; Squina, Fabio Marcio
2017-04-01
Xyloglucan is the most abundant hemicellulose in primary walls of spermatophytes except for grasses. Xyloglucan-degrading enzymes are important in lignocellulosic biomass hydrolysis because they remove xyloglucan, which is abundant in monocot-derived biomass. Fungal genomes encode numerous xyloglucanase genes, belonging to at least six glycoside hydrolase (GH) families. GH74 endo-xyloglucanases cleave xyloglucan backbones with unsubstituted glucose at the -1 subsite or prefer xylosyl-substituted residues in the -1 subsite. In this work, 137 GH74-related genes were detected by examining 293 Eurotiomycete genomes and Ascomycete fungi contained one or no GH74 xyloglucanase gene per genome. Another interesting feature is that the triad of tryptophan residues along the catalytic cleft was found to be widely conserved among Ascomycetes. The GH74 from Aspergillus fumigatus (AfXEG74) was chosen as an example to conduct comprehensive biochemical studies to determine the catalytic mechanism. AfXEG74 has no CBM and cleaves the xyloglucan backbone between the unsubstituted glucose and xylose-substituted glucose at specific positions, along the XX motif when linked to regions deprived of galactosyl branches. It resembles an endo-processive activity, which after initial random hydrolysis releases xyloglucan-oligosaccharides as major reaction products. This work provides insights on phylogenetic diversity and catalytic mechanism of GH74 xyloglucanases from Ascomycete fungi.
Cloud-based uniform ChIP-Seq processing tools for modENCODE and ENCODE.
Trinh, Quang M; Jen, Fei-Yang Arthur; Zhou, Ziru; Chu, Kar Ming; Perry, Marc D; Kephart, Ellen T; Contrino, Sergio; Ruzanov, Peter; Stein, Lincoln D
2013-07-22
Funded by the National Institutes of Health (NIH), the aim of the Model Organism ENCyclopedia of DNA Elements (modENCODE) project is to provide the biological research community with a comprehensive encyclopedia of functional genomic elements for both model organisms C. elegans (worm) and D. melanogaster (fly). With a total size of just under 10 terabytes of data collected and released to the public, one of the challenges faced by researchers is to extract biologically meaningful knowledge from this large data set. While the basic quality control, pre-processing, and analysis of the data has already been performed by members of the modENCODE consortium, many researchers will wish to reinterpret the data set using modifications and enhancements of the original protocols, or combine modENCODE data with other data sets. Unfortunately this can be a time consuming and logistically challenging proposition. In recognition of this challenge, the modENCODE DCC has released uniform computing resources for analyzing modENCODE data on Galaxy (https://github.com/modENCODE-DCC/Galaxy), on the public Amazon Cloud (http://aws.amazon.com), and on the private Bionimbus Cloud for genomic research (http://www.bionimbus.org). In particular, we have released Galaxy workflows for interpreting ChIP-seq data which use the same quality control (QC) and peak calling standards adopted by the modENCODE and ENCODE communities. For convenience of use, we have created Amazon and Bionimbus Cloud machine images containing Galaxy along with all the modENCODE data, software and other dependencies. Using these resources provides a framework for running consistent and reproducible analyses on modENCODE data, ultimately allowing researchers to use more of their time using modENCODE data, and less time moving it around.
Cloud-based uniform ChIP-Seq processing tools for modENCODE and ENCODE
2013-01-01
Background Funded by the National Institutes of Health (NIH), the aim of the Model Organism ENCyclopedia of DNA Elements (modENCODE) project is to provide the biological research community with a comprehensive encyclopedia of functional genomic elements for both model organisms C. elegans (worm) and D. melanogaster (fly). With a total size of just under 10 terabytes of data collected and released to the public, one of the challenges faced by researchers is to extract biologically meaningful knowledge from this large data set. While the basic quality control, pre-processing, and analysis of the data has already been performed by members of the modENCODE consortium, many researchers will wish to reinterpret the data set using modifications and enhancements of the original protocols, or combine modENCODE data with other data sets. Unfortunately this can be a time consuming and logistically challenging proposition. Results In recognition of this challenge, the modENCODE DCC has released uniform computing resources for analyzing modENCODE data on Galaxy (https://github.com/modENCODE-DCC/Galaxy), on the public Amazon Cloud (http://aws.amazon.com), and on the private Bionimbus Cloud for genomic research (http://www.bionimbus.org). In particular, we have released Galaxy workflows for interpreting ChIP-seq data which use the same quality control (QC) and peak calling standards adopted by the modENCODE and ENCODE communities. For convenience of use, we have created Amazon and Bionimbus Cloud machine images containing Galaxy along with all the modENCODE data, software and other dependencies. Conclusions Using these resources provides a framework for running consistent and reproducible analyses on modENCODE data, ultimately allowing researchers to use more of their time using modENCODE data, and less time moving it around. PMID:23875683
Genome analysis and identification of gelatinase encoded gene in Enterobacter aerogenes
NASA Astrophysics Data System (ADS)
Shahimi, Safiyyah; Mutalib, Sahilah Abdul; Khalid, Rozida Abdul; Repin, Rul Aisyah Mat; Lamri, Mohd Fadly; Bakar, Mohd Faizal Abu; Isa, Mohd Noor Mat
2016-11-01
In this study, bioinformatic analysis towards genome sequence of E. aerogenes was done to determine gene encoded for gelatinase. Enterobacter aerogenes was isolated from hot spring water and gelatinase species-specific bacterium to porcine and fish gelatin. This bacterium offers the possibility of enzymes production which is specific to both species gelatine, respectively. Enterobacter aerogenes was partially genome sequenced resulting in 5.0 mega basepair (Mbp) total size of sequence. From pre-process pipeline, 87.6 Mbp of total reads, 68.8 Mbp of total high quality reads and 78.58 percent of high quality percentage was determined. Genome assembly produced 120 contigs with 67.5% of contigs over 1 kilo base pair (kbp), 124856 bp of N50 contig length and 55.17 % of GC base content percentage. About 4705 protein gene was identified from protein prediction analysis. Two candidate genes selected have highest similarity identity percentage against gelatinase enzyme available in Swiss-Prot and NCBI online database. They were NODE_9_length_26866_cov_148.013245_12 containing 1029 base pair (bp) sequence with 342 amino acid sequence and NODE_24_length_155103_cov_177.082458_62 which containing 717 bp sequence with 238 amino acid sequence, respectively. Thus, two paired of primers (forward and reverse) were designed, based on the open reading frame (ORF) of selected genes. Genome analysis of E. aerogenes resulting genes encoded gelatinase were identified.
Inferring genome-wide interplay landscape between DNA methylation and transcriptional regulation.
Tang, Binhua; Wang, Xin
2015-01-01
DNA methylation and transcriptional regulation play important roles in cancer cell development and differentiation processes. Based on the currently available cell line profiling information from the ENCODE Consortium, we propose a Bayesian inference model to infer and construct genome-wide interaction landscape between DNA methylation and transcriptional regulation, which sheds light on the underlying complex functional mechanisms important within the human cancer and disease context. For the first time, we select all the currently available cell lines (>=20) and transcription factors (>=80) profiling information from the ENCODE Consortium portal. Through the integration of those genome-wide profiling sources, our genome-wide analysis detects multiple functional loci of interest, and indicates that DNA methylation is cell- and region-specific, due to the interplay mechanisms with transcription regulatory activities. We validate our analysis results with the corresponding RNA-sequencing technique for those detected genomic loci. Our results provide novel and meaningful insights for the interplay mechanisms of transcriptional regulation and gene expression for the human cancer and disease studies.
Ranade, Sonali Sachin; García-Gil, María Rosario; Rosselló, Josep A
2016-04-01
Many genes have been lost from the prokaryote plastidial genome during the early events of endosymbiosis in eukaryotes. Some of them were definitively lost, but others were relocated and functionally integrated to the host nuclear genomes through serial events of gene transfer during plant evolution. In gymnosperms, plastid genome sequencing has revealed the loss of ndh genes from several species of Gnetales and Pinaceae, including Norway spruce (Picea abies). This study aims to trace the ndh genes in the nuclear and organellar Norway spruce genomes. The plastid genomes of higher plants contain 11 ndh genes which are homologues of mitochondrial genes encoding subunits of the proton-pumping NADH-dehydrogenase (nicotinamide adenine dinucleotide dehydrogenase) or complex I (electron transport chain). Ndh genes encode 11 NDH polypeptides forming the Ndh complex (analogous to complex I) which seems to be primarily involved in chloro-respiration processes. We considered ndh genes from the plastidial genome of four gymnosperms (Cryptomeria japonica, Cycas revoluta, Ginkgo biloba, Podocarpus totara) and a single angiosperm species (Arabidopsis thaliana) to trace putative homologs in the nuclear and organellar Norway spruce genomes using tBLASTn to assess the evolutionary fate of ndh genes in Norway spruce and to address their genomic location(s), structure, integrity and functionality. The results obtained from tBLASTn were subsequently analyzed by performing homology search for finding ndh specific conserved domains using conserved domain search. We report the presence of non-functional plastid ndh gene fragments, excepting ndhE and ndhG genes, in the nuclear genome of Norway spruce. Regulatory transcriptional elements like promoters, TATA boxes and enhancers were detected in the upstream regions of some ndh fragments. We also found transposable elements in the flanking regions of few ndh fragments suggesting nuclear rearrangements in those regions. These evidences support the hypothesis that, at least in Picea, ndh translocations from the plastid to the nuclear genome have occurred, and that there might have been a functional machinery at some time during evolution to accommodate them within a nuclear-encoded environment, or attempts to form it.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kublanov, Ilya V.; Sigalova, Olga M.; Gavrilov, Sergey N.
The genome of Caldithrix abyssi, the first cultivated representative of a phylum-level bacterial lineage, was sequenced within the framework of Genomic Encyclopedia of Bacteria and Archaea (GEBA) project. The genomic analysis revealed mechanisms allowing this anaerobic bacterium to ferment peptides or to implement nitrate reduction with acetate or molecular hydrogen as electron donors. The genome encoded five different [NiFe]- and [FeFe]-hydrogenases, one of which, group 1 [NiFe]-hydrogenase, is presumably involved in lithoheterotrophic growth, three other produce H 2 during fermentation, and one is apparently bidirectional. The ability to reduce nitrate is determined by a nitrate reductase of the Nap family,more » while nitrite reduction to ammonia is presumably catalyzed by an octaheme cytochrome c nitrite reductase εHao. The genome contained genes of respiratory polysulfide/thiosulfate reductase, however, elemental sulfur and thiosulfate were not used as the electron acceptors for anaerobic respiration with acetate or H 2, probably due to the lack of the gene of the maturation protein. Nevertheless, elemental sulfur and thiosulfate stimulated growth on fermentable substrates (peptides), being reduced to sulfide, most probably through the action of the cytoplasmic sulfide dehydrogenase and/or NAD(P)-dependent [NiFe]-hydrogenase (sulfhydrogenase) encoded by the genome. Surprisingly, the genome of this anaerobic microorganism encoded all genes for cytochrome c oxidase, however, its maturation machinery seems to be non-operational due to genomic rearrangements of supplementary genes. Despite the fact that sugars were not among the substrates reported when C. abyssi was first described, our genomic analysis revealed multiple genes of glycoside hydrolases, and some of them were predicted to be secreted. This finding aided in bringing out four carbohydrates that supported the growth of C. abyssi: starch, cellobiose, glucomannan and xyloglucan. The genomic analysis demonstrated the ability of C. abyssi to synthesize nucleotides and most amino acids and vitamins. Finally, the genomic sequence allowed us to perform a phylogenomic analysis, based on 38 protein sequences, which confirmed the deep branching of this lineage and justified the proposal of a novel phylum Calditrichaeota.« less
Kublanov, Ilya V.; Sigalova, Olga M.; Gavrilov, Sergey N.; ...
2017-02-20
The genome of Caldithrix abyssi, the first cultivated representative of a phylum-level bacterial lineage, was sequenced within the framework of Genomic Encyclopedia of Bacteria and Archaea (GEBA) project. The genomic analysis revealed mechanisms allowing this anaerobic bacterium to ferment peptides or to implement nitrate reduction with acetate or molecular hydrogen as electron donors. The genome encoded five different [NiFe]- and [FeFe]-hydrogenases, one of which, group 1 [NiFe]-hydrogenase, is presumably involved in lithoheterotrophic growth, three other produce H 2 during fermentation, and one is apparently bidirectional. The ability to reduce nitrate is determined by a nitrate reductase of the Nap family,more » while nitrite reduction to ammonia is presumably catalyzed by an octaheme cytochrome c nitrite reductase εHao. The genome contained genes of respiratory polysulfide/thiosulfate reductase, however, elemental sulfur and thiosulfate were not used as the electron acceptors for anaerobic respiration with acetate or H 2, probably due to the lack of the gene of the maturation protein. Nevertheless, elemental sulfur and thiosulfate stimulated growth on fermentable substrates (peptides), being reduced to sulfide, most probably through the action of the cytoplasmic sulfide dehydrogenase and/or NAD(P)-dependent [NiFe]-hydrogenase (sulfhydrogenase) encoded by the genome. Surprisingly, the genome of this anaerobic microorganism encoded all genes for cytochrome c oxidase, however, its maturation machinery seems to be non-operational due to genomic rearrangements of supplementary genes. Despite the fact that sugars were not among the substrates reported when C. abyssi was first described, our genomic analysis revealed multiple genes of glycoside hydrolases, and some of them were predicted to be secreted. This finding aided in bringing out four carbohydrates that supported the growth of C. abyssi: starch, cellobiose, glucomannan and xyloglucan. The genomic analysis demonstrated the ability of C. abyssi to synthesize nucleotides and most amino acids and vitamins. Finally, the genomic sequence allowed us to perform a phylogenomic analysis, based on 38 protein sequences, which confirmed the deep branching of this lineage and justified the proposal of a novel phylum Calditrichaeota.« less
Complete genome sequence of yam chlorotic necrosis virus, a novel macluravirus infecting yam
USDA-ARS?s Scientific Manuscript database
Complete genomic sequence of a novel member of the genus Macluravirus was determined from yam plants with chlorotic and necrotic symptoms in China. The genomic RNA consists of 8,261 nucleotides (nt) excluding the 3’-terminal poly (A) tail, containing one long open reading frame (ORF) encoding a larg...
USDA-ARS?s Scientific Manuscript database
The features contributing to the differences in pathogenicity of the C. fetus subspecies are unknown. Putative factors involved in pathogenesis are located in genomic islands that encode type IV secretion system (T4SS) and fic-domain (filamentation induced by cyclic AMP) proteins. In the genomes of ...
Complete Genome Sequence of Staphylococcus epidermidis 1457
Galac, Madeline R.; Stam, Jason; Maybank, Rosslyn; Hinkle, Mary; Mack, Dietrich; Rohde, Holger; Roth, Amanda L.
2017-01-01
ABSTRACT Staphylococcus epidermidis 1457 is a frequently utilized strain that is amenable to genetic manipulation and has been widely used for biofilm-related research. We report here the whole-genome sequence of this strain, which encodes 2,277 protein-coding genes and 81 RNAs within its 2.4-Mb genome and plasmid. PMID:28572323
Genome Sequences for Five Strains of the Emerging Pathogen Haemophilus haemolyticus
Jordan, I. King; Conley, Andrew B.; Antonov, Ivan V.; Arthur, Robert A.; Cook, Erin D.; Cooper, Guy P.; Jones, Bernard L.; Knipe, Kristen M.; Lee, Kevin J.; Liu, Xing; Mitchell, Gabriel J.; Pande, Pushkar R.; Petit, Robert A.; Qin, Shaopu; Rajan, Vani N.; Sarda, Shruti; Sebastian, Aswathy; Tang, Shiyuyun; Thapliyal, Racchit; Varghese, Neha J.; Ye, Tianjun; Katz, Lee S.; Wang, Xin; Rowe, Lori; Frace, Michael; Mayer, Leonard W.
2011-01-01
We report the first whole-genome sequences for five strains, two carried and three pathogenic, of the emerging pathogen Haemophilus haemolyticus. Preliminary analyses indicate that these genome sequences encode markers that distinguish H. haemolyticus from its closest Haemophilus relatives and provide clues to the identity of its virulence factors. PMID:21952546
Complete genome sequence of a divergent strain of Japanese yam mosaic virus from China
USDA-ARS?s Scientific Manuscript database
A novel strain of Japanese yam mosaic virus (JYMV-CN) was identified in a yam plant with foliar mottle symptoms in China. The complete genomic sequence of JYMV-CN was determined. Its genomic sequence of 9701 nucleotides encodes a polyprotein of 3247 amino acids. Its organization was virtually identi...
Moura, Quézia; Fernandes, Miriam R; Cerdeira, Louise; Nhambe, Lúcia F; Ienne, Susan; Souza, Tiago A; Lincopan, Nilton
2017-09-01
Multidrug-resistant (MDR) Enterobacter aerogenes strains are frequently associated with nosocomial infections and high mortality rates, representing a serious public health problem. The aim of this study was to present the draft genome sequence of a MDR KPC-2-producing E. aerogenes isolated from a perineal swab of a hospitalised patient in Brazil. Genomic DNA was sequenced using an Illumina MiSeq platform. De novo genome assembly was carried out using the A5-Miseq pipeline, and whole-genome sequence analysis was performed using tools from the Center for Genomic Epidemiology. The strain harboured resistance genes to β-lactams, aminoglycosides, sulphonamides and trimethoprim in addition to genes encoding multidrug efflux system proteins, a quaternary ammonium transporter and heavy metal efflux system proteins. In addition, the strain harboured genes encoding diverse virulence factors. These data might allow a better understanding of the genetic basis of antimicrobial resistance and virulence in E. aerogenes strains. Copyright © 2017 International Society for Chemotherapy of Infection and Cancer. Published by Elsevier Ltd. All rights reserved.
Huguet-Tapia, Jose C.; Lefebure, Tristan; Badger, Jonathan H.; Guan, Dongli; Stanhope, Michael J.
2016-01-01
Streptomyces spp. are highly differentiated actinomycetes with large, linear chromosomes that encode an arsenal of biologically active molecules and catabolic enzymes. Members of this genus are well equipped for life in nutrient-limited environments and are common soil saprophytes. Out of the hundreds of species in the genus Streptomyces, a small group has evolved the ability to infect plants. The recent availability of Streptomyces genome sequences, including four genomes of pathogenic species, provided an opportunity to characterize the gene content specific to these pathogens and to study phylogenetic relationships among them. Genome sequencing, comparative genomics, and phylogenetic analysis enabled us to discriminate pathogenic from saprophytic Streptomyces strains; moreover, we calculated that the pathogen-specific genome contains 4,662 orthologs. Phylogenetic reconstruction suggested that Streptomyces scabies and S. ipomoeae share an ancestor but that their biosynthetic clusters encoding the required virulence factor thaxtomin have diverged. In contrast, S. turgidiscabies and S. acidiscabies, two relatively unrelated pathogens, possess highly similar thaxtomin biosynthesis clusters, which suggests that the acquisition of these genes was through lateral gene transfer. PMID:26826232
USDA-ARS?s Scientific Manuscript database
Bean pod mottle virus (BPMV) is a bipartite, positive sense (+) RNA plant virus in the Secoviridae family. Its RNA1 encodes proteins required for genome replication, whereas RNA2 primarily encodes proteins needed for virion assembly and cell-to-cell movement. However, the function of a 58 kilo-dalto...
USDA-ARS?s Scientific Manuscript database
Marek’s disease virus (MDV) encodes a ribonucleotide reductase (RR), a key regulatory enzyme in the DNA synthesis pathway. The gene coding for the RR of MDV is located in the unique long (UL) region of the genome. The large subunit is encoded by UL39 (RR1) and is predicted to comprise 860 amino acid...
The putative drug efflux systems of the Bacillus cereus group
Elbourne, Liam D. H.; Vörös, Aniko; Kroeger, Jasmin K.; Simm, Roger; Tourasse, Nicolas J.; Finke, Sarah; Henderson, Peter J. F.; Økstad, Ole Andreas; Paulsen, Ian T.; Kolstø, Anne-Brit
2017-01-01
The Bacillus cereus group of bacteria includes seven closely related species, three of which, B. anthracis, B. cereus and B. thuringiensis, are pathogens of humans, animals and/or insects. Preliminary investigations into the transport capabilities of different bacterial lineages suggested that genes encoding putative efflux systems were unusually abundant in the B. cereus group compared to other bacteria. To explore the drug efflux potential of the B. cereus group all putative efflux systems were identified in the genomes of prototypical strains of B. cereus, B. anthracis and B. thuringiensis using our Transporter Automated Annotation Pipeline. More than 90 putative drug efflux systems were found within each of these strains, accounting for up to 2.7% of their protein coding potential. Comparative analyses demonstrated that the efflux systems are highly conserved between these species; 70–80% of the putative efflux pumps were shared between all three strains studied. Furthermore, 82% of the putative efflux system proteins encoded by the prototypical B. cereus strain ATCC 14579 (type strain) were found to be conserved in at least 80% of 169 B. cereus group strains that have high quality genome sequences available. However, only a handful of these efflux pumps have been functionally characterized. Deletion of individual efflux pump genes from B. cereus typically had little impact to drug resistance phenotypes or the general fitness of the strains, possibly because of the large numbers of alternative efflux systems that may have overlapping substrate specificities. Therefore, to gain insight into the possible transport functions of efflux systems in B. cereus, we undertook large-scale qRT-PCR analyses of efflux pump gene expression following drug shocks and other stress treatments. Clustering of gene expression changes identified several groups of similarly regulated systems that may have overlapping drug resistance functions. In this article we review current knowledge of the small molecule efflux pumps encoded by the B. cereus group and suggest the likely functions of numerous uncharacterised pumps. PMID:28472044
Genome-wide identification, phylogeny, and expression analysis of the SWEET gene family in tomato.
Feng, Chao-Yang; Han, Jia-Xuan; Han, Xiao-Xue; Jiang, Jing
2015-12-01
The SWEET (Sugars Will Eventually Be Exported Transporters) gene family encodes membrane-embedded sugar transporters containing seven transmembrane helices harboring two MtN3 and saliva domain. SWEETs play important roles in diverse biological processes, including plant growth, development, and response to environmental stimuli. Here, we conducted an exhaustive search of the tomato genome, leading to the identification of 29 SWEET genes. We analyzed the structures, conserved domains, and phylogenetic relationships of these protein-coding genes in detail. We also analyzed the transcript levels of SWEET genes in various tissues, organs, and developmental stages to obtain information about their functions. Furthermore, we investigated the expression patterns of the SWEET genes in response to exogenous sugar and adverse environmental stress (high and low temperatures). Some family members exhibited tissue-specific expression, whereas others were more ubiquitously expressed. Numerous stress-responsive candidate genes were obtained. The results of this study provide insights into the characteristics of the SWEET genes in tomato and may serve as a basis for further functional studies of such genes. Copyright © 2015 Elsevier B.V. All rights reserved.
Brunet, Marie A; Levesque, Sébastien A; Hunting, Darel J; Cohen, Alan A; Roucou, Xavier
2018-05-01
Technological advances promise unprecedented opportunities for whole exome sequencing and proteomic analyses of populations. Currently, data from genome and exome sequencing or proteomic studies are searched against reference genome annotations. This provides the foundation for research and clinical screening for genetic causes of pathologies. However, current genome annotations substantially underestimate the proteomic information encoded within a gene. Numerous studies have now demonstrated the expression and function of alternative (mainly small, sometimes overlapping) ORFs within mature gene transcripts. This has important consequences for the correlation of phenotypes and genotypes. Most alternative ORFs are not yet annotated because of a lack of evidence, and this absence from databases precludes their detection by standard proteomic methods, such as mass spectrometry. Here, we demonstrate how current approaches tend to overlook alternative ORFs, hindering the discovery of new genetic drivers and fundamental research. We discuss available tools and techniques to improve identification of proteins from alternative ORFs and finally suggest a novel annotation system to permit a more complete representation of the transcriptomic and proteomic information contained within a gene. Given the crucial challenge of distinguishing functional ORFs from random ones, the suggested pipeline emphasizes both experimental data and conservation signatures. The addition of alternative ORFs in databases will render identification less serendipitous and advance the pace of research and genomic knowledge. This review highlights the urgent medical and research need to incorporate alternative ORFs in current genome annotations and thus permit their inclusion in hypotheses and models, which relate phenotypes and genotypes. © 2018 Brunet et al.; Published by Cold Spring Harbor Laboratory Press.
Salinero, Alicia C.; Knoll, Elisabeth R.; Zhu, Z. Iris
2018-01-01
The Ty1 retrotransposons present in the genome of Saccharomyces cerevisiae belong to the large class of mobile genetic elements that replicate via an RNA intermediary and constitute a significant portion of most eukaryotic genomes. The retromobility of Ty1 is regulated by numerous host factors, including several subunits of the Mediator transcriptional co-activator complex. In spite of its known function in the nucleus, previous studies have implicated Mediator in the regulation of post-translational steps in Ty1 retromobility. To resolve this paradox, we systematically examined the effects of deleting non-essential Mediator subunits on the frequency of Ty1 retromobility and levels of retromobility intermediates. Our findings reveal that loss of distinct Mediator subunits alters Ty1 retromobility positively or negatively over a >10,000-fold range by regulating the ratio of an internal transcript, Ty1i, to the genomic Ty1 transcript. Ty1i RNA encodes a dominant negative inhibitor of Ty1 retromobility that blocks virus-like particle maturation and cDNA synthesis. These results resolve the conundrum of Mediator exerting sweeping control of Ty1 retromobility with only minor effects on the levels of Ty1 genomic RNA and the capsid protein, Gag. Since the majority of characterized intrinsic and extrinsic regulators of Ty1 retromobility do not appear to effect genomic Ty1 RNA levels, Mediator could play a central role in integrating signals that influence Ty1i expression to modulate retromobility. PMID:29462141
An Integrated Encyclopedia of DNA Elements in the Human Genome
2012-01-01
Summary The human genome encodes the blueprint of life, but the function of the vast majority of its nearly three billion bases is unknown. The Encyclopedia of DNA Elements (ENCODE) project has systematically mapped regions of transcription, transcription factor association, chromatin structure, and histone modification. These data enabled us to assign biochemical functions for 80% of the genome, in particular outside of the well-studied protein-coding regions. Many discovered candidate regulatory elements are physically associated with one another and with expressed genes, providing new insights into the mechanisms of gene regulation. The newly identified elements also show a statistical correspondence to sequence variants linked to human disease, and can thereby guide interpretation of this variation. Overall the project provides new insights into the organization and regulation of our genes and genome, and an expansive resource of functional annotations for biomedical research. PMID:22955616
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tashkandy, Nisreen; Sabban, Sari; Fakieh, Mohammad
Flavobacterium suncheonense is a member of the family Flavobacteriaceae in the phylum Bacteroidetes. Strain GH29-5 T (DSM 17707 T ) was isolated from greenhouse soil in Suncheon, South Korea. F. suncheonense GH29-5 T is part of the Genomic Encyclopedia of Bacteria and Archaea project. The 2,880,663 bp long draft genome consists of 54 scaffolds with 2739 protein-coding genes and 82 RNA genes. The genome of strain GH29-5 T has 117 genes encoding peptidases but a small number of genes encoding carbohydrate active enzymes (51 CAZymes). Metallo and serine peptidases were found most frequently. Among CAZymes, eight glycoside hydrolase families, ninemore » glycosyl transferase families, two carbohydrate binding module families and four carbohydrate esterase families were identified. Suprisingly, polysaccharides utilization loci (PULs) were not found in strain GH29-5 T . Based on the coherent physiological and genomic characteristics we suggest that F. suncheonense GH29-5 T feeds rather on proteins than saccharides and lipids.« less
Ultraconserved regions encoding ncRNAs are altered in human leukemias and carcinomas.
Calin, George A; Liu, Chang-gong; Ferracin, Manuela; Hyslop, Terry; Spizzo, Riccardo; Sevignani, Cinzia; Fabbri, Muller; Cimmino, Amelia; Lee, Eun Joo; Wojcik, Sylwia E; Shimizu, Masayoshi; Tili, Esmerina; Rossi, Simona; Taccioli, Cristian; Pichiorri, Flavia; Liu, Xiuping; Zupo, Simona; Herlea, Vlad; Gramantieri, Laura; Lanza, Giovanni; Alder, Hansjuerg; Rassenti, Laura; Volinia, Stefano; Schmittgen, Thomas D; Kipps, Thomas J; Negrini, Massimo; Croce, Carlo M
2007-09-01
Noncoding RNA (ncRNA) transcripts are thought to be involved in human tumorigenesis. We report that a large fraction of genomic ultraconserved regions (UCRs) encode a particular set of ncRNAs whose expression is altered in human cancers. Genome-wide profiling revealed that UCRs have distinct signatures in human leukemias and carcinomas. UCRs are frequently located at fragile sites and genomic regions involved in cancers. We identified certain UCRs whose expression may be regulated by microRNAs abnormally expressed in human chronic lymphocytic leukemia, and we proved that the inhibition of an overexpressed UCR induces apoptosis in colon cancer cells. Our findings argue that ncRNAs and interaction between noncoding genes are involved in tumorigenesis to a greater extent than previously thought.
The genome of Brucella melitensis.
DelVecchio, Vito G; Kapatral, Vinayak; Elzer, Philip; Patra, Guy; Mujer, Cesar V
2002-12-20
The genome of Brucella melitensis strain 16M was sequenced and contained 3,294,931 bp distributed over two circular chromosomes. Chromosome I was composed of 2,117,144 bp and chromosome II has 1,177,787 bp. A total of 3,198 ORFs were predicted. The origins of replication of the chromosomes are similar to each other and to those of other alpha-proteobacteria. Housekeeping genes such as those that encode for DNA replication, protein synthesis, core metabolism, and cell-wall biosynthesis were found on both chromosomes. Genes encoding adhesins, invasins, and hemolysins were also identified.
Schmitz-Esser, Stephan; Tischler, Patrick; Arnold, Roland; Montanaro, Jacqueline; Wagner, Michael; Rattei, Thomas; Horn, Matthias
2010-01-01
Protozoa play host for many intracellular bacteria and are important for the adaptation of pathogenic bacteria to eukaryotic cells. We analyzed the genome sequence of “Candidatus Amoebophilus asiaticus,” an obligate intracellular amoeba symbiont belonging to the Bacteroidetes. The genome has a size of 1.89 Mbp, encodes 1,557 proteins, and shows massive proliferation of IS elements (24% of all genes), although the genome seems to be evolutionarily relatively stable. The genome does not encode pathways for de novo biosynthesis of cofactors, nucleotides, and almost all amino acids. “Ca. Amoebophilus asiaticus” encodes a variety of proteins with predicted importance for host cell interaction; in particular, an arsenal of proteins with eukaryotic domains, including ankyrin-, TPR/SEL1-, and leucine-rich repeats, which is hitherto unmatched among prokaryotes, is remarkable. Unexpectedly, 26 proteins that can interfere with the host ubiquitin system were identified in the genome. These proteins include F- and U-box domain proteins and two ubiquitin-specific proteases of the CA clan C19 family, representing the first prokaryotic members of this protein family. Consequently, interference with the host ubiquitin system is an important host cell interaction mechanism of “Ca. Amoebophilus asiaticus”. More generally, we show that the eukaryotic domains identified in “Ca. Amoebophilus asiaticus” are also significantly enriched in the genomes of other amoeba-associated bacteria (including chlamydiae, Legionella pneumophila, Rickettsia bellii, Francisella tularensis, and Mycobacterium avium). This indicates that phylogenetically and ecologically diverse bacteria which thrive inside amoebae exploit common mechanisms for interaction with their hosts, and it provides further evidence for the role of amoebae as training grounds for bacterial pathogens of humans. PMID:20023027
Schmitz-Esser, Stephan; Tischler, Patrick; Arnold, Roland; Montanaro, Jacqueline; Wagner, Michael; Rattei, Thomas; Horn, Matthias
2010-02-01
Protozoa play host for many intracellular bacteria and are important for the adaptation of pathogenic bacteria to eukaryotic cells. We analyzed the genome sequence of "Candidatus Amoebophilus asiaticus," an obligate intracellular amoeba symbiont belonging to the Bacteroidetes. The genome has a size of 1.89 Mbp, encodes 1,557 proteins, and shows massive proliferation of IS elements (24% of all genes), although the genome seems to be evolutionarily relatively stable. The genome does not encode pathways for de novo biosynthesis of cofactors, nucleotides, and almost all amino acids. "Ca. Amoebophilus asiaticus" encodes a variety of proteins with predicted importance for host cell interaction; in particular, an arsenal of proteins with eukaryotic domains, including ankyrin-, TPR/SEL1-, and leucine-rich repeats, which is hitherto unmatched among prokaryotes, is remarkable. Unexpectedly, 26 proteins that can interfere with the host ubiquitin system were identified in the genome. These proteins include F- and U-box domain proteins and two ubiquitin-specific proteases of the CA clan C19 family, representing the first prokaryotic members of this protein family. Consequently, interference with the host ubiquitin system is an important host cell interaction mechanism of "Ca. Amoebophilus asiaticus". More generally, we show that the eukaryotic domains identified in "Ca. Amoebophilus asiaticus" are also significantly enriched in the genomes of other amoeba-associated bacteria (including chlamydiae, Legionella pneumophila, Rickettsia bellii, Francisella tularensis, and Mycobacterium avium). This indicates that phylogenetically and ecologically diverse bacteria which thrive inside amoebae exploit common mechanisms for interaction with their hosts, and it provides further evidence for the role of amoebae as training grounds for bacterial pathogens of humans.
Hori, Kentaro; Yamada, Yasuyuki; Purwanto, Ratmoyo; Minakuchi, Yohei; Toyoda, Atsushi; Hirakawa, Hideki
2018-01-01
Abstract Land plants produce specialized low molecular weight metabolites to adapt to various environmental stressors, such as UV radiation, pathogen infection, wounding and animal feeding damage. Due to the large variety of stresses, plants produce various chemicals, particularly plant species-specific alkaloids, through specialized biosynthetic pathways. In this study, using a draft genome sequence and querying known biosynthetic cytochrome P450 (P450) enzyme-encoding genes, we characterized the P450 genes involved in benzylisoquinoline alkaloid (BIA) biosynthesis in California poppy (Eschscholzia californica), as P450s are key enzymes involved in the diversification of specialized metabolism. Our in silico studies showed that all identified enzyme-encoding genes involved in BIA biosynthesis were found in the draft genome sequence of approximately 489 Mb, which covered approximately 97% of the whole genome (502 Mb). Further analyses showed that some P450 families involved in BIA biosynthesis, i.e. the CYP80, CYP82 and CYP719 families, were more enriched in the genome of E. californica than in the genome of Arabidopsis thaliana, a plant that does not produce BIAs. CYP82 family genes were highly abundant, so we measured the expression of CYP82 genes with respect to alkaloid accumulation in different plant tissues and two cell lines whose BIA production differs to estimate the functions of the genes. Further characterization revealed two highly homologous P450s (CYP82P2 and CYP82P3) that exhibited 10-hydroxylase activities with different substrate specificities. Here, we discuss the evolution of the P450 genes and the potential for further genome mining of the genes encoding the enzymes involved in BIA biosynthesis. PMID:29301019
Distant Mimivirus relative with a larger genome highlights the fundamental features of Megaviridae
Arslan, Defne; Legendre, Matthieu; Seltzer, Virginie; Abergel, Chantal; Claverie, Jean-Michel
2011-01-01
Mimivirus, a DNA virus infecting acanthamoeba, was for a long time the largest known virus both in terms of particle size and gene content. Its genome encodes 979 proteins, including the first four aminoacyl tRNA synthetases (ArgRS, CysRS, MetRS, and TyrRS) ever found outside of cellular organisms. The discovery that Mimivirus encoded trademark cellular functions prompted a wealth of theoretical studies revisiting the concept of virus and associated large DNA viruses with the emergence of early eukaryotes. However, the evolutionary significance of these unique features remained impossible to assess in absence of a Mimivirus relative exhibiting a suitable evolutionary divergence. Here, we present Megavirus chilensis, a giant virus isolated off the coast of Chile, but capable of replicating in fresh water acanthamoeba. Its 1,259,197-bp genome is the largest viral genome fully sequenced so far. It encodes 1,120 putative proteins, of which 258 (23%) have no Mimivirus homologs. The 594 Megavirus/Mimivirus orthologs share an average of 50% of identical residues. Despite this divergence, Megavirus retained all of the genomic features characteristic of Mimivirus, including its cellular-like genes. Moreover, Megavirus exhibits three additional aminoacyl-tRNA synthetase genes (IleRS, TrpRS, and AsnRS) adding strong support to the previous suggestion that the Mimivirus/Megavirus lineage evolved from an ancestral cellular genome by reductive evolution. The main differences in gene content between Mimivirus and Megavirus genomes are due to (i) lineages specific gains or losses of genes, (ii) lineage specific gene family expansion or deletion, and (iii) the insertion/migration of mobile elements (intron, intein). PMID:21987820
Khan, Muhammad Sarwar; Hameed, Waqar; Nozoe, Mikio; Shiina, Takashi
2007-05-01
The functional analysis of genes encoded by the chloroplast genome of tobacco by reverse genetics is routine. Nevertheless, for a small number of genes their deletion generates heteroplasmic genotypes, complicating their analysis. There is thus the need for additional strategies to develop deletion mutants for these genes. We have developed a homologous copy correction-based strategy for deleting/mutating genes encoded on the chloroplast genome. This system was used to produce psbA knockouts. The resulting plants are homoplasmic and lack photosystem II (PSII) activity. Further, the deletion mutants exhibit a distinct phenotype; young leaves are green, whereas older leaves are bleached, irrespective of light conditions. This suggests that senescence is promoted by the absence of psbA. Analysis of the transcript levels indicates that NEP (nuclear-encoded plastid RNA polymerase)-dependent plastid genes are up regulated in the psbA deletion mutants, whereas the bleached leaves retain plastid-encoded plastid RNA polymerase activity. Hence, the expression of NEP-dependent plastid genes may be regulated by photosynthesis, either directly or indirectly.
Frazier, Courtney L.; San Filippo, Joseph; Lambowitz, Alan M.; Mills, David A.
2003-01-01
Despite their commercial importance, there are relatively few facile methods for genomic manipulation of the lactic acid bacteria. Here, the lactococcal group II intron, Ll.ltrB, was targeted to insert efficiently into genes encoding malate decarboxylase (mleS) and tetracycline resistance (tetM) within the Lactococcus lactis genome. Integrants were readily identified and maintained in the absence of a selectable marker. Since splicing of the Ll.ltrB intron depends on the intron-encoded protein, targeted invasion with an intron lacking the intron open reading frame disrupted TetM and MleS function, and MleS activity could be partially restored by expressing the intron-encoded protein in trans. Restoration of splicing from intron variants lacking the intron-encoded protein illustrates how targeted group II introns could be used for conditional expression of any gene. Furthermore, the modified Ll.ltrB intron was used to separately deliver a phage resistance gene (abiD) and a tetracycline resistance marker (tetM) into mleS, without the need for selection to drive the integration or to maintain the integrant. Our findings demonstrate the utility of targeted group II introns as a potential food-grade mechanism for delivery of industrially important traits into the genomes of lactococci. PMID:12571038
The Tomato Terpene Synthase Gene Family1[W][OA
Falara, Vasiliki; Akhtar, Tariq A.; Nguyen, Thuong T.H.; Spyropoulou, Eleni A.; Bleeker, Petra M.; Schauvinhold, Ines; Matsuba, Yuki; Bonini, Megan E.; Schilmiller, Anthony L.; Last, Robert L.; Schuurink, Robert C.; Pichersky, Eran
2011-01-01
Compounds of the terpenoid class play numerous roles in the interactions of plants with their environment, such as attracting pollinators and defending the plant against pests. We show here that the genome of cultivated tomato (Solanum lycopersicum) contains 44 terpene synthase (TPS) genes, including 29 that are functional or potentially functional. Of these 29 TPS genes, 26 were expressed in at least some organs or tissues of the plant. The enzymatic functions of eight of the TPS proteins were previously reported, and here we report the specific in vitro catalytic activity of 10 additional tomato terpene synthases. Many of the tomato TPS genes are found in clusters, notably on chromosomes 1, 2, 6, 8, and 10. All TPS family clades previously identified in angiosperms are also present in tomato. The largest clade of functional TPS genes found in tomato, with 12 members, is the TPS-a clade, and it appears to encode only sesquiterpene synthases, one of which is localized to the mitochondria, while the rest are likely cytosolic. A few additional sesquiterpene synthases are encoded by TPS-b clade genes. Some of the tomato sesquiterpene synthases use z,z-farnesyl diphosphate in vitro as well, or more efficiently than, the e,e-farnesyl diphosphate substrate. Genes encoding monoterpene synthases are also prevalent, and they fall into three clades: TPS-b, TPS-g, and TPS-e/f. With the exception of two enzymes involved in the synthesis of ent-kaurene, the precursor of gibberellins, no other tomato TPS genes could be demonstrated to encode diterpene synthases so far. PMID:21813655
Cummins, Joanne; Casey, Pat G.; Joyce, Susan A.; Gahan, Cormac G. M.
2013-01-01
Listeria monocytogenes is a Gram-positive foodborne pathogen and the causative agent of listerosis a disease that manifests predominately as meningitis in the non-pregnant individual or infection of the fetus and spontaneous abortion in pregnant women. Common-source outbreaks of foodborne listeriosis are associated with significant morbidity and mortality. However, relatively little is known concerning the mechanisms that govern infection via the oral route. In order to aid functional genetic analysis of the gastrointestinal phase of infection we designed a novel signature-tagged mutagenesis (STM) system based upon the invasive L. monocytogenes 4b serotype H7858 strain. To overcome the limitations of gastrointestinal infection by L. monocytogenes in the mouse model we created a H7858 strain that is genetically optimised for oral infection in mice. Furthermore our STM system was based upon a mariner transposon to favour numerous and random transposition events throughout the L. monocytogenes genome. Use of the STM bank to investigate oral infection by L. monocytogenes identified 21 insertion mutants that demonstrated significantly reduced potential for infection in our model. The sites of transposon insertion included lmOh7858_0671 (encoding an internalin homologous to Lmo0610), lmOh7858_0898 (encoding a putative surface-expressed LPXTG protein homologous to Lmo0842), lmOh7858_2579 (encoding the HupDGC hemin transport system) and lmOh7858_0399 (encoding a putative fructose specific phosphotransferase system). We propose that this represents an optimised STM system for functional genetic analysis of foodborne/oral infection by L. monocytogenes. PMID:24069416
Organization of the murine Cd22 locus
DOE Office of Scientific and Technical Information (OSTI.GOV)
Law, Che-Leung; Torres, R.M.; Sundeberg, H.A.
1993-07-01
Murine CD22 (mCD22) is a B cell-associated adhesion protein with seven extracellular Ig-like domains that has 62% amino acid identify to its human homologue. Southern analysis on genomic DNA isolated from tissues and cell lines from several mouse strains using mCD22 cDNA demonstrated that the Cd22 locus encoding mCD22 is a single copy gene of [le]30 kb. Digestion of genomic DNA preparations with four restriction endonucleases revealed the presence of restriction fragment length polymorphisms (RFLP) in BALB/c, C57BL/6, and C3H strains vs DBA/2j, NZB, and NZC strains, suggesting the presence of two or more Cd22 alleles. Using a mCD22 cDNAmore » clone derived from the BALB/c strain, the authors isolated genomic clones from a DBA/2 genomic library that contained all the exons necessary to encode the full length mCD22 cDNA. Fifteen exons, including exon 3 that encodes the translation start codon, were identified. Each extracellular Ig-like domain of mCD22 is encoded by a single exon. A comparison between the nucleotide sequences of the BALB/c CD22 cDNA and the exons of the DBA/2j CD22 genomic clones revealed an 18-nucleotide deletion in exon 4 (encoding the most distal Ig-like domain 1 of mCD22) of the DBA/2j genomic sequence in addition to a number of substitutions, insertions, and deletions in other exons. These nucleotide differences were also present in a cDNA clone isolated from total RNA of LPS-activated DBA/2j splenocytes mosome 7, a region sytenic to human chromosome 19q, close to the previously reported loci, Lyb-8 and Mag (a homologue of Cd22). An antibody (CY34) against the Lyb-8.2 B cell marker reacted with a BHK transfectant expressing the full length mCd22 cDNA, thus demonstrating that Lyb-8 and Cd22 loci are identical. Furthermore, a rat anti-mCD22 mAb, NIM-R6, bound to slgM[sup +] DBA/2j B cells, confirming the expression of a CD22 protein by the Cd22[sup a]/lyb-8[sup a] allele. 63 refs., 7 figs., 1 tab.« less
Drissi, F; Merhej, V; Angelakis, E; El Kaoutari, A; Carrière, F; Henrissat, B; Raoult, D
2014-01-01
BACKGROUND: Some Lactobacillus species are associated with obesity and weight gain while others are associated with weight loss. Lactobacillus spp. and bifidobacteria represent a major bacterial population of the small intestine where lipids and simple carbohydrates are absorbed, particularly in the duodenum and jejunum. The objective of this study was to identify Lactobacillus spp. proteins involved in carbohydrate and lipid metabolism associated with weight modifications. METHODS: We examined a total of 13 complete genomes belonging to seven different Lactobacillus spp. previously associated with weight gain or weight protection. We combined the data obtained from the Rapid Annotation using Subsystem Technology, Batch CD-Search and Gene Ontology to classify gene function in each genome. RESULTS: We observed major differences between the two groups of genomes. Weight gain-associated Lactobacillus spp. appear to lack enzymes involved in the catabolism of fructose, defense against oxidative stress and the synthesis of dextrin, L-rhamnose and acetate. Weight protection-associated Lactobacillus spp. encoded a significant gene amount of glucose permease. Regarding lipid metabolism, thiolases were only encoded in the genome of weight gain-associated Lactobacillus spp. In addition, we identified 18 different types of bacteriocins in the studied genomes, and weight gain-associated Lactobacillus spp. encoded more bacteriocins than weight protection-associated Lactobacillus spp. CONCLUSIONS: The results of this study revealed that weight protection-associated Lactobacillus spp. have developed defense mechanisms for enhanced glycolysis and defense against oxidative stress. Weight gain-associated Lactobacillus spp. possess a limited ability to breakdown fructose or glucose and might reduce ileal brake effects. PMID:24567124
Woolford, Lucy; Rector, Annabel; Van Ranst, Marc; Ducki, Andrea; Bennett, Mark D.; Nicholls, Philip K.; Warren, Kristin S.; Swan, Ralph A.; Wilcox, Graham E.; O'Hara, Amanda J.
2007-01-01
Conservation efforts to prevent the extinction of the endangered western barred bandicoot (Perameles bougainville) are currently hindered by a progressively debilitating cutaneous and mucocutaneous papillomatosis and carcinomatosis syndrome observed in captive and wild populations. In this study, we detected a novel virus, designated the bandicoot papillomatosis carcinomatosis virus type 1 (BPCV1), in lesional tissue from affected western barred bandicoots using multiply primed rolling-circle amplification and PCR with the cutaneotropic papillomavirus primer pairs FAP59/FAP64 and AR-L1F8/AR-L1R9. Sequencing of the BPCV1 genome revealed a novel prototype virus exhibiting genomic properties of both the Papillomaviridae and the Polyomaviridae. Papillomaviral properties included a large genome size (∼7.3 kb) and the presence of open reading frames (ORFs) encoding canonical L1 and L2 structural proteins. The genomic organization in which structural and nonstructural proteins were encoded on different strands of the double-stranded genome and the presence of ORFs encoding the nonstructural proteins large T and small t antigens were, on the other hand, typical polyomaviral features. BPCV1 may represent the first member of a novel virus family, descended from a common ancestor of the papillomaviruses and polyomaviruses recognized today. Alternatively, it may represent the product of ancient recombination between members of these two virus families. The discovery of this virus could have implications for the current taxonomic classification of Papillomaviridae and Polyomaviridae and can provide further insight into the evolution of these ancient virus families. PMID:17898069
Bushley, Kathryn E.; Ohm, Robin A.; Otillar, Robert; Martin, Joel; Schackwitz, Wendy; Grimwood, Jane; MohdZainudin, NurAinIzzati; Xue, Chunsheng; Wang, Rui; Manning, Viola A.; Dhillon, Braham; Tu, Zheng Jin; Steffenson, Brian J.; Salamov, Asaf; Sun, Hui; Lowry, Steve; LaButti, Kurt; Han, James; Copeland, Alex; Lindquist, Erika; Barry, Kerrie; Schmutz, Jeremy; Baker, Scott E.; Ciuffetti, Lynda M.; Grigoriev, Igor V.; Zhong, Shaobin; Turgeon, B. Gillian
2013-01-01
The genomes of five Cochliobolus heterostrophus strains, two Cochliobolus sativus strains, three additional Cochliobolus species (Cochliobolus victoriae, Cochliobolus carbonum, Cochliobolus miyabeanus), and closely related Setosphaeria turcica were sequenced at the Joint Genome Institute (JGI). The datasets were used to identify SNPs between strains and species, unique genomic regions, core secondary metabolism genes, and small secreted protein (SSP) candidate effector encoding genes with a view towards pinpointing structural elements and gene content associated with specificity of these closely related fungi to different cereal hosts. Whole-genome alignment shows that three to five percent of each genome differs between strains of the same species, while a quarter of each genome differs between species. On average, SNP counts among field isolates of the same C. heterostrophus species are more than 25× higher than those between inbred lines and 50× lower than SNPs between Cochliobolus species. The suites of nonribosomal peptide synthetase (NRPS), polyketide synthase (PKS), and SSP–encoding genes are astoundingly diverse among species but remarkably conserved among isolates of the same species, whether inbred or field strains, except for defining examples that map to unique genomic regions. Functional analysis of several strain-unique PKSs and NRPSs reveal a strong correlation with a role in virulence. PMID:23357949
Oka, Tomoichiro; Doan, Yen Hai; Shimoike, Takashi; Haga, Kei; Takizawa, Takenori
2017-12-01
Sapoviruses (SaVs) are enteric viruses and have been detected in various mammals. They are divided into multiple genogroups and genotypes based on the entire major capsid protein (VP1) encoding region sequences. In this study, we determined the first complete genome sequences of two genogroup V, genotype 3 (GV.3) SaV strains detected from swine fecal samples, in combination with Illumina MiSeq sequencing of the libraries prepared from viral RNA and PCR products. The lengths of the viral genome (7494 nucleotides [nt] excluding polyA tail) and short 5'-untranslated region (14 nt) as well as two predicted open reading frames are similar to those of other SaVs. The amino acid differences between the two porcine SaVs are most frequent in the central region of the VP1-encoding region. A stem-loop structure which was predicted in the first 41 nt of the 5'-terminal region of GV.3 SaVs and the other available complete genome sequences of SaVs may have a critical role in viral genome replication. Our study provides complete genome sequences of rarely reported GV.3 SaV strains and highlights the common 5'-terminal genomic feature of SaVs detected from different mammalian species.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Condon, Bradford J.; Leng, Yueqiang; Wu, Dongliang
The genomes of five Cochliobolus heterostrophus strains, two Cochliobolus sativus strains, three additional Cochliobolus species (Cochliobolus victoriae, Cochliobolus carbonum, Cochliobolus miyabeanus), and closely related Setosphaeria turcica were sequenced at the Joint Genome Institute (JGI). The datasets were used to identify SNPs between strains and species, unique genomic regions, core secondary metabolism genes, and small secreted protein (SSP) candidate effector encoding genes with a view towards pinpointing structural elements and gene content associated with specificity of these closely related fungi to different cereal hosts. Whole-genome alignment shows that three to five of each genome differs between strains of the same species,more » while a quarter of each genome differs between species. On average, SNP counts among field isolates of the same C. heterostrophus species are more than 25 higher than those between inbred lines and 50 lower than SNPs between Cochliobolus species. The suites of nonribosomal peptide synthetase (NRPS), polyketide synthase (PKS), and SSP encoding genes are astoundingly diverse among species but remarkably conserved among isolates of the same species, whether inbred or field strains, except for defining examples that map to unique genomic regions. Functional analysis of several strain-unique PKSs and NRPSs reveal a strong correlation with a role in virulence.« less
Gómez-Lunar, Zulema; Hernández-González, Ismael; Rodríguez-Torres, María-Dolores; Souza, Valeria; Olmedo-Álvarez, Gabriela
2016-01-01
Bacterial genomes undergo numerous events of gene losses and gains that generate genome variability among strains of the same species (microevolution). Our aim was to compare the genomes and relevant phenotypes of three Bacillus coahuilensis strains from two oligotrophic hydrological systems in the Cuatro Ciénegas Basin (México), to unveil the environmental challenges that this species cope with, and the microevolutionary differences in these genotypes. Since the strains were isolated from a low P environment, we placed emphasis on the search of different phosphorus acquisition strategies. The three B. coahuilensis strains exhibited similar numbers of coding DNA sequences, of which 82% (2,893) constituted the core genome, and 18% corresponded to accessory genes. Most of the genes in this last group were associated with mobile genetic elements (MGEs) or were annotated as hypothetical proteins. Ten percent of the pangenome consisted of strain-specific genes. Alignment of the three B. coahuilensis genomes indicated a high level of synteny and revealed the presence of several genomic islands. Unexpectedly, one of these islands contained genes that encode the 2-keto-3-deoxymannooctulosonic acid (Kdo) biosynthesis enzymes, a feature associated to cell walls of Gram-negative bacteria. Some microevolutionary changes were clearly associated with MGEs. Our analysis revealed inconsistencies between phenotype and genotype, which we suggest result from the impossibility to map regulatory features to genome analysis. Experimental results revealed variability in the types and numbers of auxotrophies between the strains that could not consistently be explained by in silico metabolic models. Several intraspecific differences in preferences for carbohydrate and phosphorus utilization were observed. Regarding phosphorus recycling, scavenging, and storage, variations were found between the three genomes. The three strains exhibited differences regarding alkaline phosphatase that revealed that in addition to gene gain and loss, regulation adjustment of gene expression also has contributed to the intraspecific diversity of B. coahuilensis.
Miyamoto, Hiroshi; Endo, Hirotoshi; Hashimoto, Naoki; Limura, Kurin; Isowa, Yukinobu; Kinoshita, Shigeharu; Kotaki, Tomohiro; Masaoka, Tetsuji; Miki, Takumi; Nakayama, Seiji; Nogawa, Chihiro; Notazawa, Atsuto; Ohmori, Fumito; Sarashina, Isao; Suzuki, Michio; Takagi, Ryousuke; Takahashi, Jun; Takeuchi, Takeshi; Yokoo, Naoki; Satoh, Nori; Toyohara, Haruhiko; Miyashita, Tomoyuki; Wada, Hiroshi; Samata, Tetsuro; Endo, Kazuyoshi; Nagasawa, Hiromichi; Asakawa, Shuichi; Watabe, Shugo
2013-10-01
In molluscs, shell matrix proteins are associated with biomineralization, a biologically controlled process that involves nucleation and growth of calcium carbonate crystals. Identification and characterization of shell matrix proteins are important for better understanding of the adaptive radiation of a large variety of molluscs. We searched the draft genome sequence of the pearl oyster Pinctada fucata and annotated 30 different kinds of shell matrix proteins. Of these, we could identified Perlucin, ependymin-related protein and SPARC as common genes shared by bivalves and gastropods; however, most gastropod shell matrix proteins were not found in the P. fucata genome. Glycinerich proteins were conserved in the genus Pinctada. Another important finding with regard to these annotated genes was that numerous shell matrix proteins are encoded by more than one gene; e.g., three ACCBP-like proteins, three CaLPs, five chitin synthase-like proteins, two N16 proteins (pearlins), 10 N19 proteins, two nacreins, four Pifs, nine shematrins, two prismalin-14 proteins, and 21 tyrosinases. This diversity of shell matrix proteins may be implicated in the morphological diversity of mollusc shells. The annotated genes reported here can be searched in P. fucata gene models version 1.1 and genome assembly version 1.0 ( http://marinegenomics.oist.jp/pinctada_fucata ). These genes should provide a useful resource for studies of the genetic basis of biomineralization and evaluation of the role of shell matrix proteins as an evolutionary toolkit among the molluscs.
Cui, Hongguang; Hong, Ni; Wang, Guoping; Wang, Aiming
2013-05-01
Prunus necrotic ringspot virus (PNRSV) affects Prunus fruit production worldwide. To date, numerous PNRSV isolates with diverse pathological properties have been documented. To study the pathogenicity of PNRSV, which directly or indirectly determines the economic losses of infected fruit trees, we have recently sequenced the complete genome of peach isolate Pch12 and cherry isolate Chr3, belonging to the pathogenically aggressive PV32 group and mild PV96 group, respectively. Here, we constructed the Chr3- and Pch12-derived full-length cDNA clones that were infectious in the experimental host cucumber and their respective natural Prunus hosts. Pch12-derived clones induced much more severe symptoms than Chr3 in cucumber, and the pathogenicity discrepancy between Chr3 and Pch12 was associated with virus accumulation. By reassortment of genomic segments, swapping of partial genomic segments, and site-directed mutagenesis, we identified the 3' terminal nucleotide sequence (1C region) in RNA1 and amino acid K at residue 279 in RNA2-encoded P2 as the severe virulence determinants in Pch12. Gain-of-function experiments demonstrated that both the 1C region and K279 of Pch12 were required for severe virulence and high levels of viral accumulation. Our results suggest that PNRSV RNA1 and RNA2 codetermine viral pathogenicity to adapt to alternating natural Prunus hosts, likely through mediating viral accumulation.
Complete genome sequence of Paris mosaic necrosis virus, a distinct member of the genus Potyvirus
USDA-ARS?s Scientific Manuscript database
The complete genomic sequence of a novel potyvirus was determined from Paris polyphylla var. yunnanensis. Its genomic RNA consists of 9,660 nucleotides (nt) excluding the 3’-terminal poly (A) tail, containing a single open reading frame (ORF) encoding a large polyprotein. The virus shares 52.1-69.7%...
Whole-genome sequence of “Candidatus Liberibacter solanacearum” strain R1 from California
USDA-ARS?s Scientific Manuscript database
The draft whole-genome sequence of “Candidatus Liberibacter solanacearum” strain R1, isolated from a tomato plant in California, United States, is reported. The R1 strain genome is 1,204,257 bp in size (G+C content of 35.3%), encoding 1,101 open reading frames and 57 RNA genes....
Complete Genome Sequence of Staphylococcus epidermidis 1457.
Galac, Madeline R; Stam, Jason; Maybank, Rosslyn; Hinkle, Mary; Mack, Dietrich; Rohde, Holger; Roth, Amanda L; Fey, Paul D
2017-06-01
Staphylococcus epidermidis 1457 is a frequently utilized strain that is amenable to genetic manipulation and has been widely used for biofilm-related research. We report here the whole-genome sequence of this strain, which encodes 2,277 protein-coding genes and 81 RNAs within its 2.4-Mb genome and plasmid. Copyright © 2017 Galac et al.
USDA-ARS?s Scientific Manuscript database
The complete genome sequence of a virus recently detected in switchgrass (Panicum virgatum) was determined and was found to be closely related to Maize rayado fino virus (MRFV), genus Marafivirus, family Tymoviridae. The genomic RNA is 6408 nucleotides long, excluding the poly (A) tail, and encodes...
Carbohydrate metabolism genes and pathways in insects: insights from the honey bee genome
Kunieda, T; Fujiyuki, T; Kucharski, R; Foret, S; Ament, S A; Toth, A L; Ohashi, K; Takeuchi, H; Kamikouchi, A; Kage, E; Morioka, M; Beye, M; Kubo, T; Robinson, G E; Maleszka, R
2006-01-01
Carbohydrate-metabolizing enzymes may have particularly interesting roles in the honey bee, Apis mellifera, because this social insect has an extremely carbohydrate-rich diet, and nutrition plays important roles in caste determination and socially mediated behavioural plasticity. We annotated a total of 174 genes encoding carbohydrate-metabolizing enzymes and 28 genes encoding lipid-metabolizing enzymes, based on orthology to their counterparts in the fly, Drosophila melanogaster, and the mosquito, Anopheles gambiae. We found that the number of genes for carbohydrate metabolism appears to be more evolutionarily labile than for lipid metabolism. In particular, we identified striking changes in gene number or genomic organization for genes encoding glycolytic enzymes, cellulase, glucose oxidase and glucose dehydrogenases, glucose-methanol-choline (GMC) oxidoreductases, fucosyltransferases, and lysozymes. PMID:17069632
Plastid-Nuclear Interaction and Accelerated Coevolution in Plastid Ribosomal Genes in Geraniaceae.
Weng, Mao-Lun; Ruhlman, Tracey A; Jansen, Robert K
2016-06-27
Plastids and mitochondria have many protein complexes that include subunits encoded by organelle and nuclear genomes. In animal cells, compensatory evolution between mitochondrial and nuclear-encoded subunits was identified and the high mitochondrial mutation rates were hypothesized to drive compensatory evolution in nuclear genomes. In plant cells, compensatory evolution between plastid and nucleus has rarely been investigated in a phylogenetic framework. To investigate plastid-nuclear coevolution, we focused on plastid ribosomal protein genes that are encoded by plastid and nuclear genomes from 27 Geraniales species. Substitution rates were compared for five sets of genes representing plastid- and nuclear-encoded ribosomal subunit proteins targeted to the cytosol or the plastid as well as nonribosomal protein controls. We found that nonsynonymous substitution rates (dN) and the ratios of nonsynonymous to synonymous substitution rates (ω) were accelerated in both plastid- (CpRP) and nuclear-encoded subunits (NuCpRP) of the plastid ribosome relative to control sequences. Our analyses revealed strong signals of cytonuclear coevolution between plastid- and nuclear-encoded subunits, in which nonsynonymous substitutions in CpRP and NuCpRP tend to occur along the same branches in the Geraniaceae phylogeny. This coevolution pattern cannot be explained by physical interaction between amino acid residues. The forces driving accelerated coevolution varied with cellular compartment of the sequence. Increased ω in CpRP was mainly due to intensified positive selection whereas increased ω in NuCpRP was caused by relaxed purifying selection. In addition, the many indels identified in plastid rRNA genes in Geraniaceae may have contributed to changes in plastid subunits. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Large-scale, multi-genome analysis of alternate open reading frames in bacteria and archaea.
Veloso, Felipe; Riadi, Gonzalo; Aliaga, Daniela; Lieph, Ryan; Holmes, David S
2005-01-01
Analysis of over 300,000 annotated genes in 105 bacterial and archaeal genomes reveals an unexpectedly high frequency of large (>300 nucleotides) alternate open reading frames (ORFs). Especially notable is the very high frequency of alternate ORFs in frames +3 and -1 (where the annotated gene is defined as frame +1). The occurrence of alternate ORFs is correlated with genomic G+C content and is strongly influenced by synonymous codon usage bias. The frequency of alternate ORFs in frame -1 is also influenced by the occurrence of codons encoding leucine and serine in frame +1. Although some alternate ORFs have been shown to encode proteins, many others are probably not expressed because they lack appropriate signals for transcription and translation. These latter can be mis-annotated by automatic gene finding programs leading to errors in public databases. Especially prone to mis-annotation is frame -1, because it exhibits a potential codon usage and theoretical capacity to encode proteins with an amino acid composition most similar to real genes. Some alternate ORFs are conserved across bacterial or archaeal species, and can give rise to misannotated "conserved hypothetical" genes, while others are unique to a genome and are misidentified as "hypothetical orphan" genes, contributing significantly to the orphan gene paradox.
Torres-Cortés, Gloria; Ghignone, Stefano; Bonfante, Paola; Schüßler, Arthur
2015-06-23
For more than 450 million years, arbuscular mycorrhizal fungi (AMF) have formed intimate, mutualistic symbioses with the vast majority of land plants and are major drivers in almost all terrestrial ecosystems. The obligate plant-symbiotic AMF host additional symbionts, so-called Mollicutes-related endobacteria (MRE). To uncover putative functional roles of these widespread but yet enigmatic MRE, we sequenced the genome of DhMRE living in the AMF Dentiscutata heterogama. Multilocus phylogenetic analyses showed that MRE form a previously unidentified lineage sister to the hominis group of Mycoplasma species. DhMRE possesses a strongly reduced metabolic capacity with 55% of the proteins having unknown function, which reflects unique adaptations to an intracellular lifestyle. We found evidence for transkingdom gene transfer between MRE and their AMF host. At least 27 annotated DhMRE proteins show similarities to nuclear-encoded proteins of the AMF Rhizophagus irregularis, which itself lacks MRE. Nuclear-encoded homologs could moreover be identified for another AMF, Gigaspora margarita, and surprisingly, also the non-AMF Mortierella verticillata. Our data indicate a possible origin of the MRE-fungus association in ancestors of the Glomeromycota and Mucoromycotina. The DhMRE genome encodes an arsenal of putative regulatory proteins with eukaryotic-like domains, some of them encoded in putative genomic islands. MRE are highly interesting candidates to study the evolution and interactions between an ancient, obligate endosymbiotic prokaryote with its obligate plant-symbiotic fungal host. Our data moreover may be used for further targeted searches for ancient effector-like proteins that may be key components in the regulation of the arbuscular mycorrhiza symbiosis.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fernandez-Fueyo, Elena; Ruiz-Duenas, Francisco J.; Ferreira, Patrica
Efficient lignin depolymerization is unique to the wood decay basidiomycetes, collectively referred to as white rot fungi. Phanerochaete chrysosporium simultaneously degrades lignin and cellulose, whereas the closely related species, Ceriporiopsis subvermispora, also depolymerizes lignin but may do so with relatively little cellulose degradation. To investigate the basis for selective ligninolysis, we conducted comparative genome analysis of C. subvermispora and P. chrysosporium. Genes encoding manganese peroxidase numbered 13 and five in C. subvermispora and P. chrysosporium, respectively. In addition, the C. subvermispora genome contains at least seven genes predicted to encode laccases, whereas the P. chrysosporium genome contains none. We alsomore » observed expansion of the number of C. subvermispora desaturase-encoding genes putatively involved in lipid metabolism. Microarray-based transcriptome analysis showed substantial up-regulation of several desaturase and MnP genes in wood-containing medium. MS identified MnP proteins in C. subvermispora culture filtrates, but none in P. chrysosporium cultures. These results support the importance of MnP and a lignin degradation mechanism whereby cleavage of the dominant nonphenolic structures is mediated by lipid peroxidation products. Two C. subvermispora genes were predicted to encode peroxidases structurally similar to P. chrysosporium lignin peroxidase and, following heterologous expression in Escherichia coli, the enzymes were shown to oxidize high redox potential substrates, but not Mn2. Apart from oxidative lignin degradation, we also examined cellulolytic and hemicellulolytic systems in both fungi. In summary, the C. subvermispora genetic inventory and expression patterns exhibit increased oxidoreductase potential and diminished cellulolytic capability relative to P. chrysosporium.« less
Fernandez-Fueyo, Elena; Ruiz-Dueñas, Francisco J.; Ferreira, Patricia; Floudas, Dimitrios; Hibbett, David S.; Canessa, Paulo; Larrondo, Luis F.; James, Tim Y.; Seelenfreund, Daniela; Lobos, Sergio; Polanco, Rubén; Tello, Mario; Honda, Yoichi; Watanabe, Takahito; Watanabe, Takashi; Ryu, Jae San; Kubicek, Christian P.; Schmoll, Monika; Gaskell, Jill; Hammel, Kenneth E.; St. John, Franz J.; Vanden Wymelenberg, Amber; Sabat, Grzegorz; Splinter BonDurant, Sandra; Syed, Khajamohiddin; Yadav, Jagjit S.; Doddapaneni, Harshavardhan; Subramanian, Venkataramanan; Lavín, José L.; Oguiza, José A.; Perez, Gumer; Pisabarro, Antonio G.; Ramirez, Lucia; Santoyo, Francisco; Master, Emma; Coutinho, Pedro M.; Henrissat, Bernard; Lombard, Vincent; Magnuson, Jon Karl; Kües, Ursula; Hori, Chiaki; Igarashi, Kiyohiko; Samejima, Masahiro; Held, Benjamin W.; Barry, Kerrie W.; LaButti, Kurt M.; Lapidus, Alla; Lindquist, Erika A.; Lucas, Susan M.; Riley, Robert; Salamov, Asaf A.; Hoffmeister, Dirk; Schwenk, Daniel; Hadar, Yitzhak; Yarden, Oded; de Vries, Ronald P.; Wiebenga, Ad; Stenlid, Jan; Eastwood, Daniel; Grigoriev, Igor V.; Berka, Randy M.; Blanchette, Robert A.; Kersten, Phil; Martinez, Angel T.; Vicuna, Rafael; Cullen, Dan
2012-01-01
Efficient lignin depolymerization is unique to the wood decay basidiomycetes, collectively referred to as white rot fungi. Phanerochaete chrysosporium simultaneously degrades lignin and cellulose, whereas the closely related species, Ceriporiopsis subvermispora, also depolymerizes lignin but may do so with relatively little cellulose degradation. To investigate the basis for selective ligninolysis, we conducted comparative genome analysis of C. subvermispora and P. chrysosporium. Genes encoding manganese peroxidase numbered 13 and five in C. subvermispora and P. chrysosporium, respectively. In addition, the C. subvermispora genome contains at least seven genes predicted to encode laccases, whereas the P. chrysosporium genome contains none. We also observed expansion of the number of C. subvermispora desaturase-encoding genes putatively involved in lipid metabolism. Microarray-based transcriptome analysis showed substantial up-regulation of several desaturase and MnP genes in wood-containing medium. MS identified MnP proteins in C. subvermispora culture filtrates, but none in P. chrysosporium cultures. These results support the importance of MnP and a lignin degradation mechanism whereby cleavage of the dominant nonphenolic structures is mediated by lipid peroxidation products. Two C. subvermispora genes were predicted to encode peroxidases structurally similar to P. chrysosporium lignin peroxidase and, following heterologous expression in Escherichia coli, the enzymes were shown to oxidize high redox potential substrates, but not Mn2+. Apart from oxidative lignin degradation, we also examined cellulolytic and hemicellulolytic systems in both fungi. In summary, the C. subvermispora genetic inventory and expression patterns exhibit increased oxidoreductase potential and diminished cellulolytic capability relative to P. chrysosporium. PMID:22434909
Solving traveling salesman problems with DNA molecules encoding numerical values.
Lee, Ji Youn; Shin, Soo-Yong; Park, Tai Hyun; Zhang, Byoung-Tak
2004-12-01
We introduce a DNA encoding method to represent numerical values and a biased molecular algorithm based on the thermodynamic properties of DNA. DNA strands are designed to encode real values by variation of their melting temperatures. The thermodynamic properties of DNA are used for effective local search of optimal solutions using biochemical techniques, such as denaturation temperature gradient polymerase chain reaction and temperature gradient gel electrophoresis. The proposed method was successfully applied to the traveling salesman problem, an instance of optimization problems on weighted graphs. This work extends the capability of DNA computing to solving numerical optimization problems, which is contrasted with other DNA computing methods focusing on logical problem solving.
The genome of the sea urchin Strongylocentrotus purpuratus.
Sodergren, Erica; Weinstock, George M; Davidson, Eric H; Cameron, R Andrew; Gibbs, Richard A; Angerer, Robert C; Angerer, Lynne M; Arnone, Maria Ina; Burgess, David R; Burke, Robert D; Coffman, James A; Dean, Michael; Elphick, Maurice R; Ettensohn, Charles A; Foltz, Kathy R; Hamdoun, Amro; Hynes, Richard O; Klein, William H; Marzluff, William; McClay, David R; Morris, Robert L; Mushegian, Arcady; Rast, Jonathan P; Smith, L Courtney; Thorndyke, Michael C; Vacquier, Victor D; Wessel, Gary M; Wray, Greg; Zhang, Lan; Elsik, Christine G; Ermolaeva, Olga; Hlavina, Wratko; Hofmann, Gretchen; Kitts, Paul; Landrum, Melissa J; Mackey, Aaron J; Maglott, Donna; Panopoulou, Georgia; Poustka, Albert J; Pruitt, Kim; Sapojnikov, Victor; Song, Xingzhi; Souvorov, Alexandre; Solovyev, Victor; Wei, Zheng; Whittaker, Charles A; Worley, Kim; Durbin, K James; Shen, Yufeng; Fedrigo, Olivier; Garfield, David; Haygood, Ralph; Primus, Alexander; Satija, Rahul; Severson, Tonya; Gonzalez-Garay, Manuel L; Jackson, Andrew R; Milosavljevic, Aleksandar; Tong, Mark; Killian, Christopher E; Livingston, Brian T; Wilt, Fred H; Adams, Nikki; Bellé, Robert; Carbonneau, Seth; Cheung, Rocky; Cormier, Patrick; Cosson, Bertrand; Croce, Jenifer; Fernandez-Guerra, Antonio; Genevière, Anne-Marie; Goel, Manisha; Kelkar, Hemant; Morales, Julia; Mulner-Lorillon, Odile; Robertson, Anthony J; Goldstone, Jared V; Cole, Bryan; Epel, David; Gold, Bert; Hahn, Mark E; Howard-Ashby, Meredith; Scally, Mark; Stegeman, John J; Allgood, Erin L; Cool, Jonah; Judkins, Kyle M; McCafferty, Shawn S; Musante, Ashlan M; Obar, Robert A; Rawson, Amanda P; Rossetti, Blair J; Gibbons, Ian R; Hoffman, Matthew P; Leone, Andrew; Istrail, Sorin; Materna, Stefan C; Samanta, Manoj P; Stolc, Viktor; Tongprasit, Waraporn; Tu, Qiang; Bergeron, Karl-Frederik; Brandhorst, Bruce P; Whittle, James; Berney, Kevin; Bottjer, David J; Calestani, Cristina; Peterson, Kevin; Chow, Elly; Yuan, Qiu Autumn; Elhaik, Eran; Graur, Dan; Reese, Justin T; Bosdet, Ian; Heesun, Shin; Marra, Marco A; Schein, Jacqueline; Anderson, Michele K; Brockton, Virginia; Buckley, Katherine M; Cohen, Avis H; Fugmann, Sebastian D; Hibino, Taku; Loza-Coll, Mariano; Majeske, Audrey J; Messier, Cynthia; Nair, Sham V; Pancer, Zeev; Terwilliger, David P; Agca, Cavit; Arboleda, Enrique; Chen, Nansheng; Churcher, Allison M; Hallböök, F; Humphrey, Glen W; Idris, Mohammed M; Kiyama, Takae; Liang, Shuguang; Mellott, Dan; Mu, Xiuqian; Murray, Greg; Olinski, Robert P; Raible, Florian; Rowe, Matthew; Taylor, John S; Tessmar-Raible, Kristin; Wang, D; Wilson, Karen H; Yaguchi, Shunsuke; Gaasterland, Terry; Galindo, Blanca E; Gunaratne, Herath J; Juliano, Celina; Kinukawa, Masashi; Moy, Gary W; Neill, Anna T; Nomura, Mamoru; Raisch, Michael; Reade, Anna; Roux, Michelle M; Song, Jia L; Su, Yi-Hsien; Townley, Ian K; Voronina, Ekaterina; Wong, Julian L; Amore, Gabriele; Branno, Margherita; Brown, Euan R; Cavalieri, Vincenzo; Duboc, Véronique; Duloquin, Louise; Flytzanis, Constantin; Gache, Christian; Lapraz, François; Lepage, Thierry; Locascio, Annamaria; Martinez, Pedro; Matassi, Giorgio; Matranga, Valeria; Range, Ryan; Rizzo, Francesca; Röttinger, Eric; Beane, Wendy; Bradham, Cynthia; Byrum, Christine; Glenn, Tom; Hussain, Sofia; Manning, Gerard; Miranda, Esther; Thomason, Rebecca; Walton, Katherine; Wikramanayke, Athula; Wu, Shu-Yu; Xu, Ronghui; Brown, C Titus; Chen, Lili; Gray, Rachel F; Lee, Pei Yun; Nam, Jongmin; Oliveri, Paola; Smith, Joel; Muzny, Donna; Bell, Stephanie; Chacko, Joseph; Cree, Andrew; Curry, Stacey; Davis, Clay; Dinh, Huyen; Dugan-Rocha, Shannon; Fowler, Jerry; Gill, Rachel; Hamilton, Cerrissa; Hernandez, Judith; Hines, Sandra; Hume, Jennifer; Jackson, Laronda; Jolivet, Angela; Kovar, Christie; Lee, Sandra; Lewis, Lora; Miner, George; Morgan, Margaret; Nazareth, Lynne V; Okwuonu, Geoffrey; Parker, David; Pu, Ling-Ling; Thorn, Rachel; Wright, Rita
2006-11-10
We report the sequence and analysis of the 814-megabase genome of the sea urchin Strongylocentrotus purpuratus, a model for developmental and systems biology. The sequencing strategy combined whole-genome shotgun and bacterial artificial chromosome (BAC) sequences. This use of BAC clones, aided by a pooling strategy, overcame difficulties associated with high heterozygosity of the genome. The genome encodes about 23,300 genes, including many previously thought to be vertebrate innovations or known only outside the deuterostomes. This echinoderm genome provides an evolutionary outgroup for the chordates and yields insights into the evolution of deuterostomes.
Comparative genomics of Lactobacillus
Kant, Ravi; Blom, Jochen; Palva, Airi; Siezen, Roland J.; de Vos, Willem M.
2011-01-01
Summary The genus Lactobacillus includes a diverse group of bacteria consisting of many species that are associated with fermentations of plants, meat or milk. In addition, various lactobacilli are natural inhabitants of the intestinal tract of humans and other animals. Finally, several Lactobacillus strains are marketed as probiotics as their consumption can confer a health benefit to host. Presently, 154 Lactobacillus species are known and a growing fraction of these are subject to draft genome sequencing. However, complete genome sequences are needed to provide a platform for detailed genomic comparisons. Therefore, we selected a total of 20 genomes of various Lactobacillus strains for which complete genomic sequences have been reported. These genomes had sizes varying from 1.8 to 3.3 Mb and other characteristic features, such as G+C content that ranged from 33% to 51%. The Lactobacillus pan genome was found to consist of approximately 14 000 protein‐encoding genes while all 20 genomes shared a total of 383 sets of orthologous genes that defined the Lactobacillus core genome (LCG). Based on advanced phylogeny of the proteins encoded by this LCG, we grouped the 20 strains into three main groups and defined core group genes present in all genomes of a single group, signature group genes shared in all genomes of one group but absent in all other Lactobacillus genomes, and Group‐specific ORFans present in core group genes of one group and absent in all other complete genomes. The latter are of specific value in defining the different groups of genomes. The study provides a platform for present individual comparisons as well as future analysis of new Lactobacillus genomes. PMID:21375712
Laurie, John D.; Ali, Shawkat; Linning, Rob; Mannhaupt, Gertrud; Wong, Philip; Güldener, Ulrich; Münsterkötter, Martin; Moore, Richard; Kahmann, Regine; Bakkeren, Guus; Schirawski, Jan
2012-01-01
Ustilago hordei is a biotrophic parasite of barley (Hordeum vulgare). After seedling infection, the fungus persists in the plant until head emergence when fungal spores develop and are released from sori formed at kernel positions. The 26.1-Mb U. hordei genome contains 7113 protein encoding genes with high synteny to the smaller genomes of the related, maize-infecting smut fungi Ustilago maydis and Sporisorium reilianum but has a larger repeat content that affected genome evolution at important loci, including mating-type and effector loci. The U. hordei genome encodes components involved in RNA interference and heterochromatin formation, normally involved in genome defense, that are lacking in the U. maydis genome due to clean excision events. These excision events were possibly a result of former presence of repetitive DNA and of an efficient homologous recombination system in U. maydis. We found evidence of repeat-induced point mutations in the genome of U. hordei, indicating that smut fungi use different strategies to counteract the deleterious effects of repetitive DNA. The complement of U. hordei effector genes is comparable to the other two smuts but reveals differences in family expansion and clustering. The availability of the genome sequence will facilitate the identification of genes responsible for virulence and evolution of smut fungi on their respective hosts. PMID:22623492
Laurie, John D; Ali, Shawkat; Linning, Rob; Mannhaupt, Gertrud; Wong, Philip; Güldener, Ulrich; Münsterkötter, Martin; Moore, Richard; Kahmann, Regine; Bakkeren, Guus; Schirawski, Jan
2012-05-01
Ustilago hordei is a biotrophic parasite of barley (Hordeum vulgare). After seedling infection, the fungus persists in the plant until head emergence when fungal spores develop and are released from sori formed at kernel positions. The 26.1-Mb U. hordei genome contains 7113 protein encoding genes with high synteny to the smaller genomes of the related, maize-infecting smut fungi Ustilago maydis and Sporisorium reilianum but has a larger repeat content that affected genome evolution at important loci, including mating-type and effector loci. The U. hordei genome encodes components involved in RNA interference and heterochromatin formation, normally involved in genome defense, that are lacking in the U. maydis genome due to clean excision events. These excision events were possibly a result of former presence of repetitive DNA and of an efficient homologous recombination system in U. maydis. We found evidence of repeat-induced point mutations in the genome of U. hordei, indicating that smut fungi use different strategies to counteract the deleterious effects of repetitive DNA. The complement of U. hordei effector genes is comparable to the other two smuts but reveals differences in family expansion and clustering. The availability of the genome sequence will facilitate the identification of genes responsible for virulence and evolution of smut fungi on their respective hosts.
Staphylococcal SCCmec elements encode an active MCM-like helicase and thus may be replicative
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mir-Sanchis, Ignacio; Roman, Christina A.; Misiura, Agnieszka
2016-08-29
Methicillin-resistant Staphylococcus aureus (MRSA) is a public-health threat worldwide. Although the mobile genomic island responsible for this phenotype, staphylococcal cassette chromosome (SCC), has been thought to be nonreplicative, we predicted DNA-replication-related functions for some of the conserved proteins encoded by SCC. We show that one of these, Cch, is homologous to the self-loading initiator helicases of an unrelated family of genomic islands, that it is an active 3'-to-5' helicase and that the adjacent ORF encodes a single-stranded DNA–binding protein. Our 2.9-Å crystal structure of intact Cch shows that it forms a hexameric ring. Cch, like the archaeal and eukaryotic MCM-familymore » replicative helicases, belongs to the pre–sensor II insert clade of AAA+ ATPases. Additionally, we found that SCC elements are part of a broader family of mobile elements, all of which encode a replication initiator upstream of their recombinases. Replication after excision would enhance the efficiency of horizontal gene transfer.« less
A Cryptosporidium parvum genomic region encoding hemolytic activity.
Steele, M I; Kuhls, T L; Nida, K; Meka, C S; Halabi, I M; Mosier, D A; Elliott, W; Crawford, D L; Greenfield, R A
1995-01-01
Successful parasitization by Cryptosporidium parvum requires multiple disruptions in both host and protozoan cell membranes as cryptosporidial sporozoites invade intestinal epithelial cells and subsequently develop into asexual and sexual life stages. To identify cryptosporidial proteins which may play a role in these membrane alterations, hemolytic activity was used as a marker to screen a C. parvum genomic expression library. A stable hemolytic clone (H4) containing a 5.5-kb cryptosporidial genomic fragment was identified. The hemolytic activity encoded on H4 was mapped to a 1-kb region that contained a complete 690-bp open reading frame (hemA) ending in a common stop codon. A 21-kDa plasmid-encoded recombinant protein was expressed in maxicells containing H4. Subclones of H4 which contained only a portion of hemA did not induce hemolysis on blood agar or promote expression of the recombinant protein in maxicells. Reverse transcriptase-mediated PCR analysis of total RNA isolated from excysted sporozoites and the intestines of infected adult mice with severe combined immunodeficiency demonstrated that hemA is actively transcribed during the cryptosporidial life cycle. PMID:7558289
Zhao, Jie
2010-01-01
Arabinogalactan proteins (AGPs) comprise a family of hydroxyproline-rich glycoproteins that are implicated in plant growth and development. In this study, 69 AGPs are identified from the rice genome, including 13 classical AGPs, 15 arabinogalactan (AG) peptides, three non-classical AGPs, three early nodulin-like AGPs (eNod-like AGPs), eight non-specific lipid transfer protein-like AGPs (nsLTP-like AGPs), and 27 fasciclin-like AGPs (FLAs). The results from expressed sequence tags, microarrays, and massively parallel signature sequencing tags are used to analyse the expression of AGP-encoding genes, which is confirmed by real-time PCR. The results reveal that several rice AGP-encoding genes are predominantly expressed in anthers and display differential expression patterns in response to abscisic acid, gibberellic acid, and abiotic stresses. Based on the results obtained from this analysis, an attempt has been made to link the protein structures and expression patterns of rice AGP-encoding genes to their functions. Taken together, the genome-wide identification and expression analysis of the rice AGP gene family might facilitate further functional studies of rice AGPs. PMID:20423940
Gilbert, Maarten J.; Miller, William G.; Yee, Emma; Kik, Marja; Zomer, Aldert L.; Wagenaar, Jaap A.; Duim, Birgitta
2016-01-01
Abstract Campylobacter iguaniorum is most closely related to the species C. fetus, C. hyointestinalis, and C. lanienae. Reptiles, chelonians and lizards in particular, appear to be a primary reservoir of this Campylobacter species. Here we report the genome comparison of C. iguaniorum strain 1485E, isolated from a bearded dragon (Pogona vitticeps), and strain 2463D, isolated from a green iguana (Iguana iguana), with the genomes of closely related taxa, in particular with reptile-associated C. fetus subsp. testudinum. In contrast to C. fetus, C. iguaniorum is lacking an S-layer encoding region. Furthermore, a defined lipooligosaccharide biosynthesis locus, encoding multiple glycosyltransferases and bounded by waa genes, is absent from C. iguaniorum. Instead, multiple predicted glycosylation regions were identified in C. iguaniorum. One of these regions is > 50 kb with deviant G + C content, suggesting acquisition via lateral transfer. These similar, but non-homologous glycosylation regions were located at the same position on the genome in both strains. Multiple genes encoding respiratory enzymes not identified to date within the C. fetus clade were present. C. iguaniorum shared highest homology with C. hyointestinalis and C. fetus. As in reptile-associated C. fetus subsp. testudinum, a putative tricarballylate catabolism locus was identified. However, despite colonizing a shared host, no recent recombination between both taxa was detected. This genomic study provides a better understanding of host adaptation, virulence, phylogeny, and evolution of C. iguaniorum and related Campylobacter taxa. PMID:27604878
2014-01-01
Background Nematodirus spp. are among the most common nematodes of ruminants worldwide. N. oiratianus and N. spathiger are distributed worldwide as highly prevalent gastrointestinal nematodes, which cause emerging health problems and economic losses. Accurate identification of Nematodirus species is essential to develop effective control strategies for Nematodirus infection in ruminants. Mitochondrial DNA (mtDNA) could provide powerful genetic markers for identifying these closely related species and resolving phylogenetic relationships at different taxonomic levels. Methods In the present study, the complete mitochondrial (mt) genomes of N. oiratianus and N. spathiger from small ruminants in China were obtained using Long-range PCR and sequencing. Results The complete mt genomes of N. oiratianus and N. spathiger were 13,765 bp and 13,519 bp in length, respectively. Both mt genomes were circular and consisted of 36 genes, including 12 genes encoding proteins, 2 genes encoding rRNA, and 22 genes encoding tRNA. Phylogenetic analyses based on the concatenated amino acid sequence data of all 12 protein-coding genes by Bayesian inference (BI), Maximum likelihood (ML) and Maximum parsimony (MP) showed that the two Nematodirus species (Molineidae) were closely related to Dictyocaulidae. Conclusions The availability of the complete mtDNA sequences of N. oiratianus and N. spathiger not only provides new mtDNA sources for a better understanding of nematode mt genomics and phylogeny, but also provides novel and useful genetic markers for studying diagnosis, population genetics and molecular epidemiology of Nematodirus spp. in small ruminants. PMID:25015379
Zhao, Guang-Hui; Jia, Yan-Qing; Cheng, Wen-Yu; Zhao, Wen; Bian, Qing-Qing; Liu, Guo-Hua
2014-07-11
Nematodirus spp. are among the most common nematodes of ruminants worldwide. N. oiratianus and N. spathiger are distributed worldwide as highly prevalent gastrointestinal nematodes, which cause emerging health problems and economic losses. Accurate identification of Nematodirus species is essential to develop effective control strategies for Nematodirus infection in ruminants. Mitochondrial DNA (mtDNA) could provide powerful genetic markers for identifying these closely related species and resolving phylogenetic relationships at different taxonomic levels. In the present study, the complete mitochondrial (mt) genomes of N. oiratianus and N. spathiger from small ruminants in China were obtained using Long-range PCR and sequencing. The complete mt genomes of N. oiratianus and N. spathiger were 13,765 bp and 13,519 bp in length, respectively. Both mt genomes were circular and consisted of 36 genes, including 12 genes encoding proteins, 2 genes encoding rRNA, and 22 genes encoding tRNA. Phylogenetic analyses based on the concatenated amino acid sequence data of all 12 protein-coding genes by Bayesian inference (BI), Maximum likelihood (ML) and Maximum parsimony (MP) showed that the two Nematodirus species (Molineidae) were closely related to Dictyocaulidae. The availability of the complete mtDNA sequences of N. oiratianus and N. spathiger not only provides new mtDNA sources for a better understanding of nematode mt genomics and phylogeny, but also provides novel and useful genetic markers for studying diagnosis, population genetics and molecular epidemiology of Nematodirus spp. in small ruminants.
Toshchakov, Stepan V; Korzhenkov, Alexei A; Chernikova, Tatyana N; Ferrer, Manuel; Golyshina, Olga V; Yakimov, Michail M; Golyshin, Peter N
2017-12-01
Marine bacterium Oleiphilus messinensis ME102 (DSM 13489 T ) isolated from the sediments of the harbor of Messina (Italy) is a member of the order Oceanospirillales, class Gammaproteobacteria, representing the physiological group of marine obligate hydrocarbonoclastic bacteria (OHCB) alongside the members of the genera Alcanivorax, Oleispira, Thalassolituus, Cycloclasticus and Neptunomonas. These organisms play a crucial role in the natural environmental cleanup in marine systems. Despite having the largest genome (6.379.281bp) among OHCB, O. messinensis exhibits a very narrow substrate profile. The alkane metabolism is pre-determined by three loci encoding for two P450 family monooxygenases, one of which formed a cassette with ferredoxin and alcohol dehydrogenase encoding genes and alkane monoxygenase (AlkB) gene clustered with two genes for rubredoxins and NAD + -dependent rubredoxin reductase. Its genome contains the largest numbers of genomic islands (15) and mobile genetic elements (140), as compared with more streamlined genomes of its OHCB counterparts. Among hydrocarbon-degrading Oceanospirillales, O. messinensis encodes the largest array of proteins involved in the signal transduction for sensing and responding to the environmental stimuli (345 vs 170 in Oleispira antarctica, the bacterium with the second highest number). This must be an important trait to adapt to the conditions in marine sediments with a high physico-chemical patchiness and heterogeneity as compared to those in the water column. Copyright © 2017. Published by Elsevier B.V.
Coordinated Rates of Evolution between Interacting Plastid and Nuclear Genes in Geraniaceae
Zhang, Jin; Ruhlman, Tracey A.; Sabir, Jamal; Blazier, J. Chris; Jansen, Robert K.
2015-01-01
Although gene coevolution has been widely observed within individuals and between different organisms, rarely has this phenomenon been investigated within a phylogenetic framework. The Geraniaceae is an attractive system in which to study plastid-nuclear genome coevolution due to the highly elevated evolutionary rates in plastid genomes. In plants, the plastid-encoded RNA polymerase (PEP) is a protein complex composed of subunits encoded by both plastid (rpoA, rpoB, rpoC1, and rpoC2) and nuclear genes (sig1-6). We used transcriptome and genomic data for 27 species of Geraniales in a systematic evaluation of coevolution between genes encoding subunits of the PEP holoenzyme. We detected strong correlations of dN (nonsynonymous substitutions) but not dS (synonymous substitutions) within rpoB/sig1 and rpoC2/sig2, but not for other plastid/nuclear gene pairs, and identified the correlation of dN/dS ratio between rpoB/C1/C2 and sig1/5/6, rpoC1/C2 and sig2, and rpoB/C2 and sig3 genes. Correlated rates between interacting plastid and nuclear sequences across the Geraniales could result from plastid-nuclear genome coevolution. Analyses of coevolved amino acid positions suggest that structurally mediated coevolution is not the major driver of plastid-nuclear coevolution. The detection of strong correlation of evolutionary rates between SIG and RNAP genes suggests a plausible explanation for plastome-genome incompatibility in Geraniaceae. PMID:25724640
Okamoto, Masaaki; Naito, Mariko; Miyanohara, Mayu; Imai, Susumu; Nomura, Yoshiaki; Saito, Wataru; Momoi, Yasuko; Takada, Kazuko; Miyabe-Nishiwaki, Takako; Tomonaga, Masaki; Hanada, Nobuhiro
2016-12-01
Streptococcus troglodytae TKU31 was isolated from the oral cavity of a chimpanzee (Pan troglodytes) and was found to be the most closely related species of the mutans group streptococci to Streptococcus mutans. The complete sequence of TKU31 genome consists of a single circular chromosome that is 2,097,874 base pairs long and has a G + C content of 37.18%. It possesses 2082 coding sequences (CDSs), 65 tRNAs and five rRNA operons (15 rRNAs). Two clustered regularly interspaced short palindromic repeats, six insertion sequences and two predicted prophage elements were identified. The genome of TKU31 harbors some putative virulence associated genes, including gtfB, gtfC and gtfD genes encoding glucosyltransferase and gbpA, gbpB, gbpC and gbpD genes encoding glucan-binding cell wall-anchored protein. The deduced amino acid identity of the rhamnose-glucose polysaccharide F gene (rgpF), which is one of the serotype determinants, is 91% identical with that of S. mutans LJ23 (serotype k) strain. However, two other virulence-associated genes cnm and cbm, which encode the collagen-binding proteins, were not found in the TKU31 genome. The complete genome sequence of S. troglodytae TKU31 has been deposited at DDBJ/European Nucleotide Archive/GenBank under the accession no. AP014612. © 2016 The Societies and John Wiley & Sons Australia, Ltd.
Wu, Yichao; Arumugam, Krithika; Tay, Martin Qi Xiang; Seshan, Hari; Mohanty, Anee; Cao, Bin
2015-04-01
Comamonas testosteroni is an important environmental bacterium capable of degrading a variety of toxic aromatic pollutants and has been demonstrated to be a promising biocatalyst for environmental decontamination. This organism is often found to be among the primary surface colonizers in various natural and engineered ecosystems, suggesting an extraordinary capability of this organism in environmental adaptation and biofilm formation. The goal of this study was to gain genetic insights into the adaption of C. testosteroni to versatile environments and the importance of a biofilm lifestyle. Specifically, a draft genome of C. testosteroni I2 was obtained. The draft genome is 5,778,710 bp in length and comprises 110 contigs. The average G+C content was 61.88 %. A total of 5365 genes with 5263 protein-coding genes were predicted, whereas 4324 (80.60 % of total genes) protein-encoding genes were associated with predicted functions. The catabolic genes responsible for biodegradation of steroid and other aromatic compounds on draft genome were identified. Plasmid pI2 was found to encode a complete pathway for aniline degradation and a partial catabolic pathway for chloroaniline. This organism was found to be equipped with a sophisticated signaling system which helps it find ideal niches and switch between planktonic and biofilm lifestyles. A large number of putative multi-drug-resistant genes coding for abundant outer membrane transporters, chaperones, and heat shock proteins for the protection of cellular function were identified in the genome of strain I2. In addition, the genome of strain I2 was predicted to encode several proteins involved in producing, secreting, and uptaking siderophores under iron-limiting conditions. The genome of strain I2 contains a number of genes responsible for the synthesis and secretion of exopolysaccharides, an extracellular component essential for biofilm formation. Overall, our results reveal the genomic features underlying the adaption of C. testosteroni to versatile environments and highlighting the importance of its biofilm lifestyle.
GIGGLE: a search engine for large-scale integrated genome analysis.
Layer, Ryan M; Pedersen, Brent S; DiSera, Tonya; Marth, Gabor T; Gertz, Jason; Quinlan, Aaron R
2018-02-01
GIGGLE is a genomics search engine that identifies and ranks the significance of genomic loci shared between query features and thousands of genome interval files. GIGGLE (https://github.com/ryanlayer/giggle) scales to billions of intervals and is over three orders of magnitude faster than existing methods. Its speed extends the accessibility and utility of resources such as ENCODE, Roadmap Epigenomics, and GTEx by facilitating data integration and hypothesis generation.
Chiriac, Cecilia; Baricz, Andreea
2018-01-01
ABSTRACT The draft genome assembly of Janthinobacterium sp. strain ROICE36 has 207 contigs, with a total genome size of 5,977,006 bp and a G+C content of 62%. Preliminary genome analysis identified 5,363 protein-coding genes and a total of 7 secondary metabolic gene clusters (encoding bacteriocins, nonribosomal peptide-synthetase [NRPS], terpene, hserlactone, and other ketide synthases). PMID:29650588
GIGGLE: a search engine for large-scale integrated genome analysis
Layer, Ryan M; Pedersen, Brent S; DiSera, Tonya; Marth, Gabor T; Gertz, Jason; Quinlan, Aaron R
2018-01-01
GIGGLE is a genomics search engine that identifies and ranks the significance of genomic loci shared between query features and thousands of genome interval files. GIGGLE (https://github.com/ryanlayer/giggle) scales to billions of intervals and is over three orders of magnitude faster than existing methods. Its speed extends the accessibility and utility of resources such as ENCODE, Roadmap Epigenomics, and GTEx by facilitating data integration and hypothesis generation. PMID:29309061
Bulky Trichomonad Genomes: Encoding a Swiss Army Knife.
Barratt, Joel; Gough, Rory; Stark, Damien; Ellis, John
2016-10-01
The trichomonads are a remarkably successful lineage of ancient, predominantly parasitic protozoa. Recent molecular analyses have revealed extensive duplication of certain genetic loci in trichomonads. Consequently, their genomes are exceptionally large compared to other parasitic protozoa. Retention of these large gene expansions across different trichomonad families raises the question: do these duplications afford an advantage? Many duplicated genes are linked to the parasitic lifestyle and some are regulated differently to their paralogues, suggesting they have acquired new functions. It is proposed that these large genomes encode a Swiss army knife of sorts, packed with a multitude of tools for use in many different circumstances. This may have bestowed trichomonads with the extraordinary versatility that has undoubtedly contributed to their success. Copyright © 2016 Elsevier Ltd. All rights reserved.
Mollusk genes encoding lysine tRNA (UUU) contain introns.
Matsuo, M; Abe, Y; Saruta, Y; Okada, N
1995-11-20
New intron-containing genes encoding tRNAs were discovered when genomic DNA isolated from various animal species was amplified by the polymerase chain reaction (PCR) with primers based on sequences of rabbit tRNA(Lys). From sequencing analysis of the products of PCR, we found that introns are present in several genes encoding tRNA(Lys) in mollusks, such as Loligo bleekeri (squid) and Octopus vulgaris (octopus). These introns were specific to genes encoding tRNA(Lys)(CUU) and were not present in genes encoding tRNA(Lys)(CUU). In addition, the sequences of the introns were different from one another. To confirm the results of our initial experiments, we isolated and sequenced genes encoding tRNA(Lys)(CUU) and tRNA(Lys)(UUU). The gene for tRNA(Lys)(UUU) from squid contained an intron, whose sequence was the same as that identified by PCR, and the gene formed a cluster with a corresponding pseudogene. Several DNA regions of 2.1 kb containing this cluster appeared to be tandemly arrayed in the squid genome. By contrast, the gene encoding tRNA(Lys)(CUU) did not contain an intron, as shown also by PCR. The tRNA(Lys)(UUU) that corresponded to the analyzed gene was isolated and characterized. The present study provides the first example of an intron-containing gene encoding a tRNA in mollusks and suggests the universality of introns in such genes in higher eukaryotes.
USDA-ARS?s Scientific Manuscript database
The draft genome sequence of “Candidatus Liberibacter asiaticus” strain TX2351 collected from ACP in South Texas has been determined. The TX2351 genome is 1,252,043 bp in size with a 36.5% G+C content, encoding 1,184 predicted open reading frames and 51 RNA genes....
Mitochondrial genome of the blackfin tuna Thunnus atlanticus Lesson, 1831 (Perciformes, Scrombidae).
Márquez, Edna J; Isaza, Juan P; Alzate, Juan F
2016-05-01
Blackfin tuna, Thunnus atlanticus is a widespread epipelagic oceanic species in the western Atlantic. So far the mitochondrial genome of this species remained unknown, although the mitogenomes of all congeners are known. The mitochondrial genome encodes for 13 proteins, 21 tRNAs, 2 ribosomal RNAs and the gene synteny is conserved with other previously reported mitogenomes of tunas.
Azevedo Antunes, Camila; Richardson, Emily J; Quick, Joshua; Fuentes-Utrilla, Pablo; Isom, Georgia L; Goodall, Emily C; Möller, Jens; Hoskisson, Paul A; Mattos-Guaraldi, Ana Luiza; Cunningham, Adam F; Loman, Nicholas J; Sangal, Vartul; Burkovski, Andreas; Henderson, Ian R
2018-02-01
The genome sequence of the human pathogen Corynebacterium diphtheriae bv. mitis strain ISS 3319 was determined and closed in this study. The genome is estimated to have 2,404,936 bp encoding 2,257 proteins. This strain also possesses a plasmid of 1,960 bp. Copyright © 2018 Azevedo Antunes et al.
Chen, Gao; Murdoch, Robert W.; Mack, E. Erin; ...
2017-09-14
Dehalobacterium formicoaceticum utilizes dichloromethane as the sole energy source in defined anoxic bicarbonate-buffered mineral salt medium. The products are formate, acetate, inorganic chloride, and biomass. The bacterium’s genome was sequenced using PacBio, assembled, and annotated. The complete genome consists of one 3.77-Mb circular chromosome harboring 3,935 predicted protein-encoding genes.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chen, Gao; Murdoch, Robert W.; Mack, E. Erin
Dehalobacterium formicoaceticum utilizes dichloromethane as the sole energy source in defined anoxic bicarbonate-buffered mineral salt medium. The products are formate, acetate, inorganic chloride, and biomass. The bacterium’s genome was sequenced using PacBio, assembled, and annotated. The complete genome consists of one 3.77-Mb circular chromosome harboring 3,935 predicted protein-encoding genes.
Small Genomes and Sparse Metabolisms of Sediment-Associated Bacteria from Four Candidate Phyla
Kantor, Rose S.; Wrighton, Kelly C.; Handley, Kim M.; Sharon, Itai; Hug, Laura A.; Castelle, Cindy J.; Thomas, Brian C.; Banfield, Jillian F.
2013-01-01
ABSTRACT Cultivation-independent surveys of microbial diversity have revealed many bacterial phyla that lack cultured representatives. These lineages, referred to as candidate phyla, have been detected across many environments. Here, we deeply sequenced microbial communities from acetate-stimulated aquifer sediment to recover the complete and essentially complete genomes of single representatives of the candidate phyla SR1, WWE3, TM7, and OD1. All four of these genomes are very small, 0.7 to 1.2 Mbp, and have large inventories of novel proteins. Additionally, all lack identifiable biosynthetic pathways for several key metabolites. The SR1 genome uses the UGA codon to encode glycine, and the same codon is very rare in the OD1 genome, suggesting that the OD1 organism could also transition to alternate coding. Interestingly, the relative abundance of the members of SR1 increased with the appearance of sulfide in groundwater, a pattern mirrored by a member of the phylum Tenericutes. All four genomes encode type IV pili, which may be involved in interorganism interaction. On the basis of these results and other recently published research, metabolic dependence on other organisms may be widely distributed across multiple bacterial candidate phyla. PMID:24149512
Santangelo, G M; Tornow, J; McLaughlin, C S; Moldave, K
1991-08-30
Two promoters (A7 and A23), isolated at random from the Saccharomyces cerevisiae genome by virtue of their capacity to activate transcription, are identical to known intergenic bidirectional promoters. Sequence analysis of the genomic DNA adjacent to the A7 promoter identified a split gene encoding ribosomal (r) protein L37, which is homologous to the tRNA-binding r-proteins, L35a (from human and rat) and L32 (from frogs).
Heterologous production and characterization of two glyoxal oxidases from Pycnoporus cinnabarinus
Marianne Daou; François Piumi; Daniel Cullen; Eric Record; Craig B. Faulds
2016-01-01
The genome of the white rot fungus Pycnoporus cinnabarinus includes a large number of genes encoding enzymes implicated in lignin degradation. Among these, three genes are predicted to encode glyoxal oxidase, an enzyme previously isolated from Phanerochaete chrysosporium. The glyoxal oxidase of P. chrysosporium...
Multi-Omics Driven Assembly and Annotation of the Sandalwood (Santalum album) Genome.
Mahesh, Hirehally Basavarajegowda; Subba, Pratigya; Advani, Jayshree; Shirke, Meghana Deepak; Loganathan, Ramya Malarini; Chandana, Shankara Lingu; Shilpa, Siddappa; Chatterjee, Oishi; Pinto, Sneha Maria; Prasad, Thottethodi Subrahmanya Keshava; Gowda, Malali
2018-04-01
Indian sandalwood ( Santalum album ) is an important tropical evergreen tree known for its fragrant heartwood-derived essential oil and its valuable carving wood. Here, we applied an integrated genomic, transcriptomic, and proteomic approach to assemble and annotate the Indian sandalwood genome. Our genome sequencing resulted in the establishment of a draft map of the smallest genome for any woody tree species to date (221 Mb). The genome annotation predicted 38,119 protein-coding genes and 27.42% repetitive DNA elements. In-depth proteome analysis revealed the identities of 72,325 unique peptides, which confirmed 10,076 of the predicted genes. The addition of transcriptomic and proteogenomic approaches resulted in the identification of 53 novel proteins and 34 gene-correction events that were missed by genomic approaches. Proteogenomic analysis also helped in reassigning 1,348 potential noncoding RNAs as bona fide protein-coding messenger RNAs. Gene expression patterns at the RNA and protein levels indicated that peptide sequencing was useful in capturing proteins encoded by nuclear and organellar genomes alike. Mass spectrometry-based proteomic evidence provided an unbiased approach toward the identification of proteins encoded by organellar genomes. Such proteins are often missed in transcriptome data sets due to the enrichment of only messenger RNAs that contain poly(A) tails. Overall, the use of integrated omic approaches enhanced the quality of the assembly and annotation of this nonmodel plant genome. The availability of genomic, transcriptomic, and proteomic data will enhance genomics-assisted breeding, germplasm characterization, and conservation of sandalwood trees. © 2018 American Society of Plant Biologists. All Rights Reserved.
Wisniewski-Dyé, Florence; Lozano, Luis; Acosta-Cruz, Erika; Borland, Stéphanie; Drogue, Benoît; Prigent-Combaret, Claire; Rouy, Zoé; Barbe, Valérie; Mendoza Herrera, Alberto; González, Victor; Mavingui, Patrick
2012-01-01
Bacteria of the genus Azospirillum colonize roots of important cereals and grasses, and promote plant growth by several mechanisms, notably phytohormone synthesis. The genomes of several Azospirillum strains belonging to different species, isolated from various host plants and locations, were recently sequenced and published. In this study, an additional genome of an A. brasilense strain, isolated from maize grown on an alkaline soil in the northeast of Mexico, strain CBG497, was obtained. Comparative genomic analyses were performed on this new genome and three other genomes (A. brasilense Sp245, A. lipoferum 4B and Azospirillum sp. B510). The Azospirillum core genome was established and consists of 2,328 proteins, representing between 30% to 38% of the total encoded proteins within a genome. It is mainly chromosomally-encoded and contains 74% of genes of ancestral origin shared with some aquatic relatives. The non-ancestral part of the core genome is enriched in genes involved in signal transduction, in transport and in metabolism of carbohydrates and amino-acids, and in surface properties features linked to adaptation in fluctuating environments, such as soil and rhizosphere. Many genes involved in colonization of plant roots, plant-growth promotion (such as those involved in phytohormone biosynthesis), and properties involved in rhizosphere adaptation (such as catabolism of phenolic compounds, uptake of iron) are restricted to a particular strain and/or species, strongly suggesting niche-specific adaptation. PMID:24705077
Tashkandy, Nisreen; Sabban, Sari; Fakieh, Mohammad; ...
2016-06-16
Flavobacterium suncheonense is a member of the family Flavobacteriaceae in the phylum Bacteroidetes. Strain GH29-5 T (DSM 17707 T ) was isolated from greenhouse soil in Suncheon, South Korea. F. suncheonense GH29-5 T is part of the Genomic Encyclopedia of Bacteria and Archaea project. The 2,880,663 bp long draft genome consists of 54 scaffolds with 2739 protein-coding genes and 82 RNA genes. The genome of strain GH29-5 T has 117 genes encoding peptidases but a small number of genes encoding carbohydrate active enzymes (51 CAZymes). Metallo and serine peptidases were found most frequently. Among CAZymes, eight glycoside hydrolase families, ninemore » glycosyl transferase families, two carbohydrate binding module families and four carbohydrate esterase families were identified. Suprisingly, polysaccharides utilization loci (PULs) were not found in strain GH29-5 T . Based on the coherent physiological and genomic characteristics we suggest that F. suncheonense GH29-5 T feeds rather on proteins than saccharides and lipids.« less
2011-01-01
The genomic DNA sequence of a novel enteric uncultured microphage, ΦCA82 from a turkey gastrointestinal system was determined utilizing metagenomics techniques. The entire circular, single-stranded nucleotide sequence of the genome was 5,514 nucleotides. The ΦCA82 genome is quite different from other microviruses as indicated by comparisons of nucleotide similarity, predicted protein similarity, and functional classifications. Only three genes showed significant similarity to microviral proteins as determined by local alignments using BLAST analysis. ORF1 encoded a predicted phage F capsid protein that was phylogenetically most similar to the Microviridae ΦMH2K member's major coat protein. The ΦCA82 genome also encoded a predicted minor capsid protein (ORF2) and putative replication initiation protein (ORF3) most similar to the microviral bacteriophage SpV4. The distant evolutionary relationship of ΦCA82 suggests that the divergence of this novel turkey microvirus from other microviruses may reflect unique evolutionary pressures encountered within the turkey gastrointestinal system. PMID:21714899
A system-level model for the microbial regulatory genome.
Brooks, Aaron N; Reiss, David J; Allard, Antoine; Wu, Wei-Ju; Salvanha, Diego M; Plaisier, Christopher L; Chandrasekaran, Sriram; Pan, Min; Kaur, Amardeep; Baliga, Nitin S
2014-07-15
Microbes can tailor transcriptional responses to diverse environmental challenges despite having streamlined genomes and a limited number of regulators. Here, we present data-driven models that capture the dynamic interplay of the environment and genome-encoded regulatory programs of two types of prokaryotes: Escherichia coli (a bacterium) and Halobacterium salinarum (an archaeon). The models reveal how the genome-wide distributions of cis-acting gene regulatory elements and the conditional influences of transcription factors at each of those elements encode programs for eliciting a wide array of environment-specific responses. We demonstrate how these programs partition transcriptional regulation of genes within regulons and operons to re-organize gene-gene functional associations in each environment. The models capture fitness-relevant co-regulation by different transcriptional control mechanisms acting across the entire genome, to define a generalized, system-level organizing principle for prokaryotic gene regulatory networks that goes well beyond existing paradigms of gene regulation. An online resource (http://egrin2.systemsbiology.net) has been developed to facilitate multiscale exploration of conditional gene regulation in the two prokaryotes. © 2014 The Authors. Published under the terms of the CC BY 4.0 license.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jovanovic, Iva; Magnuson, Jon K.; Collart, Frank R.
2009-08-01
Genome sequencing of a variety of fungi is a major initiative currently supported by the Department of Energy’s Joint Genome Institute. Encoded within the genomes of many fungi are upwards of 200+ enzymes called glycoside hydrolases (GHs). GHs are known for their ability to hydrolyze the polysaccharide components of lignocellulosic biomass. Production of ethanol and “next generation” biofuels from lignocellulosic biomass represents a sustainable route to biofuels production. However this process has to become more economical before large scale operations are put into place. Identifying and characterizing GHs with improved properties for biomass degradation is a key factor for themore » development of cost effective processes to convert biomass to fuels and chemicals. With the recent explosion in the number of GH encoding genes discovered by fungal genome sequencing projects, it has become apparent that improvements in GH gene annotation processes have to be developed. This will enable more informed and efficient decision making with regard to selection and utilization of these important enzymes in bioprocess that produce fuels and chemicals from lignocellulosic feedstocks.« less
The Genome of Deep-Sea Vent Chemolithoautotroph Thiomicrospira crunogena XCL-2
Scott, Kathleen M; Sievert, Stefan M; Abril, Fereniki N; Ball, Lois A; Barrett, Chantell J; Blake, Rodrigo A; Boller, Amanda J; Chain, Patrick S. G; Clark, Justine A; Davis, Carisa R; Detter, Chris; Do, Kimberly F; Dobrinski, Kimberly P; Faza, Brandon I; Fitzpatrick, Kelly A; Freyermuth, Sharyn K; Harmer, Tara L; Hauser, Loren J; Hügler, Michael; Kerfeld, Cheryl A; Klotz, Martin G; Kong, William W; Land, Miriam; Lapidus, Alla; Larimer, Frank W; Longo, Dana L; Lucas, Susan; Malfatti, Stephanie A; Massey, Steven E; Martin, Darlene D; McCuddin, Zoe; Meyer, Folker; Moore, Jessica L; Ocampo, Luis H; Paul, John H; Paulsen, Ian T; Reep, Douglas K; Ren, Qinghu; Ross, Rachel L; Sato, Priscila Y; Thomas, Phaedra; Tinkham, Lance E; Zeruth, Gary T
2006-01-01
Presented here is the complete genome sequence of Thiomicrospira crunogena XCL-2, representative of ubiquitous chemolithoautotrophic sulfur-oxidizing bacteria isolated from deep-sea hydrothermal vents. This gammaproteobacterium has a single chromosome (2,427,734 base pairs), and its genome illustrates many of the adaptations that have enabled it to thrive at vents globally. It has 14 methyl-accepting chemotaxis protein genes, including four that may assist in positioning it in the redoxcline. A relative abundance of coding sequences (CDSs) encoding regulatory proteins likely control the expression of genes encoding carboxysomes, multiple dissolved inorganic nitrogen and phosphate transporters, as well as a phosphonate operon, which provide this species with a variety of options for acquiring these substrates from the environment. Thiom. crunogena XCL-2 is unusual among obligate sulfur-oxidizing bacteria in relying on the Sox system for the oxidation of reduced sulfur compounds. The genome has characteristics consistent with an obligately chemolithoautotrophic lifestyle, including few transporters predicted to have organic allocrits, and Calvin-Benson-Bassham cycle CDSs scattered throughout the genome. PMID:17105352
Genome sequence of the Lotus spp. microsymbiont Mesorhizobium loti strain R7A.
Kelly, Simon; Sullivan, John; Ronson, Clive; Tian, Rui; Bräu, Lambert; Munk, Christine; Goodwin, Lynne; Han, Cliff; Woyke, Tanja; Reddy, Tatiparthi; Huntemann, Marcel; Pati, Amrita; Mavromatis, Konstantinos; Markowitz, Victor; Ivanova, Natalia; Kyrpides, Nikos; Reeve, Wayne
2014-01-01
Mesorhizobium loti strain R7A was isolated in 1993 in Lammermoor, Otago, New Zealand from a Lotus corniculatus root nodule and is a reisolate of the inoculant strain ICMP3153 (NZP2238) used at the site. R7A is an aerobic, Gram-negative, non-spore-forming rod. The symbiotic genes in the strain are carried on a 502-kb integrative and conjugative element known as the symbiosis island or ICEMlSym(R7A). M. loti is the microsymbiont of the model legume Lotus japonicus and strain R7A has been used extensively in studies of the plant-microbe interaction. This report reveals that the genome of M. loti strain R7A does not harbor any plasmids and contains a single scaffold of size 6,529,530 bp which encodes 6,323 protein-coding genes and 75 RNA-only encoding genes. This rhizobial genome is one of 100 sequenced as part of the DOE Joint Genome Institute 2010 Genomic Encyclopedia for Bacteria and Archaea-Root Nodule Bacteria (GEBA-RNB) project.
Reeve, Wayne; O’Hara, Graham; Chain, Patrick; Ardley, Julie; Bräu, Lambert; Nandesena, Kemanthi; Tiwari, Ravi; Copeland, Alex; Nolan, Matt; Han, Cliff; Brettin, Thomas; Land, Miriam; Ovchinikova, Galina; Ivanova, Natalia; Mavromatis, Konstantinos; Markowitz, Victor; Kyrpides, Nikos; Melino, Vanessa; Denton, Matthew; Yates, Ron; Howieson, John
2010-01-01
Rhizobium leguminosarum bv trifolii is a soil-inhabiting bacterium that has the capacity to be an effective nitrogen fixing microsymbiont of a diverse range of annual Trifolium (clover) species. Strain WSM1325 is an aerobic, motile, non-spore forming, Gram-negative rod isolated from root nodules collected in 1993 from the Greek Island of Serifos. WSM1325 is produced commercially in Australia as an inoculant for a broad range of annual clovers of Mediterranean origin due to its superior attributes of saprophytic competence, nitrogen fixation and acid-tolerance. Here we describe the basic features of this organism, together with the complete genome sequence, and annotation. This is the first completed genome sequence for a microsymbiont of annual clovers. We reveal that its genome size is 7,418,122 bp encoding 7,232 protein-coding genes and 61 RNA-only encoding genes. This multipartite genome contains 6 distinct replicons; a chromosome of size 4,767,043 bp and 5 plasmids of size 828,924 bp, 660,973 bp, 516,088 bp, 350,312 bp and 294,782 bp. PMID:21304718
Open chromatin reveals the functional maize genome
USDA-ARS?s Scientific Manuscript database
Every cellular process mediated through nuclear DNA must contend with chromatin. As results from ENCODE show, open chromatin assays can efficiently integrate across diverse regulatory elements, revealing functional non-coding genome. In this study, we use a MNase hypersensitivity assay to discover o...
'Yeast mail': a novel Saccharomyces application (NSA) to encrypt messages.
Rosemeyer, Helmut; Paululat, Achim; Heinisch, Jürgen J
2014-09-01
The universal genetic code is used by all life forms to encode biological information. It can also be used to encrypt semantic messages and convey them within organisms without anyone but the sender and recipient knowing, i.e., as a means of steganography. Several theoretical, but comparatively few experimental, approaches have been dedicated to this subject, so far. Here, we describe an experimental system to stably integrate encrypted messages within the yeast genome using a polymerase chain reaction (PCR)-based, one-step homologous recombination system. Thus, DNA sequences encoding alphabetical and/or numerical information will be inherited by yeast propagation and can be sent in the form of dried yeast. Moreover, due to the availability of triple shuttle vectors, Saccharomyces cerevisiae can also be used as an intermediate construction device for transfer of information to either Drosophila or mammalian cells as steganographic containers. Besides its classical use in alcoholic fermentation and its modern use for heterologous gene expression, we here show that baker's yeast can thus be employed in a novel Saccharomyces application (NSA) as a simple steganographic container to hide and convey messages. Copyright © 2014 Verlag Helvetica Chimica Acta AG, Zürich.
Chapman, Brad A; Bowers, John E; Feltus, Frank A; Paterson, Andrew H
2006-02-21
Genome duplication followed by massive gene loss has permanently shaped the genomes of many higher eukaryotes, particularly angiosperms. It has long been believed that a primary advantage of genome duplication is the opportunity for the evolution of genes with new functions by modification of duplicated genes. If so, then patterns of genetic diversity among strains within taxa might reveal footprints of selection that are consistent with this advantage. Contrary to classical predictions that duplicated genes may be relatively free to acquire unique functionality, we find among both Arabidopsis ecotypes and Oryza subspecies that SNPs encode less radical amino acid changes in genes for which there exists a duplicated copy at a "paleologous" locus than in "singleton" genes. Preferential retention of duplicated genes encoding long complex proteins and their unexpectedly slow divergence (perhaps because of homogenization) suggest that a primary advantage of retaining duplicated paleologs may be the buffering of crucial functions. Functional buffering and functional divergence may represent extremes in the spectrum of duplicated gene fates. Functional buffering may be especially important during "genomic turmoil" immediately after genome duplication but continues to act approximately 60 million years later, and its gradual deterioration may contribute cyclicality to genome duplication in some lineages.
Chapman, Brad A.; Bowers, John E.; Feltus, Frank A.; Paterson, Andrew H.
2006-01-01
Genome duplication followed by massive gene loss has permanently shaped the genomes of many higher eukaryotes, particularly angiosperms. It has long been believed that a primary advantage of genome duplication is the opportunity for the evolution of genes with new functions by modification of duplicated genes. If so, then patterns of genetic diversity among strains within taxa might reveal footprints of selection that are consistent with this advantage. Contrary to classical predictions that duplicated genes may be relatively free to acquire unique functionality, we find among both Arabidopsis ecotypes and Oryza subspecies that SNPs encode less radical amino acid changes in genes for which there exists a duplicated copy at a “paleologous” locus than in “singleton” genes. Preferential retention of duplicated genes encoding long complex proteins and their unexpectedly slow divergence (perhaps because of homogenization) suggest that a primary advantage of retaining duplicated paleologs may be the buffering of crucial functions. Functional buffering and functional divergence may represent extremes in the spectrum of duplicated gene fates. Functional buffering may be especially important during “genomic turmoil” immediately after genome duplication but continues to act ≈60 million years later, and its gradual deterioration may contribute cyclicality to genome duplication in some lineages. PMID:16467140
Yuki, Masahiro; Kuwahara, Hirokazu; Shintani, Masaki; Izawa, Kazuki; Sato, Tomoyuki; Starns, David; Hongoh, Yuichi; Ohkuma, Moriya
2015-12-01
Wood-feeding lower termites harbour symbiotic gut protists that support the termite nutritionally by degrading recalcitrant lignocellulose. These protists themselves host specific endo- and ectosymbiotic bacteria, functions of which remain largely unknown. Here, we present draft genomes of a dominant, uncultured ectosymbiont belonging to the order Bacteroidales, 'Candidatus Symbiothrix dinenymphae', which colonizes the cell surface of the cellulolytic gut protists Dinenympha spp. We analysed four single-cell genomes of Ca. S. dinenymphae, the highest genome completeness was estimated to be 81.6-82.3% with a predicted genome size of 4.28-4.31 Mb. The genome retains genes encoding large parts of the amino acid, cofactor and nucleotide biosynthetic pathways. In addition, the genome contains genes encoding various glycoside hydrolases such as endoglucanases and hemicellulases. The genome indicates that Ca. S. dinenymphae ferments lignocellulose-derived monosaccharides to acetate, a major carbon and energy source of the host termite. We suggest that the ectosymbiont digests lignocellulose and provides nutrients to the host termites, and hypothesize that the hydrolytic activity might also function as a pretreatment for the host protist to effectively decompose the crystalline cellulose components. © 2015 Society for Applied Microbiology and John Wiley & Sons Ltd.
Pacheco-Arjona, Jose Ramon; Ramirez-Prado, Jorge Humberto
2014-01-01
The cell wall is a protective and versatile structure distributed in all fungi. The component responsible for its rigidity is chitin, a product of chitin synthase (Chsp) enzymes. There are seven classes of chitin synthase genes (CHS) and the amount and type encoded in fungal genomes varies considerably from one species to another. Previous Chsp sequence analyses focused on their study as individual units, regardless of genomic context. The identification of blocks of conserved genes between genomes can provide important clues about the interactions and localization of chitin synthases. On the present study, we carried out an in silico search of all putative Chsp encoded in 54 full fungal genomes, encompassing 21 orders from five phyla. Phylogenetic studies of these Chsp were able to confidently classify 347 out of the 369 Chsp identified (94%). Patterns in the distribution of Chsp related to taxonomy were identified, the most prominent being related to the type of fungal growth. More importantly, a synteny analysis for genomic blocks centered on class IV Chsp (the most abundant and widely distributed Chsp class) identified a putative cell wall metabolism gene cluster in members of the genus Aspergillus, the first such association reported for any fungal genome. PMID:25148134
DOE Office of Scientific and Technical Information (OSTI.GOV)
Podar, Mircea; Graham, David E; Reysenbach, Anna-Louise
A hyperthemophilic member of the Nanoarchaeota from Obsidian Pool, a thermal feature in Yellowstone National Park was characterized using single cell isolation and sequencing, together with its putative host, a Sulfolobales archaeon. This first representative of a non-marine Nanoarchaeota (Nst1) resembles Nanoarchaeum equitans by lacking most biosynthetic capabilities, the two forming a deep-branching archaeal lineage. However, the Nst1 genome is over 20% larger, encodes a complete gluconeogenesis pathway and a full complement of archaeal flagellum proteins. Comparison of the two genomes suggests that the marine and terrestrial Nanoarchaeota lineages share a common ancestor that was already a symbiont of anothermore » archaeon. With a larger genome, a smaller repertoire of split protein encoding genes and no split non-contiguous tRNAs, Nst1 appears to have experienced less severe genome reduction than N. equitans. The inferred host of Nst1 is potentially autotrophic, with a streamlined genome and simplified central and energetic metabolism as compared to other Sulfolobales. The two distinct Nanoarchaeota-host genomic data sets offer insights into the evolution of archaeal symbiosis and parasitism and will further enable studies of the cellular and molecular mechanisms of these relationships.« less
2012-01-01
Background Pseudoscorpions are chelicerates and have historically been viewed as being most closely related to solifuges, harvestmen, and scorpions. No mitochondrial genomes of pseudoscorpions have been published, but the mitochondrial genomes of some lineages of Chelicerata possess unusual features, including short rRNA genes and tRNA genes that lack sequence to encode arms of the canonical cloverleaf-shaped tRNA. Additionally, some chelicerates possess an atypical guanine-thymine nucleotide bias on the major coding strand of their mitochondrial genomes. Results We sequenced the mitochondrial genomes of two divergent taxa from the chelicerate order Pseudoscorpiones. We find that these genomes possess unusually short tRNA genes that do not encode cloverleaf-shaped tRNA structures. Indeed, in one genome, all 22 tRNA genes lack sequence to encode canonical cloverleaf structures. We also find that the large ribosomal RNA genes are substantially shorter than those of most arthropods. We inferred secondary structures of the LSU rRNAs from both pseudoscorpions, and find that they have lost multiple helices. Based on comparisons with the crystal structure of the bacterial ribosome, two of these helices were likely contact points with tRNA T-arms or D-arms as they pass through the ribosome during protein synthesis. The mitochondrial gene arrangements of both pseudoscorpions differ from the ancestral chelicerate gene arrangement. One genome is rearranged with respect to the location of protein-coding genes, the small rRNA gene, and at least 8 tRNA genes. The other genome contains 6 tRNA genes in novel locations. Most chelicerates with rearranged mitochondrial genes show a genome-wide reversal of the CA nucleotide bias typical for arthropods on their major coding strand, and instead possess a GT bias. Yet despite their extensive rearrangement, these pseudoscorpion mitochondrial genomes possess a CA bias on the major coding strand. Phylogenetic analyses of all 13 mitochondrial protein-coding gene sequences consistently yield trees that place pseudoscorpions as sister to acariform mites. Conclusion The well-supported phylogenetic placement of pseudoscorpions as sister to Acariformes differs from some previous analyses based on morphology. However, these two lineages share multiple molecular evolutionary traits, including substantial mitochondrial genome rearrangements, extensive nucleotide substitution, and loss of helices in their inferred tRNA and rRNA structures. PMID:22409411
RGAugury: a pipeline for genome-wide prediction of resistance gene analogs (RGAs) in plants.
Li, Pingchuan; Quan, Xiande; Jia, Gaofeng; Xiao, Jin; Cloutier, Sylvie; You, Frank M
2016-11-02
Resistance gene analogs (RGAs), such as NBS-encoding proteins, receptor-like protein kinases (RLKs) and receptor-like proteins (RLPs), are potential R-genes that contain specific conserved domains and motifs. Thus, RGAs can be predicted based on their conserved structural features using bioinformatics tools. Computer programs have been developed for the identification of individual domains and motifs from the protein sequences of RGAs but none offer a systematic assessment of the different types of RGAs. A user-friendly and efficient pipeline is needed for large-scale genome-wide RGA predictions of the growing number of sequenced plant genomes. An integrative pipeline, named RGAugury, was developed to automate RGA prediction. The pipeline first identifies RGA-related protein domains and motifs, namely nucleotide binding site (NB-ARC), leucine rich repeat (LRR), transmembrane (TM), serine/threonine and tyrosine kinase (STTK), lysin motif (LysM), coiled-coil (CC) and Toll/Interleukin-1 receptor (TIR). RGA candidates are identified and classified into four major families based on the presence of combinations of these RGA domains and motifs: NBS-encoding, TM-CC, and membrane associated RLP and RLK. All time-consuming analyses of the pipeline are paralleled to improve performance. The pipeline was evaluated using the well-annotated Arabidopsis genome. A total of 98.5, 85.2, and 100 % of the reported NBS-encoding genes, membrane associated RLPs and RLKs were validated, respectively. The pipeline was also successfully applied to predict RGAs for 50 sequenced plant genomes. A user-friendly web interface was implemented to ease command line operations, facilitate visualization and simplify result management for multiple datasets. RGAugury is an efficiently integrative bioinformatics tool for large scale genome-wide identification of RGAs. It is freely available at Bitbucket: https://bitbucket.org/yaanlpc/rgaugury .
Beller, Harry R.; Chain, Patrick S. G.; Letain, Tracy E.; Chakicherla, Anu; Larimer, Frank W.; Richardson, Paul M.; Coleman, Matthew A.; Wood, Ann P.; Kelly, Donovan P.
2006-01-01
The complete genome sequence of Thiobacillus denitrificans ATCC 25259 is the first to become available for an obligately chemolithoautotrophic, sulfur-compound-oxidizing, β-proteobacterium. Analysis of the 2,909,809-bp genome will facilitate our molecular and biochemical understanding of the unusual metabolic repertoire of this bacterium, including its ability to couple denitrification to sulfur-compound oxidation, to catalyze anaerobic, nitrate-dependent oxidation of Fe(II) and U(IV), and to oxidize mineral electron donors. Notable genomic features include (i) genes encoding c-type cytochromes totaling 1 to 2 percent of the genome, which is a proportion greater than for almost all bacterial and archaeal species sequenced to date, (ii) genes encoding two [NiFe]hydrogenases, which is particularly significant because no information on hydrogenases has previously been reported for T. denitrificans and hydrogen oxidation appears to be critical for anaerobic U(IV) oxidation by this species, (iii) a diverse complement of more than 50 genes associated with sulfur-compound oxidation (including sox genes, dsr genes, and genes associated with the AMP-dependent oxidation of sulfite to sulfate), some of which occur in multiple (up to eight) copies, (iv) a relatively large number of genes associated with inorganic ion transport and heavy metal resistance, and (v) a paucity of genes encoding organic-compound transporters, commensurate with obligate chemolithoautotrophy. Ultimately, the genome sequence of T. denitrificans will enable elucidation of the mechanisms of aerobic and anaerobic sulfur-compound oxidation by β-proteobacteria and will help reveal the molecular basis of this organism's role in major biogeochemical cycles (i.e., those involving sulfur, nitrogen, and carbon) and groundwater restoration. PMID:16452431
A genome survey of Moniliophthora perniciosa gives new insights into Witches' Broom Disease of cacao
Mondego, Jorge MC; Carazzolle, Marcelo F; Costa, Gustavo GL; Formighieri, Eduardo F; Parizzi, Lucas P; Rincones, Johana; Cotomacci, Carolina; Carraro, Dirce M; Cunha, Anderson F; Carrer, Helaine; Vidal, Ramon O; Estrela, Raíssa C; García, Odalys; Thomazella, Daniela PT; de Oliveira, Bruno V; Pires, Acássia BL; Rio, Maria Carolina S; Araújo, Marcos Renato R; de Moraes, Marcos H; Castro, Luis AB; Gramacho, Karina P; Gonçalves, Marilda S; Neto, José P Moura; Neto, Aristóteles Góes; Barbosa, Luciana V; Guiltinan, Mark J; Bailey, Bryan A; Meinhardt, Lyndel W; Cascardo, Julio CM; Pereira, Gonçalo AG
2008-01-01
Background The basidiomycete fungus Moniliophthora perniciosa is the causal agent of Witches' Broom Disease (WBD) in cacao (Theobroma cacao). It is a hemibiotrophic pathogen that colonizes the apoplast of cacao's meristematic tissues as a biotrophic pathogen, switching to a saprotrophic lifestyle during later stages of infection. M. perniciosa, together with the related species M. roreri, are pathogens of aerial parts of the plant, an uncommon characteristic in the order Agaricales. A genome survey (1.9× coverage) of M. perniciosa was analyzed to evaluate the overall gene content of this phytopathogen. Results Genes encoding proteins involved in retrotransposition, reactive oxygen species (ROS) resistance, drug efflux transport and cell wall degradation were identified. The great number of genes encoding cytochrome P450 monooxygenases (1.15% of gene models) indicates that M. perniciosa has a great potential for detoxification, production of toxins and hormones; which may confer a high adaptive ability to the fungus. We have also discovered new genes encoding putative secreted polypeptides rich in cysteine, as well as genes related to methylotrophy and plant hormone biosynthesis (gibberellin and auxin). Analysis of gene families indicated that M. perniciosa have similar amounts of carboxylesterases and repertoires of plant cell wall degrading enzymes as other hemibiotrophic fungi. In addition, an approach for normalization of gene family data using incomplete genome data was developed and applied in M. perniciosa genome survey. Conclusion This genome survey gives an overview of the M. perniciosa genome, and reveals that a significant portion is involved in stress adaptation and plant necrosis, two necessary characteristics for a hemibiotrophic fungus to fulfill its infection cycle. Our analysis provides new evidence revealing potential adaptive traits that may play major roles in the mechanisms of pathogenicity in the M. perniciosa/cacao pathosystem. PMID:19019209
Single-cell genomics reveals co-metabolic interactions within uncultivated Marine Group A bacteria
NASA Astrophysics Data System (ADS)
Hawley, A. K.; Hallam, S. J.
2016-02-01
Marine Group A (MGA) bacteria represent a ubiquitous and abundant candidate phylum enriched in oxygen minimum zones (OMZs) and the deep ocean. Despite MGA prevalence little is known about their ecology and biogeochemistry. Here we chart the metabolic potential of 26 MGA single-cell amplified genomes sourced from different environments spanning ecothermodynamic gradients including open ocean waters, OMZs and methanogenic environments including a terephthalate-degrading bioreactor. Metagenomic contig recruitment to SAGs combined with tetra-nucleotide frequency distribution patterns resolved nine MGA population genome bins. All population genomes exhibited genomic streamlining with open ocean MGA being the most reduced. Different strategies for carbohydrate utilization, carbon fixation energy metabolism and respiratory pathways were identified between population genome bins, including various roles in the nitrogen and sulfur cycles. MGA inhabiting OMZ oxyclines encoded genes for partial denitrification with potential to feed into anammox and nitrification as well as a polysulfide reductase with a potential role in the cryptic sulfur cycle. MGA inhabiting anoxic waters, encoded NiFe hydrogenase and nitrous oxide reductase with the potential to complete partial denitrification pathways previously linked to sulfur oxidation in SUP05 bacteria. MGA from methanogenic environments encoded genes mediating cascading syntrophic interactions with fatty acid degraders and methanogens including reverse electron transport potential. The MGA phylum appears to have evolved alternative metabolic innovations adapting specific subgroups to occupy specific niches along ecothermodynamic gradients. Additionally, expression of MGA genes from different OMZ environments supports that these subgroups manifest an increasing propensity for co-metabolic interactions under energy limiting conditions that mandates a cooperative mode of existence with important implications for C, N and S cycling in marine ecosystems.
2011-01-01
Background Streptococcus dysgalactiae subsp. equisimilis (SDSE) causes invasive streptococcal infections, including streptococcal toxic shock syndrome (STSS), as does Lancefield group A Streptococcus pyogenes (GAS). We sequenced the entire genome of SDSE strain GGS_124 isolated from a patient with STSS. Results We found that GGS_124 consisted of a circular genome of 2,106,340 bp. Comparative analyses among bacterial genomes indicated that GGS_124 was most closely related to GAS. GGS_124 and GAS, but not other streptococci, shared a number of virulence factor genes, including genes encoding streptolysin O, NADase, and streptokinase A, distantly related to SIC (DRS), suggesting the importance of these factors in the development of invasive disease. GGS_124 contained 3 prophages, with one containing a virulence factor gene for streptodornase. All 3 prophages were significantly similar to GAS prophages that carry virulence factor genes, indicating that these prophages had transferred these genes between pathogens. SDSE was found to contain a gene encoding a superantigen, streptococcal exotoxin type G, but lacked several genes present in GAS that encode virulence factors, such as other superantigens, cysteine protease speB, and hyaluronan synthase operon hasABC. Similar to GGS_124, the SDSE strains contained larger numbers of clustered, regularly interspaced, short palindromic repeats (CRISPR) spacers than did GAS, suggesting that horizontal gene transfer via streptococcal phages between SDSE and GAS is somewhat restricted, although they share phage species. Conclusion Genome wide comparisons of SDSE with GAS indicate that SDSE is closely and quantitatively related to GAS. SDSE, however, lacks several virulence factors of GAS, including superantigens, SPE-B and the hasABC operon. CRISPR spacers may limit the horizontal transfer of phage encoded GAS virulence genes into SDSE. These findings may provide clues for dissecting the pathological roles of the virulence factors in SDSE and GAS that cause STSS. PMID:21223537
Shimomura, Yumi; Okumura, Kayo; Murayama, Somay Yamagata; Yagi, Junji; Ubukata, Kimiko; Kirikae, Teruo; Miyoshi-Akiyama, Tohru
2011-01-11
Streptococcus dysgalactiae subsp. equisimilis (SDSE) causes invasive streptococcal infections, including streptococcal toxic shock syndrome (STSS), as does Lancefield group A Streptococcus pyogenes (GAS). We sequenced the entire genome of SDSE strain GGS_124 isolated from a patient with STSS. We found that GGS_124 consisted of a circular genome of 2,106,340 bp. Comparative analyses among bacterial genomes indicated that GGS_124 was most closely related to GAS. GGS_124 and GAS, but not other streptococci, shared a number of virulence factor genes, including genes encoding streptolysin O, NADase, and streptokinase A, distantly related to SIC (DRS), suggesting the importance of these factors in the development of invasive disease. GGS_124 contained 3 prophages, with one containing a virulence factor gene for streptodornase. All 3 prophages were significantly similar to GAS prophages that carry virulence factor genes, indicating that these prophages had transferred these genes between pathogens. SDSE was found to contain a gene encoding a superantigen, streptococcal exotoxin type G, but lacked several genes present in GAS that encode virulence factors, such as other superantigens, cysteine protease speB, and hyaluronan synthase operon hasABC. Similar to GGS_124, the SDSE strains contained larger numbers of clustered, regularly interspaced, short palindromic repeats (CRISPR) spacers than did GAS, suggesting that horizontal gene transfer via streptococcal phages between SDSE and GAS is somewhat restricted, although they share phage species. Genome wide comparisons of SDSE with GAS indicate that SDSE is closely and quantitatively related to GAS. SDSE, however, lacks several virulence factors of GAS, including superantigens, SPE-B and the hasABC operon. CRISPR spacers may limit the horizontal transfer of phage encoded GAS virulence genes into SDSE. These findings may provide clues for dissecting the pathological roles of the virulence factors in SDSE and GAS that cause STSS.
Young, Michael; Artsatbanov, Vladislav; Beller, Harry R.; Chandra, Govind; Chater, Keith F.; Dover, Lynn G.; Goh, Ee-Been; Kahan, Tamar; Kaprelyants, Arseny S.; Kyrpides, Nikos; Lapidus, Alla; Lowry, Stephen R.; Lykidis, Athanasios; Mahillon, Jacques; Markowitz, Victor; Mavromatis, Konstantinos; Mukamolova, Galina V.; Oren, Aharon; Rokem, J. Stefan; Smith, Margaret C. M.; Young, Danielle I.; Greenblatt, Charles L.
2010-01-01
Micrococcus luteus (NCTC2665, “Fleming strain”) has one of the smallest genomes of free-living actinobacteria sequenced to date, comprising a single circular chromosome of 2,501,097 bp (G+C content, 73%) predicted to encode 2,403 proteins. The genome shows extensive synteny with that of the closely related organism, Kocuria rhizophila, from which it was taxonomically separated relatively recently. Despite its small size, the genome harbors 73 insertion sequence (IS) elements, almost all of which are closely related to elements found in other actinobacteria. An IS element is inserted into the rrs gene of one of only two rrn operons found in M. luteus. The genome encodes only four sigma factors and 14 response regulators, a finding indicative of adaptation to a rather strict ecological niche (mammalian skin). The high sensitivity of M. luteus to β-lactam antibiotics may result from the presence of a reduced set of penicillin-binding proteins and the absence of a wblC gene, which plays an important role in the antibiotic resistance in other actinobacteria. Consistent with the restricted range of compounds it can use as a sole source of carbon for energy and growth, M. luteus has a minimal complement of genes concerned with carbohydrate transport and metabolism and its inability to utilize glucose as a sole carbon source may be due to the apparent absence of a gene encoding glucokinase. Uniquely among characterized bacteria, M. luteus appears to be able to metabolize glycogen only via trehalose and to make trehalose only via glycogen. It has very few genes associated with secondary metabolism. In contrast to most other actinobacteria, M. luteus encodes only one resuscitation-promoting factor (Rpf) required for emergence from dormancy, and its complement of other dormancy-related proteins is also much reduced. M. luteus is capable of long-chain alkene biosynthesis, which is of interest for advanced biofuel production; a three-gene cluster essential for this metabolism has been identified in the genome. PMID:19948807
Comparative analyses of two Geraniaceae transcriptomes using next-generation sequencing.
Zhang, Jin; Ruhlman, Tracey A; Mower, Jeffrey P; Jansen, Robert K
2013-12-29
Organelle genomes of Geraniaceae exhibit several unusual evolutionary phenomena compared to other angiosperm families including accelerated nucleotide substitution rates, widespread gene loss, reduced RNA editing, and extensive genomic rearrangements. Since most organelle-encoded proteins function in multi-subunit complexes that also contain nuclear-encoded proteins, it is likely that the atypical organellar phenomena affect the evolution of nuclear genes encoding organellar proteins. To begin to unravel the complex co-evolutionary interplay between organellar and nuclear genomes in this family, we sequenced nuclear transcriptomes of two species, Geranium maderense and Pelargonium x hortorum. Normalized cDNA libraries of G. maderense and P. x hortorum were used for transcriptome sequencing. Five assemblers (MIRA, Newbler, SOAPdenovo, SOAPdenovo-trans [SOAPtrans], Trinity) and two next-generation technologies (454 and Illumina) were compared to determine the optimal transcriptome sequencing approach. Trinity provided the highest quality assembly of Illumina data with the deepest transcriptome coverage. An analysis to determine the amount of sequencing needed for de novo assembly revealed diminishing returns of coverage and quality with data sets larger than sixty million Illumina paired end reads for both species. The G. maderense and P. x hortorum transcriptomes contained fewer transcripts encoding the PLS subclass of PPR proteins relative to other angiosperms, consistent with reduced mitochondrial RNA editing activity in Geraniaceae. In addition, transcripts for all six plastid targeted sigma factors were identified in both transcriptomes, suggesting that one of the highly divergent rpoA-like ORFs in the P. x hortorum plastid genome is functional. The findings support the use of the Illumina platform and assemblers optimized for transcriptome assembly, such as Trinity or SOAPtrans, to generate high-quality de novo transcriptomes with broad coverage. In addition, results indicated no major improvements in breadth of coverage with data sets larger than six billion nucleotides or when sampling RNA from four tissue types rather than from a single tissue. Finally, this work demonstrates the power of cross-compartmental genomic analyses to deepen our understanding of the correlated evolution of the nuclear, plastid, and mitochondrial genomes in plants.
Comparative analyses of two Geraniaceae transcriptomes using next-generation sequencing
2013-01-01
Background Organelle genomes of Geraniaceae exhibit several unusual evolutionary phenomena compared to other angiosperm families including accelerated nucleotide substitution rates, widespread gene loss, reduced RNA editing, and extensive genomic rearrangements. Since most organelle-encoded proteins function in multi-subunit complexes that also contain nuclear-encoded proteins, it is likely that the atypical organellar phenomena affect the evolution of nuclear genes encoding organellar proteins. To begin to unravel the complex co-evolutionary interplay between organellar and nuclear genomes in this family, we sequenced nuclear transcriptomes of two species, Geranium maderense and Pelargonium x hortorum. Results Normalized cDNA libraries of G. maderense and P. x hortorum were used for transcriptome sequencing. Five assemblers (MIRA, Newbler, SOAPdenovo, SOAPdenovo-trans [SOAPtrans], Trinity) and two next-generation technologies (454 and Illumina) were compared to determine the optimal transcriptome sequencing approach. Trinity provided the highest quality assembly of Illumina data with the deepest transcriptome coverage. An analysis to determine the amount of sequencing needed for de novo assembly revealed diminishing returns of coverage and quality with data sets larger than sixty million Illumina paired end reads for both species. The G. maderense and P. x hortorum transcriptomes contained fewer transcripts encoding the PLS subclass of PPR proteins relative to other angiosperms, consistent with reduced mitochondrial RNA editing activity in Geraniaceae. In addition, transcripts for all six plastid targeted sigma factors were identified in both transcriptomes, suggesting that one of the highly divergent rpoA-like ORFs in the P. x hortorum plastid genome is functional. Conclusions The findings support the use of the Illumina platform and assemblers optimized for transcriptome assembly, such as Trinity or SOAPtrans, to generate high-quality de novo transcriptomes with broad coverage. In addition, results indicated no major improvements in breadth of coverage with data sets larger than six billion nucleotides or when sampling RNA from four tissue types rather than from a single tissue. Finally, this work demonstrates the power of cross-compartmental genomic analyses to deepen our understanding of the correlated evolution of the nuclear, plastid, and mitochondrial genomes in plants. PMID:24373163
Xie, Jian-Bo; Du, Zhenglin; Bai, Lanqing; Tian, Changfu; Zhang, Yunzhi; Xie, Jiu-Yan; Wang, Tianshu; Liu, Xiaomeng; Chen, Xi; Cheng, Qi; Chen, Sanfeng; Li, Jilun
2014-01-01
We provide here a comparative genome analysis of 31 strains within the genus Paenibacillus including 11 new genomic sequences of N2-fixing strains. The heterogeneity of the 31 genomes (15 N2-fixing and 16 non-N2-fixing Paenibacillus strains) was reflected in the large size of the shell genome, which makes up approximately 65.2% of the genes in pan genome. Large numbers of transposable elements might be related to the heterogeneity. We discovered that a minimal and compact nif cluster comprising nine genes nifB, nifH, nifD, nifK, nifE, nifN, nifX, hesA and nifV encoding Mo-nitrogenase is conserved in the 15 N2-fixing strains. The nif cluster is under control of a σ70-depedent promoter and possesses a GlnR/TnrA-binding site in the promoter. Suf system encoding [Fe–S] cluster is highly conserved in N2-fixing and non-N2-fixing strains. Furthermore, we demonstrate that the nif cluster enabled Escherichia coli JM109 to fix nitrogen. Phylogeny of the concatenated NifHDK sequences indicates that Paenibacillus and Frankia are sister groups. Phylogeny of the concatenated 275 single-copy core genes suggests that the ancestral Paenibacillus did not fix nitrogen. The N2-fixing Paenibacillus strains were generated by acquiring the nif cluster via horizontal gene transfer (HGT) from a source related to Frankia. During the history of evolution, the nif cluster was lost, producing some non-N2-fixing strains, and vnf encoding V-nitrogenase or anf encoding Fe-nitrogenase was acquired, causing further diversification of some strains. In addition, some N2-fixing strains have additional nif and nif-like genes which may result from gene duplications. The evolution of nitrogen fixation in Paenibacillus involves a mix of gain, loss, HGT and duplication of nif/anf/vnf genes. This study not only reveals the organization and distribution of nitrogen fixation genes in Paenibacillus, but also provides insight into the complex evolutionary history of nitrogen fixation. PMID:24651173
Bottje, Walter G.; Khatri, Bhuwan; Shouse, Stephanie A.; Seo, Dongwon; Mallmann, Barbara; Orlowski, Sara K.; Pan, Jeonghoon; Kong, Seongbae; Owens, Casey M.; Anthony, Nicholas B.; Kim, Jae K.; Kong, Byungwhi C.
2017-01-01
Background: Although small non-coding RNAs are mostly encoded by the nuclear genome, thousands of small non-coding RNAs encoded by the mitochondrial genome, termed as mitosRNAs were recently reported in human, mouse and trout. In this study, we first identified chicken mitosRNAs in breast muscle using small RNA sequencing method and the differential abundance was analyzed between modern pedigree male (PeM) broilers (characterized by rapid growth and large muscle mass) and the foundational Barred Plymouth Rock (BPR) chickens (characterized by slow growth and small muscle mass). Methods: Small RNA sequencing was performed with total RNAs extracted from breast muscles of PeM and BPR (n = 6 per group) using the 1 × 50 bp single end read method of Illumina sequencing. Raw reads were processed by quality assessment, adapter trimming, and alignment to the chicken mitochondrial genome (GenBank Accession: X52392.1) using the NGen program. Further statistical analyses were performed using the JMP Genomics 8. Differentially expressed (DE) mitosRNAs between PeM and BPR were confirmed by quantitative PCR. Results: Totals of 183,416 unique small RNA sequences were identified as potential chicken mitosRNAs. After stringent filtering processes, 117 mitosRNAs showing >100 raw read counts were abundantly produced from all 37 mitochondrial genes (except D-loop region) and the length of mitosRNAs ranged from 22 to 46 nucleotides. Of those, abundance of 44 mitosRNAs were significantly altered in breast muscles of PeM compared to those of BPR: all mitosRNAs were higher in PeM breast except those produced from 16S-rRNA gene. Possibly, the higher mitosRNAs abundance in PeM breast may be due to a higher mitochondrial content compared to BPR. Our data demonstrate that in addition to 37 known mitochondrial genes, the mitochondrial genome also encodes abundant mitosRNAs, that may play an important regulatory role in muscle growth via mitochondrial gene expression control. PMID:29104541
Scott, Kathleen M; Williams, John; Porter, Cody M B; Russel, Sydney; Harmer, Tara L; Paul, John H; Antonen, Kirsten M; Bridges, Megan K; Camper, Gary J; Campla, Christie K; Casella, Leila G; Chase, Eva; Conrad, James W; Cruz, Mercedez C; Dunlap, Darren S; Duran, Laura; Fahsbender, Elizabeth M; Goldsmith, Dawn B; Keeley, Ryan F; Kondoff, Matthew R; Kussy, Breanna I; Lane, Marannda K; Lawler, Stephanie; Leigh, Brittany A; Lewis, Courtney; Lostal, Lygia M; Marking, Devon; Mancera, Paola A; McClenthan, Evan C; McIntyre, Emily A; Mine, Jessica A; Modi, Swapnil; Moore, Brittney D; Morgan, William A; Nelson, Kaleigh M; Nguyen, Kimmy N; Ogburn, Nicholas; Parrino, David G; Pedapudi, Anangamanjari D; Pelham, Rebecca P; Preece, Amanda M; Rampersad, Elizabeth A; Richardson, Jason C; Rodgers, Christina M; Schaffer, Brent L; Sheridan, Nancy E; Solone, Michael R; Staley, Zachery R; Tabuchi, Maki; Waide, Ramond J; Wanjugi, Pauline W; Young, Suzanne; Clum, Alicia; Daum, Chris; Huntemann, Marcel; Ivanova, Natalia; Kyrpides, Nikos; Mikhailova, Natalia; Palaniappan, Krishnaveni; Pillay, Manoj; Reddy, T B K; Shapiro, Nicole; Stamatis, Dimitrios; Varghese, Neha; Woyke, Tanja; Boden, Rich; Freyermuth, Sharyn K; Kerfeld, Cheryl A
2018-03-09
Chemolithoautotrophic bacteria from the genera Hydrogenovibrio, Thiomicrorhabdus and Thiomicrospira are common, sometimes dominant, isolates from sulfidic habitats including hydrothermal vents, soda and salt lakes and marine sediments. Their genome sequences confirm their membership in a deeply branching clade of the Gammaproteobacteria. Several adaptations to heterogeneous habitats are apparent. Their genomes include large numbers of genes for sensing and responding to their environment (EAL- and GGDEF-domain proteins and methyl-accepting chemotaxis proteins) despite their small sizes (2.1-3.1 Mbp). An array of sulfur-oxidizing complexes are encoded, likely to facilitate these organisms' use of multiple forms of reduced sulfur as electron donors. Hydrogenase genes are present in some taxa, including group 1d and 2b hydrogenases in Hydrogenovibrio marinus and H. thermophilus MA2-6, acquired via horizontal gene transfer. In addition to high-affinity cbb 3 cytochrome c oxidase, some also encode cytochrome bd-type quinol oxidase or ba 3 -type cytochrome c oxidase, which could facilitate growth under different oxygen tensions, or maintain redox balance. Carboxysome operons are present in most, with genes downstream encoding transporters from four evolutionarily distinct families, which may act with the carboxysomes to form CO 2 concentrating mechanisms. These adaptations to habitat variability likely contribute to the cosmopolitan distribution of these organisms. © 2018 Society for Applied Microbiology and John Wiley & Sons Ltd.
Molecular evolution of nitrogen assimilatory enzymes in marine prasinophytes.
Ghoshroy, Sohini; Robertson, Deborah L
2015-01-01
Nitrogen assimilation is a highly regulated process requiring metabolic coordination of enzymes and pathways in the cytosol, chloroplast, and mitochondria. Previous studies of prasinophyte genomes revealed that genes encoding nitrate and ammonium transporters have a complex evolutionary history involving both vertical and horizontal transmission. Here we examine the evolutionary history of well-conserved nitrogen-assimilating enzymes to determine if a similar complex history is observed. Phylogenetic analyses suggest that genes encoding glutamine synthetase (GS) III in the prasinophytes evolved by horizontal gene transfer from a member of the heterokonts. In contrast, genes encoding GSIIE, a canonical vascular plant and green algal enzyme, were found in the Micromonas genomes but have been lost from Ostreococcus. Phylogenetic analyses placed the Micromonas GSIIs in a larger chlorophyte/vascular plant clade; a similar topology was observed for ferredoxin-dependent nitrite reductase (Fd-NiR), indicating the genes encoding GSII and Fd-NiR in these prasinophytes evolved via vertical transmission. Our results show that genes encoding the nitrogen-assimilating enzymes in Micromonas and Ostreococcus have been differentially lost and as well as recruited from different evolutionary lineages, suggesting that the regulation of nitrogen assimilation in prasinophytes will differ from other green algae.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dyer, K.D.; Handen, J.S.; Rosenberg, H.F.
The Charcot-Leyden crystal (CLC) protein, or eosinophil lysophospholipase, is a characteristic protein of human eosinophils and basophils; recent work has demonstrated that the CLC protein is both structurally and functionally related to the galectin family of {beta}-galactoside binding proteins. The galectins as a group share a number of features in common, including a linear ligand binding site encoded on a single exon. In this work, we demonstrate that the intron-exon structure of the gene encoding CLC is analogous to those encoding the galectins. The coding sequence of the CLC gene is divided into four exons, with the entire {beta}-galactoside bindingmore » site encoded by exon III. We have isolated CLC {beta}-galactoside binding sites from both orangutan (Pongo pygmaeus) and murine (Mus musculus) genomic DNAs, both encoded on single exons, and noted conservation of the amino acids shown to interact directly with the {beta}-galactoside ligand. The most likely interpretation of these results suggests the occurrence of one or more exon duplication and insertion events, resulting in the distribution of this lectin domain to CLC as well as to the multiple galectin genes. 35 refs., 3 figs.« less
McClendon, T. Brooke; Mainpal, Rana; Amrit, Francis R. G.; Krause, Michael W.; Ghazi, Arjumand; Yanowitz, Judith L.
2016-01-01
The germ line efficiently combats numerous genotoxic insults to ensure the high fidelity propagation of unaltered genomic information across generations. Yet, germ cells in most metazoans also intentionally create double-strand breaks (DSBs) to promote DNA exchange between parental chromosomes, a process known as crossing over. Homologous recombination is employed in the repair of both genotoxic lesions and programmed DSBs, and many of the core DNA repair proteins function in both processes. In addition, DNA repair efficiency and crossover (CO) distribution are both influenced by local and global differences in chromatin structure, yet the interplay between chromatin structure, genome integrity, and meiotic fidelity is still poorly understood. We have used the xnd-1 mutant of Caenorhabditis elegans to explore the relationship between genome integrity and crossover formation. Known for its role in ensuring X chromosome CO formation and germ line development, we show that xnd-1 also regulates genome stability. xnd-1 mutants exhibited a mortal germ line, high embryonic lethality, high incidence of males, and sensitivity to ionizing radiation. We discovered that a hypomorphic allele of mys-1 suppressed these genome instability phenotypes of xnd-1, but did not suppress the CO defects, suggesting it serves as a separation-of-function allele. mys-1 encodes a histone acetyltransferase, whose homolog Tip60 acetylates H2AK5, a histone mark associated with transcriptional activation that is increased in xnd-1 mutant germ lines, raising the possibility that thresholds of H2AK5ac may differentially influence distinct germ line repair events. We also show that xnd-1 regulated him-5 transcriptionally, independently of mys-1, and that ectopic expression of him-5 suppressed the CO defects of xnd-1. Our work provides xnd-1 as a model in which to study the link between chromatin factors, gene expression, and genome stability. PMID:27678523
USDA-ARS?s Scientific Manuscript database
As considerable progress has been made on producing draft quality genomic sequence for many food animal species, the next goal for genomics research is a greater understanding of gene regulation and expression. The EU-US Animal Biotechnology Working Group (ABWG), established by the EU-US Biotechnolo...
Chandra, Saket; Kazmi, Andaleeb Z; Ahmed, Zainab; Roychowdhury, Gargi; Kumari, Veena; Kumar, Manish; Mukhopadhyay, Kunal
2017-07-01
NB-ARC domain-containing resistance genes from the wheat genome were identified, characterized and localized on chromosome arms that displayed differential yet positive response during incompatible and compatible leaf rust interactions. Wheat (Triticum aestivum L.) is an important cereal crop; however, its production is affected severely by numerous diseases including rusts. An efficient, cost-effective and ecologically viable approach to control pathogens is through host resistance. In wheat, high numbers of resistance loci are present but only few have been identified and cloned. A comprehensive analysis of the NB-ARC-containing genes in complete wheat genome was accomplished in this study. Complete NB-ARC encoding genes were mined from the Ensembl Plants database to predict 604 NB-ARC containing sequences using the HMM approach. Genome-wide analysis of orthologous clusters in the NB-ARC-containing sequences of wheat and other members of the Poaceae family revealed maximum homology with Oryza sativa indica and Brachypodium distachyon. The identification of overlap between orthologous clusters enabled the elucidation of the function and evolution of resistance proteins. The distributions of the NB-ARC domain-containing sequences were found to be balanced among the three wheat sub-genomes. Wheat chromosome arms 4AL and 7BL had the most NB-ARC domain-containing contigs. The spatio-temporal expression profiling studies exemplified the positive role of these genes in resistant and susceptible wheat plants during incompatible and compatible interaction in response to the leaf rust pathogen Puccinia triticina. Two NB-ARC domain-containing sequences were modelled in silico, cloned and sequenced to analyze their fine structures. The data obtained in this study will augment isolation, characterization and application NB-ARC resistance genes in marker-assisted selection based breeding programs for improving rust resistance in wheat.
Heo, Min-Ji; Jung, Hwi-Min; Um, Jaeyong; Lee, Sang-Woo; Oh, Min-Kyu
2017-02-17
Genome editing using CRISPR/Cas9 was successfully demonstrated in Esherichia coli to effectively produce n-butanol in a defined medium under microaerobic condition. The butanol synthetic pathway genes including those encoding oxygen-tolerant alcohol dehydrogenase were overexpressed in metabolically engineered E. coli, resulting in 0.82 g/L butanol production. To increase butanol production, carbon flux from acetyl-CoA to citric acid cycle should be redirected to acetoacetyl-CoA. For this purpose, the 5'-untranslated region sequence of gltA encoding citrate synthase was designed using an expression prediction program, UTR designer, and modified using the CRISPR/Cas9 genome editing method to reduce its expression level. E. coli strains with decreased citrate synthase expression produced more butanol and the citrate synthase activity was correlated with butanol production. These results demonstrate that redistributing carbon flux using genome editing is an efficient engineering tool for metabolite overproduction.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhou, En -Min; Murugapiran, Senthil K.; Mefferd, Chrisabelle C.
Thermus amyloliquefaciens type strain YIM 77409 T is a thermophilic, Gram-negative, non-motile and rod-shaped bacterium isolated from Niujie Hot Spring in Eryuan County, Yunnan Province, southwest China. In the present study we describe the features of strain YIM 77409 T together with its genome sequence and annotation. The genome is 2,160,855 bp long and consists of 6 scaffolds with 67.4 % average GC content. A total of 2,313 genes were predicted, comprising 2,257 protein-coding and 56 RNA genes. The genome is predicted to encode a complete glycolysis, pentose phosphate pathway, and tricarboxylic acid cycle. Additionally, a large number of transportersmore » and enzymes for heterotrophy highlight the broad heterotrophic lifestyle of this organism. Furthermore, a denitrification gene cluster included genes predicted to encode enzymes for the sequential reduction of nitrate to nitrous oxide, consistent with the incomplete denitrification phenotype of this strain.« less
Deeg, Christoph M; Chow, Cheryl-Emiliane T
2018-01-01
Giant viruses are ecologically important players in aquatic ecosystems that have challenged concepts of what constitutes a virus. Herein, we present the giant Bodo saltans virus (BsV), the first characterized representative of the most abundant group of giant viruses in ocean metagenomes, and the first isolate of a klosneuvirus, a subgroup of the Mimiviridae proposed from metagenomic data. BsV infects an ecologically important microzooplankton, the kinetoplastid Bodo saltans. Its 1.39 Mb genome encodes 1227 predicted ORFs, including a complex replication machinery. Yet, much of its translational apparatus has been lost, including all tRNAs. Essential genes are invaded by homing endonuclease-encoding self-splicing introns that may defend against competing viruses. Putative anti-host factors show extensive gene duplication via a genomic accordion indicating an ongoing evolutionary arms race and highlighting the rapid evolution and genomic plasticity that has led to genome gigantism and the enigma that is giant viruses. PMID:29582753
Zhou, En -Min; Murugapiran, Senthil K.; Mefferd, Chrisabelle C.; ...
2016-02-27
Thermus amyloliquefaciens type strain YIM 77409 T is a thermophilic, Gram-negative, non-motile and rod-shaped bacterium isolated from Niujie Hot Spring in Eryuan County, Yunnan Province, southwest China. In the present study we describe the features of strain YIM 77409 T together with its genome sequence and annotation. The genome is 2,160,855 bp long and consists of 6 scaffolds with 67.4 % average GC content. A total of 2,313 genes were predicted, comprising 2,257 protein-coding and 56 RNA genes. The genome is predicted to encode a complete glycolysis, pentose phosphate pathway, and tricarboxylic acid cycle. Additionally, a large number of transportersmore » and enzymes for heterotrophy highlight the broad heterotrophic lifestyle of this organism. Furthermore, a denitrification gene cluster included genes predicted to encode enzymes for the sequential reduction of nitrate to nitrous oxide, consistent with the incomplete denitrification phenotype of this strain.« less
The Genome of the Sea Urchin Strongylocentrotus purpuratus
2011-01-01
We report the sequence and analysis of the 814-megabase genome of the sea urchin Strongylocentrotus purpuratus, a model for developmental and systems biology. The sequencing strategy combined whole-genome shotgun and bacterial artificial chromosome (BAC) sequences. This use of BAC clones, aided by a pooling strategy, overcame difficulties associated with high heterozygosity of the genome. The genome encodes about 23,300 genes, including many previously thought to be vertebrate innovations or known only outside the deuterostomes. This echinoderm genome provides an evolutionary outgroup for the chordates and yields insights into the evolution of deuterostomes. PMID:17095691
Dai, Xin; Wang, Haina; Zhang, Zhenfeng; Li, Kuan; Zhang, Xiaoling; Mora-López, Marielos; Jiang, Chengying; Liu, Chang; Wang, Li; Zhu, Yaxin; Hernández-Ascencio, Walter; Dong, Zhiyang; Huang, Li
2016-01-01
The genome of Sulfolobus sp. A20 isolated from a hot spring in Costa Rica was sequenced. This circular genome of the strain is 2,688,317 bp in size and 34.8% in G+C content, and contains 2591 open reading frames (ORFs). Strain A20 shares ~95.6% identity at the 16S rRNA gene sequence level and <30% DNA-DNA hybridization (DDH) values with the most closely related known Sulfolobus species (i.e., Sulfolobus islandicus and Sulfolobus solfataricus ), suggesting that it represents a novel Sulfolobus species. Comparison of the genome of strain A20 with those of the type strains of S. solfataricus, Sulfolobus acidocaldarius, S. islandicus , and Sulfolobus tokodaii , which were isolated from geographically separated areas, identified 1801 genes conserved among all Sulfolobus species analyzed (core genes). Comparative genome analyses show that central carbon metabolism in Sulfolobus is highly conserved, and enzymes involved in the Entner-Doudoroff pathway, the tricarboxylic acid cycle and the CO 2 fixation pathways are predominantly encoded by the core genes. All Sulfolobus species encode genes required for the conversion of ammonium into glutamate/glutamine. Some Sulfolobus strains have gained the ability to utilize additional nitrogen source such as nitrate (i.e., S. islandicus strain REY15A, LAL14/1, M14.25, and M16.27) or urea (i.e., S. islandicus HEV10/4, S. tokodaii strain7, and S. metallicus DSM 6482). The strategies for sulfur metabolism are most diverse and least understood. S. tokodaii encodes sulfur oxygenase/reductase (SOR), whereas both S. islandicus and S. solfataricus contain genes for sulfur reductase (SRE). However, neither SOR nor SRE genes exist in the genome of strain A20, raising the possibility that an unknown pathway for the utilization of elemental sulfur may be present in the strain. The ability of Sulfolobus to utilize nitrate or sulfur is encoded by a gene cluster flanked by IS elements or their remnants. These clusters appear to have become fixed at a specific genomic site in some strains and lost in other strains during the course of evolution. The versatility in nitrogen and sulfur metabolism may represent adaptation of Sulfolobus to thriving in different habitats.
Dai, Xin; Wang, Haina; Zhang, Zhenfeng; Li, Kuan; Zhang, Xiaoling; Mora-López, Marielos; Jiang, Chengying; Liu, Chang; Wang, Li; Zhu, Yaxin; Hernández-Ascencio, Walter; Dong, Zhiyang; Huang, Li
2016-01-01
The genome of Sulfolobus sp. A20 isolated from a hot spring in Costa Rica was sequenced. This circular genome of the strain is 2,688,317 bp in size and 34.8% in G+C content, and contains 2591 open reading frames (ORFs). Strain A20 shares ~95.6% identity at the 16S rRNA gene sequence level and <30% DNA-DNA hybridization (DDH) values with the most closely related known Sulfolobus species (i.e., Sulfolobus islandicus and Sulfolobus solfataricus), suggesting that it represents a novel Sulfolobus species. Comparison of the genome of strain A20 with those of the type strains of S. solfataricus, Sulfolobus acidocaldarius, S. islandicus, and Sulfolobus tokodaii, which were isolated from geographically separated areas, identified 1801 genes conserved among all Sulfolobus species analyzed (core genes). Comparative genome analyses show that central carbon metabolism in Sulfolobus is highly conserved, and enzymes involved in the Entner-Doudoroff pathway, the tricarboxylic acid cycle and the CO2 fixation pathways are predominantly encoded by the core genes. All Sulfolobus species encode genes required for the conversion of ammonium into glutamate/glutamine. Some Sulfolobus strains have gained the ability to utilize additional nitrogen source such as nitrate (i.e., S. islandicus strain REY15A, LAL14/1, M14.25, and M16.27) or urea (i.e., S. islandicus HEV10/4, S. tokodaii strain7, and S. metallicus DSM 6482). The strategies for sulfur metabolism are most diverse and least understood. S. tokodaii encodes sulfur oxygenase/reductase (SOR), whereas both S. islandicus and S. solfataricus contain genes for sulfur reductase (SRE). However, neither SOR nor SRE genes exist in the genome of strain A20, raising the possibility that an unknown pathway for the utilization of elemental sulfur may be present in the strain. The ability of Sulfolobus to utilize nitrate or sulfur is encoded by a gene cluster flanked by IS elements or their remnants. These clusters appear to have become fixed at a specific genomic site in some strains and lost in other strains during the course of evolution. The versatility in nitrogen and sulfur metabolism may represent adaptation of Sulfolobus to thriving in different habitats. PMID:27965637
Genome organization of epidemic Acinetobacter baumannii strains.
Di Nocera, Pier Paolo; Rocco, Francesco; Giannouli, Maria; Triassi, Maria; Zarrilli, Raffaele
2011-10-10
Acinetobacter baumannii is an opportunistic pathogen responsible for hospital-acquired infections. A. baumannii epidemics described world-wide were caused by few genotypic clusters of strains. The occurrence of epidemics caused by multi-drug resistant strains assigned to novel genotypes have been reported over the last few years. In the present study, we compared whole genome sequences of three A. baumannii strains assigned to genotypes ST2, ST25 and ST78, representative of the most frequent genotypes responsible for epidemics in several Mediterranean hospitals, and four complete genome sequences of A. baumannii strains assigned to genotypes ST1, ST2 and ST77. Comparative genome analysis showed extensive synteny and identified 3068 coding regions which are conserved, at the same chromosomal position, in all A. baumannii genomes. Genome alignments also identified 63 DNA regions, ranging in size from 4 o 126 kb, all defined as genomic islands, which were present in some genomes, but were either missing or replaced by non-homologous DNA sequences in others. Some islands are involved in resistance to drugs and metals, others carry genes encoding surface proteins or enzymes involved in specific metabolic pathways, and others correspond to prophage-like elements. Accessory DNA regions encode 12 to 19% of the potential gene products of the analyzed strains. The analysis of a collection of epidemic A. baumannii strains showed that some islands were restricted to specific genotypes. The definition of the genome components of A. baumannii provides a scaffold to rapidly evaluate the genomic organization of novel clinical A. baumannii isolates. Changes in island profiling will be useful in genomic epidemiology of A. baumannii population.
Utro, Filippo; Di Benedetto, Valeria; Corona, Davide F V; Giancarlo, Raffaele
2016-03-15
Thanks to research spanning nearly 30 years, two major models have emerged that account for nucleosome organization in chromatin: statistical and sequence specific. The first is based on elegant, easy to compute, closed-form mathematical formulas that make no assumptions of the physical and chemical properties of the underlying DNA sequence. Moreover, they need no training on the data for their computation. The latter is based on some sequence regularities but, as opposed to the statistical model, it lacks the same type of closed-form formulas that, in this case, should be based on the DNA sequence only. We contribute to close this important methodological gap between the two models by providing three very simple formulas for the sequence specific one. They are all based on well-known formulas in Computer Science and Bioinformatics, and they give different quantifications of how complex a sequence is. In view of how remarkably well they perform, it is very surprising that measures of sequence complexity have not even been considered as candidates to close the mentioned gap. We provide experimental evidence that the intrinsic level of combinatorial organization and information-theoretic content of subsequences within a genome are strongly correlated to the level of DNA encoded nucleosome organization discovered by Kaplan et al Our results establish an important connection between the intrinsic complexity of subsequences in a genome and the intrinsic, i.e. DNA encoded, nucleosome organization of eukaryotic genomes. It is a first step towards a mathematical characterization of this latter 'encoding'. Supplementary data are available at Bioinformatics online. futro@us.ibm.com. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Gruber, Sabine; Omann, Markus; Rodrìguez, Carolina Escobar; Radebner, Theresa; Zeilinger, Susanne
2012-11-17
Species of the fungal genus Trichoderma are important industrial producers of cellulases and hemicellulases, but also widely used as biocontrol agents (BCAs) in agriculture. In the latter function Trichoderma species stimulate plant growth, induce plant defense and directly antagonize plant pathogenic fungi through their mycoparasitic capabilities. The recent release of the genome sequences of four mycoparasitic Trichoderma species now forms the basis for large-scale genetic manipulations of these important BCAs. Thus far, only a limited number of dominant selection markers, including Hygromycin B resistance (hph) and the acetamidase-encoding amdS gene, have been available for transformation of Trichoderma spp. For more extensive functional genomics studies the utilization of additional dominant markers will be essential. We established the Escherichia coli neomycin phosphotransferase II-encoding nptII gene as a novel selectable marker for the transformation of Trichoderma atroviride conferring geneticin resistance. The nptII marker cassette was stably integrated into the fungal genome and transformants exhibited unaltered phenotypes compared to the wild-type. Co-transformation of T. atroviride with nptII and a constitutively activated version of the Gα subunit-encoding tga3 gene (tga3Q207L) resulted in a high number of mitotically stable, geneticin-resistant transformants. Further analyses revealed a co-transformation frequency of 68% with 15 transformants having additionally integrated tga3Q207L into their genome. Constitutive activation of the Tga3-mediated signaling pathway resulted in increased vegetative growth and an enhanced ability to antagonize plant pathogenic host fungi. The neomycin phosphotransferase II-encoding nptII gene from Escherichia coli proved to be a valuable tool for conferring geneticin resistance to the filamentous fungus T. atroviride thereby contributing to an enhanced genetic tractability of these important BCAs.
Novel microRNA-like viral small regulatory RNAs arising during human hepatitis A virus infection.
Shi, Jiandong; Sun, Jing; Wang, Bin; Wu, Meini; Zhang, Jing; Duan, Zhiqing; Wang, Haixuan; Hu, Ningzhu; Hu, Yunzhang
2014-10-01
MicroRNAs (miRNAs), including host miRNAs and viral miRNAs, play vital roles in regulating host-virus interactions. DNA viruses encode miRNAs that regulate the viral life cycle. However, it is generally believed that cytoplasmic RNA viruses do not encode miRNAs, owing to inaccessible cellular miRNA processing machinery. Here, we provide a comprehensive genome-wide analysis and identification of miRNAs that were derived from hepatitis A virus (HAV; Hu/China/H2/1982), which is a typical cytoplasmic RNA virus. Using deep-sequencing and in silico approaches, we identified 2 novel virally encoded miRNAs, named hav-miR-1-5p and hav-miR-2-5p. Both of the novel virally encoded miRNAs were clearly detected in infected cells. Analysis of Dicer enzyme silencing demonstrated that HAV-derived miRNA biogenesis is Dicer dependent. Furthermore, we confirmed that HAV mature miRNAs were generated from viral miRNA precursors (pre-miRNAs) in host cells. Notably, naturally derived HAV miRNAs were biologically and functionally active and induced post-transcriptional gene silencing (PTGS). Genomic location analysis revealed novel miRNAs located in the coding region of the viral genome. Overall, our results show that HAV naturally generates functional miRNA-like small regulatory RNAs during infection. This is the first report of miRNAs derived from the coding region of genomic RNA of a cytoplasmic RNA virus. These observations demonstrate that a cytoplasmic RNA virus can naturally generate functional miRNAs, as DNA viruses do. These findings also contribute to improved understanding of host-RNA virus interactions mediated by RNA virus-derived miRNAs. © FASEB.
Whole-genome expression analysis of mammalian-wide interspersed repeat elements in human cell lines.
Carnevali, Davide; Conti, Anastasia; Pellegrini, Matteo; Dieci, Giorgio
2017-02-01
With more than 500,000 copies, mammalian-wide interspersed repeats (MIRs), a sub-group of SINEs, represent ∼2.5% of the human genome and one of the most numerous family of potential targets for the RNA polymerase (Pol) III transcription machinery. Since MIR elements ceased to amplify ∼130 myr ago, previous studies primarily focused on their genomic impact, while the issue of their expression has not been extensively addressed. We applied a dedicated bioinformatic pipeline to ENCODE RNA-Seq datasets of seven human cell lines and, for the first time, we were able to define the Pol III-driven MIR transcriptome at single-locus resolution. While the majority of Pol III-transcribed MIR elements are cell-specific, we discovered a small set of ubiquitously transcribed MIRs mapping within Pol II-transcribed genes in antisense orientation that could influence the expression of the overlapping gene. We also identified novel Pol III-transcribed ncRNAs, deriving from transcription of annotated MIR fragments flanked by unique MIR-unrelated sequences, and confirmed the role of Pol III-specific internal promoter elements in MIR transcription. Besides demonstrating widespread transcription at these retrotranspositionally inactive elements in human cells, the ability to profile MIR expression at single-locus resolution will facilitate their study in different cell types and states including pathological alterations. © The Author 2016. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Dia, Ndongo; Lavie, Laurence; Méténier, Guy; Toguebaye, Bhen S; Vivarès, Christian P; Cornillot, Emmanuel
2007-03-01
Microsporidia are fungi-related obligate intracellular parasites that infect numerous animals, including man. Encephalitozoon cuniculi harbours a very small genome (2.9 Mbp) with about 2,000 coding sequences (CDSs). Most repeated CDSs are of unknown function and are distributed in subterminal regions that mark the transitions between subtelomeric rDNA units and chromosome cores. A potential multigenic family (interB) encoding proteins within a size range of 579-641 aa was investigated by PCR and RT-PCR. Thirty members were finally assigned to the E. cuniculi interB family and a predominant interB transcript was found to originate from a newly identified gene on chromosome III. Microsporidian species from eight different genera infecting insects, fishes or mammals, were tested for a possible intra-phylum conservation of interB genes. Only representatives of the Encephalitozoon, Vittaforma and Brachiola genera, differing in host range but all able to invade humans, were positive. Molecular karyotyping of Brachiola algerae showed a complex set of chromosome bands, providing a haploid genome size estimate of 15-20 Mbp. In spite of this large difference in genome complexity, B. algerae and E. cuniculi shared some similar interB gene copies and a common location of interB genes in near-rDNA subterminal regions.
Kelly, Steven L.; Kelly, Diane E.
2013-01-01
The first eukaryote genome revealed three yeast cytochromes P450 (CYPs), hence the subsequent realization that some microbial fungal genomes encode these proteins in 1 per cent or more of all genes (greater than 100) has been surprising. They are unique biocatalysts undertaking a wide array of stereo- and regio-specific reactions and so hold promise in many applications. Based on ancestral activities that included 14α-demethylation during sterol biosynthesis, it is now seen that CYPs are part of the genes and metabolism of most eukaryotes. In contrast, Archaea and Eubacteria often do not contain CYPs, while those that do are frequently interesting as producers of natural products undertaking their oxidative tailoring. Apart from roles in primary and secondary metabolism, microbial CYPs are actual/potential targets of drugs/agrochemicals and CYP51 in sterol biosynthesis is exhibiting evolution to resistance in the clinic and the field. Other CYP applications include the first industrial biotransformation for corticosteroid production in the 1950s, the diversion into penicillin synthesis in early mutations in fungal strain improvement and bioremediation using bacteria and fungi. The vast untapped resource of orphan CYPs in numerous genomes is being probed and new methods for discovering function and for discovering desired activities are being investigated. PMID:23297358
USDA-ARS?s Scientific Manuscript database
Wheat genomes encode pathogenesis-related protein 1 (PR-1)/receptor-like kinase (RK) hybrid proteins as first reported for hexaploid wheat. To date, no PR-1-RK-like proteins have been identified in the diploid wild wheat Triticum urartu, the A-genome progenitor of hexaploid wheat. Here we report the...
Reysenbach, Anna-Louise; Donaho, John; Kelley, John; ...
2018-03-15
A draft genome of a novelDictyoglomussp., NZ13-RE01, was obtained from a New Zealand hot spring enrichment culture. The 1,927,012-bp genome is similar in both size and G+C content to otherDictyoglomusspp. Like its relatives,Dictyoglomussp. NZ13-RE01 encodes many genes involved in complex carbohydrate metabolism.
Draft Genome Sequence of the d-Xylose-Fermenting Yeast Spathaspora arborariae UFMG-HM19.1AT
Lobo, Francisco P.; Gonçalves, Davi L.; Alves, Sergio L.; Gerber, Alexandra L.; de Vasconcelos, Ana Tereza R.; Basso, Luiz C.; Franco, Glória R.; Soares, Marco A.; Cadete, Raquel M.; Rosa, Carlos A.
2014-01-01
The draft genome sequence of the yeast Spathaspora arborariae UFMG-HM19.1AT (CBS 11463 = NRRL Y-48658) is presented here. The sequenced genome size is 12.7 Mb, consisting of 41 scaffolds containing a total of 5,625 predicted open reading frames, including many genes encoding enzymes and transporters involved in d-xylose fermentation. PMID:24435867
Draft Genome Sequence of Marinobacter sp. Strain ANT_B65, Isolated from Antarctic Marine Sponge.
de França, Paula; Camilo, Esther; Fantinatti-Garboginni, Fabiana
2018-01-04
Marinobacter sp. strain ANT_B65 was isolated from sponge collected in King George Island, Antarctica. The draft genome of 4,173,840 bp encodes 3,743 protein-coding open reading frames. The genome will provide insights into the strain's potential use in the production of natural products. Copyright © 2018 de França et al.
Genome analysis of Listeria ivanovii strain G770 that caused a deadly aortic prosthesis infection
Beye, M.; Gouriet, F.; Michelle, C.; Casalta, J.-P.; Habib, G.; Raoult, D.; Fournier, P.-E.
2016-01-01
We sequenced the genome of Listeria ivanovii strain G770, which caused a deadly infection of the thoracic aortic prosthesis of a 78-year-old man. The 2.9 Mb genome exhibited 21 specific genes among L. ivanovii strains, including five genes encoding a type I restriction modification system and one glycopeptide resistance gene. PMID:26933501
DOE Office of Scientific and Technical Information (OSTI.GOV)
Reysenbach, Anna-Louise; Donaho, John; Kelley, John
A draft genome of a novelDictyoglomussp., NZ13-RE01, was obtained from a New Zealand hot spring enrichment culture. The 1,927,012-bp genome is similar in both size and G+C content to otherDictyoglomusspp. Like its relatives,Dictyoglomussp. NZ13-RE01 encodes many genes involved in complex carbohydrate metabolism.
Yerrapragada, Shaila; Shukla, Animesh; Hallsworth-Pepin, Kymberlie; Choi, Kwangmin; Wollam, Aye; Clifton, Sandra; Qin, Xiang; Muzny, Donna; Raghuraman, Sriram; Ashki, Haleh; Uzman, Akif; Highlander, Sarah K.; Fryszczyn, Bartlomiej G.; Fox, George E.; Tirumalai, Madhan R.; Liu, Yamei; Kim, Sun
2015-01-01
Tolypothrix sp. PCC 7601 is a freshwater filamentous cyanobacterium with complex responses to environmental conditions. Here, we present its 9.96-Mbp draft genome sequence, containing 10,065 putative protein-coding sequences, including 305 predicted two-component system proteins and 27 putative phytochrome-class photoreceptors, the most such proteins in any sequenced genome. PMID:25953173
Genome Sequence of the Shiga Toxin-Producing Escherichia coli Strain NCCP15657
Kim, Byung Kwon; Song, Geun Cheol; Hong, Gun Hyong; Seong, Won-Keun; Kim, Seon-Young; Jeong, Haeyoung; Kang, Sung Gyun; Kwon, Soon-Kyeong; Lee, Choong Hoon; Song, Ju Yeon; Yu, Dong Su; Park, Mi-Sun
2012-01-01
Shiga toxin-producing Escherichia coli causes bloody diarrhea and hemolytic-uremic syndrome and serious outbreaks worldwide. Here, we report the draft genome sequence of E. coli NCCP15657 isolated from a patient. The genome has virulence genes, many in the locus of enterocyte effacement (LEE) island, encoding a metalloprotease, the Shiga toxin, and constituents of type III secretion. PMID:22740674
Hallenbeck, Patrick C; Grogger, Melanie; Mraz, Megan; Veverka, Donald
2016-02-18
The draft genome (57.7% GC, 7,647,882 bp) of the novel thermophilic cyanobacterium MTP1 was determined by metagenomics of an enrichment culture. The genome shows that it is in the family Oscillatoriales and encodes multiple heavy metal resistances as well as the capacity to make exopolysaccharides. Copyright © 2016 Hallenbeck et al.
Macronuclear Genome Sequence of the Ciliate Tetrahymena thermophila, a Model Eukaryote
Eisen, Jonathan A; Coyne, Robert S; Wu, Martin; Wu, Dongying; Thiagarajan, Mathangi; Wortman, Jennifer R; Badger, Jonathan H; Ren, Qinghu; Amedeo, Paolo; Jones, Kristie M; Tallon, Luke J; Delcher, Arthur L; Salzberg, Steven L; Silva, Joana C; Haas, Brian J; Majoros, William H; Farzad, Maryam; Carlton, Jane M; Smith, Roger K; Garg, Jyoti; Pearlman, Ronald E; Karrer, Kathleen M; Sun, Lei; Manning, Gerard; Elde, Nels C; Turkewitz, Aaron P; Asai, David J; Wilkes, David E; Wang, Yufeng; Cai, Hong; Collins, Kathleen; Stewart, B. Andrew; Lee, Suzanne R; Wilamowska, Katarzyna; Weinberg, Zasha; Ruzzo, Walter L; Wloga, Dorota; Gaertig, Jacek; Frankel, Joseph; Tsao, Che-Chia; Gorovsky, Martin A; Keeling, Patrick J; Waller, Ross F; Patron, Nicola J; Cherry, J. Michael; Stover, Nicholas A; Krieger, Cynthia J; del Toro, Christina; Ryder, Hilary F; Williamson, Sondra C; Barbeau, Rebecca A; Hamilton, Eileen P; Orias, Eduardo
2006-01-01
The ciliate Tetrahymena thermophila is a model organism for molecular and cellular biology. Like other ciliates, this species has separate germline and soma functions that are embodied by distinct nuclei within a single cell. The germline-like micronucleus (MIC) has its genome held in reserve for sexual reproduction. The soma-like macronucleus (MAC), which possesses a genome processed from that of the MIC, is the center of gene expression and does not directly contribute DNA to sexual progeny. We report here the shotgun sequencing, assembly, and analysis of the MAC genome of T. thermophila, which is approximately 104 Mb in length and composed of approximately 225 chromosomes. Overall, the gene set is robust, with more than 27,000 predicted protein-coding genes, 15,000 of which have strong matches to genes in other organisms. The functional diversity encoded by these genes is substantial and reflects the complexity of processes required for a free-living, predatory, single-celled organism. This is highlighted by the abundance of lineage-specific duplications of genes with predicted roles in sensing and responding to environmental conditions (e.g., kinases), using diverse resources (e.g., proteases and transporters), and generating structural complexity (e.g., kinesins and dyneins). In contrast to the other lineages of alveolates (apicomplexans and dinoflagellates), no compelling evidence could be found for plastid-derived genes in the genome. UGA, the only T. thermophila stop codon, is used in some genes to encode selenocysteine, thus making this organism the first known with the potential to translate all 64 codons in nuclear genes into amino acids. We present genomic evidence supporting the hypothesis that the excision of DNA from the MIC to generate the MAC specifically targets foreign DNA as a form of genome self-defense. The combination of the genome sequence, the functional diversity encoded therein, and the presence of some pathways missing from other model organisms makes T. thermophila an ideal model for functional genomic studies to address biological, biomedical, and biotechnological questions of fundamental importance. PMID:16933976
DOE Office of Scientific and Technical Information (OSTI.GOV)
Karpinets, Tatiana V; Park, Byung H; Syed, Mustafa H
Most bacterial symbionts of plants are phenotypically characterized by their parasitic or matualistic relationship with the host; however, the genomic characteristics that likely discriminate mutualistic symbionts from pathogens of plants are poorly understood. This study comparatively analyzed the genomes of 54 plant-symbiontic bacteria, 27 mutualists and 27 pathogens, to discover genomic determinants of their parasitic and mutualistic nature in terms of protein family domains, KEGG orthologous groups, metabolic pathways and families of carbohydrate-active enzymes (CAZymes). We further used all bacteria with sequenced genomesl, published microarrays and transcriptomics experimental datasets, and literature to validate and to explore results of the comparison.more » The analysis revealed that genomes of mutualists are larger in size and higher in GC content and encode greater molecular, functional and metabolic diversity than the investigated genomes of pathogens. This enriched molecular and functional enzyme diversity included constructive biosynthetic signatures of CAZymes and metabolic pathways in genomes of mutualists compared with catabolic signatures dominant in the genomes of pathogens. Another discriminative characteristic of mutualists is the co-occurence of gene clusters required for the expression and function of nitrogenase and RuBisCO. Analysis of previously published experimental data indicate that nitrogen-fixing mutualists may employ Rubisco to fix CO2 not in the canonical Calvin-Benson-Basham cycle but in a novel metabolic pathway, here called Rubisco-based glycolysis , to increase efficiency of sugar utilization during the symbiosis with plants. An important discriminative characteristic of plant pathogenic bacteria is two groups of genes likely encoding effector proteins involved in host invasion and a genomic locus encoding a putative secretion system that includes a DUF1525 domain protein conserved in pathogens of plants and of other organisms. The protein belongs to the same clan of thioredoxins as the circadian clock protein kaiB found in many mutualistic symbionts and highly abundant in blood cells colonized by a human pathogen, Salmonella enterica serotype Typhi, the cause of typhoid fever.« less
Seligmann, Hervé
2013-05-07
GenBank's EST database includes RNAs matching exactly human mitochondrial sequences assuming systematic asymmetric nucleotide exchange-transcription along exchange rules: A→G→C→U/T→A (12 ESTs), A→U/T→C→G→A (4 ESTs), C→G→U/T→C (3 ESTs), and A→C→G→U/T→A (1 EST), no RNAs correspond to other potential asymmetric exchange rules. Hypothetical polypeptides translated from nucleotide-exchanged human mitochondrial protein coding genes align with numerous GenBank proteins, predicted secondary structures resemble their putative GenBank homologue's. Two independent methods designed to detect overlapping genes (one based on nucleotide contents analyses in relation to replicative deamination gradients at third codon positions, and circular code analyses of codon contents based on frame redundancy), confirm nucleotide-exchange-encrypted overlapping genes. Methods converge on which genes are most probably active, and which not, and this for the various exchange rules. Mean EST lengths produced by different nucleotide exchanges are proportional to (a) extents that various bioinformatics analyses confirm the protein coding status of putative overlapping genes; (b) known kinetic chemistry parameters of the corresponding nucleotide substitutions by the human mitochondrial DNA polymerase gamma (nucleotide DNA misinsertion rates); (c) stop codon densities in predicted overlapping genes (stop codon readthrough and exchanging polymerization regulate gene expression by counterbalancing each other). Numerous rarely expressed proteins seem encoded within regular mitochondrial genes through asymmetric nucleotide exchange, avoiding lengthening genomes. Intersecting evidence between several independent approaches confirms the working hypothesis status of gene encryption by systematic nucleotide exchanges. Copyright © 2013 Elsevier Ltd. All rights reserved.
In Silico Pattern-Based Analysis of the Human Cytomegalovirus Genome
Rigoutsos, Isidore; Novotny, Jiri; Huynh, Tien; Chin-Bow, Stephen T.; Parida, Laxmi; Platt, Daniel; Coleman, David; Shenk, Thomas
2003-01-01
More than 200 open reading frames (ORFs) from the human cytomegalovirus genome have been reported as potentially coding for proteins. We have used two pattern-based in silico approaches to analyze this set of putative viral genes. With the help of an objective annotation method that is based on the Bio-Dictionary, a comprehensive collection of amino acid patterns that describes the currently known natural sequence space of proteins, we have reannotated all of the previously reported putative genes of the human cytomegalovirus. Also, with the help of MUSCA, a pattern-based multiple sequence alignment algorithm, we have reexamined the original human cytomegalovirus gene family definitions. Our analysis of the genome shows that many of the coded proteins comprise amino acid combinations that are unique to either the human cytomegalovirus or the larger group of herpesviruses. We have confirmed that a surprisingly large portion of the analyzed ORFs encode membrane proteins, and we have discovered a significant number of previously uncharacterized proteins that are predicted to be G-protein-coupled receptor homologues. The analysis also indicates that many of the encoded proteins undergo posttranslational modifications such as hydroxylation, phosphorylation, and glycosylation. ORFs encoding proteins with similar functional behavior appear in neighboring regions of the human cytomegalovirus genome. All of the results of the present study can be found and interactively explored online (http://cbcsrv.watson.ibm.com/virus/). PMID:12634390
In silico pattern-based analysis of the human cytomegalovirus genome.
Rigoutsos, Isidore; Novotny, Jiri; Huynh, Tien; Chin-Bow, Stephen T; Parida, Laxmi; Platt, Daniel; Coleman, David; Shenk, Thomas
2003-04-01
More than 200 open reading frames (ORFs) from the human cytomegalovirus genome have been reported as potentially coding for proteins. We have used two pattern-based in silico approaches to analyze this set of putative viral genes. With the help of an objective annotation method that is based on the Bio-Dictionary, a comprehensive collection of amino acid patterns that describes the currently known natural sequence space of proteins, we have reannotated all of the previously reported putative genes of the human cytomegalovirus. Also, with the help of MUSCA, a pattern-based multiple sequence alignment algorithm, we have reexamined the original human cytomegalovirus gene family definitions. Our analysis of the genome shows that many of the coded proteins comprise amino acid combinations that are unique to either the human cytomegalovirus or the larger group of herpesviruses. We have confirmed that a surprisingly large portion of the analyzed ORFs encode membrane proteins, and we have discovered a significant number of previously uncharacterized proteins that are predicted to be G-protein-coupled receptor homologues. The analysis also indicates that many of the encoded proteins undergo posttranslational modifications such as hydroxylation, phosphorylation, and glycosylation. ORFs encoding proteins with similar functional behavior appear in neighboring regions of the human cytomegalovirus genome. All of the results of the present study can be found and interactively explored online (http://cbcsrv.watson.ibm.com/virus/).
Kirilyuk, Alexander; Tolstonog, Genrich V; Damert, Annette; Held, Ulrike; Hahn, Silvia; Löwer, Roswitha; Buschmann, Christian; Horn, Axel V; Traub, Peter; Schumann, Gerald G
2008-02-01
LINE-1 (L1) is a highly successful autonomous non-LTR retrotransposon and a major force shaping mammalian genomes. Although there are about 600 000 L1 copies covering 23% of the rat genome, full-length rat L1s (L1Rn) with intact open reading frames (ORFs) representing functional master copies for retrotransposition have not been identified yet. In conjunction with studies to elucidate the role of L1 retrotransposons in tumorigenesis, we isolated and characterized 10 different cDNAs from transcribed full-length L1Rn elements in rat chloroleukemia (RCL) cells, each encoding intact ORF1 proteins (ORF1p). We identified the first functional L1Rn retrotransposon from this pool of cDNAs, determined its activity in HeLa cells and in the RCL cell line the cDNAs originated from and demonstrate that it is mobilized in the tumor cell line in which it is expressed. Furthermore, we generated monoclonal antibodies directed against L1Rn ORF1 and ORF2-encoded recombinant proteins, analyzed the expression of L1-encoded proteins and found ORF1p predominantly in the nucleus. Our results support the hypothesis that the reported explosive amplification of genomic L1Rn sequences after their transcriptional activation in RCL cells is based on L1 retrotransposition. Therefore, L1 activity might be one cause for genomic instability observed during the progression of leukemia.
Zhang, Jin; Ruhlman, Tracey A.; Sabir, Jamal S. M.; Blazier, John Chris; Weng, Mao-Lun; Park, Seongjun; Jansen, Robert K.
2016-01-01
Disruption of DNA replication, recombination, and repair (DNA-RRR) systems has been hypothesized to cause highly elevated nucleotide substitution rates and genome rearrangements in the plastids of angiosperms, but this theory remains untested. To investigate nuclear–plastid genome (plastome) coevolution in Geraniaceae, four different measures of plastome complexity (rearrangements, repeats, nucleotide insertions/deletions, and substitution rates) were evaluated along with substitution rates of 12 nuclear-encoded, plastid-targeted DNA-RRR genes from 27 Geraniales species. Significant correlations were detected for nonsynonymous (dN) but not synonymous (dS) substitution rates for three DNA-RRR genes (uvrB/C, why1, and gyrA) supporting a role for these genes in accelerated plastid genome evolution in Geraniaceae. Furthermore, correlation between dN of uvrB/C and plastome complexity suggests the presence of nucleotide excision repair system in plastids. Significant correlations were also detected between plastome complexity and 13 of the 90 nuclear-encoded organelle-targeted genes investigated. Comparisons revealed significant acceleration of dN in plastid-targeted genes of Geraniales relative to Brassicales suggesting this correlation may be an artifact of elevated rates in this gene set in Geraniaceae. Correlation between dN of plastid-targeted DNA-RRR genes and plastome complexity supports the hypothesis that the aberrant patterns in angiosperm plastome evolution could be caused by dysfunction in DNA-RRR systems. PMID:26893456
Poehlein, Anja; Daniel, Rolf
2017-01-01
Methanobrevibacter arboriphilus strain DH1 is an autotrophic methanogen that was isolated from the wetwood of methane-emitting trees. This species has been of considerable interest for its unusual oxygen tolerance and has been studied as a model organism for more than four decades. Strain DH1 is closely related to other host-associated Methanobrevibacter species from intestinal tracts of animals and the rumen, making this strain an interesting candidate for comparative analysis to identify factors important for colonizing intestinal environments. Here, the genome sequence of M. arboriphilus strain DH1 is reported. The draft genome is composed of 2.445.031 bp with an average GC content of 25.44% and predicted to harbour 1964 protein-encoding genes. Among the predicted genes, there are also more than 50 putative genes for the so-called adhesin-like proteins (ALPs). The presence of ALP-encoding genes in the genome of this non-host-associated methanogen strongly suggests that target surfaces for ALPs other than host tissues also need to be considered as potential interaction partners. The high abundance of ALPs may also indicate that these types of proteins are more characteristic for specific phylogenetic groups of methanogens rather than being indicative for a particular environment the methanogens thrives in. PMID:28634433
Mycobacterium ahvazicum sp. nov., the nineteenth species of the Mycobacterium simiae complex.
Bouam, Amar; Heidarieh, Parvin; Shahraki, Abodolrazagh Hashemi; Pourahmad, Fazel; Mirsaeidi, Mehdi; Hashemzadeh, Mohamad; Baptiste, Emeline; Armstrong, Nicholas; Levasseur, Anthony; Robert, Catherine; Drancourt, Michel
2018-03-07
Four slowly growing mycobacteria isolates were isolated from the respiratory tract and soft tissue biopsies collected in four unrelated patients in Iran. Conventional phenotypic tests indicated that these four isolates were identical to Mycobacterium lentiflavum while 16S rRNA gene sequencing yielded a unique sequence separated from that of M. lentiflavum. One representative strain AFP-003 T was characterized as comprising a 6,121,237-bp chromosome (66.24% guanosine-cytosine content) encoding for 5,758 protein-coding genes, 50 tRNA and one complete rRNA operon. A total of 2,876 proteins were found to be associated with the mobilome, including 195 phage proteins. A total of 1,235 proteins were found to be associated with virulence and 96 with toxin/antitoxin systems. The genome of AFP-003 T has the genetic potential to produce secondary metabolites, with 39 genes found to be associated with polyketide synthases and non-ribosomal peptide syntases and 11 genes encoding for bacteriocins. Two regions encoding putative prophages and three OriC regions separated by the dnaA gene were predicted. Strain AFP-003 T genome exhibits 86% average nucleotide identity with Mycobacterium genavense genome. Genetic and genomic data indicate that strain AFP-003 T is representative of a novel Mycobacterium species that we named Mycobacterium ahvazicum, the nineteenth species of the expanding Mycobacterium simiae complex.
Production of pseudoinfectious yellow fever virus with a two-component genome.
Shustov, Alexandr V; Mason, Peter W; Frolov, Ilya
2007-11-01
Application of genetically modified, deficient-in-replication flaviviruses that are incapable of developing productive, spreading infection is a promising means of designing safe and effective vaccines. Here we describe a two-component genome yellow fever virus (YFV) replication system in which each of the genomes encodes complete sets of nonstructural proteins that form the replication complex but expresses either only capsid or prM/E instead of the entire structural polyprotein. Upon delivery to the same cell, these genomes produce together all of the viral structural proteins, and cells release a combination of virions with both types of genomes packaged into separate particles. In tissue culture, this modified YFV can be further passaged at an escalating scale by using a high multiplicity of infection (MOI). However, at a low MOI, only one of the genomes is delivered into the cells, and infection cannot spread. The replicating prM/E-encoding genome produces extracellular E protein in the form of secreted subviral particles that are known to be an effective immunogen. The presented strategy of developing viruses defective in replication might be applied to other flaviviruses, and these two-component genome viruses can be useful for diagnostic or vaccine applications, including the delivery and expression of heterologous genes. In addition, the achieved separation of the capsid-coding sequence and the cyclization signal in the YFV genome provides a new means for studying the mechanism of the flavivirus packaging process.
Molecular definition of the identity and activation of natural killer cells.
Bezman, Natalie A; Kim, Charles C; Sun, Joseph C; Min-Oo, Gundula; Hendricks, Deborah W; Kamimura, Yosuke; Best, J Adam; Goldrath, Ananda W; Lanier, Lewis L
2012-10-01
Using whole-genome microarray data sets of the Immunological Genome Project, we demonstrate a closer transcriptional relationship between NK cells and T cells than between any other leukocytes, distinguished by their shared expression of genes encoding molecules with similar signaling functions. Whereas resting NK cells are known to share expression of a few genes with cytotoxic CD8(+) T cells, our transcriptome-wide analysis demonstrates that the commonalities extend to hundreds of genes, many encoding molecules with unknown functions. Resting NK cells demonstrate a 'preprimed' state compared with naive T cells, which allows NK cells to respond more rapidly to viral infection. Collectively, our data provide a global context for known and previously unknown molecular aspects of NK cell identity and function by delineating the genome-wide repertoire of gene expression of NK cells in various states.
Gilbert, Maarten J; Miller, William G; Yee, Emma; Kik, Marja; Zomer, Aldert L; Wagenaar, Jaap A; Duim, Birgitta
2016-10-05
Campylobacter iguaniorum is most closely related to the species C fetus, C hyointestinalis, and C lanienae Reptiles, chelonians and lizards in particular, appear to be a primary reservoir of this Campylobacter species. Here we report the genome comparison of C iguaniorum strain 1485E, isolated from a bearded dragon (Pogona vitticeps), and strain 2463D, isolated from a green iguana (Iguana iguana), with the genomes of closely related taxa, in particular with reptile-associated C fetus subsp. testudinum In contrast to C fetus, C iguaniorum is lacking an S-layer encoding region. Furthermore, a defined lipooligosaccharide biosynthesis locus, encoding multiple glycosyltransferases and bounded by waa genes, is absent from C iguaniorum Instead, multiple predicted glycosylation regions were identified in C iguaniorum One of these regions is > 50 kb with deviant G + C content, suggesting acquisition via lateral transfer. These similar, but non-homologous glycosylation regions were located at the same position on the genome in both strains. Multiple genes encoding respiratory enzymes not identified to date within the C. fetus clade were present. C iguaniorum shared highest homology with C hyointestinalis and C fetus. As in reptile-associated C fetus subsp. testudinum, a putative tricarballylate catabolism locus was identified. However, despite colonizing a shared host, no recent recombination between both taxa was detected. This genomic study provides a better understanding of host adaptation, virulence, phylogeny, and evolution of C iguaniorum and related Campylobacter taxa. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Coordinated rates of evolution between interacting plastid and nuclear genes in Geraniaceae.
Zhang, Jin; Ruhlman, Tracey A; Sabir, Jamal; Blazier, J Chris; Jansen, Robert K
2015-03-01
Although gene coevolution has been widely observed within individuals and between different organisms, rarely has this phenomenon been investigated within a phylogenetic framework. The Geraniaceae is an attractive system in which to study plastid-nuclear genome coevolution due to the highly elevated evolutionary rates in plastid genomes. In plants, the plastid-encoded RNA polymerase (PEP) is a protein complex composed of subunits encoded by both plastid (rpoA, rpoB, rpoC1, and rpoC2) and nuclear genes (sig1-6). We used transcriptome and genomic data for 27 species of Geraniales in a systematic evaluation of coevolution between genes encoding subunits of the PEP holoenzyme. We detected strong correlations of dN (nonsynonymous substitutions) but not dS (synonymous substitutions) within rpoB/sig1 and rpoC2/sig2, but not for other plastid/nuclear gene pairs, and identified the correlation of dN/dS ratio between rpoB/C1/C2 and sig1/5/6, rpoC1/C2 and sig2, and rpoB/C2 and sig3 genes. Correlated rates between interacting plastid and nuclear sequences across the Geraniales could result from plastid-nuclear genome coevolution. Analyses of coevolved amino acid positions suggest that structurally mediated coevolution is not the major driver of plastid-nuclear coevolution. The detection of strong correlation of evolutionary rates between SIG and RNAP genes suggests a plausible explanation for plastome-genome incompatibility in Geraniaceae. © 2015 American Society of Plant Biologists. All rights reserved.
Genome-based exploration of the specialized metabolic capacities of the genus Rhodococcus.
Ceniceros, Ana; Dijkhuizen, Lubbert; Petrusma, Mirjan; Medema, Marnix H
2017-08-09
Bacteria of the genus Rhodococcus are well known for their ability to degrade a large range of organic compounds. Some rhodococci are free-living, saprophytic bacteria; others are animal and plant pathogens. Recently, several studies have shown that their genomes encode putative pathways for the synthesis of a large number of specialized metabolites that are likely to be involved in microbe-microbe and host-microbe interactions. To systematically explore the specialized metabolic potential of this genus, we here performed a comprehensive analysis of the biosynthetic coding capacity across publicly available rhododoccal genomes, and compared these with those of several Mycobacterium strains as well as that of their mutual close relative Amycolicicoccus subflavus. Comparative genomic analysis shows that most predicted biosynthetic gene cluster families in these strains are clade-specific and lack any homology with gene clusters encoding the production of known natural products. Interestingly, many of these clusters appear to encode the biosynthesis of lipopeptides, which may play key roles in the diverse environments were rhodococci thrive, by acting as biosurfactants, pathogenicity factors or antimicrobials. We also identified several gene cluster families that are universally shared among all three genera, which therefore may have a more 'primary' role in their physiology. Inactivation of these clusters by mutagenesis might help to generate weaker strains that can be used as live vaccines. The genus Rhodococcus thus provides an interesting target for natural product discovery, in view of its large and mostly uncharacterized biosynthetic repertoire, its relatively fast growth and the availability of effective genetic tools for its genomic modification.
Distribution and Genetic Diversity of Bacteriocin Gene Clusters in Rumen Microbial Genomes.
Azevedo, Analice C; Bento, Cláudia B P; Ruiz, Jeronimo C; Queiroz, Marisa V; Mantovani, Hilário C
2015-10-01
Some species of ruminal bacteria are known to produce antimicrobial peptides, but the screening procedures have mostly been based on in vitro assays using standardized methods. Recent sequencing efforts have made available the genome sequences of hundreds of ruminal microorganisms. In this work, we performed genome mining of the complete and partial genome sequences of 224 ruminal bacteria and 5 ruminal archaea to determine the distribution and diversity of bacteriocin gene clusters. A total of 46 bacteriocin gene clusters were identified in 33 strains of ruminal bacteria. Twenty gene clusters were related to lanthipeptide biosynthesis, while 11 gene clusters were associated with sactipeptide production, 7 gene clusters were associated with class II bacteriocin production, and 8 gene clusters were associated with class III bacteriocin production. The frequency of strains whose genomes encode putative antimicrobial peptide precursors was 14.4%. Clusters related to the production of sactipeptides were identified for the first time among ruminal bacteria. BLAST analysis indicated that the majority of the gene clusters (88%) encoding putative lanthipeptides contained all the essential genes required for lanthipeptide biosynthesis. Most strains of Streptococcus (66.6%) harbored complete lanthipeptide gene clusters, in addition to an open reading frame encoding a putative class II bacteriocin. Albusin B-like proteins were found in 100% of the Ruminococcus albus strains screened in this study. The in silico analysis provided evidence of novel biosynthetic gene clusters in bacterial species not previously related to bacteriocin production, suggesting that the rumen microbiota represents an underexplored source of antimicrobial peptides. Copyright © 2015, American Society for Microbiology. All Rights Reserved.
Understanding Cullin-RING E3 Biology through Proteomics-based Substrate Identification*
Harper, J. Wade; Tan, Meng-Kwang Marcus
2012-01-01
Protein turnover through the ubiquitin-proteasome pathway controls numerous developmental decisions and biochemical processes in eukaryotes. Central to protein ubiquitylation are ubiquitin ligases, which provide specificity in targeted ubiquitylation. With more than 600 ubiquitin ligases encoded by the human genome, many of which remain to be studied, considerable effort is being placed on the development of methods for identifying substrates of specific ubiquitin ligases. In this review, we describe proteomic technologies for the identification of ubiquitin ligase targets, with a particular focus on members of the cullin-RING E3 class of ubiquitin ligases, which use F-box proteins as substrate specific adaptor proteins. Various proteomic methods are described and are compared with genetic approaches that are available. The continued development of such methods is likely to have a substantial impact on the ubiquitin-proteasome field. PMID:22962057
Understanding cullin-RING E3 biology through proteomics-based substrate identification.
Harper, J Wade; Tan, Meng-Kwang Marcus
2012-12-01
Protein turnover through the ubiquitin-proteasome pathway controls numerous developmental decisions and biochemical processes in eukaryotes. Central to protein ubiquitylation are ubiquitin ligases, which provide specificity in targeted ubiquitylation. With more than 600 ubiquitin ligases encoded by the human genome, many of which remain to be studied, considerable effort is being placed on the development of methods for identifying substrates of specific ubiquitin ligases. In this review, we describe proteomic technologies for the identification of ubiquitin ligase targets, with a particular focus on members of the cullin-RING E3 class of ubiquitin ligases, which use F-box proteins as substrate specific adaptor proteins. Various proteomic methods are described and are compared with genetic approaches that are available. The continued development of such methods is likely to have a substantial impact on the ubiquitin-proteasome field.
A singular enzymatic megacomplex from Bacillus subtilis.
Straight, Paul D; Fischbach, Michael A; Walsh, Christopher T; Rudner, David Z; Kolter, Roberto
2007-01-02
Nonribosomal peptide synthetases (NRPS), polyketide synthases (PKS), and hybrid NRPS/PKS are of particular interest, because they produce numerous therapeutic agents, have great potential for engineering novel compounds, and are the largest enzymes known. The predicted masses of known enzymatic assembly lines can reach almost 5 megadaltons, dwarfing even the ribosome (approximately 2.6 megadaltons). Despite their uniqueness and importance, little is known about the organization of these enzymes within the native producer cells. Here we report that an 80-kb gene cluster, which occupies approximately 2% of the Bacillus subtilis genome, encodes the subunits of approximately 2.5 megadalton active hybrid NRPS/PKS. Many copies of the NRPS/PKS assemble into a single organelle-like membrane-associated complex of tens to hundreds of megadaltons. Such an enzymatic megacomplex is unprecedented in bacterial subcellular organization and has important implications for engineering novel NRPS/PKSs.
EBR1 genomic expansion and its role in virulence of Fusarium species
USDA-ARS?s Scientific Manuscript database
Genome sequencing of Fusarium oxysporum revealed that pathogenic forms of this fungus harbor supernumerary chromosomes with a wide variety of genes, many of which likely encode traits required for pathogenicity or niche specialization. Specific transcription factor (TF) gene families are expanded on...
Evolution of tuf genes: ancient duplication, differential loss and gene conversion.
Lathe, W C; Bork, P
2001-08-03
The tuf gene of eubacteria, encoding the EF-tu elongation factor, was duplicated early in the evolution of the taxon. Phylogenetic and genomic location analysis of 20 complete eubacterial genomes suggests that this ancient duplication has been differentially lost and maintained in eubacteria.
Comparative genomic analysis of the multispecies probiotic-marketed product VSL#3.
Douillard, François P; Mora, Diego; Eijlander, Robyn T; Wels, Michiel; de Vos, Willem M
2018-01-01
Several probiotic-marketed formulations available for the consumers contain live lactic acid bacteria and/or bifidobacteria. The multispecies product commercialized as VSL#3 has been used for treating various gastro-intestinal disorders. However, like many other products, the bacterial strains present in VSL#3 have only been characterized to a limited extent and their efficacy as well as their predicted mode of action remain unclear, preventing further applications or comparative studies. In this work, the genomes of all eight bacterial strains present in VSL#3 were sequenced and characterized, to advance insights into the possible mode of action of this product and also to serve as a basis for future work and trials. Phylogenetic and genomic data analysis allowed us to identify the 7 species present in the VSL#3 product as specified by the manufacturer. The 8 strains present belong to the species Streptococcus thermophilus, Lactobacillus acidophilus, Lactobacillus paracasei, Lactobacillus plantarum, Lactobacillus helveticus, Bifidobacterium breve and B. animalis subsp. lactis (two distinct strains). Comparative genomics revealed that the draft genomes of the S. thermophilus and L. helveticus strains were predicted to encode most of the defence systems such as restriction modification and CRISPR-Cas systems. Genes associated with a variety of potential probiotic functions were also identified. Thus, in the three Bifidobacterium spp., gene clusters were predicted to encode tight adherence pili, known to promote bacteria-host interaction and intestinal barrier integrity, and to impact host cell development. Various repertoires of putative signalling proteins were predicted to be encoded by the genomes of the Lactobacillus spp., i.e. surface layer proteins, LPXTG-containing proteins, or sortase-dependent pili that may interact with the intestinal mucosa and dendritic cells. Taken altogether, the individual genomic characterization of the strains present in the VSL#3 product confirmed the product specifications, determined its coding capacity as well as identified potential probiotic functions.
Schwartze, Volker U; Winter, Sascha; Shelest, Ekaterina; Marcet-Houben, Marina; Horn, Fabian; Wehner, Stefanie; Linde, Jörg; Valiante, Vito; Sammeth, Michael; Riege, Konstantin; Nowrousian, Minou; Kaerger, Kerstin; Jacobsen, Ilse D; Marz, Manja; Brakhage, Axel A; Gabaldón, Toni; Böcker, Sebastian; Voigt, Kerstin
2014-08-01
Lichtheimia species are the second most important cause of mucormycosis in Europe. To provide broader insights into the molecular basis of the pathogenicity-associated traits of the basal Mucorales, we report the full genome sequence of L. corymbifera and compared it to the genome of Rhizopus oryzae, the most common cause of mucormycosis worldwide. The genome assembly encompasses 33.6 MB and 12,379 protein-coding genes. This study reveals four major differences of the L. corymbifera genome to R. oryzae: (i) the presence of an highly elevated number of gene duplications which are unlike R. oryzae not due to whole genome duplication (WGD), (ii) despite the relatively high incidence of introns, alternative splicing (AS) is not frequently observed for the generation of paralogs and in response to stress, (iii) the content of repetitive elements is strikingly low (<5%), (iv) L. corymbifera is typically haploid. Novel virulence factors were identified which may be involved in the regulation of the adaptation to iron-limitation, e.g. LCor01340.1 encoding a putative siderophore transporter and LCor00410.1 involved in the siderophore metabolism. Genes encoding the transcription factors LCor08192.1 and LCor01236.1, which are similar to GATA type regulators and to calcineurin regulated CRZ1, respectively, indicating an involvement of the calcineurin pathway in the adaption to iron limitation. Genes encoding MADS-box transcription factors are elevated up to 11 copies compared to the 1-4 copies usually found in other fungi. More findings are: (i) lower content of tRNAs, but unique codons in L. corymbifera, (ii) Over 25% of the proteins are apparently specific for L. corymbifera. (iii) L. corymbifera contains only 2/3 of the proteases (known to be essential virulence factors) in comparison to R. oryzae. On the other hand, the number of secreted proteases, however, is roughly twice as high as in R. oryzae.
Wehner, Stefanie; Linde, Jörg; Valiante, Vito; Sammeth, Michael; Riege, Konstantin; Nowrousian, Minou; Kaerger, Kerstin; Jacobsen, Ilse D.; Marz, Manja; Brakhage, Axel A.; Gabaldón, Toni; Böcker, Sebastian; Voigt, Kerstin
2014-01-01
Lichtheimia species are the second most important cause of mucormycosis in Europe. To provide broader insights into the molecular basis of the pathogenicity-associated traits of the basal Mucorales, we report the full genome sequence of L. corymbifera and compared it to the genome of Rhizopus oryzae, the most common cause of mucormycosis worldwide. The genome assembly encompasses 33.6 MB and 12,379 protein-coding genes. This study reveals four major differences of the L. corymbifera genome to R. oryzae: (i) the presence of an highly elevated number of gene duplications which are unlike R. oryzae not due to whole genome duplication (WGD), (ii) despite the relatively high incidence of introns, alternative splicing (AS) is not frequently observed for the generation of paralogs and in response to stress, (iii) the content of repetitive elements is strikingly low (<5%), (iv) L. corymbifera is typically haploid. Novel virulence factors were identified which may be involved in the regulation of the adaptation to iron-limitation, e.g. LCor01340.1 encoding a putative siderophore transporter and LCor00410.1 involved in the siderophore metabolism. Genes encoding the transcription factors LCor08192.1 and LCor01236.1, which are similar to GATA type regulators and to calcineurin regulated CRZ1, respectively, indicating an involvement of the calcineurin pathway in the adaption to iron limitation. Genes encoding MADS-box transcription factors are elevated up to 11 copies compared to the 1–4 copies usually found in other fungi. More findings are: (i) lower content of tRNAs, but unique codons in L. corymbifera, (ii) Over 25% of the proteins are apparently specific for L. corymbifera. (iii) L. corymbifera contains only 2/3 of the proteases (known to be essential virulence factors) in comparision to R. oryzae. On the other hand, the number of secreted proteases, however, is roughly twice as high as in R. oryzae. PMID:25121733
Comparative genomic analysis of the multispecies probiotic-marketed product VSL#3
Mora, Diego; Eijlander, Robyn T.; Wels, Michiel; de Vos, Willem M.
2018-01-01
Several probiotic-marketed formulations available for the consumers contain live lactic acid bacteria and/or bifidobacteria. The multispecies product commercialized as VSL#3 has been used for treating various gastro-intestinal disorders. However, like many other products, the bacterial strains present in VSL#3 have only been characterized to a limited extent and their efficacy as well as their predicted mode of action remain unclear, preventing further applications or comparative studies. In this work, the genomes of all eight bacterial strains present in VSL#3 were sequenced and characterized, to advance insights into the possible mode of action of this product and also to serve as a basis for future work and trials. Phylogenetic and genomic data analysis allowed us to identify the 7 species present in the VSL#3 product as specified by the manufacturer. The 8 strains present belong to the species Streptococcus thermophilus, Lactobacillus acidophilus, Lactobacillus paracasei, Lactobacillus plantarum, Lactobacillus helveticus, Bifidobacterium breve and B. animalis subsp. lactis (two distinct strains). Comparative genomics revealed that the draft genomes of the S. thermophilus and L. helveticus strains were predicted to encode most of the defence systems such as restriction modification and CRISPR-Cas systems. Genes associated with a variety of potential probiotic functions were also identified. Thus, in the three Bifidobacterium spp., gene clusters were predicted to encode tight adherence pili, known to promote bacteria-host interaction and intestinal barrier integrity, and to impact host cell development. Various repertoires of putative signalling proteins were predicted to be encoded by the genomes of the Lactobacillus spp., i.e. surface layer proteins, LPXTG-containing proteins, or sortase-dependent pili that may interact with the intestinal mucosa and dendritic cells. Taken altogether, the individual genomic characterization of the strains present in the VSL#3 product confirmed the product specifications, determined its coding capacity as well as identified potential probiotic functions. PMID:29451876
Genome-Wide Search for Genes Required for Bifidobacterial Growth under Iron-Limitation
Lanigan, Noreen; Bottacini, Francesca; Casey, Pat G.; O'Connell Motherway, Mary; van Sinderen, Douwe
2017-01-01
Bacteria evolved over millennia in the presence of the vital micronutrient iron. Iron is involved in numerous processes within the cell and is essential for nearly all living organisms. The importance of iron to the survival of bacteria is obvious from the large variety of mechanisms by which iron may be acquired from the environment. Random mutagenesis and global gene expression profiling led to the identification of a number of genes, which are essential for Bifidobacterium breve UCC2003 survival under iron-restrictive conditions. These genes encode, among others, Fe-S cluster-associated proteins, a possible ferric iron reductase, a number of cell wall-associated proteins, and various DNA replication and repair proteins. In addition, our study identified several presumed iron uptake systems which were shown to be essential for B. breve UCC2003 growth under conditions of either ferric and/or ferrous iron chelation. Of these, two gene clusters encoding putative iron-uptake systems, bfeUO and sifABCDE, were further characterised, indicating that sifABCDE is involved in ferrous iron transport, while the bfeUO-encoded transport system imports both ferrous and ferric iron. Transcription studies showed that bfeUO and sifABCDE constitute two separate transcriptional units that are induced upon dipyridyl-mediated iron limitation. In the anaerobic gastrointestinal environment ferrous iron is presumed to be of most relevance, though a mutation in the sifABCDE cluster does not affect B. breve UCC2003's ability to colonise the gut of a murine model. PMID:28620359
Li, Fuchao; Jiang, Peng; Zheng, Huajun; Wang, Shengyue; Zhao, Guoping; Qin, Song; Liu, Zhaopu
2011-07-01
Streptomyces griseoaurantiacus M045, isolated from marine sediment, produces manumycin and chinikomycin antibiotics. Here we present a high-quality draft genome sequence of S. griseoaurantiacus M045, the first marine Streptomyces species to be sequenced and annotated. The genome encodes several gene clusters for biosynthesis of secondary metabolites and has provided insight into genomic islands linking secondary metabolism to functional adaptation in marine S. griseoaurantiacus M045.
Cioncoloni, David; Galli, Giulia; Mazzocchio, Riccardo; Feurra, Matteo; Giovannelli, Fabio; Santarnecchi, Emiliano; Bonifazi, Marco; Rossi, Alessandro; Rossi, Simone
2014-10-01
We aimed at investigating rapid effects of plasma cortisol elevations on the episodic memory phase of encoding or retrieval, and on the strength of the memory trace. Participants were asked either to select a word containing the letter "e" (shallow encoding task) or to judge if a word referred to a living entity (deep encoding task). We intravenously administered a bolus of 20mg of cortisol either 5 min before encoding or 5 min before retrieval, in a between-subjects design. The study included only male participants tested in the late afternoon, and neutral words as stimuli. When cortisol administration occurred prior to retrieval, a main effect of group emerged. Recognition accuracy was higher for individuals who received cortisol compared to placebo. The higher discrimination accuracy for the cortisol group was significant for words encoded during deep but not shallow task. Cortisol administration before encoding did not affect subsequent retrieval performance (either for deep or shallow stimuli) despite a facilitatory trend. Because genomic mechanisms take some time to develop, such a mechanism cannot apply to our findings where the memory task was performed shortly after the enhancement of glucocorticoid levels. Therefore, glucocorticoids, through non-genomic fast effects, determine an enhancement in episodic memory if administered immediately prior to retrieval. This effect is more evident if the memory trace is laid down through deep encoding operations involving the recruitment of specific neural networks. Copyright © 2014 Elsevier Inc. All rights reserved.
Muller, Ryan Y; Hammond, Ming C; Rio, Donald C; Lee, Yeon J
2015-12-01
The Encyclopedia of DNA Elements (ENCODE) Project aims to identify all functional sequence elements in the human genome sequence by use of high-throughput DNA/cDNA sequencing approaches. To aid the standardization, comparison, and integration of data sets produced from different technologies and platforms, the ENCODE Consortium selected several standard human cell lines to be used by the ENCODE Projects. The Tier 1 ENCODE cell lines include GM12878, K562, and H1 human embryonic stem cell lines. GM12878 is a lymphoblastoid cell line, transformed with the Epstein-Barr virus, that was selected by the International HapMap Project for whole genome and transcriptome sequencing by use of the Illumina platform. K562 is an immortalized myelogenous leukemia cell line. The GM12878 cell line is attractive for the ENCODE Projects, as it offers potential synergy with the International HapMap Project. Despite the vast amount of sequencing data available on the GM12878 cell line through the ENCODE Project, including transcriptome, chromatin immunoprecipitation-sequencing for histone marks, and transcription factors, no small interfering siRNA-mediated knockdown studies have been performed in the GM12878 cell line, as cationic lipid-mediated transfection methods are inefficient for lymphoid cell lines. Here, we present an efficient and reproducible method for transfection of a variety of siRNAs into the GM12878 and K562 cell lines, which subsequently results in targeted protein depletion.
IQCJ-SCHIP1, a novel fusion transcript encoding a calmodulin-binding IQ motif protein
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kwasnicka-Crawford, Dorota A.; Carson, Andrew R.; Scherer, Stephen W.
The existence of transcripts that span two adjacent, independent genes is considered rare in the human genome. This study characterizes a novel human fusion gene named IQCJ-SCHIP1. IQCJ-SCHIP1 is the longest isoform of a complex transcriptional unit that bridges two separate genes that encode distinct proteins, IQCJ, a novel IQ motif containing protein and SCHIP1, a schwannomin interacting protein that has been previously shown to interact with the Neurofibromatosis type 2 (NF2) protein. IQCJ-SCHIP1 is located on the chromosome 3q25 and comprises a 1692-bp transcript encompassing 11 exons spanning 828 kb of the genomic DNA. We show that IQCJ-SCHIP1 mRNAmore » is highly expressed in the brain. Protein encoded by the IQCJ-SCHIP1 gene was localized to cytoplasm and actin-rich regions and in differentiated PC12 cells was also seen in neurite extensions.« less
Xin, Min; Zhang, Peipei; Liu, Wenwen; Ren, Yingdang; Cao, Mengji; Wang, Xifeng
2017-10-01
The complete nucleotide sequence of a novel positive single-stranded (+ss) RNA virus, tentatively named watermelon virus A (WVA), was determined using a combination of three methods: RNA sequencing, small RNA sequencing, and Sanger sequencing. The full genome of WVA is comprised of 8,372 nucleotides (nt), excluding the poly (A) tail, and contains four open reading frames (ORFs). The largest ORF, ORF1 encodes a putative replication-associated polyprotein (RP) with three conserved domains. ORF2 and ORF4 encode a movement protein (MP) and coat protein (CP), respectively. The putative product encoded by ORF3, of an estimated molecular mass of 25 kDa, has no significant similarity with other proteins. Identity and phylogenetic analysis indicate that WVA is a new virus, closely related to members of the family Betaflexiviridae. However, the final taxonomic allocation of WVA within the family is yet to be determined.
Sullivan, William J; Monroy, M Alexandra; Bohne, Wolfgang; Nallani, Karuna C; Chrivia, John; Yaciuk, Peter; Smith, Charles K; Queener, Sherry F
2003-05-01
We have identified and mapped a gene in Toxoplasma gondii that encodes a homologue of SRCAP (Snf2-related CBP activator protein), a member of the SNF/SWI family of chromatin remodeling factors. The genomic locus (TgSRCAP) is present as a single copy and contains 16 introns. The predicted cDNA contains an open reading frame of 8,775 bp and encodes a protein of 2,924 amino acids. We have identified additional SRCAP-like sequences in Apicomplexa for comparison by screening genomic databases. An analysis of SRCAP homologues between species reveals signature features that may be indicative of SRCAP members. Expression of mRNA encoding TgSRCAP is upregulated when tachyzoite (invasive form) parasites are induced to differentiate into bradyzoites (encysted form) in vitro. Recombinant TgSRCAP protein is functionally equivalent to the human homologue, being capable of increasing transcription mediated by CREB.
The complete chloroplast genome sequence of Dianthus superbus var. longicalycinus.
Gurusamy, Raman; Lee, Do-Hyung; Park, SeonJoo
2016-05-01
The complete chloroplast genome (cpDNA) sequence of Dianthus superbus var. longicalycinus is an economically important traditional Chinese medicine was reported and characterized. The cpDNA of Dianthus superbus var. longicalycinus is 149,539 bp, with 36.3% GC content. A pair of inverted repeats (IRs) of 24,803 bp is separated by a large single-copy region (LSC, 82,805 bp) and a small single-copy region (SSC, 17,128 bp). It encodes 85 protein-coding genes, 36 tRNA genes and 8 rRNA genes. Of 129 individual genes, 13 genes encoded one intron and three genes have two introns.
NASA Astrophysics Data System (ADS)
Tian, Z. H.; Jiao, C. Z.
2017-07-01
RIG-I like receptors (RLRs) play key roles in sensing non-self nucleic acids in cytoplasm and trigger antiviral innate immune response in vertebrates and human body. Here we carried out in silico analysis to identify and investigate the putative RLRs encoded in the genome of marine mollusk, Crassostrea gigas (cgRLRs), an invertebrate species. We found the unusual duplication and varieties on domain architecture of putative cgRLRs encoded in the genome of C. gigas. Three putative cgRLRs (accessions numbers are EKC24603, EKC31344.1 and EKC38304.1 on GenBank), have the similar domain architecture with that of human RIG-I or MDA5, and one protein (EKC34573.1) with that of human LGP2; The fifth putative cgRLRs (EKC38303.1) is somewhat similar with human RIG-I/MDA5 except that it has only one caspase activation and recruitment domain (CARD) in its N-terminal. Other nine proteins were identified to be partialy similar with RLRs while with the incomplete sequences, which maybe reflect the events of partial duplication of cgRLRs genes occurred in the oyster genome.
USDA-ARS?s Scientific Manuscript database
Genomic analysis indicated that Edwardsiella ictaluri encodes a putative ureasepathogenicity island containing 9 open reading frames, including urea and ammonium transporters. In vitro studies with the wild-type E. ictaluri and a ureG::kan urease mutant strain indicated that E. ictaluri is significa...
USDA-ARS?s Scientific Manuscript database
Natural antisense transcripts (NATs) are transcripts of the opposite DNA strand to the sense-strand either at the same locus (cis-encoded) or a different locus (trans-encoded). They can affect gene expression at multiple stages including transcription, RNA processing and transport, and translation....
Novel Thrombotic Function of a Human SNP in STXBP5 Revealed by CRISPR/Cas9 Gene Editing in Mice.
Zhu, Qiuyu Martin; Ko, Kyung Ae; Ture, Sara; Mastrangelo, Michael A; Chen, Ming-Huei; Johnson, Andrew D; O'Donnell, Christopher J; Morrell, Craig N; Miano, Joseph M; Lowenstein, Charles J
2017-02-01
To identify and characterize the effect of a SNP (single-nucleotide polymorphism) in the STXBP5 locus that is associated with altered thrombosis in humans. GWAS (genome-wide association studies) have identified numerous SNPs associated with human thrombotic phenotypes, but determining the functional significance of an individual candidate SNP can be challenging, particularly when in vivo modeling is required. Recent GWAS led to the discovery of STXBP5 as a regulator of platelet secretion in humans. Further clinical studies have identified genetic variants of STXBP5 that are linked to altered plasma von Willebrand factor levels and thrombosis in humans, but the functional significance of these variants in STXBP5 is not understood. We used CRISPR/Cas9 (clustered regularly interspaced short palindromic repeats/CRISPR-associated 9) techniques to produce a precise mouse model carrying a human coding SNP rs1039084 (encoding human p. N436S) in the STXBP5 locus associated with decreased thrombosis. Mice carrying the orthologous human mutation (encoding p. N437S in mouse STXBP5) have lower plasma von Willebrand factor levels, decreased thrombosis, and decreased platelet secretion compared with wild-type mice. This thrombosis phenotype recapitulates the phenotype of humans carrying the minor allele of rs1039084. Decreased plasma von Willebrand factor and platelet activation may partially explain the decreased thrombotic phenotype in mutant mice. Using precise mammalian genome editing, we have identified a human nonsynonymous SNP rs1039084 in the STXBP5 locus as a causal variant for a decreased thrombotic phenotype. CRISPR/Cas9 genetic editing facilitates the rapid and efficient generation of animals to study the function of human genetic variation in vascular diseases. © 2016 American Heart Association, Inc.
Doddapaneni, Harshavardhan; Subramanian, Venkataramanan; Fu, Bolei; Cullen, Dan
2013-06-01
The oxidative enzymatic machinery for degradation of organic substrates in Agaricus bisporus (Ab) is at the core of the carbon recycling mechanisms in this fungus. To date, 156 genes have been tentatively identified as part of this oxidative enzymatic machinery, which includes 26 peroxidase encoding genes, nine copper radical oxidase [including three putative glyoxal oxidase-encoding genes (GLXs)], 12 laccases sensu stricto and 109 cytochrome P450 monooxygenases. Comparative analyses of these enzymes in Ab with those of the white-rot fungus, Phanerochaete chrysosporium, the brown-rot fungus, Postia placenta, the coprophilic litter fungus, Coprinopsis cinerea and the ectomychorizal fungus, Laccaria bicolor, revealed enzyme diversity consistent with adaptation to substrates rich in humic substances and partially degraded plant material. For instance, relative to wood decay fungi, Ab cytochrome P450 genes were less numerous (109 gene models), distributed among distinctive families, and lacked extensive duplication and clustering. Viewed together with P450 transcript accumulation patterns in three tested growth conditions, these observations were consistent with the unique Ab lifestyle. Based on tandem gene arrangements, a certain degree of gene duplication seems to have occurred in this fungus in the copper radical oxidase (CRO) and the laccase gene families. In Ab, high transcript levels and regulation of the heme-thiolate peroxidases, two manganese peroxidases and the three GLX-like genes are likely in response to complex natural substrates, including lignocellulose and its derivatives, thereby suggesting an important role in lignin degradation. On the other hand, the expression patterns of the related CROs suggest a developmental role in this fungus. Based on these observations, a brief comparative genomic overview of the Ab oxidative enzyme machinery is presented. Copyright © 2013 Elsevier Inc. All rights reserved.
Weyman, Philip D.; Beeri, Karen; Lefebvre, Stephane C.; ...
2014-10-10
Diatoms are unicellular photosynthetic algae with promise for green production of fuels and other chemicals. Recent genome-editing techniques have greatly improved the potential of many eukaryotic genetic systems, including diatoms, to enable knowledge-based studies and bioengineering. Using a new technique, transcription activator-like effector nucleases (TALENs), the gene encoding the urease enzyme in the model diatom, Phaeodactylum tricornutum, was targeted for interruption. The knockout cassette was identified within the urease gene by PCR and Southern blot analyses of genomic DNA. The lack of urease protein was confirmed by Western blot analyses in mutant cell lines that were unable to grow onmore » urea as the sole nitrogen source. Untargeted metabolomic analysis revealed a build-up of urea, arginine and ornithine in the urease knockout lines. All three intermediate metabolites are upstream of the urease reaction within the urea cycle, suggesting a disruption of the cycle despite urea production. Numerous high carbon metabolites were enriched in the mutant, implying a breakdown of cellular C and N repartitioning. The presented method improves the molecular toolkit for diatoms and clarifies the role of urease in the urea cycle.« less
Loh, Joy; Zhao, Guoyan; Nelson, Christopher A.; Coder, Penny; Droit, Lindsay; Handley, Scott A.; Johnson, L. Steven; Vachharajani, Punit; Guzman, Hilda; Tesh, Robert B.; Wang, David; Fremont, Daved H.; Virgin, Herbert W.
2011-01-01
Gammaherpesviruses encode numerous immunomodulatory molecules that contribute to their ability to evade the host immune response and establish persistent, lifelong infections. As the human gammaherpesviruses are strictly species specific, small animal models of gammaherpesvirus infection, such as murine gammaherpesvirus 68 (γHV68) infection, are important for studying the roles of gammaherpesvirus immune evasion genes in in vivo infection and pathogenesis. We report here the genome sequence and characterization of a novel rodent gammaherpesvirus, designated rodent herpesvirus Peru (RHVP), that shares conserved genes and genome organization with γHV68 and the primate gammaherpesviruses but is phylogenetically distinct from γHV68. RHVP establishes acute and latent infection in laboratory mice. Additionally, RHVP contains multiple open reading frames (ORFs) not present in γHV68 that have sequence similarity to primate gammaherpesvirus immunomodulatory genes or cellular genes. These include ORFs with similarity to major histocompatibility complex class I (MHC-I), C-type lectins, and the mouse mammary tumor virus and herpesvirus saimiri superantigens. As these ORFs may function as immunomodulatory or virulence factors, RHVP presents new opportunities for the study of mechanisms of immune evasion by gammaherpesviruses. PMID:21209105
Hiller, Ekkehard; Istel, Fabian; Tscherner, Michael; Brunke, Sascha; Ames, Lauren; Firon, Arnaud; Green, Brian; Cabral, Vitor; Marcet-Houben, Marina; Jacobsen, Ilse D.; Quintin, Jessica; Seider, Katja; Frohner, Ingrid; Glaser, Walter; Jungwirth, Helmut; Bachellier-Bassi, Sophie; Chauvel, Murielle; Zeidler, Ute; Ferrandon, Dominique; Gabaldón, Toni; Hube, Bernhard; d'Enfert, Christophe; Rupp, Steffen; Cormack, Brendan; Haynes, Ken; Kuchler, Karl
2014-01-01
The opportunistic fungal pathogen Candida glabrata is a frequent cause of candidiasis, causing infections ranging from superficial to life-threatening disseminated disease. The inherent tolerance of C. glabrata to azole drugs makes this pathogen a serious clinical threat. To identify novel genes implicated in antifungal drug tolerance, we have constructed a large-scale C. glabrata deletion library consisting of 619 unique, individually bar-coded mutant strains, each lacking one specific gene, all together representing almost 12% of the genome. Functional analysis of this library in a series of phenotypic and fitness assays identified numerous genes required for growth of C. glabrata under normal or specific stress conditions, as well as a number of novel genes involved in tolerance to clinically important antifungal drugs such as azoles and echinocandins. We identified 38 deletion strains displaying strongly increased susceptibility to caspofungin, 28 of which encoding proteins that have not previously been linked to echinocandin tolerance. Our results demonstrate the potential of the C. glabrata mutant collection as a valuable resource in functional genomics studies of this important fungal pathogen of humans, and to facilitate the identification of putative novel antifungal drug target and virulence genes. PMID:24945925
Genetic evidence for conserved non-coding element function across species–the ears have it
Turner, Eric E.; Cox, Timothy C.
2014-01-01
Comparison of genomic sequences from diverse vertebrate species has revealed numerous highly conserved regions that do not appear to encode proteins or functional RNAs. Often these “conserved non-coding elements,” or CNEs, can direct gene expression to specific tissues in transgenic models, demonstrating they have regulatory function. CNEs are frequently found near “developmental” genes, particularly transcription factors, implying that these elements have essential regulatory roles in development. However, actual examples demonstrating CNE regulatory functions across species have been few, and recent loss-of-function studies of several CNEs in mice have shown relatively minor effects. In this Perspectives article, we discuss new findings in “fancy” rats and Highland cattle demonstrating that function of a CNE near the Hmx1 gene is crucial for normal external ear development and when disrupted can mimic loss-of function Hmx1 coding mutations in mice and humans. These findings provide important support for conserved developmental roles of CNEs in divergent species, and reinforce the concept that CNEs should be examined systematically in the ongoing search for genetic causes of human developmental disorders in the era of genome-scale sequencing. PMID:24478720
Ecophysiology of Freshwater Verrucomicrobia Inferred from Metagenome-Assembled Genomes
He, Shaomei; Stevens, Sarah L. R.; Chan, Leong-Keat; Bertilsson, Stefan; Glavina del Rio, Tijana; Tringe, Susannah G.; Malmstrom, Rex R.
2017-01-01
ABSTRACT Microbes are critical in carbon and nutrient cycling in freshwater ecosystems. Members of the Verrucomicrobia are ubiquitous in such systems, and yet their roles and ecophysiology are not well understood. In this study, we recovered 19 Verrucomicrobia draft genomes by sequencing 184 time-series metagenomes from a eutrophic lake and a humic bog that differ in carbon source and nutrient availabilities. These genomes span four of the seven previously defined Verrucomicrobia subdivisions and greatly expand knowledge of the genomic diversity of freshwater Verrucomicrobia. Genome analysis revealed their potential role as (poly)saccharide degraders in freshwater, uncovered interesting genomic features for this lifestyle, and suggested their adaptation to nutrient availabilities in their environments. Verrucomicrobia populations differ significantly between the two lakes in glycoside hydrolase gene abundance and functional profiles, reflecting the autochthonous and terrestrially derived allochthonous carbon sources of the two ecosystems, respectively. Interestingly, a number of genomes recovered from the bog contained gene clusters that potentially encode a novel porin-multiheme cytochrome c complex and might be involved in extracellular electron transfer in the anoxic humus-rich environment. Notably, most epilimnion genomes have large numbers of so-called “Planctomycete-specific” cytochrome c-encoding genes, which exhibited distribution patterns nearly opposite to those seen with glycoside hydrolase genes, probably associated with the different levels of environmental oxygen availability and carbohydrate complexity between lakes/layers. Overall, the recovered genomes represent a major step toward understanding the role, ecophysiology, and distribution of Verrucomicrobia in freshwater. IMPORTANCE Freshwater Verrucomicrobia spp. are cosmopolitan in lakes and rivers, and yet their roles and ecophysiology are not well understood, as cultured freshwater Verrucomicrobia spp. are restricted to one subdivision of this phylum. Here, we greatly expanded the known genomic diversity of this freshwater lineage by recovering 19 Verrucomicrobia draft genomes from 184 metagenomes collected from a eutrophic lake and a humic bog across multiple years. Most of these genomes represent the first freshwater representatives of several Verrucomicrobia subdivisions. Genomic analysis revealed Verrucomicrobia to be potential (poly)saccharide degraders and suggested their adaptation to carbon sources of different origins in the two contrasting ecosystems. We identified putative extracellular electron transfer genes and so-called “Planctomycete-specific” cytochrome c-encoding genes and identified their distinct distribution patterns between the lakes/layers. Overall, our analysis greatly advances the understanding of the function, ecophysiology, and distribution of freshwater Verrucomicrobia, while highlighting their potential role in freshwater carbon cycling. PMID:28959738
Identification in Marinomonas mediterranea of a novel quinoprotein with glycine oxidase activity.
Campillo-Brocal, Jonatan Cristian; Lucas-Elio, Patricia; Sanchez-Amat, Antonio
2013-08-01
A novel enzyme with lysine-epsilon oxidase activity was previously described in the marine bacterium Marinomonas mediterranea. This enzyme differs from other l-amino acid oxidases in not being a flavoprotein but containing a quinone cofactor. It is encoded by an operon with two genes lodA and lodB. The first one codes for the oxidase, while the second one encodes a protein required for the expression of the former. Genome sequencing of M. mediterranea has revealed that it contains two additional operons encoding proteins with sequence similarity to LodA. In this study, it is shown that the product of one of such genes, Marme_1655, encodes a protein with glycine oxidase activity. This activity shows important differences in terms of substrate range and sensitivity to inhibitors to other glycine oxidases previously described which are flavoproteins synthesized by Bacillus. The results presented in this study indicate that the products of the genes with different degrees of similarity to lodA detected in bacterial genomes could constitute a reservoir of different oxidases. © 2013 The Authors. Microbiology Open published by John Wiley & Sons Ltd.
Ontology-Based Search of Genomic Metadata.
Fernandez, Javier D; Lenzerini, Maurizio; Masseroli, Marco; Venco, Francesco; Ceri, Stefano
2016-01-01
The Encyclopedia of DNA Elements (ENCODE) is a huge and still expanding public repository of more than 4,000 experiments and 25,000 data files, assembled by a large international consortium since 2007; unknown biological knowledge can be extracted from these huge and largely unexplored data, leading to data-driven genomic, transcriptomic, and epigenomic discoveries. Yet, search of relevant datasets for knowledge discovery is limitedly supported: metadata describing ENCODE datasets are quite simple and incomplete, and not described by a coherent underlying ontology. Here, we show how to overcome this limitation, by adopting an ENCODE metadata searching approach which uses high-quality ontological knowledge and state-of-the-art indexing technologies. Specifically, we developed S.O.S. GeM (http://www.bioinformatics.deib.polimi.it/SOSGeM/), a system supporting effective semantic search and retrieval of ENCODE datasets. First, we constructed a Semantic Knowledge Base by starting with concepts extracted from ENCODE metadata, matched to and expanded on biomedical ontologies integrated in the well-established Unified Medical Language System. We prove that this inference method is sound and complete. Then, we leveraged the Semantic Knowledge Base to semantically search ENCODE data from arbitrary biologists' queries. This allows correctly finding more datasets than those extracted by a purely syntactic search, as supported by the other available systems. We empirically show the relevance of found datasets to the biologists' queries.
Osada, Naoki; Akashi, Hiroshi
2012-01-01
Accelerated rates of mitochondrial protein evolution have been proposed to reflect Darwinian coadaptation for efficient energy production for mammalian flight and brain activity. However, several features of mammalian mtDNA (absence of recombination, small effective population size, and high mutation rate) promote genome degradation through the accumulation of weakly deleterious mutations. Here, we present evidence for "compensatory" adaptive substitutions in nuclear DNA- (nDNA) encoded mitochondrial proteins to prevent fitness decline in primate mitochondrial protein complexes. We show that high mutation rate and small effective population size, key features of primate mitochondrial genomes, can accelerate compensatory adaptive evolution in nDNA-encoded genes. We combine phylogenetic information and the 3D structure of the cytochrome c oxidase (COX) complex to test for accelerated compensatory changes among interacting sites. Physical interactions among mtDNA- and nDNA-encoded components are critical in COX evolution; amino acids in close physical proximity in the 3D structure show a strong tendency for correlated evolution among lineages. Only nuclear-encoded components of COX show evidence for positive selection and adaptive nDNA-encoded changes tend to follow mtDNA-encoded amino acid changes at nearby sites in the 3D structure. This bias in the temporal order of substitutions supports compensatory weak selection as a major factor in accelerated primate COX evolution.
Tumor Genomic Profiling in Breast Cancer Patients Using Targeted Massively Parallel Sequencing
2015-04-30
recently, we identified several novel alterations in in ER+ breast tumors, including translocations in ESR1 , the gene that encodes the estrogen receptor...modified our bait design to include genomic coordinates across select introns in ESR1 . In addition, two recent papers from the Broad Institute published
Draft genome sequences of 50 MRSA ST5 isolates obtained from a U.S. hospital
USDA-ARS?s Scientific Manuscript database
Methicillin resistant Staphylococcus aureus (MRSA) can be a commensal or pathogen in humans. Pathogenicity and disease are related to the acquisition of mobile genetic elements encoding virulence and antimicrobial resistance genes. Here, we report draft genome sequences for 50 clinical MRSA isolates...
Donaho, John A.; Kelley, John F.; St. John, Emily; Turner, Christina; Podar, Mircea; Stott, Matthew B.
2018-01-01
ABSTRACT A draft genome of a novel Dictyoglomus sp., NZ13-RE01, was obtained from a New Zealand hot spring enrichment culture. The 1,927,012-bp genome is similar in both size and G+C content to other Dictyoglomus spp. Like its relatives, Dictyoglomus sp. NZ13-RE01 encodes many genes involved in complex carbohydrate metabolism. PMID:29545298
The Calyptogena magnifica chemoautotrophic symbiont genome
DOE Office of Scientific and Technical Information (OSTI.GOV)
Newton, I.L.; Woyke, T.; Auchtung, T.A.
2007-03-01
Chemoautotrophic endosymbionts are the metabolic cornerstone of hydrothermal vent communities, providing invertebrate hosts with nearly all of their nutrition. The Calyptogena magnifica (Bivalvia: Vesicomyidae) symbiont, Candidatus Ruthia magnifica, is the first intracellular sulfur-oxidizing endosymbiont to have its genome sequenced, revealing a suite of metabolic capabilities. The genome encodes major chemoautotrophic pathways as well as pathways for biosynthesis of vitamins, cofactors, and all 20 amino acids required by the clam.
Near-Complete Genome Sequence of a Novel Single-Stranded RNA Virus Discovered in Indoor Air
2018-01-01
ABSTRACT Viral metagenomic analysis of heating, ventilation, and air conditioning (HVAC) filters recovered the near-complete genome sequence of a novel virus, named HVAC-associated RNA virus 1 (HVAC-RV1). The HVAC-RV1 genome is most similar to those of picorna-like viruses identified in arthropods but encodes a small domain observed only in negative-sense single-stranded RNA viruses. PMID:29567746
Complete Genome Sequence of Mycobacterium chimaera Strain AH16.
Hasan, Nabeeh A; Honda, Jennifer R; Davidson, Rebecca M; Epperson, L Elaine; Bankowski, Matthew J; Chan, Edward D; Strong, Michael
2016-11-23
Mycobacterium chimaera is a nontuberculous mycobacterial species that causes cardiovascular, pulmonary, and postsurgical infections. Here, we report the first complete genome sequence of M. chimaera This genome is 6.33 Mbp, with a G+C content of 67.56%, and encodes 4,926 protein-coding genes, as well as 74 tRNAs, one ncRNA, and three rRNA genes. Copyright © 2016 Hasan et al.
Yerrapragada, Shaila; Shukla, Animesh; Hallsworth-Pepin, Kymberlie; Choi, Kwangmin; Wollam, Aye; Clifton, Sandra; Qin, Xiang; Muzny, Donna; Raghuraman, Sriram; Ashki, Haleh; Uzman, Akif; Highlander, Sarah K; Fryszczyn, Bartlomiej G; Fox, George E; Tirumalai, Madhan R; Liu, Yamei; Kim, Sun; Kehoe, David M; Weinstock, George M
2015-05-07
Tolypothrix sp. PCC 7601 is a freshwater filamentous cyanobacterium with complex responses to environmental conditions. Here, we present its 9.96-Mbp draft genome sequence, containing 10,065 putative protein-coding sequences, including 305 predicted two-component system proteins and 27 putative phytochrome-class photoreceptors, the most such proteins in any sequenced genome. Copyright © 2015 Yerrapragada et al.
Complete Genome Sequence of Aggregatibacter (Haemophilus) aphrophilus NJ8700▿
Di Bonaventura, Maria Pia; DeSalle, Rob; Pop, Mihai; Nagarajan, Niranjan; Figurski, David H.; Fine, Daniel H.; Kaplan, Jeffrey B.; Planet, Paul J.
2009-01-01
We report the finished and annotated genome sequence of Aggregatibacter aphrophilus strain NJ8700, a strain isolated from the oral flora of a healthy individual, and discuss characteristics that may affect its dual roles in human health and disease. This strain has a rough appearance, and its genome contains genes encoding a type VI secretion system and several factors that may participate in host colonization. PMID:19447908
Birney, Ewan; Stamatoyannopoulos, John A; Dutta, Anindya; Guigó, Roderic; Gingeras, Thomas R; Margulies, Elliott H; Weng, Zhiping; Snyder, Michael; Dermitzakis, Emmanouil T; Thurman, Robert E; Kuehn, Michael S; Taylor, Christopher M; Neph, Shane; Koch, Christoph M; Asthana, Saurabh; Malhotra, Ankit; Adzhubei, Ivan; Greenbaum, Jason A; Andrews, Robert M; Flicek, Paul; Boyle, Patrick J; Cao, Hua; Carter, Nigel P; Clelland, Gayle K; Davis, Sean; Day, Nathan; Dhami, Pawandeep; Dillon, Shane C; Dorschner, Michael O; Fiegler, Heike; Giresi, Paul G; Goldy, Jeff; Hawrylycz, Michael; Haydock, Andrew; Humbert, Richard; James, Keith D; Johnson, Brett E; Johnson, Ericka M; Frum, Tristan T; Rosenzweig, Elizabeth R; Karnani, Neerja; Lee, Kirsten; Lefebvre, Gregory C; Navas, Patrick A; Neri, Fidencio; Parker, Stephen C J; Sabo, Peter J; Sandstrom, Richard; Shafer, Anthony; Vetrie, David; Weaver, Molly; Wilcox, Sarah; Yu, Man; Collins, Francis S; Dekker, Job; Lieb, Jason D; Tullius, Thomas D; Crawford, Gregory E; Sunyaev, Shamil; Noble, William S; Dunham, Ian; Denoeud, France; Reymond, Alexandre; Kapranov, Philipp; Rozowsky, Joel; Zheng, Deyou; Castelo, Robert; Frankish, Adam; Harrow, Jennifer; Ghosh, Srinka; Sandelin, Albin; Hofacker, Ivo L; Baertsch, Robert; Keefe, Damian; Dike, Sujit; Cheng, Jill; Hirsch, Heather A; Sekinger, Edward A; Lagarde, Julien; Abril, Josep F; Shahab, Atif; Flamm, Christoph; Fried, Claudia; Hackermüller, Jörg; Hertel, Jana; Lindemeyer, Manja; Missal, Kristin; Tanzer, Andrea; Washietl, Stefan; Korbel, Jan; Emanuelsson, Olof; Pedersen, Jakob S; Holroyd, Nancy; Taylor, Ruth; Swarbreck, David; Matthews, Nicholas; Dickson, Mark C; Thomas, Daryl J; Weirauch, Matthew T; Gilbert, James; Drenkow, Jorg; Bell, Ian; Zhao, XiaoDong; Srinivasan, K G; Sung, Wing-Kin; Ooi, Hong Sain; Chiu, Kuo Ping; Foissac, Sylvain; Alioto, Tyler; Brent, Michael; Pachter, Lior; Tress, Michael L; Valencia, Alfonso; Choo, Siew Woh; Choo, Chiou Yu; Ucla, Catherine; Manzano, Caroline; Wyss, Carine; Cheung, Evelyn; Clark, Taane G; Brown, James B; Ganesh, Madhavan; Patel, Sandeep; Tammana, Hari; Chrast, Jacqueline; Henrichsen, Charlotte N; Kai, Chikatoshi; Kawai, Jun; Nagalakshmi, Ugrappa; Wu, Jiaqian; Lian, Zheng; Lian, Jin; Newburger, Peter; Zhang, Xueqing; Bickel, Peter; Mattick, John S; Carninci, Piero; Hayashizaki, Yoshihide; Weissman, Sherman; Hubbard, Tim; Myers, Richard M; Rogers, Jane; Stadler, Peter F; Lowe, Todd M; Wei, Chia-Lin; Ruan, Yijun; Struhl, Kevin; Gerstein, Mark; Antonarakis, Stylianos E; Fu, Yutao; Green, Eric D; Karaöz, Ulaş; Siepel, Adam; Taylor, James; Liefer, Laura A; Wetterstrand, Kris A; Good, Peter J; Feingold, Elise A; Guyer, Mark S; Cooper, Gregory M; Asimenos, George; Dewey, Colin N; Hou, Minmei; Nikolaev, Sergey; Montoya-Burgos, Juan I; Löytynoja, Ari; Whelan, Simon; Pardi, Fabio; Massingham, Tim; Huang, Haiyan; Zhang, Nancy R; Holmes, Ian; Mullikin, James C; Ureta-Vidal, Abel; Paten, Benedict; Seringhaus, Michael; Church, Deanna; Rosenbloom, Kate; Kent, W James; Stone, Eric A; Batzoglou, Serafim; Goldman, Nick; Hardison, Ross C; Haussler, David; Miller, Webb; Sidow, Arend; Trinklein, Nathan D; Zhang, Zhengdong D; Barrera, Leah; Stuart, Rhona; King, David C; Ameur, Adam; Enroth, Stefan; Bieda, Mark C; Kim, Jonghwan; Bhinge, Akshay A; Jiang, Nan; Liu, Jun; Yao, Fei; Vega, Vinsensius B; Lee, Charlie W H; Ng, Patrick; Shahab, Atif; Yang, Annie; Moqtaderi, Zarmik; Zhu, Zhou; Xu, Xiaoqin; Squazzo, Sharon; Oberley, Matthew J; Inman, David; Singer, Michael A; Richmond, Todd A; Munn, Kyle J; Rada-Iglesias, Alvaro; Wallerman, Ola; Komorowski, Jan; Fowler, Joanna C; Couttet, Phillippe; Bruce, Alexander W; Dovey, Oliver M; Ellis, Peter D; Langford, Cordelia F; Nix, David A; Euskirchen, Ghia; Hartman, Stephen; Urban, Alexander E; Kraus, Peter; Van Calcar, Sara; Heintzman, Nate; Kim, Tae Hoon; Wang, Kun; Qu, Chunxu; Hon, Gary; Luna, Rosa; Glass, Christopher K; Rosenfeld, M Geoff; Aldred, Shelley Force; Cooper, Sara J; Halees, Anason; Lin, Jane M; Shulha, Hennady P; Zhang, Xiaoling; Xu, Mousheng; Haidar, Jaafar N S; Yu, Yong; Ruan, Yijun; Iyer, Vishwanath R; Green, Roland D; Wadelius, Claes; Farnham, Peggy J; Ren, Bing; Harte, Rachel A; Hinrichs, Angie S; Trumbower, Heather; Clawson, Hiram; Hillman-Jackson, Jennifer; Zweig, Ann S; Smith, Kayla; Thakkapallayil, Archana; Barber, Galt; Kuhn, Robert M; Karolchik, Donna; Armengol, Lluis; Bird, Christine P; de Bakker, Paul I W; Kern, Andrew D; Lopez-Bigas, Nuria; Martin, Joel D; Stranger, Barbara E; Woodroffe, Abigail; Davydov, Eugene; Dimas, Antigone; Eyras, Eduardo; Hallgrímsdóttir, Ingileif B; Huppert, Julian; Zody, Michael C; Abecasis, Gonçalo R; Estivill, Xavier; Bouffard, Gerard G; Guan, Xiaobin; Hansen, Nancy F; Idol, Jacquelyn R; Maduro, Valerie V B; Maskeri, Baishali; McDowell, Jennifer C; Park, Morgan; Thomas, Pamela J; Young, Alice C; Blakesley, Robert W; Muzny, Donna M; Sodergren, Erica; Wheeler, David A; Worley, Kim C; Jiang, Huaiyang; Weinstock, George M; Gibbs, Richard A; Graves, Tina; Fulton, Robert; Mardis, Elaine R; Wilson, Richard K; Clamp, Michele; Cuff, James; Gnerre, Sante; Jaffe, David B; Chang, Jean L; Lindblad-Toh, Kerstin; Lander, Eric S; Koriabine, Maxim; Nefedov, Mikhail; Osoegawa, Kazutoyo; Yoshinaga, Yuko; Zhu, Baoli; de Jong, Pieter J
2007-06-14
We report the generation and analysis of functional data from multiple, diverse experiments performed on a targeted 1% of the human genome as part of the pilot phase of the ENCODE Project. These data have been further integrated and augmented by a number of evolutionary and computational analyses. Together, our results advance the collective knowledge about human genome function in several major areas. First, our studies provide convincing evidence that the genome is pervasively transcribed, such that the majority of its bases can be found in primary transcripts, including non-protein-coding transcripts, and those that extensively overlap one another. Second, systematic examination of transcriptional regulation has yielded new understanding about transcription start sites, including their relationship to specific regulatory sequences and features of chromatin accessibility and histone modification. Third, a more sophisticated view of chromatin structure has emerged, including its inter-relationship with DNA replication and transcriptional regulation. Finally, integration of these new sources of information, in particular with respect to mammalian evolution based on inter- and intra-species sequence comparisons, has yielded new mechanistic and evolutionary insights concerning the functional landscape of the human genome. Together, these studies are defining a path for pursuit of a more comprehensive characterization of human genome function.
Genome-Wide Discovery of Long Non-Coding RNAs in Rainbow Trout.
Al-Tobasei, Rafet; Paneru, Bam; Salem, Mohamed
2016-01-01
The ENCODE project revealed that ~70% of the human genome is transcribed. While only 1-2% of the RNAs encode for proteins, the rest are non-coding RNAs. Long non-coding RNAs (lncRNAs) form a diverse class of non-coding RNAs that are longer than 200 nt. Emerging evidence indicates that lncRNAs play critical roles in various cellular processes including regulation of gene expression. LncRNAs show low levels of gene expression and sequence conservation, which make their computational identification in genomes difficult. In this study, more than two billion Illumina sequence reads were mapped to the genome reference using the TopHat and Cufflinks software. Transcripts shorter than 200 nt, with more than 83-100 amino acids ORF, or with significant homologies to the NCBI nr-protein database were removed. In addition, a computational pipeline was used to filter the remaining transcripts based on a protein-coding-score test. Depending on the filtering stringency conditions, between 31,195 and 54,503 lncRNAs were identified, with only 421 matching known lncRNAs in other species. A digital gene expression atlas revealed 2,935 tissue-specific and 3,269 ubiquitously-expressed lncRNAs. This study annotates the lncRNA rainbow trout genome and provides a valuable resource for functional genomics research in salmonids.
Lipinski, Kamil A; Kaniak-Golik, Aneta; Golik, Pawel
2010-01-01
As a legacy of their endosymbiotic eubacterial origin, mitochondria possess a residual genome, encoding only a few proteins and dependent on a variety of factors encoded by the nuclear genome for its maintenance and expression. As a facultative anaerobe with well understood genetics and molecular biology, Saccharomyces cerevisiae is the model system of choice for studying nucleo-mitochondrial genetic interactions. Maintenance of the mitochondrial genome is controlled by a set of nuclear-coded factors forming intricately interconnected circuits responsible for replication, recombination, repair and transmission to buds. Expression of the yeast mitochondrial genome is regulated mostly at the post-transcriptional level, and involves many general and gene-specific factors regulating splicing, RNA processing and stability and translation. A very interesting aspect of the yeast mitochondrial system is the relationship between genome maintenance and gene expression. Deletions of genes involved in many different aspects of mitochondrial gene expression, notably translation, result in an irreversible loss of functional mtDNA. The mitochondrial genetic system viewed from the systems biology perspective is therefore very fragile and lacks robustness compared to the remaining systems of the cell. This lack of robustness could be a legacy of the reductive evolution of the mitochondrial genome, but explanations involving selective advantages of increased evolvability have also been postulated. Copyright © 2009 Elsevier B.V. All rights reserved.
Hovde, Blake T.; Deodato, Chloe R.; Hunsperger, Heather M.; Ryken, Scott A.; Yost, Will; Jha, Ramesh K.; Patterson, Johnathan; Monnat, Raymond J.; Barlow, Steven B.; Starkenburg, Shawn R.; Cattolico, Rose Ann
2015-01-01
Haptophytes are recognized as seminal players in aquatic ecosystem function. These algae are important in global carbon sequestration, form destructive harmful blooms, and given their rich fatty acid content, serve as a highly nutritive food source to a broad range of eco-cohorts. Haptophyte dominance in both fresh and marine waters is supported by the mixotrophic nature of many taxa. Despite their importance the nuclear genome sequence of only one haptophyte, Emiliania huxleyi (Isochrysidales), is available. Here we report the draft genome sequence of Chrysochromulina tobin (Prymnesiales), and transcriptome data collected at seven time points over a 24-hour light/dark cycle. The nuclear genome of C. tobin is small (59 Mb), compact (∼40% of the genome is protein coding) and encodes approximately 16,777 genes. Genes important to fatty acid synthesis, modification, and catabolism show distinct patterns of expression when monitored over the circadian photoperiod. The C. tobin genome harbors the first hybrid polyketide synthase/non-ribosomal peptide synthase gene complex reported for an algal species, and encodes potential anti-microbial peptides and proteins involved in multidrug and toxic compound extrusion. A new haptophyte xanthorhodopsin was also identified, together with two “red” RuBisCO activases that are shared across many algal lineages. The Chrysochromulina tobin genome sequence provides new information on the evolutionary history, ecology and economic importance of haptophytes. PMID:26397803
Cozens, A L; Walker, J E
1986-01-01
The nucleotide sequence has been determined of a segment of 4680 bases of the pea chloroplast genome. It adjoins a sequence described elsewhere that encodes subunits of the F0 membrane domain of the ATP-synthase complex. The sequence contains a potential gene encoding a protein which is strongly related to the S2 polypeptide of Escherichia coli ribosomes. It also encodes an incomplete protein which contains segments that are homologous to the beta'-subunit of E. coli RNA polymerase and to yeast RNA polymerases II and III. PMID:3530249
Kidokoro, Satoshi; Watanabe, Keitaro; Ohori, Teppei; Moriwaki, Takashi; Maruyama, Kyonoshin; Mizoi, Junya; Myint Phyu Sin Htwe, Nang; Fujita, Yasunari; Sekita, Sachiko; Shinozaki, Kazuo; Yamaguchi-Shinozaki, Kazuko
2015-02-01
Soybean (Glycine max) is a globally important crop, and its growth and yield are severely reduced by abiotic stresses, such as drought, heat, and cold. The cis-acting element DRE (dehydration-responsive element)/CRT plays an important role in activating gene expression in response to these stresses. The Arabidopsis DREB1/CBF genes that encode DRE-binding proteins function as transcriptional activators in the cold stress responsive gene expression. In this study, we identified 14 DREB1-type transcription factors (GmDREB1s) from a soybean genome database. The expression of most GmDREB1 genes in soybean was strongly induced by a variety of abiotic stresses, such as cold, drought, high salt, and heat. The GmDREB1 proteins activated transcription via DREs (dehydration-responsive element) in Arabidopsis and soybean protoplasts. Transcriptome analyses using transgenic Arabidopsis plants overexpressing GmDREB1s indicated that many of the downstream genes are cold-inducible and overlap with those of Arabidopsis DREB1A. We then comprehensively analyzed the downstream genes of GmDREB1B;1, which is closely related to DREB1A, using a transient expression system in soybean protoplasts. The expression of numerous genes induced by various abiotic stresses were increased by overexpressing GmDREB1B;1 in soybean, and DREs were the most conserved element in the promoters of these genes. The downstream genes of GmDREB1B;1 included numerous soybean-specific stress-inducible genes that encode an ABA receptor family protein, GmPYL21, and translation-related genes, such as ribosomal proteins. We confirmed that GmDREB1B;1 directly activates GmPYL21 expression and enhances ABRE-mediated gene expression in an ABA-independent manner. These results suggest that GmDREB1 proteins activate the expression of numerous soybean-specific stress-responsive genes under diverse abiotic stress conditions. © 2014 The Authors The Plant Journal © 2014 John Wiley & Sons Ltd.
Penz, Thomas; Horn, Matthias; Schmitz-Esser, Stephan
2010-01-01
The recently sequenced genome of the obligate intracellular amoeba symbiont 'Candidatus Amoebophilus asiaticus' is unique among prokaryotic genomes due to its extremely large fraction of genes encoding proteins harboring eukaryotic domains such as ankyrin-repeats, TPR/SEL1 repeats, leucine-rich repeats, as well as F- and U-box domains, most of which likely serve in the interaction with the amoeba host. Here we provide evidence for the presence of additional proteins which are presumably presented extracellularly and should thus also be important for host cell interaction. Surprisingly, we did not find homologues of any of the well-known protein secretion systems required to translocate effector proteins into the host cell in the A. asiaticus genome, and the type six secretion systems seems to be incomplete. Here we describe the presence of a putative prophage in the A. asiaticus genome, which shows similarity to the antifeeding prophage from the insect pathogen Serratia entomophila. In S. entomophila this system is used to deliver toxins into insect hosts. This putative antifeeding-like prophage might thus represent the missing protein secretion apparatus in A. asiaticus.
CRISPR-Cas encoding of a digital movie into the genomes of a population of living bacteria.
Shipman, Seth L; Nivala, Jeff; Macklis, Jeffrey D; Church, George M
2017-07-20
DNA is an excellent medium for archiving data. Recent efforts have illustrated the potential for information storage in DNA using synthesized oligonucleotides assembled in vitro. A relatively unexplored avenue of information storage in DNA is the ability to write information into the genome of a living cell by the addition of nucleotides over time. Using the Cas1-Cas2 integrase, the CRISPR-Cas microbial immune system stores the nucleotide content of invading viruses to confer adaptive immunity. When harnessed, this system has the potential to write arbitrary information into the genome. Here we use the CRISPR-Cas system to encode the pixel values of black and white images and a short movie into the genomes of a population of living bacteria. In doing so, we push the technical limits of this information storage system and optimize strategies to minimize those limitations. We also uncover underlying principles of the CRISPR-Cas adaptation system, including sequence determinants of spacer acquisition that are relevant for understanding both the basic biology of bacterial adaptation and its technological applications. This work demonstrates that this system can capture and stably store practical amounts of real data within the genomes of populations of living cells.
Nisa-Martínez, Rafael; Jiménez-Zurdo, José I.; Martínez-Abarca, Francisco; Muñoz-Adelantado, Estefanía; Toro, Nicolás
2007-01-01
RmInt1 is a self-splicing and mobile group II intron initially identified in the bacterium Sinorhizobium meliloti, which encodes a reverse transcriptase–maturase (Intron Encoded Protein, IEP) lacking the C-terminal DNA binding (D) and DNA endonuclease domains (En). RmInt1 invades cognate intronless homing sites (ISRm2011-2) by a mechanism known as retrohoming. This work describes how the RmInt1 intron spreads in the S.meliloti genome upon acquisition by conjugation. This process was revealed by using the wild-type intron RmInt1 and engineered intron-donor constructs based on ribozyme coding sequence (ΔORF)-derivatives with higher homing efficiency than the wild-type intron. The data demonstrate that RmInt1 propagates into the S.meliloti genome primarily by retrohoming with a strand bias related to replication of the chromosome and symbiotic megaplasmids. Moreover, we show that when expressed in trans from a separate plasmid, the IEP is able to mobilize genomic ΔORF ribozymes that afterward displayed wild-type levels of retrohoming. Our results contribute to get further understanding of how group II introns spread into bacterial genomes in nature. PMID:17158161
Blazier, J Chris; Ruhlman, Tracey A; Weng, Mao-Lun; Rehman, Sumaiyah K; Sabir, Jamal S M; Jansen, Robert K
2016-04-18
Genes for the plastid-encoded RNA polymerase (PEP) persist in the plastid genomes of all photosynthetic angiosperms. However, three unrelated lineages (Annonaceae, Passifloraceae and Geraniaceae) have been identified with unusually divergent open reading frames (ORFs) in the conserved region of rpoA, the gene encoding the PEP α subunit. We used sequence-based approaches to evaluate whether these genes retain function. Both gene sequences and complete plastid genome sequences were assembled and analyzed from each of the three angiosperm families. Multiple lines of evidence indicated that the rpoA sequences are likely functional despite retaining as low as 30% nucleotide sequence identity with rpoA genes from outgroups in the same angiosperm order. The ratio of non-synonymous to synonymous substitutions indicated that these genes are under purifying selection, and bioinformatic prediction of conserved domains indicated that functional domains are preserved. One of the lineages (Pelargonium, Geraniaceae) contains species with multiple rpoA-like ORFs that show evidence of ongoing inter-paralog gene conversion. The plastid genomes containing these divergent rpoA genes have experienced extensive structural rearrangement, including large expansions of the inverted repeat. We propose that illegitimate recombination, not positive selection, has driven the divergence of rpoA.
The Complete Plastome Sequence of an Antarctic Bryophyte Sanionia uncinata (Hedw.) Loeske
Park, Mira; Park, Hyun; Lee, Hyoungseok; Lee, Byeong-ha
2018-01-01
Organellar genomes of bryophytes are poorly represented with chloroplast genomes of only four mosses, four liverworts and two hornworts having been sequenced and annotated. Moreover, while Antarctic vegetation is dominated by the bryophytes, there are few reports on the plastid genomes for the Antarctic bryophytes. Sanionia uncinata (Hedw.) Loeske is one of the most dominant moss species in the maritime Antarctic. It has been researched as an important marker for ecological studies and as an extremophile plant for studies on stress tolerance. Here, we report the complete plastome sequence of S. uncinata, which can be exploited in comparative studies to identify the lineage-specific divergence across different species. The complete plastome of S. uncinata is 124,374 bp in length with a typical quadripartite structure of 114 unique genes including 82 unique protein-coding genes, 37 tRNA genes and four rRNA genes. However, two genes encoding the α subunit of RNA polymerase (rpoA) and encoding the cytochrome b6/f complex subunit VIII (petN) were absent. We could identify nuclear genes homologous to those genes, which suggests that rpoA and petN might have been relocated from the chloroplast genome to the nuclear genome. PMID:29494552
Nisa-Martínez, Rafael; Jiménez-Zurdo, José I; Martínez-Abarca, Francisco; Muñoz-Adelantado, Estefanía; Toro, Nicolás
2007-01-01
RmInt1 is a self-splicing and mobile group II intron initially identified in the bacterium Sinorhizobium meliloti, which encodes a reverse transcriptase-maturase (Intron Encoded Protein, IEP) lacking the C-terminal DNA binding (D) and DNA endonuclease domains (En). RmInt1 invades cognate intronless homing sites (ISRm2011-2) by a mechanism known as retrohoming. This work describes how the RmInt1 intron spreads in the S.meliloti genome upon acquisition by conjugation. This process was revealed by using the wild-type intron RmInt1 and engineered intron-donor constructs based on ribozyme coding sequence (DeltaORF)-derivatives with higher homing efficiency than the wild-type intron. The data demonstrate that RmInt1 propagates into the S.meliloti genome primarily by retrohoming with a strand bias related to replication of the chromosome and symbiotic megaplasmids. Moreover, we show that when expressed in trans from a separate plasmid, the IEP is able to mobilize genomic DeltaORF ribozymes that afterward displayed wild-type levels of retrohoming. Our results contribute to get further understanding of how group II introns spread into bacterial genomes in nature.
De Novo Sequencing of a Sparassis latifolia Genome and Its Associated Comparative Analyses
Ma, Lu; Yang, Chi; Ying, Zhenghe; Jiang, Xiaoling
2018-01-01
Known to be rich in β-glucan, Sparassis latifolia (S. latifolia) is a valuable edible fungus cultivated in East Asia. A few studies have suggested that S. latifolia is effective on antidiabetic, antihypertension, antitumor, and antiallergen medications. However, it is still unclear genetically why the fungus has these medical effects, which has become a key bottleneck for its further applications. To provide a better understanding of this fungus, we sequenced its whole genome, which has a total size of 48.13 megabases (Mb) and contains 12,471 predicted gene models. We then performed comparative and phylogenetic analyses, which indicate that S. latifolia is closely related to a few species in the antrodia clade including Fomitopsis pinicola, Wolfiporia cocos, Postia placenta, and Antrodia sinuosa. Finally, we annotated the predicted genes. Interestingly, the S. latifolia genome encodes most enzymes involved in carbohydrate and glycoconjugate metabolism and is also enriched in genes encoding enzymes critical to secondary metabolite biosynthesis and involved in indole, terpene, and type I polyketide pathways. As a conclusion, the genome content of S. latifolia sheds light on its genetic basis of the reported medicinal properties and could also be used as a reference genome for comparative studies on fungi. PMID:29682127
Peng, Huizhen; Liu, Qiaolin; Xiao, Tiaoyi
2016-09-01
In this study, 15 sets of primers were used to amplify contiguous, overlapping segments of the complete mitochondrial DNA (mtDNA) of C. capio furong(♀) × C. carpio var.singguonensis(♂) in order to characterize and compare their mitochondrial genomes. The total length of the mitochondrial genome was 16,581 bp and deposited in the GenBank with the accession number KP210473. The organization of the mitochondrial genomes contained 37 genes (13 protein-coding genes, 2 ribosomal RNA and 22 transfer RNAs) and a major non-coding control region which was similar to those reported mitochondrial genomes. Most genes were encoded on the H-strand, except for the ND6 and 8 tRNA genes, encoding on the L-strand. The nucleotide skewness for the coding strands of C. capio furong(♀) × C. carpio var.singguonensis(♂) (AT-skew = 0.12, GC-skew = -0.27) were biased toward T and G. The complete mitogenome may provide important date for the study of genetic mechanism of C. capio furong(♀) × C. carpio var.singguonensis(♂).
Experimental Induction of Genome Chaos.
Ye, Christine J; Liu, Guo; Heng, Henry H
2018-01-01
Genome chaos, or karyotype chaos, represents a powerful survival strategy for somatic cells under high levels of stress/selection. Since the genome context, not the gene content, encodes the genomic blueprint of the cell, stress-induced rapid and massive reorganization of genome topology functions as a very important mechanism for genome (karyotype) evolution. In recent years, the phenomenon of genome chaos has been confirmed by various sequencing efforts, and many different terms have been coined to describe different subtypes of the chaotic genome including "chromothripsis," "chromoplexy," and "structural mutations." To advance this exciting field, we need an effective experimental system to induce and characterize the karyotype reorganization process. In this chapter, an experimental protocol to induce chaotic genomes is described, following a brief discussion of the mechanism and implication of genome chaos in cancer evolution.
The Genomic HyperBrowser: an analysis web server for genome-scale data
Sandve, Geir K.; Gundersen, Sveinung; Johansen, Morten; Glad, Ingrid K.; Gunathasan, Krishanthi; Holden, Lars; Holden, Marit; Liestøl, Knut; Nygård, Ståle; Nygaard, Vegard; Paulsen, Jonas; Rydbeck, Halfdan; Trengereid, Kai; Clancy, Trevor; Drabløs, Finn; Ferkingstad, Egil; Kalaš, Matúš; Lien, Tonje; Rye, Morten B.; Frigessi, Arnoldo; Hovig, Eivind
2013-01-01
The immense increase in availability of genomic scale datasets, such as those provided by the ENCODE and Roadmap Epigenomics projects, presents unprecedented opportunities for individual researchers to pose novel falsifiable biological questions. With this opportunity, however, researchers are faced with the challenge of how to best analyze and interpret their genome-scale datasets. A powerful way of representing genome-scale data is as feature-specific coordinates relative to reference genome assemblies, i.e. as genomic tracks. The Genomic HyperBrowser (http://hyperbrowser.uio.no) is an open-ended web server for the analysis of genomic track data. Through the provision of several highly customizable components for processing and statistical analysis of genomic tracks, the HyperBrowser opens for a range of genomic investigations, related to, e.g., gene regulation, disease association or epigenetic modifications of the genome. PMID:23632163
The Genomic HyperBrowser: an analysis web server for genome-scale data.
Sandve, Geir K; Gundersen, Sveinung; Johansen, Morten; Glad, Ingrid K; Gunathasan, Krishanthi; Holden, Lars; Holden, Marit; Liestøl, Knut; Nygård, Ståle; Nygaard, Vegard; Paulsen, Jonas; Rydbeck, Halfdan; Trengereid, Kai; Clancy, Trevor; Drabløs, Finn; Ferkingstad, Egil; Kalas, Matús; Lien, Tonje; Rye, Morten B; Frigessi, Arnoldo; Hovig, Eivind
2013-07-01
The immense increase in availability of genomic scale datasets, such as those provided by the ENCODE and Roadmap Epigenomics projects, presents unprecedented opportunities for individual researchers to pose novel falsifiable biological questions. With this opportunity, however, researchers are faced with the challenge of how to best analyze and interpret their genome-scale datasets. A powerful way of representing genome-scale data is as feature-specific coordinates relative to reference genome assemblies, i.e. as genomic tracks. The Genomic HyperBrowser (http://hyperbrowser.uio.no) is an open-ended web server for the analysis of genomic track data. Through the provision of several highly customizable components for processing and statistical analysis of genomic tracks, the HyperBrowser opens for a range of genomic investigations, related to, e.g., gene regulation, disease association or epigenetic modifications of the genome.
Kim, Suyeon; Chung, Han Young; Lee, Dong-Hoon; Lim, Jong Gyu; Kim, Se Keun; Ku, Hye-Jin; Kim, You-Tae; Kim, Heebal; Ryu, Sangryeol; Lee, Ju-Hoon; Choi, Sang Ho
2016-07-01
Vibrio parahaemolyticus is a Gram-negative, motile, nonspore-forming pathogen that causes foodborne illness associated with the consumption of contaminated seafoods. Although many cases of foodborne outbreaks caused by V. parahaemolyticus have been reported, the genomes of only five strains have been completely sequenced and analyzed using bioinformatics. In order to characterize overall virulence factors and pathogenesis of V. parahaemolyticus associated with foodborne outbreak in South Korea, a new strain FORC_008 was isolated from flounder fish and its genome was completely sequenced. The genomic analysis revealed that the genome of FORC_008 consists of two circular DNA chromosomes of 3266 132 bp (chromosome I) and 1772 036 bp (chromosome II) with a GC content of 45.36% and 45.53%, respectively. The entire genome contains 4494 predicted open reading frames, 129 tRNAs and 31 rRNA genes. While the strain FORC_008 does not have genes encoding thermostable direct hemolysin (TDH) and TDH-related hemolysin (TRH), its genome encodes many other virulence factors including hemolysins, pathogenesis-associated secretion systems and iron acquisition systems, suggesting that it may be a potential pathogen. This report provides an extended understanding on V. parahaemolyticus in genomic level and would be helpful for rapid detection, epidemiological investigation and prevention of foodborne outbreak in South Korea. © FEMS 2016. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Springfeld, Christoph; Darai, Gholamreza; Cattaneo, Roberto
2005-06-01
Rhabdoviruses are negative-stranded RNA viruses of the order Mononegavirales and have been isolated from vertebrates, insects, and plants. Members of the genus Lyssavirus cause the invariably fatal disease rabies, and a member of the genus Vesiculovirus, Chandipura virus, has recently been associated with acute encephalitis in children. We present here the complete genome sequence and transcription map of a rhabdovirus isolated from cultivated cells of hepatocellular carcinoma tissue from a moribund tree shrew. The negative-strand genome of tupaia rhabdovirus is composed of 11,440 nucleotides and encodes six genes that are separated by one or two intergenic nucleotides. In addition to the typical rhabdovirus genes in the order N-P-M-G-L, a gene encoding a small hydrophobic putative type I transmembrane protein of approximately 11 kDa was identified between the M and G genes, and the corresponding transcript was detected in infected cells. Similar to some Vesiculoviruses and many Paramyxovirinae, the P gene has a second overlapping reading frame that can be accessed by ribosomal choice and encodes a protein of 26 kDa, predicted to be the largest C protein of these virus families. Phylogenetic analyses of the tupaia rhabdovirus N and L genes show that the virus is distantly related to the Vesiculoviruses, Ephemeroviruses, and the recently characterized Flanders virus and Oita virus and further extends the sequence territory occupied by animal rhabdoviruses.
Springfeld, Christoph; Darai, Gholamreza; Cattaneo, Roberto
2005-01-01
Rhabdoviruses are negative-stranded RNA viruses of the order Mononegavirales and have been isolated from vertebrates, insects, and plants. Members of the genus Lyssavirus cause the invariably fatal disease rabies, and a member of the genus Vesiculovirus, Chandipura virus, has recently been associated with acute encephalitis in children. We present here the complete genome sequence and transcription map of a rhabdovirus isolated from cultivated cells of hepatocellular carcinoma tissue from a moribund tree shrew. The negative-strand genome of tupaia rhabdovirus is composed of 11,440 nucleotides and encodes six genes that are separated by one or two intergenic nucleotides. In addition to the typical rhabdovirus genes in the order N-P-M-G-L, a gene encoding a small hydrophobic putative type I transmembrane protein of approximately 11 kDa was identified between the M and G genes, and the corresponding transcript was detected in infected cells. Similar to some Vesiculoviruses and many Paramyxovirinae, the P gene has a second overlapping reading frame that can be accessed by ribosomal choice and encodes a protein of 26 kDa, predicted to be the largest C protein of these virus families. Phylogenetic analyses of the tupaia rhabdovirus N and L genes show that the virus is distantly related to the Vesiculoviruses, Ephemeroviruses, and the recently characterized Flanders virus and Oita virus and further extends the sequence territory occupied by animal rhabdoviruses. PMID:15890917
Regulatory role of XynR (YagI) in catabolism of xylonate in Escherichia coli K-12.
Shimada, Tomohiro; Momiyama, Eri; Yamanaka, Yuki; Watanabe, Hiroki; Yamamoto, Kaneyoshi; Ishihama, Akira
2017-12-01
The genome of Escherichia coli K-12 contains ten cryptic phages, altogether constituting about 3.6% of the genome in sequence. Among more than 200 predicted genes in these cryptic phages, 14 putative transcription factor (TF) genes exist, but their regulatory functions remain unidentified. As an initial attempt to make a breakthrough for understanding the regulatory roles of cryptic phage-encoded TFs, we tried to identify the regulatory function of CP4-6 cryptic prophage-encoded YagI with unknown function. After SELEX screening, YagI was found to bind mainly at a single site within the spacer of bidirectional transcription units, yagA (encoding another uncharacterized TF) and yagEF (encoding 2-keto-3-deoxy gluconate aldolase, and dehydratase, respectively) within this prophage region. YagEF enzymes are involved in the catabolism of xylose downstream from xylonate. We then designated YagI as XynR (regulator of xylonate catabolism), one of the rare single-target TFs. In agreement with this predicted regulatory function, the activity of XynR was suggested to be controlled by xylonate. Even though low-affinity binding sites of XynR were identified in the E. coli K-12 genome, they all were inside open reading frames, implying that the regulation network of XynR is still fixed within the CR4-6 prophage without significant influence over the host E. coli K-12. © FEMS 2017. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Nouvel, Laurent X; Sirand-Pugnet, Pascal; Marenda, Marc S; Sagné, Eveline; Barbe, Valérie; Mangenot, Sophie; Schenowitz, Chantal; Jacob, Daniel; Barré, Aurélien; Claverol, Stéphane; Blanchard, Alain; Citti, Christine
2010-02-02
While the genomic era is accumulating a tremendous amount of data, the question of how genomics can describe a bacterial species remains to be fully addressed. The recent sequencing of the genome of the Mycoplasma agalactiae type strain has challenged our general view on mycoplasmas by suggesting that these simple bacteria are able to exchange significant amount of genetic material via horizontal gene transfer. Yet, events that are shaping mycoplasma genomes and that are underlining diversity within this species have to be fully evaluated. For this purpose, we compared two strains that are representative of the genetic spectrum encountered in this species: the type strain PG2 which genome is already available and a field strain, 5632, which was fully sequenced and annotated in this study. The two genomes differ by ca. 130 kbp with that of 5632 being the largest (1006 kbp). The make up of this additional genetic material mainly corresponds (i) to mobile genetic elements and (ii) to expanded repertoire of gene families that encode putative surface proteins and display features of highly-variable systems. More specifically, three entire copies of a previously described integrative conjugative element are found in 5632 that accounts for ca. 80 kbp. Other mobile genetic elements, found in 5632 but not in PG2, are the more classical insertion sequences which are related to those found in two other ruminant pathogens, M. bovis and M. mycoides subsp. mycoides SC. In 5632, repertoires of gene families encoding surface proteins are larger due to gene duplication. Comparative proteomic analyses of the two strains indicate that the additional coding capacity of 5632 affects the overall architecture of the surface and suggests the occurrence of new phase variable systems based on single nucleotide polymorphisms. Overall, comparative analyses of two M. agalactiae strains revealed a very dynamic genome which structure has been shaped by gene flow among ruminant mycoplasmas and expansion-reduction of gene repertoires encoding surface proteins, the expression of which is driven by localized genetic micro-events.
Reeve, Wayne; van Berkum, Peter; Ardley, Julie; ...
2017-03-04
Bradyrhizobium elkanii USDA 76 T (INSCD = ARAG00000000), the type strain for Bradyrhizobium elkanii, is an aerobic, motile, Gram-negative, non-spore-forming rod that was isolated from an effective nitrogen-fixing root nodule of Glycine max (L. Merr) grown in the USA. Because of its significance as a microsymbiont of this economically important legume, B. elkanii USDA 76 T was selected as part of the DOE Joint Genome Institute 2010 Genomic Encyclopedia for Bacteria and Archaea-Root Nodule Bacteria sequencing project. Here the symbiotic abilities of B. elkanii USDA 76 T are described, together with its genome sequence information and annotation. The 9,484,767 bpmore » high-quality draft genome is arranged in 2 scaffolds of 25 contigs, containing 9060 protein-coding genes and 91 RNA-only encoding genes. The B. elkanii USDA 76 T genome contains a low GC content region with symbiotic nod and fix genes, indicating the presence of a symbiotic island integration. A comparison of five B. elkanii genomes that formed a clique revealed that 356 of the 9060 protein coding genes of USDA 76 T were unique, including 22 genes of an intact resident prophage. A conserved set of 7556 genes were also identified for this species, including genes encoding a general secretion pathway as well as type II, III, IV and VI secretion system proteins. The type III secretion system has previously been characterized as a host determinant for Rj and/or rj soybean cultivars. Here we show that the USDA 76 T genome contains genes encoding all the type III secretion system components, including a translocon complex protein NopX required for the introduction of effector proteins into host cells. While many bradyrhizobial strains are unable to nodulate the soybean cultivar Clark (rj1), USDA 76 T was able to elicit nodules on Clark (rj1), although in reduced numbers, when plants were grown in Leonard jars containing sand or vermiculite. In these conditions, we postulate that the presence of NopX allows USDA 76 T to introduce various effector molecules into this host to enable nodulation.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Reeve, Wayne; van Berkum, Peter; Ardley, Julie
Bradyrhizobium elkanii USDA 76 T (INSCD = ARAG00000000), the type strain for Bradyrhizobium elkanii, is an aerobic, motile, Gram-negative, non-spore-forming rod that was isolated from an effective nitrogen-fixing root nodule of Glycine max (L. Merr) grown in the USA. Because of its significance as a microsymbiont of this economically important legume, B. elkanii USDA 76 T was selected as part of the DOE Joint Genome Institute 2010 Genomic Encyclopedia for Bacteria and Archaea-Root Nodule Bacteria sequencing project. Here the symbiotic abilities of B. elkanii USDA 76 T are described, together with its genome sequence information and annotation. The 9,484,767 bpmore » high-quality draft genome is arranged in 2 scaffolds of 25 contigs, containing 9060 protein-coding genes and 91 RNA-only encoding genes. The B. elkanii USDA 76 T genome contains a low GC content region with symbiotic nod and fix genes, indicating the presence of a symbiotic island integration. A comparison of five B. elkanii genomes that formed a clique revealed that 356 of the 9060 protein coding genes of USDA 76 T were unique, including 22 genes of an intact resident prophage. A conserved set of 7556 genes were also identified for this species, including genes encoding a general secretion pathway as well as type II, III, IV and VI secretion system proteins. The type III secretion system has previously been characterized as a host determinant for Rj and/or rj soybean cultivars. Here we show that the USDA 76 T genome contains genes encoding all the type III secretion system components, including a translocon complex protein NopX required for the introduction of effector proteins into host cells. While many bradyrhizobial strains are unable to nodulate the soybean cultivar Clark (rj1), USDA 76 T was able to elicit nodules on Clark (rj1), although in reduced numbers, when plants were grown in Leonard jars containing sand or vermiculite. In these conditions, we postulate that the presence of NopX allows USDA 76 T to introduce various effector molecules into this host to enable nodulation.« less
2010-01-01
Background While the genomic era is accumulating a tremendous amount of data, the question of how genomics can describe a bacterial species remains to be fully addressed. The recent sequencing of the genome of the Mycoplasma agalactiae type strain has challenged our general view on mycoplasmas by suggesting that these simple bacteria are able to exchange significant amount of genetic material via horizontal gene transfer. Yet, events that are shaping mycoplasma genomes and that are underlining diversity within this species have to be fully evaluated. For this purpose, we compared two strains that are representative of the genetic spectrum encountered in this species: the type strain PG2 which genome is already available and a field strain, 5632, which was fully sequenced and annotated in this study. Results The two genomes differ by ca. 130 kbp with that of 5632 being the largest (1006 kbp). The make up of this additional genetic material mainly corresponds (i) to mobile genetic elements and (ii) to expanded repertoire of gene families that encode putative surface proteins and display features of highly-variable systems. More specifically, three entire copies of a previously described integrative conjugative element are found in 5632 that accounts for ca. 80 kbp. Other mobile genetic elements, found in 5632 but not in PG2, are the more classical insertion sequences which are related to those found in two other ruminant pathogens, M. bovis and M. mycoides subsp. mycoides SC. In 5632, repertoires of gene families encoding surface proteins are larger due to gene duplication. Comparative proteomic analyses of the two strains indicate that the additional coding capacity of 5632 affects the overall architecture of the surface and suggests the occurrence of new phase variable systems based on single nucleotide polymorphisms. Conclusion Overall, comparative analyses of two M. agalactiae strains revealed a very dynamic genome which structure has been shaped by gene flow among ruminant mycoplasmas and expansion-reduction of gene repertoires encoding surface proteins, the expression of which is driven by localized genetic micro-events. PMID:20122262
Structural insights into the multifunctional protein VP3 of birnaviruses.
Casañas, Arnau; Navarro, Aitor; Ferrer-Orta, Cristina; González, Dolores; Rodríguez, José F; Verdaguer, Núria
2008-01-01
Infectious bursal disease virus (IBDV), a member of the Birnaviridae family, is the causative agent of one of the most harmful poultry diseases. The IBDV genome encodes five mature proteins; of these, the multifunctional protein VP3 plays an essential role in virus morphogenesis. This protein, which interacts with the structural protein VP2, with the double-stranded RNA genome, and with the virus-encoded, RNA-dependent RNA polymerase, VP1, is involved not only in the formation of the viral capsid, but also in the recruitment of VP1 into the capsid and in the encapsidation of the viral genome. Here, we report the X-ray structure of the central region of VP3, residues 92-220, consisting of two alpha-helical domains connected by a long and flexible hinge that are organized as a dimer. Unexpectedly, the overall fold of the second VP3 domain shows significant structural similarities with different transcription regulation factors.
Proteogenomic characterization of human colon and rectal cancer
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhang, Bing; Wang, Jing; Wang, Xiaojing
2014-09-18
We analyzed proteomes of colon and rectal tumors previously characterized by the Cancer Genome Atlas (TCGA) and performed integrated proteogenomic analyses. Protein sequence variants encoded by somatic genomic variations displayed reduced expression compared to protein variants encoded by germline variations. mRNA transcript abundance did not reliably predict protein expression differences between tumors. Proteomics identified five protein expression subtypes, two of which were associated with the TCGA "MSI/CIMP" transcriptional subtype, but had distinct mutation and methylation patterns and associated with different clinical outcomes. Although CNAs showed strong cis- and trans-effects on mRNA expression, relatively few of these extend to the proteinmore » level. Thus, proteomics data enabled prioritization of candidate driver genes. Our analyses identified HNF4A, a novel candidate driver gene in tumors with chromosome 20q amplifications. Integrated proteogenomic analysis provides functional context to interpret genomic abnormalities and affords novel insights into cancer biology.« less
Complete genome sequence of Paenibacillus sp. strain JDR-2
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chow, Virginia; Nong, Guang; St. John, Franz J.
2012-01-01
Paenibacillus sp. strain JDR-2, an aggressively xylanolytic bacterium isolated from sweetgum (Liquidambar styraciflua) wood, is able to efficiently depolymerize, assimilate and metabolize 4-O-methylglucuronoxylan, the predominant structural component of hardwood hemicelluloses. A basis for this capability was first supported by the identification of genes and characterization of encoded enzymes and has been further defined by the sequencing and annotation of the complete genome, which we describe. In addition to genes implicated in the utilization of -1,4-xylan, genes have also been identified for the utilization of other hemicellulosic polysaccharides. The genome of Paenibacillus sp. JDR-2 contains 7,184,930 bp in a single repliconmore » with 6,288 protein-coding and 122 RNA genes. Uniquely prominent are 874 genes encoding proteins involved in carbohydrate transport and metabolism. The prevalence and organization of these genes support a metabolic potential for bioprocessing of hemicellulose fractions derived from lignocellulosic resources.« less
Proteins of Unknown Biochemical Function: A Persistent Problem and a Roadmap to Help Overcome It.
Niehaus, Thomas D; Thamm, Antje M K; de Crécy-Lagard, Valérie; Hanson, Andrew D
2015-11-01
The number of sequenced genomes is rapidly increasing, but functional annotation of the genes in these genomes lags far behind. Even in Arabidopsis (Arabidopsis thaliana), only approximately 40% of enzyme- and transporter-encoding genes have credible functional annotations, and this number is even lower in nonmodel plants. Functional characterization of unknown genes is a challenge, but various databases (e.g. for protein localization and coexpression) can be mined to provide clues. If homologous microbial genes exist-and about one-half the genes encoding unknown enzymes and transporters in Arabidopsis have microbial homologs-cross-kingdom comparative genomics can powerfully complement plant-based data. Multiple lines of evidence can strengthen predictions and warrant experimental characterization. In some cases, relatively quick tests in genetically tractable microbes can determine whether a prediction merits biochemical validation, which is costly and demands specialized skills. © 2015 American Society of Plant Biologists. All Rights Reserved.
O'Connell Motherway, Mary; Zomer, Aldert; Leahy, Sinead C.; Reunanen, Justus; Bottacini, Francesca; Claesson, Marcus J.; O'Brien, Frances; Flynn, Kiera; Casey, Patrick G.; Moreno Munoz, Jose Antonio; Kearney, Breda; Houston, Aileen M.; O'Mahony, Caitlin; Higgins, Des G.; Shanahan, Fergus; Palva, Airi; de Vos, Willem M.; Fitzgerald, Gerald F.; Ventura, Marco; O'Toole, Paul W.; van Sinderen, Douwe
2011-01-01
Development of the human gut microbiota commences at birth, with bifidobacteria being among the first colonizers of the sterile newborn gastrointestinal tract. To date, the genetic basis of Bifidobacterium colonization and persistence remains poorly understood. Transcriptome analysis of the Bifidobacterium breve UCC2003 2.42-Mb genome in a murine colonization model revealed differential expression of a type IVb tight adherence (Tad) pilus-encoding gene cluster designated “tad2003.” Mutational analysis demonstrated that the tad2003 gene cluster is essential for efficient in vivo murine gut colonization, and immunogold transmission electron microscopy confirmed the presence of Tad pili at the poles of B. breve UCC2003 cells. Conservation of the Tad pilus-encoding locus among other B. breve strains and among sequenced Bifidobacterium genomes supports the notion of a ubiquitous pili-mediated host colonization and persistence mechanism for bifidobacteria. PMID:21690406
Integrating mass spectrometry and genomics for cyanobacterial metabolite discovery
Bertin, Matthew J.; Kleigrewe, Karin; Leão, Tiago F.; Gerwick, Lena
2016-01-01
Filamentous marine cyanobacteria produce bioactive natural products with both potential therapeutic value and capacity to be harmful to human health. Genome sequencing has revealed that cyanobacteria have the capacity to produce many more secondary metabolites than have been characterized. The biosynthetic pathways that encode cyanobacterial natural products are mostly uncharacterized, and lack of cyanobacterial genetic tools has largely prevented their heterologous expression. Hence, a combination of cutting edge and traditional techniques has been required to elucidate their secondary metabolite biosynthetic pathways. Here, we review the discovery and refined biochemical understanding of the olefin synthase and fatty acid ACP reductase/aldehyde deformylating oxygenase pathways to hydrocarbons, and the curacin A, jamaicamide A, lyngbyabellin, columbamide, and a trans-acyltransferase macrolactone pathway encoding phormidolide. We integrate into this discussion the use of genomics, mass spectrometric networking, biochemical characterization, and isolation and structure elucidation techniques. PMID:26578313
Kang, Sung-Hwan; Atallah, Osama O; Sun, Yong-Duo; Folimonova, Svetlana Y
2018-01-15
Viruses from the family Closteroviridae show an example of intra-genome duplications of more than one gene. In addition to the hallmark coat protein gene duplication, several members possess a tandem duplication of papain-like leader proteases. In this study, we demonstrate that domains encoding the L1 and L2 proteases in the Citrus tristeza virus genome underwent a significant functional divergence at the RNA and protein levels. We show that the L1 protease is crucial for viral accumulation and establishment of initial infection, whereas its coding region is vital for virus transport. On the other hand, the second protease is indispensable for virus infection of its natural citrus host, suggesting that L2 has evolved an important adaptive function that mediates virus interaction with the woody host. Copyright © 2017 Elsevier Inc. All rights reserved.
O'Connell Motherway, Mary; Zomer, Aldert; Leahy, Sinead C; Reunanen, Justus; Bottacini, Francesca; Claesson, Marcus J; O'Brien, Frances; Flynn, Kiera; Casey, Patrick G; Munoz, Jose Antonio Moreno; Kearney, Breda; Houston, Aileen M; O'Mahony, Caitlin; Higgins, Des G; Shanahan, Fergus; Palva, Airi; de Vos, Willem M; Fitzgerald, Gerald F; Ventura, Marco; O'Toole, Paul W; van Sinderen, Douwe
2011-07-05
Development of the human gut microbiota commences at birth, with bifidobacteria being among the first colonizers of the sterile newborn gastrointestinal tract. To date, the genetic basis of Bifidobacterium colonization and persistence remains poorly understood. Transcriptome analysis of the Bifidobacterium breve UCC2003 2.42-Mb genome in a murine colonization model revealed differential expression of a type IVb tight adherence (Tad) pilus-encoding gene cluster designated "tad(2003)." Mutational analysis demonstrated that the tad(2003) gene cluster is essential for efficient in vivo murine gut colonization, and immunogold transmission electron microscopy confirmed the presence of Tad pili at the poles of B. breve UCC2003 cells. Conservation of the Tad pilus-encoding locus among other B. breve strains and among sequenced Bifidobacterium genomes supports the notion of a ubiquitous pili-mediated host colonization and persistence mechanism for bifidobacteria.
Metagenomic recovery of phage genomes of uncultured freshwater actinobacteria.
Ghai, Rohit; Mehrshad, Maliheh; Mizuno, Carolina Megumi; Rodriguez-Valera, Francisco
2017-01-01
Low-GC Actinobacteria are among the most abundant and widespread microbes in freshwaters and have largely resisted all cultivation efforts. Consequently, their phages have remained totally unknown. In this work, we have used deep metagenomic sequencing to assemble eight complete genomes of the first tailed phages that infect freshwater Actinobacteria. Their genomes encode the actinobacterial-specific transcription factor whiB, frequently found in mycobacteriophages and also in phages infecting marine pelagic Actinobacteria. Its presence suggests a common and widespread strategy of modulation of host transcriptional machinery upon infection via this transcriptional switch. We present evidence that some whiB-carrying phages infect the acI lineage of Actinobacteria. At least one of them encodes the ADP-ribosylating component of the widespread bacterial AB toxins family (for example, clostridial toxin). We posit that the presence of this toxin reflects a 'trojan horse' strategy, providing protection at the population level to the abundant host microbes against eukaryotic predators.
2008-04-01
IID on A pril 23, 2008 jb.asm .org D ow nloaded from metabolite-producing clusters encoding nonribosomal peptide or polyketide synthetases...BMA1848) encod- ing a subunit of acetolactate synthase III. The resultant mutant was not able to grow on minimal glucose medium and, similar to what has...caused by the wild type. BMAA1204 is a 4,200-residue CDS annotated as encoding a putative polyketide synthase (PKS) in COG family 0332. PKSs are
Kant, Ravi; Sigvart-Mattila, Pia; Paulin, Lars; Mecklin, Jukka-Pekka; Saarela, Maria; Palva, Airi; von Ossowski, Ingemar
2014-01-01
Lactobacillus rhamnosus is a ubiquitously adaptable Gram-positive bacterium and as a typical commensal can be recovered from various microbe-accessible bodily orifices and cavities. Then again, other isolates are food-borne, with some of these having been long associated with naturally fermented cheeses and yogurts. Additionally, because of perceived health benefits to humans and animals, numerous L. rhamnosus strains have been selected for use as so-called probiotics and are often taken in the form of dietary supplements and functional foods. At the genome level, it is anticipated that certain genetic variances will have provided the niche-related phenotypes that augment the flexible adaptiveness of this species, thus enabling its strains to grow and survive in their respective host environments. For this present study, we considered it functionally informative to examine and catalogue the genotype-phenotype variation existing at the cell surface between different L. rhamnosus strains, with the presumption that this might be relatable to habitat preferences and ecological adaptability. Here, we conducted a pan-genomic study involving 13 genomes from L. rhamnosus isolates with various origins. In using a benchmark strain (gut-adapted L. rhamnosus GG) for our pan-genome comparison, we had focused our efforts on a detailed examination and description of gene products for certain functionally relevant surface-exposed proteins, each of which in effect might also play a part in niche adaptability among the other strains. Perhaps most significantly of the surface protein loci we had analyzed, it would appear that the spaCBA operon (known to encode SpaCBA-called pili having a mucoadhesive phenotype) is a genomic rarity and an uncommon occurrence in L. rhamnosus. However, for any of the so-piliated L. rhamnosus strains, they will likely possess an increased niche-specific fitness, which functionally might presumably be manifested by a protracted transient colonization of the gut mucosa or some similar microhabitat. PMID:25032833
Kirkness, Ewen F; Haas, Brian J; Sun, Weilin; Braig, Henk R; Perotti, M Alejandra; Clark, John M; Lee, Si Hyeock; Robertson, Hugh M; Kennedy, Ryan C; Elhaik, Eran; Gerlach, Daniel; Kriventseva, Evgenia V; Elsik, Christine G; Graur, Dan; Hill, Catherine A; Veenstra, Jan A; Walenz, Brian; Tubío, José Manuel C; Ribeiro, José M C; Rozas, Julio; Johnston, J Spencer; Reese, Justin T; Popadic, Aleksandar; Tojo, Marta; Raoult, Didier; Reed, David L; Tomoyasu, Yoshinori; Kraus, Emily; Krause, Emily; Mittapalli, Omprakash; Margam, Venu M; Li, Hong-Mei; Meyer, Jason M; Johnson, Reed M; Romero-Severson, Jeanne; Vanzee, Janice Pagel; Alvarez-Ponce, David; Vieira, Filipe G; Aguadé, Montserrat; Guirao-Rico, Sara; Anzola, Juan M; Yoon, Kyong S; Strycharz, Joseph P; Unger, Maria F; Christley, Scott; Lobo, Neil F; Seufferheld, Manfredo J; Wang, Naikuan; Dasch, Gregory A; Struchiner, Claudio J; Madey, Greg; Hannick, Linda I; Bidwell, Shelby; Joardar, Vinita; Caler, Elisabet; Shao, Renfu; Barker, Stephen C; Cameron, Stephen; Bruggner, Robert V; Regier, Allison; Johnson, Justin; Viswanathan, Lakshmi; Utterback, Terry R; Sutton, Granger G; Lawson, Daniel; Waterhouse, Robert M; Venter, J Craig; Strausberg, Robert L; Berenbaum, May R; Collins, Frank H; Zdobnov, Evgeny M; Pittendrigh, Barry R
2010-07-06
As an obligatory parasite of humans, the body louse (Pediculus humanus humanus) is an important vector for human diseases, including epidemic typhus, relapsing fever, and trench fever. Here, we present genome sequences of the body louse and its primary bacterial endosymbiont Candidatus Riesia pediculicola. The body louse has the smallest known insect genome, spanning 108 Mb. Despite its status as an obligate parasite, it retains a remarkably complete basal insect repertoire of 10,773 protein-coding genes and 57 microRNAs. Representing hemimetabolous insects, the genome of the body louse thus provides a reference for studies of holometabolous insects. Compared with other insect genomes, the body louse genome contains significantly fewer genes associated with environmental sensing and response, including odorant and gustatory receptors and detoxifying enzymes. The unique architecture of the 18 minicircular mitochondrial chromosomes of the body louse may be linked to the loss of the gene encoding the mitochondrial single-stranded DNA binding protein. The genome of the obligatory louse endosymbiont Candidatus Riesia pediculicola encodes less than 600 genes on a short, linear chromosome and a circular plasmid. The plasmid harbors a unique arrangement of genes required for the synthesis of pantothenate, an essential vitamin deficient in the louse diet. The human body louse, its primary endosymbiont, and the bacterial pathogens that it vectors all possess genomes reduced in size compared with their free-living close relatives. Thus, the body louse genome project offers unique information and tools to use in advancing understanding of coevolution among vectors, symbionts, and pathogens.
Kirkness, Ewen F.; Haas, Brian J.; Sun, Weilin; Braig, Henk R.; Perotti, M. Alejandra; Clark, John M.; Lee, Si Hyeock; Robertson, Hugh M.; Kennedy, Ryan C.; Elhaik, Eran; Gerlach, Daniel; Kriventseva, Evgenia V.; Elsik, Christine G.; Graur, Dan; Hill, Catherine A.; Veenstra, Jan A.; Walenz, Brian; Tubío, José Manuel C.; Ribeiro, José M. C.; Rozas, Julio; Johnston, J. Spencer; Reese, Justin T.; Popadic, Aleksandar; Tojo, Marta; Raoult, Didier; Reed, David L.; Tomoyasu, Yoshinori; Kraus, Emily; Mittapalli, Omprakash; Margam, Venu M.; Li, Hong-Mei; Meyer, Jason M.; Johnson, Reed M.; Romero-Severson, Jeanne; VanZee, Janice Pagel; Alvarez-Ponce, David; Vieira, Filipe G.; Aguadé, Montserrat; Guirao-Rico, Sara; Anzola, Juan M.; Yoon, Kyong S.; Strycharz, Joseph P.; Unger, Maria F.; Christley, Scott; Lobo, Neil F.; Seufferheld, Manfredo J.; Wang, NaiKuan; Dasch, Gregory A.; Struchiner, Claudio J.; Madey, Greg; Hannick, Linda I.; Bidwell, Shelby; Joardar, Vinita; Caler, Elisabet; Shao, Renfu; Barker, Stephen C.; Cameron, Stephen; Bruggner, Robert V.; Regier, Allison; Johnson, Justin; Viswanathan, Lakshmi; Utterback, Terry R.; Sutton, Granger G.; Lawson, Daniel; Waterhouse, Robert M.; Venter, J. Craig; Strausberg, Robert L.; Collins, Frank H.; Zdobnov, Evgeny M.; Pittendrigh, Barry R.
2010-01-01
As an obligatory parasite of humans, the body louse (Pediculus humanus humanus) is an important vector for human diseases, including epidemic typhus, relapsing fever, and trench fever. Here, we present genome sequences of the body louse and its primary bacterial endosymbiont Candidatus Riesia pediculicola. The body louse has the smallest known insect genome, spanning 108 Mb. Despite its status as an obligate parasite, it retains a remarkably complete basal insect repertoire of 10,773 protein-coding genes and 57 microRNAs. Representing hemimetabolous insects, the genome of the body louse thus provides a reference for studies of holometabolous insects. Compared with other insect genomes, the body louse genome contains significantly fewer genes associated with environmental sensing and response, including odorant and gustatory receptors and detoxifying enzymes. The unique architecture of the 18 minicircular mitochondrial chromosomes of the body louse may be linked to the loss of the gene encoding the mitochondrial single-stranded DNA binding protein. The genome of the obligatory louse endosymbiont Candidatus Riesia pediculicola encodes less than 600 genes on a short, linear chromosome and a circular plasmid. The plasmid harbors a unique arrangement of genes required for the synthesis of pantothenate, an essential vitamin deficient in the louse diet. The human body louse, its primary endosymbiont, and the bacterial pathogens that it vectors all possess genomes reduced in size compared with their free-living close relatives. Thus, the body louse genome project offers unique information and tools to use in advancing understanding of coevolution among vectors, symbionts, and pathogens. PMID:20566863
Annotation and sequence diversity of transposable elements in common bean (Phaseolus vulgaris).
Gao, Dongying; Abernathy, Brian; Rohksar, Daniel; Schmutz, Jeremy; Jackson, Scott A
2014-01-01
Common bean (Phaseolus vulgaris) is an important legume crop grown and consumed worldwide. With the availability of the common bean genome sequence, the next challenge is to annotate the genome and characterize functional DNA elements. Transposable elements (TEs) are the most abundant component of plant genomes and can dramatically affect genome evolution and genetic variation. Thus, it is pivotal to identify TEs in the common bean genome. In this study, we performed a genome-wide transposon annotation in common bean using a combination of homology and sequence structure-based methods. We developed a 2.12-Mb transposon database which includes 791 representative transposon sequences and is available upon request or from www.phytozome.org. Of note, nearly all transposons in the database are previously unrecognized TEs. More than 5,000 transposon-related expressed sequence tags (ESTs) were detected which indicates that some transposons may be transcriptionally active. Two Ty1-copia retrotransposon families were found to encode the envelope-like protein which has rarely been identified in plant genomes. Also, we identified an extra open reading frame (ORF) termed ORF2 from 15 Ty3-gypsy families that was located between the ORF encoding the retrotransposase and the 3'LTR. The ORF2 was in opposite transcriptional orientation to retrotransposase. Sequence homology searches and phylogenetic analysis suggested that the ORF2 may have an ancient origin, but its function is not clear. These transposon data provide a useful resource for understanding the genome organization and evolution and may be used to identify active TEs for developing transposon-tagging system in common bean and other related genomes.
USDA-ARS?s Scientific Manuscript database
Apple gene MDP0000136494 was identified as the only LysM containing protein encoding gene which was specifically up-regulated in P. ultimum infected apple root by a previous transcriptome analysis. In current study, the functional identity of MDP0000136494 was investigated using combined genomic, tr...
Márquez, Edna J; Castro, Erick R; Alzate, Juan F
2016-01-01
The queen conch Strombus gigas is an endangered marine gastropod of significant economic importance across the Greater Caribbean region. This work reports for the first time the complete mitochondrial genome of S. gigas, obtained by FLX 454 pyrosequencing. The mtDNA genome encodes for 13 proteins, 22 tRNAs and 2 ribosomal RNAs. In addition, the coding sequences and gene synteny were similar to other previously reported mitogenomes of gastropods.
Hernandez-Maldonado, Jaime; Stoneburner, Brendon; Boren, Alison; Miller, Laurence; Rosen, Michael R.; Oremland, Ronald S.; Saltikov, Chad W
2016-01-01
The full genome sequence of Ectothiorhodospira sp. strain BSL-9 is reported here. This purple sulfur bacterium encodes an arxA-type arsenite oxidase within the arxB2AB1CD gene island and is capable of carrying out “photoarsenotrophy” anoxygenic photosynthetic arsenite oxidation. Its genome is composed of 3.5 Mb and has approximately 63% G+C content.
Reysenbach, Anna-Louise; Donaho, John A; Kelley, John F; St John, Emily; Turner, Christina; Podar, Mircea; Stott, Matthew B
2018-03-15
A draft genome of a novel Dictyoglomus sp., NZ13-RE01, was obtained from a New Zealand hot spring enrichment culture. The 1,927,012-bp genome is similar in both size and G+C content to other Dictyoglomus spp. Like its relatives, Dictyoglomus sp. NZ13-RE01 encodes many genes involved in complex carbohydrate metabolism. Copyright © 2018 Reysenbach et al.
Endoribonuclease type II toxin-antitoxin systems: functional or selfish?
Ramisetty, Bhaskar Chandra Mohan; Santhosh, Ramachandran Sarojini
2017-07-01
Most bacterial genomes have multiple type II toxin-antitoxin systems (TAs) that encode two proteins which are referred to as a toxin and an antitoxin. Toxins inhibit a cellular process, while the interaction of the antitoxin with the toxin attenuates the toxin's activity. Endoribonuclease-encoding TAs cleave RNA in a sequence-dependent fashion, resulting in translational inhibition. To account for their prevalence and retention by bacterial genomes, TAs are credited with clinically significant phenomena, such as bacterial programmed cell death, persistence, biofilms and anti-addiction to plasmids. However, the programmed cell death and persistence hypotheses have been challenged because of conceptual, methodological and/or strain issues. In an alternative view, chromosomal TAs seem to be retained by virtue of addiction at two levels: via a poison-antidote combination (TA proteins) and via transcriptional reprogramming of the downstream core gene (due to integration). Any perturbation in the chromosomal TA operons could cause fitness loss due to polar effects on the downstream genes and hence be detrimental under natural conditions. The endoribonucleases encoding chromosomal TAs are most likely selfish DNA as they are retained by bacterial genomes, even though TAs do not confer a direct advantage via the TA proteins. TAs are likely used by various replicons as 'genetic arms' that allow the maintenance of themselves and associated genetic elements. TAs seem to be the 'selfish arms' that make the best use of the 'arms race' between bacterial genomes and plasmids.
Zhang, Jin; Ruhlman, Tracey A; Sabir, Jamal S M; Blazier, John Chris; Weng, Mao-Lun; Park, Seongjun; Jansen, Robert K
2016-02-17
Disruption of DNA replication, recombination, and repair (DNA-RRR) systems has been hypothesized to cause highly elevated nucleotide substitution rates and genome rearrangements in the plastids of angiosperms, but this theory remains untested. To investigate nuclear-plastid genome (plastome) coevolution in Geraniaceae, four different measures of plastome complexity (rearrangements, repeats, nucleotide insertions/deletions, and substitution rates) were evaluated along with substitution rates of 12 nuclear-encoded, plastid-targeted DNA-RRR genes from 27 Geraniales species. Significant correlations were detected for nonsynonymous (dN) but not synonymous (dS) substitution rates for three DNA-RRR genes (uvrB/C, why1, and gyrA) supporting a role for these genes in accelerated plastid genome evolution in Geraniaceae. Furthermore, correlation between dN of uvrB/C and plastome complexity suggests the presence of nucleotide excision repair system in plastids. Significant correlations were also detected between plastome complexity and 13 of the 90 nuclear-encoded organelle-targeted genes investigated. Comparisons revealed significant acceleration of dN in plastid-targeted genes of Geraniales relative to Brassicales suggesting this correlation may be an artifact of elevated rates in this gene set in Geraniaceae. Correlation between dN of plastid-targeted DNA-RRR genes and plastome complexity supports the hypothesis that the aberrant patterns in angiosperm plastome evolution could be caused by dysfunction in DNA-RRR systems. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Kobayashi, Michie; Hiraka, Yukie; Abe, Akira; Yaegashi, Hiroki; Natsume, Satoshi; Kikuchi, Hideko; Takagi, Hiroki; Saitoh, Hiromasa; Win, Joe; Kamoun, Sophien; Terauchi, Ryohei
2017-11-22
Downy mildew, caused by the oomycete pathogen Sclerospora graminicola, is an economically important disease of Gramineae crops including foxtail millet (Setaria italica). Plants infected with S. graminicola are generally stunted and often undergo a transformation of flower organs into leaves (phyllody or witches' broom), resulting in serious yield loss. To establish the molecular basis of downy mildew disease in foxtail millet, we carried out whole-genome sequencing and an RNA-seq analysis of S. graminicola. Sequence reads were generated from S. graminicola using an Illumina sequencing platform and assembled de novo into a draft genome sequence comprising approximately 360 Mbp. Of this sequence, 73% comprised repetitive elements, and a total of 16,736 genes were predicted from the RNA-seq data. The predicted genes included those encoding effector-like proteins with high sequence similarity to those previously identified in other oomycete pathogens. Genes encoding jacalin-like lectin-domain-containing secreted proteins were enriched in S. graminicola compared to other oomycetes. Of a total of 1220 genes encoding putative secreted proteins, 91 significantly changed their expression levels during the infection of plant tissues compared to the sporangia and zoospore stages of the S. graminicola lifecycle. We established the draft genome sequence of a downy mildew pathogen that infects Gramineae plants. Based on this sequence and our transcriptome analysis, we generated a catalog of in planta-induced candidate effector genes, providing a solid foundation from which to identify the effectors causing phyllody.
Tomazetto, Geizecler; Hahnke, Sarah; Wibberg, Daniel; Pühler, Alfred; Klocke, Michael; Schlüter, Andreas
2018-06-01
Proteiniphilum saccharofermentans str. M3/6 T is a recently described species within the family Porphyromonadaceae (phylum Bacteroidetes ), which was isolated from a mesophilic laboratory-scale biogas reactor. The genome of the strain was completely sequenced and manually annotated to reconstruct its metabolic potential regarding biomass degradation and fermentation pathways. The P. saccharofermentans str. M3/6 T genome consists of a 4,414,963 bp chromosome featuring an average GC-content of 43.63%. Genome analyses revealed that the strain possesses 3396 protein-coding sequences. Among them are 158 genes assigned to the carbohydrate-active-enzyme families as defined by the CAZy database, including 116 genes encoding glycosyl hydrolases (GHs) involved in pectin, arabinogalactan, hemicellulose (arabinan, xylan, mannan, β-glucans), starch, fructan and chitin degradation. The strain also features several transporter genes, some of which are located in polysaccharide utilization loci (PUL). PUL gene products are involved in glycan binding, transport and utilization at the cell surface. In the genome of strain M3/6 T , 64 PUL are present and most of them in association with genes encoding carbohydrate-active enzymes. Accordingly, the strain was predicted to metabolize several sugars yielding carbon dioxide, hydrogen, acetate, formate, propionate and isovalerate as end-products of the fermentation process. Moreover, P. saccharofermentans str. M3/6 T encodes extracellular and intracellular proteases and transporters predicted to be involved in protein and oligopeptide degradation. Comparative analyses between P. saccharofermentans str. M3/6 T and its closest described relative P. acetatigenes str. DSM 18083 T indicate that both strains share a similar metabolism regarding decomposition of complex carbohydrates and fermentation of sugars.
Baquerizo-Audiot, Elizabeth; Abd-Alla, Adly; Jousset, Françoise-Xavière; Cousserans, François; Tijssen, Peter; Bergoin, Max
2009-07-01
The genome of all densoviruses (DNVs) so far isolated from mosquitoes or mosquito cell lines consists of a 4-kb single-stranded DNA molecule with a monosense organization (genus Brevidensovirus, subfamily Densovirinae). We previously reported the isolation of a Culex pipiens DNV (CpDNV) that differs significantly from brevidensoviruses by (i) having a approximately 6-kb genome, (ii) lacking sequence homology, and (iii) lacking antigenic cross-reactivity with Brevidensovirus capsid polypeptides. We report here the sequence organization and transcription map of this virus. The cloned genome of CpDNV is 5,759 nucleotides (nt) long, and it possesses an inverted terminal repeat (ITR) of 285 nt and an ambisense organization of its genes. The nonstructural (NS) proteins NS-1, NS-2, and NS-3 are located in the 5' half of one strand and are organized into five open reading frames (ORFs) due to the split of both NS-1 and NS-2 into two ORFs. The ORF encoding capsid polypeptides is located in the 5' half of the complementary strand. The expression of NS proteins is controlled by two promoters, P7 and P17, driving the transcription of a 2.4-kb mRNA encoding NS-3 and of a 1.8-kb mRNA encoding NS-1 and NS-2, respectively. The two NS mRNAs species are spliced off a 53-nt sequence. Capsid proteins are translated from an unspliced 2.3-kb mRNA driven by the P88 promoter. CpDNV thus appears as a new type of mosquito DNV, and based on the overall organization and expression modalities of its genome, it may represent the prototype of a new genus of DNV.
Yamaguchi, Kazuaki; Chijiwa, Takahito; Ikeda, Naoki; Shibata, Hiroki; Fukumaki, Yasuyuki; Oda-Ueda, Naoko; Hattori, Shosaku; Ohno, Motonori
2014-01-01
The genes encoding group IIE phospholipase A2, abbreviated as IIE PLA2, and its 5' and 3' flanking regions of Crotalinae snakes such as Protobothrops flavoviridis, P. tokarensis, P. elegans, and Ovophis okinavensis, were found and sequenced. The genes consisted of four exons and three introns and coded for 22 or 24 amino acid residues of the signal peptides and 134 amino acid residues of the mature proteins. These IIE PLA2s show high similarity to those from mammals and Colubridae snakes. The high expression level of IIE PLA2s in Crotalinae venom glands suggests that they should work as venomous proteins. The blast analysis indicated that the gene encoding OTUD3, which is ovarian tumor domain-containing protein 3, is located in the 3' downstream of IIE PLA2 gene. Moreover, a group IIA PLA2 gene was found in the 5' upstream of IIE PLA2 gene linked to the OTUD3 gene (OTUD3) in the P. flavoviridis genome. It became evident that the specified arrangement of IIA PLA2 gene, IIE PLA2 gene, and OTUD3 in this order is common in the genomes of humans to snakes. The present finding that the genes encoding various secretory PLA2s form a cluster in the genomes of humans to birds is closely related to the previous finding that six venom PLA2 isozyme genes are densely clustered in the so-called NIS-1 fragment of the P. flavoviridis genome. It is also suggested that venom IIA PLA2 genes may be evolutionarily derived from the IIE PLA2 gene. PMID:25529307
EGASP: the human ENCODE Genome Annotation Assessment Project
Guigó, Roderic; Flicek, Paul; Abril, Josep F; Reymond, Alexandre; Lagarde, Julien; Denoeud, France; Antonarakis, Stylianos; Ashburner, Michael; Bajic, Vladimir B; Birney, Ewan; Castelo, Robert; Eyras, Eduardo; Ucla, Catherine; Gingeras, Thomas R; Harrow, Jennifer; Hubbard, Tim; Lewis, Suzanna E; Reese, Martin G
2006-01-01
Background We present the results of EGASP, a community experiment to assess the state-of-the-art in genome annotation within the ENCODE regions, which span 1% of the human genome sequence. The experiment had two major goals: the assessment of the accuracy of computational methods to predict protein coding genes; and the overall assessment of the completeness of the current human genome annotations as represented in the ENCODE regions. For the computational prediction assessment, eighteen groups contributed gene predictions. We evaluated these submissions against each other based on a 'reference set' of annotations generated as part of the GENCODE project. These annotations were not available to the prediction groups prior to the submission deadline, so that their predictions were blind and an external advisory committee could perform a fair assessment. Results The best methods had at least one gene transcript correctly predicted for close to 70% of the annotated genes. Nevertheless, the multiple transcript accuracy, taking into account alternative splicing, reached only approximately 40% to 50% accuracy. At the coding nucleotide level, the best programs reached an accuracy of 90% in both sensitivity and specificity. Programs relying on mRNA and protein sequences were the most accurate in reproducing the manually curated annotations. Experimental validation shows that only a very small percentage (3.2%) of the selected 221 computationally predicted exons outside of the existing annotation could be verified. Conclusion This is the first such experiment in human DNA, and we have followed the standards established in a similar experiment, GASP1, in Drosophila melanogaster. We believe the results presented here contribute to the value of ongoing large-scale annotation projects and should guide further experimental methods when being scaled up to the entire human genome sequence. PMID:16925836
Beres, Stephen B; Sylva, Gail L; Barbian, Kent D; Lei, Benfang; Hoff, Jessica S; Mammarella, Nicole D; Liu, Meng-Yao; Smoot, James C; Porcella, Stephen F; Parkins, Larye D; Campbell, David S; Smith, Todd M; McCormick, John K; Leung, Donald Y M; Schlievert, Patrick M; Musser, James M
2002-07-23
Genome sequences are available for many bacterial strains, but there has been little progress in using these data to understand the molecular basis of pathogen emergence and differences in strain virulence. Serotype M3 strains of group A Streptococcus (GAS) are a common cause of severe invasive infections with unusually high rates of morbidity and mortality. To gain insight into the molecular basis of this high-virulence phenotype, we sequenced the genome of strain MGAS315, an organism isolated from a patient with streptococcal toxic shock syndrome. The genome is composed of 1,900,521 bp, and it shares approximately 1.7 Mb of related genetic material with genomes of serotype M1 and M18 strains. Phage-like elements account for the great majority of variation in gene content relative to the sequenced M1 and M18 strains. Recombination produces chimeric phages and strains with previously uncharacterized arrays of virulence factor genes. Strain MGAS315 has phage genes that encode proteins likely to contribute to pathogenesis, such as streptococcal pyrogenic exotoxin A (SpeA) and SpeK, streptococcal superantigen (SSA), and a previously uncharacterized phospholipase A(2) (designated Sla). Infected humans had anti-SpeK, -SSA, and -Sla antibodies, indicating that these GAS proteins are made in vivo. SpeK and SSA were pyrogenic and toxic for rabbits. Serotype M3 strains with the phage-encoded speK and sla genes increased dramatically in frequency late in the 20th century, commensurate with the rise in invasive disease caused by M3 organisms. Taken together, the results show that phage-mediated recombination has played a critical role in the emergence of a new, unusually virulent clone of serotype M3 GAS.
Di Pierro, Michele; Cheng, Ryan R; Lieberman Aiden, Erez; Wolynes, Peter G; Onuchic, José N
2017-11-14
Inside the cell nucleus, genomes fold into organized structures that are characteristic of cell type. Here, we show that this chromatin architecture can be predicted de novo using epigenetic data derived from chromatin immunoprecipitation-sequencing (ChIP-Seq). We exploit the idea that chromosomes encode a 1D sequence of chromatin structural types. Interactions between these chromatin types determine the 3D structural ensemble of chromosomes through a process similar to phase separation. First, a neural network is used to infer the relation between the epigenetic marks present at a locus, as assayed by ChIP-Seq, and the genomic compartment in which those loci reside, as measured by DNA-DNA proximity ligation (Hi-C). Next, types inferred from this neural network are used as an input to an energy landscape model for chromatin organization [Minimal Chromatin Model (MiChroM)] to generate an ensemble of 3D chromosome conformations at a resolution of 50 kilobases (kb). After training the model, dubbed Maximum Entropy Genomic Annotation from Biomarkers Associated to Structural Ensembles (MEGABASE), on odd-numbered chromosomes, we predict the sequences of chromatin types and the subsequent 3D conformational ensembles for the even chromosomes. We validate these structural ensembles by using ChIP-Seq tracks alone to predict Hi-C maps, as well as distances measured using 3D fluorescence in situ hybridization (FISH) experiments. Both sets of experiments support the hypothesis of phase separation being the driving process behind compartmentalization. These findings strongly suggest that epigenetic marking patterns encode sufficient information to determine the global architecture of chromosomes and that de novo structure prediction for whole genomes may be increasingly possible. Copyright © 2017 the Author(s). Published by PNAS.
Principles of metadata organization at the ENCODE data coordination center
Hong, Eurie L.; Sloan, Cricket A.; Chan, Esther T.; Davidson, Jean M.; Malladi, Venkat S.; Strattan, J. Seth; Hitz, Benjamin C.; Gabdank, Idan; Narayanan, Aditi K.; Ho, Marcus; Lee, Brian T.; Rowe, Laurence D.; Dreszer, Timothy R.; Roe, Greg R.; Podduturi, Nikhil R.; Tanaka, Forrest; Hilton, Jason A.; Cherry, J. Michael
2016-01-01
The Encyclopedia of DNA Elements (ENCODE) Data Coordinating Center (DCC) is responsible for organizing, describing and providing access to the diverse data generated by the ENCODE project. The description of these data, known as metadata, includes the biological sample used as input, the protocols and assays performed on these samples, the data files generated from the results and the computational methods used to analyze the data. Here, we outline the principles and philosophy used to define the ENCODE metadata in order to create a metadata standard that can be applied to diverse assays and multiple genomic projects. In addition, we present how the data are validated and used by the ENCODE DCC in creating the ENCODE Portal (https://www.encodeproject.org/). Database URL: www.encodeproject.org PMID:26980513
USDA-ARS?s Scientific Manuscript database
Introduction: Previous studies in Cronobacter sakazakii, Klebsiella spp., and Escherichia coli have identified a genomic island that confers thermotolerance to its hosts. This island has recently been identified in Salmonella enterica serovar Senfentenberg ATCC 43845, a historically important, heat ...
Were protein internal repeats formed by "bricolage"?
Lavorgna, G; Patthy, L; Boncinelli, E
2001-03-01
Is evolution an engineer, or is it a tinkerer--a "bricoleur"--building up complex molecules in organisms by increasing and adapting the materials at hand? An analysis of completely sequenced genomes suggests the latter, showing that increasing repetition of modules within the proteins encoded by these genomes is correlated with increasing complexity of the organism.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ghodhbane-Gtari, Faten; Beauchemin, Nicholas; Gueddou, Abdellatif
Nocardiasp. strain BMG111209 is a non-Frankiaactinobacterium isolated from root nodules ofCasuarina glaucain Tunisia. Here, we report the 9.1-Mbp draft genome sequence ofNocardiasp. strain BMG111209 with a G + C content of 69.19% and 8,122 candidate protein-encoding genes.
USDA-ARS?s Scientific Manuscript database
The P. ultimum DAOM BR144 (=CBS 805.95 = ATCC200006) genome (42.8 Mb) encodes 15,290 genes, and has extensive sequence similarity and synteny with related Phytophthora spp., including the potato late blight pathogen Phytophthora infestans. Whole transcriptome sequencing revealed expression of 86 % o...
Plant Breeding Goes Microbial.
Wei, Zhong; Jousset, Alexandre
2017-07-01
Plant breeding has traditionally improved traits encoded in the plant genome. Here we propose an alternative framework reaching novel phenotypes by modifying together genomic information and plant-associated microbiota. This concept is made possible by a novel technology that enables the transmission of endophytic microbiota to the next plant generation. Copyright © 2017 Elsevier Ltd. All rights reserved.
Polycipiviridae: a proposed new family of polycistronic picorna-like RNA viruses
USDA-ARS?s Scientific Manuscript database
Solenopsis invicta virus 2 is a single-stranded positive-sense picorna-like RNA virus with an unusual genome structure. The monopartite genome of approximately 11 kb contains four short open reading frames in its 5' one third, three of which encode proteins with homology to picornavirus-like jelly-r...
Ghodhbane-Gtari, Faten; Beauchemin, Nicholas; Gueddou, Abdellatif; ...
2016-08-04
Nocardiasp. strain BMG111209 is a non-Frankiaactinobacterium isolated from root nodules ofCasuarina glaucain Tunisia. Here, we report the 9.1-Mbp draft genome sequence ofNocardiasp. strain BMG111209 with a G + C content of 69.19% and 8,122 candidate protein-encoding genes.
APOLLO Network | Office of Cancer Clinical Proteomics Research
The Applied Proteogenomics OrganizationaL Learning and Outcomes (APOLLO) network is a collaboration between NCI, the Department of Defense (DoD), and the Department of Veterans Affairs (VA) to incorporate proteogenomics into patient care as a way of looking beyond the genome, to the activity and expression of the proteins that the genome encodes.
Transposases are the most abundant, most ubiquitous genes in nature.
Aziz, Ramy K; Breitbart, Mya; Edwards, Robert A
2010-07-01
Genes, like organisms, struggle for existence, and the most successful genes persist and widely disseminate in nature. The unbiased determination of the most successful genes requires access to sequence data from a wide range of phylogenetic taxa and ecosystems, which has finally become achievable thanks to the deluge of genomic and metagenomic sequences. Here, we analyzed 10 million protein-encoding genes and gene tags in sequenced bacterial, archaeal, eukaryotic and viral genomes and metagenomes, and our analysis demonstrates that genes encoding transposases are the most prevalent genes in nature. The finding that these genes, classically considered as selfish genes, outnumber essential or housekeeping genes suggests that they offer selective advantage to the genomes and ecosystems they inhabit, a hypothesis in agreement with an emerging body of literature. Their mobile nature not only promotes dissemination of transposable elements within and between genomes but also leads to mutations and rearrangements that can accelerate biological diversification and--consequently--evolution. By securing their own replication and dissemination, transposases guarantee to thrive so long as nucleic acid-based life forms exist.
Emergent adaptive behaviour of GRN-controlled simulated robots in a changing environment.
Yao, Yao; Storme, Veronique; Marchal, Kathleen; Van de Peer, Yves
2016-01-01
We developed a bio-inspired robot controller combining an artificial genome with an agent-based control system. The genome encodes a gene regulatory network (GRN) that is switched on by environmental cues and, following the rules of transcriptional regulation, provides output signals to actuators. Whereas the genome represents the full encoding of the transcriptional network, the agent-based system mimics the active regulatory network and signal transduction system also present in naturally occurring biological systems. Using such a design that separates the static from the conditionally active part of the gene regulatory network contributes to a better general adaptive behaviour. Here, we have explored the potential of our platform with respect to the evolution of adaptive behaviour, such as preying when food becomes scarce, in a complex and changing environment and show through simulations of swarm robots in an A-life environment that evolution of collective behaviour likely can be attributed to bio-inspired evolutionary processes acting at different levels, from the gene and the genome to the individual robot and robot population.
Batchu, Navish Kumar; Khater, Shradha; Patil, Sonal; Nagle, Vinod; Das, Gautam; Bhadra, Bhaskar; Sapre, Ajit; Dasgupta, Santanu
2018-03-05
A filamentous cyanobacteria, Geitlerinema sp. FC II, was isolated from marine algae culture pond at Reliance Industries Limited (RIL), India. The 6.7 Mb draft genome of FC II encodes for 6697 protein coding genes. Analysis of the whole genome sequence revealed presence of nif gene cluster, supporting its capability to fix atmospheric nitrogen. FC II genome contains two variants of sulfide:quinone oxidoreductases (SQR), which is a crucial elector donor in cyanobacterial metabolic processes. FC II is characterized by the presence of multiple CRISPR- Cas (Clustered Regularly Interspaced Short Palindrome Repeats - CRISPR associated proteins) clusters, multiple variants of genes encoding photosystem reaction centres, biosynthetic gene clusters of alkane, polyketides and non-ribosomal peptides. Presence of these pathways will help FC II in gaining an ecological advantage over other strains for biomass production in large scale cultivation system. Hence, FC II may be used for production of biofuel and other industrially important metabolites. Copyright © 2018 Elsevier Inc. All rights reserved.
Bottacini, Francesca; Morrissey, Ruth; Roberts, Richard John; James, Kieran; van Breen, Justin; Egan, Muireann; Lambert, Jolanda; van Limpt, Kees; Knol, Jan; Motherway, Mary O’Connell; van Sinderen, Douwe
2018-01-01
Abstract Bifidobacterium breve represents one of the most abundant bifidobacterial species in the gastro-intestinal tract of breast-fed infants, where their presence is believed to exert beneficial effects. In the present study whole genome sequencing, employing the PacBio Single Molecule, Real-Time (SMRT) sequencing platform, combined with comparative genome analysis allowed the most extensive genetic investigation of this taxon. Our findings demonstrate that genes encoding Restriction/Modification (R/M) systems constitute a substantial part of the B. breve variable gene content (or variome). Using the methylome data generated by SMRT sequencing, combined with targeted Illumina bisulfite sequencing (BS-seq) and comparative genome analysis, we were able to detect methylation recognition motifs and assign these to identified B. breve R/M systems, where in several cases such assignments were confirmed by restriction analysis. Furthermore, we show that R/M systems typically impose a very significant barrier to genetic accessibility of B. breve strains, and that cloning of a methyltransferase-encoding gene may overcome such a barrier, thus allowing future functional investigations of members of this species. PMID:29294107
Emergent adaptive behaviour of GRN-controlled simulated robots in a changing environment
Yao, Yao; Storme, Veronique; Marchal, Kathleen
2016-01-01
We developed a bio-inspired robot controller combining an artificial genome with an agent-based control system. The genome encodes a gene regulatory network (GRN) that is switched on by environmental cues and, following the rules of transcriptional regulation, provides output signals to actuators. Whereas the genome represents the full encoding of the transcriptional network, the agent-based system mimics the active regulatory network and signal transduction system also present in naturally occurring biological systems. Using such a design that separates the static from the conditionally active part of the gene regulatory network contributes to a better general adaptive behaviour. Here, we have explored the potential of our platform with respect to the evolution of adaptive behaviour, such as preying when food becomes scarce, in a complex and changing environment and show through simulations of swarm robots in an A-life environment that evolution of collective behaviour likely can be attributed to bio-inspired evolutionary processes acting at different levels, from the gene and the genome to the individual robot and robot population. PMID:28028477
Genomes to natural products PRediction Informatics for Secondary Metabolomes (PRISM)
Skinnider, Michael A.; Dejong, Chris A.; Rees, Philip N.; Johnston, Chad W.; Li, Haoxin; Webster, Andrew L. H.; Wyatt, Morgan A.; Magarvey, Nathan A.
2015-01-01
Microbial natural products are an invaluable source of evolved bioactive small molecules and pharmaceutical agents. Next-generation and metagenomic sequencing indicates untapped genomic potential, yet high rediscovery rates of known metabolites increasingly frustrate conventional natural product screening programs. New methods to connect biosynthetic gene clusters to novel chemical scaffolds are therefore critical to enable the targeted discovery of genetically encoded natural products. Here, we present PRISM, a computational resource for the identification of biosynthetic gene clusters, prediction of genetically encoded nonribosomal peptides and type I and II polyketides, and bio- and cheminformatic dereplication of known natural products. PRISM implements novel algorithms which render it uniquely capable of predicting type II polyketides, deoxygenated sugars, and starter units, making it a comprehensive genome-guided chemical structure prediction engine. A library of 57 tailoring reactions is leveraged for combinatorial scaffold library generation when multiple potential substrates are consistent with biosynthetic logic. We compare the accuracy of PRISM to existing genomic analysis platforms. PRISM is an open-source, user-friendly web application available at http://magarveylab.ca/prism/. PMID:26442528
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shah, Bhumika S., E-mail: bhumika.shah@mq.edu.au; Tetu, Sasha G.; Harrop, Stephen J.
2014-09-25
The structure of a short-chain dehydrogenase encoded within genomic islands of A. baumannii strains has been solved to 2.4 Å resolution. This classical SDR incorporates a flexible helical subdomain. The NADP-binding site and catalytic side chains are identified. Over 15% of the genome of an Australian clinical isolate of Acinetobacter baumannii occurs within genomic islands. An uncharacterized protein encoded within one island feature common to this and other International Clone II strains has been studied by X-ray crystallography. The 2.4 Å resolution structure of SDR-WM99c reveals it to be a new member of the classical short-chain dehydrogenase/reductase (SDR) superfamily. Themore » enzyme contains a nucleotide-binding domain and, like many other SDRs, is tetrameric in form. The active site contains a catalytic tetrad (Asn117, Ser146, Tyr159 and Lys163) and water molecules occupying the presumed NADP cofactor-binding pocket. An adjacent cleft is capped by a relatively mobile helical subdomain, which is well positioned to control substrate access.« less