Sequence analysis of 497 mouse brain ESTs expressed in the substantia nigra
DOE Office of Scientific and Technical Information (OSTI.GOV)
Stewart, G.J.; Savioz, A.; Davies, R.W.
1997-01-15
The use of subtracted, region-specific cDNA libraries combined with single-pass cDNA sequencing allows the discovery of novel genes and facilitates molecular description of the tissue or region involved. We report the sequence of 497 mouse expressed sequence tags (ESTs) from two subtracted libraries enriched for cDNAs expressed in the substantia nigra, a brain region with important roles in movement control and Parkinson disease. Of these, 238 ESTs give no database matches and therefore derive from novel genes. A further 115 ESTs show sequence similarity to ESTs from other organisms, which themselves do not yield any significant database matches to genesmore » of known function. Fifty-six ESTs show sequence similarity to previously identified genes whose mouse homologues have not been reported. The total number of ESTs reported that are new for the mouse is 407, which, together with the 90 ESTs corresponding to known mouse genes or cDNAs, contributes to the molecular description of the substantia nigra. 21 refs., 4 tabs.« less
Escherichia coli K-12: a cooperatively developed annotation snapshot—2005
Riley, Monica; Abe, Takashi; Arnaud, Martha B.; Berlyn, Mary K.B.; Blattner, Frederick R.; Chaudhuri, Roy R.; Glasner, Jeremy D.; Horiuchi, Takashi; Keseler, Ingrid M.; Kosuge, Takehide; Mori, Hirotada; Perna, Nicole T.; Plunkett, Guy; Rudd, Kenneth E.; Serres, Margrethe H.; Thomas, Gavin H.; Thomson, Nicholas R.; Wishart, David; Wanner, Barry L.
2006-01-01
The goal of this group project has been to coordinate and bring up-to-date information on all genes of Escherichia coli K-12. Annotation of the genome of an organism entails identification of genes, the boundaries of genes in terms of precise start and end sites, and description of the gene products. Known and predicted functions were assigned to each gene product on the basis of experimental evidence or sequence analysis. Since both kinds of evidence are constantly expanding, no annotation is complete at any moment in time. This is a snapshot analysis based on the most recent genome sequences of two E.coli K-12 bacteria. An accurate and up-to-date description of E.coli K-12 genes is of particular importance to the scientific community because experimentally determined properties of its gene products provide fundamental information for annotation of innumerable genes of other organisms. Availability of the complete genome sequence of two K-12 strains allows comparison of their genotypes and mutant status of alleles. PMID:16397293
Premzl, Marko
2015-01-01
Using eutherian comparative genomic analysis protocol and public genomic sequence data sets, the present work attempted to update and revise two gene data sets. The most comprehensive third party annotation gene data sets of eutherian adenohypophysis cystine-knot genes (128 complete coding sequences), and d-dopachrome tautomerases and macrophage migration inhibitory factor genes (30 complete coding sequences) were annotated. For example, the present study first described primate-specific cystine-knot Prometheus genes, as well as differential gene expansions of D-dopachrome tautomerase genes. Furthermore, new frameworks of future experiments of two eutherian gene data sets were proposed. PMID:25941635
EuroPineDB: a high-coverage web database for maritime pine transcriptome
2011-01-01
Background Pinus pinaster is an economically and ecologically important species that is becoming a woody gymnosperm model. Its enormous genome size makes whole-genome sequencing approaches are hard to apply. Therefore, the expressed portion of the genome has to be characterised and the results and annotations have to be stored in dedicated databases. Description EuroPineDB is the largest sequence collection available for a single pine species, Pinus pinaster (maritime pine), since it comprises 951 641 raw sequence reads obtained from non-normalised cDNA libraries and high-throughput sequencing from adult (xylem, phloem, roots, stem, needles, cones, strobili) and embryonic (germinated embryos, buds, callus) maritime pine tissues. Using open-source tools, sequences were optimally pre-processed, assembled, and extensively annotated (GO, EC and KEGG terms, descriptions, SNPs, SSRs, ORFs and InterPro codes). As a result, a 10.5× P. pinaster genome was covered and assembled in 55 322 UniGenes. A total of 32 919 (59.5%) of P. pinaster UniGenes were annotated with at least one description, revealing at least 18 466 different genes. The complete database, which is designed to be scalable, maintainable, and expandable, is freely available at: http://www.scbi.uma.es/pindb/. It can be retrieved by gene libraries, pine species, annotations, UniGenes and microarrays (i.e., the sequences are distributed in two-colour microarrays; this is the only conifer database that provides this information) and will be periodically updated. Small assemblies can be viewed using a dedicated visualisation tool that connects them with SNPs. Any sequence or annotation set shown on-screen can be downloaded. Retrieval mechanisms for sequences and gene annotations are provided. Conclusions The EuroPineDB with its integrated information can be used to reveal new knowledge, offers an easy-to-use collection of information to directly support experimental work (including microarray hybridisation), and provides deeper knowledge on the maritime pine transcriptome. PMID:21762488
Human Chromosome 7: DNA Sequence and Biology
Scherer, Stephen W.; Cheung, Joseph; MacDonald, Jeffrey R.; Osborne, Lucy R.; Nakabayashi, Kazuhiko; Herbrick, Jo-Anne; Carson, Andrew R.; Parker-Katiraee, Layla; Skaug, Jennifer; Khaja, Razi; Zhang, Junjun; Hudek, Alexander K.; Li, Martin; Haddad, May; Duggan, Gavin E.; Fernandez, Bridget A.; Kanematsu, Emiko; Gentles, Simone; Christopoulos, Constantine C.; Choufani, Sanaa; Kwasnicka, Dorota; Zheng, Xiangqun H.; Lai, Zhongwu; Nusskern, Deborah; Zhang, Qing; Gu, Zhiping; Lu, Fu; Zeesman, Susan; Nowaczyk, Malgorzata J.; Teshima, Ikuko; Chitayat, David; Shuman, Cheryl; Weksberg, Rosanna; Zackai, Elaine H.; Grebe, Theresa A.; Cox, Sarah R.; Kirkpatrick, Susan J.; Rahman, Nazneen; Friedman, Jan M.; Heng, Henry H. Q.; Pelicci, Pier Giuseppe; Lo-Coco, Francesco; Belloni, Elena; Shaffer, Lisa G.; Pober, Barbara; Morton, Cynthia C.; Gusella, James F.; Bruns, Gail A. P.; Korf, Bruce R.; Quade, Bradley J.; Ligon, Azra H.; Ferguson, Heather; Higgins, Anne W.; Leach, Natalia T.; Herrick, Steven R.; Lemyre, Emmanuelle; Farra, Chantal G.; Kim, Hyung-Goo; Summers, Anne M.; Gripp, Karen W.; Roberts, Wendy; Szatmari, Peter; Winsor, Elizabeth J. T.; Grzeschik, Karl-Heinz; Teebi, Ahmed; Minassian, Berge A.; Kere, Juha; Armengol, Lluis; Pujana, Miguel Angel; Estivill, Xavier; Wilson, Michael D.; Koop, Ben F.; Tosi, Sabrina; Moore, Gudrun E.; Boright, Andrew P.; Zlotorynski, Eitan; Kerem, Batsheva; Kroisel, Peter M.; Petek, Erwin; Oscier, David G.; Mould, Sarah J.; Döhner, Hartmut; Döhner, Konstanze; Rommens, Johanna M.; Vincent, John B.; Venter, J. Craig; Li, Peter W.; Mural, Richard J.; Adams, Mark D.; Tsui, Lap-Chee
2010-01-01
DNA sequence and annotation of the entire human chromosome 7, encompassing nearly 158 million nucleotides of DNA and 1917 gene structures, are presented. To generate a higher order description, additional structural features such as imprinted genes, fragile sites, and segmental duplications were integrated at the level of the DNA sequence with medical genetic data, including 440 chromosome rearrangement breakpoints associated with disease. This approach enabled the discovery of candidate genes for developmental diseases including autism. PMID:12690205
USDA-ARS?s Scientific Manuscript database
In phylogenetic analyses of the genus Streptomyces using 16S rRNA gene sequences, Streptomyces albus subsp. albus NRRL B-1811T forms a cluster with 5 other species having identical or nearly identical 16S rRNA gene sequences. Moreover, the morphological and physiological characteristics of these oth...
GGRNA: an ultrafast, transcript-oriented search engine for genes and transcripts
Naito, Yuki; Bono, Hidemasa
2012-01-01
GGRNA (http://GGRNA.dbcls.jp/) is a Google-like, ultrafast search engine for genes and transcripts. The web server accepts arbitrary words and phrases, such as gene names, IDs, gene descriptions, annotations of gene and even nucleotide/amino acid sequences through one simple search box, and quickly returns relevant RefSeq transcripts. A typical search takes just a few seconds, which dramatically enhances the usability of routine searching. In particular, GGRNA can search sequences as short as 10 nt or 4 amino acids, which cannot be handled easily by popular sequence analysis tools. Nucleotide sequences can be searched allowing up to three mismatches, or the query sequences may contain degenerate nucleotide codes (e.g. N, R, Y, S). Furthermore, Gene Ontology annotations, Enzyme Commission numbers and probe sequences of catalog microarrays are also incorporated into GGRNA, which may help users to conduct searches by various types of keywords. GGRNA web server will provide a simple and powerful interface for finding genes and transcripts for a wide range of users. All services at GGRNA are provided free of charge to all users. PMID:22641850
GGRNA: an ultrafast, transcript-oriented search engine for genes and transcripts.
Naito, Yuki; Bono, Hidemasa
2012-07-01
GGRNA (http://GGRNA.dbcls.jp/) is a Google-like, ultrafast search engine for genes and transcripts. The web server accepts arbitrary words and phrases, such as gene names, IDs, gene descriptions, annotations of gene and even nucleotide/amino acid sequences through one simple search box, and quickly returns relevant RefSeq transcripts. A typical search takes just a few seconds, which dramatically enhances the usability of routine searching. In particular, GGRNA can search sequences as short as 10 nt or 4 amino acids, which cannot be handled easily by popular sequence analysis tools. Nucleotide sequences can be searched allowing up to three mismatches, or the query sequences may contain degenerate nucleotide codes (e.g. N, R, Y, S). Furthermore, Gene Ontology annotations, Enzyme Commission numbers and probe sequences of catalog microarrays are also incorporated into GGRNA, which may help users to conduct searches by various types of keywords. GGRNA web server will provide a simple and powerful interface for finding genes and transcripts for a wide range of users. All services at GGRNA are provided free of charge to all users.
Emended description of Pasteuria nishizawae.
Noel, Gregory R; Atibalentja, N; Domier, Leslie L
2005-07-01
The description of the Gram-positive, obligately parasitic, mycelial and endospore-forming bacterium, Pasteuria nishizawae, is emended to include additional observations on the life cycle, host specificity and endospore morphology. The nucleotide sequence of the 16S rRNA gene is also provided.
Description of Drinking Water Bacterial Communities Using 16S rRNA Gene Sequence Analyses
Descriptions of bacterial communities inhabiting water distribution systems (WDS) have mainly been accomplished using culture-based approaches. Due to the inherent selective nature of culture-based approaches, the majority of bacteria inhabiting WDS remain uncharacterized. The go...
KONAGAbase: a genomic and transcriptomic database for the diamondback moth, Plutella xylostella.
Jouraku, Akiya; Yamamoto, Kimiko; Kuwazaki, Seigo; Urio, Masahiro; Suetsugu, Yoshitaka; Narukawa, Junko; Miyamoto, Kazuhisa; Kurita, Kanako; Kanamori, Hiroyuki; Katayose, Yuichi; Matsumoto, Takashi; Noda, Hiroaki
2013-07-09
The diamondback moth (DBM), Plutella xylostella, is one of the most harmful insect pests for crucifer crops worldwide. DBM has rapidly evolved high resistance to most conventional insecticides such as pyrethroids, organophosphates, fipronil, spinosad, Bacillus thuringiensis, and diamides. Therefore, it is important to develop genomic and transcriptomic DBM resources for analysis of genes related to insecticide resistance, both to clarify the mechanism of resistance of DBM and to facilitate the development of insecticides with a novel mode of action for more effective and environmentally less harmful insecticide rotation. To contribute to this goal, we developed KONAGAbase, a genomic and transcriptomic database for DBM (KONAGA is the Japanese word for DBM). KONAGAbase provides (1) transcriptomic sequences of 37,340 ESTs/mRNAs and 147,370 RNA-seq contigs which were clustered and assembled into 84,570 unigenes (30,695 contigs, 50,548 pseudo singletons, and 3,327 singletons); and (2) genomic sequences of 88,530 WGS contigs with 246,244 degenerate contigs and 106,455 singletons from which 6,310 de novo identified repeat sequences and 34,890 predicted gene-coding sequences were extracted. The unigenes and predicted gene-coding sequences were clustered and 32,800 representative sequences were extracted as a comprehensive putative gene set. These sequences were annotated with BLAST descriptions, Gene Ontology (GO) terms, and Pfam descriptions, respectively. KONAGAbase contains rich graphical user interface (GUI)-based web interfaces for easy and efficient searching, browsing, and downloading sequences and annotation data. Five useful search interfaces consisting of BLAST search, keyword search, BLAST result-based search, GO tree-based search, and genome browser are provided. KONAGAbase is publicly available from our website (http://dbm.dna.affrc.go.jp/px/) through standard web browsers. KONAGAbase provides DBM comprehensive transcriptomic and draft genomic sequences with useful annotation information with easy-to-use web interfaces, which helps researchers to efficiently search for target sequences such as insect resistance-related genes. KONAGAbase will be continuously updated and additional genomic/transcriptomic resources and analysis tools will be provided for further efficient analysis of the mechanism of insecticide resistance and the development of effective insecticides with a novel mode of action for DBM.
2013-01-01
A need for a genomic species definition is emerging from several independent studies worldwide. In this commentary paper, we discuss recent studies on the genomic taxonomy of diverse microbial groups and a unified species definition based on genomics. Accordingly, strains from the same microbial species share >95% Average Amino Acid Identity (AAI) and Average Nucleotide Identity (ANI), >95% identity based on multiple alignment genes, <10 in Karlin genomic signature, and > 70% in silico Genome-to-Genome Hybridization similarity (GGDH). Species of the same genus will form monophyletic groups on the basis of 16S rRNA gene sequences, Multilocus Sequence Analysis (MLSA) and supertree analysis. In addition to the established requirements for species descriptions, we propose that new taxa descriptions should also include at least a draft genome sequence of the type strain in order to obtain a clear outlook on the genomic landscape of the novel microbe. The application of the new genomic species definition put forward here will allow researchers to use genome sequences to define simultaneously coherent phenotypic and genomic groups. PMID:24365132
Bardet, Lucie; Cimmino, Teresa; Buffet, Clémence; Michelle, Caroline; Rathored, Jaishriram; Tandina, Fatalmoudou; Lagier, Jean-Christophe; Khelaifia, Saber; Abrahão, Jônatas; Raoult, Didier; Rolain, Jean-Marc
2018-02-01
Culturomics is a new postgenomics field that explores the microbial diversity of the human gut coupled with taxono-genomic strategy. Culturomics, and the microbiome science more generally, are anticipated to transform global health diagnostics and inform the ways in which gut microbial diversity contributes to human health and disease, and by extension, to personalized medicine. Using culturomics, we report in this study the description of strain CB1 T ( = CSUR P1334 = DSM 29075), a new species isolated from a stool specimen from a 37-year-old Brazilian woman. This description includes phenotypic characteristics and complete genome sequence and annotation. Strain CB1 T is a gram-negative aerobic and motile bacillus, exhibits neither catalase nor oxidase activities, and presents a 98.3% 16S rRNA sequence similarity with Pseudomonas putida. The 4,723,534 bp long genome contains 4239 protein-coding genes and 74 RNA genes, including 15 rRNA genes (5 16S rRNA, 4 23S rRNA, and 6 5S rRNA) and 59 tRNA genes. Strain CB1 T was named Pseudomonas massiliensis sp. nov. and classified into the family Pseudomonadaceae. This study demonstrates the usefulness of microbial culturomics in exploration of human microbiota in diverse geographies and offers new promise for incorporating new omics technologies for innovation in diagnostic medicine and global health.
Gene: a gene-centered information resource at NCBI.
Brown, Garth R; Hem, Vichet; Katz, Kenneth S; Ovetsky, Michael; Wallin, Craig; Ermolaeva, Olga; Tolstoy, Igor; Tatusova, Tatiana; Pruitt, Kim D; Maglott, Donna R; Murphy, Terence D
2015-01-01
The National Center for Biotechnology Information's (NCBI) Gene database (www.ncbi.nlm.nih.gov/gene) integrates gene-specific information from multiple data sources. NCBI Reference Sequence (RefSeq) genomes for viruses, prokaryotes and eukaryotes are the primary foundation for Gene records in that they form the critical association between sequence and a tracked gene upon which additional functional and descriptive content is anchored. Additional content is integrated based on the genomic location and RefSeq transcript and protein sequence data. The content of a Gene record represents the integration of curation and automated processing from RefSeq, collaborating model organism databases, consortia such as Gene Ontology, and other databases within NCBI. Records in Gene are assigned unique, tracked integers as identifiers. The content (citations, nomenclature, genomic location, gene products and their attributes, phenotypes, sequences, interactions, variation details, maps, expression, homologs, protein domains and external databases) is available via interactive browsing through NCBI's Entrez system, via NCBI's Entrez programming utilities (E-Utilities and Entrez Direct) and for bulk transfer by FTP. Published by Oxford University Press on behalf of Nucleic Acids Research 2014. This work is written by (a) US Government employee(s) and is in the public domain in the US.
COGNATE: comparative gene annotation characterizer.
Wilbrandt, Jeanne; Misof, Bernhard; Niehuis, Oliver
2017-07-17
The comparison of gene and genome structures across species has the potential to reveal major trends of genome evolution. However, such a comparative approach is currently hampered by a lack of standardization (e.g., Elliott TA, Gregory TR, Philos Trans Royal Soc B: Biol Sci 370:20140331, 2015). For example, testing the hypothesis that the total amount of coding sequences is a reliable measure of potential proteome diversity (Wang M, Kurland CG, Caetano-Anollés G, PNAS 108:11954, 2011) requires the application of standardized definitions of coding sequence and genes to create both comparable and comprehensive data sets and corresponding summary statistics. However, such standard definitions either do not exist or are not consistently applied. These circumstances call for a standard at the descriptive level using a minimum of parameters as well as an undeviating use of standardized terms, and for software that infers the required data under these strict definitions. The acquisition of a comprehensive, descriptive, and standardized set of parameters and summary statistics for genome publications and further analyses can thus greatly benefit from the availability of an easy to use standard tool. We developed a new open-source command-line tool, COGNATE (Comparative Gene Annotation Characterizer), which uses a given genome assembly and its annotation of protein-coding genes for a detailed description of the respective gene and genome structure parameters. Additionally, we revised the standard definitions of gene and genome structures and provide the definitions used by COGNATE as a working draft suggestion for further reference. Complete parameter lists and summary statistics are inferred using this set of definitions to allow down-stream analyses and to provide an overview of the genome and gene repertoire characteristics. COGNATE is written in Perl and freely available at the ZFMK homepage ( https://www.zfmk.de/en/COGNATE ) and on github ( https://github.com/ZFMK/COGNATE ). The tool COGNATE allows comparing genome assemblies and structural elements on multiples levels (e.g., scaffold or contig sequence, gene). It clearly enhances comparability between analyses. Thus, COGNATE can provide the important standardization of both genome and gene structure parameter disclosure as well as data acquisition for future comparative analyses. With the establishment of comprehensive descriptive standards and the extensive availability of genomes, an encompassing database will become possible.
Thermodynamics-based models of transcriptional regulation with gene sequence.
Wang, Shuqiang; Shen, Yanyan; Hu, Jinxing
2015-12-01
Quantitative models of gene regulatory activity have the potential to improve our mechanistic understanding of transcriptional regulation. However, the few models available today have been based on simplistic assumptions about the sequences being modeled or heuristic approximations of the underlying regulatory mechanisms. In this work, we have developed a thermodynamics-based model to predict gene expression driven by any DNA sequence. The proposed model relies on a continuous time, differential equation description of transcriptional dynamics. The sequence features of the promoter are exploited to derive the binding affinity which is derived based on statistical molecular thermodynamics. Experimental results show that the proposed model can effectively identify the activity levels of transcription factors and the regulatory parameters. Comparing with the previous models, the proposed model can reveal more biological sense.
De novo characterization of Lentinula edodes C(91-3) transcriptome by deep Solexa sequencing.
Zhong, Mintao; Liu, Ben; Wang, Xiaoli; Liu, Lei; Lun, Yongzhi; Li, Xingyun; Ning, Anhong; Cao, Jing; Huang, Min
2013-02-01
Lentinula edodes, has been utilized as food, as well as, in popular medicine, moreover, its extract isolated from its mycelium and fruiting body have shown several therapeutic properties. Yet little is understood about its genes involved in these properties, and the absence of L.edodes genomes has been a barrier to the development of functional genomics research. However, high throughput sequencing technologies are now being widely applied to non-model species. To facilitate research on L.edodes, we leveraged Solexa sequencing technology in de novo assembly of L.edodes C(91-3) transcriptome. In a single run, we produced more than 57 million sequencing reads. These reads were assembled into 28,923 unigene sequences (mean size=689bp) including 18,120 unigenes with coding sequence (CDS). Based on similarity search with known proteins, assembled unigene sequences were annotated with gene descriptions, gene ontology (GO) and clusters of orthologous group (COG) terms. Our data provides the first comprehensive sequence resource available for functional genomics studies in L.edodes, and demonstrates the utility of Illumina/Solexa sequencing for de novo transcriptome characterization and gene discovery in a non-model mushroom. Copyright © 2012 Elsevier Inc. All rights reserved.
2013-01-01
Background Salamanders are unique among vertebrates in their ability to completely regenerate amputated limbs through the mediation of blastema cells located at the stump ends. This regeneration is nerve-dependent because blastema formation and regeneration does not occur after limb denervation. To obtain the genomic information of blastema tissues, de novo transcriptomes from both blastema tissues and denervated stump ends of Ambystoma mexicanum (axolotls) 14 days post-amputation were sequenced and compared using Solexa DNA sequencing. Results The sequencing done for this study produced 40,688,892 reads that were assembled into 307,345 transcribed sequences. The N50 of transcribed sequence length was 562 bases. A similarity search with known proteins identified 39,200 different genes to be expressed during limb regeneration with a cut-off E-value exceeding 10-5. We annotated assembled sequences by using gene descriptions, gene ontology, and clusters of orthologous group terms. Targeted searches using these annotations showed that the majority of the genes were in the categories of essential metabolic pathways, transcription factors and conserved signaling pathways, and novel candidate genes for regenerative processes. We discovered and confirmed numerous sequences of the candidate genes by using quantitative polymerase chain reaction and in situ hybridization. Conclusion The results of this study demonstrate that de novo transcriptome sequencing allows gene expression analysis in a species lacking genome information and provides the most comprehensive mRNA sequence resources for axolotls. The characterization of the axolotl transcriptome can help elucidate the molecular mechanisms underlying blastema formation during limb regeneration. PMID:23815514
Wink, Joachim; Schumann, Peter; Atasayar, Ewelina; Klenk, Hans-Peter; Zaburannyi, Nestor; Westermann, Martin; Martin, Karin; Glaeser, Stefanie P; Kämpfer, Peter
2017-04-01
'Streptomyces caelicus' DSM 40835 was first reported as the producer of the antibiotic griselimycin by some coworkers of Rhone Poulenc in 1971. The project on isolation of the antibiotic compound was stopped because of the bad solubility and selectivity of the compound towards Mycobacteria. At Sanofi-Aventis, Germany, the project was re-evaluated in 2007 and the gene cluster of griselimycin could be identified, characterized and was patented in 2013. At this time, 'S. caelicus' was an invalid name. During the strain characterization work, it was found that 'S. caelicus' belongs to the group of species of the genus Streptomyces which show an unusual heterogeneity of the 16S rRNA gene sequences. However, high 16S rRNA gene sequence similarities to Streptomyces muensis JCM 17576T and Streptomyces canchipurensis JCM 17575T were obvious. Here, we present a comparative description of 'Streptomyces caelicus' DS 9461 (=DSM 40835=NCCB 100592) with S. muensis and S. canchipurensis by use of a polyphasic taxonomy approach and additional comparison of some housekeeping genes by multilocus sequence analysis (MLSA). An emended description of Streptomyces muensis is provided as a result of this work.
Afouda, Pamela; Durand, Guillaume A; Lagier, Jean-Christophe; Labas, Noémie; Cadoret, Fréderic; Armstrong, Nicholas; Raoult, Didier; Dubourg, Grégory
2018-04-14
Intestinimonas massiliensis sp. nov strain GD2 T is a new species of the genus Intestinimonas (the second, following Intestinimonas butyriciproducens gen. nov., sp. nov). First isolated from the gut microbiota of a healthy subject of French origin using a culturomics approach combined with taxono-genomics, it is strictly anaerobic, nonspore-forming, rod-shaped, with catalase- and oxidase-negative reactions. Its growth was observed after preincubation in an anaerobic blood culture enriched with sheep blood (5%) and rumen fluid (5%), incubated at 37°C. Its phenotypic and genotypic descriptions are presented in this paper with a full annotation of its genome sequence. This genome consists of 3,104,261 bp in length and contains 3,074 predicted genes, including 3,012 protein-coding genes and 62 RNA-coding genes. Strain GD2 T significantly produces butyrate and is frequently found among available 16S rRNA gene amplicon datasets, which leads consideration of Intestinimonas massiliensis as an important human gut commensal. © 2018 The Authors. MicrobiologyOpen published by John Wiley & Sons Ltd.
Alauzet, C.; Mory, F.; Teyssier, C.; Hallage, H.; Carlier, J. P.; Grollier, G.; Lozniewski, A.
2010-01-01
Nonduplicate clinical isolates of Prevotella spp. recovered from patients hospitalized between 2003 and 2006 in two French tertiary-care teaching hospitals were investigated for their susceptibility to metronidazole and the presence of nim genes. Of the 188 strains tested, 3 isolates displayed reduced susceptibility to metronidazole after 48 h of incubation, while 27 additional isolates exhibited heterogeneous resistance after prolonged incubation; all 30 of the isolates were nim negative. Among the remaining 158 isolates, 7 nim-positive isolates were detected. All of these strains were identified as Prevotella baroniae by 16S rRNA gene sequence analysis and contained a new nim gene, named nimI, as determined by DNA sequence analysis. Chromosomal localization of this single-copy gene was demonstrated in all clinical isolates as well as in type strain P. baroniae DSM 16972 by using Southern hybridization. No known associated insertion sequence elements were detected upstream of the nimI gene in any of the nim-positive strains by PCR mapping. After prolonged exposure to metronidazole, stable resistant subpopulations could be selected in nimI-positive Prevotella isolates (n = 6) as well as in nim-negative Prevotella isolates (n = 6), irrespective of their initial susceptibility to this antibiotic. This study is the first description of a new nitroimidazole resistance gene in P. baroniae which seems to be silent and which might be intrinsic in this species. Moreover, our findings highlight the fact that high-level resistance to metronidazole may be easily induced in both nim-positive and nim-negative Prevotella sp. strains. PMID:19805556
Kim, Byoung-Jun; Kim, Ga-Na; Kim, Bo-Ram; Jeon, Che Ok; Jeong, Joseph; Lee, Seon Ho; Lim, Ji-Hun; Lee, Seung-Heon; Kim, Chang Ki; Kook, Yoon-Hoh; Kim, Bum-Joon
2017-10-01
Three rapidly growing mycobacterial strains, QIA-37 T , QIA-40 and QIA-41, were isolated from the lymph nodes of three separate Korean native cattle, Hanwoo (Bos taurus coreanae). These strains were previously shown to be phylogenetically distinct but closely related to Mycobacterium chelonae ATCC 35752 T by taxonomic approaches targeting three genes (16S rRNA, hsp6 and rpoB) and were further characterized using a polyphasic approach in this study. The 16S rRNA gene sequences of all three strains showed 99.7 % sequence similarity with that of the M. chelonae type strain. A multilocus sequence typing analysis targeting 10 housekeeping genes, including hsp65 and rpoB, revealed a phylogenetic cluster of these strains with M. chelonae. DNA-DNA hybridization values of 78.2 % between QIA-37 T and M. chelonae indicated that it belongs to M. chelonae but is a novel subspecies distinct from M. chelonae. Phylogenetic analysis based on whole-genome sequences revealed a 95.44±0.06 % average nucleotide identity (ANI) value with M. chelonae, slightly higher than the 95.0 % ANI criterion for determining a novel species. In addition, distinct phenotypic characteristics such as positive growth at 37 °C, at which temperature M. chelonae does not grow, further support the taxonomic status of these strains as representatives of a novel subspecies of M. chelonae. Therefore, we propose an emended description of Mycobacterium chelonae, and descriptions of M. chelonae subsp. chelonae subsp. nov. and M. chelonae subsp. bovis subsp. nov. are presented; strains ATCC 35752 T (=CCUG 47445 T =CIP 104535 T =DSM 43804 T =JCM 6388 T =NCTC 946 T ) and QIA-37 T (=KCTC 39630 T =JCM 30986 T ) are the type strains of the two novel subspecies.
Louis, Ed
2011-01-01
In the early days of the yeast genome sequencing project, gene annotation was in its infancy and suffered the problem of many false positive annotations as well as missed genes. The lack of other sequences for comparison also prevented the annotation of conserved, functional sequences that were not coding. We are now in an era of comparative genomics where many closely related as well as more distantly related genomes are available for direct sequence and synteny comparisons allowing for more probable predictions of genes and other functional sequences due to conservation. We also have a plethora of functional genomics data which helps inform gene annotation for previously uncharacterised open reading frames (ORFs)/genes. For Saccharomyces cerevisiae this has resulted in a continuous updating of the gene and functional sequence annotations in the reference genome helping it retain its position as the best characterized eukaryotic organism's genome. A single reference genome for a species does not accurately describe the species and this is quite clear in the case of S. cerevisiae where the reference strain is not ideal for brewing or baking due to missing genes. Recent surveys of numerous isolates, from a variety of sources, using a variety of technologies have revealed a great deal of variation amongst isolates with genome sequence surveys providing information on novel genes, undetectable by other means. We now have a better understanding of the extant variation in S. cerevisiae as a species as well as some idea of how much we are missing from this understanding. As with gene annotation, comparative genomics enhances the discovery and description of genome variation and is providing us with the tools for understanding genome evolution, adaptation and selection, and underlying genetics of complex traits.
Non-contiguous finished genome sequence and description of Oceanobacillus massiliensis sp. nov.
Roux, Véronique; Million, Matthieu; Robert, Catherine; Magne, Alix; Raoult, Didier
2013-01-01
Oceanobacillus massiliensis strain N’DiopT sp. nov. is the type strain of O. massiliensis sp. nov., a new species within the genus Oceanobacillus. This strain, whose genome is described here, was isolated from the fecal flora of a healthy patient. O. massiliensis is an aerobic rod. Here we describe the features of this organism, together with the complete genome sequence and annotation. The 3,532,675 bp long genome contains 3,519 protein-coding genes and 72 RNA genes, including between 6 and 8 rRNA operons. PMID:24976893
Kozlov, Konstantin N.; Kulakovskiy, Ivan V.; Zubair, Asif; Marjoram, Paul; Lawrie, David S.; Nuzhdin, Sergey V.; Samsonova, Maria G.
2017-01-01
Annotating the genotype-phenotype relationship, and developing a proper quantitative description of the relationship, requires understanding the impact of natural genomic variation on gene expression. We apply a sequence-level model of gap gene expression in the early development of Drosophila to analyze single nucleotide polymorphisms (SNPs) in a panel of natural sequenced D. melanogaster lines. Using a thermodynamic modeling framework, we provide both analytical and computational descriptions of how single-nucleotide variants affect gene expression. The analysis reveals that the sequence variants increase (decrease) gene expression if located within binding sites of repressors (activators). We show that the sign of SNP influence (activation or repression) may change in time and space and elucidate the origin of this change in specific examples. The thermodynamic modeling approach predicts non-local and non-linear effects arising from SNPs, and combinations of SNPs, in individual fly genotypes. Simulation of individual fly genotypes using our model reveals that this non-linearity reduces to almost additive inputs from multiple SNPs. Further, we see signatures of the action of purifying selection in the gap gene regulatory regions. To infer the specific targets of purifying selection, we analyze the patterns of polymorphism in the data at two phenotypic levels: the strengths of binding and expression. We find that combinations of SNPs show evidence of being under selective pressure, while individual SNPs do not. The model predicts that SNPs appear to accumulate in the genotypes of the natural population in a way biased towards small increases in activating action on the expression pattern. Taken together, these results provide a systems-level view of how genetic variation translates to the level of gene regulatory networks via combinatorial SNP effects. PMID:28898266
Detection and molecular status of Isospora sp. from the domestic pigeon (Columba livia domestica).
Matsubara, Ryuma; Fukuda, Yasuhiro; Murakoshi, Fumi; Nomura, Osamu; Suzuki, Toru; Tada, Chika; Nakai, Yutaka
2017-10-01
The domestic pigeon, Columba livia domestica, is reared for meat production, as a pet, or for racing. Few reports have characterized the parasitic protists from the genus Isospora isolated from Columbiformes. We detected Isospora-like oocysts from C. livia reared for racing. The oocyst contained two sporocysts, and each sporocyst included four sporozoites. The sporulated oocysts (n=4) were spherical; their mean diameters were 25.6 (24.0-27.2)×24.7 (23.4-26.0) μm. Micropyles, polar granules, and oocyst residuum were absent. The mean length and width of the sporocysts (n=8) were 19.5 (18.5-20.5) and 11.2 (10.2-12.1) μm, respectively. Stieda and sub-Stieda bodies were observed. Single-oocyst PCR revealed two different 18S rRNA gene sequences and one 28S rRNA gene sequence in a single oocyst of Isospora sp. Based on a phylogenetic analysis of the 18S rRNA gene, the two sequences made a group which fell within a cluster of known avian Isospora species. A tree based on the 28S rRNA gene sequence indicated that sequences from the pigeon Isospora sp. fell within a cluster of avian Isospora species. Both trees failed to clarify the phylogenetic relationships among the avian Isospora species due to limited resolution. Because the morphological description of Isospora sp. is based on only four oocysts, Isospora sp. is not proposed as a novel species here. This is the first description of Isospora sp. isolated from the domestic pigeon C. livia. Copyright © 2017 Elsevier B.V. All rights reserved.
Ntougias, Spyridon; Lapidus, Alla; Copeland, Alex; ...
2015-08-13
Members of the genus Halotalea (family Halomonadaceae) are of high significance since they can tolerate the greatest glucose and maltose concentrations ever reported for known bacteria and are involved in the degradation of industrial effluents. Here, the characteristics and the permanent-draft genome sequence and annotation of Halotalea alkalilenta AW-7T are described. The microorganism was sequenced as a part of the Genomic Encyclopedia of Type Strains, Phase I: the one thousand microbial genomes (KMG) project at the DOE Joint Genome Institute, and it is the only strain within the genus Halotalea having its genome sequenced. The genome is 4,467,826 bp longmore » and consists of 40 scaffolds with 64.62 % average GC content. A total of 4,104 genes were predicted, comprising of 4,028 protein-coding and 76 RNA genes. Most protein-coding genes (87.79 %) were assigned to a putative function. Halotalea alkalilenta AW-7T encodes the catechol and protocatechuate degradation to β-ketoadipate via the β-ketoadipate and protocatechuate ortho-cleavage degradation pathway, and it possesses the genetic ability to detoxify fluoroacetate, cyanate and acrylonitrile. Lastly, an emended description of the genus Halotalea Ntougias et al. 2007 is also provided in order to describe the delayed fermentation ability of the type strain.« less
Genetic Diversity of Bacterial Communities and Gene Transfer Agents in Northern South China Sea
Sun, Fu-Lin; Wang, You-Shao; Wu, Mei-Lin; Jiang, Zhao-Yu; Sun, Cui-Ci; Cheng, Hao
2014-01-01
Pyrosequencing of the 16S ribosomal RNA gene (rDNA) amplicons was performed to investigate the unique distribution of bacterial communities in northern South China Sea (nSCS) and evaluate community structure and spatial differences of bacterial diversity. Cyanobacteria, Proteobacteria, Actinobacteria, and Bacteroidetes constitute the majority of bacteria. The taxonomic description of bacterial communities revealed that more Chroococcales, SAR11 clade, Acidimicrobiales, Rhodobacterales, and Flavobacteriales are present in the nSCS waters than other bacterial groups. Rhodobacterales were less abundant in tropical water (nSCS) than in temperate and cold waters. Furthermore, the diversity of Rhodobacterales based on the gene transfer agent (GTA) major capsid gene (g5) was investigated. Four g5 gene clone libraries were constructed from samples representing different regions and yielded diverse sequences. Fourteen g5 clusters could be identified among 197 nSCS clones. These clusters were also related to known g5 sequences derived from genome-sequenced Rhodobacterales. The composition of g5 sequences in surface water varied with the g5 sequences in the sampling sites; this result indicated that the Rhodobacterales population could be highly diverse in nSCS. Phylogenetic tree analysis result indicated distinguishable diversity patterns among tropical (nSCS), temperate, and cold waters, thereby supporting the niche adaptation of specific Rhodobacterales members in unique environments. PMID:25364820
Zhang, Yongqiang; Pei, Xinwu; Zhang, Chao; Lu, Zifeng; Wang, Zhixing; Jia, Shirong; Li, Weimin
2012-01-01
Background The hypersensitive response (HR) system of Chenopodium spp. confers broad-spectrum virus resistance. However, little knowledge exists at the genomic level for Chenopodium, thus impeding the advanced molecular research of this attractive feature. Hence, we took advantage of RNA-seq to survey the foliar transcriptome of C. amaranticolor, a Chenopodium species widely used as laboratory indicator for pathogenic viruses, in order to facilitate the characterization of the HR-type of virus resistance. Methodology and Principal Findings Using Illumina HiSeq™ 2000 platform, we obtained 39,868,984 reads with 3,588,208,560 bp, which were assembled into 112,452 unigenes (3,847 clusters and 108,605 singletons). BlastX search against the NCBI NR database identified 61,698 sequences with a cut-off E-value above 10−5. Assembled sequences were annotated with gene descriptions, GO, COG and KEGG terms, respectively. A total number of 738 resistance gene analogs (RGAs) and homology sequences of 6 key signaling proteins within the R proteins-directed signaling pathway were identified. Based on this transcriptome data, we investigated the gene expression profiles over the stage of HR induced by Tobacco mosaic virus and Cucumber mosaic virus by using digital gene expression analysis. Numerous candidate genes specifically or commonly regulated by these two distinct viruses at early and late stages of the HR were identified, and the dynamic changes of the differently expressed genes enriched in the pathway of plant-pathogen interaction were particularly emphasized. Conclusions To our knowledge, this study is the first description of the genetic makeup of C. amaranticolor, providing deep insight into the comprehensive gene expression information at transcriptional level in this species. The 738 RGAs as well as the differentially regulated genes, particularly the common genes regulated by both TMV and CMV, are suitable candidates which merit further functional characterization to dissect the molecular mechanisms and regulatory pathways of the HR-type of virus resistance in Chenopodium. PMID:23029338
Homophila: human disease gene cognates in Drosophila
Chien, Samson; Reiter, Lawrence T.; Bier, Ethan; Gribskov, Michael
2002-01-01
Although many human genes have been associated with genetic diseases, knowing which mutations result in disease phenotypes often does not explain the etiology of a specific disease. Drosophila melanogaster provides a powerful system in which to use genetic and molecular approaches to investigate human genetic diseases. Homophila is an intergenomic resource linking the human and fly genomes in order to stimulate functional genomic investigations in Drosophila that address questions about genetic disease in humans. Homophila provides a comprehensive linkage between the disease genes compiled in Online Mendelian Inheritance in Man (OMIM) and the complete Drosophila genomic sequence. Homophila is a relational database that allows searching based on human disease descriptions, OMIM number, human or fly gene names, and sequence similarity, and can be accessed at http://homophila.sdsc.edu. PMID:11752278
Non-contiguous finished genome sequence and description of Alistipes timonensis sp. nov.
Lagier, Jean-Christophe; Armougom, Fabrice; Mishra, Ajay Kumar; Nguyen, Thi-Tien; Raoult, Didier; Fournier, Pierre-Edouard
2012-01-01
Alistipes timonensis strain JC136T sp. nov. is the type strain of A. timonensis sp. nov., a new species within the genus Alistipes. This strain, whose genome is described here, was isolated from the fecal flora of a healthy patient. A. timonensis is an obligate anaerobic rod. Here we describe the features of this organism, together with the complete genome sequence and annotation. The 3,497,779 bp long genome (one chromosome but no plasmid) contains 2,742 protein-coding and 50 RNA genes, including three rRNA genes. PMID:23408657
Li, Yong; Xue, Han; Sang, Sheng-Qi; Lin, Cai-Li; Wang, Xi-Zhuo
2017-01-01
Two Gram-stain negative aerobic bacterial strains were isolated from the bark tissue of Populus × euramericana. The novel isolates were investigated using a polyphasic approach including 16S rRNA gene sequencing, genome sequencing, average nucleotide identity (ANI) and both phenotypic and chemotaxonomic assays. The genome core gene sequence and 16S rRNA gene phylogenies suggest that the novel isolates are different from the genera Snodgrassella and Stenoxybacter. Additionally, the ANI, G+C content, main fatty acids and phospholipid profile data supported the distinctiveness of the novel strain from genus Snodgrassella. Therefore, based on the data presented, the strains constitute a novel species of a novel genus within the family Neisseriaceae, for which the name Populibacter corticis gen. nov., sp. nov. is proposed. The type strain is 15-3-5T (= CFCC 13594T = KCTC 42251T).
Busse, Hans-Jürgen
2016-01-01
In this paper, the taxonomy of the genus Arthrobacter is discussed, from its first description in 1947 to the present state. Emphasis is given to intrageneric phylogeny and chemotaxonomic characteristics, concentrating on quinone systems, peptidoglycan compositions and polar lipid profiles. Internal groups within the genus Arthrobacter indicated from homogeneous chemotaxonomic traits and corresponding to phylogenetic grouping and/or high 16S rRNA gene sequence similarities are highlighted. Furthermore, polar lipid profiles and quinone systems of selected species are shown, filling some gaps concerning these chemotaxonomic traits. Based on phylogenetic groupings, 16S rRNA gene sequence similarities and homogeneity in peptidoglycan types, quinone systems and polar lipid profiles, a description of the genus Arthrobacter sensu lato and an emended description of Arthrobacter roseus are provided. Furthermore, reclassifications of selected species of the genus Arthrobacter into novel genera are proposed, namely Glutamicibacter gen. nov. (nine species), Paeniglutamicibacter gen. nov. (six species), Pseudoglutamicibacter gen. nov. (two species), Paenarthrobacter gen. nov. (six species) and Pseudarthrobacter gen. nov. (ten species).
Gene annotation from scientific literature using mappings between keyword systems.
Pérez, Antonio J; Perez-Iratxeta, Carolina; Bork, Peer; Thode, Guillermo; Andrade, Miguel A
2004-09-01
The description of genes in databases by keywords helps the non-specialist to quickly grasp the properties of a gene and increases the efficiency of computational tools that are applied to gene data (e.g. searching a gene database for sequences related to a particular biological process). However, the association of keywords to genes or protein sequences is a difficult process that ultimately implies examination of the literature related to a gene. To support this task, we present a procedure to derive keywords from the set of scientific abstracts related to a gene. Our system is based on the automated extraction of mappings between related terms from different databases using a model of fuzzy associations that can be applied with all generality to any pair of linked databases. We tested the system by annotating genes of the SWISS-PROT database with keywords derived from the abstracts linked to their entries (stored in the MEDLINE database of scientific references). The performance of the annotation procedure was much better for SWISS-PROT keywords (recall of 47%, precision of 68%) than for Gene Ontology terms (recall of 8%, precision of 67%). The algorithm can be publicly accessed and used for the annotation of sequences through a web server at http://www.bork.embl.de/kat
Riedel, Thomas; Spring, Stefan; Fiebig, Anne; Petersen, Jörn; Göker, Markus; Klenk, Hans-Peter
2014-06-15
Rubellimicrobium mesophilum Dastager et al. 2008 is a mesophilic and light reddish-pigmented representative of the Roseobacter group within the alphaproteobacterial family Rhodobacteraceae. Representatives of the Roseobacter group play an important role in the marine biogeochemical cycles and were found in a broad variety of marine environments associated with algal blooms, different kinds of sediments, and surfaces of invertebrates and vertebrates. Roseobacters were shown to be widely distributed, especially within the total bacterial community found in coastal waters, as well as in mixed water layers of the open ocean. Here we describe the features of R. mesophilum strain MSL-20(T) together with its genome sequence and annotation generated from a culture of DSM 19309(T). The 4,927,676 bp genome sequence consists of one chromosome and probably one extrachromosomal element. It contains 5,082 protein-coding genes and 56 RNA genes. As previously reported, the G+C content is significantly different from the actual genome sequence-based G+C content and as the type strain tests positively for oxidase, the species description is emended accordingly. The genome was sequenced as part of the activities of the Transregional Collaborative Research Centre 51 (TRR51) funded by the German Research Foundation (DFG).
[Advance on genome research of Yersinia pestis bacteriophage].
Tan, H L; Wang, P; Li, W
2017-04-10
Completion of the genome sequences on Yersinia pestis bacteriophage offered unprecedented opportunity for researchers to carry out related genomic studies. This review was based on the genomic sequences and provided a genomic perspective in describing the essential features of genome on Yersinia pestis bacteriophage. Based on the comparative genomics, genetic evolutionary relationship was discussed. Description of functions from the gene prediction and protein annotation provided evidence for further related studies.
Machado, Soraya A; Kuzmin, Yuriy; Tkach, Vasyl V; Dos Santos, Jeannie Nascimento; Gonçalves, Evonnildo Costa; de Vasconcelos Melo, Francisco Tiago
2018-05-09
A new species of the genus Serpentirhabdias Tkach, Kuzmin et Snyder, 2014, S. moi n. sp., is described from a colubroid snake Chironius exoletus from Caxiuanã National Forest, State of Pará, Brazil. The species is characterised by having a triangular oral opening, absence of the buccal capsule, presence of six minute onchia in the oesophastome, and excretory glands of approximately the same length as the oesophagus. These qualitative morphological characters, as well as some measurements, differentiate the new species from other Neotropical and Nearctic Serpentirhabdias spp. The morphological description of parasitic adults of S. moi n. sp. is complemented by the description of free-living stages including males, females, and infective larvae. Comparative analysis of partial sequences of cox1 and 12S mitochondrial genes strongly supported the status of S. moi n. sp. as a new species. Molecular phylogeny based on sequences of the nuclear DNA region spanning the 3' end of the 18S nuclear rRNA gene, ITS region (ITS1 + 5.8S + ITS2) and 5' end of the 28S gene supported monophyly of all rhabdiasid genera included in the analysis and placed the new species into the Serpentirhabdias clade as sister taxon to S. fuscovenosa. Copyright © 2018 Elsevier B.V. All rights reserved.
Sensible use of antisense: how to use oligonucleotides as research tools.
Myers, K J; Dean, N M
2000-01-01
In the past decade, there has been a vast increase in the amount of gene sequence information that has the potential to revolutionize the way diseases are both categorized and treated. Old diagnoses, largely anatomical or descriptive in nature, are likely to be superceded by the molecular characterization of the disease. The recognition that certain genes drive key disease processes will also enable the rational design of gene-specific therapeutics. Antisense oligonucleotides represent a technology that should play multiple roles in this process.
Description and physical localization of the bovine survival of motor neuron gene (SMN).
Pietrowski, D; Goldammer, T; Meinert, S; Schwerin, M; Förster, M
1998-01-01
Proximal spinal muscular atrophy (SMA) is an autosomal recessive disease in humans and other mammals, characterized by degeneration of anterior horn cells of the spinal cord. In humans, the survival of motor neuron gene (SMN) has been recognized as the SMA-determining gene and has been mapped to 5q13. In cattle, SMA is a recurrent, inherited disease that plays an important economic role in breeding programs of Brown Swiss stock. Now we have identified the full- length cDNA sequence of the bovine SMN gene. Molecular analysis and characterization of the sequence documents 85% identity to its human counterpart and three evolutionarily conserved domains in different species. Physical mapping data reveals that bovine SMN is localized to chromosome region 20q12-->q13, supporting the conserved synteny of this chromosomal region between humans and cattle.
76 FR 49777 - Government-Owned Inventions; Availability for Licensing
Federal Register 2010, 2011, 2012, 2013, 2014
2011-08-11
... Treatment of Melanoma Description of Technology: Using whole-exome sequencing of matched normal and.../ transcription domain-associated protein (TRRAP) gene, found the glutamate receptor ionotropic N-methyl D... therapeutic proteins that target this pathway. Potential Commercial Applications: Diagnostic array for the...
Gene Composer: database software for protein construct design, codon engineering, and gene synthesis
Lorimer, Don; Raymond, Amy; Walchli, John; Mixon, Mark; Barrow, Adrienne; Wallace, Ellen; Grice, Rena; Burgin, Alex; Stewart, Lance
2009-01-01
Background To improve efficiency in high throughput protein structure determination, we have developed a database software package, Gene Composer, which facilitates the information-rich design of protein constructs and their codon engineered synthetic gene sequences. With its modular workflow design and numerous graphical user interfaces, Gene Composer enables researchers to perform all common bio-informatics steps used in modern structure guided protein engineering and synthetic gene engineering. Results An interactive Alignment Viewer allows the researcher to simultaneously visualize sequence conservation in the context of known protein secondary structure, ligand contacts, water contacts, crystal contacts, B-factors, solvent accessible area, residue property type and several other useful property views. The Construct Design Module enables the facile design of novel protein constructs with altered N- and C-termini, internal insertions or deletions, point mutations, and desired affinity tags. The modifications can be combined and permuted into multiple protein constructs, and then virtually cloned in silico into defined expression vectors. The Gene Design Module uses a protein-to-gene algorithm that automates the back-translation of a protein amino acid sequence into a codon engineered nucleic acid gene sequence according to a selected codon usage table with minimal codon usage threshold, defined G:C% content, and desired sequence features achieved through synonymous codon selection that is optimized for the intended expression system. The gene-to-oligo algorithm of the Gene Design Module plans out all of the required overlapping oligonucleotides and mutagenic primers needed to synthesize the desired gene constructs by PCR, and for physically cloning them into selected vectors by the most popular subcloning strategies. Conclusion We present a complete description of Gene Composer functionality, and an efficient PCR-based synthetic gene assembly procedure with mis-match specific endonuclease error correction in combination with PIPE cloning. In a sister manuscript we present data on how Gene Composer designed genes and protein constructs can result in improved protein production for structural studies. PMID:19383142
Lorimer, Don; Raymond, Amy; Walchli, John; Mixon, Mark; Barrow, Adrienne; Wallace, Ellen; Grice, Rena; Burgin, Alex; Stewart, Lance
2009-04-21
To improve efficiency in high throughput protein structure determination, we have developed a database software package, Gene Composer, which facilitates the information-rich design of protein constructs and their codon engineered synthetic gene sequences. With its modular workflow design and numerous graphical user interfaces, Gene Composer enables researchers to perform all common bio-informatics steps used in modern structure guided protein engineering and synthetic gene engineering. An interactive Alignment Viewer allows the researcher to simultaneously visualize sequence conservation in the context of known protein secondary structure, ligand contacts, water contacts, crystal contacts, B-factors, solvent accessible area, residue property type and several other useful property views. The Construct Design Module enables the facile design of novel protein constructs with altered N- and C-termini, internal insertions or deletions, point mutations, and desired affinity tags. The modifications can be combined and permuted into multiple protein constructs, and then virtually cloned in silico into defined expression vectors. The Gene Design Module uses a protein-to-gene algorithm that automates the back-translation of a protein amino acid sequence into a codon engineered nucleic acid gene sequence according to a selected codon usage table with minimal codon usage threshold, defined G:C% content, and desired sequence features achieved through synonymous codon selection that is optimized for the intended expression system. The gene-to-oligo algorithm of the Gene Design Module plans out all of the required overlapping oligonucleotides and mutagenic primers needed to synthesize the desired gene constructs by PCR, and for physically cloning them into selected vectors by the most popular subcloning strategies. We present a complete description of Gene Composer functionality, and an efficient PCR-based synthetic gene assembly procedure with mis-match specific endonuclease error correction in combination with PIPE cloning. In a sister manuscript we present data on how Gene Composer designed genes and protein constructs can result in improved protein production for structural studies.
Non contiguous-finished genome sequence and description of Enorma timonensis sp. nov.
Ramasamy, Dhamodaran; Dubourg, Gregory; Robert, Catherine; Caputo, Aurelia; Papazian, Laurent; Raoult, Didier; Fournier, Pierre-Edouard
2014-01-01
Enorma timonensis strain GD5T sp. nov., is the type strain of E. timonensis sp. nov., a new member of the genus Enorma within the family Coriobacteriaceae. This strain, whose genome is described here, was isolated from the fecal flora of a 53-year-old woman hospitalized for 3 months in an intensive care unit. E. timonensis is an obligate anaerobic rod. Here we describe the features of this organism, together with the complete genome sequence and annotation. The 2,365,123 bp long genome (1 chromosome but no plasmid) contains 2,060 protein-coding and 52 RNA genes, including 4 rRNA genes. PMID:25197477
Non contiguous-finished genome sequence and description of Peptoniphilus obesi sp. nov.
Mishra, Ajay Kumar; Hugon, Perrine; Lagier, Jean-Christophe; Nguyen, Thi-Thien; Robert, Catherine; Couderc, Carine; Raoult, Didier
2013-01-01
Peptoniphilus obesi strain ph1T sp. nov., is the type strain of P. obesi sp. nov., a new species within the genus Peptoniphilus. This strain, whose genome is described here, was isolated from the fecal flora of a 26-year-old woman suffering from morbid obesity. P. obesi strain ph1T is a Gram-positive, obligate anaerobic coccus. Here we describe the features of this organism, together with the complete genome sequence and annotation. The 1,774,150 bp long genome (1 chromosome but no plasmid) contains 1,689 protein-coding and 29 RNA genes, including 5 rRNA genes. PMID:24019985
Woodyard, Ethan T; Rosser, Thomas G; Griffin, Matt J
2017-08-01
Neodiplostomum americanum Chandler and Rausch, 1947 has been reported from six species of owls in North America. At present, there are no molecular data for this species and gene sequence data from Neodiplostomum Railliet, 1919 are limited. A freshly deceased specimen of the Great Horned Owl Bubo virginianus Gmelin, 1788 and a freshly deceased specimen of the Eastern Screech Owl Megascops asio Linnaeus, 1758 were collected in Oktibbeha County, Mississippi in 2014 and 2016, respectively. Neodiplostomum americanum were recovered from both hosts. Herein, updated morphological descriptions are supplemented with gene sequence data from conserved (18S, ITS1-5.8S, ITS2, and 28S rRNA) and fast-evolving (cytochrome c oxidase subunit 1 mtDNA) regions. Preliminary phylogenetic analysis of the genus based on cytochrome c oxidase subunit 1 gene sequence data supports the placement of N. americanum within a discrete phylogroup of the family Diplostomidae. The life history of N. americanum is unknown and currently limited to the description of the adult stage in avian hosts. The molecular data generated in this study offer insight into the phylogenetic placement of N. americanum within the Diplostomatidae and will aid in identifying different life stages in putative intermediate hosts.
Carro, Lorena; Nouioui, Imen; Sangal, Vartul; Meier-Kolthoff, Jan P; Trujillo, Martha E; Montero-Calasanz, Maria Del Carmen; Sahin, Nevzat; Smith, Darren Lee; Kim, Kristi E; Peluso, Paul; Deshpande, Shweta; Woyke, Tanja; Shapiro, Nicole; Kyrpides, Nikos C; Klenk, Hans-Peter; Göker, Markus; Goodfellow, Michael
2018-01-11
There is a need to clarify relationships within the actinobacterial genus Micromonospora, the type genus of the family Micromonosporaceae, given its biotechnological and ecological importance. Here, draft genomes of 40 Micromonospora type strains and two non-type strains are made available through the Genomic Encyclopedia of Bacteria and Archaea project and used to generate a phylogenomic tree which showed they could be assigned to well supported phyletic lines that were not evident in corresponding trees based on single and concatenated sequences of conserved genes. DNA G+C ratios derived from genome sequences showed that corresponding data from species descriptions were imprecise. Emended descriptions include precise base composition data and approximate genome sizes of the type strains. antiSMASH analyses of the draft genomes show that micromonosporae have a previously unrealised potential to synthesize novel specialized metabolites. Close to one thousand biosynthetic gene clusters were detected, including NRPS, PKS, terpenes and siderophores clusters that were discontinuously distributed thereby opening up the prospect of prioritising gifted strains for natural product discovery. The distribution of key stress related genes provide an insight into how micromonosporae adapt to key environmental variables. Genes associated with plant interactions highlight the potential use of micromonosporae in agriculture and biotechnology.
2009-01-01
Background Sequence identification of ESTs from non-model species offers distinct challenges particularly when these species have duplicated genomes and when they are phylogenetically distant from sequenced model organisms. For the common carp, an environmental model of aquacultural interest, large numbers of ESTs remained unidentified using BLAST sequence alignment. We have used the expression profiles from large-scale microarray experiments to suggest gene identities. Results Expression profiles from ~700 cDNA microarrays describing responses of 7 major tissues to multiple environmental stressors were used to define a co-expression landscape. This was based on the Pearsons correlation coefficient relating each gene with all other genes, from which a network description provided clusters of highly correlated genes as 'mountains'. We show that these contain genes with known identities and genes with unknown identities, and that the correlation constitutes evidence of identity in the latter. This procedure has suggested identities to 522 of 2701 unknown carp ESTs sequences. We also discriminate several common carp genes and gene isoforms that were not discriminated by BLAST sequence alignment alone. Precision in identification was substantially improved by use of data from multiple tissues and treatments. Conclusion The detailed analysis of co-expression landscapes is a sensitive technique for suggesting an identity for the large number of BLAST unidentified cDNAs generated in EST projects. It is capable of detecting even subtle changes in expression profiles, and thereby of distinguishing genes with a common BLAST identity into different identities. It benefits from the use of multiple treatments or contrasts, and from the large-scale microarray data. PMID:19939286
Breaking the 1000-gene barrier for Mimivirus using ultra-deep genome and transcriptome sequencing.
Legendre, Matthieu; Santini, Sébastien; Rico, Alain; Abergel, Chantal; Claverie, Jean-Michel
2011-03-04
Mimivirus, a giant dsDNA virus infecting Acanthamoeba, is the prototype of the mimiviridae family, the latest addition to the family of the nucleocytoplasmic large DNA viruses (NCLDVs). Its 1.2 Mb-genome was initially predicted to encode 917 genes. A subsequent RNA-Seq analysis precisely mapped many transcript boundaries and identified 75 new genes. We now report a much deeper analysis using the SOLiD™ technology combining RNA-Seq of the Mimivirus transcriptome during the infectious cycle (202.4 Million reads), and a complete genome re-sequencing (45.3 Million reads). This study corrected the genome sequence and identified several single nucleotide polymorphisms. Our results also provided clear evidence of previously overlooked transcription units, including an important RNA polymerase subunit distantly related to Euryarchea homologues. The total Mimivirus gene count is now 1018, 11% greater than the original annotation. This study highlights the huge progress brought about by ultra-deep sequencing for the comprehensive annotation of virus genomes, opening the door to a complete one-nucleotide resolution level description of their transcriptional activity, and to the realistic modeling of the viral genome expression at the ultimate molecular level. This work also illustrates the need to go beyond bioinformatics-only approaches for the annotation of short protein and non-coding genes in viral genomes.
Luz, Bruna Louise Pereira; Capel, Kátia Cristina Cruz; Stampar, Sérgio Nascimento; Kitahara, Marcelo Visentini
2016-07-01
Dendrophylliidae is one of the few monophyletic families within the Scleractinia that embraces zooxanthellate and azooxanthellate species represented by both solitary and colonial forms. Among the exclusively azooxanthellate genera, Dendrophyllia is reported worldwide from 1 to 1200 m deep. To date, although three complete mitochondrial (mt) genomes from representatives of the family are available, only that from Turbinaria peltata has been formally published. Here we describe the complete nucleotide sequence of the mt genome from Dendrophyllia arbuscula that is 19 069 bp in length and comprises two rDNAs, two tRNAs, and 13 protein-coding genes arranged in the canonical scleractinian mt gene order. No genes overlap, resulting in the presence of 18 intergenic spacers and one of the longest scleractinian mt genome sequenced to date.
Yang, Fengxi; Zhu, Genfa
2015-01-01
Cymbidium ensifolium belongs to the genus Cymbidium of the orchid family. Owing to its spectacular flower morphology, C. ensifolium has considerable ecological and cultural value. However, limited genetic data is available for this non-model plant, and the molecular mechanism underlying floral organ identity is still poorly understood. In this study, we characterize the floral transcriptome of C. ensifolium and present, for the first time, extensive sequence and transcript abundance data of individual floral organs. After sequencing, over 10 Gb clean sequence data were generated and assembled into 111,892 unigenes with an average length of 932.03 base pairs, including 1,227 clusters and 110,665 singletons. Assembled sequences were annotated with gene descriptions, gene ontology, clusters of orthologous group terms, the Kyoto Encyclopedia of Genes and Genomes, and the plant transcription factor database. From these annotations, 131 flowering-associated unigenes, 61 CONSTANS-LIKE (COL) unigenes and 90 floral homeotic genes were identified. In addition, four digital gene expression libraries were constructed for the sepal, petal, labellum and gynostemium, and 1,058 genes corresponding to individual floral organ development were identified. Among them, eight MADS-box genes were further investigated by full-length cDNA sequence analysis and expression validation, which revealed two APETALA1/AGL9-like MADS-box genes preferentially expressed in the sepal and petal, two AGAMOUS-like genes particularly restricted to the gynostemium, and four DEF-like genes distinctively expressed in different floral organs. The spatial expression of these genes varied distinctly in different floral mutant corresponding to different floral morphogenesis, which validated the specialized roles of them in floral patterning and further supported the effectiveness of our in silico analysis. This dataset generated in our study provides new insights into the molecular mechanisms underlying floral patterning of Cymbidium and supports a valuable resource for molecular breeding of the orchid plant. PMID:26580566
Automated Gene Ontology annotation for anonymous sequence data.
Hennig, Steffen; Groth, Detlef; Lehrach, Hans
2003-07-01
Gene Ontology (GO) is the most widely accepted attempt to construct a unified and structured vocabulary for the description of genes and their products in any organism. Annotation by GO terms is performed in most of the current genome projects, which besides generality has the advantage of being very convenient for computer based classification methods. However, direct use of GO in small sequencing projects is not easy, especially for species not commonly represented in public databases. We present a software package (GOblet), which performs annotation based on GO terms for anonymous cDNA or protein sequences. It uses the species independent GO structure and vocabulary together with a series of protein databases collected from various sites, to perform a detailed GO annotation by sequence similarity searches. The sensitivity and the reference protein sets can be selected by the user. GOblet runs automatically and is available as a public service on our web server. The paper also addresses the reliability of automated GO annotations by using a reference set of more than 6000 human proteins. The GOblet server is accessible at http://goblet.molgen.mpg.de.
Large-scale gene function analysis with the PANTHER classification system.
Mi, Huaiyu; Muruganujan, Anushya; Casagrande, John T; Thomas, Paul D
2013-08-01
The PANTHER (protein annotation through evolutionary relationship) classification system (http://www.pantherdb.org/) is a comprehensive system that combines gene function, ontology, pathways and statistical analysis tools that enable biologists to analyze large-scale, genome-wide data from sequencing, proteomics or gene expression experiments. The system is built with 82 complete genomes organized into gene families and subfamilies, and their evolutionary relationships are captured in phylogenetic trees, multiple sequence alignments and statistical models (hidden Markov models or HMMs). Genes are classified according to their function in several different ways: families and subfamilies are annotated with ontology terms (Gene Ontology (GO) and PANTHER protein class), and sequences are assigned to PANTHER pathways. The PANTHER website includes a suite of tools that enable users to browse and query gene functions, and to analyze large-scale experimental data with a number of statistical tests. It is widely used by bench scientists, bioinformaticians, computer scientists and systems biologists. In the 2013 release of PANTHER (v.8.0), in addition to an update of the data content, we redesigned the website interface to improve both user experience and the system's analytical capability. This protocol provides a detailed description of how to analyze genome-wide experimental data with the PANTHER classification system.
Multi-source and ontology-based retrieval engine for maize mutant phenotypes
USDA-ARS?s Scientific Manuscript database
In the midst of this genomics era, major plant genome databases are collecting massive amounts of heterogeneous information, including sequence data, gene product information, images of mutant phenotypes, etc., as well as textual descriptions of many of these entities. While basic browsing and sear...
Virus-Based RNA Silencing Agents and Virus-Derived Expression Vectors as Gene Therapy Vehicles.
Venkataraman, Srividhya; Ahmad, Tauqeer; AbouHaidar, Mounir G; Hefferon, Kathleen L
2017-01-01
In consideration of recent developments in understanding the genomics and proteomics of viruses, the use of viral DNA / RNA sequences as well as their gene expression schemes, have found new in-roads towards the prognosis and therapy of diseases. Correspondingly, the sphere of the patenting scenario has expanded significantly. The current review addresses patented inventions concerning the use of virus sequences as gene silencing machineries and inventions concerning the generation and application of viral sequences as expression vectors. Furthermore, this review also discusses the employment of these patents for clinical, agricultural and biotechnological applications. Considering these objectives, the Delphion Research Intellectual Property Network database was searched using keywords such as "gene silencing", "engineered viruses" and "expression vectors" and descriptions of recent patents on the said topics were discussed. Despite several recent advances in the use of viruses as disease therapy vehicles and biotechnological vectors, these developments have yet to be proven effective in practice, in clinical and field trials. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Seck, E H; Diop, A; Armstrong, N; Delerce, J; Fournier, P-E; Raoult, D; Khelaifia, S
2018-05-01
Bacillus salis strain ES3 T (= CSUR P1478 = DSM 100598) is the type strain of B. salis sp. nov. It is an aerobic, Gram-positive, moderately halophilic, motile and spore-forming bacterium. It was isolated from commercial table salt as part of a broad culturomics study aiming to maximize the culture conditions for the in-depth exploration of halophilic bacteria in salty food. Here we describe the phenotypic characteristics of this isolate, its complete genome sequence and annotation, together with a comparison with closely related bacteria. Phylogenetic analysis based on 16S rRNA gene sequences indicated 97.5% similarity with Bacillus aquimaris, the closest species. The 8 329 771 bp long genome (one chromosome, no plasmids) exhibits a G+C content of 39.19%. It is composed of 18 scaffolds with 29 contigs. Of the 8303 predicted genes, 8109 were protein-coding genes and 194 were RNAs. A total of 5778 genes (71.25%) were assigned a putative function.
McGarvey, J A; Franco, R B; Palumbo, J D; Hnasko, R; Stanker, L; Mitloehner, F M
2013-06-01
To describe, at high resolution, the bacterial population dynamics and chemical transformations during the ensiling of alfalfa and subsequent exposure to air. Samples of alfalfa, ensiled alfalfa and silage exposed to air were collected and their bacterial population structures compared using 16S rRNA gene libraries containing approximately 1900 sequences each. Cultural and chemical analyses were also performed to complement the 16S gene sequence data. Sequence analysis revealed significant differences (P < 0·05) in the bacterial populations at each time point. The alfalfa-derived library contained mostly sequences associated with the Gammaproteobacteria (including the genera: Enterobacter, Erwinia and Pantoea); the ensiled material contained mostly sequences associated with the lactic acid bacteria (LAB) (including the genera: Lactobacillus, Pediococcus and Lactococcus). Exposure to air resulted in even greater percentages of LAB, especially among the genus Lactobacillus, and a significant drop in bacterial diversity. In-depth 16S rRNA gene sequence analysis revealed significant bacterial population structure changes during ensiling and again during exposure to air. This in-depth description of the bacterial population dynamics that occurred during ensiling and simulated feed out expands our knowledge of these processes. © 2013 The Society for Applied Microbiology No claim to US Government works.
Szabó, Zsolt; Gyula, Péter; Robotka, Hermina; Bató, Emese; Gálik, Bence; Pach, Péter; Pekker, Péter; Papp, Ildikó; Bihari, Zoltán
2015-01-01
Methylibium sp. strain T29 was isolated from a gasoline-contaminated aquifer and proved to have excellent capabilities in degrading some common fuel oxygenates like methyl tert-butyl ether, tert-amyl methyl ether and tert-butyl alcohol along with other organic compounds. Here, we report the draft genome sequence of M. sp. strain T29 together with the description of the genome properties and its annotation. The draft genome consists of 608 contigs with a total size of 4,449,424 bp and an average coverage of 150×. The genome exhibits an average G + C content of 68.7 %, and contains 4754 protein coding and 52 RNA genes, including 48 tRNA genes. 71 % of the protein coding genes could be assigned to COG (Clusters of Orthologous Groups) categories. A formerly unknown circular plasmid designated as pT29A was isolated and sequenced separately and found to be 86,856 bp long.
Ravi, Anuradha; Avershina, Ekaterina; Angell, Inga Leena; Ludvigsen, Jane; Manohar, Prasanth; Padmanaban, Sumathi; Nachimuthu, Ramesh; Snipen, Lars; Rudi, Knut
2018-06-01
Use of the 16S rRNA gene in microbiota studies is limited by the lack of taxonomic and functional resolution. High resolution analyses are particularly important for understanding transmission and persistence of bacteria. The aim of our work was therefore to compare a novel reduced metagenome sequencing (RMS) approach with 16S rRNA gene sequencing to determine both the metagenome genetic diversity and the mother-to-child sharing of the microbiota in a cohort of 17 mother-child pairs. We found that although both approaches gave comparable results with respect to sample separation and taxonomy, RMS gave higher resolution and the potential for genomic-/functional assignment. Using RMS we estimated that the metagenome size increased from about 60 Mbp for 4-day-old children to about 225 Mbp for mothers. The 4-day-old children shared 7% of the metagenome sequences with the mothers, while the metagenome sequence sharing was >30% among the mothers. We found 15 genomes shared across >50% of the mothers, of which 10 belonged to Clostridia. Only Bacteroides showed a direct mother-child association, with B. vulgatus being abundant in both 4-day-old children and mothers. For the functional assignments, we identified a significant association between antibiotic usage during labor, and quantity of Fosfomycin resistance genes. In conclusion, our results show a higher functional and taxonomic resolution for RMS compared to 16S rRNA gene sequencing, where RMS enabled a detailed description of mother to child gut microbiota transmission - supporting a late recruitment of most gut bacteria and an effect of antibiotic treatment during labor on infant antibiotic resistance gene patterns. Copyright © 2018. Published by Elsevier B.V.
Ben Tanfous, Farah; Alonso, Carla Andrea; Achour, Wafa; Ruiz-Ripa, Laura; Torres, Carmen; Ben Hassen, Assia
2017-04-01
The aim of this study was to investigate the molecular features among Klebsiella pneumoniae and Escherichia coli strains showing a resistant/intermediate-resistant phenotype to ertapenem (R/IR-ERT), implicated in colonization/infection in patients of the Hematology and Graft Units of the National Bone Marrow Transplant Center of Tunisia (3-year period, 2011-2014). The major carbapenemase, extended-spectrum beta-lactamase, and plasmidic AmpC beta-lactamase genes were analyzed and characterized by PCR and sequencing. Genetic relatedness was determined by pulsed-field gel electrophoresis (PFGE) using XbaI and multilocus sequencing typing. The bla OXA-48 and bla KPC carbapenemase genes were detected among R/IR-ERT isolates. All R/IR-ERT K. pneumoniae strains (n = 19) had bla OXA-48 gene, and 14/19 strains also harbored the bla CTX-M-15 gene. Eight different PFGE patterns were detected among these K. pneumoniae isolates, and they showed eight different sequences types, ST11 and ST15 being the most prevalent ones. Two out of three R/IR-ERT E. coli isolates carried bla OXA-48 and one coproduced the bla CTX-M-15 gene. One E. coli strain, ascribed to the new sequence type ST5700, harbored the bla KPC-2 gene. E. coli isolates were not clonally related and belonged to different sequence types (ST5700, ST227, and ST58). To our knowledge, this is the first report in Tunisia of either KPC-2 carbapenemase in E. coli or OXA-48 carbapenemase in K. pneumoniae of lineage ST15.
Genome sequence and description of Corynebacterium ihumii sp. nov.
Padmanabhan, Roshan; Dubourg, Grégory; Lagier, Jean-Christophe; Couderc, Carine; Michelle, Caroline; Raoult, Didier; Fournier, Pierre-Edouard
2014-01-01
Corynebacterium ihumii strain GD7T sp. nov. is proposed as the type strain of a new species, which belongs to the family Corynebacteriaceae of the class Actinobacteria. This strain was isolated from the fecal flora of a 62 year-old male patient, as a part of the culturomics study. Corynebacterium ihumii is a Gram positive, facultativly anaerobic, nonsporulating bacillus. Here, we describe the features of this organism, together with the high quality draft genome sequence, annotation and the comparison with other member of the genus Corynebacteria. C. ihumii genome is 2,232,265 bp long (one chromosome but no plasmid) containing 2,125 protein-coding and 53 RNA genes, including 4 rRNA genes. The whole-genome shotgun sequence of Corynebacterium ihumii strain GD7T sp. nov has been deposited in EMBL under accession number GCA_000403725. PMID:25197488
SGDB: a database of synthetic genes re-designed for optimizing protein over-expression.
Wu, Gang; Zheng, Yuanpu; Qureshi, Imran; Zin, Htar Thant; Beck, Tyler; Bulka, Blazej; Freeland, Stephen J
2007-01-01
Here we present the Synthetic Gene Database (SGDB): a relational database that houses sequences and associated experimental information on synthetic (artificially engineered) genes from all peer-reviewed studies published to date. At present, the database comprises information from more than 200 published experiments. This resource not only provides reference material to guide experimentalists in designing new genes that improve protein expression, but also offers a dataset for analysis by bioinformaticians who seek to test ideas regarding the underlying factors that influence gene expression. The SGDB was built under MySQL database management system. We also offer an XML schema for standardized data description of synthetic genes. Users can access the database at http://www.evolvingcode.net/codon/sgdb/index.php, or batch downloads all information through XML files. Moreover, users may visually compare the coding sequences of a synthetic gene and its natural counterpart with an integrated web tool at http://www.evolvingcode.net/codon/sgdb/aligner.php, and discuss questions, findings and related information on an associated e-forum at http://www.evolvingcode.net/forum/viewforum.php?f=27.
2011-01-01
Background Biodiesel or ethanol derived from lipids or starch produced by microalgae may overcome many of the sustainability challenges previously ascribed to petroleum-based fuels and first generation plant-based biofuels. The paucity of microalgae genome sequences, however, limits gene-based biofuel feedstock optimization studies. Here we describe the sequencing and de novo transcriptome assembly for the non-model microalgae species, Dunaliella tertiolecta, and identify pathways and genes of importance related to biofuel production. Results Next generation DNA pyrosequencing technology applied to D. tertiolecta transcripts produced 1,363,336 high quality reads with an average length of 400 bases. Following quality and size trimming, ~ 45% of the high quality reads were assembled into 33,307 isotigs with a 31-fold coverage and 376,482 singletons. Assembled sequences and singletons were subjected to BLAST similarity searches and annotated with Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) orthology (KO) identifiers. These analyses identified the majority of lipid and starch biosynthesis and catabolism pathways in D. tertiolecta. Conclusions The construction of metabolic pathways involved in the biosynthesis and catabolism of fatty acids, triacylglycrols, and starch in D. tertiolecta as well as the assembled transcriptome provide a foundation for the molecular genetics and functional genomics required to direct metabolic engineering efforts that seek to enhance the quantity and character of microalgae-based biofuel feedstock. PMID:21401935
Baum, Thierry-Pascal; Hierle, Vivien; Pasqual, Nicolas; Bellahcene, Fatena; Chaume, Denys; Lefranc, Marie-Paule; Jouvin-Marche, Evelyne; Marche, Patrice Noël; Demongeot, Jacques
2006-01-01
Background Adaptative immune repertoire diversity in vertebrate species is generated by recombination of variable (V), diversity (D) and joining (J) genes in the immunoglobulin (IG) loci of B lymphocytes and in the T cell receptor (TR) loci of T lymphocytes. These V-J and V-D-J gene rearrangements at the DNA level involve recombination signal sequences (RSS). Whereas many data exist, they are scattered in non specialized resources with different nomenclatures (eg. flat files) and are difficult to extract. Description IMGT/GeneInfo is an online information system that provides, through a user-friendly interface, exhaustive information resulting from the complex mechanisms of T cell receptor V-J and V-D-J recombinations. T cells comprise two populations which express the αβ and γδ TR, respectively. The first version of the system dealt with the Homo sapiens and Mus musculus TRA and TRB loci whose gene rearrangements allow the synthesis of the αβ TR chains. In this paper, we present the second version of IMGT/GeneInfo where we complete the database for the Homo sapiens and Mus musculus TRG and TRD loci along with the introduction of a quality control procedure for existing and new data. We also include new functionalities to the four loci analysis, giving, to date, a very informative tool which allows to work on V(D)J genes of all TR loci in both human and mouse species. IMGT/GeneInfo provides more than 59,000 rearrangement combinations with a full gene description which is freely available at . Conclusion IMGT/GeneInfo allows all TR information sequences to be in the same spot, and are now available within two computer-mouse clicks. This is useful for biologists and bioinformaticians for the study of T lymphocyte V(D)J gene rearrangements and their applications in immune response analysis. PMID:16640788
Campello-Nunes, Pedro H; Fernandes, Noemi M; Szokoli, Franziska; Petroni, Giulio; da Silva-Neto, Inácio D
2018-05-19
Ciliates of the genus Gruberia are poorly studied. Consequently, most species lack detailed morphological descriptions, and all gene sequences in GenBank are not classified at the species level. In this study, a detailed morphological description of a population of G. lanceolata from Brazil is presented, based on live and protargol-stained organisms. We also present the 18S rRNA gene sequence and the phylogenetic position of this species. The primary characteristics of G. lanceolata from the Maricá Lagoon are as follows: an elongate fusiform body 280-870 × 40-160 μm in size; rosy cortical granules; a peristome occupying approximately 1/3-1/2 of body length; an adoral zone comprising 115-330 membranelles; a paroral membrane in 35-50 fragments; and a moniliform macronucleus with 11-16 nodules. Based on our observations and data from pertinent literature, we suggest G. beninensis to be a junior synonym of G. lanceolata. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
Whiteduck-Léveillée, Kerri; Whiteduck-Léveillée, Jenni; Cloutier, Michel; Tambong, James T; Xu, Renlin; Topp, Edward; Arts, Michael T; Chao, Jerry; Adam, Zaky; Lévesque, C André; Lapen, David R; Villemur, Richard; Khan, Izhar U H
2016-03-01
A study on the taxonomic classification of Arcobacter species was performed on the cultures isolated from various fecal sources where an Arcobacter strain AF1078(T) from human waste septic tank near Ottawa, Ontario, Canada was characterized using a polyphasic approach. Genetic investigations including 16S rRNA, atpA, cpn60, gyrA, gyrB and rpoB gene sequences of strain AF1078(T) are unique in comparison with other arcobacters. Phylogenetic analysis based on the 16S rRNA gene sequence revealed that the strain is most closely related to Arcobacter lanthieri and Arcobacter cibarius. Analyses of atpA, cpn60, gyrA, gyrB and rpoB gene sequences suggested that strain AF1078(T) formed a phylogenetic lineage independent of other species in the genus. Whole-genome sequence, DNA-DNA hybridization, fatty acid profile and phenotypic analysis further supported the conclusion that strain AF1078(T) represents a novel Arcobacter species, for which the name Arcobacter faecis sp. nov. is proposed, with type strain AF1078(T) (=LMG 28519(T); CCUG 66484(T)). Crown Copyright © 2015. Published by Elsevier GmbH. All rights reserved.
The Yak genome database: an integrative database for studying yak biology and high-altitude adaption
2012-01-01
Background The yak (Bos grunniens) is a long-haired bovine that lives at high altitudes and is an important source of milk, meat, fiber and fuel. The recent sequencing, assembly and annotation of its genome are expected to further our understanding of the means by which it has adapted to life at high altitudes and its ecologically important traits. Description The Yak Genome Database (YGD) is an internet-based resource that provides access to genomic sequence data and predicted functional information concerning the genes and proteins of Bos grunniens. The curated data stored in the YGD includes genome sequences, predicted genes and associated annotations, non-coding RNA sequences, transposable elements, single nucleotide variants, and three-way whole-genome alignments between human, cattle and yak. YGD offers useful searching and data mining tools, including the ability to search for genes by name or using function keywords as well as GBrowse genome browsers and/or BLAST servers, which can be used to visualize genome regions and identify similar sequences. Sequence data from the YGD can also be downloaded to perform local searches. Conclusions A new yak genome database (YGD) has been developed to facilitate studies on high-altitude adaption and bovine genomics. The database will be continuously updated to incorporate new information such as transcriptome data and population resequencing data. The YGD can be accessed at http://me.lzu.edu.cn/yak. PMID:23134687
Nielsen, Tue Kjærgaard; Rasmussen, Morten; Demanèche, Sandrine; Cecillon, Sébastien; Vogel, Timothy M.
2017-01-01
Abstract Bacterial degraders of chlorophenoxy herbicides have been isolated from various ecosystems, including pristine environments. Among these degraders, the sphingomonads constitute a prominent group that displays versatile xenobiotic-degradation capabilities. Four separate sequencing strategies were required to provide the complete sequence of the complex and plastic genome of the canonical chlorophenoxy herbicide-degrading Sphingobium herbicidovorans MH. The genome has an intricate organization of the chlorophenoxy-herbicide catabolic genes sdpA, rdpA, and cadABCD that encode the (R)- and (S)-enantiomer-specific 2,4-dichlorophenoxypropionate dioxygenases and four subunits of a Rieske non-heme iron oxygenase involved in 2-methyl-chlorophenoxyacetic acid degradation, respectively. Several major genomic rearrangements are proposed to help understand the evolution and mobility of these important genes and their genetic context. Single-strain mobilomic sequence analysis uncovered plasmids and insertion sequence-associated circular intermediates in this environmentally important bacterium and enabled the description of evolutionary models for pesticide degradation in strain MH and related organisms. The mobilome presented a complex mosaic of mobile genetic elements including four plasmids and several circular intermediate DNA molecules of insertion-sequence elements and transposons that are central to the evolution of xenobiotics degradation. Furthermore, two individual chromosomally integrated prophages were shown to excise and form free circular DNA molecules. This approach holds great potential for improving the understanding of genome plasticity, evolution, and microbial ecology. PMID:28961970
Pittet, Vanessa; Phister, Trevor G.; Ziola, Barry
2013-01-01
Growth of specific lactic acid bacteria in beer leads to spoiled product and economic loss for the brewing industry. Microbial growth is typically inhibited by the combined stresses found in beer (e.g., ethanol, hops, low pH, minimal nutrients); however, certain bacteria have adapted to grow in this harsh environment. Considering little is known about the mechanisms used by bacteria to grow in and spoil beer, transcriptome sequencing was performed on a variant of the beer-spoilage organism Pediococcus claussenii ATCC BAA-344T (Pc344-358). Illumina sequencing was used to compare the transcript levels in Pc344-358 growing mid-exponentially in beer to those in nutrient-rich MRS broth. Various operons demonstrated high gene expression in beer, several of which are involved in nutrient acquisition and overcoming the inhibitory effects of hop compounds. As well, genes functioning in cell membrane modification and biosynthesis demonstrated significantly higher transcript levels in Pc344-358 growing in beer. Three plasmids had the majority of their genes showing increased transcript levels in beer, whereas the two cryptic plasmids showed slightly decreased gene expression. Follow-up analysis of plasmid copy number in both growth environments revealed similar trends, where more copies of the three non-cryptic plasmids were found in Pc344-358 growing in beer. Transcriptome sequencing also enabled the addition of several genes to the P . claussenii ATCC BAA-344T genome annotation, some of which are putatively transcribed as non-coding RNAs. The sequencing results not only provide the first transcriptome description of a beer-spoilage organism while growing in beer, but they also highlight several targets for future exploration, including genes that may have a role in the general stress response of lactic acid bacteria. PMID:24040005
Grijseels, Sietske; Nielsen, Jens Christian; Randelovic, Milica; Nielsen, Jens; Nielsen, Kristian Fog; Workman, Mhairi; Frisvad, Jens Christian
2016-10-14
A new soil-borne species belonging to the Penicillium section Canescentia is described, Penicillium arizonense sp. nov. (type strain CBS 141311 T = IBT 12289 T ). The genome was sequenced and assembled into 33.7 Mb containing 12,502 predicted genes. A phylogenetic assessment based on marker genes confirmed the grouping of P. arizonense within section Canescentia. Compared to related species, P. arizonense proved to encode a high number of proteins involved in carbohydrate metabolism, in particular hemicellulases. Mining the genome for genes involved in secondary metabolite biosynthesis resulted in the identification of 62 putative biosynthetic gene clusters. Extracts of P. arizonense were analysed for secondary metabolites and austalides, pyripyropenes, tryptoquivalines, fumagillin, pseurotin A, curvulinic acid and xanthoepocin were detected. A comparative analysis against known pathways enabled the proposal of biosynthetic gene clusters in P. arizonense responsible for the synthesis of all detected compounds except curvulinic acid. The capacity to produce biomass degrading enzymes and the identification of a high chemical diversity in secreted bioactive secondary metabolites, offers a broad range of potential industrial applications for the new species P. arizonense. The description and availability of the genome sequence of P. arizonense, further provides the basis for biotechnological exploitation of this species.
Grijseels, Sietske; Nielsen, Jens Christian; Randelovic, Milica; Nielsen, Jens; Nielsen, Kristian Fog; Workman, Mhairi; Frisvad, Jens Christian
2016-01-01
A new soil-borne species belonging to the Penicillium section Canescentia is described, Penicillium arizonense sp. nov. (type strain CBS 141311T = IBT 12289T). The genome was sequenced and assembled into 33.7 Mb containing 12,502 predicted genes. A phylogenetic assessment based on marker genes confirmed the grouping of P. arizonense within section Canescentia. Compared to related species, P. arizonense proved to encode a high number of proteins involved in carbohydrate metabolism, in particular hemicellulases. Mining the genome for genes involved in secondary metabolite biosynthesis resulted in the identification of 62 putative biosynthetic gene clusters. Extracts of P. arizonense were analysed for secondary metabolites and austalides, pyripyropenes, tryptoquivalines, fumagillin, pseurotin A, curvulinic acid and xanthoepocin were detected. A comparative analysis against known pathways enabled the proposal of biosynthetic gene clusters in P. arizonense responsible for the synthesis of all detected compounds except curvulinic acid. The capacity to produce biomass degrading enzymes and the identification of a high chemical diversity in secreted bioactive secondary metabolites, offers a broad range of potential industrial applications for the new species P. arizonense. The description and availability of the genome sequence of P. arizonense, further provides the basis for biotechnological exploitation of this species. PMID:27739446
Piombo, Edoardo; Sela, Noa; Wisniewski, Michael; Hoffmann, Maria; Gullino, Maria L.; Allard, Marc W.; Levin, Elena; Spadaro, Davide; Droby, Samir
2018-01-01
The yeast Metschnikowia fructicola was reported as an efficient biological control agent of postharvest diseases of fruits and vegetables, and it is the bases of the commercial formulated product “Shemer.” Several mechanisms of action by which M. fructicola inhibits postharvest pathogens were suggested including iron-binding compounds, induction of defense signaling genes, production of fungal cell wall degrading enzymes and relatively high amounts of superoxide anions. We assembled the whole genome sequence of two strains of M. fructicola using PacBio and Illumina shotgun sequencing technologies. Using the PacBio, a high-quality draft genome consisting of 93 contigs, with an estimated genome size of approximately 26 Mb, was obtained. Comparative analysis of M. fructicola proteins with the other three available closely related genomes revealed a shared core of homologous proteins coded by 5,776 genes. Comparing the genomes of the two M. fructicola strains using a SNP calling approach resulted in the identification of 564,302 homologous SNPs with 2,004 predicted high impact mutations. The size of the genome is exceptionally high when compared with those of available closely related organisms, and the high rate of homology among M. fructicola genes points toward a recent whole-genome duplication event as the cause of this large genome. Based on the assembled genome, sequences were annotated with a gene description and gene ontology (GO term) and clustered in functional groups. Analysis of CAZymes family genes revealed 1,145 putative genes, and transcriptomic analysis of CAZyme expression levels in M. fructicola during its interaction with either grapefruit peel tissue or Penicillium digitatum revealed a high level of CAZyme gene expression when the yeast was placed in wounded fruit tissue. PMID:29666611
Novel nonsense mutation in the katA gene of a catalase-negative Staphylococcus aureus strain.
Lagos, Jaime; Alarcón, Pedro; Benadof, Dona; Ulloa, Soledad; Fasce, Rodrigo; Tognarelli, Javier; Aguayo, Carolina; Araya, Pamela; Parra, Bárbara; Olivares, Berta; Hormazábal, Juan Carlos; Fernández, Jorge
2016-01-01
We report the first description of a rare catalase-negative strain of Staphylococcus aureus in Chile. This new variant was isolated from blood and synovial tissue samples of a pediatric patient. Sequencing analysis revealed that this catalase-negative strain is related to ST10 strain, which has earlier been described in relation to S. aureus carriers. Interestingly, sequence analysis of the catalase gene katA revealed presence of a novel nonsense mutation that causes premature translational truncation of the C-terminus of the enzyme leading to a loss of 222 amino acids. Our study suggests that loss of catalase activity in this rare catalase-negative Chilean strain is due to this novel nonsense mutation in the katA gene, which truncates the enzyme to just 283 amino acids. Copyright © 2015 Sociedade Brasileira de Microbiologia. Published by Elsevier Editora Ltda. All rights reserved.
Prevalence and genetic diversity of Bartonella species in sika deer (Cervus nippon) in Japan.
Sato, Shingo; Kabeya, Hidenori; Yamazaki, Mari; Takeno, Shinako; Suzuki, Kazuo; Kobayashi, Shinichi; Souma, Kousaku; Masuko, Takayoshi; Chomel, Bruno B; Maruyama, Soichi
2012-12-01
We report the first description of Bartonella prevalence and genetic diversity in 64 Honshu sika deer (Cervus nippon centralis) and 18 Yezo sika deer (Cervus nippon yesoensis) in Japan. Overall, Bartonella bacteremia prevalence was 41.5% (34/82). The prevalence in wild deer parasitized with ticks and deer keds was 61.8% (34/55), whereas no isolates were detected in captive deer (0/27) free of ectoparasites. The isolates belonged to 11 genogroups based on a combination of the gltA and rpoB gene sequences. Phylogenetic analysis of concatenated sequences of the ftsZ, gltA, ribC, and rpoB genes of 11 representative isolates showed that Japanese sika deer harbor three Bartonella species, including B. capreoli and two novel Bartonella species. All Yezo deer's isolates were identical to B. capreoli B28980 strain isolated from an elk in the USA, based on the sequences of the ftsZ, gltA, and rpoB genes. In contrast, the isolates from Honshu deer showed a higher genetic diversity. Copyright © 2012 Elsevier Ltd. All rights reserved.
Brahmi, Soumia; Touati, Abdelaziz; Cadière, Axelle; Djahmi, Nassima; Pantel, Alix; Sotto, Albert; Dunyach-Remy, Catherine
2016-01-01
To determine the occurrence of carbapenem-resistant Acinetobacter baumannii in fish fished from the Mediterranean Sea near the Bejaia coast (Algeria), we studied 300 gills and gut samples that had been randomly and prospectively collected during 1 year. After screening on selective agar media, using PCR arrays and whole-genome sequencing, we identified for the first time two OXA-23-producing A. baumannii strains belonging to the widespread sequence type 2 (ST2)/international clone II and harboring aminoglycoside-modifying enzymes [aac(6′)-Ib and aac(3′)-I genes]. PMID:26787693
Bossé, Janine T.; Li, Yanwen; Walker, Stephanie; Atherton, Tom; Fernandez Crespo, Roberto; Williamson, Susanna M.; Rogers, Jon; Chaudhuri, Roy R.; Weinert, Lucy A.; Oshota, Olusegun; Holden, Matt T. G.; Maskell, Duncan J.; Tucker, Alexander W.; Wren, Brendan W.; Rycroft, Andrew N.; Langford, Paul R.
2015-01-01
Objectives The objective of this study was to determine the distribution and genetic basis of trimethoprim resistance in Actinobacillus pleuropneumoniae isolates from pigs in England. Methods Clinical isolates collected between 1998 and 2011 were tested for resistance to trimethoprim and sulphonamide. The genetic basis of trimethoprim resistance was determined by shotgun WGS analysis and the subsequent isolation and sequencing of plasmids. Results A total of 16 (out of 106) A. pleuropneumoniae isolates were resistant to both trimethoprim (MIC >32 mg/L) and sulfisoxazole (MIC ≥256 mg/L), and a further 32 were resistant only to sulfisoxazole (MIC ≥256 mg/L). Genome sequence data for the trimethoprim-resistant isolates revealed the presence of the dfrA14 dihydrofolate reductase gene. The distribution of plasmid sequences in multiple contigs suggested the presence of two distinct dfrA14-containing plasmids in different isolates, which was confirmed by plasmid isolation and sequencing. Both plasmids encoded mobilization genes, the sulphonamide resistance gene sul2, as well as dfrA14 inserted into strA, a streptomycin-resistance-associated gene, although the gene order differed between the two plasmids. One of the plasmids further encoded the strB streptomycin-resistance-associated gene. Conclusions This is the first description of mobilizable plasmids conferring trimethoprim resistance in A. pleuropneumoniae and, to our knowledge, the first report of dfrA14 in any member of the Pasteurellaceae. The identification of dfrA14 conferring trimethoprim resistance in A. pleuropneumoniae isolates will facilitate PCR screens for resistance to this important antimicrobial. PMID:25957382
Doroghazi, J. R.; Ju, K.-S.; Metcalf, W. W.
2014-01-01
In phylogenetic analyses of the genus Streptomyces using 16S rRNA gene sequences, Streptomyces albus subsp. albus NRRL B-1811T forms a cluster with five other species having identical or nearly identical 16S rRNA gene sequences. Moreover, the morphological and physiological characteristics of these other species, including Streptomyces almquistii NRRL B-1685T, Streptomyces flocculus NRRL B-2465T, Streptomyces gibsonii NRRL B-1335T and Streptomyces rangoonensis NRRL B-12378T are quite similar. This cluster is of particular taxonomic interest because Streptomyces albus is the type species of the genus Streptomyces. The related strains were subjected to multilocus sequence analysis (MLSA) utilizing partial sequences of the housekeeping genes atpD, gyrB, recA, rpoB and trpB and confirmation of previously reported phenotypic characteristics. The five strains formed a coherent cluster supported by a 100 % bootstrap value in phylogenetic trees generated from sequence alignments prepared by concatenating the sequences of the housekeeping genes, and identical tree topology was observed using various different tree-making algorithms. Moreover, all but one strain, S. flocculus NRRL B-2465T, exhibited identical sequences for all of the five housekeeping gene loci sequenced, but NRRL B-2465T still exhibited an MLSA evolutionary distance of 0.005 from the other strains, a value that is lower than the 0.007 MLSA evolutionary distance threshold proposed for species-level relatedness. These data support a proposal to reclassify S. almquistii, S. flocculus, S. gibsonii and S. rangoonensis as later heterotypic synonyms of S. albus with NRRL B-1811T as the type strain. The MLSA sequence database also demonstrated utility for quickly and conclusively confirming that numerous strains within the ARS Culture Collection had been previously misidentified as subspecies of S. albus and that Streptomyces albus subsp. pathocidicus should be redescribed as a novel species, Streptomyces pathocidini sp. nov., with the type strain NRRL B-24287T. PMID:24277863
Quaglino, Fabio; Zhao, Yan; Casati, Paola; Bulgari, Daniela; Bianco, Piero Attilio; Wei, Wei; Davis, Robert Edward
2013-08-01
Phytoplasmas classified in group 16SrXII infect a wide range of plants and are transmitted by polyphagous planthoppers of the family Cixiidae. Based on 16S rRNA gene sequence identity and biological properties, group 16SrXII encompasses several species, including 'Candidatus Phytoplasma australiense', 'Candidatus Phytoplasma japonicum' and 'Candidatus Phytoplasma fragariae'. Other group 16SrXII phytoplasma strains are associated with stolbur disease in wild and cultivated herbaceous and woody plants and with bois noir disease in grapevines (Vitis vinifera L.). Such latter strains have been informally proposed to represent a separate species, 'Candidatus Phytoplasma solani', but a formal description of this taxon has not previously been published. In the present work, stolbur disease strain STOL11 (STOL) was distinguished from reference strains of previously described species of the 'Candidatus Phytoplasma' genus based on 16S rRNA gene sequence similarity and a unique signature sequence in the 16S rRNA gene. Other stolbur- and bois noir-associated ('Ca. Phytoplasma solani') strains shared >99 % 16S rRNA gene sequence similarity with strain STOL11 and contained the signature sequence. 'Ca. Phytoplasma solani' is the only phytoplasma known to be transmitted by Hyalesthes obsoletus. Insect vectorship and molecular characteristics are consistent with the concept that diverse 'Ca. Phytoplasma solani' strains share common properties and represent an ecologically distinct gene pool. Phylogenetic analyses of 16S rRNA, tuf, secY and rplV-rpsC gene sequences supported this view and yielded congruent trees in which 'Ca. Phytoplasma solani' strains formed, within the group 16SrXII clade, a monophyletic subclade that was most closely related to, but distinct from, that of 'Ca. Phytoplasma australiense'-related strains. Based on distinct molecular and biological properties, stolbur- and bois noir-associated strains are proposed to represent a novel species level taxon, 'Ca. Phytoplasma solani'; STOL11 is designated the reference strain.
Schmid, Michael; Muri, Jonathan; Melidis, Damianos; Varadarajan, Adithi R; Somerville, Vincent; Wicki, Adrian; Moser, Aline; Bourqui, Marc; Wenzel, Claudia; Eugster-Meier, Elisabeth; Frey, Juerg E; Irmler, Stefan; Ahrens, Christian H
2018-01-01
Although complete genome sequences hold particular value for an accurate description of core genomes, the identification of strain-specific genes, and as the optimal basis for functional genomics studies, they are still largely underrepresented in public repositories. Based on an assessment of the genome assembly complexity for all lactobacilli, we used Pacific Biosciences' long read technology to sequence and de novo assemble the genomes of three Lactobacillus helveticus starter strains, raising the number of completely sequenced strains to 12. The first comparative genomics study for L. helveticus -to our knowledge-identified a core genome of 988 genes and sets of unique, strain-specific genes ranging from about 30 to more than 200 genes. Importantly, the comparison of MiSeq- and PacBio-based assemblies uncovered that not only accessory but also core genes can be missed in incomplete genome assemblies based on short reads. Analysis of the three genomes revealed that a large number of pseudogenes were enriched for functional Gene Ontology categories such as amino acid transmembrane transport and carbohydrate metabolism, which is in line with a reductive genome evolution in the rich natural habitat of L. helveticus . Notably, the functional Clusters of Orthologous Groups of proteins categories "cell wall/membrane biogenesis" and "defense mechanisms" were found to be enriched among the strain-specific genes. A genome mining effort uncovered examples where an experimentally observed phenotype could be linked to the underlying genotype, such as for cell envelope proteinase PrtH3 of strain FAM8627. Another possible link identified for peptidoglycan hydrolases will require further experiments. Of note, strain FAM22155 did not harbor a CRISPR/Cas system; its loss was also observed in other L. helveticus strains and lactobacillus species, thus questioning the value of the CRISPR/Cas system for diagnostic purposes. Importantly, the complete genome sequences proved to be very useful for the analysis of natural whey starter cultures with metagenomics, as a larger percentage of the sequenced reads of these complex mixtures could be unambiguously assigned down to the strain level.
Schmid, Michael; Muri, Jonathan; Melidis, Damianos; Varadarajan, Adithi R.; Somerville, Vincent; Wicki, Adrian; Moser, Aline; Bourqui, Marc; Wenzel, Claudia; Eugster-Meier, Elisabeth; Frey, Juerg E.; Irmler, Stefan; Ahrens, Christian H.
2018-01-01
Although complete genome sequences hold particular value for an accurate description of core genomes, the identification of strain-specific genes, and as the optimal basis for functional genomics studies, they are still largely underrepresented in public repositories. Based on an assessment of the genome assembly complexity for all lactobacilli, we used Pacific Biosciences' long read technology to sequence and de novo assemble the genomes of three Lactobacillus helveticus starter strains, raising the number of completely sequenced strains to 12. The first comparative genomics study for L. helveticus—to our knowledge—identified a core genome of 988 genes and sets of unique, strain-specific genes ranging from about 30 to more than 200 genes. Importantly, the comparison of MiSeq- and PacBio-based assemblies uncovered that not only accessory but also core genes can be missed in incomplete genome assemblies based on short reads. Analysis of the three genomes revealed that a large number of pseudogenes were enriched for functional Gene Ontology categories such as amino acid transmembrane transport and carbohydrate metabolism, which is in line with a reductive genome evolution in the rich natural habitat of L. helveticus. Notably, the functional Clusters of Orthologous Groups of proteins categories “cell wall/membrane biogenesis” and “defense mechanisms” were found to be enriched among the strain-specific genes. A genome mining effort uncovered examples where an experimentally observed phenotype could be linked to the underlying genotype, such as for cell envelope proteinase PrtH3 of strain FAM8627. Another possible link identified for peptidoglycan hydrolases will require further experiments. Of note, strain FAM22155 did not harbor a CRISPR/Cas system; its loss was also observed in other L. helveticus strains and lactobacillus species, thus questioning the value of the CRISPR/Cas system for diagnostic purposes. Importantly, the complete genome sequences proved to be very useful for the analysis of natural whey starter cultures with metagenomics, as a larger percentage of the sequenced reads of these complex mixtures could be unambiguously assigned down to the strain level. PMID:29441050
Ribas, Laia; Pardo, Belén G; Fernández, Carlos; Alvarez-Diós, José Antonio; Gómez-Tato, Antonio; Quiroga, María Isabel; Planas, Josep V; Sitjà-Bobadilla, Ariadna; Martínez, Paulino; Piferrer, Francesc
2013-03-15
Genomic resources for plant and animal species that are under exploitation primarily for human consumption are increasingly important, among other things, for understanding physiological processes and for establishing adequate genetic selection programs. Current available techniques for high-throughput sequencing have been implemented in a number of species, including fish, to obtain a proper description of the transcriptome. The objective of this study was to generate a comprehensive transcriptomic database in turbot, a highly priced farmed fish species in Europe, with potential expansion to other areas of the world, for which there are unsolved production bottlenecks, to understand better reproductive- and immune-related functions. This information is essential to implement marker assisted selection programs useful for the turbot industry. Expressed sequence tags were generated by Sanger sequencing of cDNA libraries from different immune-related tissues after several parasitic challenges. The resulting database ("Turbot 2 database") was enlarged with sequences generated from a 454 sequencing run of brain-hypophysis-gonadal axis-derived RNA obtained from turbot at different development stages. The assembly of Sanger and 454 sequences generated 52,427 consensus sequences ("Turbot 3 database"), of which 23,661 were successfully annotated. A total of 1,410 sequences were confirmed to be related to reproduction and key genes involved in sex differentiation and maturation were identified for the first time in turbot (AR, AMH, SRY-related genes, CYP19A, ZPGs, STAR FSHR, etc.). Similarly, 2,241 sequences were related to the immune system and several novel key immune genes were identified (BCL, TRAF, NCK, CD28 and TOLLIP, among others). The number of genes of many relevant reproduction- and immune-related pathways present in the database was 50-90% of the total gene count of each pathway. In addition, 1,237 microsatellites and 7,362 single nucleotide polymorphisms (SNPs) were also compiled. Further, 2,976 putative natural antisense transcripts (NATs) including microRNAs were also identified. The combined sequencing strategies employed here significantly increased the turbot genomic resources available, including 34,400 novel sequences. The generated database contains a larger number of genes relevant for reproduction- and immune-associated studies, with an excellent coverage of most genes present in many relevant physiological pathways. This database also allowed the identification of many microsatellites and SNP markers that will be very useful for population and genome screening and a valuable aid in marker assisted selection programs.
2013-01-01
Background Genomic resources for plant and animal species that are under exploitation primarily for human consumption are increasingly important, among other things, for understanding physiological processes and for establishing adequate genetic selection programs. Current available techniques for high-throughput sequencing have been implemented in a number of species, including fish, to obtain a proper description of the transcriptome. The objective of this study was to generate a comprehensive transcriptomic database in turbot, a highly priced farmed fish species in Europe, with potential expansion to other areas of the world, for which there are unsolved production bottlenecks, to understand better reproductive- and immune-related functions. This information is essential to implement marker assisted selection programs useful for the turbot industry. Results Expressed sequence tags were generated by Sanger sequencing of cDNA libraries from different immune-related tissues after several parasitic challenges. The resulting database (“Turbot 2 database”) was enlarged with sequences generated from a 454 sequencing run of brain-hypophysis-gonadal axis-derived RNA obtained from turbot at different development stages. The assembly of Sanger and 454 sequences generated 52,427 consensus sequences (“Turbot 3 database”), of which 23,661 were successfully annotated. A total of 1,410 sequences were confirmed to be related to reproduction and key genes involved in sex differentiation and maturation were identified for the first time in turbot (AR, AMH, SRY-related genes, CYP19A, ZPGs, STAR FSHR, etc.). Similarly, 2,241 sequences were related to the immune system and several novel key immune genes were identified (BCL, TRAF, NCK, CD28 and TOLLIP, among others). The number of genes of many relevant reproduction- and immune-related pathways present in the database was 50–90% of the total gene count of each pathway. In addition, 1,237 microsatellites and 7,362 single nucleotide polymorphisms (SNPs) were also compiled. Further, 2,976 putative natural antisense transcripts (NATs) including microRNAs were also identified. Conclusions The combined sequencing strategies employed here significantly increased the turbot genomic resources available, including 34,400 novel sequences. The generated database contains a larger number of genes relevant for reproduction- and immune-associated studies, with an excellent coverage of most genes present in many relevant physiological pathways. This database also allowed the identification of many microsatellites and SNP markers that will be very useful for population and genome screening and a valuable aid in marker assisted selection programs. PMID:23497389
Multi-source and ontology-based retrieval engine for maize mutant phenotypes
Green, Jason M.; Harnsomburana, Jaturon; Schaeffer, Mary L.; Lawrence, Carolyn J.; Shyu, Chi-Ren
2011-01-01
Model Organism Databases, including the various plant genome databases, collect and enable access to massive amounts of heterogeneous information, including sequence data, gene product information, images of mutant phenotypes, etc, as well as textual descriptions of many of these entities. While a variety of basic browsing and search capabilities are available to allow researchers to query and peruse the names and attributes of phenotypic data, next-generation search mechanisms that allow querying and ranking of text descriptions are much less common. In addition, the plant community needs an innovative way to leverage the existing links in these databases to search groups of text descriptions simultaneously. Furthermore, though much time and effort have been afforded to the development of plant-related ontologies, the knowledge embedded in these ontologies remains largely unused in available plant search mechanisms. Addressing these issues, we have developed a unique search engine for mutant phenotypes from MaizeGDB. This advanced search mechanism integrates various text description sources in MaizeGDB to aid a user in retrieving desired mutant phenotype information. Currently, descriptions of mutant phenotypes, loci and gene products are utilized collectively for each search, though expansion of the search mechanism to include other sources is straightforward. The retrieval engine, to our knowledge, is the first engine to exploit the content and structure of available domain ontologies, currently the Plant and Gene Ontologies, to expand and enrich retrieval results in major plant genomic databases. Database URL: http:www.PhenomicsWorld.org/QBTA.php PMID:21558151
Nozaki, T; Arase, T; Shigeta, Y; Asai, T; Leustek, T; Takeuchi, T
1998-12-08
A gene encoding adenosine-5'-triphosphate sulfurylase (AS) was cloned from the enteric protozoan parasite Entamoeba histolytica by polymerase chain reaction using degenerate oligonucleotide primers corresponding to conserved regions of the protein from a variety of organisms. The deduced amino acid sequence of E. histolytica AS revealed a calculated molecular mass of 47925 Da and an unusual basic pI of 9.38. The amebic protein sequence showed 23-48% identities with AS from bacteria, yeasts, fungi, plants, and animals with the highest identities being to Synechocystis sp. and Bacillus subtilis (48 and 44%, respectively). Four conserved blocks including putative sulfate-binding and phosphate-binding regions were highly conserved in the E. histolytica AS. The upstream region of the AS gene contained three conserved elements reported for other E. histolytica genes. A recombinant E. histolytica AS revealed enzymatic activity, measured in both the forward and reverse directions. Expression of the E. histolytica AS complemented cysteine auxotrophy of the AS-deficient Escherichia coli strains. Genomic hybridization revealed that the AS gene exists as a single copy gene. In the literature, this is the first description of an AS gene in Protozoa.
Nielsen, Tue Kjærgaard; Rasmussen, Morten; Demanèche, Sandrine; Cecillon, Sébastien; Vogel, Timothy M; Hansen, Lars Hestbjerg
2017-09-01
Bacterial degraders of chlorophenoxy herbicides have been isolated from various ecosystems, including pristine environments. Among these degraders, the sphingomonads constitute a prominent group that displays versatile xenobiotic-degradation capabilities. Four separate sequencing strategies were required to provide the complete sequence of the complex and plastic genome of the canonical chlorophenoxy herbicide-degrading Sphingobium herbicidovorans MH. The genome has an intricate organization of the chlorophenoxy-herbicide catabolic genes sdpA, rdpA, and cadABCD that encode the (R)- and (S)-enantiomer-specific 2,4-dichlorophenoxypropionate dioxygenases and four subunits of a Rieske non-heme iron oxygenase involved in 2-methyl-chlorophenoxyacetic acid degradation, respectively. Several major genomic rearrangements are proposed to help understand the evolution and mobility of these important genes and their genetic context. Single-strain mobilomic sequence analysis uncovered plasmids and insertion sequence-associated circular intermediates in this environmentally important bacterium and enabled the description of evolutionary models for pesticide degradation in strain MH and related organisms. The mobilome presented a complex mosaic of mobile genetic elements including four plasmids and several circular intermediate DNA molecules of insertion-sequence elements and transposons that are central to the evolution of xenobiotics degradation. Furthermore, two individual chromosomally integrated prophages were shown to excise and form free circular DNA molecules. This approach holds great potential for improving the understanding of genome plasticity, evolution, and microbial ecology. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Yeager, C.M.; Kornosky, J.L.; Morgan, R.E.; Cain, E.C.; Garcia-Pichel, F.; Housman, D.C.; Belnap, J.; Kuske, C.R.
2007-01-01
The identity of the numerically dominant N2-fixing bacteria in biological soil crusts of the Colorado Plateau region and two outlying areas was determined using multiple approaches, to link the environmental diversity of nifH gene sequences to cultured bacterial isolates from the regions. Of the nifH sequence-types detected in soil crusts of the Colorado Plateau, 89% (421/473) were most closely related to nifH signature sequences from cyanobacteria of the order Nostocales. N2-fixing cyanobacterial strains were cultured from crusts and their morphotypes, 16S rRNA gene and nifH gene sequences were characterized. The numerically dominant diazotrophs in the Colorado Plateau crusts fell within three clades of heterocystous cyanobacteria. Two clades are well-represented by phylogenetically and morphologically coherent strains, corresponding to the descriptions of Nostoc commune and Scytonema hyalinum, which are widely recognized as important N2-fixing components of soil crusts. A third, previously-overlooked clade was represented by a phylogenetically coherent but morphologically diverse group of strains that encompass the morphogenera Tolypothrix and Spirirestis. Many of the strains in each of these groups contained at least two nifH copies that represent different clusters in the nifH environmental survey. ?? 2007 Federation of European Microbiological Societies.
Pauchet, Y; Wilkinson, P; Vogel, H; Nelson, D R; Reynolds, S E; Heckel, D G; ffrench-Constant, R H
2010-02-01
The tobacco hornworm Manduca sexta is an important model for insect physiology but genomic and transcriptomic data are currently lacking. Following a recent pyrosequencing study generating immune related expressed sequence tags (ESTs), here we use this new technology to define the M. sexta larval midgut transcriptome. We generated over 387,000 midgut ESTs, using a combination of Sanger and 454 sequencing, and classified predicted proteins into those involved in digestion, detoxification and immunity. In many cases the depth of 454 pyrosequencing coverage allowed us to define the entire cDNA sequence of a particular gene. Many new M. sexta genes are described including up to 36 new cytochrome P450s, some of which have been implicated in the metabolism of host plant-derived nicotine. New lepidopteran gene families such as the beta-fructofuranosidases, previously thought to be restricted to Bombyx mori, are also described. An unexpectedly high number of ESTs were involved in immunity, for example 39 contigs encoding serpins, and the increasingly appreciated role of the midgut in insect immunity is discussed. Similar studies of other tissues will allow for a tissue by tissue description of the M. sexta transcriptome and will form an essential complimentary step on the road to genome sequencing and annotation.
Epidemiology, pathology, and genetic analysis of a canine distemper epidemic in Namibia.
Gowtage-Sequeira, Sonya; Banyard, Ashley C; Barrett, Tom; Buczkowski, Hubert; Funk, Stephan M; Cleaveland, Sarah
2009-10-01
Severe population declines have resulted from the spillover of canine distemper virus (CDV) into susceptible wildlife, with both domestic and wild canids being involved in the maintenance and transmission of the virus. This study (March 2001 to October 2003) collated case data, serologic, pathologic, and molecular data to describe the spillover of CDV from domestic dogs (Canis familiaris) to black-backed jackals (Canis mesomelas) during an epidemic on the Namibian coast. Antibody prevalence in jackals peaked at 74.1%, and the clinical signs and histopathologic observations closely resembled those observed in domestic dog cases. Viral RNA was isolated from the brain of a domestic dog from the outbreak area. Sequence data from the phosphoprotein (P) gene and the hemagglutinin (H) genes were used for phylogenetic analyses. The P gene sequence from the domestic dog shared 98% identity with the sequence data available for other CDV isolates of African carnivores. For the H gene, the two sequences available from the outbreak that decimated the lion population in Tanzania in 1994 were the closest match with the Namibian sample, being 94% identical across 1,122 base pairs (bp). Phylogenetic analyses based on this region clustered the Namibian sample with the CDV that is within the morbilliviruses. This is the first description of an epidemic involving black-backed jackals in Namibia, demonstrating that this species has the capacity for rapid and large-scale dissemination of CDV. This work highlights the threat posed to endangered wildlife in Namibia by the spillover of CDV from domestic dog populations. Very few sequence data are currently available for CDV isolates from African carnivores, and this work provides the first sequence data from a Namibian CDV isolate.
Bedside Back to Bench: Building Bridges between Basic and Clinical Genomic Research.
Manolio, Teri A; Fowler, Douglas M; Starita, Lea M; Haendel, Melissa A; MacArthur, Daniel G; Biesecker, Leslie G; Worthey, Elizabeth; Chisholm, Rex L; Green, Eric D; Jacob, Howard J; McLeod, Howard L; Roden, Dan; Rodriguez, Laura Lyman; Williams, Marc S; Cooper, Gregory M; Cox, Nancy J; Herman, Gail E; Kingsmore, Stephen; Lo, Cecilia; Lutz, Cathleen; MacRae, Calum A; Nussbaum, Robert L; Ordovas, Jose M; Ramos, Erin M; Robinson, Peter N; Rubinstein, Wendy S; Seidman, Christine; Stranger, Barbara E; Wang, Haoyi; Westerfield, Monte; Bult, Carol
2017-03-23
Genome sequencing has revolutionized the diagnosis of genetic diseases. Close collaborations between basic scientists and clinical genomicists are now needed to link genetic variants with disease causation. To facilitate such collaborations, we recommend prioritizing clinically relevant genes for functional studies, developing reference variant-phenotype databases, adopting phenotype description standards, and promoting data sharing. Published by Elsevier Inc.
Bedside Back to Bench: Building Bridges between Basic and Clinical Genomic Research
Manolio, Teri A.; Fowler, Douglas M.; Starita, Lea M.; Haendel, Melissa A.; MacArthur, Daniel G.; Biesecker, Leslie G.; Worthey, Elizabeth; Chisholm, Rex L.; Green, Eric D.; Jacob, Howard J.; McLeod, Howard L.; Roden, Dan; Rodriguez, Laura Lyman; Williams, Marc S.; Cooper, Gregory M.; Cox, Nancy J.; Herman, Gail E.; Kingsmore, Stephen; Lo, Cecilia; Lutz, Cathleen; MacRae, Calum A.; Nussbaum, Robert L.; Ordovas, Jose M.; Ramos, Erin M.; Robinson, Peter N.; Rubinstein, Wendy S.; Seidman, Christine; Stranger, Barbara E.; Wang, Haoyi; Westerfield, Monte; Bult, Carol
2017-01-01
Summary Genome sequencing has revolutionized the diagnosis of genetic diseases. Close collaborations between basic scientists and clinical genomicists are now needed to link genetic variants with disease causation. To facilitate such collaborations we recommend prioritizing clinically relevant genes for functional studies, developing reference variant-phenotype databases, adopting phenotype description standards, and promoting data sharing. PMID:28340351
USDA-ARS?s Scientific Manuscript database
A polyphasic study was undertaken to establish the taxonomic status of Streptomyces strains isolated from arid Atacama Desert soils. Analysis of the 16S rRNA gene sequences of the isolates showed that they formed a well-defined lineage that was loosely associated with the type strains of several Str...
Park, Seong Chan; Choe, Han Na; Baik, Keun Sik; Lee, Kang Hyun; Seong, Chi Nam
2012-01-01
A rod-shaped, yellow and strictly aerobic marine bacterium, designated KYW382(T), was isolated from seawater collected from the South Sea, Republic of Korea. Cells were Gram-negative and catalase- and oxidase-positive. The major fatty acids were iso-C(15:1) G, iso-C(15:0), iso-C(17:0) 3-OH, iso-C(15:0) 3-OH and anteiso-C(15:0). The DNA G+C content was 32.4 mol%. A phylogenetic tree based on 16S rRNA gene sequences showed that strain KYW382(T) constituted an evolutionary lineage within the radiation enclosing the members of the genus Gaetbulibacter. The closest neighbour was Gaetbulibacter saemankumensis SMK-12(T) (96.1% 16S rRNA gene sequence similarity). A number of phenotypic characteristics distinguished strain KYW382(T) from the described members of the genus Gaetbulibacter. On the basis of the data presented in this study, strain KYW382(T) represents a novel species, for which the name Gaetbulibacter aestuarii sp. nov. is proposed. The type strain is KYW382(T) (=KCTC 23303(T) =JCM 17455(T)). An emended description of the genus Gaetbulibacter is also given.
Whole-exome/genome sequencing and genomics.
Grody, Wayne W; Thompson, Barry H; Hudgins, Louanne
2013-12-01
As medical genetics has progressed from a descriptive entity to one focused on the functional relationship between genes and clinical disorders, emphasis has been placed on genomics. Genomics, a subelement of genetics, is the study of the genome, the sum total of all the genes of an organism. The human genome, which is contained in the 23 pairs of nuclear chromosomes and in the mitochondrial DNA of each cell, comprises >6 billion nucleotides of genetic code. There are some 23,000 protein-coding genes, a surprisingly small fraction of the total genetic material, with the remainder composed of noncoding DNA, regulatory sequences, and introns. The Human Genome Project, launched in 1990, produced a draft of the genome in 2001 and then a finished sequence in 2003, on the 50th anniversary of the initial publication of Watson and Crick's paper on the double-helical structure of DNA. Since then, this mass of genetic information has been translated at an ever-increasing pace into useable knowledge applicable to clinical medicine. The recent advent of massively parallel DNA sequencing (also known as shotgun, high-throughput, and next-generation sequencing) has brought whole-genome analysis into the clinic for the first time, and most of the current applications are directed at children with congenital conditions that are undiagnosable by using standard genetic tests for single-gene disorders. Thus, pediatricians must become familiar with this technology, what it can and cannot offer, and its technical and ethical challenges. Here, we address the concepts of human genomic analysis and its clinical applicability for primary care providers.
Walker, Michael B; King, Benjamin L; Paigen, Kenneth
2012-01-01
Arrangements of genes along chromosomes are a product of evolutionary processes, and we can expect that preferable arrangements will prevail over the span of evolutionary time, often being reflected in the non-random clustering of structurally and/or functionally related genes. Such non-random arrangements can arise by two distinct evolutionary processes: duplications of DNA sequences that give rise to clusters of genes sharing both sequence similarity and common sequence features and the migration together of genes related by function, but not by common descent. To provide a background for distinguishing between the two, which is important for future efforts to unravel the evolutionary processes involved, we here provide a description of the extent to which ancestrally related genes are found in proximity.Towards this purpose, we combined information from five genomic datasets, InterPro, SCOP, PANTHER, Ensembl protein families, and Ensembl gene paralogs. The results are provided in publicly available datasets (http://cgd.jax.org/datasets/clustering/paraclustering.shtml) describing the extent to which ancestrally related genes are in proximity beyond what is expected by chance (i.e. form paraclusters) in the human and nine other vertebrate genomes, as well as the D. melanogaster, C. elegans, A. thaliana, and S. cerevisiae genomes. With the exception of Saccharomyces, paraclusters are a common feature of the genomes we examined. In the human genome they are estimated to include at least 22% of all protein coding genes. Paraclusters are far more prevalent among some gene families than others, are highly species or clade specific and can evolve rapidly, sometimes in response to environmental cues. Altogether, they account for a large portion of the functional clustering previously reported in several genomes.
Wang, Ying; Yang, Liandong; Wu, Bo; Song, Zhaobin; He, Shunping
2015-07-10
Triplophysa dalaica, endemic species of Qinghai-Tibetan Plateau, is informative for understanding the genetic basis of adaptation to hypoxic conditions of high altitude habitats. Here, a comprehensive gene repertoire for this plateau fish was generated using the Illumina deep paired-end high-throughput sequencing technology. De novo assembly yielded 145, 256 unigenes with an average length of 1632 bp. Blast searches against GenBank non-redundant database annotated 74,594 (51.4%) unigenes encoding for 30,047 gene descriptions in T. dalaica. Functional annotation and classification of assembled sequences were performed using Gene Ontology (GO), clusters of euKaryotic Orthologous Groups (KOG) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis. After comparison with other fish transcriptomes, including silver carp (Hypophthalmichthys molitrix) and mud loach (Misgurnus anguillicaudatus), 2621 high-quality orthologous gene alignments were constructed among these species. 61 (2.3%) of the genes were identified as having undergone positive selection in the T. dalaica lineage. Within the positively selected genes, 13 genes were involved in hypoxia response, of which 11 were listed in HypoxiaDB. Furthermore, duplicated hif-α (hif-1αA/B and hif-2αA/B), EGLN1 and PPARA candidate genes involved in adaptation to hypoxia were identified in T. dalaica transcriptome. Branch-site model in PAML validated that hif-1αB and hif-2αA genes have undergone positive selection in T.dalaica. Finally, 37,501 simple sequence repeats (SSRs) and 19,497 high-quality single nucleotide polymorphisms (SNPs) were identified in T. dalaica. The identified SSR and SNP markers will facilitate the genetic structure, population geography and ecological studies of Triplophysa fishes. Copyright © 2015 Elsevier B.V. All rights reserved.
O’Grady, Joseph F.; Hoelters, Laura S.; Swain, Martin T.
2016-01-01
Background Talitrus saltator is an amphipod crustacean that inhabits the supralittoral zone on sandy beaches in the Northeast Atlantic and Mediterranean. T. saltator exhibits endogenous locomotor activity rhythms and time-compensated sun and moon orientation, both of which necessitate at least one chronometric mechanism. Whilst their behaviour is well studied, currently there are no descriptions of the underlying molecular components of a biological clock in this animal, and very few in other crustacean species. Methods We harvested brain tissue from animals expressing robust circadian activity rhythms and used homology cloning and Illumina RNAseq approaches to sequence and identify the core circadian clock and clock-related genes in these samples. We assessed the temporal expression of these genes in time-course samples from rhythmic animals using RNAseq. Results We identified a comprehensive suite of circadian clock gene homologues in T. saltator including the ‘core’ clock genes period (Talper), cryptochrome 2 (Talcry2), timeless (Taltim), clock (Talclk), and bmal1 (Talbmal1). In addition we describe the sequence and putative structures of 23 clock-associated genes including two unusual, extended isoforms of pigment dispersing hormone (Talpdh). We examined time-course RNAseq expression data, derived from tissues harvested from behaviourally rhythmic animals, to reveal rhythmic expression of these genes with approximately circadian period in Talper and Talbmal1. Of the clock-related genes, casein kinase IIβ (TalckIIβ), ebony (Talebony), jetlag (Taljetlag), pigment dispensing hormone (Talpdh), protein phosphatase 1 (Talpp1), shaggy (Talshaggy), sirt1 (Talsirt1), sirt7 (Talsirt7) and supernumerary limbs (Talslimb) show temporal changes in expression. Discussion We report the sequences of principle genes that comprise the circadian clock of T. saltator and highlight the conserved structural and functional domains of their deduced cognate proteins. Our sequencing data contribute to the growing inventory of described comparative clocks. Expression profiling of the identified clock genes illuminates tantalising targets for experimental manipulation to elucidate the molecular and cellular control of clock-driven phenotypes in this crustacean. PMID:27761341
Glaberman, Scott; Du Pasquier, Louis; Caccone, Adalgisa
2008-01-01
Squamates are a diverse order of vertebrates, representing more than 7,000 species. Yet, descriptions of full-length major histocompatibility complex (MHC) genes in this group are nearly absent from the literature, while the number of MHC studies continues to rise in other vertebrate taxa. The lack of basic information about MHC organization in squamates inhibits investigation into the relationship between MHC polymorphism and disease, and leaves a large taxonomic gap in our understanding of amniote MHC evolution. Here, we use both cDNA and genomic sequence data to characterize a class I MHC gene (Amcr-UA) from the Galápagos marine iguana, a member of the squamate subfamily Iguaninae. Amcr-UA appears to be functional since it is expressed in the blood and contains many of the conserved peptide-binding residues that are found in classical class I genes of other vertebrates. In addition, comparison of Amcr-UA to homologous sequences from other iguanine species shows that the antigen-binding portion of this gene is under purifying selection, rather than balancing selection, and therefore may have a conserved function. A striking feature of Amcr-UA is that both the cDNA and genomic sequences lack the transmembrane and cytoplasmic domains that are necessary to anchor the class I receptor molecule into the cell membrane, suggesting that the product of this gene is secreted and consequently not involved in classical class I antigen-presentation. The truncated and conserved character of Amcr-UA lead us to define it as a nonclassical gene that is related to the few available squamate class I sequences. However, phylogenetic analysis placed Amcr-UA in a basal position relative to other published classical MHC genes from squamates, suggesting that this gene diverged near the beginning of squamate diversification. PMID:18682845
Wan, LingLin; Han, Juan; Sang, Min; Li, AiFen; Wu, Hong; Yin, ShunJi; Zhang, ChengWu
2012-01-01
Background Eustigmatos cf. polyphem is a yellow-green unicellular soil microalga belonging to the eustimatophyte with high biomass and considerable production of triacylglycerols (TAGs) for biofuels, which is thus referred to as an oleaginous microalga. The paucity of microalgae genome sequences, however, limits development of gene-based biofuel feedstock optimization studies. Here we describe the sequencing and de novo transcriptome assembly for a non-model microalgae species, E. cf. polyphem, and identify pathways and genes of importance related to biofuel production. Results We performed the de novo assembly of E. cf. polyphem transcriptome using Illumina paired-end sequencing technology. In a single run, we produced 29,199,432 sequencing reads corresponding to 2.33 Gb total nucleotides. These reads were assembled into 75,632 unigenes with a mean size of 503 bp and an N50 of 663 bp, ranging from 100 bp to >3,000 bp. Assembled unigenes were subjected to BLAST similarity searches and annotated with Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) orthology identifiers. These analyses identified the majority of carbohydrate, fatty acids, TAG and carotenoids biosynthesis and catabolism pathways in E. cf. polyphem. Conclusions Our data provides the construction of metabolic pathways involved in the biosynthesis and catabolism of carbohydrate, fatty acids, TAG and carotenoids in E. cf. polyphem and provides a foundation for the molecular genetics and functional genomics required to direct metabolic engineering efforts that seek to enhance the quantity and character of microalgae-based biofuel feedstock. PMID:22536352
Detailed transcriptome description of the neglected cestode Taenia multiceps.
Wu, Xuhang; Fu, Yan; Yang, Deying; Zhang, Runhui; Zheng, Wanpeng; Nie, Huaming; Xie, Yue; Yan, Ning; Hao, Guiying; Gu, Xiaobin; Wang, Shuxian; Peng, Xuerong; Yang, Guangyou
2012-01-01
The larval stage of Taenia multiceps, a global cestode, encysts in the central nervous system (CNS) of sheep and other livestock. This frequently leads to their death and huge socioeconomic losses, especially in developing countries. This parasite can also cause zoonotic infections in humans, but has been largely neglected due to a lack of diagnostic techniques and studies. Recent developments in next-generation sequencing provide an opportunity to explore the transcriptome of T. multiceps. We obtained a total of 31,282 unigenes (mean length 920 bp) using Illumina paired-end sequencing technology and a new Trinity de novo assembler without a referenced genome. Individual transcription molecules were determined by sequence-based annotations and/or domain-based annotations against public databases (Nr, UniprotKB/Swiss-Prot, COG, KEGG, UniProtKB/TrEMBL, InterPro and Pfam). We identified 26,110 (83.47%) unigenes and inferred 20,896 (66.8%) coding sequences (CDS). Further comparative transcripts analysis with other cestodes (Taenia pisiformis, Taenia solium, Echincoccus granulosus and Echincoccus multilocularis) and intestinal parasites (Trichinella spiralis, Ancylostoma caninum and Ascaris suum) showed that 5,100 common genes were shared among three Taenia tapeworms, 261 conserved genes were detected among five Taeniidae cestodes, and 109 common genes were found in four zoonotic intestinal parasites. Some of the common genes were genes required for parasite survival, involved in parasite-host interactions. In addition, we amplified two full-length CDS of unigenes from the common genes using RT-PCR. This study provides an extensive transcriptome of the adult stage of T. multiceps, and demonstrates that comparative transcriptomic investigations deserve to be further studied. This transcriptome dataset forms a substantial public information platform to achieve a fundamental understanding of the biology of T. multiceps, and helps in the identification of drug targets and parasite-host interaction studies.
Bossé, Janine T; Li, Yanwen; Walker, Stephanie; Atherton, Tom; Fernandez Crespo, Roberto; Williamson, Susanna M; Rogers, Jon; Chaudhuri, Roy R; Weinert, Lucy A; Oshota, Olusegun; Holden, Matt T G; Maskell, Duncan J; Tucker, Alexander W; Wren, Brendan W; Rycroft, Andrew N; Langford, Paul R
2015-08-01
The objective of this study was to determine the distribution and genetic basis of trimethoprim resistance in Actinobacillus pleuropneumoniae isolates from pigs in England. Clinical isolates collected between 1998 and 2011 were tested for resistance to trimethoprim and sulphonamide. The genetic basis of trimethoprim resistance was determined by shotgun WGS analysis and the subsequent isolation and sequencing of plasmids. A total of 16 (out of 106) A. pleuropneumoniae isolates were resistant to both trimethoprim (MIC >32 mg/L) and sulfisoxazole (MIC ≥256 mg/L), and a further 32 were resistant only to sulfisoxazole (MIC ≥256 mg/L). Genome sequence data for the trimethoprim-resistant isolates revealed the presence of the dfrA14 dihydrofolate reductase gene. The distribution of plasmid sequences in multiple contigs suggested the presence of two distinct dfrA14-containing plasmids in different isolates, which was confirmed by plasmid isolation and sequencing. Both plasmids encoded mobilization genes, the sulphonamide resistance gene sul2, as well as dfrA14 inserted into strA, a streptomycin-resistance-associated gene, although the gene order differed between the two plasmids. One of the plasmids further encoded the strB streptomycin-resistance-associated gene. This is the first description of mobilizable plasmids conferring trimethoprim resistance in A. pleuropneumoniae and, to our knowledge, the first report of dfrA14 in any member of the Pasteurellaceae. The identification of dfrA14 conferring trimethoprim resistance in A. pleuropneumoniae isolates will facilitate PCR screens for resistance to this important antimicrobial. © The Author 2015. Published by Oxford University Press on behalf of the British Society for Antimicrobial Chemotherapy.
NemaPath: online exploration of KEGG-based metabolic pathways for nematodes
Wylie, Todd; Martin, John; Abubucker, Sahar; Yin, Yong; Messina, David; Wang, Zhengyuan; McCarter, James P; Mitreva, Makedonka
2008-01-01
Background Nematode.net is a web-accessible resource for investigating gene sequences from parasitic and free-living nematode genomes. Beyond the well-characterized model nematode C. elegans, over 500,000 expressed sequence tags (ESTs) and nearly 600,000 genome survey sequences (GSSs) have been generated from 36 nematode species as part of the Parasitic Nematode Genomics Program undertaken by the Genome Center at Washington University School of Medicine. However, these sequencing data are not present in most publicly available protein databases, which only include sequences in Swiss-Prot. Swiss-Prot, in turn, relies on GenBank/Embl/DDJP for predicted proteins from complete genomes or full-length proteins. Description Here we present the NemaPath pathway server, a web-based pathway-level visualization tool for navigating putative metabolic pathways for over 30 nematode species, including 27 parasites. The NemaPath approach consists of two parts: 1) a backend tool to align and evaluate nematode genomic sequences (curated EST contigs) against the annotated Kyoto Encyclopedia of Genes and Genomes (KEGG) protein database; 2) a web viewing application that displays annotated KEGG pathway maps based on desired confidence levels of primary sequence similarity as defined by a user. NemaPath also provides cross-referenced access to nematode genome information provided by other tools available on Nematode.net, including: detailed NemaGene EST cluster information; putative translations; GBrowse EST cluster views; links from nematode data to external databases for corresponding synonymous C. elegans counterparts, subject matches in KEGG's gene database, and also KEGG Ontology (KO) identification. Conclusion The NemaPath server hosts metabolic pathway mappings for 30 nematode species and is available on the World Wide Web at . The nematode source sequences used for the metabolic pathway mappings are available via FTP , as provided by the Genome Center at Washington University School of Medicine. PMID:18983679
Detection of non-coding RNA in bacteria and archaea using the DETR'PROK Galaxy pipeline.
Toffano-Nioche, Claire; Luo, Yufei; Kuchly, Claire; Wallon, Claire; Steinbach, Delphine; Zytnicki, Matthias; Jacq, Annick; Gautheret, Daniel
2013-09-01
RNA-seq experiments are now routinely used for the large scale sequencing of transcripts. In bacteria or archaea, such deep sequencing experiments typically produce 10-50 million fragments that cover most of the genome, including intergenic regions. In this context, the precise delineation of the non-coding elements is challenging. Non-coding elements include untranslated regions (UTRs) of mRNAs, independent small RNA genes (sRNAs) and transcripts produced from the antisense strand of genes (asRNA). Here we present a computational pipeline (DETR'PROK: detection of ncRNAs in prokaryotes) based on the Galaxy framework that takes as input a mapping of deep sequencing reads and performs successive steps of clustering, comparison with existing annotation and identification of transcribed non-coding fragments classified into putative 5' UTRs, sRNAs and asRNAs. We provide a step-by-step description of the protocol using real-life example data sets from Vibrio splendidus and Escherichia coli. Copyright © 2013 The Authors. Published by Elsevier Inc. All rights reserved.
Jiménez, Diego Javier; Andreote, Fernando Dini; Chaves, Diego; Montaña, José Salvador; Osorio-Forero, Cesar; Junca, Howard; Zambrano, María Mercedes; Baena, Sandra
2012-01-01
A taxonomic and annotated functional description of microbial life was deduced from 53 Mb of metagenomic sequence retrieved from a planktonic fraction of the Neotropical high Andean (3,973 meters above sea level) acidic hot spring El Coquito (EC). A classification of unassembled metagenomic reads using different databases showed a high proportion of Gammaproteobacteria and Alphaproteobacteria (in total read affiliation), and through taxonomic affiliation of 16S rRNA gene fragments we observed the presence of Proteobacteria, micro-algae chloroplast and Firmicutes. Reads mapped against the genomes Acidiphilium cryptum JF-5, Legionella pneumophila str. Corby and Acidithiobacillus caldus revealed the presence of transposase-like sequences, potentially involved in horizontal gene transfer. Functional annotation and hierarchical comparison with different datasets obtained by pyrosequencing in different ecosystems showed that the microbial community also contained extensive DNA repair systems, possibly to cope with ultraviolet radiation at such high altitudes. Analysis of genes involved in the nitrogen cycle indicated the presence of dissimilatory nitrate reduction to N2 (narGHI, nirS, norBCDQ and nosZ), associated with Proteobacteria-like sequences. Genes involved in the sulfur cycle (cysDN, cysNC and aprA) indicated adenylsulfate and sulfite production that were affiliated to several bacterial species. In summary, metagenomic sequence data provided insight regarding the structure and possible functions of this hot spring microbial community, describing some groups potentially involved in the nitrogen and sulfur cycling in this environment. PMID:23251687
Döcker, Dennis; Schubach, Max; Menzel, Moritz; Spaich, Christiane; Gabriel, Heinz-Dieter; Zenker, Martin; Bartholdi, Deborah; Biskup, Saskia
2015-01-01
Megalencephaly-capillary malformation (MCAP) syndrome is an overgrowth syndrome that is diagnosed by clinical criteria. Recently, somatic and germline variants in genes that are involved in the PI3K-AKT pathway (AKT3, PIK3R2 and PIK3CA) have been described to be associated with MCAP and/or other related megalencephaly syndromes. We performed trio-exome sequencing in a 6-year-old boy and his healthy parents. Clinical features were macrocephaly, cutis marmorata, angiomata, asymmetric overgrowth, developmental delay, discrete midline facial nevus flammeus, toe syndactyly and postaxial polydactyly—thus, clearly an MCAP phenotype. Exome sequencing revealed a pathogenic de novo germline variant in the PTPN11 gene (c.1529A>G; p.(Gln510Arg)), which has so far been associated with Noonan, as well as LEOPARD syndrome. Whole-exome sequencing (>100 × coverage) did not reveal any alteration in the known megalencephaly genes. However, ultra-deep sequencing results from saliva (>1000 × coverage) revealed a 22% mosaic variant in PIK3CA (c.2740G>A; p.(Gly914Arg)). To our knowledge, this report is the first description of a PTPN11 germline variant in an MCAP patient. Data from experimental studies show a complex interaction of SHP2 (gene product of PTPN11) and the PI3K-AKT pathway. We hypothesize that certain PTPN11 germline variants might drive toward additional second-hit alterations. PMID:24939587
Döcker, Dennis; Schubach, Max; Menzel, Moritz; Spaich, Christiane; Gabriel, Heinz-Dieter; Zenker, Martin; Bartholdi, Deborah; Biskup, Saskia
2015-03-01
Megalencephaly-capillary malformation (MCAP) syndrome is an overgrowth syndrome that is diagnosed by clinical criteria. Recently, somatic and germline variants in genes that are involved in the PI3K-AKT pathway (AKT3, PIK3R2 and PIK3CA) have been described to be associated with MCAP and/or other related megalencephaly syndromes. We performed trio-exome sequencing in a 6-year-old boy and his healthy parents. Clinical features were macrocephaly, cutis marmorata, angiomata, asymmetric overgrowth, developmental delay, discrete midline facial nevus flammeus, toe syndactyly and postaxial polydactyly--thus, clearly an MCAP phenotype. Exome sequencing revealed a pathogenic de novo germline variant in the PTPN11 gene (c.1529A>G; p.(Gln510Arg)), which has so far been associated with Noonan, as well as LEOPARD syndrome. Whole-exome sequencing (>100 × coverage) did not reveal any alteration in the known megalencephaly genes. However, ultra-deep sequencing results from saliva (>1000 × coverage) revealed a 22% mosaic variant in PIK3CA (c.2740G>A; p.(Gly914Arg)). To our knowledge, this report is the first description of a PTPN11 germline variant in an MCAP patient. Data from experimental studies show a complex interaction of SHP2 (gene product of PTPN11) and the PI3K-AKT pathway. We hypothesize that certain PTPN11 germline variants might drive toward additional second-hit alterations.
Gruenstaeudl, Michael; Gerschler, Nico; Borsch, Thomas
2018-06-21
The sequencing and comparison of plastid genomes are becoming a standard method in plant genomics, and many researchers are using this approach to infer plant phylogenetic relationships. Due to the widespread availability of next-generation sequencing, plastid genome sequences are being generated at breakneck pace. This trend towards massive sequencing of plastid genomes highlights the need for standardized bioinformatic workflows. In particular, documentation and dissemination of the details of genome assembly, annotation, alignment and phylogenetic tree inference are needed, as these processes are highly sensitive to the choice of software and the precise settings used. Here, we present the procedure and results of sequencing, assembling, annotating and quality-checking of three complete plastid genomes of the aquatic plant genus Cabomba as well as subsequent gene alignment and phylogenetic tree inference. We accompany our findings by a detailed description of the bioinformatic workflow employed. Importantly, we share a total of eleven software scripts for each of these bioinformatic processes, enabling other researchers to evaluate and replicate our analyses step by step. The results of our analyses illustrate that the plastid genomes of Cabomba are highly conserved in both structure and gene content.
McGregor, Glenn B; Sendall, Barbara C
2015-02-01
Three populations of the freshwater filamentous cyanobacterium Lyngbya wollei (Farlow ex Gomont) Speziale and Dyck have been putatively identified from north-eastern Australia and found to produce the potent cyanotoxin cylindrospermopsin (CYN) and its analog deoxy-cylindrospermopsin (deoxy-CYN). We investigated the phylogeny and toxicology of strains and mats isolated from two of these populations using a combination of molecular and morphological techniques. Morphologically the strains corresponded to the type description, however, the frequency of false-branching was low, and variable over time. Strains and mat samples from both sites were positive for the cyrF and cyrJ genes associated with CYN biosynthesis. Phylogenetic analysis of these genes from Australian L. wollei sequences and comparable cyanobacterial sequences revealed that the genes in L. wollei were more closely related to homologous genes in Oscillatoria sp. PCC 6506 than to homologs in Nostocalean CYN-producers. These data suggest a common evolutionary origin of CYN biosynthesis in L. wollei and Oscillatoria. In both the 16S rRNA and nifH phylogenies, the Australian L. wollei strains formed well-supported clades with United States L. wollei (= Plectonema wollei) strains. Pair-wise sequence similarities within the 16S rRNA clade containing all eleven L. wollei strains were high, ranging from 97% to 100%. This group was distantly related (<92% nucleotide similarity) to other taxa within the group previously considered under the genus Lyngbya sensu lato (C. Agardh ex Gomont). Collectively, these results suggest that this toxigenic group is evolutionarily distinct and sufficiently distant as to be considered a separate genus, which we have described as Microseira gen. nov. and hence transfer to it the type M. wollei comb. nov. © 2014 State of Queensland. Journal of Phycology © 2014 Phycological Society of America.
Sokol, Martin; Jessen, Karen Margrethe; Pedersen, Finn Skou
2016-01-01
Several studies have shown that human endogenous retroviruses and endogenous retrovirus-like repeats (here collectively HERVs) impose direct regulation on human genes through enhancer and promoter motifs present in their long terminal repeats (LTRs). Although chimeric transcription in which novel gene isoforms containing retroviral and human sequence are transcribed from viral promoters are commonly associated with disease, regulation by HERVs is beneficial in other settings; for example, in human testis chimeric isoforms of TP63 induced by an ERV9 LTR protect the male germ line upon DNA damage by inducing apoptosis, whereas in the human globin locus the γ- and β-globin switch during normal hematopoiesis is mediated by complex interactions of an ERV9 LTR and surrounding human sequence. The advent of deep sequencing or next-generation sequencing (NGS) has revolutionized the way researchers solve important scientific questions and develop novel hypotheses in relation to human genome regulation. We recently applied next-generation paired-end RNA-sequencing (RNA-seq) together with chromatin immunoprecipitation with sequencing (ChIP-seq) to examine ERV9 chimeric transcription in human reference cell lines from Encyclopedia of DNA Elements (ENCODE). This led to the discovery of advanced regulation mechanisms by ERV9s and other HERVs across numerous human loci including transcription of large gene-unannotated genomic regions, as well as cooperative regulation by multiple HERVs and non-LTR repeats such as Alu elements. In this article, well-established examples of human gene regulation by HERVs are reviewed followed by a description of paired-end RNA-seq, and its application in identifying chimeric transcription genome-widely. Based on integrative analyses of RNA-seq and ChIP-seq, data we then present novel examples of regulation by ERV9s of tumor suppressor genes CADM2 and SEMA3A, as well as transcription of an unannotated region. Taken together, this article highlights the high suitability of contemporary sequencing methods in future analyses of human biology in relation to evolutionary acquired retroviruses in the human genome. © 2016 APMIS. Published by John Wiley & Sons Ltd.
Lopes-Santos, Lucilene; Castro, Daniel Bedo Assumpção; Ferreira-Tonin, Mariana; Corrêa, Daniele Bussioli Alves; Weir, Bevan Simon; Park, Duckchul; Ottoboni, Laura Maria Mariscal; Neto, Júlio Rodrigues; Destéfano, Suzete Aparecida Lanza
2017-06-01
The phylogenetic classification of the species Burkholderia andropogonis within the Burkholderia genus was reassessed using 16S rRNA gene phylogenetic analysis and multilocus sequence analysis (MLSA). Both phylogenetic trees revealed two main groups, named A and B, strongly supported by high bootstrap values (100%). Group A encompassed all of the Burkholderia species complex, whi.le Group B only comprised B. andropogonis species, with low percentage similarities with other species of the genus, from 92 to 95% for 16S rRNA gene sequences and 83% for conserved gene sequences. Average nucleotide identity (ANI), tetranucleotide signature frequency, and percentage of conserved proteins POCP analyses were also carried out, and in the three analyses B. andropogonis showed lower values when compared to the other Burkholderia species complex, near 71% for ANI, from 0.484 to 0.724 for tetranucleotide signature frequency, and around 50% for POCP, reinforcing the distance observed in the phylogenetic analyses. Our findings provide an important insight into the taxonomy of B. andropogonis. It is clear from the results that this bacterial species exhibits genotypic differences and represents a new genus described herein as Robbsia andropogonis gen. nov., comb. nov.
2012-01-01
Introduction Traditionally, genomic or transcriptomic data have been restricted to a few model or emerging model organisms, and to a handful of species of medical and/or environmental importance. Next-generation sequencing techniques have the capability of yielding massive amounts of gene sequence data for virtually any species at a modest cost. Here we provide a comparative analysis of de novo assembled transcriptomic data for ten non-model species of previously understudied animal taxa. Results cDNA libraries of ten species belonging to five animal phyla (2 Annelida [including Sipuncula], 2 Arthropoda, 2 Mollusca, 2 Nemertea, and 2 Porifera) were sequenced in different batches with an Illumina Genome Analyzer II (read length 100 or 150 bp), rendering between ca. 25 and 52 million reads per species. Read thinning, trimming, and de novo assembly were performed under different parameters to optimize output. Between 67,423 and 207,559 contigs were obtained across the ten species, post-optimization. Of those, 9,069 to 25,681 contigs retrieved blast hits against the NCBI non-redundant database, and approximately 50% of these were assigned with Gene Ontology terms, covering all major categories, and with similar percentages in all species. Local blasts against our datasets, using selected genes from major signaling pathways and housekeeping genes, revealed high efficiency in gene recovery compared to available genomes of closely related species. Intriguingly, our transcriptomic datasets detected multiple paralogues in all phyla and in nearly all gene pathways, including housekeeping genes that are traditionally used in phylogenetic applications for their purported single-copy nature. Conclusions We generated the first study of comparative transcriptomics across multiple animal phyla (comparing two species per phylum in most cases), established the first Illumina-based transcriptomic datasets for sponge, nemertean, and sipunculan species, and generated a tractable catalogue of annotated genes (or gene fragments) and protein families for ten newly sequenced non-model organisms, some of commercial importance (i.e., Octopus vulgaris). These comprehensive sets of genes can be readily used for phylogenetic analysis, gene expression profiling, developmental analysis, and can also be a powerful resource for gene discovery. The characterization of the transcriptomes of such a diverse array of animal species permitted the comparison of sequencing depth, functional annotation, and efficiency of genomic sampling using the same pipelines, which proved to be similar for all considered species. In addition, the datasets revealed their potential as a resource for paralogue detection, a recurrent concern in various aspects of biological inquiry, including phylogenetics, molecular evolution, development, and cellular biochemistry. PMID:23190771
Cherkaoui Jaouad, Imane; El Alloussi, Mustapha; Chafai El Alaoui, Siham; Laarabi, Fatima Zahra; Lyahyai, Jaber; Sefiani, Abdelaziz
2015-01-30
Amelogenesis imperfecta represents a group of developmental conditions, clinically and genetically heterogeneous, that affect the structure and clinical appearance of enamel. Amelogenesis imperfecta occurred as an isolated trait or as part of a genetic syndrome. Recently, disease-causing mutations in the FAM20A gene were identified, in families with an autosomal recessive syndrome associating amelogenesis imperfecta and gingival fibromatosis. We report, the first description of a Moroccan patient with amelogenesis imperfecta and gingival fibromatosis, in whom we performed Sanger sequencing of the entire coding sequence of FAM20A and identified a homozygous mutation in the FAM20A gene (c.34_35delCT), already reported in a family with this syndrome. Our finding confirms that the mutations of FAM20A gene are causative for amelogenesis imperfecta and gingival fibromatosis and underlines the recurrent character of the c.34_35delCT in two different ethnic groups.
Lagier, Jean-Christophe; Elkarkouri, Khalid; Rivet, Romain; Couderc, Carine; Raoult, Didier; Fournier, Pierre-Edouard
2013-01-01
Senegalemassilia anaerobia strain JC110T sp.nov. is the type strain of Senegalemassilia anaerobia gen. nov., sp. nov., the type species of a new genus within the Coriobacteriaceae family, Senegalemassilia gen. nov. This strain, whose genome is described here, was isolated from the fecal flora of a healthy Senegalese patient. S. anaerobia is a Gram-positive anaerobic coccobacillus. Here we describe the features of this organism, together with the complete genome sequence and annotation. The 2,383,131 bp long genome contains 1,932 protein-coding and 58 RNA genes. PMID:24019984
The standard operating procedure of the DOE-JGI Microbial Genome Annotation Pipeline (MGAP v.4).
Huntemann, Marcel; Ivanova, Natalia N; Mavromatis, Konstantinos; Tripp, H James; Paez-Espino, David; Palaniappan, Krishnaveni; Szeto, Ernest; Pillay, Manoj; Chen, I-Min A; Pati, Amrita; Nielsen, Torben; Markowitz, Victor M; Kyrpides, Nikos C
2015-01-01
The DOE-JGI Microbial Genome Annotation Pipeline performs structural and functional annotation of microbial genomes that are further included into the Integrated Microbial Genome comparative analysis system. MGAP is applied to assembled nucleotide sequence datasets that are provided via the IMG submission site. Dataset submission for annotation first requires project and associated metadata description in GOLD. The MGAP sequence data processing consists of feature prediction including identification of protein-coding genes, non-coding RNAs and regulatory RNA features, as well as CRISPR elements. Structural annotation is followed by assignment of protein product names and functions.
Phylogenetic position and emended description of the genus Methylovorus.
Doronina, Nina V; Ivanova, Ekaterina G; Trotsenko, Yuri A
2005-03-01
The genus Methylovorus, currently represented by the restricted facultative methylotroph Methylovorus glucosotrophus Govorukhina and Trotsenko 1991 and the obligate methylotroph Methylovorus mays Doronina et al. 2001, is here established by direct sequencing of amplified 16S rRNA genes and DNA-DNA hybridization to be clearly separated from the extant ribulose monophosphate (RuMP) pathway methylobacteria and to form a distinct branch within the beta-Proteobacteria.
Systematics of Hypocrea citrina and related taxa
Overton, Barrie E.; Stewart, Elwin L.; Geiser, David M.; Jaklitsch, Walter M.
2006-01-01
Morphological studies and phylogenetic analyses of DNA sequences from three genomic regions – the internal transcribed spacer (ITS) regions of the nuclear ribosomal gene repeat, a partial sequence of RNA polymerase II subunit (rpb2), and a partial sequence of translation elongation factor (tef1) – were used to investigate the systematics of Hypocrea citrina and related species. A neotype specimen is designated for H. citrina that conforms to Persoon's description of a yellow effuse fungus occurring on leaf litter. Historical information and results obtained in this study provide the foundation for selection of a lectotype specimen from Fries's herbarium for H. lactea. The results indicate that (1) Hypocrea citrina and H. pulvinata are distinct species; (2) H. lactea sensu Fries is a synonym of the older name H. citrina; (3) H. pulvinata, H. protopulvinata, and H. americana are phylogenetically distinct species that form a well-supported polyporicolous clade; (4) H. citrina is situated in a clade closely related to H. pulvinata; and (5) H. microcitrina and H. pseudostraminea reside in a highly supported clade phylogenetically distinct from H. citrina. Hypocrea protopulvinata, H. microcitrina, H. megalocitrina, H. pseudostraminea, and a new species, H. aurantiistroma, are reported and described from North America. Variation in rpb2 and tef1 gene sequences suggests geographical subgroupings between European and North American isolates of H. pulvinata. The phylogenies inferred from ITS, rpb2, and tef1 gene sequences are concordant. Hypocrea citrina var. americana is elevated to species status, Hypocrea americana. PMID:18490988
Zhang, Linshuang; Li, Xiangyang; Zhang, Feng; Wang, Gejiao
2014-01-01
Agrobacterium radiobacter is the only known non-phytopathogenic species in Agrobacterium genus. In this study, the whole-genome sequence of A. radiobacter type strain DSM 30147T was described and compared to the other available Agrobacterium genomes. This bacterium has a genome size of 7,122,065 bp distributed in 612 contigs, including 6,834 protein-coding genes and 41 RNA genes. It harbors a circular chromosome and a linear chromosome but not a tumor-inducing (Ti) plasmid. To the best of our knowledge, this is the first report of a genome from the A. radiobacter species. In addition, an emended description of A. radiobacter is described. This study reveals information that enhances the current understanding of its non-phytopathogenicity and its phylogenetic position within Agrobacterium genus. PMID:25197445
Díaz-Cárdenas, Carolina; López, Gina; Alzate-Ocampo, José David; González, Laura N; Shapiro, Nicole; Woyke, Tanja; Kyrpides, Nikos C; Restrepo, Silvia; Baena, Sandra
2017-01-01
A bacterium belonging to the phylum Synergistetes , genus Dethiosulfovibrio was isolated in 2007 from a saline spring in Colombia. Dethiosulfovibrio salsuginis USBA 82 T ( DSM 21565 T = KCTC 5659 T ) is a mesophilic, strictly anaerobic, slightly halophilic, Gram negative bacterium with a diderm cell envelope. The strain ferments peptides, amino acids and a few organic acids. Here we present the description of the complete genome sequencing and annotation of the type species Dethiosulfovibrio salsuginis USBA 82 T . The genome consisted of 2.68 Mbp with a 53.7% G + C . A total of 2609 genes were predicted and of those, 2543 were protein coding genes and 66 were RNA genes. We detected in USBA 82 T genome six Synergistetes conserved signature indels (CSIs), specific for Jonquetella, Pyramidobacter and Dethiosulfovibrio . The genome of D. salsuginis contained, as expected, genes related to amino acid transport, amino acid metabolism and thiosulfate reduction. These genes represent the major gene groups of Synergistetes , related with their phenotypic traits, and interestingly, 11.8% of the genes in the genome belonged to the amino acid fermentation COG category. In addition, we identified in the genome some ammonification genes such as nitrate reductase genes. The presence of proline operon genes could be related to de novo synthesis of proline to protect the cell in response to high osmolarity. Our bioinformatics workflow included antiSMASH and BAGEL3 which allowed us to identify bacteriocins genes in the genome.
2009-01-01
Background Chickpea (Cicer arietinum L.), an important grain legume crop of the world is seriously challenged by terminal drought and salinity stresses. However, very limited number of molecular markers and candidate genes are available for undertaking molecular breeding in chickpea to tackle these stresses. This study reports generation and analysis of comprehensive resource of drought- and salinity-responsive expressed sequence tags (ESTs) and gene-based markers. Results A total of 20,162 (18,435 high quality) drought- and salinity- responsive ESTs were generated from ten different root tissue cDNA libraries of chickpea. Sequence editing, clustering and assembly analysis resulted in 6,404 unigenes (1,590 contigs and 4,814 singletons). Functional annotation of unigenes based on BLASTX analysis showed that 46.3% (2,965) had significant similarity (≤1E-05) to sequences in the non-redundant UniProt database. BLASTN analysis of unique sequences with ESTs of four legume species (Medicago, Lotus, soybean and groundnut) and three model plant species (rice, Arabidopsis and poplar) provided insights on conserved genes across legumes as well as novel transcripts for chickpea. Of 2,965 (46.3%) significant unigenes, only 2,071 (32.3%) unigenes could be functionally categorised according to Gene Ontology (GO) descriptions. A total of 2,029 sequences containing 3,728 simple sequence repeats (SSRs) were identified and 177 new EST-SSR markers were developed. Experimental validation of a set of 77 SSR markers on 24 genotypes revealed 230 alleles with an average of 4.6 alleles per marker and average polymorphism information content (PIC) value of 0.43. Besides SSR markers, 21,405 high confidence single nucleotide polymorphisms (SNPs) in 742 contigs (with ≥ 5 ESTs) were also identified. Recognition sites for restriction enzymes were identified for 7,884 SNPs in 240 contigs. Hierarchical clustering of 105 selected contigs provided clues about stress- responsive candidate genes and their expression profile showed predominance in specific stress-challenged libraries. Conclusion Generated set of chickpea ESTs serves as a resource of high quality transcripts for gene discovery and development of functional markers associated with abiotic stress tolerance that will be helpful to facilitate chickpea breeding. Mapping of gene-based markers in chickpea will also add more anchoring points to align genomes of chickpea and other legume species. PMID:19912666
Ibarra-Laclette, Enrique; Méndez-Bravo, Alfonso; Pérez-Torres, Claudia Anahí; Albert, Victor A; Mockaitis, Keithanne; Kilaru, Aruna; López-Gómez, Rodolfo; Cervantes-Luevano, Jacob Israel; Herrera-Estrella, Luis
2015-08-13
Avocado (Persea americana) is an economically important tropical fruit considered to be a good source of fatty acids. Despite its importance, the molecular and cellular characterization of biochemical and developmental processes in avocado is limited due to the lack of transcriptome and genomic information. The transcriptomes of seeds, roots, stems, leaves, aerial buds and flowers were determined using different sequencing platforms. Additionally, the transcriptomes of three different stages of fruit ripening (pre-climacteric, climacteric and post-climacteric) were also analyzed. The analysis of the RNAseqatlas presented here reveals strong differences in gene expression patterns between different organs, especially between root and flower, but also reveals similarities among the gene expression patterns in other organs, such as stem, leaves and aerial buds (vegetative organs) or seed and fruit (storage organs). Important regulators, functional categories, and differentially expressed genes involved in avocado fruit ripening were identified. Additionally, to demonstrate the utility of the avocado gene expression atlas, we investigated the expression patterns of genes implicated in fatty acid metabolism and fruit ripening. A description of transcriptomic changes occurring during fruit ripening was obtained in Mexican avocado, contributing to a dynamic view of the expression patterns of genes involved in fatty acid biosynthesis and the fruit ripening process.
Generation of TALE-Based Designer Epigenome Modifiers.
Nitsch, Sandra; Mussolino, Claudio
2018-01-01
Manipulation of gene expression can be facilitated by editing the genome or the epigenome. Precise genome editing is traditionally achieved by using designer nucleases which are generally exploited to eliminate a specific gene product. Upon the introduction of a site-specific DNA double-strand break (DSB) by the nuclease, endogenous DSB repair mechanisms are in turn harnessed to induce DNA sequence changes that can result in target gene inactivation. Minimal off-target effects can be obtained by endowing designer nucleases with the highly specific DNA-binding domain (DBD) derived from transcription activator-like effectors (TALEs). In contrast, epigenome editing allows gene expression control without inducing changes in the DNA sequence by specifically altering epigenetic marks, as histone tails modifications or DNA methylation patterns within promoter or enhancer regions. Importantly, this approach allows both up- and downregulation of the target gene expression, and the effect is generally reversible. TALE-based designer epigenome modifiers combine the high specificity of TALE-derived DBDs with the power of epigenetic modifier domains to induce fast and long-lasting changes in the epigenetic landscape of a target gene and control its expression. Here we provide a detailed description for the generation of TALE-based designer epigenome modifiers and of a suitable reporter cell line to easily monitor their activity.
Molecular epidemiology of goat pox viruses.
Roy, P; Jaisree, S; Balakrishnan, S; Senthilkumar, K; Mahaprabhu, R; Mishra, A; Maity, B; Ghosh, T K; Karmakar, A P
2018-02-01
Goat pox disease outbreaks were observed in different places affecting Black Bengal Goats in West Bengal (WB) and Tellicherry, Vembur and non-descriptive breeds in Tamil Nadu (TN) causing severe lesions and mortality up to 30%. Clinical specimens from all the outbreaks were screened by polymerase chain reaction followed by restriction fragment length polymorphism (PCR-RFLP) and confirmed the diseases as Goat Pox. Virus isolation in Vero cell line was done with randomly selected ten samples, cytopathic effects (CPE) characterized by syncytia and intracytoplasmic inclusion bodies were observed after several blind passages. Nucleotide sequence of complete p32 gene using randomly selected two isolates and three clinical specimens revealed presence of Goat pox virus (GTPV)-specific signature residues in all the sequences. Phylogenetic analysis using the present five sequences along with GenBank data of GTPV complete p32 gene sequences showed all the GTPV sequences cluster together except Pellor strain (NC004003) and FZ Chinese strain (KC951854). The five sequences either from WB or TN cluster more closely with GTPV isolates of Maharashtra state that were responsible for cross species outbreak of pox disease in both sheep (KF468759) and goats (KF468762) in India during the year 2010. All the Indian goat pox viruses, including the Mukteswar strain, isolated in 1946 and sequence reported in 2004 clustered together with the GTPVs causing the recent outbreaks. It was observed that GTPVs caused similar clinical manifestation irrespective of their geographical locations and breed characteristics, no variation observed among the Indian isolates based on p32 gene over the period of seventy years and disease outbreaks could not be observed or reported in vaccinated goats. © 2017 Blackwell Verlag GmbH.
Calduch-Giner, Josep A.; Sitjà-Bobadilla, Ariadna; Pérez-Sánchez, Jaume
2016-01-01
High-quality sequencing reads from the intestine of European sea bass were assembled, annotated by similarity against protein reference databases and combined with nucleotide sequences from public and private databases. After redundancy filtering, 24,906 non-redundant annotated sequences encoding 15,367 different gene descriptions were obtained. These annotated sequences were used to design a custom, high-density oligo-microarray (8 × 15 K) for the transcriptomic profiling of anterior (AI), middle (MI), and posterior (PI) intestinal segments. Similar molecular signatures were found for AI and MI segments, which were combined in a single group (AI-MI) whereas the PI outstood separately, with more than 1900 differentially expressed genes with a fold-change cutoff of 2. Functional analysis revealed that molecular and cellular functions related to feed digestion and nutrient absorption and transport were over-represented in AI-MI segments. By contrast, the initiation and establishment of immune defense mechanisms became especially relevant in PI, although the microarray expression profiling validated by qPCR indicated that these functional changes are gradual from anterior to posterior intestinal segments. This functional divergence occurred in association with spatial transcriptional changes in nutrient transporters and the mucosal chemosensing system via G protein-coupled receptors. These findings contribute to identify key indicators of gut functions and to compare different fish feeding strategies and immune defense mechanisms acquired along the evolution of teleosts. PMID:27610085
Calduch-Giner, Josep A; Sitjà-Bobadilla, Ariadna; Pérez-Sánchez, Jaume
2016-01-01
High-quality sequencing reads from the intestine of European sea bass were assembled, annotated by similarity against protein reference databases and combined with nucleotide sequences from public and private databases. After redundancy filtering, 24,906 non-redundant annotated sequences encoding 15,367 different gene descriptions were obtained. These annotated sequences were used to design a custom, high-density oligo-microarray (8 × 15 K) for the transcriptomic profiling of anterior (AI), middle (MI), and posterior (PI) intestinal segments. Similar molecular signatures were found for AI and MI segments, which were combined in a single group (AI-MI) whereas the PI outstood separately, with more than 1900 differentially expressed genes with a fold-change cutoff of 2. Functional analysis revealed that molecular and cellular functions related to feed digestion and nutrient absorption and transport were over-represented in AI-MI segments. By contrast, the initiation and establishment of immune defense mechanisms became especially relevant in PI, although the microarray expression profiling validated by qPCR indicated that these functional changes are gradual from anterior to posterior intestinal segments. This functional divergence occurred in association with spatial transcriptional changes in nutrient transporters and the mucosal chemosensing system via G protein-coupled receptors. These findings contribute to identify key indicators of gut functions and to compare different fish feeding strategies and immune defense mechanisms acquired along the evolution of teleosts.
Idris, Hamidah; Labeda, David P; Nouioui, Imen; Castro, Jean Franco; Del Carmen Montero-Calasanz, Maria; Bull, Alan T; Asenjo, Juan A; Goodfellow, Michael
2017-05-01
A polyphasic study was undertaken to determine the taxonomic status of a Streptomyces strain which had been isolated from a high altitude Atacama Desert soil and shown to have bioactive properties. The strain, isolate H9 T , was found to have chemotaxonomic, cultural and morphological properties that place it in the genus Streptomyces. 16S rRNA gene sequence analyses showed that the isolate forms a distinct branch at the periphery of a well-delineated subclade in the Streptomyces 16S rRNA gene tree together with the type strains of Streptomyces crystallinus, Streptomyces melanogenes and Streptomyces noboritoensis. Multi-locus sequence analysis (MLSA) based on five house-keeping gene alleles showed that isolate H9 T is closely related to the latter two type strains and to Streptomyces polyantibioticus NRRL B-24448 T . The isolate was distinguished readily from the type strains of S. melanogenes, S. noboritoensis and S. polyantibioticus using a combination of phenotypic properties. Consequently, the isolate is considered to represent a new species of Streptomyces for which the name Streptomyces aridus sp. nov. is proposed; the type strain is H9 T (=NCIMB 14965 T =NRRL B65268 T ). In addition, the MLSA and phenotypic data show that the S. melanogenes and S. noboritoensis type strains belong to a single species, it is proposed that S. melanogenes be recognised as a heterotypic synonym of S. noboritoensis for which an emended description is given.
Laurie, Andrew D.; Lloyd-Jones, Gareth
1999-01-01
Cloning and molecular ecological studies have underestimated the diversity of polycyclic aromatic hydrocarbon (PAH) catabolic genes by emphasizing classical nah-like (nah, ndo, pah, and dox) sequences. Here we report the description of a divergent set of PAH catabolic genes, the phn genes, which although isofunctional to the classical nah-like genes, show very low homology. This phn locus, which contains nine open reading frames (ORFs), was isolated on an 11.5-kb HindIII fragment from phenanthrene-degrading Burkholderia sp. strain RP007. The phn genes are significantly different in sequence and gene order from previously characterized genes for PAH degradation. They are transcribed by RP007 when grown at the expense of either naphthalene or phenanthrene, while in Escherichia coli the recombinant phn enzymes have been shown to be capable of oxidizing both naphthalene and phenanthrene to predicted metabolites. The locus encodes iron sulfur protein α and β subunits of a PAH initial dioxygenase but lacks the ferredoxin and reductase components. The dihydrodiol dehydrogenase of the RP007 pathway, PhnB, shows greater similarity to analogous dehydrogenases from described biphenyl pathways than to those characterized from naphthalene/phenanthrene pathways. An unusual extradiol dioxygenase, PhnC, shows no similarity to other extradiol dioxygenases for naphthalene or biphenyl oxidation but is the first member of the recently proposed class III extradiol dioxygenases that is specific for polycyclic arene diols. Upstream of the phn catabolic genes are two putative regulatory genes, phnR and phnS. Sequence homology suggests that phnS is a LysR-type transcriptional activator and that phnR, which is divergently transcribed with respect to phnSFECDAcAdB, is a member of the ς54-dependent family of positive transcriptional regulators. Reverse transcriptase PCR experiments suggest that this gene cluster is coordinately expressed and is under regulatory control which may involve PhnR and PhnS. PMID:9882667
Xu, Xing-Li; Cheng, Tian-Yin; Yang, Hu; Yan, Fen; Yang, Ya
2015-06-01
Saliva plays an important role in feeding and pathogen transmission, identification and analysis of tick salivary gland (SG) proteins is considered as a hot spot in anti-tick researching area. Herein, we present the first description of SG transcriptome of Haemaphysalis flava using next-generation sequencing (NGS). A total of over 143 million high-quality reads were assembled into 54,357 unigenes, of which 20,145 (37.06%) had significant similarities to proteins in the Swiss-Prot database. 13,513 annotated sequences were associated with GO terms. Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis showed that 14,280 unigenes were assigned to 279 KEGG pathways in total. Reads per kb per million reads (RPKM) analysis showed that there were 3035 down-regulated unigenes and 2260 up-regulated unigenes in the engorged ticks (ET) compared with the semi-engorged one (SET). Several important genes are associated with blood feeding and ingestion as secreted salivary proteins, concluding cysteine, longipain, 4D8, calreticulin, metalloproteases, serine protease inhibitor, enolase, heat shock protein and AV422 in SG, were identified. The qRT-PCR results confirmed that patterns of these genes (except for the longipain gene) expression were consistent with RNA-seq results. This de novo assembly of SG transcriptome of H. flava not only provides more chance for screening and cloning functional genes, but also forms a solid basis for further insight into the changes of salivary proteins during blood-feeding. Copyright © 2015 Elsevier B.V. All rights reserved.
Panzenhagen, P H N; Cabral, C C; Suffys, P N; Franco, R M; Rodrigues, D P; Conte-Junior, C A
2018-04-01
Salmonella pathogenicity relies on virulence factors many of which are clustered within the Salmonella pathogenicity islands. Salmonella also harbours mobile genetic elements such as virulence plasmids, prophage-like elements and antimicrobial resistance genes which can contribute to increase its pathogenicity. Here, we have genetically characterized a selected S. Typhimurium strain (CCRJ_26) from our previous study with Multiple Drugs Resistant profile and high-frequency PFGE clonal profile which apparently persists in the pork production centre of Rio de Janeiro State, Brazil. By whole-genome sequencing, we described the strain's genome virulent content and characterized the repertoire of bacterial plasmids, antibiotic resistance genes and prophage-like elements. Here, we have shown evidence that strain CCRJ_26 genome possible represent a virulence-associated phenotype which may be potentially virulent in human infection. Whole-genome sequencing technologies are still costly and remain underexplored for applied microbiology in Brazil. Hence, this genomic description of S. Typhimurium strain CCRJ_26 will provide help in future molecular epidemiological studies. The analysis described here reveals a quick and useful pipeline for bacterial virulence characterization using whole-genome sequencing approach. © 2018 The Society for Applied Microbiology.
Hirst, Marissa B.; Kita, Kelley N.; Dawson, Scott C.
2011-01-01
Protists have traditionally been identified by cultivation and classified taxonomically based on their cellular morphologies and behavior. In the past decade, however, many novel protist taxa have been identified using cultivation independent ssu rRNA sequence surveys. New rRNA “phylotypes” from uncultivated eukaryotes have no connection to the wealth of prior morphological descriptions of protists. To link phylogenetically informative sequences with taxonomically informative morphological descriptions, we demonstrate several methods for combining whole cell rRNA-targeted fluorescent in situ hybridization (FISH) with cytoskeletal or organellar immunostaining. Either eukaryote or ciliate-specific ssu rRNA probes were combined with an anti-α-tubulin antibody or phalloidin, a common actin stain, to define cytoskeletal features of uncultivated protists in several environmental samples. The eukaryote ssu rRNA probe was also combined with Mitotracker® or a hydrogenosomal-specific anti-Hsp70 antibody to localize mitochondria and hydrogenosomes, respectively, in uncultivated protists from different environments. Using rRNA probes in combination with immunostaining, we linked ssu rRNA phylotypes with microtubule structure to describe flagellate and ciliate morphology in three diverse environments, and linked Naegleria spp. to their amoeboid morphology using actin staining in hay infusion samples. We also linked uncultivated ciliates to morphologically similar Colpoda-like ciliates using tubulin immunostaining with a ciliate-specific rRNA probe. Combining rRNA-targeted FISH with cytoskeletal immunostaining or stains targeting specific organelles provides a fast, efficient, high throughput method for linking genetic sequences with morphological features in uncultivated protists. When linked to phylotype, morphological descriptions of protists can both complement and vet the increasing number of sequences from uncultivated protists, including those of novel lineages, identified in diverse environments. PMID:22174774
Phylogenomics and taxonomy of Lecomtelleae (Poaceae), an isolated panicoid lineage from Madagascar
Besnard, Guillaume; Christin, Pascal-Antoine; Malé, Pierre-Jean G.; Coissac, Eric; Ralimanana, Hélène; Vorontsova, Maria S.
2013-01-01
Background and Aims An accurate characterization of biodiversity requires analyses of DNA sequences in addition to classical morphological descriptions. New methods based on high-throughput sequencing may allow investigation of specimens with a large set of genetic markers to infer their evolutionary history. In the grass family, the phylogenetic position of the monotypic genus Lecomtella, a rare bamboo-like endemic from Madagascar, has never been appropriately evaluated. Until now its taxonomic treatment has remained controversial, indicating the need for re-evaluation based on a combination of molecular and morphological data. Methods The phylogenetic position of Lecomtella in Poaceae was evaluated based on sequences from the nuclear and plastid genomes generated by next-generation sequencing (NGS). In addition, a detailed morphological description of L. madagascariensis was produced, and its distribution and habit were investigated in order to assess its conservation status. Key Results The complete plastid sequence, a ribosomal DNA unit and fragments of low-copy nuclear genes (phyB and ppc) were obtained. All phylogenetic analyses place Lecomtella as an isolated member of the core panicoids, which last shared a common ancestor with other species >20 million years ago. Although Lecomtella exhibits morphological characters typical of Panicoideae, an unusual combination of traits supports its treatment as a separate group. Conclusions The study showed that NGS can be used to generate abundant phylogenetic information rapidly, opening new avenues for grass phylogenetics. These data clearly showed that Lecomtella forms an isolated lineage, which, in combination with its morphological peculiarities, justifies its treatment as a separate tribe: Lecomtelleae. New descriptions of the tribe, genus and species are presented with a typification, a distribution map and an IUCN conservation assessment. PMID:23985988
Phylogenomics and taxonomy of Lecomtelleae (Poaceae), an isolated panicoid lineage from Madagascar.
Besnard, Guillaume; Christin, Pascal-Antoine; Malé, Pierre-Jean G; Coissac, Eric; Ralimanana, Hélène; Vorontsova, Maria S
2013-10-01
An accurate characterization of biodiversity requires analyses of DNA sequences in addition to classical morphological descriptions. New methods based on high-throughput sequencing may allow investigation of specimens with a large set of genetic markers to infer their evolutionary history. In the grass family, the phylogenetic position of the monotypic genus Lecomtella, a rare bamboo-like endemic from Madagascar, has never been appropriately evaluated. Until now its taxonomic treatment has remained controversial, indicating the need for re-evaluation based on a combination of molecular and morphological data. The phylogenetic position of Lecomtella in Poaceae was evaluated based on sequences from the nuclear and plastid genomes generated by next-generation sequencing (NGS). In addition, a detailed morphological description of L. madagascariensis was produced, and its distribution and habit were investigated in order to assess its conservation status. The complete plastid sequence, a ribosomal DNA unit and fragments of low-copy nuclear genes (phyB and ppc) were obtained. All phylogenetic analyses place Lecomtella as an isolated member of the core panicoids, which last shared a common ancestor with other species >20 million years ago. Although Lecomtella exhibits morphological characters typical of Panicoideae, an unusual combination of traits supports its treatment as a separate group. The study showed that NGS can be used to generate abundant phylogenetic information rapidly, opening new avenues for grass phylogenetics. These data clearly showed that Lecomtella forms an isolated lineage, which, in combination with its morphological peculiarities, justifies its treatment as a separate tribe: Lecomtelleae. New descriptions of the tribe, genus and species are presented with a typification, a distribution map and an IUCN conservation assessment.
Brozovic, Matija; Dantec, Christelle; Dardaillon, Justine; Dauga, Delphine; Faure, Emmanuel; Gineste, Mathieu; Louis, Alexandra; Naville, Magali; Nitta, Kazuhiro R; Piette, Jacques; Reeves, Wendy; Scornavacca, Céline; Simion, Paul; Vincentelli, Renaud; Bellec, Maelle; Aicha, Sameh Ben; Fagotto, Marie; Guéroult-Bellone, Marion; Haeussler, Maximilian; Jacox, Edwin; Lowe, Elijah K; Mendez, Mickael; Roberge, Alexis; Stolfi, Alberto; Yokomori, Rui; Cambillau, Christian; Christiaen, Lionel; Delsuc, Frédéric; Douzery, Emmanuel; Dumollard, Rémi; Kusakabe, Takehiro; Nakai, Kenta; Nishida, Hiroki; Satou, Yutaka; Swalla, Billie; Veeman, Michael; Volff, Jean-Nicolas
2018-01-01
Abstract ANISEED (www.aniseed.cnrs.fr) is the main model organism database for tunicates, the sister-group of vertebrates. This release gives access to annotated genomes, gene expression patterns, and anatomical descriptions for nine ascidian species. It provides increased integration with external molecular and taxonomy databases, better support for epigenomics datasets, in particular RNA-seq, ChIP-seq and SELEX-seq, and features novel interactive interfaces for existing and novel datatypes. In particular, the cross-species navigation and comparison is enhanced through a novel taxonomy section describing each represented species and through the implementation of interactive phylogenetic gene trees for 60% of tunicate genes. The gene expression section displays the results of RNA-seq experiments for the three major model species of solitary ascidians. Gene expression is controlled by the binding of transcription factors to cis-regulatory sequences. A high-resolution description of the DNA-binding specificity for 131 Ciona robusta (formerly C. intestinalis type A) transcription factors by SELEX-seq is provided and used to map candidate binding sites across the Ciona robusta and Phallusia mammillata genomes. Finally, use of a WashU Epigenome browser enhances genome navigation, while a Genomicus server was set up to explore microsynteny relationships within tunicates and with vertebrates, Amphioxus, echinoderms and hemichordates. PMID:29149270
Characterization and functional analyses of a novel chicken CD8a variant X1 (CD8a1)
USDA-ARS?s Scientific Manuscript database
We provide the first description of cloning, as well as structural and functional analysis of a novel variant in the chicken CD8alpha family, termed the CD8-alpha X1 (CD8alpha1) gene. Multiple alignment of CD8alpha1 with known CD8alpha and beta sequences of other species revealed relatively low con...
Stepwise evolution of pandrug-resistance in Klebsiella pneumoniae
Zowawi, Hosam M.; Forde, Brian M.; Alfaresi, Mubarak; Alzarouni, Abdulqadir; Farahat, Yasser; Chong, Teik-Min; Yin, Wai-Fong; Chan, Kok-Gan; Li, Jian; Schembri, Mark A.; Beatson, Scott A.; Paterson, David L.
2015-01-01
Carbapenem resistant Enterobacteriaceae (CRE) pose an urgent risk to global human health. CRE that are non-susceptible to all commercially available antibiotics threaten to return us to the pre-antibiotic era. Using Single Molecule Real Time (SMRT) sequencing we determined the complete genome of a pandrug-resistant Klebsiella pneumoniae isolate, representing the first complete genome sequence of CRE resistant to all commercially available antibiotics. The precise location of acquired antibiotic resistance elements, including mobile elements carrying genes for the OXA-181 carbapenemase, were defined. Intriguingly, we identified three chromosomal copies of an ISEcp1-blaOXA-181 mobile element, one of which has disrupted the mgrB regulatory gene, accounting for resistance to colistin. Our findings provide the first description of pandrug-resistant CRE at the genomic level, and reveal the critical role of mobile resistance elements in accelerating the emergence of resistance to other last resort antibiotics. PMID:26478520
Melegh, S; Kovács, K; Gám, T; Nyul, A; Patkó, B; Tóth, A; Damjanova, I; Mestyán, G
2014-01-01
Since November 2009 carbapenemase-producing Klebsiella pneumoniae isolates have been detected in increasing numbers at the Clinical Centre University of Pécs. Molecular typing was performed for 102 clinical isolates originating from different time periods and various departments of the Clinical Centre. Pulsed-field gel electrophoresis revealed the predominance of a single clone (101/102), identified as sequence type ST15. PCR and sequencing showed the presence of blaCTX-M-15 and blaVIM-4 genes. The blaVIM-4 was located on a class 1 integron designated In238b. To our knowledge, this is the first description of a blaVIM-4 gene in the predominant CTX-M-15 extended spectrum β-lactamase-producing Hungarian Epidemic Clone/ST15. © 2013 The Authors Clinical Microbiology and Infection © 2013 European Society of Clinical Microbiology and Infectious Diseases.
[Tale nucleases--new tool for genome editing].
Glazkova, D V; Shipulin, G A
2014-01-01
The ability to introduce targeted changes in the genome of living cells or entire organisms enables researchers to meet the challenges of basic life sciences, biotechnology and medicine. Knockdown of target genes in the zygotes gives the opportunity to investigate the functions of these genes in different organisms. Replacement of single nucleotide in the DNA sequence allows to correct mutations in genes and thus to cure hereditary diseases. Adding transgene to specific genomic.loci can be used in biotechnology for generation of organisms with certain properties or cell lines for biopharmaceutical production. Such manipulations of gene sequences in their natural chromosomal context became possible after the emergence of the technology called "genome editing". This technology is based on the induction of a double-strand break in a specific genomic target DNA using endonucleases that recognize the unique sequences in the genome and on subsequent recovery of DNA integrity through the use of cellular repair mechanisms. A necessary tool for the genome editing is a custom-designed endonuclease which is able to recognize selected sequences. The emergence of a new type of programmable endonucleases, which were constructed on the basis of bacterial proteins--TAL-effectors (Transcription activators like effector), has become an important stage in the development of technology and promoted wide spread of the genome editing. This article reviews the history of the discovery of TAL effectors and creation of TALE nucleases, and describes their advantages over zinc finger endonucleases that appeared earlier. A large section is devoted to description of genetic modifications that can be performed using the genome editing.
Martin, Lauren; Damaso, Natalie; Mills, DeEtta
2016-10-01
Molecular methods for the detection of mammalian coat color phenotypes have expanded greatly within the past decade. Many phenotypes are associated with a single nucleotide polymorphism mutation in the genetic sequence. Traditionally, these mutations are detected through sequencing, hybridization assays or mini-sequencing. However, these techniques can be expensive and tedious. Previously, CE-SSCP using the F-108 polymer was able to distinguish SNPs for the melanocortin-1 receptor (mc1r) coat color gene in horses (Equus caballus) that differed by one nucleotide substitution. The objective of this study was to expand the detection of coat color SNPs in horses. The genes for the solute carrier family member 2 (slc45a2/matp), type III receptor protein-tyrosine kinase (kit) and mc1r genes using CE-SSCP and F-108 polymer were compared to mini-sequencing with the SNaPshot TM kit. The F-108 polymer reproducibly resolved homozygous and heterozygous individuals for the mc1r and kit markers, but was unable to resolve heterozygous individuals for slc45a2 at 38ºC. The need for temperatures <15ºC, the SNP position being close to the 5'-end, and conformational structures/free energy with similar values resulted in the inability to resolve the secondary structures. Despite this limitation, the CE-SSCP method could be used to provide a rapid phenotypic description for equine forensic investigations. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Pauciullo, Alfredo; Erhardt, Georg
2015-01-01
In the present paper, we report for the first time the characterization of llama (Lama glama) caseins at transcriptomic and genetic level. A total of 288 casein clones transcripts were analysed from two lactating llamas. The most represented mRNA populations were those correctly assembled (85.07%) and they encoded for mature proteins of 215, 217, 187 and 162 amino acids respectively for the CSN1S1, CSN2, CSN1S2 and CSN3 genes. The exonic subdivision evidenced a structure made of 21, 9, 17 and 6 exons for the αs1-, β-, αs2- and κ-casein genes respectively. Exon skipping and duplication events were evidenced. Two variants A and B were identified in the αs1-casein gene as result of the alternative out-splicing of the exon 18. An additional exon coding for a novel esapeptide was found to be cryptic in the κ-casein gene, whereas one extra exon was found in the αs2-casein gene by the comparison with the Camelus dromedaries sequence. A total of 28 putative phosphorylated motifs highlighted a complex heterogeneity and a potential variable degree of post-translational modifications. Ninety-six polymorphic sites were found through the comparison of the lama casein cDNAs with the homologous camel sequences, whereas the first description and characterization of the 5’- and 3’-regulatory regions allowed to identify the main putative consensus sequences involved in the casein genes expression, thus opening the way to new investigations -so far- never achieved in this species. PMID:25923814
Pauciullo, Alfredo; Erhardt, Georg
2015-01-01
In the present paper, we report for the first time the characterization of llama (Lama glama) caseins at transcriptomic and genetic level. A total of 288 casein clones transcripts were analysed from two lactating llamas. The most represented mRNA populations were those correctly assembled (85.07%) and they encoded for mature proteins of 215, 217, 187 and 162 amino acids respectively for the CSN1S1, CSN2, CSN1S2 and CSN3 genes. The exonic subdivision evidenced a structure made of 21, 9, 17 and 6 exons for the αs1-, β-, αs2- and κ-casein genes respectively. Exon skipping and duplication events were evidenced. Two variants A and B were identified in the αs1-casein gene as result of the alternative out-splicing of the exon 18. An additional exon coding for a novel esapeptide was found to be cryptic in the κ-casein gene, whereas one extra exon was found in the αs2-casein gene by the comparison with the Camelus dromedaries sequence. A total of 28 putative phosphorylated motifs highlighted a complex heterogeneity and a potential variable degree of post-translational modifications. Ninety-six polymorphic sites were found through the comparison of the lama casein cDNAs with the homologous camel sequences, whereas the first description and characterization of the 5'- and 3'-regulatory regions allowed to identify the main putative consensus sequences involved in the casein genes expression, thus opening the way to new investigations -so far- never achieved in this species.
Chen, Xin; Long, Hai; Gao, Ping; Deng, Guangbing; Pan, Zhifen; Liang, Junjun; Tang, Yawei; Tashi, Nyima; Yu, Maoqun
2014-01-01
Background Hulless barley is attracting increasing attention due to its unique nutritional value and potential health benefits. However, the molecular biology of the barley grain development and nutrient storage are not well understood. Furthermore, the genetic potential of hulless barley has not been fully tapped for breeding. Methodology/Principal Findings In the present study, we investigated the transcriptome features during hulless barley grain development. Using Illumina paired-end RNA-Sequencing, we generated two data sets of the developing grain transcriptomes from two hulless barley landraces. A total of 13.1 and 12.9 million paired-end reads with lengths of 90 bp were generated from the two varieties and were assembled to 48,863 and 45,788 unigenes, respectively. A combined dataset of 46,485 All-Unigenes were generated from two transcriptomes with an average length of 542 bp, and 36,278 among were annotated with gene descriptions, conserved protein domains or gene ontology terms. Furthermore, sequences and expression levels of genes related to the biosynthesis of storage reserve compounds (starch, protein, and β-glucan) were analyzed, and their temporal and spatial patterns were deduced from the transcriptome data of cultivated barley Morex. Conclusions/Significance We established a sequences and functional annotation integrated database and examined the expression profiles of the developing grains of Tibetan hulless barley. The characterization of genes encoding storage proteins and enzymes of starch synthesis and (1–3;1–4)-β-D-glucan synthesis provided an overview of changes in gene expression associated with grain nutrition and health properties. Furthermore, the characterization of these genes provides a gene reservoir, which helps in quality improvement of hulless barley. PMID:24871534
2013-01-01
Background The yeast Metschnikowia fructicola is an antagonist with biological control activity against postharvest diseases of several fruits. We performed a transcriptome analysis, using RNA-Seq technology, to examine the response of M. fructicola with citrus fruit and with the postharvest pathogen, Penicillium digitatum. Results More than 26 million sequencing reads were assembled into 9,674 unigenes. Approximately 50% of the unigenes could be annotated based on homology matches in the NCBI database. Based on homology, sequences were annotated with a gene description, gene ontology (GO term), and clustered into functional groups. An analysis of differential expression when the yeast was interacting with the fruit vs. the pathogen revealed more than 250 genes with specific expression responses. In the antagonist-pathogen interaction, genes related to transmembrane, multidrug transport and to amino acid metabolism were induced. In the antagonist-fruit interaction, expression of genes involved in oxidative stress, iron homeostasis, zinc homeostasis, and lipid metabolism were induced. Patterns of gene expression in the two interactions were examined at the individual transcript level by quantitative real-time PCR analysis (RT-qPCR). Conclusion This study provides new insight into the biology of the tritrophic interactions that occur in a biocontrol system such as the use of the yeast, M. fructicola for the control of green mold on citrus caused by P. digitatum. PMID:23496978
The standard operating procedure of the DOE-JGI Microbial Genome Annotation Pipeline (MGAP v.4)
Huntemann, Marcel; Ivanova, Natalia N.; Mavromatis, Konstantinos; ...
2015-10-26
The DOE-JGI Microbial Genome Annotation Pipeline performs structural and functional annotation of microbial genomes that are further included into the Integrated Microbial Genome comparative analysis system. MGAP is applied to assembled nucleotide sequence datasets that are provided via the IMG submission site. Dataset submission for annotation first requires project and associated metadata description in GOLD. The MGAP sequence data processing consists of feature prediction including identification of protein-coding genes, non-coding RNAs and regulatory RNA features, as well as CRISPR elements. In conclusion, structural annotation is followed by assignment of protein product names and functions.
The standard operating procedure of the DOE-JGI Microbial Genome Annotation Pipeline (MGAP v.4)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Huntemann, Marcel; Ivanova, Natalia N.; Mavromatis, Konstantinos
The DOE-JGI Microbial Genome Annotation Pipeline performs structural and functional annotation of microbial genomes that are further included into the Integrated Microbial Genome comparative analysis system. MGAP is applied to assembled nucleotide sequence datasets that are provided via the IMG submission site. Dataset submission for annotation first requires project and associated metadata description in GOLD. The MGAP sequence data processing consists of feature prediction including identification of protein-coding genes, non-coding RNAs and regulatory RNA features, as well as CRISPR elements. In conclusion, structural annotation is followed by assignment of protein product names and functions.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shi, CY; Yang, H; Wei, CL
Tea is one of the most popular non-alcoholic beverages worldwide. However, the tea plant, Camellia sinensis, is difficult to culture in vitro, to transform, and has a large genome, rendering little genomic information available. Recent advances in large-scale RNA sequencing (RNA-seq) provide a fast, cost-effective, and reliable approach to generate large expression datasets for functional genomic analysis, which is especially suitable for non-model species with un-sequenced genomes. Using high-throughput Illumina RNA-seq, the transcriptome from poly (A){sup +} RNA of C. sinensis was analyzed at an unprecedented depth (2.59 gigabase pairs). Approximate 34.5 million reads were obtained, trimmed, and assembled intomore » 127,094 unigenes, with an average length of 355 bp and an N50 of 506 bp, which consisted of 788 contig clusters and 126,306 singletons. This number of unigenes was 10-fold higher than existing C. sinensis sequences deposited in GenBank (as of August 2010). Sequence similarity analyses against six public databases (Uniprot, NR and COGs at NCBI, Pfam, InterPro and KEGG) found 55,088 unigenes that could be annotated with gene descriptions, conserved protein domains, or gene ontology terms. Some of the unigenes were assigned to putative metabolic pathways. Targeted searches using these annotations identified the majority of genes associated with several primary metabolic pathways and natural product pathways that are important to tea quality, such as flavonoid, theanine and caffeine biosynthesis pathways. Novel candidate genes of these secondary pathways were discovered. Comparisons with four previously prepared cDNA libraries revealed that this transcriptome dataset has both a high degree of consistency with previous EST data and an approximate 20 times increase in coverage. Thirteen unigenes related to theanine and flavonoid synthesis were validated. Their expression patterns in different organs of the tea plant were analyzed by RT-PCR and quantitative real time PCR (qRT-PCR). An extensive transcriptome dataset has been obtained from the deep sequencing of tea plant. The coverage of the transcriptome is comprehensive enough to discover all known genes of several major metabolic pathways. This transcriptome dataset can serve as an important public information platform for gene expression, genomics, and functional genomic studies in C. sinensis.« less
2011-01-01
Background Tea is one of the most popular non-alcoholic beverages worldwide. However, the tea plant, Camellia sinensis, is difficult to culture in vitro, to transform, and has a large genome, rendering little genomic information available. Recent advances in large-scale RNA sequencing (RNA-seq) provide a fast, cost-effective, and reliable approach to generate large expression datasets for functional genomic analysis, which is especially suitable for non-model species with un-sequenced genomes. Results Using high-throughput Illumina RNA-seq, the transcriptome from poly (A)+ RNA of C. sinensis was analyzed at an unprecedented depth (2.59 gigabase pairs). Approximate 34.5 million reads were obtained, trimmed, and assembled into 127,094 unigenes, with an average length of 355 bp and an N50 of 506 bp, which consisted of 788 contig clusters and 126,306 singletons. This number of unigenes was 10-fold higher than existing C. sinensis sequences deposited in GenBank (as of August 2010). Sequence similarity analyses against six public databases (Uniprot, NR and COGs at NCBI, Pfam, InterPro and KEGG) found 55,088 unigenes that could be annotated with gene descriptions, conserved protein domains, or gene ontology terms. Some of the unigenes were assigned to putative metabolic pathways. Targeted searches using these annotations identified the majority of genes associated with several primary metabolic pathways and natural product pathways that are important to tea quality, such as flavonoid, theanine and caffeine biosynthesis pathways. Novel candidate genes of these secondary pathways were discovered. Comparisons with four previously prepared cDNA libraries revealed that this transcriptome dataset has both a high degree of consistency with previous EST data and an approximate 20 times increase in coverage. Thirteen unigenes related to theanine and flavonoid synthesis were validated. Their expression patterns in different organs of the tea plant were analyzed by RT-PCR and quantitative real time PCR (qRT-PCR). Conclusions An extensive transcriptome dataset has been obtained from the deep sequencing of tea plant. The coverage of the transcriptome is comprehensive enough to discover all known genes of several major metabolic pathways. This transcriptome dataset can serve as an important public information platform for gene expression, genomics, and functional genomic studies in C. sinensis. PMID:21356090
Functional Annotation of the Arabidopsis Genome Using Controlled Vocabularies1
Berardini, Tanya Z.; Mundodi, Suparna; Reiser, Leonore; Huala, Eva; Garcia-Hernandez, Margarita; Zhang, Peifen; Mueller, Lukas A.; Yoon, Jungwoon; Doyle, Aisling; Lander, Gabriel; Moseyko, Nick; Yoo, Danny; Xu, Iris; Zoeckler, Brandon; Montoya, Mary; Miller, Neil; Weems, Dan; Rhee, Seung Y.
2004-01-01
Controlled vocabularies are increasingly used by databases to describe genes and gene products because they facilitate identification of similar genes within an organism or among different organisms. One of The Arabidopsis Information Resource's goals is to associate all Arabidopsis genes with terms developed by the Gene Ontology Consortium that describe the molecular function, biological process, and subcellular location of a gene product. We have also developed terms describing Arabidopsis anatomy and developmental stages and use these to annotate published gene expression data. As of March 2004, we used computational and manual annotation methods to make 85,666 annotations representing 26,624 unique loci. We focus on associating genes to controlled vocabulary terms based on experimental data from the literature and use The Arabidopsis Information Resource-developed PubSearch software to facilitate this process. Each annotation is tagged with a combination of evidence codes, evidence descriptions, and references that provide a robust means to assess data quality. Annotation of all Arabidopsis genes will allow quantitative comparisons between sets of genes derived from sources such as microarray experiments. The Arabidopsis annotation data will also facilitate annotation of newly sequenced plant genomes by using sequence similarity to transfer annotations to homologous genes. In addition, complete and up-to-date annotations will make unknown genes easy to identify and target for experimentation. Here, we describe the process of Arabidopsis functional annotation using a variety of data sources and illustrate several ways in which this information can be accessed and used to infer knowledge about Arabidopsis and other plant species. PMID:15173566
Janssen, Toon; Karssen, Gerrit; Orlando, Valeria; Subbotin, Sergei A; Bert, Wim
2017-12-01
Root-lesion nematodes of the genus Pratylenchus are an important pest parasitizing a wide range of vascular plants including several economically important crops. However, morphological diagnosis of the more than 100 species is problematic due to the low number of diagnostic features, high morphological plasticity and incomplete taxonomic descriptions. In order to employ barcoding based diagnostics, a link between morphology and species specific sequences has to be established. In this study, we reconstructed a multi-gene phylogeny of the Penetrans group using nuclear ribosomal and mitochondrial gene sequences. A combination of this phylogenetic framework with molecular species delineation analysis, population genetics, morphometric information and sequences from type location material allowed us to establish the species boundaries within the Penetrans group and as such clarify long-standing controversies about the taxonomic status of P. penetrans, P. fallax and P. convallariae. Our study also reveals a remarkable amount of cryptic biodiversity within the genus Pratylenchus confirming that identification on morphology alone can be inconclusive in this taxonomically confusing genus. Copyright © 2017 Elsevier Inc. All rights reserved.
A novel ENU-induced mutation, peewee, causes dwarfism in the mouse
Bon-Ryon, Lee; Kano, Kiyoshi; Young, Jay; John, Simon; Nishina, Patsy M; Naggert, Jurgen K; Naito, Kunihiko
2010-01-01
We identified a novel fertile, autosomal recessive mutation, called peewee and that results in dwarfing, in a region-specific ENU-induced mutagenesis. These mice at litter size were smaller those of other strains. Histological analysis revealed that the major organs appear normal, but abnormalities in cellular proliferation were observed in bone, liver and testis. Haplotype analysis localized the peewee gene to a 3.3-Mb region between D5Mit83 and D5Mit356.3. There are 18 genes in this linkage area, and we also performed in silico mapping using the PosMed℠ program, which searches for connections among keywords and genes in an interval, but no similar phenotype descriptions were found for these genes. In the peewee mutant compared to the normal, C57BL/6J mouse, only Slc10a4 expression was lower. Our preliminary mutation analysis examining the nucleotide sequence of three exons, two introns and an untranslated region of Slc10a4 did not find any sequence difference between the peewee mouse and the C57BL/6J mouse. Detailed analysis of peewee mice might provide novel molecular insights into the complex mechanisms regulating body growth. PMID:19513787
Complete mitochondrial genome sequence of Urechis caupo, a representative of the phylum Echiura
Boore, Jeffrey L
2004-01-01
Background Mitochondria contain small genomes that are physically separate from those of nuclei. Their comparison serves as a model system for understanding the processes of genome evolution. Although hundreds of these genome sequences have been reported, the taxonomic sampling is highly biased toward vertebrates and arthropods, with many whole phyla remaining unstudied. This is the first description of a complete mitochondrial genome sequence of a representative of the phylum Echiura, that of the fat innkeeper worm, Urechis caupo. Results This mtDNA is 15,113 nts in length and 62% A+T. It contains the 37 genes that are typical for animal mtDNAs in an arrangement somewhat similar to that of annelid worms. All genes are encoded by the same DNA strand which is rich in A and C relative to the opposite strand. Codons ending with the dinucleotide GG are more frequent than would be expected from apparent mutational biases. The largest non-coding region is only 282 nts long, is 71% A+T, and has potential for secondary structures. Conclusions Urechis caupo mtDNA shares many features with those of the few studied annelids, including the common usage of ATG start codons, unusual among animal mtDNAs, as well as gene arrangements, tRNA structures, and codon usage biases. PMID:15369601
Tariq, Mansoor; Chen, Rong; Yuan, Hongyu; Liu, Yanjie; Wu, Yanan; Wang, Junya; Xia, Chun
2015-01-01
Background The Chinese goose is one of the most economically important poultry birds and is a natural reservoir for many avian viruses. However, the nature and regulation of the innate and adaptive immune systems of this waterfowl species are not completely understood due to limited information on the goose genome. Recently, transcriptome sequencing technology was applied in the genomic studies focused on novel gene discovery. Thus, this study described the transcriptome of the goose peripheral blood lymphocytes to identify immunity relevant genes. Principal Findings De novo transcriptome assembly of the goose peripheral blood lymphocytes was sequenced by Illumina-Solexa technology. In total, 211,198 unigenes were assembled from the 69.36 million cleaned reads. The average length, N50 size and the maximum length of the assembled unigenes were 687 bp, 1,298 bp and 18,992 bp, respectively. A total of 36,854 unigenes showed similarity by BLAST search against the NCBI non-redundant (Nr) protein database. For functional classification, 163,161 unigenes were comprised of three Gene Ontology (Go) categories and 67 subcategories. A total of 15,334 unigenes were annotated into 25 eukaryotic orthologous groups (KOGs) categories. Kyoto Encyclopedia of Genes and Genomes (KEGG) database annotated 39,585 unigenes into six biological functional groups and 308 pathways. Among the 2,757 unigenes that participated in the 15 immune system KEGG pathways, 125 of the most important immune relevant genes were summarized and analyzed by STRING analysis to identify gene interactions and relationships. Moreover, 10 genes were confirmed by PCR and analyzed. Of these 125 unigenes, 109 unigenes, approximately 87%, were not previously identified in the goose. Conclusion This de novo transcriptome analysis could provide important Chinese goose sequence information and highlights the value of new gene discovery, pathways investigation and immune system gene identification, and comparison with other avian species as useful tools to understand the goose immune system. PMID:25816068
Gharout-Sait, Alima; Touati, Abdelaziz; Guillard, Thomas; Brasme, Lucien; de Champs, Christophe
2015-01-01
In this study, 922 consecutive non-duplicate clinical isolates of Enterobacteriaceae obtained from hospitalized and non-hospitalized patients at Bejaia, Algeria were analyzed for AmpC-type β-lactamases production. The ampC genes and their genetic environment were characterized using polymerase chain reaction (PCR) and sequencing. Plasmid incompatibility groups were determined by using PCR-based replicon typing. Phylogenetic grouping and multilocus sequence typing were determined for molecular typing of the plasmid-mediated AmpC (pAmpC) isolates. Of the isolates, 15 (1.6%) were identified as AmpC producers including 14 CMY-4-producing isolates and one DHA-1-producing Klebsiella pneumoniae. All AmpC-producing isolates co-expressed the broad-spectrum TEM-1 β-lactamase and three of them co-produced CTX-M and/or SHV-12 ESBL. Phylogenetic grouping and virulence genotyping of the E. coli isolates revealed that most of them belonged to groups D and B1. Multilocus sequence typing analysis of K. pneumoniae isolates identified four different sequence types (STs) with two new sequences: ST1617 and ST1618. Plasmid replicon typing indicates that blaCMY-4 gene was located on broad host range A/C plasmid, while LVPK replicon was associated with blaDHA-1. All isolates carrying blaCMY-4 displayed the transposon-like structures ISEcp1/ΔISEcp1-blaCMY-blc-sugE. Our study showed that CMY-4 was the main pAmpC in the Enterobacteriaceae isolates in Algeria. Copyright © 2015 Elsevier Editora Ltda. All rights reserved.
Zhu, Xun; Xie, Shangbo; Armengaud, Jean; Xie, Wen; Guo, Zhaojiang; Kang, Shi; Wu, Qingjun; Wang, Shaoli; Xia, Jixing; He, Rongjun; Zhang, Youjun
2016-01-01
The diamondback moth, Plutella xylostella (L.), is the major cosmopolitan pest of brassica and other cruciferous crops. Its larval midgut is a dynamic tissue that interfaces with a wide variety of toxicological and physiological processes. The draft sequence of the P. xylostella genome was recently released, but its annotation remains challenging because of the low sequence coverage of this branch of life and the poor description of exon/intron splicing rules for these insects. Peptide sequencing by computational assignment of tandem mass spectra to genome sequence information provides an experimental independent approach for confirming or refuting protein predictions, a concept that has been termed proteogenomics. In this study, we carried out an in-depth proteogenomic analysis to complement genome annotation of P. xylostella larval midgut based on shotgun HPLC-ESI-MS/MS data by means of a multialgorithm pipeline. A total of 876,341 tandem mass spectra were searched against the predicted P. xylostella protein sequences and a whole-genome six-frame translation database. Based on a data set comprising 2694 novel genome search specific peptides, we discovered 439 novel protein-coding genes and corrected 128 existing gene models. To get the most accurate data to seed further insect genome annotation, more than half of the novel protein-coding genes, i.e. 235 over 439, were further validated after RT-PCR amplification and sequencing of the corresponding transcripts. Furthermore, we validated 53 novel alternative splicings. Finally, a total of 6764 proteins were identified, resulting in one of the most comprehensive proteogenomic study of a nonmodel animal. As the first tissue-specific proteogenomics analysis of P. xylostella, this study provides the fundamental basis for high-throughput proteomics and functional genomics approaches aimed at deciphering the molecular mechanisms of resistance and controlling this pest. PMID:26902207
Goker, Markus; Lu, Megan; Fiebig, Anne; ...
2014-06-15
Methanoplanus limicola Wildgruber et al. 1984 is a mesophilic methanogen that was isolated from a swamp composed of drilling waste near Naples, Italy, shortly after the Archaea were recognized as a separate domain of life. Methanoplanus is the type genus in the family Methanoplanaceae, a taxon that felt into disuse since modern 16S rRNA gene sequences-based taxonomy was established. Methanoplanus is now placed within the Methanomicrobiaceae, a family that is so far poorly characterized at the genome level. The only other type strain of the genus with a sequenced genome, Methanoplanus petrolearius SEBR 4847 T, turned out to be misclassifiedmore » and required reclassification to Methanolacinia. Both, Methanoplanus and Methanolacinia, needed taxonomic emendations due to a significant deviation of the G+C content of their genomes from previously published (pregenome-sequence era) values. Until now genome sequences were published for only four of the 33 species with validly published names in the Methanomicrobiaceae. Here we describe the features of M. limicola, together with the improved-high-quality draft genome sequence and an notation of the type strain, M3 T. The 3,200,946 bp long chromosome (permanent draft sequence) with its 3,064 protein-coding and 65 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.« less
2013-01-01
Background Secondary metabolite production, a hallmark of filamentous fungi, is an expanding area of research for the Aspergilli. These compounds are potent chemicals, ranging from deadly toxins to therapeutic antibiotics to potential anti-cancer drugs. The genome sequences for multiple Aspergilli have been determined, and provide a wealth of predictive information about secondary metabolite production. Sequence analysis and gene overexpression strategies have enabled the discovery of novel secondary metabolites and the genes involved in their biosynthesis. The Aspergillus Genome Database (AspGD) provides a central repository for gene annotation and protein information for Aspergillus species. These annotations include Gene Ontology (GO) terms, phenotype data, gene names and descriptions and they are crucial for interpreting both small- and large-scale data and for aiding in the design of new experiments that further Aspergillus research. Results We have manually curated Biological Process GO annotations for all genes in AspGD with recorded functions in secondary metabolite production, adding new GO terms that specifically describe each secondary metabolite. We then leveraged these new annotations to predict roles in secondary metabolism for genes lacking experimental characterization. As a starting point for manually annotating Aspergillus secondary metabolite gene clusters, we used antiSMASH (antibiotics and Secondary Metabolite Analysis SHell) and SMURF (Secondary Metabolite Unknown Regions Finder) algorithms to identify potential clusters in A. nidulans, A. fumigatus, A. niger and A. oryzae, which we subsequently refined through manual curation. Conclusions This set of 266 manually curated secondary metabolite gene clusters will facilitate the investigation of novel Aspergillus secondary metabolites. PMID:23617571
Scheps, Karen G; De Paula, Silvia M; Bitsman, Alicia R; Freigeiro, Daniel H; Basack, F Nora; Pennesi, Sandra P; Varela, Viviana
2013-01-01
We describe a novel frameshift mutation on the HBA1 gene (c.187delG), causative of α-thalassemia (α-thal) in a Black Cuban family with multiple sequence variants in the HBA genes and the Hb S [β6(A3)Glu→Val, GAG>GTG; HBB: c.20A>T] mutation. The deletion of the first base of codon 62 resulted in a frameshift at amino acid 62 with a putative premature termination codon (PTC) at amino acid 66 on the same exon (p.W62fsX66), which most likely triggers nonsense mediated decay of the resulting mRNA. This study also presents the first report of the α212 patchwork allele in Latin America and the description of two new sequence variants in the HBA2 region (c.-614G>A in the promoter region and c.95+39 C>T on the first intron).
Detailed Transcriptome Description of the Neglected Cestode Taenia multiceps
Wu, Xuhang; Fu, Yan; Yang, Deying; Zhang, Runhui; Zheng, Wanpeng; Nie, Huaming; Xie, Yue; Yan, Ning; Hao, Guiying; Gu, Xiaobin; Wang, Shuxian; Peng, Xuerong; Yang, Guangyou
2012-01-01
Background The larval stage of Taenia multiceps, a global cestode, encysts in the central nervous system (CNS) of sheep and other livestock. This frequently leads to their death and huge socioeconomic losses, especially in developing countries. This parasite can also cause zoonotic infections in humans, but has been largely neglected due to a lack of diagnostic techniques and studies. Recent developments in next-generation sequencing provide an opportunity to explore the transcriptome of T. multiceps. Methodology/Principal Findings We obtained a total of 31,282 unigenes (mean length 920 bp) using Illumina paired-end sequencing technology and a new Trinity de novo assembler without a referenced genome. Individual transcription molecules were determined by sequence-based annotations and/or domain-based annotations against public databases (Nr, UniprotKB/Swiss-Prot, COG, KEGG, UniProtKB/TrEMBL, InterPro and Pfam). We identified 26,110 (83.47%) unigenes and inferred 20,896 (66.8%) coding sequences (CDS). Further comparative transcripts analysis with other cestodes (Taenia pisiformis, Taenia solium, Echincoccus granulosus and Echincoccus multilocularis) and intestinal parasites (Trichinella spiralis, Ancylostoma caninum and Ascaris suum) showed that 5,100 common genes were shared among three Taenia tapeworms, 261 conserved genes were detected among five Taeniidae cestodes, and 109 common genes were found in four zoonotic intestinal parasites. Some of the common genes were genes required for parasite survival, involved in parasite-host interactions. In addition, we amplified two full-length CDS of unigenes from the common genes using RT-PCR. Conclusions/Significance This study provides an extensive transcriptome of the adult stage of T. multiceps, and demonstrates that comparative transcriptomic investigations deserve to be further studied. This transcriptome dataset forms a substantial public information platform to achieve a fundamental understanding of the biology of T. multiceps, and helps in the identification of drug targets and parasite-host interaction studies. PMID:23049872
USDA-ARS?s Scientific Manuscript database
The new anamorphic yeast Kuraishia piskuri, f.a., sp. nov. is described for three strains that were isolated from insect frass from trees growing in Florida, USA (type strain, NRRL YB-2544, CBS 13714). Species placement was based on phylogenetic analysis of nuclear gene sequences for the D1/D2 domai...
Bayesian Networks Predict Neuronal Transdifferentiation.
Ainsworth, Richard I; Ai, Rizi; Ding, Bo; Li, Nan; Zhang, Kai; Wang, Wei
2018-05-30
We employ the language of Bayesian networks to systematically construct gene-regulation topologies from deep-sequencing single-nucleus RNA-Seq data for human neurons. From the perspective of the cell-state potential landscape, we identify attractors that correspond closely to different neuron subtypes. Attractors are also recovered for cell states from an independent data set confirming our models accurate description of global genetic regulations across differing cell types of the neocortex (not included in the training data). Our model recovers experimentally confirmed genetic regulations and community analysis reveals genetic associations in common pathways. Via a comprehensive scan of all theoretical three-gene perturbations of gene knockout and overexpression, we discover novel neuronal trans-differrentiation recipes (including perturbations of SATB2, GAD1, POU6F2 and ADARB2) for excitatory projection neuron and inhibitory interneuron subtypes. Copyright © 2018, G3: Genes, Genomes, Genetics.
Zhang, Chunxiao; Sheng, Chaolan; Wang, Wei; Hu, Hongbo; Peng, Huasong; Zhang, Xuehong
2015-01-01
Streptomyces lomondensis S015 synthesizes the broad-spectrum phenazine antibiotic lomofungin. Whole genome sequencing of this strain revealed a genomic locus consisting of 23 open reading frames that includes the core phenazine biosynthesis gene cluster lphzGFEDCB. lomo10, encoding a putative flavin-dependent monooxygenase, was also identified in this locus. Inactivation of lomo10 by in-frame partial deletion resulted in the biosynthesis of a new phenazine metabolite, 1-carbomethoxy-6-formyl-4,9-dihydroxy-phenazine, along with the absence of lomofungin. This result suggests that lomo10 is responsible for the hydroxylation of lomofungin at its C-7 position. This is the first description of a phenazine hydroxylation gene in Streptomyces, and the results of this study lay the foundation for further investigation of phenazine metabolite biosynthesis in Streptomyces. PMID:26305803
Evolutionary conservation of vertebrate notochord genes in the ascidian Ciona intestinalis.
Kugler, Jamie E; Passamaneck, Yale J; Feldman, Taya G; Beh, Jeni; Regnier, Todd W; Di Gregorio, Anna
2008-11-01
To reconstruct a minimum complement of notochord genes evolutionarily conserved across chordates, we scanned the Ciona intestinalis genome using the sequences of 182 genes reported to be expressed in the notochord of different vertebrates and identified 139 candidate notochord genes. For 66 of these Ciona genes expression data were already available, hence we analyzed the expression of the remaining 73 genes and found notochord expression for 20. The predicted products of the newly identified notochord genes range from the transcription factors Ci-XBPa and Ci-miER1 to extracellular matrix proteins. We examined the expression of the newly identified notochord genes in embryos ectopically expressing Ciona Brachyury (Ci-Bra) and in embryos expressing a repressor form of this transcription factor in the notochord, and we found that while a subset of the genes examined are clearly responsive to Ci-Bra, other genes are not affected by alterations in its levels. We provide a first description of notochord genes that are not evidently influenced by the ectopic expression of Ci-Bra and we propose alternative regulatory mechanisms that might control their transcription. Copyright 2008 Wiley-Liss, Inc.
Molecular Diagnosis of Cystic Fibrosis.
Deignan, Joshua L; Grody, Wayne W
2016-01-01
This unit describes a recommended approach to identifying causal genetic variants in an individual suspected of having cystic fibrosis. An introduction to the genetics and clinical presentation of cystic fibrosis is initially presented, followed by a description of the two main strategies used in the molecular diagnosis of cystic fibrosis: (1) an initial targeted variant panel used to detect only the most common cystic fibrosis-causing variants in the CFTR gene, and (2) sequencing of the entire coding region of the CFTR gene to detect additional rare causal CFTR variants. Finally, the unit concludes with a discussion regarding the analytic and clinical validity of these approaches. Copyright © 2016 John Wiley & Sons, Inc.
SpliceDisease database: linking RNA splicing and disease.
Wang, Juan; Zhang, Jie; Li, Kaibo; Zhao, Wei; Cui, Qinghua
2012-01-01
RNA splicing is an important aspect of gene regulation in many organisms. Splicing of RNA is regulated by complicated mechanisms involving numerous RNA-binding proteins and the intricate network of interactions among them. Mutations in cis-acting splicing elements or its regulatory proteins have been shown to be involved in human diseases. Defects in pre-mRNA splicing process have emerged as a common disease-causing mechanism. Therefore, a database integrating RNA splicing and disease associations would be helpful for understanding not only the RNA splicing but also its contribution to disease. In SpliceDisease database, we manually curated 2337 splicing mutation disease entries involving 303 genes and 370 diseases, which have been supported experimentally in 898 publications. The SpliceDisease database provides information including the change of the nucleotide in the sequence, the location of the mutation on the gene, the reference Pubmed ID and detailed description for the relationship among gene mutations, splicing defects and diseases. We standardized the names of the diseases and genes and provided links for these genes to NCBI and UCSC genome browser for further annotation and genomic sequences. For the location of the mutation, we give direct links of the entry to the respective position/region in the genome browser. The users can freely browse, search and download the data in SpliceDisease at http://cmbi.bjmu.edu.cn/sdisease.
Verma, Pankaj; Pandey, Prashant Kumar; Gupta, Arvind Kumar; Seong, Chi Nam; Park, Seong Chan; Choe, Han Na; Baik, Keun Sik; Patole, Milind Shivaji; Shouche, Yogesh Shreepad
2012-10-01
We have carried out a polyphasic taxonomic characterization of Bacillus beijingensis DSM 19037(T) and Bacillus ginsengi DSM 19038(T), which are closely related phylogenetically to Bhargavaea cecembensis LMG 24411(T). All three strains are Gram-stain-positive, non-motile, moderately halotolerant and non-spore-forming. 16S rRNA gene sequence analyses showed that the strains constituted a coherent cluster, with sequence similarities between 99.7 and 98.7 %. The percentage similarity on the basis of amino acid sequences deduced from partial gyrB gene nucleotide sequences of these three type strains was 96.1-92.7 %. Phylogenetic trees based on the 16S rRNA gene and GyrB amino acid sequences, obtained by using three different algorithms, were consistent and showed that these three species constituted a deeply rooted cluster separated from the clades represented by the genera Bacillus, Planococcus, Planomicrobium, Sporosarcina, Lysinibacillus, Viridibacillus, Kurthia and Geobacillus, supporting their placement in the genus Bhargavaea. All three type strains have menaquinone MK-8 as the major respiratory quinone and showed similar fatty acid profiles. The main polar lipids present in the three type strains were diphosphatidylglycerol and phosphatidylglycerol, and the three strains showed peptidoglycan type A4α with L-lysine as the diagnostic diamino acid. The DNA G+C contents of Bacillus beijingensis DSM 19037(T), Bacillus ginsengi DSM 19038(T) and Bhargavaea cecembensis LMG 24411(T) were 53.1, 50.2 and 53.7 mol%, respectively. The level of DNA-DNA hybridization among the three strains was 57-39 %, indicating that they are members of different species of the genus Bhargavaea. The phenotypic data are consistent with the placement of these three species in a single genus and support their differentiation at the species level. On the basis of these data, we have emended the description of the genus Bhargavaea and propose the reclassification of Bacillus beijingensis and Bacillus ginsengi to the genus Bhargavaea, as Bhargavaea beijingensis comb. nov. (type strain ge10(T) = DSM 19037(T) = CGMCC 1.6762(T)) and Bhargavaea ginsengi comb. nov. (type strain ge14(T) = DSM 19038(T) = CGMCC 1.6763(T)).
Alsarraf, Mohammed; Mohallal, Eman M.E.; Mierzejewska, Ewa J.; Behnke-Borowczyk, Jolanta; Welc-Falęciak, Renata; Bednarska, Małgorzata; Dziewit, Lukasz; Zalat, Samy; Gilbert, Francis; Behnke, Jerzy M.
2017-01-01
Abstract Bartonella spp. are parasites of mammalian erythrocytes and endothelial cells, transmitted by blood-feeding arthropod ectoparasites. Different species of rodents may constitute the main hosts of Bartonella, including several zoonotic species of Bartonella. The aim of this study was to identify and compare Bartonella species and genotypes isolated from rodent hosts from the South Sinai, Egypt. Prevalence of Bartonella infection was assessed in rodents (837 Acomys dimidiatus, 73 Acomys russatus, 111 Dipodillus dasyurus, and 65 Sekeetamys calurus) trapped in 2000, 2004, 2008, and 2012 in four dry montane wadis around St. Katherine town in the Sinai Mountains. Total DNA was extracted from blood samples, and PCR amplification and sequencing of the Bartonella-specific 860-bp gene fragment of rpoB and the 810-bp gene fragment of gltA were used for molecular and phylogenetic analyses. The overall prevalence of Bartonella in rodents was 7.2%. Prevalence differed between host species, being 30.6%, 10.8%, 9.6%, and 3.6% in D. dasyurus, S. calurus, A. russatus, and A. dimidiatus, respectively. The phylogenetic analyses of six samples of Bartonella (five from D. dasyurus and one from S. calurus) based on a fragment of the rpoB gene, revealed the existence of two distinct genetic groups (with 95–96% reciprocal sequence identity), clustering with several unidentified isolates obtained earlier from the same rodent species, and distant from species that have already been described (90–92% of sequence identity to the closest match from the GenBank reference database). Thus, molecular and phylogenetic analyses led to the description of two species: Candidatus Bartonella fadhilae n. sp. and Candidatus Bartonella sanaae n. sp. The identification of their vectors and the medical significance of these species need further investigation. PMID:28541836
The genomic sequence of ectromelia virus, the causative agent of mousepox.
Chen, Nanhai; Danila, Maria I; Feng, Zehua; Buller, R Mark L; Wang, Chunlin; Han, Xiaosi; Lefkowitz, Elliot J; Upton, Chris
2003-12-05
Ectromelia virus is the causative agent of mousepox, an acute exanthematous disease of mouse colonies in Europe, Japan, China, and the U.S. The Moscow, Hampstead, and NIH79 strains are the most thoroughly studied with the Moscow strain being the most infectious and virulent for the mouse. In the late 1940s mousepox was proposed as a model for the study of the pathogenesis of smallpox and generalized vaccinia in humans. Studies in the last five decades from a succession of investigators have resulted in a detailed description of the virologic and pathologic disease course in genetically susceptible and resistant inbred and out-bred mice. We report the DNA sequence of the left-hand end, the predicted right-hand terminal repeat, and central regions of the genome of the Moscow strain of ectromelia virus (approximately 177,500 bp), which together with the previously sequenced right-hand end, yields a genome of 209,771 bp. We identified 175 potential genes specifying proteins of between 53 and 1924 amino acids, and 29 regions containing sequences related to genes predicted in other poxviruses, but unlikely to encode for functional proteins in ectromelia virus. The translated protein sequences were compared with the protein database for structure/function relationships, and these analyses were used to investigate poxvirus evolution and to attempt to explain at the cellular and molecular level the well-characterized features of the ectromelia virus natural life cycle.
Castro, Rosario; Navelsaker, Sofie; Krasnov, Aleksei; Du Pasquier, Louis; Boudinot, Pierre
2017-10-01
During the last decades, gene and cDNA cloning identified TCR and Ig genes across vertebrates; genome sequencing of TCR and Ig loci in many species revealed the different organizations selected during evolution under the pressure of generating diverse repertoires of Ag receptors. By detecting clonotypes over a wide range of frequency, deep sequencing of Ig and TCR transcripts provides a new way to compare the structure of expressed repertoires in species of various sizes, at different stages of development, with different physiologies, and displaying multiple adaptations to the environment. In this review, we provide a short overview of the technologies currently used to produce global description of immune repertoires, describe how they have already been used in comparative immunology, and we discuss the future potential of such approaches. The development of these methodologies in new species holds promise for new discoveries concerning particular adaptations. As an example, understanding the development of adaptive immunity across metamorphosis in frogs has been made possible by such approaches. Repertoire sequencing is now widely used, not only in basic research but also in the context of immunotherapy and vaccination. Analysis of fish responses to pathogens and vaccines has already benefited from these methods. Finally, we also discuss potential advances based on repertoire sequencing of multigene families of immune sensors and effectors in invertebrates. Copyright © 2017 Elsevier Ltd. All rights reserved.
Golby, Paul; Nunez, Javier; Cockle, Paul J.; Ewer, Katie; Logan, Karen; Hogarth, Philip; Vordermeier, H. Martin; Hinds, Jason; Hewinson, R. Glyn; Gordon, Stephen V.
2011-01-01
Genome sequencing of Mycobacterium tuberculosis complex members has accelerated the search for new disease-control tools. Antigen mining is one area that has benefited enormously from access to genome data. As part of an ongoing antigen mining programme, we screened genes that were previously identified by transcriptome analysis as upregulated in response to an in vitro acid shock for their in vivo expression profile and antigenicity. We show that the genes encoding two methyltransferases, Mb1438c/Rv1403c and Mb1440c/Rv1404c, were highly upregulated in a mouse model of infection, and were antigenic in M. bovis-infected cattle. As the genes encoding these antigens were highly upregulated in vivo, we sought to define their genetic regulation. A mutant was constructed that was deleted for their putative regulator, Mb1439/Rv1404; loss of the regulator led to increased expression of the flanking methyltransferases and a defined set of distal genes. This work has therefore generated both applied and fundamental outputs, with the description of novel mycobacterial antigens that can now be moved into field trials, but also with the description of a regulatory network that is responsive to both in vivo and in vitro stimuli. PMID:18375799
Brozovic, Matija; Dantec, Christelle; Dardaillon, Justine; Dauga, Delphine; Faure, Emmanuel; Gineste, Mathieu; Louis, Alexandra; Naville, Magali; Nitta, Kazuhiro R; Piette, Jacques; Reeves, Wendy; Scornavacca, Céline; Simion, Paul; Vincentelli, Renaud; Bellec, Maelle; Aicha, Sameh Ben; Fagotto, Marie; Guéroult-Bellone, Marion; Haeussler, Maximilian; Jacox, Edwin; Lowe, Elijah K; Mendez, Mickael; Roberge, Alexis; Stolfi, Alberto; Yokomori, Rui; Brown, C Titus; Cambillau, Christian; Christiaen, Lionel; Delsuc, Frédéric; Douzery, Emmanuel; Dumollard, Rémi; Kusakabe, Takehiro; Nakai, Kenta; Nishida, Hiroki; Satou, Yutaka; Swalla, Billie; Veeman, Michael; Volff, Jean-Nicolas; Lemaire, Patrick
2018-01-04
ANISEED (www.aniseed.cnrs.fr) is the main model organism database for tunicates, the sister-group of vertebrates. This release gives access to annotated genomes, gene expression patterns, and anatomical descriptions for nine ascidian species. It provides increased integration with external molecular and taxonomy databases, better support for epigenomics datasets, in particular RNA-seq, ChIP-seq and SELEX-seq, and features novel interactive interfaces for existing and novel datatypes. In particular, the cross-species navigation and comparison is enhanced through a novel taxonomy section describing each represented species and through the implementation of interactive phylogenetic gene trees for 60% of tunicate genes. The gene expression section displays the results of RNA-seq experiments for the three major model species of solitary ascidians. Gene expression is controlled by the binding of transcription factors to cis-regulatory sequences. A high-resolution description of the DNA-binding specificity for 131 Ciona robusta (formerly C. intestinalis type A) transcription factors by SELEX-seq is provided and used to map candidate binding sites across the Ciona robusta and Phallusia mammillata genomes. Finally, use of a WashU Epigenome browser enhances genome navigation, while a Genomicus server was set up to explore microsynteny relationships within tunicates and with vertebrates, Amphioxus, echinoderms and hemichordates. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
MEPD: a Medaka gene expression pattern database
Henrich, Thorsten; Ramialison, Mirana; Quiring, Rebecca; Wittbrodt, Beate; Furutani-Seiki, Makoto; Wittbrodt, Joachim; Kondoh, Hisato
2003-01-01
The Medaka Expression Pattern Database (MEPD) stores and integrates information of gene expression during embryonic development of the small freshwater fish Medaka (Oryzias latipes). Expression patterns of genes identified by ESTs are documented by images and by descriptions through parameters such as staining intensity, category and comments and through a comprehensive, hierarchically organized dictionary of anatomical terms. Sequences of the ESTs are available and searchable through BLAST. ESTs in the database are clustered upon entry and have been blasted against public data-bases. The BLAST results are updated regularly, stored within the database and searchable. The MEPD is a project within the Medaka Genome Initiative (MGI) and entries will be interconnected to integrated genomic map databases. MEPD is accessible through the WWW at http://medaka.dsp.jst.go.jp/MEPD. PMID:12519950
Pöggeler, S
2000-06-01
In order to analyze the involvement of pheromones in cell recognition and mating in a homothallic fungus, two putative pheromone precursor genes, named ppg1 and ppg2, were isolated from a genomic library of Sordaria macrospora. The ppg1 gene is predicted to encode a precursor pheromone that is processed by a Kex2-like protease to yield a pheromone that is structurally similar to the alpha-factor of the yeast Saccharomyces cerevisiae. The ppg2 gene encodes a 24-amino-acid polypeptide that contains a putative farnesylated and carboxy methylated C-terminal cysteine residue. The sequences of the predicted pheromones display strong structural similarity to those encoded by putative pheromones of heterothallic filamentous ascomycetes. Both genes are expressed during the life cycle of S. macrospora. This is the first description of pheromone precursor genes encoded by a homothallic fungus. Southern-hybridization experiments indicated that ppg1 and ppg2 homologues are also present in other homothallic ascomycetes.
Two Endophytic Diaporthe Species Isolated from the Leaves of Astragalus membranaceus in Korea
Kim, Jin-Hee; Kim, Dong-Yeo; Park, Hyeok
2017-01-01
We characterized two endophyte fungi from the leaves of Astragalus membranaceus in Korea. The isolated strains were identified on the basis of the morphological characters and sequences analysis of the internal transcribed spacer and large subunit regions of the rDNA and β-tubulin gene. To the best of our knowledge, this is the first report of Diaporthe oncostoma and Diaporthe infecunda in Korea, and we have provided descriptions and figures. PMID:29371813
Standard Mutation Nomenclature in Molecular Diagnostics
Ogino, Shuji; Gulley, Margaret L.; den Dunnen, Johan T.; Wilson, Robert B.
2007-01-01
To translate basic research findings into clinical practice, it is essential that information about mutations and variations in the human genome are communicated easily and unequivocally. Unfortunately, there has been much confusion regarding the description of genetic sequence variants. This is largely because research articles that first report novel sequence variants do not often use standard nomenclature, and the final genomic sequence is compiled over many separate entries. In this article, we discuss issues crucial to clear communication, using examples of genes that are commonly assayed in clinical laboratories. Although molecular diagnostics is a dynamic field, this should not inhibit the need for and movement toward consensus nomenclature for accurate reporting among laboratories. Our aim is to alert laboratory scientists and other health care professionals to the important issues and provide a foundation for further discussions that will ultimately lead to solutions. PMID:17251329
The Giardia genome project database.
McArthur, A G; Morrison, H G; Nixon, J E; Passamaneck, N Q; Kim, U; Hinkle, G; Crocker, M K; Holder, M E; Farr, R; Reich, C I; Olsen, G E; Aley, S B; Adam, R D; Gillin, F D; Sogin, M L
2000-08-15
The Giardia genome project database provides an online resource for Giardia lamblia (WB strain, clone C6) genome sequence information. The database includes edited single-pass reads, the results of BLASTX searches, and details of progress towards sequencing the entire 12 million-bp Giardia genome. Pre-sorted BLASTX results can be retrieved based on keyword searches and BLAST searches of the high throughput Giardia data can be initiated from the web site or through NCBI. Descriptions of the genomic DNA libraries, project protocols and summary statistics are also available. Although the Giardia genome project is ongoing, new sequences are made available on a bi-monthly basis to ensure that researchers have access to information that may assist them in the search for genes and their biological function. The current URL of the Giardia genome project database is www.mbl.edu/Giardia.
Comparative immunogenomics of molluscs.
Schultz, Jonathan H; Adema, Coen M
2017-10-01
Comparative immunology, studying both vertebrates and invertebrates, provided the earliest descriptions of phagocytosis as a general immune mechanism. However, the large scale of animal diversity challenges all-inclusive investigations and the field of immunology has developed by mostly emphasizing study of a few vertebrate species. In addressing the lack of comprehensive understanding of animal immunity, especially that of invertebrates, comparative immunology helps toward management of invertebrates that are food sources, agricultural pests, pathogens, or transmit diseases, and helps interpret the evolution of animal immunity. Initial studies showed that the Mollusca (second largest animal phylum), and invertebrates in general, possess innate defenses but lack the lymphocytic immune system that characterizes vertebrate immunology. Recognizing the reality of both common and taxon-specific immune features, and applying up-to-date cell and molecular research capabilities, in-depth studies of a select number of bivalve and gastropod species continue to reveal novel aspects of molluscan immunity. The genomics era heralded a new stage of comparative immunology; large-scale efforts yielded an initial set of full molluscan genome sequences that is available for analyses of full complements of immune genes and regulatory sequences. Next-generation sequencing (NGS), due to lower cost and effort required, allows individual researchers to generate large sequence datasets for growing numbers of molluscs. RNAseq provides expression profiles that enable discovery of immune genes and genome sequences reveal distribution and diversity of immune factors across molluscan phylogeny. Although computational de novo sequence assembly will benefit from continued development and automated annotation may require some experimental validation, NGS is a powerful tool for comparative immunology, especially increasing coverage of the extensive molluscan diversity. To date, immunogenomics revealed new levels of complexity of molluscan defense by indicating sequence heterogeneity in individual snails and bivalves, and members of expanded immune gene families are expressed differentially to generate pathogen-specific defense responses. Copyright © 2017 Elsevier Ltd. All rights reserved.
Kim, MyongChol; Pak, SeHong; Rim, SongGuk; Ren, Lvzhi; Jiang, Fan; Chang, Xulu; Liu, Ping; Zhang, Yumin; Fang, Chengxiang; Zheng, Congyi; Peng, Fang
2015-06-01
A pale yellow, Gram-reaction-negative, non-motile, aerobic bacterium, designated MC 3726T, was isolated from a tundra soil near Ny-Ålesund, Svalbard Archipelago, Norway (78 °N). Growth occurred at 4-37 °C (optimum 25-30 °C) and at pH 5.0-9.0 (optimum pH 8.0). Phylogenetic analysis based on 16S rRNA gene sequences indicated that strain MC 3726T belonged to the genus Luteolibacter in the family Verrucomicrobiaceae. The 16S rRNA gene sequence of this strain showed 93.18, 92.54 and 92.44 % similarity to those of Luteolibacter cuticulihirudinis E100T, Luteolibacter pohnpeiensis A4T-83T and Luteolibacter yonseiensis EBTL01T, respectively. The cell wall of strain MC 3726T contained meso-diaminopimelic acid as the diagnostic amino acid. Strain MC 3726T contained iso-C14:0 (38.28 %), C16:0 (15.89 %), C16:1ω9c (14.24 %), iso-C16:0 (10.42 %) and anteiso-C15:0 (5.75 %) as the predominant cellular fatty acids, MK-9 and MK-10 as the major respiratory quinones, and phosphatidylethanolamine, phosphatidylmethylethanolamine, phosphatidylglycerol and diphosphatidylglycerol as the main polar lipids. The DNA G+C content was 60.7 mol %. On the basis of phenotypic, chemotaxonomic and phylogenetic data, strain MC 3726T is considered to represent a novel species of the genus Luteolibacter, for which the name Luteolibacter arcticus sp. nov. is proposed. The type strain is MC 3726T ( = CCTCC AB 2014275T = LMG 28638T). An emended description of the genus Luteolibacter is also provided, along with emended descriptions of Luteolibacter cuticulihirudinis, Luteolibacter yonseiensis and Luteolibacter pohnpeiensis.
van der Linden, Mark; Otten, Julia; Bergmann, Carina; Latorre, Cristina; Liñares, Josefina
2017-01-01
ABSTRACT The identification of commensal streptococci species is an everlasting problem due to their ability to genetically transform. A new challenge in this respect is the recent description of Streptococcus pseudopneumoniae as a new species, which was distinguished from closely related pathogenic S. pneumoniae and commensal S. mitis by a variety of physiological and molecular biological tests. Forty-one atypical S. pneumoniae isolates have been collected at the German National Reference Center for Streptococci (GNRCS). Multilocus sequence typing (MLST) confirmed 35 isolates as the species S. pseudopneumoniae. A comparison with the pbp2x sequences from 120 commensal streptococci isolated from different continents revealed that pbp2x is distinct among penicillin-susceptible S. pseudopneumoniae isolates. Four penicillin-binding protein x (PBPx) alleles of penicillin-sensitive S. mitis account for most of the diverse sequence blocks in resistant S. pseudopneumoniae, S. pneumoniae, and S. mitis, and S. infantis and S. oralis sequences were found in S. pneumoniae from Japan. PBP2x genes of the family of mosaic genes related to pbp2x in the S. pneumoniae clone Spain23F-1 were observed in S. oralis and S. infantis as well, confirming its global distribution. Thirty-eight sites were altered within the PBP2x transpeptidase domains of penicillin-resistant strains, excluding another 37 sites present in the reference genes of sensitive strains. Specific mutational patterns were detected depending on the parental sequence blocks, in agreement with distinct mutational pathways during the development of beta-lactam resistance. The majority of the mutations clustered around the active site, whereas others are likely to affect stability or interactions with the C-terminal domain or partner proteins. PMID:28193649
GenoMycDB: a database for comparative analysis of mycobacterial genes and genomes.
Catanho, Marcos; Mascarenhas, Daniel; Degrave, Wim; Miranda, Antonio Basílio de
2006-03-31
Several databases and computational tools have been created with the aim of organizing, integrating and analyzing the wealth of information generated by large-scale sequencing projects of mycobacterial genomes and those of other organisms. However, with very few exceptions, these databases and tools do not allow for massive and/or dynamic comparison of these data. GenoMycDB (http://www.dbbm.fiocruz.br/GenoMycDB) is a relational database built for large-scale comparative analyses of completely sequenced mycobacterial genomes, based on their predicted protein content. Its central structure is composed of the results obtained after pair-wise sequence alignments among all the predicted proteins coded by the genomes of six mycobacteria: Mycobacterium tuberculosis (strains H37Rv and CDC1551), M. bovis AF2122/97, M. avium subsp. paratuberculosis K10, M. leprae TN, and M. smegmatis MC2 155. The database stores the computed similarity parameters of every aligned pair, providing for each protein sequence the predicted subcellular localization, the assigned cluster of orthologous groups, the features of the corresponding gene, and links to several important databases. Tables containing pairs or groups of potential homologs between selected species/strains can be produced dynamically by user-defined criteria, based on one or multiple sequence similarity parameters. In addition, searches can be restricted according to the predicted subcellular localization of the protein, the DNA strand of the corresponding gene and/or the description of the protein. Massive data search and/or retrieval are available, and different ways of exporting the result are offered. GenoMycDB provides an on-line resource for the functional classification of mycobacterial proteins as well as for the analysis of genome structure, organization, and evolution.
Racsa, Lori D; Luu, Hung S; Park, Jason Y; Mitui, Midori; Timmons, Charles F
2014-06-01
Hemoglobin (Hb) Austin was defined in 1977, using amino acid sequencing of samples from 3 unrelated Mexican-Americans, as a substitution of serine for arginine at position 40 of the β-globin chain (Arg40Ser). Its electrophoretic migration on both cellulose acetate (pH 8.4) and citrate agar (pH 6.2) was reported between Hb F and Hb A, and this description persists in reference literature. OBJECTIVES.-To review the clinical features and redefine the diagnostic characteristics of Hb Austin. Eight samples from 6 unrelated individuals and 2 siblings, all with Hispanic surnames, were submitted for abnormal Hb identification between June 2010 and September 2011. High-performance liquid chromatography, isoelectric focusing (IEF), citrate agar electrophoresis, and bidirectional DNA sequencing of the entire β-globin gene were performed. DNA sequencing confirmed all 8 individuals to be heterozygous for Hb Austin (Arg40Ser). Retention time on high-performance liquid chromatography and migration on citrate agar electrophoresis were consistent with that identification. Migration on IEF, however, was not between Hb F and Hb A, as predicted from the report of cellulose acetate electrophoresis. By IEF, Hb Austin migrated anodal to ("faster than") Hb A. Hemoglobin Austin (Arg40Ser) appears on IEF as a "fast," anodally migrating, Hb variant, just as would be expected from its amino acid substitution. The cited historic report is, at best, not applicable to IEF and is probably erroneous. Our observation of 8 cases in 16 months suggests that this variant may be relatively common in some Hispanic populations, making its recognition important. Furthermore, gene sequencing is proving itself a powerful and reliable tool for definitive identification of Hb variants.
Exploiting Multisite Gateway and pENFRUIT plasmid collection for fruit genetic engineering.
Estornell, Leandro H; Granell, Antonio; Orzaez, Diego
2012-01-01
MultiSite Gateway cloning techniques based on homologous recombination facilitate the combinatorial assembly of basic genetic pieces (i.e., promoters, CDS, and terminators) into gene expression or gene silencing cassettes. pENFRUIT is a collection of MultiSite Triple Gateway Entry vectors dedicated to genetic engineering in fruits. It comprises a number of fruit-operating promoters as well as C-terminal tags adapted to the Gateway standard. In this way, flanking regulatory/labeling sequences can be easily Gateway-assembled with a given gene of interest for its ectopic expression or silencing in fruits. The resulting gene constructs can be analyzed in stable transgenic plants or in transient expression assays, the latter allowing fast testing of the increasing number of combinations arising from MultiSite methodology. A detailed description of the use of MultiSite cloning methodology for the assembly of pENFRUIT elements is presented.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pichon, L.; Carn, G.; Bouric, P.
1996-03-01
Positional cloning strategies for the hemochromatosis gene have previously concentrated on a target area restricted to a maximum genomic expanse of 400 kb around the HLA-A and HLA-F loci. Recently, the candidate region has been extended to 2-3 Mb on the distal side of the MHC. In this study, 10 coding sequences [hemochromatosis candidate genes (HCG) I to X] were isolated by cDNA selection using YACs covering the HLA-A/HLA-F subregion. Two of these (HCG II and HCG IV) belong to multigene families, as well as other sequences already described in this region, i.e., P5, pMC 6.7, and HLA class I.more » Fingerprinting of the four YACSs overlapping the region was performed and allowed partial localization of the different multigene family sequences on each YAC without defining their exact positions. Fingerprinting on cosmids isolated from the ICRF chromosome 6-specific cosmid library allowed more precise localization of the redundant sequences in all of the multigene families and revealed their apparent organization in clusters. Further examination of these intertwined sequences demonstrated that this structural organization resulted from a succession of complex phenomena, including duplications and contractions. This study presents a precise description of the structural organization of the HLA-A/HLA-F region and a determination of the sequences involved in the megabase size polymorphism observed among the A3, A24, and A31 haplotypes. 29 refs., 2 figs., 2 tabs.« less
Developmental staging of male murine embryonic gonad by SAGE analysis
Lee, Tin-Lap; Li, Yunmin; Alba, Diana; Vong, Queenie P.; Wu, Shao-Ming; Baxendale, Vanessa; Rennert, Owen M.; Lau, Yun-Fai Chris; Chan, Wai-Yee
2012-01-01
Despite the identification of key genes such as Sry integral to embryonic gonadal development, the genomic classification and identification of chromosomal activation of this process is still poorly understood. To better understand the genetic regulation of gonadal development, we performed Serial Analysis of Gene Expression (SAGE) to profile the genes and novel transcripts, and an average of 152,000 tags from male embryonic gonads at E10.5 (embryonic day 10.5), E11.5, E12.5, E13.5, E15.5 and E17.5 were analyzed. A total of 275,583 non-singleton tags that do not map to any annotated sequence were identified in the six gonad libraries, and 47,255 tags were mapped to 24,975 annotated sequences, among which 987 sequences were uncharacterized. Utilizing an unsupervised pattern identification technique, we established molecular staging of male gonadal development. Rather than providing a static descriptive analysis, we developed algorithms to cluster the SAGE data and assign SAGE tags to a corresponding chromosomal position; these data are displayed in chromosome graphic format. A prominent increase in global genomic activity from E10.5 to E17.5 was observed. Important chromosomal regions related to the developmental processes were identified and validated based on established mouse models with developmental disorders. These regions may represent markers for early diagnosis for disorders of male gonad development as well as potential treatment targets. PMID:19376482
Bonnin, Rémy A; Bogaerts, Pierre; Girlich, Delphine; Huang, Te-Din; Dortet, Laurent; Glupczynski, Youri; Naas, Thierry
2018-06-01
Carbapenemase-producing Pseudomonadaceae have increasingly been reported worldwide, with an ever-increasing heterogeneity of carbapenem resistance mechanisms, depending on the bacterial species and the geographical location. OXA-198 is a plasmid-encoded class D β-lactamase involved in carbapenem resistance in one Pseudomonas aeruginosa isolate from Belgium. In the setting of a multicenter survey of carbapenem resistance in P. aeruginosa strains in Belgian hospitals in 2013, three additional OXA-198-producing P. aeruginosa isolates originating from patients hospitalized in one hospital were detected. To reveal the molecular mechanism underlying the reduced susceptibility to carbapenems, MIC determinations, whole-genome sequencing, and PCR analyses to confirm the genetic organization were performed. The plasmid harboring the bla OXA-198 gene was characterized, along with the genetic relatedness of the four P. aeruginosa isolates. The bla OXA-198 gene was harbored on a class 1 integron carried by an ∼49-kb IncP-type plasmid proposed as IncP-11. The same plasmid was present in all four P. aeruginosa isolates. Multilocus sequence typing revealed that the isolates all belonged to sequence type 446, and single-nucleotide polymorphism analysis revealed only a few differences between the isolates. This report describes the structure of a 49-kb plasmid harboring the bla OXA-198 gene and presents the first description of OXA-198-producing P. aeruginosa isolates associated with a hospital-associated cluster episode. Copyright © 2018 American Society for Microbiology.
Highly virulent M1 Streptococcus pyogenes isolates resistant to clindamycin.
Plainvert, C; Martin, C; Loubinoux, J; Touak, G; Dmytruk, N; Collobert, G; Fouet, A; Ploy, M-C; Poyart, C
2015-01-01
Emm1-type group A Streptococcus (GAS), or Streptococcus pyogenes, is mostly responsible for invasive infections such as necrotizing fasciitis (NF) and streptococcal toxic shock syndrome (STSS). The recommended treatment of severe invasive GAS infections is a combination of clindamycin and penicillin. Until 2012, almost all emm1 isolates were susceptible to clindamycin. We aimed to identify the phenotypic and genotypic characteristics of emm1 GAS clone resistant to clindamycin. GAS strains were characterized by emm sequence typing, detection of genes encoding pyrogenic exotoxins or superantigens. Cluster analysis was performed by pulsed-field gel electrophoresis (PFGE) and multilocus sequence typing (MLST). Antibiotic susceptibility was assessed using disk diffusion and resistance genes were detected by PCR. A total of 1321 GAS invasive isolates were analyzed between January 2011 and December 2012. The overall number of invasive isolates resistant to clindamycin was 52 (3.9%); seven of them were emm1 isolates. All isolates had the same genomic markers: macrolide resistance due to the presence of the erm(B) gene, emm subtype 1.0, the same toxin or superantigen profile, PFGE pattern and sequence type. This is the first description of highly virulent GAS emm1 isolates resistant to clindamycin in France. This article strengthens the need for monitoring the epidemiology of invasive GAS strains as they could lead to changes in treatment guidelines. Copyright © 2015 Elsevier Masson SAS. All rights reserved.
Xie, W; Fletcher, B S; Andersen, R D; Herschman, H R
1994-10-01
We recently reported the cloning of a mitogen-inducible prostaglandin synthase gene, TIS10/PGS2. In addition to growth factors and tumor promoters, the v-src oncogene induces TIS10/PGS2 expression in 3T3 cells. Deletion analysis, using luciferase reporters, identifies a region between -80 and -40 nucleotides 5' of the TIS10/PGS2 transcription start site that mediates pp60v-src induction in 3T3 cells. This region contains the sequence CGTCACGTG, which includes overlapping ATF/CRE (CGTCA) and E-box (CACGTG) sequences. Gel shift-oligonucleotide competition experiments with nuclear extracts from cells stably transfected with a temperature-sensitive v-src gene demonstrate that the CGTCACGTG sequence can bind proteins at both the ATF/CRE and E-box sequences. Dominant-negative CREB and Myc proteins that bind DNA, but do not transactivate, block v-src induction of a luciferase reporter driven by the first 80 nucleotides of the TIS10/PGS2 promoter. Mutational analysis distinguishes which TIS10/PGS2 cis-acting element mediates pp60v-src induction. E-box mutation has no effect on the fold induction in response to pp60v-src. In contrast, ATF/CRE mutation attenuates the pp60v-src response. Antibody supershift and methylation interference experiments demonstrate that CREB and at least one other ATF transcription factor in these extracts bind to the TIS10/PGS2 ATF/CRE element. Expression of a dominant-negative ras gene also blocks TIS10/PGS2 induction by v-src. Our data suggest that Ras mediates pp60v-src activation of an ATF transcription factor, leading to induced TIS10/PGS2 expression via the ATF/CRE element of the TIS10/PGS2 promoter. This is the first description of v-src activation of gene expression via an ATF/CRE element.
Fleckenstein, E C; Dirks, W G; Drexler, H G
2000-02-01
The biochemical properties and protein structure of the tartrate-resistant acid phosphatase (TRAP), an iron-containing lysosomal glycoprotein in cells of the mononuclear phagocyte system, are well known. In contrast, little is known about the physiology and genic structure of this unique enzyme. In some diseases, like hairy cell leukemia, Gaucher's disease and osteoclastoma, cytochemically detected TRAP expression is used as a disease-associated marker. In order to begin to elucidate the regulation of this gene we generated different deletion constructs of the TRAP 5'-flanking region, placed them upstream of the luciferase reporter gene and assayed them for their ability to direct luciferase expression in human 293 cells. Treatment of these cells with the iron-modulating reagents transferrin and hemin causes opposite effects on the TRAP promoter activity. Two regulatory GAGGC tandem repeat sequences (the hemin responsive elements, HRE) within the 5'-flanking region of the human TRAP gene were identified. Studies with specific HRE-deletion constructs of the human TRAP 5'-flanking region upstream of the luciferase reporter gene document the functionality of these HRE-sequences which are apparently responsible for mediating transcriptional inhibition upon exposure to hemin. In addition to the previously published functional characterization of the murine TRAP HRE motifs, these results provide the first description of a new iron/hemin-responsive transcriptional regulation in the human TRAP gene.
Mating-Type Genes and MAT Switching in Saccharomyces cerevisiae
Haber, James E.
2012-01-01
Mating type in Saccharomyces cerevisiae is determined by two nonhomologous alleles, MATa and MATα. These sequences encode regulators of the two different haploid mating types and of the diploids formed by their conjugation. Analysis of the MATa1, MATα1, and MATα2 alleles provided one of the earliest models of cell-type specification by transcriptional activators and repressors. Remarkably, homothallic yeast cells can switch their mating type as often as every generation by a highly choreographed, site-specific homologous recombination event that replaces one MAT allele with different DNA sequences encoding the opposite MAT allele. This replacement process involves the participation of two intact but unexpressed copies of mating-type information at the heterochromatic loci, HMLα and HMRa, which are located at opposite ends of the same chromosome-encoding MAT. The study of MAT switching has yielded important insights into the control of cell lineage, the silencing of gene expression, the formation of heterochromatin, and the regulation of accessibility of the donor sequences. Real-time analysis of MAT switching has provided the most detailed description of the molecular events that occur during the homologous recombinational repair of a programmed double-strand chromosome break. PMID:22555442
Chen, Jianjun; Wang, Qiwei; Cabrera, Patricia E.; Zhong, Zilin; Sun, Wenmin; Jiao, Xiaodong; Chen, Yabin; Govindarajan, Gowthaman; Naeem, Muhammad Asif; Khan, Shaheen N.; Ali, Muhammad Hassaan; Assir, Muhammad Zaman; Rahman, Fawad Ur; Qazi, Zaheeruddin A.; Riazuddin, Sheikh; Akram, Javed; Riazuddin, S. Amer; Hejtmancik, J. Fielding
2017-01-01
Purpose To identify the genetic origins of autosomal recessive congenital cataracts (arCC) in the Pakistani population. Methods Based on the hypothesis that most arCC patients in consanguineous families in the Punjab areas of Pakistan should be homozygous for causative mutations, affected individuals were screened for homozygosity of nearby highly informative microsatellite markers and then screened for pathogenic mutations by DNA sequencing. A total of 83 unmapped consanguineous families were screened for mutations in 33 known candidate genes. Results Patients in 32 arCC families were homozygous for markers near at least 1 of the 33 known CC genes. Sequencing the included genes revealed homozygous cosegregating sequence changes in 10 families, 2 of which had the same variation. These included five missense, one nonsense, two frame shift, and one splice site mutations, eight of which were novel, in EPHA2, FOXE3, FYCO1, TDRD7, MIP, GALK1, and CRYBA4. Conclusions The above results confirm the usefulness of homozygosity mapping for identifying genetic defects underlying autosomal recessive disorders in consanguineous families. In our ongoing study of arCC in Pakistan, including 83 arCC families that underwent homozygosity mapping, 3 mapped using genome-wide linkage analysis in unpublished data, and 30 previously reported families, mutations were detected in approximately 37.1% (43/116) of all families studied, suggesting that additional genes might be responsible in the remaining families. The most commonly mutated gene was FYCO1 (14%), followed by CRYBB3 (5.2%), GALK1 (3.5%), and EPHA2 (2.6%). This provides the first comprehensive description of the genetic architecture of arCC in the Pakistani population. PMID:28418495
VitisExpDB: A database resource for grape functional genomics
Doddapaneni, Harshavardhan; Lin, Hong; Walker, M Andrew; Yao, Jiqiang; Civerolo, Edwin L
2008-01-01
Background The family Vitaceae consists of many different grape species that grow in a range of climatic conditions. In the past few years, several studies have generated functional genomic information on different Vitis species and cultivars, including the European grape vine, Vitis vinifera. Our goal is to develop a comprehensive web data source for Vitaceae. Description VitisExpDB is an online MySQL-PHP driven relational database that houses annotated EST and gene expression data for V. vinifera and non-vinifera grape species and varieties. Currently, the database stores ~320,000 EST sequences derived from 8 species/hybrids, their annotation (BLAST top match) details and Gene Ontology based structured vocabulary. Putative homologs for each EST in other species and varieties along with information on their percent nucleotide identities, phylogenetic relationship and common primers can be retrieved. The database also includes information on probe sequence and annotation features of the high density 60-mer gene expression chip consisting of ~20,000 non-redundant set of ESTs. Finally, the database includes 14 processed global microarray expression profile sets. Data from 12 of these expression profile sets have been mapped onto metabolic pathways. A user-friendly web interface with multiple search indices and extensively hyperlinked result features that permit efficient data retrieval has been developed. Several online bioinformatics tools that interact with the database along with other sequence analysis tools have been added. In addition, users can submit their ESTs to the database. Conclusion The developed database provides genomic resource to grape community for functional analysis of genes in the collection and for the grape genome annotation and gene function identification. The VitisExpDB database is available through our website . PMID:18307813
Linking microarray reporters with protein functions.
Gaj, Stan; van Erk, Arie; van Haaften, Rachel I M; Evelo, Chris T A
2007-09-26
The analysis of microarray experiments requires accurate and up-to-date functional annotation of the microarray reporters to optimize the interpretation of the biological processes involved. Pathway visualization tools are used to connect gene expression data with existing biological pathways by using specific database identifiers that link reporters with elements in the pathways. This paper proposes a novel method that aims to improve microarray reporter annotation by BLASTing the original reporter sequences against a species-specific EMBL subset, that was derived from and crosslinked back to the highly curated UniProt database. The resulting alignments were filtered using high quality alignment criteria and further compared with the outcome of a more traditional approach, where reporter sequences were BLASTed against EnsEMBL followed by locating the corresponding protein (UniProt) entry for the high quality hits. Combining the results of both methods resulted in successful annotation of > 58% of all reporter sequences with UniProt IDs on two commercial array platforms, increasing the amount of Incyte reporters that could be coupled to Gene Ontology terms from 32.7% to 58.3% and to a local GenMAPP pathway from 9.6% to 16.7%. For Agilent, 35.3% of the total reporters are now linked towards GO nodes and 7.1% on local pathways. Our methods increased the annotation quality of microarray reporter sequences and allowed us to visualize more reporters using pathway visualization tools. Even in cases where the original reporter annotation showed the correct description the new identifiers often allowed improved pathway and Gene Ontology linking. These methods are freely available at http://www.bigcat.unimaas.nl/public/publications/Gaj_Annotation/.
Ziada-Bouchaar, H; Sifi, K; Filali, T; Hammada, T; Satta, D; Abadi, N
2017-01-01
Hereditary non-polyposis colorectal cancer (HNPCC) is an autosomal dominant disorder characterized by the early onset of colorectal cancer (CRC) linked to germline defects in Mismatch Repair (MMR) genes. We present here, the first molecular study of the correlation between CRC and mutations occurring in these genes performed in twenty-one unrelated Algerian families. The presence of germline mutations in MMR genes, MLH1, MSH2 and MSH6 genes was tested by sequencing all exons plus adjacent intronic sequences and Multiplex ligand-dependent probe amplification (MLPA) for testing large genomic rearrangements. Pathogenic mutations were identified in 20 % of families with clinical suspicion on HNPCC. Two novel variants described for the first time in Algerian families were identified in MLH1, c.881_884delTCAGinsCATTCCT and a large deletion in MSH6 gene from a young onset of CRC. Moreover, the variants of MSH2 gene: c.942+3A>T, c.1030C>T, the most described ones, were also detected in Algerian families. Furthermore, the families HNPCC caused by MSH6 germline mutation may show an age of onset that is comparable to this of patients with MLH1 and MSH2 mutations. In this study, we confirmed that MSH2, MLH1, and MSH6 contribute to CRC susceptibility. This work represents the implementation of a diagnostic algorithm for the identification of Lynch syndrome patients in Algerian families.
Cristancho, Marco A.; Botero-Rozo, David Octavio; Giraldo, William; Tabima, Javier; Riaño-Pachón, Diego Mauricio; Escobar, Carolina; Rozo, Yomara; Rivera, Luis F.; Durán, Andrés; Restrepo, Silvia; Eilam, Tamar; Anikster, Yehoshua; Gaitán, Alvaro L.
2014-01-01
Coffee leaf rust caused by the fungus Hemileia vastatrix is the most damaging disease to coffee worldwide. The pathogen has recently appeared in multiple outbreaks in coffee producing countries resulting in significant yield losses and increases in costs related to its control. New races/isolates are constantly emerging as evidenced by the presence of the fungus in plants that were previously resistant. Genomic studies are opening new avenues for the study of the evolution of pathogens, the detailed description of plant-pathogen interactions and the development of molecular techniques for the identification of individual isolates. For this purpose we sequenced 8 different H. vastatrix isolates using NGS technologies and gathered partial genome assemblies due to the large repetitive content in the coffee rust hybrid genome; 74.4% of the assembled contigs harbor repetitive sequences. A hybrid assembly of 333 Mb was built based on the 8 isolates; this assembly was used for subsequent analyses. Analysis of the conserved gene space showed that the hybrid H. vastatrix genome, though highly fragmented, had a satisfactory level of completion with 91.94% of core protein-coding orthologous genes present. RNA-Seq from urediniospores was used to guide the de novo annotation of the H. vastatrix gene complement. In total, 14,445 genes organized in 3921 families were uncovered; a considerable proportion of the predicted proteins (73.8%) were homologous to other Pucciniales species genomes. Several gene families related to the fungal lifestyle were identified, particularly 483 predicted secreted proteins that represent candidate effector genes and will provide interesting hints to decipher virulence in the coffee rust fungus. The genome sequence of Hva will serve as a template to understand the molecular mechanisms used by this fungus to attack the coffee plant, to study the diversity of this species and for the development of molecular markers to distinguish races/isolates. PMID:25400655
Zhang, Jian-Ming; Zhang, Zhi-Shan; Deng, Yan-Qin; Wu, Shou-Li; Wang, Wei; Yan, Yan-Sheng
2017-08-30
Rabies is a global fatal infectious viral disease that is characterized by a high mortality after onset of clinical symptoms. Recently, there has been an increase in the incidence of rabies in China. The aim of this study was to investigate the incidence of human rabies and characterize the rabies virus nucleoprotein gene in dogs sampled from Fujian Province, Southeast China from 2002 to 2012. Data pertaining to human rabies cases in Fujian Province during the period from 2002 through 2012 were collected, and the epidemiological profiles were described. The saliva and brain specimens were collected from dogs in Quanzhou, Longyan and Sanming cities of the province, and the rabies virus antigen was determined in the canine saliva specimens using an ELISA assay. Rabies virus RNA was extracted from canine brain specimens, and rabies virus nucleoprotein gene was amplified using a nested RT-PCR assay, followed by sequencing and genotyping. A total of 226 human rabies cases were reported in Fujian Province from 2002 to 2012, in which 197 cases were detected in three cities of Quanzhou, Longyan and Sanming. ELISA assay revealed positive rabies virus antigen in six of eight rabid dogs and 165 of 3492 seemingly healthy dogs. The full-length gene fragment of the rabies virus nucleoprotein gene was amplified from the brain specimens of seven rabid dogs and 12 seemingly healthy dogs. Sequence alignment and phylogenetic analysis revealed that these 19 rabies virus nucleoprotein genes all belonged to genotype I, and were classified into three genetic groups. Sequencing analysis showed a 99.7% to 100% intra-group and an 86.4% to 89.3% inter-group homology. This study is the first description pertaining to the epidemiological characteristics of human rabies cases and characterization of the rabies virus nucleoprotein gene in dogs in Fujian Province, Southeast China. Our findings may provide valuable knowledge for the development of strategies targeting the prevention and control of rabies.
Harris, D James; Damas-Moreira, Isabel; Maia, João P M C; Perera, Ana
2014-02-01
Hepatozoon spp. are identified for the first time in the amphibian order Gymnophiona, or caecilians, from the Seychelles island of Silhouette. Estimate of relationships derived from partial 18S rRNA gene sequences indicate these are not related to Hepatozoon spp. from frogs or to other Hepatozoon spp. from reptiles in the Seychelles. Assessment of mature gamonts from blood smears indicate that these can be recognized as a new species, Hepatozoon seychellensis n. sp.
Geomyces destructans sp. nov. associated with bat white-nose syndrome
Gargas, Andrea; Trest, M.T.; Christensen, M.; Volk, T.J.; Blehert, David S.
2009-01-01
We describe and illustrate the new species Geomyces destructans. Bats infected with this fungus present with powdery conidia and hyphae on their muzzles, wing membranes, and/or pinnae, leading to description of the accompanying disease as white-nose syndrome, a cause of widespread mortality among hibernating bats in the northeastern US. Based on rRNA gene sequence (ITS and SSU) characters the fungus is placed in the genus Geomyces, yet its distinctive asymmetrically curved conidia are unlike those of any described Geomyces species.
Genome-Based Taxonomic Classification of Bacteroidetes
Hahnke, Richard L.; Meier-Kolthoff, Jan P.; García-López, Marina; ...
2016-12-20
The bacterial phylum Bacteroidetes, characterized by a distinct gliding motility, occurs in a broad variety of ecosystems, habitats, life styles, and physiologies. Accordingly, taxonomic classification of the phylum, based on a limited number of features, proved difficult and controversial in the past, for example, when decisions were based on unresolved phylogenetic trees of the 16S rRNA gene sequence. Here we use a large collection of type-strain genomes from Bacteroidetes and closely related phyla for assessing their taxonomy based on the principles of phylogenetic classification and trees inferred from genome-scale data. No significant conflict between 16S rRNA gene and whole-genome phylogeneticmore » analysis is found, whereas many but not all of the involved taxa are supported as monophyletic groups, particularly in the genome-scale trees. Phenotypic and phylogenomic features support the separation of Balneolaceae as new phylum Balneolaeota from Rhodothermaeota and of Saprospiraceae as new class Saprospiria from Chitinophagia. Epilithonimonas is nested within the older genus Chryseobacterium and without significant phenotypic differences; thus merging the two genera is proposed. Similarly, Vitellibacter is proposed to be included in Aequorivita. Flexibacter is confirmed as being heterogeneous and dissected, yielding six distinct genera. Hallella seregens is a later heterotypic synonym of Prevotella dentalis. Compared to values directly calculated from genome sequences, the G+C content mentioned in many species descriptions is too imprecise; moreover, corrected G+C content values have a significantly better fit to the phylogeny. Corresponding emendations of species descriptions are provided where necessary. Whereas most observed conflict with the current classification of Bacteroidetes is already visible in 16S rRNA gene trees, as expected whole-genome phylogenies are much better resolved.« less
Genome-Based Taxonomic Classification of Bacteroidetes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hahnke, Richard L.; Meier-Kolthoff, Jan P.; García-López, Marina
The bacterial phylum Bacteroidetes, characterized by a distinct gliding motility, occurs in a broad variety of ecosystems, habitats, life styles, and physiologies. Accordingly, taxonomic classification of the phylum, based on a limited number of features, proved difficult and controversial in the past, for example, when decisions were based on unresolved phylogenetic trees of the 16S rRNA gene sequence. Here we use a large collection of type-strain genomes from Bacteroidetes and closely related phyla for assessing their taxonomy based on the principles of phylogenetic classification and trees inferred from genome-scale data. No significant conflict between 16S rRNA gene and whole-genome phylogeneticmore » analysis is found, whereas many but not all of the involved taxa are supported as monophyletic groups, particularly in the genome-scale trees. Phenotypic and phylogenomic features support the separation of Balneolaceae as new phylum Balneolaeota from Rhodothermaeota and of Saprospiraceae as new class Saprospiria from Chitinophagia. Epilithonimonas is nested within the older genus Chryseobacterium and without significant phenotypic differences; thus merging the two genera is proposed. Similarly, Vitellibacter is proposed to be included in Aequorivita. Flexibacter is confirmed as being heterogeneous and dissected, yielding six distinct genera. Hallella seregens is a later heterotypic synonym of Prevotella dentalis. Compared to values directly calculated from genome sequences, the G+C content mentioned in many species descriptions is too imprecise; moreover, corrected G+C content values have a significantly better fit to the phylogeny. Corresponding emendations of species descriptions are provided where necessary. Whereas most observed conflict with the current classification of Bacteroidetes is already visible in 16S rRNA gene trees, as expected whole-genome phylogenies are much better resolved.« less
Genome-Based Taxonomic Classification of Bacteroidetes
Hahnke, Richard L.; Meier-Kolthoff, Jan P.; García-López, Marina; Mukherjee, Supratim; Huntemann, Marcel; Ivanova, Natalia N.; Woyke, Tanja; Kyrpides, Nikos C.; Klenk, Hans-Peter; Göker, Markus
2016-01-01
The bacterial phylum Bacteroidetes, characterized by a distinct gliding motility, occurs in a broad variety of ecosystems, habitats, life styles, and physiologies. Accordingly, taxonomic classification of the phylum, based on a limited number of features, proved difficult and controversial in the past, for example, when decisions were based on unresolved phylogenetic trees of the 16S rRNA gene sequence. Here we use a large collection of type-strain genomes from Bacteroidetes and closely related phyla for assessing their taxonomy based on the principles of phylogenetic classification and trees inferred from genome-scale data. No significant conflict between 16S rRNA gene and whole-genome phylogenetic analysis is found, whereas many but not all of the involved taxa are supported as monophyletic groups, particularly in the genome-scale trees. Phenotypic and phylogenomic features support the separation of Balneolaceae as new phylum Balneolaeota from Rhodothermaeota and of Saprospiraceae as new class Saprospiria from Chitinophagia. Epilithonimonas is nested within the older genus Chryseobacterium and without significant phenotypic differences; thus merging the two genera is proposed. Similarly, Vitellibacter is proposed to be included in Aequorivita. Flexibacter is confirmed as being heterogeneous and dissected, yielding six distinct genera. Hallella seregens is a later heterotypic synonym of Prevotella dentalis. Compared to values directly calculated from genome sequences, the G+C content mentioned in many species descriptions is too imprecise; moreover, corrected G+C content values have a significantly better fit to the phylogeny. Corresponding emendations of species descriptions are provided where necessary. Whereas most observed conflict with the current classification of Bacteroidetes is already visible in 16S rRNA gene trees, as expected whole-genome phylogenies are much better resolved. PMID:28066339
Gostinčar, Cene; Ohm, Robin A; Kogej, Tina; Sonjak, Silva; Turk, Martina; Zajc, Janja; Zalar, Polona; Grube, Martin; Sun, Hui; Han, James; Sharma, Aditi; Chiniquy, Jennifer; Ngan, Chew Yee; Lipzen, Anna; Barry, Kerrie; Grigoriev, Igor V; Gunde-Cimerman, Nina
2014-07-01
Aureobasidium pullulans is a black-yeast-like fungus used for production of the polysaccharide pullulan and the antimycotic aureobasidin A, and as a biocontrol agent in agriculture. It can cause opportunistic human infections, and it inhabits various extreme environments. To promote the understanding of these traits, we performed de-novo genome sequencing of the four varieties of A. pullulans. The 25.43-29.62 Mb genomes of these four varieties of A. pullulans encode between 10266 and 11866 predicted proteins. Their genomes encode most of the enzyme families involved in degradation of plant material and many sugar transporters, and they have genes possibly associated with degradation of plastic and aromatic compounds. Proteins believed to be involved in the synthesis of pullulan and siderophores, but not of aureobasidin A, are predicted. Putative stress-tolerance genes include several aquaporins and aquaglyceroporins, large numbers of alkali-metal cation transporters, genes for the synthesis of compatible solutes and melanin, all of the components of the high-osmolarity glycerol pathway, and bacteriorhodopsin-like proteins. All of these genomes contain a homothallic mating-type locus. The differences between these four varieties of A. pullulans are large enough to justify their redefinition as separate species: A. pullulans, A. melanogenum, A. subglaciale and A. namibiae. The redundancy observed in several gene families can be linked to the nutritional versatility of these species and their particular stress tolerance. The availability of the genome sequences of the four Aureobasidium species should improve their biotechnological exploitation and promote our understanding of their stress-tolerance mechanisms, diverse lifestyles, and pathogenic potential.
PIGD: a database for intronless genes in the Poaceae.
Yan, Hanwei; Jiang, Cuiping; Li, Xiaoyu; Sheng, Lei; Dong, Qing; Peng, Xiaojian; Li, Qian; Zhao, Yang; Jiang, Haiyang; Cheng, Beijiu
2014-10-01
Intronless genes are a feature of prokaryotes; however, they are widespread and unequally distributed among eukaryotes and represent an important resource to study the evolution of gene architecture. Although many databases on exons and introns exist, there is currently no cohesive database that collects intronless genes in plants into a single database. In this study, we present the Poaceae Intronless Genes Database (PIGD), a user-friendly web interface to explore information on intronless genes from different plants. Five Poaceae species, Sorghum bicolor, Zea mays, Setaria italica, Panicum virgatum and Brachypodium distachyon, are included in the current release of PIGD. Gene annotations and sequence data were collected and integrated from different databases. The primary focus of this study was to provide gene descriptions and gene product records. In addition, functional annotations, subcellular localization prediction and taxonomic distribution are reported. PIGD allows users to readily browse, search and download data. BLAST and comparative analyses are also provided through this online database, which is available at http://pigd.ahau.edu.cn/. PIGD provides a solid platform for the collection, integration and analysis of intronless genes in the Poaceae. As such, this database will be useful for subsequent bio-computational analysis in comparative genomics and evolutionary studies.
Delahay, Robin M; Croxall, Nicola J; Stephens, Amberley D
2018-01-01
The genome of the gastric pathogen Helicobacter pylori is characterised by considerable variation of both gene sequence and content, much of which is contained within three large genomic islands comprising the cag pathogenicity island ( cag PAI) and two mobile integrative and conjugative elements (ICEs) termed tfs3 and tfs4 . All three islands are implicated as virulence factors, although whereas the cag PAI is well characterised, understanding of how the tfs elements influence H. pylori interactions with different human hosts is significantly confounded by limited definition of their distribution, diversity and structural representation in the global H. pylori population. To gain a global perspective of tfs ICE population dynamics we established a bioinformatics workflow to extract and precisely define the full tfs pan-gene content contained within a global collection of 221 draft and complete H. pylori genome sequences. Complete (ca. 35-55kbp) and remnant tfs ICE clusters were reconstructed from a dataset comprising > 12,000 genes, from which orthologous gene complements and distinct alleles descriptive of different tfs ICE types were defined and classified in comparative analyses. The genetic variation within defined ICE modular segments was subsequently used to provide a complete description of tfs ICE diversity and a comprehensive assessment of their phylogeographic context. Our further examination of the apparent ICE modular types identified an ancient and complex history of ICE residence, mobility and interaction within particular H. pylori phylogeographic lineages and further, provided evidence of both contemporary inter-lineage and inter-species ICE transfer and displacement. Our collective results establish a clear view of tfs ICE diversity and phylogeographic representation in the global H. pylori population, and provide a robust contextual framework for elucidating the functional role of the tfs ICEs particularly as it relates to the risk of gastric disease associated with different tfs ICE genotypes.
Non-contiguous finished genome sequence and description of Collinsella massiliensis sp. nov.
Padmanabhan, Roshan; Dubourg, Gregory; Nguyen, Thi-Thien; Couderc, Carine; Rossi-Tamisier, Morgane; Caputo, Aurelia; Raoult, Didier; Fournier, Pierre-Edouard
2014-01-01
Collinsella massiliensis strain GD3T is the type strain of Collinsella massiliensis sp. nov., a new species within the genus Collinsella. This strain, whose genome is described here, was isolated from the fecal flora of a 53-year-old French Caucasoid woman who had been admitted to intensive care unit for Guillain-Barré syndrome. Collinsella massiliensis is a Gram-positive, obligate anaerobic, non motile and non sporulating bacillus. Here, we describe the features of this organism, together with the complete genome sequence and annotation. The genome is 2,319,586 bp long (1 chromosome, no plasmid), exhibits a G+C content of 65.8% and contains 2,003 protein-coding and 54 RNA genes, including 1 rRNA operon. PMID:25197489
Non-contiguous finished genome sequence and description of Collinsella massiliensis sp. nov.
Padmanabhan, Roshan; Dubourg, Gregory; Nguyen, Thi-Thien; Couderc, Carine; Rossi-Tamisier, Morgane; Caputo, Aurelia; Raoult, Didier; Fournier, Pierre-Edouard
2014-06-15
Collinsella massiliensis strain GD3(T) is the type strain of Collinsella massiliensis sp. nov., a new species within the genus Collinsella. This strain, whose genome is described here, was isolated from the fecal flora of a 53-year-old French Caucasoid woman who had been admitted to intensive care unit for Guillain-Barré syndrome. Collinsella massiliensis is a Gram-positive, obligate anaerobic, non motile and non sporulating bacillus. Here, we describe the features of this organism, together with the complete genome sequence and annotation. The genome is 2,319,586 bp long (1 chromosome, no plasmid), exhibits a G+C content of 65.8% and contains 2,003 protein-coding and 54 RNA genes, including 1 rRNA operon.
Yang, Seung-Jo; Cho, Jang-Cheon
2008-02-01
A Gram-negative, yellow-coloured, chemoheterotrophic, non-motile, strictly aerobic, rod-shaped bacterium, designated IMCC1914(T), was isolated from coastal surface seawater of the Yellow Sea, Korea. The temperature, pH and NaCl ranges for growth were 3-37 degrees C, pH 8.0-11.0 and 0.5-4.0 %. The DNA G+C content of the strain was 38.1 mol% and the major cellular fatty acids were iso-C(15 : 1) (32.1 %), iso-C(15 : 0) (20.6 %) and iso-C(17 : 0) 3-OH (7.8 %). Phylogenetic analysis based on 16S rRNA gene sequences indicated that strain IMCC1914(T) was related most closely to Gaetbulibacter saemankumensis SMK-12(T), with a sequence similarity of 96.2 %. On the basis of phylogenetic data and several distinct phenotypic characteristics, strain IMCC1914(T) (=KCCM 42380(T) =NBRC 102040(T)) could be assigned to the genus Gaetbulibacter as the type strain of a novel species, for which the name Gaetbulibacter marinus sp. nov. is proposed. In addition, an emended description of the genus Gaetbulibacter is presented.
Metabolism and Genetics of Helicobacter pylori: the Genome Era
Marais, Armelle; Mendz, George L.; Hazell, Stuart L.; Mégraud, Francis
1999-01-01
The publication of the complete sequence of Helicobacter pylori 26695 in 1997 and more recently that of strain J99 has provided new insight into the biology of this organism. In this review, we attempt to analyze and interpret the information provided by sequence annotations and to compare these data with those provided by experimental analyses. After a brief description of the general features of the genomes of the two sequenced strains, the principal metabolic pathways are analyzed. In particular, the enzymes encoded by H. pylori involved in fermentative and oxidative metabolism, lipopolysaccharide biosynthesis, nucleotide biosynthesis, aerobic and anaerobic respiration, and iron and nitrogen assimilation are described, and the areas of controversy between the experimental data and those provided by the sequence annotation are discussed. The role of urease, particularly in pH homeostasis, and other specialized mechanisms developed by the bacterium to maintain its internal pH are also considered. The replicational, transcriptional, and translational apparatuses are reviewed, as is the regulatory network. The numerous findings on the metabolism of the bacteria and the paucity of gene expression regulation systems are indicative of the high level of adaptation to the human gastric environment. Arguments in favor of the diversity of H. pylori and molecular data reflecting possible mechanisms involved in this diversity are presented. Finally, we compare the numerous experimental data on the colonization factors and those provided from the genome sequence annotation, in particular for genes involved in motility and adherence of the bacterium to the gastric tissue. PMID:10477311
2009-01-01
Background The full power of modern genetics has been applied to the study of speciation in only a small handful of genetic model species - all of which speciated allopatrically. Here we report the first large expressed sequence tag (EST) study of a candidate for ecological sympatric speciation, the apple maggot Rhagoletis pomonella, using massively parallel pyrosequencing on the Roche 454-FLX platform. To maximize transcript diversity we created and sequenced separate libraries from larvae, pupae, adult heads, and headless adult bodies. Results We obtained 239,531 sequences which assembled into 24,373 contigs. A total of 6810 unique protein coding genes were identified among the contigs and long singletons, corresponding to 48% of all known Drosophila melanogaster protein-coding genes. Their distribution across GO classes suggests that we have obtained a representative sample of the transcriptome. Among these sequences are many candidates for potential R. pomonella "speciation genes" (or "barrier genes") such as those controlling chemosensory and life-history timing processes. Furthermore, we identified important marker loci including more than 40,000 single nucleotide polymorphisms (SNPs) and over 100 microsatellites. An initial search for SNPs at which the apple and hawthorn host races differ suggested at least 75 loci warranting further work. We also determined that developmental expression differences remained even after normalization; transcripts expected to show different expression levels between larvae and pupae in D. melanogaster also did so in R. pomonella. Preliminary comparative analysis of transcript presences and absences revealed evidence of gene loss in Drosophila and gain in the higher dipteran clade Schizophora. Conclusions These data provide a much needed resource for exploring mechanisms of divergence in this important model for sympatric ecological speciation. Our description of ESTs from a substantial portion of the R. pomonella transcriptome will facilitate future functional studies of candidate genes for olfaction and diapause-related life history timing, and will enable large scale expression studies. Similarly, the identification of new SNP and microsatellite markers will facilitate future population and quantitative genetic studies of divergence between the apple and hawthorn-infesting host races. PMID:20035631
Kim, Shin-Hee; Nayak, Subhashree; Paldurai, Anandan; Nayak, Baibaswata; Samuel, Arthur; Aplogan, Gilbert L.; Awoume, Kodzo A.; Webby, Richard J.; Ducatez, Mariette F.; Collins, Peter L.
2012-01-01
The complete genome sequence of an African Newcastle disease virus (NDV) strain isolated from a chicken in Togo in 2009 was determined. The genome is 15,198 nucleotides (nt) in length and is classified in genotype VII in the class II cluster. Compared to common vaccine strains, the African strain contains a previously described 6-nt insert in the downstream untranslated region of the N gene and a novel 6-nt insert in the HN-L intergenic region. Genome length differences are a marker of the natural history of NDV. This is the first description of a class II NDV strain with a genome of 15,198 nt and a 6-nt insert in the HN-L intergenic region. Sequence divergence relative to vaccine strains was substantial, likely contributes to outbreaks, and illustrates the continued evolution of new NDV strains in West Africa. PMID:22997417
Zhu, Xun; Xie, Shangbo; Armengaud, Jean; Xie, Wen; Guo, Zhaojiang; Kang, Shi; Wu, Qingjun; Wang, Shaoli; Xia, Jixing; He, Rongjun; Zhang, Youjun
2016-06-01
The diamondback moth, Plutella xylostella (L.), is the major cosmopolitan pest of brassica and other cruciferous crops. Its larval midgut is a dynamic tissue that interfaces with a wide variety of toxicological and physiological processes. The draft sequence of the P. xylostella genome was recently released, but its annotation remains challenging because of the low sequence coverage of this branch of life and the poor description of exon/intron splicing rules for these insects. Peptide sequencing by computational assignment of tandem mass spectra to genome sequence information provides an experimental independent approach for confirming or refuting protein predictions, a concept that has been termed proteogenomics. In this study, we carried out an in-depth proteogenomic analysis to complement genome annotation of P. xylostella larval midgut based on shotgun HPLC-ESI-MS/MS data by means of a multialgorithm pipeline. A total of 876,341 tandem mass spectra were searched against the predicted P. xylostella protein sequences and a whole-genome six-frame translation database. Based on a data set comprising 2694 novel genome search specific peptides, we discovered 439 novel protein-coding genes and corrected 128 existing gene models. To get the most accurate data to seed further insect genome annotation, more than half of the novel protein-coding genes, i.e. 235 over 439, were further validated after RT-PCR amplification and sequencing of the corresponding transcripts. Furthermore, we validated 53 novel alternative splicings. Finally, a total of 6764 proteins were identified, resulting in one of the most comprehensive proteogenomic study of a nonmodel animal. As the first tissue-specific proteogenomics analysis of P. xylostella, this study provides the fundamental basis for high-throughput proteomics and functional genomics approaches aimed at deciphering the molecular mechanisms of resistance and controlling this pest. © 2016 by The American Society for Biochemistry and Molecular Biology, Inc.
Carapelli, Antonio; Comandi, Sara; Convey, Peter; Nardi, Francesco; Frati, Francesco
2008-01-01
Background Mitogenomics data, i.e. complete mitochondrial genome sequences, are popular molecular markers used for phylogenetic, phylogeographic and ecological studies in different animal lineages. Their comparative analysis has been used to shed light on the evolutionary history of given taxa and on the molecular processes that regulate the evolution of the mitochondrial genome. A considerable literature is available in the fields of invertebrate biochemical and ecophysiological adaptation to extreme environmental conditions, exemplified by those of the Antarctic. Nevertheless, limited molecular data are available from terrestrial Antarctic species, and this study represents the first attempt towards the description of a mitochondrial genome from one of the most widespread and common collembolan species of Antarctica. Results In this study we describe the mitochondrial genome of the Antarctic collembolan Cryptopygus antarcticus Willem, 1901. The genome contains the standard set of 37 genes usually present in animal mtDNAs and a large non-coding fragment putatively corresponding to the region (A+T-rich) responsible for the control of replication and transcription. All genes are arranged in the gene order typical of Pancrustacea. Three additional short non-coding regions are present at gene junctions. Two of these are located in positions of abrupt shift of the coding polarity of genes oriented on opposite strands suggesting a role in the attenuation of the polycistronic mRNA transcription(s). In addition, remnants of an additional copy of trnL(uag) are present between trnS(uga) and nad1. Nucleotide composition is biased towards a high A% and T% (A+T = 70.9%), as typically found in hexapod mtDNAs. There is also a significant strand asymmetry, with the J-strand being more abundant in A and C. Within the A+T-rich region, some short sequence fragments appear to be similar (in position and primary sequence) to those involved in the origin of the N-strand replication of the Drosophila mtDNA. Conclusion The mitochondrial genome of C. antarcticus shares several features with other pancrustacean genomes, although the presence of unusual non-coding regions is also suggestive of molecular rearrangements that probably occurred before the differentiation of major collembolan families. Closer examination of gene boundaries also confirms previous observations on the presence of unusual start and stop codons, and suggests a role for tRNA secondary structures as potential cleavage signals involved in the maturation of the primary transcript. Sequences potentially involved in the regulation of replication/transcription are present both in the A+T-rich region and in other areas of the genome. Their position is similar to that observed in a limited number of insect species, suggesting unique replication/transcription mechanisms for basal and derived hexapod lineages. This initial description and characterization of the mitochondrial genome of C. antarcticus will constitute the essential foundation prerequisite for investigations of the evolutionary history of one of the most speciose collembolan genera present in Antarctica and other localities of the Southern Hemisphere. PMID:18593463
Maurya, Anand Prakash; Das Talukdar, Anupam; Chanda, Debadatta Dhar; Chakravarty, Atanu; Bhattacharjee, Amitabha
2016-01-01
The present study was aimed to investigate the genetic context, association with IS26 and horizontal transmission of SHV-148 among Escherichia coli in Tertiary Referral Hospital of India. Phenotypic characterisation of extended-spectrum beta-lactamases (ESBLs) was carried out as per CLSI criteria. Molecular characterisation of blaSHVand integron was carried out by polymerase chain reaction (PCR) assay and confirmed by sequencing. Linkage of IS26 with blaSHV-148was achieved by PCR. Purified products were cloned on pGEM-T vector and sequenced. Strain typing was performed by pulsed field gel electrophoresis with Xba I digestion. Transferability experiment and antimicrobial susceptibility was performed. A total of 33 isolates showed the presence of SHV-148 variant by sequencing and all were Class 1 integron borne. PCR and sequencing results suggested that all blaSHV-148 showed linkage with IS26 and were present in the upstream portion of the gene cassette and were also horizontally transferable through F type of Inc group. Susceptibility results suggest that tigecycline was most effective. The present study reports for the first time of SHV-148 mediated extended spectrum cephalosporin resistance from India. Association of their resistance gene with IS26 and Class 1 integron and carriage within IncF plasmid signifies the potential mobilising unit for the horizontal transfer.
Molecular evidence for piroplasms in wild Reeves' muntjac (Muntiacus reevesi) in China.
Yang, Ji-fei; Li, You-quan; Liu, Zhi-jie; Liu, Jun-long; Guan, Gui-quan; Chen, Ze; Luo, Jian-xun; Wang, Xiao-long; Yin, Hong
2014-10-01
DNA from liver samples of 17 free-ranging wild Reeves' muntjac (Muntiacus reevesi) was used for PCR amplification of piropalsm 18S rRNA gene. Of 17 samples, 14 (82.4%) showed a specific PCR product which were cloned and sequenced. BLAST analysis of the sequences obtained showed similarities to Babesia sp., Theileria capreoli, Theileria uilenbergi and Theileria sp. BO302-SE. Phylogenetic analysis showed that the Babesia sp. detected in the present study was distantly separated from known Babesia species of wild and domestic animals. Six sequences showed 100% similarity to T. capreoli while five sequences were separated from all known Theileria species and constituted an independent clade with Theileria sp. BO302-SE derived from roe deer in Italy; two sequences were close to T. uilenbergi with 97% similarity. This is the first description of hemoparasite infection in free-ranging wild Reeves' muntjac in China. Our results indicate that wild Reeves' muntjac may play an important reservoir role for hemoparasites. Crown Copyright © 2014. Published by Elsevier Ireland Ltd. All rights reserved.
2012-01-01
Background MicroRNAs (miRNAs) are one of the functional non-coding small RNAs involved in the epigenetic control of the plant genome. Although plants contain both evolutionary conserved miRNAs and species-specific miRNAs within their genomes, computational methods often only identify evolutionary conserved miRNAs. The recent sequencing of the Brassica rapa genome enables us to identify miRNAs and their putative target genes. In this study, we sought to provide a more comprehensive prediction of B. rapa miRNAs based on high throughput small RNA deep sequencing. Results We sequenced small RNAs from five types of tissue: seedlings, roots, petioles, leaves, and flowers. By analyzing 2.75 million unique reads that mapped to the B. rapa genome, we identified 216 novel and 196 conserved miRNAs that were predicted to target approximately 20% of the genome’s protein coding genes. Quantitative analysis of miRNAs from the five types of tissue revealed that novel miRNAs were expressed in diverse tissues but their expression levels were lower than those of the conserved miRNAs. Comparative analysis of the miRNAs between the B. rapa and Arabidopsis thaliana genomes demonstrated that redundant copies of conserved miRNAs in the B. rapa genome may have been deleted after whole genome triplication. Novel miRNA members seemed to have spontaneously arisen from the B. rapa and A. thaliana genomes, suggesting the species-specific expansion of miRNAs. We have made this data publicly available in a miRNA database of B. rapa called BraMRs. The database allows the user to retrieve miRNA sequences, their expression profiles, and a description of their target genes from the five tissue types investigated here. Conclusions This is the first report to identify novel miRNAs from Brassica crops using genome-wide high throughput techniques. The combination of computational methods and small RNA deep sequencing provides robust predictions of miRNAs in the genome. The finding of numerous novel miRNAs, many with few target genes and low expression levels, suggests the rapid evolution of miRNA genes. The development of a miRNA database, BraMRs, enables us to integrate miRNA identification, target prediction, and functional annotation of target genes. BraMRs will represent a valuable public resource with which to study the epigenetic control of B. rapa and other closely related Brassica species. The database is available at the following link: http://bramrs.rna.kr [1]. PMID:23163954
Manamgoda, D.S.; Rossman, A.Y.; Castlebury, L.A.; Crous, P.W.; Madrid, H.; Chukeatirote, E.; Hyde, K.D.
2014-01-01
The genus Bipolaris includes important plant pathogens with worldwide distribution. Species recognition in the genus has been uncertain due to the lack of molecular data from ex-type cultures as well as overlapping morphological characteristics. In this study, we revise the genus Bipolaris based on DNA sequence data derived from living cultures of fresh isolates, available ex-type cultures from worldwide collections and observation of type and additional specimens. Combined analyses of ITS, GPDH and TEF gene sequences were used to reconstruct the molecular phylogeny of the genus Bipolaris for species with living cultures. The GPDH gene is determined to be the best single marker for species of Bipolaris. Generic boundaries between Bipolaris and Curvularia are revised and presented in an updated combined ITS and GPDH phylogenetic tree. We accept 47 species in the genus Bipolaris and clarify the taxonomy, host associations, geographic distributions and species’ synonymies. Modern descriptions and illustrations are provided for 38 species in the genus with notes provided for the other taxa when recent descriptions are available. Bipolaris cynodontis, B. oryzae, B. victoriae, B. yamadae and B. zeicola are epi- or neotypified and a lectotype is designated for B. stenospila. Excluded and doubtful species are listed with notes on taxonomy and phylogeny. Seven new combinations are introduced in the genus Curvularia to accomodate the species of Bipolaris transferred based on the phylogenetic analysis. A taxonomic key is provided for the morphological identification of species within the genus. PMID:25492990
Margesin, Rosa; Zhang, De-Chao; Frasson, David; Brouchkov, Anatoli
2016-02-01
The bacterial strain N1-38 T was isolated from ancient Siberian permafrost sediment. The strain was Gram-reaction-negative, motile by gliding, rod-shaped and psychrophilic, and showed good growth over a temperature range of - 5 to 25 °C. Phylogenetic analysis of 16S rRNA gene sequences revealed that strain N1-38 T was most closely related to members of the genus Glaciimonas and shared the highest 16S rRNA gene sequence similarities with the type strains of Glaciimonas alpina (99.3 %), Glaciimonas immobilis (98.9 %) and Glaciimonas singularis (96.5 %). The predominant cellular fatty acids of strain N1-38 T were summed feature 3 (C 16 : 1 ω7 c and/or iso-C 15 : 0 2-OH), C 16 : 0 and C 18 : 1 ω7 c . The major respiratory quinone was ubiquinone 8 and the major polar lipids were phosphatidylethanolamine and diphosphatidylglycerol. The genomic DNA G+C content was 53.0 mol%. Combined data of phenotypic, phylogenetic and DNA-DNA relatedness studies demonstrated that strain N1-38 T represents a novel species of the genus Glaciimonas , for which the name Glaciimonas frigoris sp. nov. is proposed. The type strain is N1-38 T ( = LMG 28868 T = CCOS 838 T ). An emended description of the genus Glaciimonas is also provided.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Onda, M.; Kudo, S.; Fukuda, M.
Human glycophorin A, B, and E (GPA, GPB, and GPE) genes belong to a gene family located at the long arm of chromosome 4. These three genes are homologous from the 5'-flanking sequence to the Alu sequence, which is 1 kb downstream from the exon encoding the transmembrane domain. Analysis of the Alu sequence and flanking direct repeat sequences suggested that the GPA gene most closely resembles the ancestral gene, whereas the GPB and GPE gene arose by homologous recombination within the Alu sequence, acquiring 3' sequences from an unrelated precursor genomic segment. Here the authors describe the identification ofmore » this putative precursor genomic segment. A human genomic library was screened by using the sequence of the 3' region of the GPB gene as a probe. The genomic clones isolated were found to contain an Alu sequence that appeared to be involved in the recombination. Downstream from the Alu sequence, the nucleotide sequence of the precursor genomic segment is almost identical to that of the GPB or GPE gene. In contrast, the upstream sequence of the genomic segment differs entirely from that of the GPA, GPB, and GPE genes. Conservation of the direct repeats flanking the Alu sequence of the genomic segment strongly suggests that the sequence of this genomic segment has been maintained during evolution. This identified genomic segment was found to reside downstream from the GPA gene by both gene mapping and in situ chromosomal localization. The precursor genomic segment was also identified in the orangutan genome, which is known to lack GPB and GPE genes. These results indicate that one of the duplicated ancestral glycophorin genes acquired a unique 3' sequence by unequal crossing-over through its Alu sequence and the further downstream Alu sequence present in the duplicated gene. Further duplication and divergence of this gene yielded the GPB and GPE genes. 37 refs., 5 figs.« less
Girlich, Delphine; Dortet, Laurent; Poirel, Laurent; Nordmann, Patrice
2015-01-01
To decipher the mechanisms and their associated genetic determinants responsible for β-lactam resistance in a Proteus mirabilis clinical isolate. The entire genetic structure surrounding the β-lactam resistance genes was characterized by PCR, gene walking and DNA sequencing. Genes encoding the carbapenemase NDM-1 and the ESBL VEB-6 were located in a 38.5 kb MDR structure, which itself was inserted into a new variant of the Proteus genomic island 1 (PGI1). This new PGI1-PmPEL variant of 64.4 kb was chromosomally located, as an external circular form in the P. mirabilis isolate, suggesting potential mobility. This is the first known description of the bla(NDM-1) gene in a genomic island structure, which might further enhance the spread of the bla(NDM-1) carbapenemase gene among enteric pathogens. © The Author 2014. Published by Oxford University Press on behalf of the British Society for Antimicrobial Chemotherapy. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Gharsa, H; Slama, K Ben; Gómez-Sanz, E; Gómez, P; Klibi, N; Zarazaga, M; Boudabous, A; Torres, C
2015-07-01
Staphylococcus intermedius group (SIG) bacteria can colonise the nares of some animals but are also emerging pathogens in humans and animals. To analyse SIG nasal carriage in healthy donkeys destined for food consumption in Tunisia and to characterise recovered isolates. Nasal swabs from 100 healthy donkeys were tested for SIG recovery, and isolates were identified by biochemical and molecular methods. Antimicrobial susceptibility of isolates was tested and detection of antimicrobial resistance and virulence genes was performed. Isolates were typed at the clonal level by multilocus sequence typing and SmaI pulsed-field gel electrophoresis. Staphylococcus delphini and Staphylococcus pseudintermedius (included in SIG) were obtained in 19% and 2% of the tested samples, respectively, and one isolate per sample was characterised. All isolates were meticillin susceptible and mecA negative. Most S. delphini and S. pseudintermedius isolates showed susceptibility to all antimicrobials tested, with the exception of 2 isolates resistant to tetracycline (tet(M) gene) or fusidic acid. The following toxin genes were identified (percentage of isolates): lukS-I (100%), lukF-I (9.5%), siet (100%), se-int (90%), seccanine (19%) and expA (9.5%). Thirteen different pulsed-field gel electrophoresis profiles were identified among the 21 SIG isolates. Additionally, the following 9 different sequence types (STs) were detected by multilocus sequence typing, 6 of them new: ST219 (6 isolates), ST12 (5 isolates), ST220 (3 isolates), ST13, ST50, ST193, ST196, ST218 and ST221 (one isolate each). Staphylococcus delphini and S. pseudintermedius are common nasal colonisers of donkeys, generally susceptible to the antimicrobials tested; nevertheless, these SIG isolates contain virulence genes, including the recently described exfoliative gene (expA) and several enterotoxin genes, with potential implications for public health. This is the first description of S. delphini in Tunisia. The Summary is available in Chinese - see Supporting information. © 2014 EVJ Ltd.
Targeted Exon Sequencing in Usher Syndrome Type I
Bujakowska, Kinga M.; Consugar, Mark; Place, Emily; Harper, Shyana; Lena, Jaclyn; Taub, Daniel G.; White, Joseph; Navarro-Gomez, Daniel; Weigel DiFranco, Carol; Farkas, Michael H.; Gai, Xiaowu; Berson, Eliot L.; Pierce, Eric A.
2014-01-01
Purpose. Patients with Usher syndrome type I (USH1) have retinitis pigmentosa, profound congenital hearing loss, and vestibular ataxia. This syndrome is currently thought to be associated with at least six genes, which are encoded by over 180 exons. Here, we present the use of state-of-the-art techniques in the molecular diagnosis of a cohort of 47 USH1 probands. Methods. The cohort was studied with selective exon capture and next-generation sequencing of currently known inherited retinal degeneration genes, comparative genomic hybridization, and Sanger sequencing of new USH1 exons identified by human retinal transcriptome analysis. Results. With this approach, we were able to genetically solve 14 of the 47 probands by confirming the biallelic inheritance of mutations. We detected two likely pathogenic variants in an additional 19 patients, for whom family members were not available for cosegregation analysis to confirm biallelic inheritance. Ten patients, in addition to primary disease–causing mutations, carried rare likely pathogenic USH1 alleles or variants in other genes associated with deaf-blindness, which may influence disease phenotype. Twenty-one of the identified mutations were novel among the 33 definite or likely solved patients. Here, we also present a clinical description of the studied cohort at their initial visits. Conclusions. We found a remarkable genetic heterogeneity in the studied USH1 cohort with multiplicity of mutations, of which many were novel. No obvious influence of genotype on phenotype was found, possibly due to small sample sizes of the genotypes under study. PMID:25468891
DOE Office of Scientific and Technical Information (OSTI.GOV)
Scheuner, Carmen; Tindall, Brian J.; Lu, Megan
Planctomyces brasiliensis Schlesner 1990 belongs to the order Planctomycetales, which differs from other bacterial taxa by several distinctive features such as internal cell compartmentalization, multiplication by forming buds directly from the spherical, ovoid or pear-shaped mother cell and a cell wall consisting of a proteinaceous layer rather than a peptidoglycan layer. The first strains of P. brasiliensis, including the type strain IFAM 1448 T, were isolated from a water sample of Lagoa Vermelha, a salt pit near Rio de Janeiro, Brasil. This is the second completed genome sequence of a type strain of the genus Planctomyces to be published andmore » the sixth type strain genome sequence from the family Planctomycetaceae. The 6,006,602 bp long genome with its 4,811 protein-coding and 54 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project. We study phylogenomic analyses that indicate that the classification within the Planctomycetaceae is partially in conflict with its evolutionary history, as the positioning of Schlesneria renders the genus Planctomyces paraphyletic. A re-analysis of published fatty-acid measurements also does not support the current arrangement of the two genera. A quantitative comparison of phylogenetic and phenotypic aspects indicates that the three Planctomyces species with type strains available in public culture collections should be placed in separate genera. Thus the genera Gimesia, Planctopirus and Rubinisphaera are proposed to accommodate P. maris, P. limnophilus and P. brasiliensis, respectively. Pronounced differences between the reported G + C content of Gemmata obscuriglobus, Singulisphaera acidiphila and Zavarzinella formosa and G + C content calculated from their genome sequences call for emendation of their species descriptions. Lastly, in addition to other features, the range of G + C values reported for the genera within the Planctomycetaceae indicates that the descriptions of the family and the order should be emended.« less
Scheuner, Carmen; Tindall, Brian J.; Lu, Megan; ...
2014-12-08
Planctomyces brasiliensis Schlesner 1990 belongs to the order Planctomycetales, which differs from other bacterial taxa by several distinctive features such as internal cell compartmentalization, multiplication by forming buds directly from the spherical, ovoid or pear-shaped mother cell and a cell wall consisting of a proteinaceous layer rather than a peptidoglycan layer. The first strains of P. brasiliensis, including the type strain IFAM 1448 T, were isolated from a water sample of Lagoa Vermelha, a salt pit near Rio de Janeiro, Brasil. This is the second completed genome sequence of a type strain of the genus Planctomyces to be published andmore » the sixth type strain genome sequence from the family Planctomycetaceae. The 6,006,602 bp long genome with its 4,811 protein-coding and 54 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project. We study phylogenomic analyses that indicate that the classification within the Planctomycetaceae is partially in conflict with its evolutionary history, as the positioning of Schlesneria renders the genus Planctomyces paraphyletic. A re-analysis of published fatty-acid measurements also does not support the current arrangement of the two genera. A quantitative comparison of phylogenetic and phenotypic aspects indicates that the three Planctomyces species with type strains available in public culture collections should be placed in separate genera. Thus the genera Gimesia, Planctopirus and Rubinisphaera are proposed to accommodate P. maris, P. limnophilus and P. brasiliensis, respectively. Pronounced differences between the reported G + C content of Gemmata obscuriglobus, Singulisphaera acidiphila and Zavarzinella formosa and G + C content calculated from their genome sequences call for emendation of their species descriptions. Lastly, in addition to other features, the range of G + C values reported for the genera within the Planctomycetaceae indicates that the descriptions of the family and the order should be emended.« less
Development of Laboratory Investigations in Disorders of Sex Development.
Audí, Laura; Camats, Núria; Fernández-Cancio, Mónica; Granada, María L
2018-01-01
Scientific knowledge to understand the biological basis of sex development was prompted by the observation of variants different from the 2 most frequent body types, and this became one of the fields first studied by modern pediatric endocrinology. The clinical observation was supported by professionals working in different areas of laboratory sciences which led to the description of adrenal and gonadal steroidogenesis, the enzymes involved, and the different deficiencies. Steroid hormone measurements evolved from colorimetry to radioimmunoassay (RIA) and automated immunoassays, although gas and liquid chromatography coupled to mass spectrometry are now the gold standard techniques for steroid measurements. Peptide hormones and growth factors were purified, and their measurement evolved from RIA to automated immunoassays. Hormone action mechanisms were described, and their specific receptors were characterized and assayed in experimental materials and in patient tissues and cell cultures. The discovery of the genetic basis for variant sex developments began with the description of the sex chromosomes. Molecular technology allowed cloning of genes coding for the different proteins involved in sex determination and development. Experimental animal models aided in verifying the roles of proteins and also suggested new genes to be investigated. New candidate genes continue to be described based on experimental models and on next-generation sequencing of patient DNAs. © 2017 S. Karger AG, Basel.
The human genome: a multifractal analysis
2011-01-01
Background Several studies have shown that genomes can be studied via a multifractal formalism. Recently, we used a multifractal approach to study the genetic information content of the Caenorhabditis elegans genome. Here we investigate the possibility that the human genome shows a similar behavior to that observed in the nematode. Results We report here multifractality in the human genome sequence. This behavior correlates strongly on the presence of Alu elements and to a lesser extent on CpG islands and (G+C) content. In contrast, no or low relationship was found for LINE, MIR, MER, LTRs elements and DNA regions poor in genetic information. Gene function, cluster of orthologous genes, metabolic pathways, and exons tended to increase their frequencies with ranges of multifractality and large gene families were located in genomic regions with varied multifractality. Additionally, a multifractal map and classification for human chromosomes are proposed. Conclusions Based on these findings, we propose a descriptive non-linear model for the structure of the human genome, with some biological implications. This model reveals 1) a multifractal regionalization where many regions coexist that are far from equilibrium and 2) this non-linear organization has significant molecular and medical genetic implications for understanding the role of Alu elements in genome stability and structure of the human genome. Given the role of Alu sequences in gene regulation, genetic diseases, human genetic diversity, adaptation and phylogenetic analyses, these quantifications are especially useful. PMID:21999602
Diop, Khoudia; Diop, Awa; Levasseur, Anthony; Mediannikov, Oleg; Robert, Catherine; Armstrong, Nicholas; Couderc, Carine; Bretelle, Florence; Raoult, Didier; Fournier, Pierre-Edouard; Fenollar, Florence
2018-03-01
Microbial culturomics is a new subfield of postgenomic medicine and omics biotechnology application that has broadened our awareness on bacterial diversity of the human microbiome, including the human vaginal flora bacterial diversity. Using culturomics, a new obligate anaerobic Gram-stain-negative rod-shaped bacterium designated strain khD1 T was isolated in the vagina of a patient with bacterial vaginosis and characterized using taxonogenomics. The most abundant cellular fatty acids were C 15:0 anteiso (36%), C 16:0 (19%), and C 15:0 iso (10%). Based on an analysis of the full-length 16S rRNA gene sequences, phylogenetic analysis showed that the strain khD1 T exhibited 90% sequence similarity with Prevotella loescheii, the phylogenetically closest validated Prevotella species. With 3,763,057 bp length, the genome of strain khD1 T contained (mol%) 48.7 G + C and 3248 predicted genes, including 3194 protein-coding and 54 RNA genes. Given the phenotypical and biochemical characteristic results as well as genome sequencing, strain khD1 T is considered to represent a novel species within the genus Prevotella, for which the name Prevotella lascolaii sp. nov. is proposed. The type strain is khD1 T ( = CSUR P0109, = DSM 101754). These results show that microbial culturomics greatly improves the characterization of the human microbiome repertoire by isolating potential putative new species. Further studies will certainly clarify the microbial mechanisms of pathogenesis of these new microbes and their role in health and disease. Microbial culturomics is an important new addition to the diagnostic medicine toolbox and warrants attention in future medical, global health, and integrative biology postgraduate teaching curricula.
Linking microarray reporters with protein functions
Gaj, Stan; van Erk, Arie; van Haaften, Rachel IM; Evelo, Chris TA
2007-01-01
Background The analysis of microarray experiments requires accurate and up-to-date functional annotation of the microarray reporters to optimize the interpretation of the biological processes involved. Pathway visualization tools are used to connect gene expression data with existing biological pathways by using specific database identifiers that link reporters with elements in the pathways. Results This paper proposes a novel method that aims to improve microarray reporter annotation by BLASTing the original reporter sequences against a species-specific EMBL subset, that was derived from and crosslinked back to the highly curated UniProt database. The resulting alignments were filtered using high quality alignment criteria and further compared with the outcome of a more traditional approach, where reporter sequences were BLASTed against EnsEMBL followed by locating the corresponding protein (UniProt) entry for the high quality hits. Combining the results of both methods resulted in successful annotation of > 58% of all reporter sequences with UniProt IDs on two commercial array platforms, increasing the amount of Incyte reporters that could be coupled to Gene Ontology terms from 32.7% to 58.3% and to a local GenMAPP pathway from 9.6% to 16.7%. For Agilent, 35.3% of the total reporters are now linked towards GO nodes and 7.1% on local pathways. Conclusion Our methods increased the annotation quality of microarray reporter sequences and allowed us to visualize more reporters using pathway visualization tools. Even in cases where the original reporter annotation showed the correct description the new identifiers often allowed improved pathway and Gene Ontology linking. These methods are freely available at http://www.bigcat.unimaas.nl/public/publications/Gaj_Annotation/. PMID:17897448
Delamuta, Jakeline Renata Marçon; Ribeiro, Renan Augusto; Ormeño-Orrillo, Ernesto; Parma, Marcia Maria; Melo, Itamar Soares; Martínez-Romero, Esperanza; Hungria, Mariangela
2015-12-01
Biological nitrogen fixation is a key process for agricultural production and environmental sustainability, but there are comparatively few studies of symbionts of tropical pasture legumes, as well as few described species of the genus Bradyrhizobium, although it is the predominant rhizobial genus in the tropics. A detailed polyphasic study was conducted with two strains of the genus Bradyrhizobium used in commercial inoculants for tropical pastures in Brazil, CNPSo 1112T, isolated from perennial soybean (Neonotonia wightii), and CNPSo 2833T, from desmodium (Desmodium heterocarpon). Based on 16S-rRNA gene phylogeny, both strains were grouped in the Bradyrhizobium elkanii superclade, but were not clearly clustered with any known species. Multilocus sequence analysis of three (glnII, gyrB and recA) and five (plus atpD and dnaK) housekeeping genes confirmed that the strains are positioned in two distinct clades. Comparison with intergenic transcribed spacer sequences of type strains of described species of the genus Bradyrhizobium showed similarity lower than 93.1 %, and differences were confirmed by BOX-PCR analysis. Nucleotide identity of three housekeeping genes with type strains of described species ranged from 88.1 to 96.2 %. Average nucleotide identity of genome sequences showed values below the threshold for distinct species of the genus Bradyrhizobium ( < 90.6 %), and the value between the two strains was also below this threshold (91.2 %). Analysis of nifH and nodC gene sequences positioned the two strains in a clade distinct from other species of the genus Bradyrhizobium. Morphophysiological, genotypic and genomic data supported the description of two novel species in the genus Bradyrhizobium, Bradyrhizobium tropiciagri sp. nov. (type strain CNPSo 1112T = SMS 303T = BR 1009T = SEMIA 6148T = LMG 28867T) and Bradyrhizobium embrapense sp. nov. (type strain CNPSo 2833T = CIAT 2372T = BR 2212T = SEMIA 6208T = U674T = LMG 2987).
Shewmaker, P L; Whitney, A M; Humrighouse, B W
2016-03-01
Phenotypic, genotypic, and antimicrobial characteristics of six phenotypically distinct human clinical isolates that most closely resembled the type strain of Streptococcus halichoeri isolated from a seal are presented. Sequencing of the 16S rRNA, rpoB, sodA, and recN genes; comparative whole-genome analysis; conventional biochemical and Rapid ID 32 Strep identification methods; and antimicrobial susceptibility testing were performed on the human isolates, the type strain of S. halichoeri, and type strains of closely related species. The six human clinical isolates were biochemically indistinguishable from each other and showed 100% 16S rRNA, rpoB, sodA, and recN gene sequence similarity. Comparative 16S rRNA gene sequencing analysis revealed 98.6% similarity to S. halichoeri CCUG 48324(T), 97.9% similarity to S. canis ATCC 43496(T), and 97.8% similarity to S. ictaluri ATCC BAA-1300(T). A 3,530-bp fragment of the rpoB gene was 98.8% similar to the S. halichoeri type strain, 84.6% to the S. canis type strain, and 83.8% to the S. ictaluri type strain. The S. halichoeri type strain and the human clinical isolates were susceptible to the antimicrobials tested based on CLSI guidelines for Streptococcus species viridans group with the exception of tetracycline and erythromycin. The human isolates were phenotypically distinct from the type strain isolated from a seal; comparative whole-genome sequence analysis confirmed that the human isolates were S. halichoeri. On the basis of these results, a novel subspecies, Streptococcus halichoeri subsp. hominis, is proposed for the human isolates and Streptococcus halichoeri subsp. halichoeri is proposed for the gray seal isolates. The type strain of the novel subspecies is SS1844(T) = CCUG 67100(T) = LMG 28801(T). Copyright © 2016, American Society for Microbiology. All Rights Reserved.
Chapman, Jarrod A; Kirkness, Ewen F; Simakov, Oleg; Hampson, Steven E; Mitros, Therese; Weinmaier, Thomas; Rattei, Thomas; Balasubramanian, Prakash G; Borman, Jon; Busam, Dana; Disbennett, Kathryn; Pfannkoch, Cynthia; Sumin, Nadezhda; Sutton, Granger G; Viswanathan, Lakshmi Devi; Walenz, Brian; Goodstein, David M; Hellsten, Uffe; Kawashima, Takeshi; Prochnik, Simon E; Putnam, Nicholas H; Shu, Shengquiang; Blumberg, Bruce; Dana, Catherine E; Gee, Lydia; Kibler, Dennis F; Law, Lee; Lindgens, Dirk; Martinez, Daniel E; Peng, Jisong; Wigge, Philip A; Bertulat, Bianca; Guder, Corina; Nakamura, Yukio; Ozbek, Suat; Watanabe, Hiroshi; Khalturin, Konstantin; Hemmrich, Georg; Franke, André; Augustin, René; Fraune, Sebastian; Hayakawa, Eisuke; Hayakawa, Shiho; Hirose, Mamiko; Hwang, Jung Shan; Ikeo, Kazuho; Nishimiya-Fujisawa, Chiemi; Ogura, Atshushi; Takahashi, Toshio; Steinmetz, Patrick R H; Zhang, Xiaoming; Aufschnaiter, Roland; Eder, Marie-Kristin; Gorny, Anne-Kathrin; Salvenmoser, Willi; Heimberg, Alysha M; Wheeler, Benjamin M; Peterson, Kevin J; Böttger, Angelika; Tischler, Patrick; Wolf, Alexander; Gojobori, Takashi; Remington, Karin A; Strausberg, Robert L; Venter, J Craig; Technau, Ulrich; Hobmayer, Bert; Bosch, Thomas C G; Holstein, Thomas W; Fujisawa, Toshitaka; Bode, Hans R; David, Charles N; Rokhsar, Daniel S; Steele, Robert E
2010-03-25
The freshwater cnidarian Hydra was first described in 1702 and has been the object of study for 300 years. Experimental studies of Hydra between 1736 and 1744 culminated in the discovery of asexual reproduction of an animal by budding, the first description of regeneration in an animal, and successful transplantation of tissue between animals. Today, Hydra is an important model for studies of axial patterning, stem cell biology and regeneration. Here we report the genome of Hydra magnipapillata and compare it to the genomes of the anthozoan Nematostella vectensis and other animals. The Hydra genome has been shaped by bursts of transposable element expansion, horizontal gene transfer, trans-splicing, and simplification of gene structure and gene content that parallel simplification of the Hydra life cycle. We also report the sequence of the genome of a novel bacterium stably associated with H. magnipapillata. Comparisons of the Hydra genome to the genomes of other animals shed light on the evolution of epithelia, contractile tissues, developmentally regulated transcription factors, the Spemann-Mangold organizer, pluripotency genes and the neuromuscular junction.
Peter, S; Bezdan, D; Oberhettinger, P; Vogel, W; Dörfel, D; Dick, J; Marschal, M; Liese, J; Weidenmaier, C; Autenrieth, I; Ossowski, S; Willmann, M
2018-06-01
Citrobacter spp. harbouring metallo-β-lactamases (MBLs) have been reported from various countries and different sources, but their isolation from clinical specimens remains a rare event in Europe. MBL-harbouring Enterobacteriaceae are considered a major threat in infection control as therapeutic options are often limited to colistin. In this study, whole-genome sequencing was applied to characterise five clinical isolates of multidrug-resistant Citrobacter werkmanii obtained from rectal swabs. Four strains possessed a class 1 integron with a novel bla VIM-48 MBL resistance gene and the aminoglycoside acetyltransferase gene aacA4, whilst one isolate harboured a bla IMP-8 MBL. Resistance to colistin evolved in one strain isolated from a patient who had received colistin orally for 8 days. Genomic comparison of this strain with a colistin-susceptible pre-treatment isolate from the same patient revealed 66 single nucleotide polymorphisms (SNPs) and 26 indels, indicating the presence of a mutator phenotype. This was confirmed by the finding of a SNP in the mutL gene that led to a significantly truncated protein. Additionally, an amino acid change from glycine to serine at position 53 was observed in PmrA. Mutations in the pmrA gene have been previously described as mediating colistin resistance in different bacterial species and are the most likely reason for the susceptibility change observed. To the best of our knowledge, this is the first description of a colistin-resistant Citrobacter spp. isolated from a human sample. This study demonstrates the power of applying next-generation sequencing in a hospital setting to trace and understand evolving resistance at the level of individual patients. Copyright © 2018 Elsevier B.V. and International Society of Chemotherapy. All rights reserved.
Gostinčar, Cene; Ohm, Robin A.; Kogej, Tina; ...
2014-07-01
Aureobasidium pullulans is a black-yeast-like fungus used for production of the polysaccharide pullulan and the antimycotic aureobasidin A, and as a biocontrol agent in agriculture. It can cause opportunistic human infections, and it inhabits various extreme environments. To promote the understanding of these traits, we performed de-novo genome sequencing of the four varieties of A. pullulans. The 25.43-29.62 Mb genomes of these four varieties of A. pullulans encode between 10266 and 11866 predicted proteins. Their genomes encode most of the enzyme families involved in degradation of plant material and many sugar transporters, and they have genes possibly associated with degradationmore » of plastic and aromatic compounds. Proteins believed to be involved in the synthesis of pullulan and siderophores, but not of aureobasidin A, are predicted. Putative stress-tolerance genes include several aquaporins and aquaglyceroporins, large numbers of alkali-metal cation transporters, genes for the synthesis of compatible solutes and melanin, all of the components of the high-osmolarity glycerol pathway, and bacteriorhodopsin-like proteins. All of these genomes contain a homothallic mating-type locus. The differences between these four varieties of A. pullulans are large enough to justify their redefinition as separate species: A. pullulans, A. melanogenum, A. subglaciale and A. namibiae. We observed redundancy in several gene families that can be linked to the nutritional versatility of these species and their particular stress tolerance. In conclusions, the availability of the genome sequences of the four Aureobasidium species should improve their biotechnological exploitation and promote our understanding of their stress-tolerance mechanisms, diverse lifestyles, and pathogenic potential.« less
Livestock rabies outbreaks in Shanxi province, China.
Feng, Ye; Shi, Yanyan; Yu, Mingyang; Xu, Weidi; Gong, Wenjie; Tu, Zhongzhong; Ding, Laixi; He, Biao; Guo, Huancheng; Tu, Changchun
2016-10-01
Dogs play an important role in rabies transmission throughout the world. In addition to the severe human rabies situation in China, spillover of rabies virus from dogs in recent years has caused rabies outbreaks in sheep, cattle and pigs, showing that there is an increasing threat to other domestic animals. Two livestock rabies outbreaks were caused by dogs in Shanxi province, China from April to October in 2015, resulting in the deaths of 60 sheep, 10 cattle and one donkey. Brain samples from one infected bovine and the donkey were determined to be rabies virus (RABV) positive by fluorescent antibody test (FAT) and reverse transcription polymerase chain reaction (RT-PCR). The complete RABV N genes of the two field strains, together with those of two previously confirmed Shanxi dog strains, were amplified, sequenced and compared phylogenetically with published sequences of the N gene of RABV strains from Shanxi and surrounding provinces. All of the strains from Shanxi province grouped closely, sharing 99.6 %-100 % sequence identity, indicating the wide distribution and transmission of dog-mediated rabies in these areas. This is the first description of donkey rabies symptoms with phylogenetic analysis of RABVs in Shanxi province and surrounding regions. The result emphasizes the need for mandatory dog rabies vaccination and improved public education to eradicate dog rabies transmission.
Identifying currents in the gene pool for bacterial populations using an integrative approach.
Tang, Jing; Hanage, William P; Fraser, Christophe; Corander, Jukka
2009-08-01
The evolution of bacterial populations has recently become considerably better understood due to large-scale sequencing of population samples. It has become clear that DNA sequences from a multitude of genes, as well as a broad sample coverage of a target population, are needed to obtain a relatively unbiased view of its genetic structure and the patterns of ancestry connected to the strains. However, the traditional statistical methods for evolutionary inference, such as phylogenetic analysis, are associated with several difficulties under such an extensive sampling scenario, in particular when a considerable amount of recombination is anticipated to have taken place. To meet the needs of large-scale analyses of population structure for bacteria, we introduce here several statistical tools for the detection and representation of recombination between populations. Also, we introduce a model-based description of the shape of a population in sequence space, in terms of its molecular variability and affinity towards other populations. Extensive real data from the genus Neisseria are utilized to demonstrate the potential of an approach where these population genetic tools are combined with an phylogenetic analysis. The statistical tools introduced here are freely available in BAPS 5.2 software, which can be downloaded from http://web.abo.fi/fak/mnf/mate/jc/software/baps.html.
Rylková, K; Tůmová, E; Brožová, A; Jankovská, I; Vadlejch, J; Čadková, Z; Frýdlová, J; Peřinková, P; Langrová, I; Chodová, D; Nechybová, S; Scháňková, Š
2015-11-01
Trichuris sp. individuals were collected from Myocastor coypus from fancy breeder farms in the Czech Republic. Using morphological and biometrical methods, 30 female and 30 male nematodes were identified as Trichuris myocastoris. This paper presents the first molecular description of this species. The ribosomal DNA (rDNA) region, consisting of internal transcribed spacer (ITS)-1, 5.8 gene and ITS-2, was sequenced. Based on an analysis of 651 bp, T. myocastoris was found to be different from any other Trichuris species for which published sequencing of the ITS region is available. The phylogenetic relationships were estimated using the maximum parsimony methods and Bayesian analyses. T. myocastoris was found to be significantly closely related to Trichuris of rodents than those of ruminants.
Kurtzman, Cletus P
2016-07-01
DNA sequence analyses have demonstrated that species of the polyphyletic anamorphic ascomycete genus Candida may be members of described teleomorphic genera, members of the Candida tropicalis clade upon which the genus Candida is circumscribed, or members of isolated clades that represent undescribed genera. From phylogenetic analysis of gene sequences from nuclear large subunit rRNA, mitochondrial small subunit rRNA and cytochrome oxidase II, Candida auringiensis (NRRL Y-17674(T), CBS 6913(T)), Candida salmanticensis (NRRL Y-17090(T), CBS 5121(T)), and Candida tartarivorans (NRRL Y-27291(T), CBS 7955(T)) were shown to be members of an isolated clade and are proposed for reclassification in the genus Groenewaldozyma gen. nov. (MycoBank MB 815817). Neighbouring taxa include species of the Wickerhamiella clade and Candida blankii.
An outbreak of food-borne gastroenteritis due to sapovirus among junior high school students.
Usuku, Shuzo; Kumazaki, Makoto; Kitamura, Katsuhiko; Tochikubo, Osamu; Noguchi, Yuzo
2008-11-01
The human sapovirus (SaV) causes acute gastroenteritis mainly in infants and young children. A food-borne outbreak of gastroenteritis associated with SaV occurred among junior high school students in Yokohama, Japan, during and after a study trip. The nucleotide sequences of the partial capsid gene derived from the students exhibited 98% homology to a SaV genogroup IV strain, Hu/Angelholm/SW278/2004/SE, which was isolated from an adult with gastroenteritis in Solna, Sweden. An identical nucleotide sequence was detected from a food handler at the hotel restaurant, suggesting that the causative agent of the outbreak was transmitted from the food handler. This is the first description of a food-borne outbreak associated with the SaV genogroup IV strain in Japan.
Reid, Allecia E.; Taber, Jennifer M.; Ferrer, Rebecca A.; Biesecker, Barbara B.; Lewis, Katie L.; Biesecker, Leslie G.; Klein, William M. P.
2018-01-01
Objective Genomic sequencing is becoming increasingly accessible, highlighting the need to understand the social and psychological factors that drive interest in receiving testing results. These decisions may depend on perceived descriptive norms (how most others behave) and injunctive norms (what is approved of by others). We predicted that descriptive norms would be directly associated with intentions to learn genomic sequencing results, whereas injunctive norms would be associated indirectly, via attitudes. These differential associations with intentions versus attitudes were hypothesized to be strongest when individuals held ambivalent attitudes toward obtaining results. Methods Participants enrolled in a genomic sequencing trial (n=372) reported intentions to learn medically actionable, non-medically actionable, and carrier sequencing results. Descriptive norms items referenced other study participants. Injunctive norms were analyzed separately for close friends and family members. Attitudes, attitudinal ambivalence, and sociodemographic covariates were also assessed. Results In structural equation models, both descriptive norms and friend injunctive norms were associated with intentions to receive all sequencing results (ps<.004). Attitudes consistently mediated all friend injunctive norms-intentions associations, but not the descriptive norms-intentions associations. Attitudinal ambivalence moderated the association between friend injunctive norms (p≤.001), but not descriptive norms (p=.16), and attitudes. Injunctive norms were significantly associated with attitudes when ambivalence was high, but were unrelated when ambivalence was low. Results replicated for family injunctive norms. Conclusions Descriptive and injunctive norms play roles in genomic sequencing decisions. Considering mediators and moderators of these processes enhances ability to optimize use of normative information to support informed decision making. PMID:29745680
Saeung, Atiporn; Srisuka, Wichai; Low, Van Lun; Maleewong, Wanchai; Takaoka, Hiroyuki
2017-08-01
The female and larva of Simulium (Gomphostilbia) udomi Takaoka & Choochote from Thailand are described for the first time. The female of this species is similar to those of S. (G.) asakoae Takaoka & Davies from Peninsular Malaysia, Thailand, Hong Kong and Vietnam, and S. (G.) chiangdaoense Takaoka & Srisuka from Thailand. The larva of this species is similar to S. (G.) curtatum Jitklang et al. and S. (G.) nr. asakoae 2 from Thailand in having a medium-long postgenal cleft. Taxonomic notes are given to separate this species from these related species. The COI gene sequence of S. (G.) udomi is compared with those of eight species of the S. asakoae species-group and three species of the S. ceylonicum species-group. This species is transferred from the S. ceylonicum species-group to the S. asakoae species-group based on the adult female and male morphological characters, comparisons of the genetic distances and phylogenetic relationships inferred from the COI gene sequences. Copyright © 2017 Elsevier B.V. All rights reserved.
Molecular characterization of Trichuris serrata.
Ketzis, Jennifer K; Verma, Ashutosh; Burgess, Graham
2015-05-01
Trichuris serrata, a whipworm of cats, can cause inflammation in the cecum and upper portion of the large intestine. It is unknown if the virulence and pathology of T. serrata differ from Trichuris campanula, the other species in cats. Distinguishing the species based on egg size is challenging. In addition, Trichuris eggs can be difficult to distinguish from Capillaria spp. This paper presents the first molecular description of T. serrata. The 18S ribosomal RNA (rRNA) gene was sequenced from male adult worms sourced from two unrelated cats on St. Kitts. Based on the analysis of 651 base pairs, T. serrata was found to be different than any other Trichuris species for which published sequencing of the 18S rRNA gene is available. A dendrogram was developed using Molecular Evolutionary Genetics Analysis version 6.0, and evolutionary history was inferred using the minimum evolution method. T. serrata was found to be most closely related to Trichuris vulpis, the Trichuris of dogs. Further development of the methodology could enable distinguishing T. serrata, T. campanula, and Capillaria spp. infections in cats and aid in diagnosis.
Solovyeva, Evgeniya N; Dunayev, Evgeniy N; Nazarov, Roman A; Mehdi Radjabizadeh; Poyarkov, Nikolay A
2018-01-01
The morphological and genetic variation of a wide-ranging Secret Toad-headed agama, Phrynocephalus mystaceus that inhabits sand deserts of south-eastern Europe, Middle East, Middle Asia, and western China is reviewed. Based on the morphological differences and high divergence in COI (mtDNA) gene sequences a new subspecies of Ph. mystaceus is described from Khorasan Razavi Province in Iran. Partial sequences of COI mtDNA gene of 31 specimens of Ph. mystaceus from 17 localities from all major parts of species range were analyzed. Genetic distances show a deep divergence between Ph. mystaceus khorasanus ssp. n. from Khorasan Razavi Province and all other populations of Ph. mystaceus . The new subspecies can be distinguished from other populations of Ph. mystaceus by a combination of several morphological features. Molecular and morphological analyses do not support the validity of other Ph. mystaceus subspecies described from Middle Asia and Caspian basin. Geographic variations in the Ph. mystaceus species complex and the status of previously described subspecies were discussed.
AbsIDconvert: An absolute approach for converting genetic identifiers at different granularities
2012-01-01
Background High-throughput molecular biology techniques yield vast amounts of data, often by detecting small portions of ribonucleotides corresponding to specific identifiers. Existing bioinformatic methodologies categorize and compare these elements using inferred descriptive annotation given this sequence information irrespective of the fact that it may not be representative of the identifier as a whole. Results All annotations, no matter the granularity, can be aligned to genomic sequences and therefore annotated by genomic intervals. We have developed AbsIDconvert, a methodology for converting between genomic identifiers by first mapping them onto a common universal coordinate system using an interval tree which is subsequently queried for overlapping identifiers. AbsIDconvert has many potential uses, including gene identifier conversion, identification of features within a genomic region, and cross-species comparisons. The utility is demonstrated in three case studies: 1) comparative genomic study mapping plasmodium gene sequences to corresponding human and mosquito transcriptional regions; 2) cross-species study of Incyte clone sequences; and 3) analysis of human Ensembl transcripts mapped by Affymetrix®; and Agilent microarray probes. AbsIDconvert currently supports ID conversion of 53 species for a given list of input identifiers, genomic sequence, or genome intervals. Conclusion AbsIDconvert provides an efficient and reliable mechanism for conversion between identifier domains of interest. The flexibility of this tool allows for custom definition identifier domains contingent upon the availability and determination of a genomic mapping interval. As the genomes and the sequences for genetic elements are further refined, this tool will become increasingly useful and accurate. AbsIDconvert is freely available as a web application or downloadable as a virtual machine at: http://bioinformatics.louisville.edu/abid/. PMID:22967011
Ben Hania, Wajdi; Joseph, Manon; Schumann, Peter; Bunk, Boyke; Fiebig, Anne; Spröer, Cathrin; Klenk, Hans-Peter; Fardeau, Marie-Laure; Spring, Stefan
2015-01-01
During a study of the anaerobic microbial community of a lithifying hypersaline microbial mat of Lake 21 on the Kiritimati atoll (Kiribati Republic, Central Pacific) strain L21-RPul-D2(T) was isolated. The closest phylogenetic neighbor was Spirochaeta africana Z-7692(T) that shared a 16S rRNA gene sequence identity value of 90% with the novel strain and thus was only distantly related. A comprehensive polyphasic study including determination of the complete genome sequence was initiated to characterize the novel isolate. Cells of strain L21-RPul-D2(T) had a size of 0.2 - 0.25 × 8-9 μm, were helical, motile, stained Gram-negative and produced an orange carotenoid-like pigment. Optimal conditions for growth were 35°C, a salinity of 50 g/l NaCl and a pH around 7.0. Preferred substrates for growth were carbohydrates and a few carboxylic acids. The novel strain had an obligate fermentative metabolism and produced ethanol, acetate, lactate, hydrogen and carbon dioxide during growth on glucose. Strain L21-RPul-D2(T) was aerotolerant, but oxygen did not stimulate growth. Major cellular fatty acids were C14:0, iso-C15:0, C16:0 and C18:0. The major polar lipids were an unidentified aminolipid, phosphatidylglycerol, an unidentified phospholipid and two unidentified glycolipids. Whole-cell hydrolysates contained L-ornithine as diagnostic diamino acid of the cell wall peptidoglycan. The complete genome sequence was determined and annotated. The genome comprised one circular chromosome with a size of 3.78 Mbp that contained 3450 protein-coding genes and 50 RNA genes, including 2 operons of ribosomal RNA genes. The DNA G + C content was determined from the genome sequence as 51.9 mol%. There were no predicted genes encoding cytochromes or enzymes responsible for the biosynthesis of respiratory lipoquinones. Based on significant differences to the uncultured type species of the genus Spirochaeta, S. plicatilis, as well as to any other phylogenetically related cultured species it is suggested to place strain L21-RPul-D2(T) (=DSM 27196(T) = JCM 18663(T)) in a novel species and genus, for which the name Salinispira pacifica gen. nov., sp. nov. is proposed.
Everest, Gareth J; Curtis, Sarah M; De Leo, Filomena; Urzì, Clara; Meyers, Paul R
2013-10-01
A novel actinobacterium, strain BC640(T), was isolated from a biofilm sample collected in 2009 in the Saint Callistus Roman catacombs. Analysis of the 16S rRNA gene sequence showed that the strain belonged to the genus Kribbella. Phylogenetic analysis using the 16S rRNA gene and concatenated gyrB, rpoB, relA, recA and atpD gene sequences showed that strain BC640(T) was most closely related to the type strains of Kribbella yunnanensis and Kribbella sandramycini. Based on gyrB genetic distance analysis, strain BC640(T) was shown to be distinct from all Kribbella type strains. DNA-DNA hybridization experiments confirmed that strain BC640(T) represents a genomic species distinct from its closest phylogenetic relatives, K. yunnanensis DSM 15499(T) (53.5±7.8 % DNA relatedness) and K. sandramycini DSM 15626(T) (33.5±5.0 %). Physiological comparisons further showed that strain BC640(T) is phenotypically distinct from the type strains of K. yunnanensis and K. sandramycini. Strain BC640(T) ( = DSM 26744(T) = NRRL B-24917(T)) is thus presented as the type strain of a novel species, for which the name Kribbella albertanoniae sp. nov. is proposed.
Liu, Dan; Wang, Qianqian; Ruan, Zengliang; He, Qian; Zhang, Liming
2015-01-01
Background Jellyfish contain diverse toxins and other bioactive components. However, large-scale identification of novel toxins and bioactive components from jellyfish has been hampered by the low efficiency of traditional isolation and purification methods. Results We performed de novo transcriptome sequencing of the tentacle tissue of the jellyfish Cyanea capillata. A total of 51,304,108 reads were obtained and assembled into 50,536 unigenes. Of these, 21,357 unigenes had homologues in public databases, but the remaining unigenes had no significant matches due to the limited sequence information available and species-specific novel sequences. Functional annotation of the unigenes also revealed general gene expression profile characteristics in the tentacle of C. capillata. A primary goal of this study was to identify putative toxin transcripts. As expected, we screened many transcripts encoding proteins similar to several well-known toxin families including phospholipases, metalloproteases, serine proteases and serine protease inhibitors. In addition, some transcripts also resembled molecules with potential toxic activities, including cnidarian CfTX-like toxins with hemolytic activity, plancitoxin-1, venom toxin-like peptide-6, histamine-releasing factor, neprilysin, dipeptidyl peptidase 4, vascular endothelial growth factor A, angiotensin-converting enzyme-like and endothelin-converting enzyme 1-like proteins. Most of these molecules have not been previously reported in jellyfish. Interestingly, we also characterized a number of transcripts with similarities to proteins relevant to several degenerative diseases, including Huntington’s, Alzheimer’s and Parkinson’s diseases. This is the first description of degenerative disease-associated genes in jellyfish. Conclusion We obtained a well-categorized and annotated transcriptome of C. capillata tentacle that will be an important and valuable resource for further understanding of jellyfish at the molecular level and information on the underlying molecular mechanisms of jellyfish stinging. The findings of this study may also be used in comparative studies of gene expression profiling among different jellyfish species. PMID:26551022
Liu, Guoyan; Zhou, Yonghong; Liu, Dan; Wang, Qianqian; Ruan, Zengliang; He, Qian; Zhang, Liming
2015-01-01
Jellyfish contain diverse toxins and other bioactive components. However, large-scale identification of novel toxins and bioactive components from jellyfish has been hampered by the low efficiency of traditional isolation and purification methods. We performed de novo transcriptome sequencing of the tentacle tissue of the jellyfish Cyanea capillata. A total of 51,304,108 reads were obtained and assembled into 50,536 unigenes. Of these, 21,357 unigenes had homologues in public databases, but the remaining unigenes had no significant matches due to the limited sequence information available and species-specific novel sequences. Functional annotation of the unigenes also revealed general gene expression profile characteristics in the tentacle of C. capillata. A primary goal of this study was to identify putative toxin transcripts. As expected, we screened many transcripts encoding proteins similar to several well-known toxin families including phospholipases, metalloproteases, serine proteases and serine protease inhibitors. In addition, some transcripts also resembled molecules with potential toxic activities, including cnidarian CfTX-like toxins with hemolytic activity, plancitoxin-1, venom toxin-like peptide-6, histamine-releasing factor, neprilysin, dipeptidyl peptidase 4, vascular endothelial growth factor A, angiotensin-converting enzyme-like and endothelin-converting enzyme 1-like proteins. Most of these molecules have not been previously reported in jellyfish. Interestingly, we also characterized a number of transcripts with similarities to proteins relevant to several degenerative diseases, including Huntington's, Alzheimer's and Parkinson's diseases. This is the first description of degenerative disease-associated genes in jellyfish. We obtained a well-categorized and annotated transcriptome of C. capillata tentacle that will be an important and valuable resource for further understanding of jellyfish at the molecular level and information on the underlying molecular mechanisms of jellyfish stinging. The findings of this study may also be used in comparative studies of gene expression profiling among different jellyfish species.
[New perspectives on molecular and genic therapies in Down syndrome].
Delabar, Jean Maurice
2010-04-01
Trisomy 21 was first described as a syndrome in the middle of the nineteenth century and associated to a chromosomic anomaly one hundred years later: the most salient feature of this syndrome is a mental retardation of variable intensity. Molecular mapping and DNA sequencing have allowed identifying the gene content of chromosome 21. Molecular quantitative analyses indicated that trisomy is inducing an overexpression for a large part of the triplicated genes and deregulates also pathways involving non HSA21 genes. Together with the physiological description of murine models overexpressing orthologous genes, these data have allowed to elaborate hypotheses on the cause of cognitive impairment. From these hypotheses and using murine models it is now possible to assess the efficiency of various therapeutic strategies. This paper reviews these new perspectives starting from the strategies targeting the level of HSA21 RNAs or HSA21 proteins; then it describes methods targeting activities either of proteins involved in cell cycle pathways or of proteins controlling the synaptic plasticity. It is promising that strategies targeting specific genes or specific pathways are already giving positive results.
Hodgkin's disease biology: recent advances.
Jox, A; Wolf, J; Diehl, V
1997-11-01
The cellular origin of H-RS cells has been questioned for a long time. Recently, using single cell amplification of Ig genes evidence was obtained that H-RS cells clonally arise from B-cells. Sequence analysis of rearranged Ig genes demonstrated that H-RS cells develop within the germinal centre. H-RS cells in classical HD grow despite loss of function of their rearranged Ig genes. In contrast, the mutation pattern of rearranged Ig genes in L & H cells of lymphocyte-predominant HD frequently shows ongoing mutations indicating that these cell are still antigen selected. These molecular differences show that LP HD genetically differs from classical HD. H-RS cells escape from apoptosis within the germinal centre. However, the events leading to malignant transformation are still unknown. The association between EBV and HD has been repeatedly described, but the occurrence of EBV negative cases is hard to explain just by loss of EBV. The analysis of chromosomal aberrations in H-RS cells did not result in the description of a specific 'HD-gene'. Also the role of the T-lymphocytes surrounding the H-RS cells has remained an open question.
Heyrman, Jeroen; Logan, Niall A; Rodríguez-Díaz, Marina; Scheldeman, Patsy; Lebbe, Liesbeth; Swings, Jean; Heyndrickx, Marc; De Vos, Paul
2005-01-01
A group of 24 strains was isolated from deteriorated mural paintings situated in Spain (necropolis of Carmona) and Germany (church of Greene-Kreiensen). (GTG)5-PCR genomic fingerprinting was performed on these strains to assess their genomic variability and the strains were delineated into four groups. Representatives were studied by 16S rRNA gene sequencing and were found to be closely related to Bacillus simplex and the species 'Bacillus macroides' (strain NCIMB 8796) and 'Bacillus maroccanus' (names not validly published) according to a fasta search. The close similarity between B. simplex, 'B. macroides' NCIMB 8796, 'B. maroccanus' and the mural painting isolates was confirmed by additional (GTG)5-PCR, ARDRA, FAME and SDS-PAGE analyses. Furthermore, these techniques revealed that strains of 'Bacillus carotarum', another name that has not been validly published, also showed high similarity to this group of organisms. On the other hand, it was shown that the strains labelled 'B. macroides' in different collections do not all belong to the same species. Strain NCIMB 8796 can be allocated to B. simplex, while strain DSM 54 (=ATCC 12905) shares the highest 16S rRNA gene sequence similarity with Bacillus sphaericus and Bacillus fusiformis (both around 98.6 %). On the basis of further DNA-DNA hybridization data and the study of phenotypic characteristics, one group of five mural painting strains was attributed to a novel species in the genus Bacillus, for which the name Bacillus muralis sp. nov. is proposed. Finally, the remaining mural painting strains, one (LMG 18508=NCIMB 8796) of two strains belonging to 'B. macroides' and strains belonging to 'B. maroccanus' and 'B. carotarum' are allocated to the species B. simplex and an emended description of B. simplex is given.
Wallace, Robert J; Snelling, Timothy J; McCartney, Christine A; Tapio, Ilma; Strozzi, Francesco
2017-01-16
Methane emissions from ruminal fermentation contribute significantly to total anthropological greenhouse gas (GHG) emissions. New meta-omics technologies are beginning to revolutionise our understanding of the rumen microbial community structure, metabolic potential and metabolic activity. Here we explore these developments in relation to GHG emissions. Microbial rumen community analyses based on small subunit ribosomal RNA sequence analysis are not yet predictive of methane emissions from individual animals or treatments. Few metagenomics studies have been directly related to GHG emissions. In these studies, the main genes that differed in abundance between high and low methane emitters included archaeal genes involved in methanogenesis, with others that were not apparently related to methane metabolism. Unlike the taxonomic analysis up to now, the gene sets from metagenomes may have predictive value. Furthermore, metagenomic analysis predicts metabolic function better than only a taxonomic description, because different taxa share genes with the same function. Metatranscriptomics, the study of mRNA transcript abundance, should help to understand the dynamic of microbial activity rather than the gene abundance; to date, only one study has related the expression levels of methanogenic genes to methane emissions, where gene abundance failed to do so. Metaproteomics describes the proteins present in the ecosystem, and is therefore arguably a better indication of microbial metabolism. Both two-dimensional polyacrylamide gel electrophoresis and shotgun peptide sequencing methods have been used for ruminal analysis. In our unpublished studies, both methods showed an abundance of archaeal methanogenic enzymes, but neither was able to discriminate high and low emitters. Metabolomics can take several forms that appear to have predictive value for methane emissions; ruminal metabolites, milk fatty acid profiles, faecal long-chain alcohols and urinary metabolites have all shown promising results. Rumen microbial amino acid metabolism lies at the root of excessive nitrogen emissions from ruminants, yet only indirect inferences for nitrogen emissions can be drawn from meta-omics studies published so far. Annotation of meta-omics data depends on databases that are generally weak in rumen microbial entries. The Hungate 1000 project and Global Rumen Census initiatives are therefore essential to improve the interpretation of sequence/metabolic information.
Faës, Pascal; Deleu, Carole; Aïnouche, Abdelkader; Le Cahérec, Françoise; Montes, Emilie; Clouet, Vanessa; Gouraud, Anne-Marie; Albert, Benjamin; Orsel, Mathilde; Lassalle, Gilles; Leport, Laurent; Bouchereau, Alain; Niogret, Marie-Françoise
2015-02-01
Six BnaProDH1 and two BnaProDH2 genes were identified in Brassica napus genome. The BnaProDH1 genes are mainly expressed in pollen and roots' organs while BnaProDH2 gene expression is associated with leaf vascular tissues at senescence. Proline dehydrogenase (ProDH) catalyzes the first step in the catabolism of proline. The ProDH gene family in oilseed rape (Brassica napus) was characterized and compared to other Brassicaceae ProDH sequences to establish the phylogenetic relationships between genes. Six BnaProDH1 genes and two BnaProDH2 genes were identified in the B. napus genome. Expression of the three paralogous pairs of BnaProDH1 genes and the two homoeologous BnaProDH2 genes was measured by real-time quantitative RT-PCR in plants at vegetative and reproductive stages. The BnaProDH2 genes are specifically expressed in vasculature in an age-dependent manner, while BnaProDH1 genes are strongly expressed in pollen grains and roots. Compared to the abundant expression of BnaProDH1, the overall expression of BnaProDH2 is low except in roots and senescent leaves. The BnaProDH1 paralogs showed different levels of expression with BnaA&C.ProDH1.a the most strongly expressed and BnaA&C.ProDH1.c the least. The promoters of two BnaProDH1 and two BnaProDH2 genes were fused with uidA reporter gene (GUS) to characterize organ and tissue expression profiles in transformed B. napus plants. The transformants with promoters from different genes showed contrasting patterns of GUS activity, which corresponded to the spatial expression of their respective transcripts. ProDHs probably have non-redundant functions in different organs and at different phenological stages. In terms of molecular evolution, all BnaProDH sequences appear to have undergone strong purifying selection and some copies are becoming subfunctionalized. This detailed description of oilseed rape ProDH genes provides new elements to investigate the function of proline metabolism in plant development.
A primer on thermodynamic-based models for deciphering transcriptional regulatory logic.
Dresch, Jacqueline M; Richards, Megan; Ay, Ahmet
2013-09-01
A rigorous analysis of transcriptional regulation at the DNA level is crucial to the understanding of many biological systems. Mathematical modeling has offered researchers a new approach to understanding this central process. In particular, thermodynamic-based modeling represents the most biophysically informed approach aimed at connecting DNA level regulatory sequences to the expression of specific genes. The goal of this review is to give biologists a thorough description of the steps involved in building, analyzing, and implementing a thermodynamic-based model of transcriptional regulation. The data requirements for this modeling approach are described, the derivation for a specific regulatory region is shown, and the challenges and future directions for the quantitative modeling of gene regulation are discussed. Copyright © 2013 Elsevier B.V. All rights reserved.
Ruckmani, Arunachalam; Kaur, Ishwinder; Schumann, Peter; Klenk, Hans-Peter; Mayilraj, Shanmugam
2011-10-01
During the course of a study on the bacterial diversity in Western Ghats, India, an actinobacterial strain, designated PC IW02(T), was isolated and characterized by a polyphasic taxonomic approach. Strain PC IW02(T) was a non-motile, Gram-positive, short rod that formed creamish white to yellow coloured colonies. 16S rRNA gene sequence analysis showed that the novel strain showed highest sequence similarity with type strains of members of the genus Dermacoccus: Dermacoccus barathri (96.6 %), Dermacoccus profundi (96.5 %), Dermacoccus abyssi (96.4 %) and Dermacoccus nishinomiyaensis (95.9 %). The phylogenetic tree suggested that strain PC IW02(T) could represent a member of a new genus of the family Dermacoccaceae with the genus Demetria as closest clade. Pairwise sequence alignment with Demetria terragena HKI 0089(T) and Kytococcus sedentarius DSM 20547(T) showed similarities of 94.2 and 93.7 %, respectively. Strain PC IW02(T) had MK-8(H(4)) as the major menaquinone. The major fatty acids were iso-C(16 : 0) (43.4 %), iso-C(16 : 1) H (17.2 %) and anteiso-C(17 : 0) (9.9 %). The diagnostic cell-wall amino acid at position 3 of the peptide subunit was lysine; the interpeptide bridge consisted of Gly-Ser-Asp. The polar lipids present were diphosphatidylglycerol, phosphatidylglycerol, phosphatidylinositol, phosphatidylinositol mannosides and phosphatidylserine, along with two unknown phospholipids. The genomic DNA G+C content of the isolate was 77 mol%. On the basis of phenotypic characteristics, including chemotaxonomic data, and 16S rRNA gene sequence similarities, strain PC IW02(T) represents a novel species in a new genus of the family Dermacoccaceae for which the name Calidifontibacter indicus gen. nov., sp. nov. is proposed. The type strain of Calidifontibacter indicus is PC IW02(T) ( = MTCC 8338(T) = DSM 22967(T) = JCM 16038(T)). An emended description of the family Dermacoccaceae is provided.
A high-density genetic map of Arachis duranensis, a diploid ancestor of cultivated peanut
2012-01-01
Background Cultivated peanut (Arachis hypogaea) is an allotetraploid species whose ancestral genomes are most likely derived from the A-genome species, A. duranensis, and the B-genome species, A. ipaensis. The very recent (several millennia) evolutionary origin of A. hypogaea has imposed a bottleneck for allelic and phenotypic diversity within the cultigen. However, wild diploid relatives are a rich source of alleles that could be used for crop improvement and their simpler genomes can be more easily analyzed while providing insight into the structure of the allotetraploid peanut genome. The objective of this research was to establish a high-density genetic map of the diploid species A. duranensis based on de novo generated EST databases. Arachis duranensis was chosen for mapping because it is the A-genome progenitor of cultivated peanut and also in order to circumvent the confounding effects of gene duplication associated with allopolyploidy in A. hypogaea. Results More than one million expressed sequence tag (EST) sequences generated from normalized cDNA libraries of A. duranensis were assembled into 81,116 unique transcripts. Mining this dataset, 1236 EST-SNP markers were developed between two A. duranensis accessions, PI 475887 and Grif 15036. An additional 300 SNP markers also were developed from genomic sequences representing conserved legume orthologs. Of the 1536 SNP markers, 1054 were placed on a genetic map. In addition, 598 EST-SSR markers identified in A. hypogaea assemblies were included in the map along with 37 disease resistance gene candidate (RGC) and 35 other previously published markers. In total, 1724 markers spanning 1081.3 cM over 10 linkage groups were mapped. Gene sequences that provided mapped markers were annotated using similarity searches in three different databases, and gene ontology descriptions were determined using the Medicago Gene Atlas and TAIR databases. Synteny analysis between A. duranensis, Medicago and Glycine revealed significant stretches of conserved gene clusters spread across the peanut genome. A higher level of colinearity was detected between A. duranensis and Glycine than with Medicago. Conclusions The first high-density, gene-based linkage map for A. duranensis was generated that can serve as a reference map for both wild and cultivated Arachis species. The markers developed here are valuable resources for the peanut, and more broadly, to the legume research community. The A-genome map will have utility for fine mapping in other peanut species and has already had application for mapping a nematode resistance gene that was introgressed into A. hypogaea from A. cardenasii. PMID:22967170
Van Roy, N; Van Limbergen, H; Vandesompele, J; Van Gele, M; Poppe, B; Salwen, H; Laureys, G; Manoel, N; De Paepe, A; Speleman, F
2001-10-01
Cancer cell lines are essential gene discovery tools and have often served as models in genetic and functional studies of particular tumor types. One of the future challenges is comparison and interpretation of gene expression data with the available knowledge on the genomic abnormalities in these cell lines. In this context, accurate description of these genomic abnormalities is required. Here, we show that a combination of M-FISH with banding analysis, standard FISH, and CGH allowed a detailed description of the genetic alterations in 16 neuroblastoma cell lines. In total, 14 cryptic chromosome rearrangements were detected, including a balanced t(2;4)(p24.3;q34.3) translocation in cell line NBL-S, with the 2p24 breakpoint located at about 40 kb from MYCN. The chromosomal origin of 22 marker chromosomes and 41 cytogenetically undefined translocated segments was determined. Chromosome arm 2 short arm translocations were observed in six cell lines (38%) with and five (31%) without MYCN amplification, leading to partial chromosome arm 2p gain in all but one cell line and loss of material in the various partner chromosomes, including 1p and 11q. These 2p gains were often masked in the GGH profiles due to MYCN amplification. The commonly overrepresented region was chromosome segment 2pter-2p22, which contains the MYCN gene, and five out of eleven 2p breakpoints clustered to the interface of chromosome bands 2p16 and 2p21. In neuroblastoma cell line SJNB-12, with double minutes (dmins) but no MYCN amplification, the dmins were shown to be derived from 16q22-q23 sequences. The ATBF1 gene, an AT-binding transcription factor involved in normal neurogenesis and located at 16q22.2, was shown to be present in the amplicon. This is the first report describing the possible implication of ATBF1 in neuroblastoma cells. We conclude that a combined approach of M-FISH, cytogenetics, and CGH allowed a more complete and accurate description of the genetic alterations occurring in the investigated cell lines. Copyright 2001 Wiley-Liss, Inc.
Proença, Diogo Neves; Nobre, Maria Fernanda; Morais, Paula V
2014-04-01
Bacterial strain A37T2(T) was isolated from the endophytic microbial community of a Pinus pinaster tree trunk and characterized. Strain A37T2(T) was Gram-stain-negative, formed rod-shaped cells, and grew optimally at 26-30 °C and at pH 5.5-7.5. The G+C content of the DNA was 46.6 mol%. The major respiratory quinone was menaquinone 7 (MK-7) and the major fatty acids were C16 : 1ω5c and iso-C15 : 0, representing 61.7 % of the total fatty acids. The polar lipids consisted of phosphatidylethanolamine, four unidentified aminophospholipids, one unidentified phospholipid, two unidentified aminolipids and three unidentified lipids. Phylogenetic analysis based on 16S rRNA gene sequences showed that strain A37T2(T) belonged to the family Chitinophagaceae, forming a distinct branch with Chitinophaga niabensis JS13-10(T) within the genus Chitinophaga. Strain A37T2(T) shared between 92.7 and 95.1 % 16S rRNA gene sequence similarity with the type strains of species of the genus Chitinophaga. The phylogenetic, phenotypic and chemotaxonomic data presented indicate that strain A37T2(T) represents a novel species of the genus Chitinophaga, for which the name Chitinophaga costaii sp. nov. is proposed. The type strain is A37T2(T) ( = CIP 110584(T) = LMG 27458(T)). An emended description of Chitinophaga niabensis JS13-10(T) is also proposed.
L'Haridon, Stéphane; Chalopin, Morgane; Colombo, Delphine; Toffin, Laurent
2014-06-01
A novel, strictly anaerobic, methylotrophic marine methanogen, strain SLH33(T), was isolated from deep sediment samples covered by an orange microbial mat collected from the Napoli Mud Volcano. Cells of strain SLH33(T) were Gram-stain-negative, motile, irregular cocci that occurred singly. Cells utilized trimethylamine, dimethylamine, monomethylamine, methanol, betaine, N,N-dimethylethanolamine and choline (N,N,N-trimethylethanolamine) as substrates for growth and methanogenesis. The optimal growth temperature was 30 °C; maximum growth rate was obtained at pH 7.0 in the presence of 0.5 M Na(+). The DNA G+C content of strain SLH33(T) was 43.4 mol%. Phylogenetic analyses based on 16S rRNA gene sequences placed strain SLH33(T) within the genus Methanococcoides. The novel isolate was related most closely to Methanococcoides methylutens TMA-10(T) (98.8% 16S rRNA gene sequence similarity) but distantly related to Methanococcoides burtonii DSM 6242(T) (97.6%) and Methanococcoides alaskense AK-5(T) (97.6%). DNA-DNA hybridization studies indicated that strain SLH33(T) represents a novel species, given that it shared less than 16% DNA-DNA relatedness with Methanococcoides methylutens TMA-10(T). The name Methanococcoides vulcani sp. nov. is proposed for this novel species, with strain SLH33(T) ( = DSM 26966(T) = JCM 19278(T)) as the type strain. An emended description of the genus Methanococcoides is also proposed. © 2014 IUMS.
Chen, Yi-Guang; Zhang, Yu-Qin; Yi, Lang-Bo; Li, Zhao-Yang; Wang, Yong-Xiao; Xiao, Huai-Dong; Chen, Qi-Hui; Cui, Xiao-Long; Li, Wen-Jun
2010-03-01
A facultatively anaerobic, moderately halophilic, Gram-positive, endospore-forming, motile, catalase- and oxidase-positive, rod-shaped bacterium, strain JSM 072002(T), was isolated from a sea anemone (Anthopleura xanthogrammica) collected from the South China Sea. Strain JSM 072002(T) was able to grow with 0.5-15 % (w/v) NaCl and at pH 6.0-10.0 and 15-50 degrees C; optimum growth was observed with 2-5 % (w/v) NaCl and at pH 7.5 and 35 degrees C. meso-Diaminopimelic acid was present in the cell-wall peptidoglycan. The major cellular fatty acids were iso-C(15 : 0) and anteiso-C(15 : 0). The predominant respiratory quinone was menaquinone 7 and the genomic DNA G+C content was 41.3 mol%. Phylogenetic analysis based on 16S rRNA gene sequences indicated that strain JSM 072002(T) should be assigned to the genus Pontibacillus and revealed relatively low 16S rRNA gene sequence similarities (<97 %) with the type strains of the three recognized Pontibacillus species (Pontibacillus chungwhensis BH030062(T), 96.8 %; Pontibacillus marinus KCTC 3917(T), 96.7 %; Pontibacillus halophilus JSM 076056(T), 96.0 %). The combination of phylogenetic analysis, DNA-DNA relatedness values, phenotypic characteristics and chemotaxonomic data supports the view that strain JSM 072002(T) represents a novel species of the genus Pontibacillus, for which the name Pontibacillus litoralis sp. nov. is proposed. The type strain is JSM 072002(T) (=DSM 21186(T)=KCTC 13237(T)). An emended description of the genus Pontibacillus is also presented.
Nedashkovskaya, Olga I; Kukhlevskiy, Andrey D; Zhukova, Natalia V; Kim, So-Jeong; Rhee, Sung-Keun
2013-06-01
A strictly aerobic, Gram-stain-negative, rod-shaped and red-orange pigmented bacterium, designated strain KMM 6395(T), was isolated from the green alga Cladophora stimpsoni and subjected to a polyphasic taxonomic study. Phylogenetic analysis based on 16S rRNA gene sequencing revealed that the novel strain affiliated to the family Hyphomonadaceae of the class Alphaproteobacteria being most closely related to the type strain of the single species of the genus Litorimonas, Litorimonas taeanensis G5(T), with 16S rRNA gene sequence similarity of 96.8 %. Strain KMM 6395(T) grew with 1-5 % NaCl and at 4-35 °C, hydrolysed starch and Tween 80. The DNA G+C content was 48.7 mol%. The prevalent fatty acids were C18:1 ω7c, C19:1 ω8c and C18:1 ω7c 10-methyl. The polar lipid profile was characterized by the presence of phosphatidylglycerol, monoglycosyldiglyceride, glucuronopyranosyldiglyceride and an unidentified glycolipid. The major respiratory quinone was Q-10. The significant molecular distinctiveness between the novel isolate and its nearest neighbour, L. taeanensis G5(T), were strongly supported by the differences in physiological and biochemical tests. Therefore, strain KMM 6395(T) represents a novel species of the genus Litorimonas, for which the name Litorimonas cladophorae sp. nov. is proposed. The type strain is KMM 6395(T) (=KCTC 23968(T) = LMG 26985(T)). The emended descriptions of the genus Litorimonas and L. taeaensis are also provided.
Wirshing, Herman H; Baker, Andrew C
2014-08-01
Molecular phylogenies of scleractinian corals often fail to agree with traditional phylogenies derived from morphological characters. These discrepancies are generally attributed to non-homologous or morphologically plastic characters used in taxonomic descriptions. Consequently, morphological convergence of coral skeletons among phylogenetically unrelated groups is considered to be the major evolutionary process confounding molecular and morphological hypotheses. A strategy that may help identify cases of convergence and/or diversification in coral morphology is to compare phylogenies of existing "neutral" genetic markers used to estimate genealogic phylogenetic history with phylogenies generated from non-neutral genes involved in calcification (biomineralization). We tested the hypothesis that differences among calcification gene phylogenies with respect to the "neutral" trees may represent convergent or divergent functional strategies among calcification gene proteins that may correlate to aspects of coral skeletal morphology. Partial sequences of two nuclear genes previously determined to be involved in the calcification process in corals, "Cnidaria-III" membrane-bound/secreted α-carbonic anhydrase (CIII-MBSα-CA) and bone morphogenic protein (BMP) 2/4, were PCR-amplified, cloned and sequenced from 31 scleractinian coral species in 26 genera and 9 families. For comparison, "neutral" gene phylogenies were generated from sequences from two protein-coding "non-calcification" genes, one nuclear (β-tubulin) and one mitochondrial (cytochrome b), from the same individuals. Cloned CIII-MBSα-CA sequences were found to be non-neutral, and phylogenetic analyses revealed CIII-MBSα-CAs to exhibit a complex evolutionary history with clones distributed between at least 2 putative gene copies. However, for several coral taxa only one gene copy was recovered. With CIII-MBSα-CA, several recovered clades grouped taxa that differed from the "non-calcification" loci. In some cases, these taxa shared aspects of their skeletal morphology (i.e., convergence or diversification relative to the "non-calcification" loci), but in other cases they did not. For example, the "non-calcification" loci recovered Atlantic and Pacific mussids as separate evolutionary lineages, whereas with CIII-MBSα-CA, clones of two species of Atlantic mussids (Isophyllia sinuosa and Mycetophyllia sp.) and two species of Pacific mussids (Acanthastrea echinata and Lobophyllia hemprichii) were united in a distinct clade (except for one individual of Mycetophyllia). However, this clade also contained other taxa which were not unambiguously correlated with morphological features. BMP2/4 also contained clones that likely represent different gene copies. However, many of the sequences showed no significant deviation from neutrality, and reconstructed phylogenies were similar to the "non-calcification" tree topologies with a few exceptions. Although individual calcification genes are unlikely to precisely explain the diverse morphological features exhibited by scleractinian corals, this study demonstrates an approach for identifying cases where morphological taxonomy may have been misled by convergent and/or divergent molecular evolutionary processes in corals. Studies such as this may help illuminate our understanding of the likely complex evolution of genes involved in the calcification process, and enhance our knowledge of the natural history and biodiversity within this central ecological group. Published by Elsevier Inc.
Mouse Vk gene classification by nucleic acid sequence similarity.
Strohal, R; Helmberg, A; Kroemer, G; Kofler, R
1989-01-01
Analyses of immunoglobulin (Ig) variable (V) region gene usage in the immune response, estimates of V gene germline complexity, and other nucleic acid hybridization-based studies depend on the extent to which such genes are related (i.e., sequence similarity) and their organization in gene families. While mouse Igh heavy chain V region (VH) gene families are relatively well-established, a corresponding systematic classification of Igk light chain V region (Vk) genes has not been reported. The present analysis, in the course of which we reviewed the known extent of the Vk germline gene repertoire and Vk gene usage in a variety of responses to foreign and self antigens, provides a classification of mouse Vk genes in gene families composed of members with greater than 80% overall nucleic acid sequence similarity. This classification differed in several aspects from that of VH genes: only some Vk gene families were as clearly separated (by greater than 25% sequence dissimilarity) as typical VH gene families; most Vk gene families were closely related and, in several instances, members from different families were very similar (greater than 80%) over large sequence portions; frequently, classification by nucleic acid sequence similarity diverged from existing classifications based on amino-terminal protein sequence similarity. Our data have implications for Vk gene analyses by nucleic acid hybridization and describe potentially important differences in sequence organization between VH and Vk genes.
AutoFACT: An Automatic Functional Annotation and Classification Tool
Koski, Liisa B; Gray, Michael W; Lang, B Franz; Burger, Gertraud
2005-01-01
Background Assignment of function to new molecular sequence data is an essential step in genomics projects. The usual process involves similarity searches of a given sequence against one or more databases, an arduous process for large datasets. Results We present AutoFACT, a fully automated and customizable annotation tool that assigns biologically informative functions to a sequence. Key features of this tool are that it (1) analyzes nucleotide and protein sequence data; (2) determines the most informative functional description by combining multiple BLAST reports from several user-selected databases; (3) assigns putative metabolic pathways, functional classes, enzyme classes, GeneOntology terms and locus names; and (4) generates output in HTML, text and GFF formats for the user's convenience. We have compared AutoFACT to four well-established annotation pipelines. The error rate of functional annotation is estimated to be only between 1–2%. Comparison of AutoFACT to the traditional top-BLAST-hit annotation method shows that our procedure increases the number of functionally informative annotations by approximately 50%. Conclusion AutoFACT will serve as a useful annotation tool for smaller sequencing groups lacking dedicated bioinformatics staff. It is implemented in PERL and runs on LINUX/UNIX platforms. AutoFACT is available at . PMID:15960857
2014-01-01
Background The lined sea anemone Edwardsiella lineata is an informative model system for evolutionary-developmental studies of parasitism. In this species, it is possible to compare alternate developmental pathways leading from a larva to either a free-living polyp or a vermiform parasite that inhabits the mesoglea of a ctenophore host. Additionally, E. lineata is confamilial with the model cnidarian Nematostella vectensis, providing an opportunity for comparative genomic, molecular and organismal studies. Description We generated a reference transcriptome for E. lineata via high-throughput sequencing of RNA isolated from five developmental stages (parasite; parasite-to-larva transition; larva; larva-to-adult transition; adult). The transcriptome comprises 90,440 contigs assembled from >15 billion nucleotides of DNA sequence. Using a molecular clock approach, we estimated the divergence between E. lineata and N. vectensis at 215–364 million years ago. Based on gene ontology and metabolic pathway analyses and gene family surveys (bHLH-PAS, deiodinases, Fox genes, LIM homeodomains, minicollagens, nuclear receptors, Sox genes, and Wnts), the transcriptome of E. lineata is comparable in depth and completeness to N. vectensis. Analyses of protein motifs and revealed extensive conservation between the proteins of these two edwardsiid anemones, although we show the NF-κB protein of E. lineata reflects the ancestral structure, while the NF-κB protein of N. vectensis has undergone a split that separates the DNA-binding domain from the inhibitory domain. All contigs have been deposited in a public database (EdwardsiellaBase), where they may be searched according to contig ID, gene ontology, protein family motif (Pfam), enzyme commission number, and BLAST. The alignment of the raw reads to the contigs can also be visualized via JBrowse. Conclusions The transcriptomic data and database described here provide a platform for studying the evolutionary developmental genomics of a derived parasitic life cycle. In addition, these data from E. lineata will aid in the interpretation of evolutionary novelties in gene sequence or structure that have been reported for the model cnidarian N. vectensis (e.g., the split NF-κB locus). Finally, we include custom computational tools to facilitate the annotation of a transcriptome based on high-throughput sequencing data obtained from a “non-model system.” PMID:24467778
Draft genome sequence of type strain HBR26T and description of Rhizobium aethiopicum sp. nov.
Aserse, Aregu Amsalu; Woyke, Tanja; Kyrpides, Nikos C.; ...
2017-01-26
Rhizobium aethiopicum sp. nov. is a newly proposed species within the genus Rhizobium. This species includes six rhizobial strains; which were isolated from root nodules of the legume plant Phaseolus vulgaris growing in soils of Ethiopia. The species fixes nitrogen effectively in symbiosis with the host plant P. vulgaris, and is composed of aerobic, Gram-negative staining, rod-shaped bacteria. The genome of type strain HBR26 T of R. aethiopicum sp. nov. was one of the rhizobial genomes sequenced as a part of the DOE JGI 2014 Genomic Encyclopedia project designed for soil and plant-associated and newly described type strains. The genomemore » sequence is arranged in 62 scaffolds and consists of 6,557,588 bp length, with a 61% G + C content and 6221 protein-coding and 86 RNAs genes. The genome of HBR26 T contains repABC genes (plasmid replication genes) homologous to the genes found in five differen t Rhizobium etli CFN42 T plasmids, suggesting that HBR26 T may have five additional replicons other than the chromosome. In the genome of HBR26 T , the nodulation genes nodB, nodC, nodS, nodI, nodJ and nodD are located in the same module, and organized in a similar way as nod genes found in the genome of other known common bean-nodulating rhizobial species. nodA gene is found in a different scaffold, but it is also very similar to nodA genes of other bean-nodulating rhizobial strains. Though HBR26 T is distinct on the phylogenetic tree and based on ANI analysis (the highest value 90.2% ANI with CFN42 T ) from other bean-nodulating species, these nod genes and most nitrogen-fixing genes found in the genome of HBR26 T share high identity with the corresponding genes of known bean-nodulating rhizobial species (96-100% identity). This suggests that symbiotic genes might be shared between bean-nodulating rhizobia through horizontal gene transfer. R. aethiopicum sp. nov. was grouped into the genus Rhizobium but was distinct from all recognized species of that genus by phylogenetic analyses of combined sequences of the housekeeping genes recA and glnII. The closest reference type strains for HBR26 T were R. etli CFN42 T (94% similarity of the combined recA and glnII sequences) and Rhizobium bangladeshense BLR175 T (93%). Genomic ANI calculation based on protein-coding genes also revealed that the closest reference strains were R. bangladeshense BLR175 T and R. etli CFN42 T with ANI values 91.8 and 90.2%, respectively. Nevertheless, the ANI values between HBR26 T and BLR175 T or CFN42 T are far lower than the cutoff value of ANI ( > = 96%) between strains in the same species, confirming that HBR26 T belongs to a novel species. Thus, on the basis of phylogenetic, comparative genomic analyses and ANI results, we formally propose the creation of R. aethiopicum sp. nov. with strain HBR26 T (=HAMBI 3550 T =LMG 29711 T ) as the type strain. The genome assembly and annotation data is deposited in the DOE JGI portal and also available at European Nucleotide Archive under accession numbers FMAJ01000001-FMAJ01000062.« less
Draft genome sequence of type strain HBR26T and description of Rhizobium aethiopicum sp. nov.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Aserse, Aregu Amsalu; Woyke, Tanja; Kyrpides, Nikos C.
Rhizobium aethiopicum sp. nov. is a newly proposed species within the genus Rhizobium. This species includes six rhizobial strains; which were isolated from root nodules of the legume plant Phaseolus vulgaris growing in soils of Ethiopia. The species fixes nitrogen effectively in symbiosis with the host plant P. vulgaris, and is composed of aerobic, Gram-negative staining, rod-shaped bacteria. The genome of type strain HBR26 T of R. aethiopicum sp. nov. was one of the rhizobial genomes sequenced as a part of the DOE JGI 2014 Genomic Encyclopedia project designed for soil and plant-associated and newly described type strains. The genomemore » sequence is arranged in 62 scaffolds and consists of 6,557,588 bp length, with a 61% G + C content and 6221 protein-coding and 86 RNAs genes. The genome of HBR26 T contains repABC genes (plasmid replication genes) homologous to the genes found in five differen t Rhizobium etli CFN42 T plasmids, suggesting that HBR26 T may have five additional replicons other than the chromosome. In the genome of HBR26 T , the nodulation genes nodB, nodC, nodS, nodI, nodJ and nodD are located in the same module, and organized in a similar way as nod genes found in the genome of other known common bean-nodulating rhizobial species. nodA gene is found in a different scaffold, but it is also very similar to nodA genes of other bean-nodulating rhizobial strains. Though HBR26 T is distinct on the phylogenetic tree and based on ANI analysis (the highest value 90.2% ANI with CFN42 T ) from other bean-nodulating species, these nod genes and most nitrogen-fixing genes found in the genome of HBR26 T share high identity with the corresponding genes of known bean-nodulating rhizobial species (96-100% identity). This suggests that symbiotic genes might be shared between bean-nodulating rhizobia through horizontal gene transfer. R. aethiopicum sp. nov. was grouped into the genus Rhizobium but was distinct from all recognized species of that genus by phylogenetic analyses of combined sequences of the housekeeping genes recA and glnII. The closest reference type strains for HBR26 T were R. etli CFN42 T (94% similarity of the combined recA and glnII sequences) and Rhizobium bangladeshense BLR175 T (93%). Genomic ANI calculation based on protein-coding genes also revealed that the closest reference strains were R. bangladeshense BLR175 T and R. etli CFN42 T with ANI values 91.8 and 90.2%, respectively. Nevertheless, the ANI values between HBR26 T and BLR175 T or CFN42 T are far lower than the cutoff value of ANI ( > = 96%) between strains in the same species, confirming that HBR26 T belongs to a novel species. Thus, on the basis of phylogenetic, comparative genomic analyses and ANI results, we formally propose the creation of R. aethiopicum sp. nov. with strain HBR26 T (=HAMBI 3550 T =LMG 29711 T ) as the type strain. The genome assembly and annotation data is deposited in the DOE JGI portal and also available at European Nucleotide Archive under accession numbers FMAJ01000001-FMAJ01000062.« less
Hayashi, Kazukuni; Busse, Hans-Jürgen; Golke, Jan; Anderson, James; Wan, Xuehua; Hou, Shaobin; Chain, Patrick S G; Prescott, Rebecca D; Donachie, Stuart P
2018-01-01
A Gram-negative, rod-shaped bacterium, designated KH87 T , was isolated from a fishing hook that had been baited and suspended in seawater off O'ahu, Hawai'i. Based on a comparison of 1524 nt of the 16S rRNA gene sequence of strain KH87 T , its nearest neighbours were the GammaproteobacteriaRheinheimera nanhaiensis E407-8 T (96.2 % identity), Rheinheimera chironomi K19414 T (96.0 %), Rheinheimera pacifica KMM 1406 T (95.8 %), Rheinheimera muenzenbergensis E49 T (95.7 %), Alishewanella solinquinati KMK6 T (94.9 %) and Arsukibacterium ikkense GCM72 T (94.6 %). Cells of KH87 T were motile by a single polar flagellum, strictly aerobic, and catalase- and oxidase-positive. Growth occurred between 4 and 39 °C, and in a circumneutral pH range. Major fatty acids in whole cells of strain KH87 T were cis-9-hexadecenoic acid, hexadecanoic acid and cis-11-octadecenoic acid. The quinone system contained mostly menaquinone MK-7, and a minor amount of ubiquinone Q-8. The polar lipid profile contained the major lipids phosphatidylglycerol, phosphatidylserine, phosphatidylethanolamine, an unidentified aminolipid, and a lipid not containing phosphate, an amino group or a sugar moiety. Putrescine was the major polyamine. Physiological, biochemical and genomic data, including obligate halophily, absence of amylolytic activity, a quinone system dominated by MK-7 and DNA G+C content (42.0 mol%) distinguished KH87 T from extant Rheinheimera species; strain KH87 T was also distinguished by a multi-locus sequence analysis of aligned and concatenated 16S rRNA, gyrB, rpoB and rpoD gene sequences. Based on phenotypic and genotypic differences, the species Rheinheimera salexigens sp. nov. is proposed to accommodate KH87 T as the type strain (=ATCC BAA-2715 T =CIP 111115 T ). An emended description of the genus Rheinheimera is also proposed.
Behrendt, Undine; Schumann, Peter; Stieglmeier, Michaela; Pukall, Rüdiger; Augustin, Jürgen; Spröer, Cathrin; Schwendner, Petra; Moissl-Eichinger, Christine; Ulrich, Andreas
2010-10-01
In the course of studying the influence of N-fertilization on N(2) and N(2)O flux rates in relation to soil bacterial community composition of a long-term fertilization experiment in fen peat grassland, a strain group was isolated that was related to a strain isolated from a spacecraft assembly clean room during diversity studies of microorganisms, which withstood cleaning and bioburden reduction strategies. Both the fen soil isolates and the clean room strain revealed versatile physiological capacities in N-transformation processes by performing heterotrophic nitrification, respiratory ammonification and denitrification activity. Phylogenetic analysis based on 16S rRNA gene sequences demonstrated that the investigated isolates belonged to the genus Paenibacillus. Sequence similarities lower than 97% in comparison to established species indicated a separate species position. Except for the peptidoglycan type (A4alpha L-Lys-D-Asp), chemotaxonomic features of the isolates matched the genus description, but differences in several physiological characteristics separated them from related species and supported their novel species status. Despite a high 16S rRNA gene sequence similarity between the clean room isolate ES_MS17(T) and the representative fen soil isolate N3/975(T), DNA-DNA hybridization studies revealed genetic differences at the species level. These differences were substantiated by MALDI-TOF MS analysis, ribotyping and several distinct physiological characteristics. On the basis of these results, it was concluded that the fen soil isolates and the clean room isolate ES_MS17(T) represented two novel species for which the names Paenibacillus uliginis sp. nov. (type strain N3/975(T)=DSM 21861(T)=LMG 24790(T)) and Paenibacillus purispatii sp. nov. (type strain ES_MS17(T)=DSM 22991(T)=CIP 110057(T)) are proposed. Copyright © 2010 Elsevier GmbH. All rights reserved.
Targeted exon sequencing in Usher syndrome type I.
Bujakowska, Kinga M; Consugar, Mark; Place, Emily; Harper, Shyana; Lena, Jaclyn; Taub, Daniel G; White, Joseph; Navarro-Gomez, Daniel; Weigel DiFranco, Carol; Farkas, Michael H; Gai, Xiaowu; Berson, Eliot L; Pierce, Eric A
2014-12-02
Patients with Usher syndrome type I (USH1) have retinitis pigmentosa, profound congenital hearing loss, and vestibular ataxia. This syndrome is currently thought to be associated with at least six genes, which are encoded by over 180 exons. Here, we present the use of state-of-the-art techniques in the molecular diagnosis of a cohort of 47 USH1 probands. The cohort was studied with selective exon capture and next-generation sequencing of currently known inherited retinal degeneration genes, comparative genomic hybridization, and Sanger sequencing of new USH1 exons identified by human retinal transcriptome analysis. With this approach, we were able to genetically solve 14 of the 47 probands by confirming the biallelic inheritance of mutations. We detected two likely pathogenic variants in an additional 19 patients, for whom family members were not available for cosegregation analysis to confirm biallelic inheritance. Ten patients, in addition to primary disease-causing mutations, carried rare likely pathogenic USH1 alleles or variants in other genes associated with deaf-blindness, which may influence disease phenotype. Twenty-one of the identified mutations were novel among the 33 definite or likely solved patients. Here, we also present a clinical description of the studied cohort at their initial visits. We found a remarkable genetic heterogeneity in the studied USH1 cohort with multiplicity of mutations, of which many were novel. No obvious influence of genotype on phenotype was found, possibly due to small sample sizes of the genotypes under study. Copyright 2014 The Association for Research in Vision and Ophthalmology, Inc.
Peeters, Charlotte; Meier-Kolthoff, Jan P.; Verheyde, Bart; De Brandt, Evie; Cooper, Vaughn S.; Vandamme, Peter
2016-01-01
Partial gyrB gene sequence analysis of 17 isolates from human and environmental sources revealed 13 clusters of strains and identified them as Burkholderia glathei clade (BGC) bacteria. The taxonomic status of these clusters was examined by whole-genome sequence analysis, determination of the G+C content, whole-cell fatty acid analysis and biochemical characterization. The whole-genome sequence-based phylogeny was assessed using the Genome Blast Distance Phylogeny (GBDP) method and an extended multilocus sequence analysis (MLSA) approach. The results demonstrated that these 17 BGC isolates represented 13 novel Burkholderia species that could be distinguished by both genotypic and phenotypic characteristics. BGC strains exhibited a broad metabolic versatility and developed beneficial, symbiotic, and pathogenic interactions with different hosts. Our data also confirmed that there is no phylogenetic subdivision in the genus Burkholderia that distinguishes beneficial from pathogenic strains. We therefore propose to formally classify the 13 novel BGC Burkholderia species as Burkholderia arvi sp. nov. (type strain LMG 29317T = CCUG 68412T), Burkholderia hypogeia sp. nov. (type strain LMG 29322T = CCUG 68407T), Burkholderia ptereochthonis sp. nov. (type strain LMG 29326T = CCUG 68403T), Burkholderia glebae sp. nov. (type strain LMG 29325T = CCUG 68404T), Burkholderia pedi sp. nov. (type strain LMG 29323T = CCUG 68406T), Burkholderia arationis sp. nov. (type strain LMG 29324T = CCUG 68405T), Burkholderia fortuita sp. nov. (type strain LMG 29320T = CCUG 68409T), Burkholderia temeraria sp. nov. (type strain LMG 29319T = CCUG 68410T), Burkholderia calidae sp. nov. (type strain LMG 29321T = CCUG 68408T), Burkholderia concitans sp. nov. (type strain LMG 29315T = CCUG 68414T), Burkholderia turbans sp. nov. (type strain LMG 29316T = CCUG 68413T), Burkholderia catudaia sp. nov. (type strain LMG 29318T = CCUG 68411T) and Burkholderia peredens sp. nov. (type strain LMG 29314T = CCUG 68415T). Furthermore, we present emended descriptions of the species Burkholderia sordidicola, Burkholderia zhejiangensis and Burkholderia grimmiae. The GenBank/EMBL/DDBJ accession numbers for the 16S rRNA and gyrB gene sequences determined in this study are LT158612-LT158624 and LT158625-LT158641, respectively. PMID:27375597
Li, Qianqian; Liu, Jianguo; Zhang, Litao; Liu, Qian
2014-01-01
Background Algae in the order Trentepohliales have a broad geographic distribution and are generally characterized by the presence of abundant β-carotene. The many monographs published to date have mainly focused on their morphology, taxonomy, phylogeny, distribution and reproduction; molecular studies of this order are still rare. High-throughput RNA sequencing (RNA-Seq) technology provides a powerful and efficient method for transcript analysis and gene discovery in Trentepohlia jolithus. Methods/Principal Findings Illumina HiSeq 2000 sequencing generated 55,007,830 Illumina PE raw reads, which were assembled into 41,328 assembled unigenes. Based on NR annotation, 53.28% of the unigenes (22,018) could be assigned to gene ontology classes with 54 subcategories and 161,451 functional terms. A total of 26,217 (63.44%) assembled unigenes were mapped to 128 KEGG pathways. Furthermore, a set of 5,798 SSRs in 5,206 unigenes and 131,478 putative SNPs were identified. Moreover, the fact that all of the C4 photosynthesis genes exist in T. jolithus suggests a complex carbon acquisition and fixation system. Similarities and differences between T. jolithus and other algae in carotenoid biosynthesis are also described in depth. Conclusions/Significance This is the first broad transcriptome survey for T. jolithus, increasing the amount of molecular data available for the class Ulvophyceae. As well as providing resources for functional genomics studies, the functional genes and putative pathways identified here will contribute to a better understanding of carbon fixation and fatty acid and carotenoid biosynthesis in T. jolithus. PMID:25254555
Lubin, Ira M; Aziz, Nazneen; Babb, Lawrence J; Ballinger, Dennis; Bisht, Himani; Church, Deanna M; Cordes, Shaun; Eilbeck, Karen; Hyland, Fiona; Kalman, Lisa; Landrum, Melissa; Lockhart, Edward R; Maglott, Donna; Marth, Gabor; Pfeifer, John D; Rehm, Heidi L; Roy, Somak; Tezak, Zivana; Truty, Rebecca; Ullman-Cullere, Mollie; Voelkerding, Karl V; Worthey, Elizabeth A; Zaranek, Alexander W; Zook, Justin M
2017-05-01
A national workgroup convened by the Centers for Disease Control and Prevention identified principles and made recommendations for standardizing the description of sequence data contained within the variant file generated during the course of clinical next-generation sequence analysis for diagnosing human heritable conditions. The specifications for variant files were initially developed to be flexible with regard to content representation to support a variety of research applications. This flexibility permits variation with regard to how sequence findings are described and this depends, in part, on the conventions used. For clinical laboratory testing, this poses a problem because these differences can compromise the capability to compare sequence findings among laboratories to confirm results and to query databases to identify clinically relevant variants. To provide for a more consistent representation of sequence findings described within variant files, the workgroup made several recommendations that considered alignment to a common reference sequence, variant caller settings, use of genomic coordinates, and gene and variant naming conventions. These recommendations were considered with regard to the existing variant file specifications presently used in the clinical setting. Adoption of these recommendations is anticipated to reduce the potential for ambiguity in describing sequence findings and facilitate the sharing of genomic data among clinical laboratories and other entities. Copyright © 2017 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.
Transcriptome changes during fruit development and ripening of sweet orange (Citrus sinensis).
Yu, Keqin; Xu, Qiang; Da, Xinlei; Guo, Fei; Ding, Yuduan; Deng, Xiuxin
2012-01-10
The transcriptome of the fruit pulp of the sweet orange variety Anliu (WT) and that of its red fleshed mutant Hong Anliu (MT) were compared to understand the dynamics and differential expression of genes expressed during fruit development and ripening. The transcriptomes of WT and MT were sampled at four developmental stages using an Illumina sequencing platform. A total of 19,440 and 18,829 genes were detected in MT and WT, respectively. Hierarchical clustering analysis revealed 24 expression patterns for the set of all genes detected, of which 20 were in common between MT and WT. Over 89% of the genes showed differential expression during fruit development and ripening in the WT. Functional categorization of the differentially expressed genes revealed that cell wall biosynthesis, carbohydrate and citric acid metabolism, carotenoid metabolism, and the response to stress were the most differentially regulated processes occurring during fruit development and ripening. A description of the transcriptomic changes occurring during fruit development and ripening was obtained in sweet orange, along with a dynamic view of the gene expression differences between the wild type and a red fleshed mutant. © 2012 Yu et al; licensee BioMed Central Ltd.
2014-01-01
Background Alternative splicing is an important process in higher eukaryotes that allows obtaining several transcripts from one gene. A specific case of alternative splicing is mutually exclusive splicing, in which exactly one exon out of a cluster of neighbouring exons is spliced into the mature transcript. Recently, a new algorithm for the prediction of these exons has been developed based on the preconditions that the exons of the cluster have similar lengths, sequence homology, and conserved splice sites, and that they are translated in the same reading frame. Description In this contribution we introduce Kassiopeia, a database and web application for the generation, storage, and presentation of genome-wide analyses of mutually exclusive exomes. Currently, Kassiopeia provides access to the mutually exclusive exomes of twelve Drosophila species, the thale cress Arabidopsis thaliana, the flatworm Caenorhabditis elegans, and human. Mutually exclusive spliced exons (MXEs) were predicted based on gene reconstructions from Scipio. Based on the standard prediction values, with which 83.5% of the annotated MXEs of Drosophila melanogaster were reconstructed, the exomes contain surprisingly more MXEs than previously supposed and identified. The user can search Kassiopeia using BLAST or browse the genes of each species optionally adjusting the parameters used for the prediction to reveal more divergent or only very similar exon candidates. Conclusions We developed a pipeline to predict MXEs in the genomes of several model organisms and a web interface, Kassiopeia, for their visualization. For each gene Kassiopeia provides a comprehensive gene structure scheme, the sequences and predicted secondary structures of the MXEs, and, if available, further evidence for MXE candidates from cDNA/EST data, predictions of MXEs in homologous genes of closely related species, and RNA secondary structure predictions. Kassiopeia can be accessed at http://www.motorprotein.de/kassiopeia. PMID:24507667
Kurylo, Chad M.; Alexander, Noah; Dass, Randall A.; Parks, Matthew M.; Altman, Roger A.; Vincent, C. Theresa; Mason, Christopher E.; Blanchard, Scott C.
2016-01-01
Escherichia coli strain MRE600 was originally identified for its low RNase I activity and has therefore been widely adopted by the biomedical research community as a preferred source for the expression and purification of transfer RNAs and ribosomes. Despite its widespread use, surprisingly little information about its genome or genetic content exists. Here, we present the first de novo assembly and description of the MRE600 genome and epigenome. To provide context to these studies of MRE600, we include comparative analyses with E. coli K-12 MG1655 (K12). Pacific Biosciences Single Molecule, Real-Time sequencing reads were assembled into one large chromosome (4.83 Mb) and three smaller plasmids (89.1, 56.9, and 7.1 kb). Interestingly, the 7.1-kb plasmid possesses genes encoding a colicin E1 protein and its associated immunity protein. The MRE600 genome has a G + C content of 50.8% and contains a total of 5,181 genes, including 4,913 protein-encoding genes and 268 RNA genes. We identified 41,469 modified DNA bases (0.83% of total) and found that MRE600 lacks the gene for type I methyltransferase, EcoKI. Phylogenetic, taxonomic, and genetic analyses demonstrate that MRE600 is a divergent E. coli strain that displays features of the closely related genus, Shigella. Nevertheless, comparative analyses between MRE600 and E. coli K12 show that these two strains exhibit nearly identical ribosomal proteins, ribosomal RNAs, and highly homologous tRNA species. Substantiating prior suggestions that MRE600 lacks RNase I activity, the RNase I-encoding gene, rna, contains a single premature stop codon early in its open-reading frame. PMID:26802429
Gao, Yi; Zhang, Xiaojun; Wei, Jiankai; Sun, Xiaoqing; Yuan, Jianbo; Li, Fuhua; Xiang, Jianhai
2015-01-01
Molting is one of the most important biological processes in shrimp growth and development. All shrimp undergo cyclic molting periodically to shed and replace their exoskeletons. This process is essential for growth, metamorphosis, and reproduction in shrimp. However, the molecular mechanisms underlying shrimp molting remain poorly understood. In this study, we investigated global expression changes in the transcriptomes of the Pacific white shrimp, Litopenaeus vannamei, the most commonly cultured shrimp species worldwide. The transcriptome of whole L. vannamei was investigated by RNA-sequencing (RNA-seq) throughout the molting cycle, including the inter-molt (C), pre-molt (D0, D1, D2, D3, D4), and post-molt (P1 and P2) stages, and 93,756 unigenes were identified. Among these genes, we identified 5,117 genes differentially expressed (log2ratio ≥1 and FDR ≤0.001) in adjacent molt stages. The results were compared against the National Center for Biotechnology Information (NCBI) non-redundant protein/nucleotide sequence database, Swiss-Prot, PFAM database, the Gene Ontology database, and the Kyoto Encyclopedia of Genes and Genomes database in order to annotate gene descriptions, associate them with gene ontology terms, and assign them to pathways. The expression patterns for genes involved in several molecular events critical for molting, such as hormone regulation, triggering events, implementation phases, skelemin, immune responses were characterized and considered as mechanisms underlying molting in L. vannamei. Comparisons with transcriptomic analyses in other arthropods were also performed. The characterization of major transcriptional changes in genes involved in the molting cycle provides candidates for future investigation of the molecular mechanisms. The data generated in this study will serve as an important transcriptomic resource for the shrimp research community to facilitate gene and genome annotation and to characterize key molecular processes underlying shrimp development. PMID:26650402
Asamizu, Erika; Nakamura, Yasukazu; Sato, Shusei; Tabata, Satoshi
2004-02-01
To perform a comprehensive analysis of genes expressed in a model legume, Lotus japonicus, a total of 74472 3'-end expressed sequence tags (EST) were generated from cDNA libraries produced from six different organs. Clustering of sequences was performed with an identity criterion of 95% for 50 bases, and a total of 20457 non-redundant sequences, 8503 contigs and 11954 singletons were generated. EST sequence coverage was analyzed by using the annotated L. japonicus genomic sequence and 1093 of the 1889 predicted protein-encoding genes (57.9%) were hit by the EST sequence(s). Gene content was compared to several plant species. Among the 8503 contigs, 471 were identified as sequences conserved only in leguminous species and these included several disease resistance-related genes. This suggested that in legumes, these genes may have evolved specifically to resist pathogen attack. The rate of gene sequence divergence was assessed by comparing similarity level and functional category based on the Gene Ontology (GO) annotation of Arabidopsis genes. This revealed that genes encoding ribosomal proteins, as well as those related to translation, photosynthesis, and cellular structure were more abundantly represented in the highly conserved class, and that genes encoding transcription factors and receptor protein kinases were abundantly represented in the less conserved class. To make the sequence information and the cDNA clones available to the research community, a Web database with useful services was created at http://www.kazusa.or.jp/en/plant/lotus/EST/.
Elsayed, Liena E O; Mohammed, Inaam N; Hamed, Ahlam A A; Elseed, Maha A; Johnson, Adam; Mairey, Mathilde; Mohamed, Hassab Elrasoul S A; Idris, Mohamed N; Salih, Mustafa A M; El-Sadig, Sarah M; Koko, Mahmoud E; Mohamed, Ashraf Y O; Raymond, Laure; Coutelier, Marie; Darios, Frédéric; Siddig, Rayan A; Ahmed, Ahmed K M A; Babai, Arwa M A; Malik, Hiba M O; Omer, Zulfa M B M; Mohamed, Eman O E; Eltahir, Hanan B; Magboul, Nasr Aldin A; Bushara, Elfatih E; Elnour, Abdelrahman; Rahim, Salah M Abdel; Alattaya, Abdelmoneim; Elbashir, Mustafa I; Ibrahim, Muntaser E; Durr, Alexandra; Audhya, Anjon; Brice, Alexis; Ahmed, Ammar E; Stevanin, Giovanni
2016-01-01
Hereditary spastic paraplegias (HSP) are the second most common type of motor neuron disease recognized worldwide. We investigated a total of 25 consanguineous families from Sudan. We used next-generation sequencing to screen 74 HSP-related genes in 23 families. Linkage analysis and candidate gene sequencing was performed in two other families. We established a genetic diagnosis in six families with autosomal recessive HSP (SPG11 in three families and TFG/SPG57, SACS and ALS2 in one family each). A heterozygous mutation in a gene involved in an autosomal dominant HSP (ATL1/SPG3A) was also identified in one additional family. Six out of seven identified variants were novel. The c.64C>T (p.(Arg22Trp)) TFG/SPG57 variant (PB1 domain) is the second identified that underlies HSP, and we demonstrated its impact on TFG oligomerization in vitro. Patients did not present with visual impairment as observed in a previously reported SPG57 family (c.316C>T (p.(Arg106Cys)) in coiled-coil domain), suggesting unique contributions of the PB1 and coiled-coil domains in TFG complex formation/function and a possible phenotype correlation to variant location. Some families manifested marked phenotypic variations implying the possibility of modifier factors complicated by high inbreeding. Finally, additional genetic heterogeneity is expected in HSP Sudanese families. The remaining families might unravel new genes or uncommon modes of inheritance.
Elsayed, Liena E O; Mohammed, Inaam N; Hamed, Ahlam A A; Elseed, Maha A; Johnson, Adam; Mairey, Mathilde; Mohamed, Hassab Elrasoul S A; Idris, Mohamed N; Salih, Mustafa A M; El-sadig, Sarah M; Koko, Mahmoud E; Mohamed, Ashraf Y O; Raymond, Laure; Coutelier, Marie; Darios, Frédéric; Siddig, Rayan A; Ahmed, Ahmed K M A; Babai, Arwa M A; Malik, Hiba M O; Omer, Zulfa M B M; Mohamed, Eman O E; Eltahir, Hanan B; Magboul, Nasr Aldin A; Bushara, Elfatih E; Elnour, Abdelrahman; Rahim, Salah M Abdel; Alattaya, Abdelmoneim; Elbashir, Mustafa I; Ibrahim, Muntaser E; Durr, Alexandra; Audhya, Anjon; Brice, Alexis; Ahmed, Ammar E; Stevanin, Giovanni
2017-01-01
Hereditary spastic paraplegias (HSP) are the second most common type of motor neuron disease recognized worldwide. We investigated a total of 25 consanguineous families from Sudan. We used next-generation sequencing to screen 74 HSP-related genes in 23 families. Linkage analysis and candidate gene sequencing was performed in two other families. We established a genetic diagnosis in six families with autosomal recessive HSP (SPG11 in three families and TFG/SPG57, SACS and ALS2 in one family each). A heterozygous mutation in a gene involved in an autosomal dominant HSP (ATL1/SPG3A) was also identified in one additional family. Six out of seven identified variants were novel. The c.64C>T (p.(Arg22Trp)) TFG/SPG57 variant (PB1 domain) is the second identified that underlies HSP, and we demonstrated its impact on TFG oligomerization in vitro. Patients did not present with visual impairment as observed in a previously reported SPG57 family (c.316C>T (p.(Arg106Cys)) in coiled-coil domain), suggesting unique contributions of the PB1 and coiled-coil domains in TFG complex formation/function and a possible phenotype correlation to variant location. Some families manifested marked phenotypic variations implying the possibility of modifier factors complicated by high inbreeding. Finally, additional genetic heterogeneity is expected in HSP Sudanese families. The remaining families might unravel new genes or uncommon modes of inheritance. PMID:27601211
Grace, Mark A; Doosey, Michael H; Bart, Henry L; Naylor, Gavin J P
2015-04-22
The description of the pocket shark genus Mollisquama (M. parini Dolganov, 1984) is based on a single known specimen collected from the Nazca Ridge of the southeast Pacific Ocean. A second Mollisquama specimen has been captured in the central Gulf of Mexico establishing a considerable range extension and a parturition locality because the specimen has a healed vitelline scar. Both the holotype of M. parini and the Gulf of Mexico specimen possess the remarkable pocket gland with its large slit-like external opening located just above the pectoral fin. Features found on the Gulf of Mexico specimen that were not noted in the description of M. parini include a series of ventral abdominal photophore agglomerations and a modified dermal denticle surrounded by a radiating arrangement of denticles just posterior to the mouth. Based on a morphometric and meristic comparison of the Gulf of Mexico specimen with information in the description of M. parini, the Gulf of Mexico specimen is identified as Mollisquama sp. due to differences in tooth morphology and vertebral counts. Phylogenetic analysis of NADH2 gene sequences places Mollisquama sister to Dalatias plus Isistius within the family Dalatiidae.
Meta sequence analysis of human blood peptides and their parent proteins.
Bowden, Peter; Pendrak, Voitek; Zhu, Peihong; Marshall, John G
2010-04-18
Sequence analysis of the blood peptides and their qualities will be key to understanding the mechanisms that contribute to error in LC-ESI-MS/MS. Analysis of peptides and their proteins at the level of sequences is much more direct and informative than the comparison of disparate accession numbers. A portable database of all blood peptide and protein sequences with descriptor fields and gene ontology terms might be useful for designing immunological or MRM assays from human blood. The results of twelve studies of human blood peptides and/or proteins identified by LC-MS/MS and correlated against a disparate array of genetic libraries were parsed and matched to proteins from the human ENSEMBL, SwissProt and RefSeq databases by SQL. The reported peptide and protein sequences were organized into an SQL database with full protein sequences and up to five unique peptides in order of prevalence along with the peptide count for each protein. Structured query language or BLAST was used to acquire descriptive information in current databases. Sampling error at the level of peptides is the largest source of disparity between groups. Chi Square analysis of peptide to protein distributions confirmed the significant agreement between groups on identified proteins. Copyright 2010. Published by Elsevier B.V.
Gene finding in metatranscriptomic sequences.
Ismail, Wazim Mohammed; Ye, Yuzhen; Tang, Haixu
2014-01-01
Metatranscriptomic sequencing is a highly sensitive bioassay of functional activity in a microbial community, providing complementary information to the metagenomic sequencing of the community. The acquisition of the metatranscriptomic sequences will enable us to refine the annotations of the metagenomes, and to study the gene activities and their regulation in complex microbial communities and their dynamics. In this paper, we present TransGeneScan, a software tool for finding genes in assembled transcripts from metatranscriptomic sequences. By incorporating several features of metatranscriptomic sequencing, including strand-specificity, short intergenic regions, and putative antisense transcripts into a Hidden Markov Model, TranGeneScan can predict a sense transcript containing one or multiple genes (in an operon) or an antisense transcript. We tested TransGeneScan on a mock metatranscriptomic data set containing three known bacterial genomes. The results showed that TranGeneScan performs better than metagenomic gene finders (MetaGeneMark and FragGeneScan) on predicting protein coding genes in assembled transcripts, and achieves comparable or even higher accuracy than gene finders for microbial genomes (Glimmer and GeneMark). These results imply, with the assistance of metatranscriptomic sequencing, we can obtain a broad and precise picture about the genes (and their functions) in a microbial community. TransGeneScan is available as open-source software on SourceForge at https://sourceforge.net/projects/transgenescan/.
Nikou, Mahdi Moshtaghi; Ramezani, Mohaddaseh; Amoozegar, Mohammad Ali; Rasouli, Mehrnoush; Fazeli, Seyed Abolhassan Shahzadeh; Schumann, Peter; de la Haba, Rafael R; Ventosa, Antonio
2015-10-01
A Gram-stain-positive actinobacterial strain, Miq-4T, was isolated from soil around Meighan wetland in the centre of Iran. Strain Miq-4T was strictly aerobic, catalase- and oxidase-positive. The isolate grew in the presence of 3–15 % (w/v) NaCl, at 20–40 °C and pH 6.0–11.0. The optimum NaCl, temperature and pH for growth were 7.0 %, 30 °C and 7.0–8.5, respectively. The cell wall of strain Miq-4T contained meso-diaminopimelic acid as the diamino acid and glucose and ribose as the whole-cell sugars. The polar lipid pattern consisted of diphosphatidylglycerol, phosphatidylglycerol, phosphatidylethanolamine, phosphatidylinositol and phosphatidylinositol mannoside. Strain Miq-4T synthesized cellular fatty acids of anteiso- and iso-branched types, including anteiso-C17 : 0, anteiso- C15 : 0 and iso-C16 : 0, and the major respiratory quinone was MK-9(H4). The G+C content of the genomic DNA was 68.2 mol%. Phylogenetic analysis based on 16S rRNA gene sequences and characteristic patterns of 16S rRNA gene signature nucleotides revealed that strain Miq-4T belongs to the family Glycomycetaceae and showed the closest phylogenetic similarity with Haloglycomyces albus YIM 92370T (94.1 % 16S rRNA gene sequence similarity). On the basis of phylogenetic analysis and phenotypic and chemotaxonomic characteristics, strain Miq-4T represents a novel species of a new genus in the family Glycomycetaceae, for which the name Salininema proteoliyticum gen. nov., sp. nov. is proposed. The type strain of the type species is Miq-4T ( = IBRC-M 10908T = LMG 28391T). An emended description of the family Glycomycetaceae is also proposed in order to include features of the new genus.
GAN: a platform of genomics and genetics analysis and application in Nicotiana
Yang, Shuai; Zhang, Xingwei; Li, Huayang; Chen, Yudong
2018-01-01
Abstract Nicotiana is an important Solanaceae genus, and plays a significant role in modern biological research. Massive Nicotiana biological data have emerged from in-depth genomics and genetics studies. From big data to big discovery, large-scale analysis and application with new platforms is critical. Based on data accumulation, a comprehensive platform of Genomics and Genetics Analysis and Application in Nicotiana (GAN) has been developed, and is publicly available at http://biodb.sdau.edu.cn/gan/. GAN consists of four main sections: (i) Sources, a total of 5267 germplasm lines, along with detailed descriptions of associated characteristics, are all available on the Germplasm page, which can be queried using eight different inquiry modes. Seven fully sequenced species with accompanying sequences and detailed genomic annotation are available on the Genomics page. (ii) Genetics, detailed descriptions of 10 genetic linkage maps, constructed by different parents, 2239 KEGG metabolic pathway maps and 209 945 gene families across all catalogued genes, along with two co-linearity maps combining N. tabacum with available tomato and potato linkage maps are available here. Furthermore, 3 963 119 genome-SSRs, 10 621 016 SNPs, 12 388 PIPs and 102 895 reverse transcription-polymerase chain reaction primers, are all available to be used and searched on the Markers page. (iii) Tools, the genome browser JBrowse and five useful online bioinformatics softwares, Blast, Primer3, SSR-detect, Nucl-Protein and E-PCR, are provided on the JBrowse and Tools pages. (iv) Auxiliary, all the datasets are shown on a Statistics page, and are available for download on a Download page. In addition, the user’s manual is provided on a Manual page in English and Chinese languages. GAN provides a user-friendly Web interface for searching, browsing and downloading the genomics and genetics datasets in Nicotiana. As far as we can ascertain, GAN is the most comprehensive source of bio-data available, and the most applicable resource for breeding, gene mapping, gene cloning, the study of the origin and evolution of polyploidy, and related studies in Nicotiana. Database URL: http://biodb.sdau.edu.cn/gan/ PMID:29688356
Sequence Composition and Gene Content of the Short Arm of Rye (Secale cereale) Chromosome 1
Fluch, Silvia; Kopecky, Dieter; Burg, Kornel; Šimková, Hana; Taudien, Stefan; Petzold, Andreas; Kubaláková, Marie; Platzer, Matthias; Berenyi, Maria; Krainer, Siegfried; Doležel, Jaroslav; Lelley, Tamas
2012-01-01
Background The purpose of the study is to elucidate the sequence composition of the short arm of rye chromosome 1 (Secale cereale) with special focus on its gene content, because this portion of the rye genome is an integrated part of several hundreds of bread wheat varieties worldwide. Methodology/Principal Findings Multiple Displacement Amplification of 1RS DNA, obtained from flow sorted 1RS chromosomes, using 1RS ditelosomic wheat-rye addition line, and subsequent Roche 454FLX sequencing of this DNA yielded 195,313,589 bp sequence information. This quantity of sequence information resulted in 0.43× sequence coverage of the 1RS chromosome arm, permitting the identification of genes with estimated probability of 95%. A detailed analysis revealed that more than 5% of the 1RS sequence consisted of gene space, identifying at least 3,121 gene loci representing 1,882 different gene functions. Repetitive elements comprised about 72% of the 1RS sequence, Gypsy/Sabrina (13.3%) being the most abundant. More than four thousand simple sequence repeat (SSR) sites mostly located in gene related sequence reads were identified for possible marker development. The existence of chloroplast insertions in 1RS has been verified by identifying chimeric chloroplast-genomic sequence reads. Synteny analysis of 1RS to the full genomes of Oryza sativa and Brachypodium distachyon revealed that about half of the genes of 1RS correspond to the distal end of the short arm of rice chromosome 5 and the proximal region of the long arm of Brachypodium distachyon chromosome 2. Comparison of the gene content of 1RS to 1HS barley chromosome arm revealed high conservation of genes related to chromosome 5 of rice. Conclusions The present study revealed the gene content and potential gene functions on this chromosome arm and demonstrated numerous sequence elements like SSRs and gene-related sequences, which can be utilised for future research as well as in breeding of wheat and rye. PMID:22328922
Uptake, Results, and Outcomes of Germline Multiple-Gene Sequencing After Diagnosis of Breast Cancer.
Kurian, Allison W; Ward, Kevin C; Hamilton, Ann S; Deapen, Dennis M; Abrahamse, Paul; Bondarenko, Irina; Li, Yun; Hawley, Sarah T; Morrow, Monica; Jagsi, Reshma; Katz, Steven J
2018-05-10
Low-cost sequencing of multiple genes is increasingly available for cancer risk assessment. Little is known about uptake or outcomes of multiple-gene sequencing after breast cancer diagnosis in community practice. To examine the effect of multiple-gene sequencing on the experience and treatment outcomes for patients with breast cancer. For this population-based retrospective cohort study, patients with breast cancer diagnosed from January 2013 to December 2015 and accrued from SEER registries across Georgia and in Los Angeles, California, were surveyed (n = 5080, response rate = 70%). Responses were merged with SEER data and results of clinical genetic tests, either BRCA1 and BRCA2 (BRCA1/2) sequencing only or including additional other genes (multiple-gene sequencing), provided by 4 laboratories. Type of testing (multiple-gene sequencing vs BRCA1/2-only sequencing), test results (negative, variant of unknown significance, or pathogenic variant), patient experiences with testing (timing of testing, who discussed results), and treatment (strength of patient consideration of, and surgeon recommendation for, prophylactic mastectomy), and prophylactic mastectomy receipt. We defined a patient subgroup with higher pretest risk of carrying a pathogenic variant according to practice guidelines. Among 5026 patients (mean [SD] age, 59.9 [10.7]), 1316 (26.2%) were linked to genetic results from any laboratory. Multiple-gene sequencing increasingly replaced BRCA1/2-only testing over time: in 2013, the rate of multiple-gene sequencing was 25.6% and BRCA1/2-only testing, 74.4%;in 2015 the rate of multiple-gene sequencing was 66.5% and BRCA1/2-only testing, 33.5%. Multiple-gene sequencing was more often ordered by genetic counselors (multiple-gene sequencing, 25.5% and BRCA1/2-only testing, 15.3%) and delayed until after surgery (multiple-gene sequencing, 32.5% and BRCA1/2-only testing, 19.9%). Multiple-gene sequencing substantially increased rate of detection of any pathogenic variant (multiple-gene sequencing: higher-risk patients, 12%; average-risk patients, 4.2% and BRCA1/2-only testing: higher-risk patients, 7.8%; average-risk patients, 2.2%) and variants of uncertain significance, especially in minorities (multiple-gene sequencing: white patients, 23.7%; black patients, 44.5%; and Asian patients, 50.9% and BRCA1/2-only testing: white patients, 2.2%; black patients, 5.6%; and Asian patients, 0%). Multiple-gene sequencing was not associated with an increase in the rate of prophylactic mastectomy use, which was highest with pathogenic variants in BRCA1/2 (BRCA1/2, 79.0%; other pathogenic variant, 37.6%; variant of uncertain significance, 30.2%; negative, 35.3%). Multiple-gene sequencing rapidly replaced BRCA1/2-only testing for patients with breast cancer in the community and enabled 2-fold higher detection of clinically relevant pathogenic variants without an associated increase in prophylactic mastectomy. However, important targets for improvement in the clinical utility of multiple-gene sequencing include postsurgical delay and racial/ethnic disparity in variants of uncertain significance.
Takiya, Daniela M.; Nessimian, Jorge L.
2016-01-01
Metrichia is assigned to the Ochrotrichiinae, a group of almost exclusively Neotropical microcaddisflies. Metrichia comprises over 100 described species and, despite its diversity, only one species has been described from Brazil so far. In this paper, we provide descriptions for 20 new species from 8 Brazilian states: M. acuminata sp. nov., M. azul sp. nov., M. bonita sp. nov., M. bracui sp. nov., M. caraca sp. nov., M. circuliforme sp. nov., M. curta sp. nov., M. farofa sp. nov., M. forceps sp. nov., M. formosinha sp. nov., M. goiana sp. nov., M. itabaiana sp. nov., M. longissima sp. nov., M. peluda sp. nov., M. rafaeli sp. nov., M. simples sp. nov., M. talhada sp. nov., M. tere sp. nov., M. ubajara sp. nov., and M. vulgaris sp. nov. DNA barcode sequences (577 bp of the mitochondrial gene COI) were generated for 13 of the new species and two previously known species of Metrichia resulting in 64 sequences. In addition, COI sequences were obtained for other genera of Ochrotrichiinae (Angrisanoia, Nothotrichia, Ochrotrichia, Ragatrichia, and Rhyacopsyche). DNA sequences and morphological data were integrated to evaluate species delimitations. K2P pairwise distances were calculated to generate a neighbor-joining tree. COI sequences also were submitted to ABGD and GMYC methods to assess ‘potential species’ delimitation. Analyses showed a conspicuous barcoding gap among Metrichia sequences (highest intraspecific divergence: 4.8%; lowest interspecific divergence: 12.6%). Molecular analyses also allowed the association of larvae and adults of Metrichia bonita sp. nov. from Mato Grosso do Sul, representing the first record of microcaddisfly larvae occurring in calcareous tufa (or travertine). ABGD results agreed with the morphological delimitation of Metrichia species, while GMYC estimated a slightly higher number of species, suggesting the division of two morphological species, each one into two potential species. Because this could be due to unbalanced sampling and the lack of morphological diagnostic characters, we have maintained these two species as undivided. PMID:27169001
El-Sherry, S; Ogedengbe, M E; Hafeez, M A; Sayf-Al-Din, M; Gad, N; Barta, J R
2015-02-01
Unlike with Eimeria species infecting chickens, specific identification and nomenclature of Eimeria species infecting turkeys is complicated, and in the absence of molecular data, imprecise. In an attempt to reconcile contradictory data reported on oocyst morphometrics and biological descriptions of various Eimeria species infecting turkey, we established single oocyst derived lines of 5 important Eimeria species infecting turkeys, Eimeria meleagrimitis (USMN08-01 strain), Eimeria adenoeides (Guelph strain), Eimeria gallopavonis (Weybridge strain), Eimeria meleagridis (USAR97-01 strain), and Eimeria dispersa (Briston strain). Short portions (514 bp) of mitochondrial cytochrome c oxidase subunit I gene (mt COI) from each were amplified and sequenced. Comparison of these sequences showed sufficient species-specific sequence variation to recommend these short mt COI sequences as species-specific markers. Uniformity of oocyst features (dimensions and oocyst structure) of each pure line was observed. Additional morphological features of the oocysts of these species are described as useful for the microscopic differentiation of these Eimeria species. Combined molecular and morphometric data on these single species lines compared with the original species descriptions and more recent data have helped to clarify some confusing, and sometimes conflicting, features associated with these Eimeria spp. For example, these new data suggest that the KCH and KR strains of E. adenoeides reported previously represent 2 distinct species, E. adenoeides and E. meleagridis, respectively. Likewise, analysis of the Weybridge strain of E. adenoeides, which has long been used as a reference strain in various studies conducted on the pathogenicity of E. adenoeides, indicates that this coccidium is actually a strain of E. gallopavonis. We highly recommend mt COI sequence-based genotyping be incorporated into all studies using Eimeria spp. of turkeys to confirm species identifications and so that any resulting data can be associated correctly with a single named Eimeria species. © 2015 Poultry Science Association Inc.
Spread of Plasmids Carrying Multiple GES Variants
Cuzon, Gaelle; Bogaerts, Pierre; Bauraing, Caroline; Huang, Te-Din; Glupczynski, Youri
2016-01-01
Five GES-producing Enterobacteriaceae isolates that displayed an extended-spectrum β-lactamase (ESBL) phenotype harbored two GES variants: GES-7 ESBL and GES-6 carbapenemase. In all isolates, the two GES alleles were located on the same integron that was inserted into an 80-kb IncM1 self-conjugative plasmid. Whole-genome sequencing suggested in vivo horizontal gene transfer of the plasmid along with clonal diffusion of Enterobacter cloacae. To our knowledge, this is the first description in Europe of clustered Enterobacteriaceae isolates carrying two GES β-lactamases, of which one has extended activity toward carbapenems. PMID:27216071
Establishing gene models from the Pinus pinaster genome using gene capture and BAC sequencing.
Seoane-Zonjic, Pedro; Cañas, Rafael A; Bautista, Rocío; Gómez-Maldonado, Josefa; Arrillaga, Isabel; Fernández-Pozo, Noé; Claros, M Gonzalo; Cánovas, Francisco M; Ávila, Concepción
2016-02-27
In the era of DNA throughput sequencing, assembling and understanding gymnosperm mega-genomes remains a challenge. Although drafts of three conifer genomes have recently been published, this number is too low to understand the full complexity of conifer genomes. Using techniques focused on specific genes, gene models can be established that can aid in the assembly of gene-rich regions, and this information can be used to compare genomes and understand functional evolution. In this study, gene capture technology combined with BAC isolation and sequencing was used as an experimental approach to establish de novo gene structures without a reference genome. Probes were designed for 866 maritime pine transcripts to sequence genes captured from genomic DNA. The gene models were constructed using GeneAssembler, a new bioinformatic pipeline, which reconstructed over 82% of the gene structures, and a high proportion (85%) of the captured gene models contained sequences from the promoter regulatory region. In a parallel experiment, the P. pinaster BAC library was screened to isolate clones containing genes whose cDNA sequence were already available. BAC clones containing the asparagine synthetase, sucrose synthase and xyloglucan endotransglycosylase gene sequences were isolated and used in this study. The gene models derived from the gene capture approach were compared with the genomic sequences derived from the BAC clones. This combined approach is a particularly efficient way to capture the genomic structures of gene families with a small number of members. The experimental approach used in this study is a valuable combined technique to study genomic gene structures in species for which a reference genome is unavailable. It can be used to establish exon/intron boundaries in unknown gene structures, to reconstruct incomplete genes and to obtain promoter sequences that can be used for transcriptional studies. A bioinformatics algorithm (GeneAssembler) is also provided as a Ruby gem for this class of analyses.
Klenk, Hans-Peter; Lapidus, Alla; Chertkov, Olga; Copeland, Alex; Del Rio, Tijana Glavina; Nolan, Matt; Lucas, Susan; Chen, Feng; Tice, Hope; Cheng, Jan-Fang; Han, Cliff; Bruce, David; Goodwin, Lynne; Pitluck, Sam; Pati, Amrita; Ivanova, Natalia; Mavromatis, Konstantinos; Daum, Chris; Chen, Amy; Palaniappan, Krishna; Chang, Yun-Juan; Land, Miriam; Hauser, Loren; Jeffries, Cynthia D; Detter, John C; Rohde, Manfred; Abt, Birte; Pukall, Rüdiger; Göker, Markus; Bristow, James; Markowitz, Victor; Hugenholtz, Philip; Eisen, Jonathan A
2011-10-15
Bacillus tusciae Bonjour & Aragno 1994 is a hydrogen-oxidizing, thermoacidophilic spore former that lives as a facultative chemolithoautotroph in solfataras. Although 16S rRNA gene sequencing was well established at the time of the initial description of the organism, 16S sequence data were not available and the strain was placed into the genus Bacillus based on limited chemotaxonomic information. Despite the now obvious misplacement of strain T2 as a member of the genus Bacillus in 16S rRNA-based phylogenetic trees, the misclassification remained uncorrected for many years, which was likely due to the extremely difficult, analysis-hampering cultivation conditions and poor growth rate of the strain. Here we provide a taxonomic re-evaluation of strain T2T (= DSM 2912 = NBRC 15312) and propose its reclassification as the type strain of a new species, Kyrpidia tusciae, and the type species of the new genus Kyrpidia, which is a sister-group of Alicyclobacillus. The family Alicyclobacillaceae da Costa and Rainey, 2010 is emended. The 3,384,766 bp genome with its 3,323 protein-coding and 78 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.
LPA and PLG sequence variation and kringle IV-2 copy number in two populations.
Crawford, Dana C; Peng, Ze; Cheng, Jan-Fang; Boffelli, Dario; Ahearn, Magdalena; Nguyen, Dan; Shaffer, Tristan; Yi, Qian; Livingston, Robert J; Rieder, Mark J; Nickerson, Deborah A
2008-01-01
Lp(a) levels have long been recognized as a potential risk factor for coronary heart disease that is almost completely under genetic control. Much of the genetics impacting Lp(a) levels has been attributed to the highly polymorphic LPA kringle IV-2 copy number variant, and most of the variance in Lp(a) levels in populations of European-descent is inversely correlated with kringle IV copy number. However, less of the variance is explained in African-descent populations for the same structural variation. African-descent populations have, on average, higher levels of Lp(a), suggesting other genetic factors contribute to Lp(a) level variability across populations. To identify potential cis-acting factors, we re-sequenced the gene LPA for single nucleotide polymorphism (SNP) discovery in 23 European-Americans and 24 African-Americans. We also re- sequenced the neighboring gene plasminogen (PLG) and genotyped the kringle IV copy number variant in the same reference samples. These data are the most comprehensive description of sequence variation in LPA and its relationship with the kringle IV copy number variant. With these data, we demonstrate that only a fraction of LPA sequence diversity has been previously documented. Also, we identify several high frequency SNPs present in the African-American sample but absent in the European-American sample. Finally, we show that SNPs within PLG are not in linkage disequilibrium with SNPs in LPA, and we show that kringle IV copy number variation is not in linkage disequilibrium with either LPA or PLG SNPs. Together, these data suggest that LPA SNPs could independently contribute to Lp(a) levels in the general population. Copyright (c) 2008 S. Karger AG, Basel.
Broderick, Nichole A; Raffa, Kenneth F; Goodman, Robert M; Handelsman, Jo
2004-01-01
Little is known about bacteria associated with Lepidoptera, the large group of mostly phytophagous insects comprising the moths and butterflies. We inventoried the larval midgut bacteria of a polyphagous foliivore, the gypsy moth (Lymantria dispar L.), whose gut is highly alkaline, by using traditional culturing and culture-independent methods. We also examined the effects of diet on microbial composition. Analysis of individual third-instar larvae revealed a high degree of similarity of microbial composition among insects fed on the same diet. DNA sequence analysis indicated that most of the PCR-amplified 16S rRNA genes belong to the gamma-Proteobacteria and low G+C gram-positive divisions and that the cultured members represented more than half of the phylotypes identified. Less frequently detected taxa included members of the alpha-Proteobacterium, Actinobacterium, and Cytophaga/Flexibacter/Bacteroides divisions. The 16S rRNA gene sequences from 7 of the 15 cultured organisms and 8 of the 9 sequences identified by PCR amplification diverged from previously reported bacterial sequences. The microbial composition of midguts differed substantially among larvae feeding on a sterilized artificial diet, aspen, larch, white oak, or willow. 16S rRNA analysis of cultured isolates indicated that an Enterococcus species and culture-independent analysis indicated that an Entbacter sp. were both present in all larvae, regardless of the feeding substrate; the sequences of these two phylotypes varied less than 1% among individual insects. These results provide the first comprehensive description of the microbial diversity of a lepidopteran midgut and demonstrate that the plant species in the diet influences the composition of the gut bacterial community.
Bartonella dromedarii sp. nov. isolated from domesticated camels (Camelus dromedarius) in Israel.
Rasis, Michal; Rudoler, Nir; Schwartz, David; Giladi, Michael
2014-11-01
Bartonella spp. are fastidious, Gram-negative bacilli that cause a wide spectrum of diseases in humans. Most Bartonella spp. have adapted to a specific host, generally a domestic or wild mammal. Dromedary camels (Camelus dromedarius) have become a focus of growing public-health interest because they have been identified as a reservoir host for the Middle East respiratory syndrome coronavirus. Nevertheless, data on camel zoonoses are limited. We aimed to study the occurrence of Bartonella bacteremia among dromedaries in Israel. Nine of 51 (17.6%) camels were found to be bacteremic with Bartonella spp.; bacteremia levels ranged from five to >1000 colony-forming units/mL. Phylogenetic reconstruction based on the concatenated sequences of gltA and rpoB genes demonstrated that the dromedary Bartonella isolates are closely related to other ruminant-derived Bartonella spp., with B. bovis being the nearest relative. Using electron microscopy, the novel isolates were shown to be flagellated, whereas B. bovis is nonflagellated. Sequence comparisons analysis of the housekeeping genes ftsZ, ribC, and groEL showed the highest homology to B. chomelii, B. capreoli, and B. birtlesii, respectively. Sequence analysis of the gltA and rpoB revealed ∼96% identity to B. bovis, a previously suggested cutoff value for sequence-based differentiation of Bartonella spp., suggesting that this approach does not have sufficient discriminatory power for differentiating ruminant-related Bartonella spp. A comprehensive multilocus sequence typing (MLST) analysis based on nine genetic loci (gltA, rpoB, ftsZ, internal transcribed spacer (ITS), 16S rRNA, ribC, groEL, nuoG, and SsrA) identified seven sequence types of the new dromedary isolates. This is the first description of a Bartonella sp. from camelids. On the basis of a distinct reservoir and ecological niche, sequence analyses, and expression of flagella, we designate these isolates as a novel Bartonella sp. named Bartonella dromedarii sp. nov. Further studies are required to explore its zoonotic potential.
The gene space in wheat: the complete γ-gliadin gene family from the wheat cultivar Chinese Spring.
Anderson, Olin D; Huo, Naxin; Gu, Yong Q
2013-06-01
The complete set of unique γ-gliadin genes is described for the wheat cultivar Chinese Spring using a combination of expressed sequence tag (EST) and Roche 454 DNA sequences. Assemblies of Chinese Spring ESTs yielded 11 different γ-gliadin gene sequences. Two of the sequences encode identical polypeptides and are assumed to be the result of a recent gene duplication. One gene has a 3' coding mutation that changes the reading frame in the final eight codons. A second assembly of Chinese Spring γ-gliadin sequences was generated using Roche 454 total genomic DNA sequences. The 454 assembly confirmed the same 11 active genes as the EST assembly plus two pseudogenes not represented by ESTs. These 13 γ-gliadin sequences represent the complete unique set of γ-gliadin genes for cv Chinese Spring, although not ruled out are additional genes that are exact duplications of these 13 genes. A comparison with the ESTs of two other hexaploid cultivars (Butte 86 and Recital) finds that the most active genes are present in all three cultivars, with exceptions likely due to too few ESTs for detection in Butte 86 and Recital. A comparison of the numbers of ESTs per gene indicates differential levels of expression within the γ-gliadin gene family. Genome assignments were made for 6 of the 13 Chinese Spring γ-gliadin genes, i.e., one assignment from a match to two γ-gliadin genes found within a tetraploid wheat A genome BAC and four genes that match four distinct γ-gliadin sequences assembled from Roche 454 sequences from Aegilops tauschii, the hexaploid wheat D-genome ancestor.
Optimization of Multilocus Sequence Analysis for Identification of Species in the Genus Vibrio
Gabriel, Michael W.; Matsui, George Y.; Friedman, Robert
2014-01-01
Multilocus sequence analysis (MLSA) is an important method for identification of taxa that are not well differentiated by 16S rRNA gene sequences alone. In this procedure, concatenated sequences of selected genes are constructed and then analyzed. The effects that the number and the order of genes used in MLSA have on reconstruction of phylogenetic relationships were examined. The recA, rpoA, gapA, 16S rRNA gene, gyrB, and ftsZ sequences from 56 species of the genus Vibrio were used to construct molecular phylogenies, and these were evaluated individually and using various gene combinations. Phylogenies from two-gene sequences employing recA and rpoA in both possible gene orders were different. The addition of the gapA gene sequence, producing all six possible concatenated sequences, reduced the differences in phylogenies to degrees of statistical (bootstrap) support for some nodes. The overall statistical support for the phylogenetic tree, assayed on the basis of a reliability score (calculated from the number of nodes having bootstrap values of ≥80 divided by the total number of nodes) increased with increasing numbers of genes used, up to a maximum of four. No further improvement was observed from addition of the fifth gene sequence (ftsZ), and addition of the sixth gene (gyrB) resulted in lower proportions of strongly supported nodes. Reductions in the numbers of strongly supported nodes were also observed when maximum parsimony was employed for tree construction. Use of a small number of gene sequences in MLSA resulted in accurate identification of Vibrio species. PMID:24951781
Zhang, Bochao; Meng, Wenzhao; Prak, Eline T Luning; Hershberg, Uri
2015-12-01
Immune repertoires are collections of lymphocytes that express diverse antigen receptor gene rearrangements consisting of Variable (V), (Diversity (D) in the case of heavy chains) and Joining (J) gene segments. Clonally related cells typically share the same germline gene segments and have highly similar junctional sequences within their third complementarity determining regions. Identifying clonal relatedness of sequences is a key step in the analysis of immune repertoires. The V gene is the most important for clone identification because it has the longest sequence and the greatest number of sequence variants. However, accurate identification of a clone's germline V gene source is challenging because there is a high degree of similarity between different germline V genes. This difficulty is compounded in antibodies, which can undergo somatic hypermutation. Furthermore, high-throughput sequencing experiments often generate partial sequences and have significant error rates. To address these issues, we describe a novel method to estimate which germline V genes (or alleles) cannot be discriminated under different conditions (read lengths, sequencing errors or somatic hypermutation frequencies). Starting with any set of germline V genes, this method measures their similarity using different sequencing lengths and calculates their likelihood of unambiguous assignment under different levels of mutation. Hence, one can identify, under different experimental and biological conditions, the germline V genes (or alleles) that cannot be uniquely identified and bundle them together into groups of specific V genes with highly similar sequences. Copyright © 2015 Elsevier B.V. All rights reserved.
Tashkandy, Nisreen; Sabban, Sari; Fakieh, Mohammad; ...
2016-06-16
Flavobacterium suncheonense is a member of the family Flavobacteriaceae in the phylum Bacteroidetes. Strain GH29-5 T (DSM 17707 T ) was isolated from greenhouse soil in Suncheon, South Korea. F. suncheonense GH29-5 T is part of the Genomic Encyclopedia of Bacteria and Archaea project. The 2,880,663 bp long draft genome consists of 54 scaffolds with 2739 protein-coding genes and 82 RNA genes. The genome of strain GH29-5 T has 117 genes encoding peptidases but a small number of genes encoding carbohydrate active enzymes (51 CAZymes). Metallo and serine peptidases were found most frequently. Among CAZymes, eight glycoside hydrolase families, ninemore » glycosyl transferase families, two carbohydrate binding module families and four carbohydrate esterase families were identified. Suprisingly, polysaccharides utilization loci (PULs) were not found in strain GH29-5 T . Based on the coherent physiological and genomic characteristics we suggest that F. suncheonense GH29-5 T feeds rather on proteins than saccharides and lipids.« less
Chapman, Jarrod A.; Kirkness, Ewen F.; Simakov, Oleg; Hampson, Steven E.; Mitros, Therese; Weinmaier, Therese; Rattei, Thomas; Balasubramanian, Prakash G.; Borman, Jon; Busam, Dana; Disbennett, Kathryn; Pfannkoch, Cynthia; Sumin, Nadezhda; Sutton, Granger G.; Viswanathan, Lakshmi Devi; Walenz, Brian; Goodstein, David M.; Hellsten, Uffe; Kawashima, Takeshi; Prochnik, Simon E.; Putnam, Nicholas H.; Shu, Shengquiang; Blumberg, Bruce; Dana, Catherine E.; Gee, Lydia; Kibler, Dennis F.; Law, Lee; Lindgens, Dirk; Martinez, Daniel E.; Peng, Jisong; Wigge, Philip A.; Bertulat, Bianca; Guder, Corina; Nakamura, Yukio; Ozbek, Suat; Watanabe, Hiroshi; Khalturin, Konstantin; Hemmrich, Georg; Franke, André; Augustin, René; Fraune, Sebastian; Hayakawa, Eisuke; Hayakawa, Shiho; Hirose, Mamiko; Hwang, Jung Shan; Ikeo, Kazuho; Nishimiya-Fujisawa, Chiemi; Ogura, Atshushi; Takahashi, Toshio; Steinmetz, Patrick R. H.; Zhang, Xiaoming; Aufschnaiter, Roland; Eder, Marie-Kristin; Gorny, Anne-Kathrin; Salvenmoser, Willi; Heimberg, Alysha M.; Wheeler, Benjamin M.; Peterson, Kevin J.; Böttger, Angelika; Tischler, Patrick; Wolf, Alexander; Gojobori, Takashi; Remington, Karin A.; Strausberg, Robert L.; Venter, J. Craig; Technau, Ulrich; Hobmayer, Bert; Bosch, Thomas C. G.; Holstein, Thomas W.; Fujisawa, Toshitaka; Bode, Hans R.; David, Charles N.; Rokhsar, Daniel S.; Steele, Robert E.
2015-01-01
The freshwater cnidarian Hydra was first described in 17021 and has been the object of study for 300 years. Experimental studies of Hydra between 1736 and 1744 culminated in the discovery of asexual reproduction of an animal by budding, the first description of regeneration in an animal, and successful transplantation of tissue between animals2. Today, Hydra is an important model for studies of axial patterning3, stem cell biology4 and regeneration5. Here we report the genome of Hydra magnipapillata and compare it to the genomes of the anthozoan Nematostella vectensis6 and other animals. The Hydra genome has been shaped by bursts of transposable element expansion, horizontal gene transfer, trans-splicing, and simplification of gene structure and gene content that parallel simplification of the Hydra life cycle. We also report the sequence of the genome of a novel bacterium stably associated with H. magnipapillata. Comparisons of the Hydra genome to the genomes of other animals shed light on the evolution of epithelia, contractile tissues, developmentally regulated transcription factors, the Spemann–Mangold organizer, pluripotency genes and the neuromuscular junction. PMID:20228792
González, Víctor M; Aventín, Núria; Centeno, Emilio; Puigdomènech, Pere
2014-12-17
Plant NBS-LRR -resistance genes tend to be found in clusters, which have been shown to be hot spots of genome variability. In melon, half of the 81 predicted NBS-LRR genes group in nine clusters, and a 1 Mb region on linkage group V contains the highest density of R-genes and presence/absence gene polymorphisms found in the melon genome. This region is known to contain the locus of Vat, an agronomically important gene that confers resistance to aphids. However, the presence of duplications makes the sequencing and annotation of R-gene clusters difficult, usually resulting in multi-gapped sequences with higher than average errors. A 1-Mb sequence that contains the largest NBS-LRR gene cluster found in melon was improved using a strategy that combines Illumina paired-end mapping and PCR-based gap closing. Unknown sequence was decreased by 70% while about 3,000 SNPs and small indels were corrected. As a result, the annotations of 18 of a total of 23 NBS-LRR genes found in this region were modified, including additional coding sequences, amino acid changes, correction of splicing boundaries, or fussion of ORFs in common transcription units. A phylogeny analysis of the R-genes and their comparison with syntenic sequences in other cucurbits point to a pattern of local gene amplifications since the diversification of cucurbits from other families, and through speciation within the family. A candidate Vat gene is proposed based on the sequence similarity between a reported Vat gene from a Korean melon cultivar and a sequence fragment previously absent in the unrefined sequence. A sequence refinement strategy allowed substantial improvement of a 1 Mb fragment of the melon genome and the re-annotation of the largest cluster of NBS-LRR gene homologues found in melon. Analysis of the cluster revealed that resistance genes have been produced by sequence duplication in adjacent genome locations since the divergence of cucurbits from other close families, and through the process of speciation within the family a candidate Vat gene was also identified using sequence previously unavailable, which demonstrates the advantages of genome assembly refinements when analyzing complex regions such as those containing clusters of highly similar genes.
Chiu, Shih-Hau; Chen, Chien-Chi; Yuan, Gwo-Fang; Lin, Thy-Hou
2006-06-15
The number of sequences compiled in many genome projects is growing exponentially, but most of them have not been characterized experimentally. An automatic annotation scheme must be in an urgent need to reduce the gap between the amount of new sequences produced and reliable functional annotation. This work proposes rules for automatically classifying the fungus genes. The approach involves elucidating the enzyme classifying rule that is hidden in UniProt protein knowledgebase and then applying it for classification. The association algorithm, Apriori, is utilized to mine the relationship between the enzyme class and significant InterPro entries. The candidate rules are evaluated for their classificatory capacity. There were five datasets collected from the Swiss-Prot for establishing the annotation rules. These were treated as the training sets. The TrEMBL entries were treated as the testing set. A correct enzyme classification rate of 70% was obtained for the prokaryote datasets and a similar rate of about 80% was obtained for the eukaryote datasets. The fungus training dataset which lacks an enzyme class description was also used to evaluate the fungus candidate rules. A total of 88 out of 5085 test entries were matched with the fungus rule set. These were otherwise poorly annotated using their functional descriptions. The feasibility of using the method presented here to classify enzyme classes based on the enzyme domain rules is evident. The rules may be also employed by the protein annotators in manual annotation or implemented in an automatic annotation flowchart.
Bhumika, V; Srinivas, T N R; Ravinder, K; Kumar, P Anil
2013-06-01
A novel marine, Gram-stain-negative, oxidase- and catalase- positive, rod-shaped bacterium, designated strain AK6(T), was isolated from marine aquaculture pond water collected in Andhra Pradesh, India. The fatty acids were dominated by iso-C15:0, iso-C17:1ω9c, iso-C15:1 G, iso-C17:0 3-OH and anteiso-C15:0. Strain AK6(T) contained MK-7 as the sole respiratory quinone and phosphatidylethanolamine, one unidentified aminophospholipid, one unidentified phospholipid and seven unidentified lipids as polar lipids. The DNA G+C content of strain AK6(T) was 45.6 mol%. Phylogenetic analysis showed that strain AK6(T) formed a distinct branch within the family Cyclobacteriaceae and clustered with Aquiflexum balticum DSM 16537(T) and other members of the family Cyclobacteriaceae. 16S rRNA gene sequence analysis confirmed that Aquiflexum balticum DSM 16537(T) was the nearest neighbour, with pairwise sequence similarity of 90.1%, while sequence similarity with the other members of the family was <88.5%. Based on differentiating phenotypic characteristics and phylogenetic inference, strain AK6(T) is proposed as a representative of a new genus and species of the family Cyclobacteriaceae, as Mariniradius saccharolyticus gen. nov., sp. nov. The type strain of Mariniradius saccharolyticus is AK6(T) (=MTCC 11279(T)=JCM 17389(T)). Emended descriptions of the genus Aquiflexum and Aquiflexum balticum are also proposed.
Delimiting regulatory sequences of the Drosophila melanogaster Ddc gene.
Hirsh, J; Morgan, B A; Scholnick, S B
1986-01-01
We delimited sequences necessary for in vivo expression of the Drosophila melanogaster dopa decarboxylase gene Ddc. The expression of in vitro-altered genes was assayed following germ line integration via P-element vectors. Sequences between -209 and -24 were necessary for normally regulated expression, although genes lacking these sequences could be expressed at 10 to 50% of wild-type levels at specific developmental times. These genes showed components of normal developmental expression, which suggests that they retain some regulatory elements. All Ddc genes lacking the normal immediate 5'-flanking sequences were grossly deficient in larval central nervous system expression. Thus, this upstream region must contain at least one element necessary for this expression. A mutated Ddc gene without a normal TATA boxlike sequence used the normal RNA start points, indicating that this sequences is not required for start point specificity. Images PMID:3099170
Partial characterization of new adenoviruses found in lizards.
Ball, Inna; Behncke, Helge; Schmidt, Volker; Geflügel, F T A; Papp, Tibor; Stöhr, Anke C; Marschang, Rachel E
2014-06-01
In the years 2011-2012, a consensus nested polymerase chain reaction was used for the detection of adenovirus (AdV) infection in reptiles. During this screening, three new AdVs were detected. One of these viruses was detected in three lizards from a group of green striped tree dragons (Japalura splendida). Another was detected in a green anole (Anolis carolinensis). A third virus was detected in a Jackson's chameleon (Chamaeleo jacksonii). Analysis of a portion of the DNA-dependent DNA polymerase genes of each of these viruses revealed that they all were different from one another and from all previously described reptilian AdVs. Phylogenetic analysis of the partial DNA polymerase gene sequence showed that all newly detected viruses clustered within the genus Atadenovirus. This is the first description of AdVs in these lizard species.
Andersen, Birgitte; Nielsen, Kristian F; Thrane, Ulf; Szaro, Tim; Taylor, John W; Jarvis, Bruce B
2003-01-01
Twenty-five Stachybotrys isolates from two previous studies have been examined and compared, using morphological, chemical and phylogenetic methods. The results show that S. chartarum sensu lato can be segregated into two chemotypes and one new species. The new species, S. chlorohalonata, differs morphologically from S. chartarum by having smooth conidia, being more restricted in growth and producing a green extracellular pigment on the medium CYA. S. chlorohalonata and S. chartarum also have different tri5, chs1 and tub1 gene fragment sequences. The two chemotypes of S. chartarum, chemotype S and chemotype A, have similar morphology but differ in production of metabolites. Chemotype S produces macrocyclic trichothecenes, satratoxins and roridins, while chemotype A produces atranones and dolabellanes. There is no difference between the two chemotypes in the tub1 gene fragment, but there is a one nucleotide difference in each of the tri5 and the chs1 gene fragments.
Reutter, Heiko; Keppler-Noreuil, Kim; E. Keegan, Catherine; Thiele, Holger; Yamada, Gen; Ludwig, Michael
2016-01-01
The Bladder-Exstrophy-Epispadias Complex (BEEC) represents the severe end of the uro-rectal malformation spectrum, and has a profound impact on continence, and on sexual and renal function. While previous reports of familial occurrence, in-creased recurrence among first-degree relatives, high concordance rates among monozygotic twins, and chromosomal aberra-tions were suggestive of causative genetic factors, the recent identification of copy number variations (CNVs), susceptibility regions and genes through the systematic application of array based analysis, candidate gene and genome-wide association studies (GWAS) provide strong evidence. These findings in human BEEC cohorts are underscored by the recent description of BEEC(-like) murine knock-out models. Here, we discuss the current knowledge of the potential molecular mechanisms, mediating abnormal uro-rectal development leading to the BEEC, demonstrating the importance of ISL1-pathway in human and mouse and propose SLC20A1 and CELSR3 as the first BEEC candidate genes, identified through systematic whole-exome sequencing (WES) in BEEC patients. PMID:27013921
Vives-Corrons, Joan-Lluis; Koralkova, Pavla; Grau, Josep M.; Mañú Pereira, Maria del Mar; Van Wijk, Richard
2013-01-01
Phosphofructokinase deficiency is a very rare autosomal recessive disorder, which belongs to group of rare inborn errors of metabolism called glycogen storage disease. Here we report on a new mutation in the phosphofructokinase (PFK) gene PFKM identified in a 65-years-old woman who suffered from lifelong intermittent muscle weakness and painful spasms of random occurrence, episodic dark urines, and slight haemolytic anemia. After ruling out the most common causes of chronic haemolytic anemia, the study of a panel of 24 enzyme activities showed a markedly decreased PFK activity in red blood cells (RBCs) from the patient. DNA sequence analysis of the PFKM gene subsequently revealed a novel homozygous mutation: c.926A>G; p.Asp309Gly. This mutation is predicted to severely affect enzyme catalysis thereby accounting for the observed enzyme deficiency. This case represents a prime example of classical PFK deficiency and is the first reported case of this very rare red blood cell disorder in Spain. PMID:24427140
Mauldin, E A; Wang, P; Evans, E; Cantner, C A; Ferracone, J D; Credille, K M; Casal, M L
2015-07-01
A minority of patients with nonsyndromic autosomal recessive congenital ichthyosis (ARCI) display mutations in NIPAL4 (ICHTHYIN). This protein plays a role in epidermal lipid metabolism, although the mechanism is unknown. The study describes a moderate form of ARCI in an extended pedigree of American Bulldogs that is linked to the gene encoding ichthyin. The gross phenotype was manifest as a disheveled pelage shortly after birth, generalized scaling, and adherent brown scale with erythema of the abdominal skin. Pedigree analysis indicated an autosomal recessive mode of inheritance. Ultrastructurally, the epidermis showed discontinuous lipid bilayers, unprocessed lipid within corneocytes, and abnormal lamellar bodies. Linkage analysis, performed by choosing simple sequence repeat markers and single-nucleotide polymorphisms near genes known to cause ACRI, revealed an association with NIPAL4. NIPAL4 was identified and sequenced using standard methods. No mutation was identified within the gene, but affected dogs had a SINE element 5' upstream of exon 1 in a highly conserved region. Of 545 DNA samples from American Bulldogs, 32 dogs (17 females, 15 males) were homozygous for the polymerase chain reaction fragment. All affected dogs were homozygous, with parents heterozygous for the insertion. Immunolabeling revealed an absence of ichthyin in the epidermis. This is the first description of ARCI associated with decreased expression of NIPAL4 in nonhuman species. © The Author(s) 2014.
Hu, Xiaozhong; Fan, Yangbo; Warren, Alan
2015-08-01
The benthic urostylid ciliate Apoholosticha sinicaFan et al., 2014 was isolated from a salt marsh at Blakeney, UK, and reinvestigated using light microscopy and small-subunit rRNA gene sequencing. Morphologically, it corresponds well with the original description. Several stages of divisional morphogenesis and physiological reorganization were also observed from which the following could be deduced: (i) the oral apparatus is completely newly built in the proter; (ii) frontal-ventral-transverse cirral anlage II does not produce a buccal cirrus; (iii) each of the posteriormost three or four anlagen contributes one transverse cirrus at its posterior end; (iv) a row of frontoterminal cirri originates from the rearmost frontal-ventral-transverse cirral anlage; (v) the last midventral row is formed from the penultimate frontal-ventral-transverse cirral anlage. Based on new data, two diagnostic features were added to the genus definition: (i) the midventral complex is composed of midventral pairs and midventral row and (ii) pretransverse ventral cirri are absent. Based on a combination of morphological and morphogenetic data, the genus Apoholosticha is assigned to the recently erected subfamily Nothoholostichinae Paiva et al., 2014, which is consistent with sequence comparison and phylogenetic analyses based on SSU rRNA gene data. It is also concluded that this benthic species, previously reported only from China, is not an endemic form.
Xu, Li; Ding, Zhi-Shan; Zhou, Yun-Kai; Tao, Xue-Fen
2009-06-01
To obtain the full-length cDNA sequence of Secoisolariciresinol Dehydrogenase gene from Dysosma versipellis by RACE PCR,then investigate the character of Secoisolariciresinol Dehydrogenase gene. The full-length cDNA sequence of Secoisolariciresinol Dehydrogenase gene was obtained by 3'-RACE and 5'-RACE from Dysosma versipellis. We first reported the full cDNA sequences of Secoisolariciresinol Dehydrogenase in Dysosma versipellis. The acquired gene was 991bp in full length, including 5' untranslated region of 42bp, 3' untranslated region of 112bp with Poly (A). The open reading frame (ORF) encoding 278 amino acid with molecular weight 29253.3 Daltons and isolectric point 6.328. The gene accession nucleotide sequence number in GeneBank was EU573789. Semi-quantitative RT-PCR analysis revealed that the Secoisolariciresinol Dehydrogenase gene was highly expressed in stem. Alignment of the amino acid sequence of Secoisolariciresinol Dehydrogenase indicated there may be some significant amino acid sequence difference among different species. Obtain the full-length cDNA sequence of Secoisolariciresinol Dehydrogenase gene from Dysosma versipellis.
Gene and translation initiation site prediction in metagenomic sequences
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hyatt, Philip Douglas; LoCascio, Philip F; Hauser, Loren John
2012-01-01
Gene prediction in metagenomic sequences remains a difficult problem. Current sequencing technologies do not achieve sufficient coverage to assemble the individual genomes in a typical sample; consequently, sequencing runs produce a large number of short sequences whose exact origin is unknown. Since these sequences are usually smaller than the average length of a gene, algorithms must make predictions based on very little data. We present MetaProdigal, a metagenomic version of the gene prediction program Prodigal, that can identify genes in short, anonymous coding sequences with a high degree of accuracy. The novel value of the method consists of enhanced translationmore » initiation site identification, ability to identify sequences that use alternate genetic codes and confidence values for each gene call. We compare the results of MetaProdigal with other methods and conclude with a discussion of future improvements.« less
Scholz, Christian F P; Kilian, Mogens
2016-11-01
The genus Propionibacterium in the family Propionibacteriaceaeconsists of species of various habitats, including mature cheese, cattle rumen and human skin. Traditionally, these species have been grouped as either classical or cutaneous propionibacteria based on characteristic phenotypes and source of isolation. To re-evaluate the taxonomy of the family and to elucidate the interspecies relatedness we compared 162 public whole-genome sequences of strains representing species of the family Propionibacteriaceae. We found substantial discrepancies between the phylogenetic signals of 16S rRNA gene sequence analysis and our high-resolution core-genome analysis. To accommodate these discrepancies, and to address the long-standing issue of the taxonomically problematic Propionibacterium propionicum, we propose three novel genera, Acidipropionibacterium gen. nov., Cutibacterium gen. nov. and Pseudopropionibacterium gen. nov., and an amended description of the genus Propionibacterium. Furthermore, our genome-based analyses support the amounting evidence that the subdivision of Propionibacterium freudenreichii into subspecies is not warranted. Our proposals are supported by phylogenetic analyses, DNA G+C content, peptidoglycan composition and patterns of the gene losses and acquisitions in the cutaneous propionibacteria during their adaptation to the human host.
Paarup, Maiken; Friedrich, Michael W; Tindall, Brian J; Finster, Kai
2006-01-01
A psychrotolerant, obligate anaerobic, acetogenic bacterium designated strain SyrA5 was isolated from black anoxic sediment of a brackish fjord. Cells were Gram-positive, non-sporeforming rods. The isolate utilized H(2)/CO(2), CO, fructose, glucose, ethanol, ethylene glycol, glycerol, pyruvate, lactate, betaine and the methyl-groups of several methoxylated benzoic derivatives such as syringate, trimethoxybenzoate and vallinate. The optimum temperature for growth was 29 degrees C, whilst slow growth occurred at 2 degrees C. The strain grew optimally with NaCl concentrations below 2.7% (w/v), but growth occurred up to 4.3% (w/v) NaCl. Growth was observed in the range from pH 5.9 to 8.5, optimum at pH 8. The G+C content was 44.1 mol%. Based upon 16S rRNA gene sequence analysis and DNA-DNA reassociation studies, the organism was classified in the genus Acetobacterium. Strain SyrA5 shared a 16S rRNA sequence similarity with A. carbinolicum of 100%, a fthfs gene (which codes for the N5,N10 tetrahydrofolate synthetase) sequence identity of 98.5-98.7% (amino acid sequence similarities were 99.4-100%) and a RNA-DNA hybridization homology of 64-68%. Despite a number of phenotypic differences between strain SyrA5 and A. carbinolicum we propose including strain SyrA5 as a subspecies of A. carbinolicum for which we propose the name Acetobacterium carbinolicum subspecies kysingense. The type strain is SyrA5 (=DSM 16427(T), ATCC BAA-990).
SIDR: simultaneous isolation and parallel sequencing of genomic DNA and total RNA from single cells.
Han, Kyung Yeon; Kim, Kyu-Tae; Joung, Je-Gun; Son, Dae-Soon; Kim, Yeon Jeong; Jo, Areum; Jeon, Hyo-Jeong; Moon, Hui-Sung; Yoo, Chang Eun; Chung, Woosung; Eum, Hye Hyeon; Kim, Sangmin; Kim, Hong Kwan; Lee, Jeong Eon; Ahn, Myung-Ju; Lee, Hae-Ock; Park, Donghyun; Park, Woong-Yang
2018-01-01
Simultaneous sequencing of the genome and transcriptome at the single-cell level is a powerful tool for characterizing genomic and transcriptomic variation and revealing correlative relationships. However, it remains technically challenging to analyze both the genome and transcriptome in the same cell. Here, we report a novel method for simultaneous isolation of genomic DNA and total RNA (SIDR) from single cells, achieving high recovery rates with minimal cross-contamination, as is crucial for accurate description and integration of the single-cell genome and transcriptome. For reliable and efficient separation of genomic DNA and total RNA from single cells, the method uses hypotonic lysis to preserve nuclear lamina integrity and subsequently captures the cell lysate using antibody-conjugated magnetic microbeads. Evaluating the performance of this method using real-time PCR demonstrated that it efficiently recovered genomic DNA and total RNA. Thorough data quality assessments showed that DNA and RNA simultaneously fractionated by the SIDR method were suitable for genome and transcriptome sequencing analysis at the single-cell level. The integration of single-cell genome and transcriptome sequencing by SIDR (SIDR-seq) showed that genetic alterations, such as copy-number and single-nucleotide variations, were more accurately captured by single-cell SIDR-seq compared with conventional single-cell RNA-seq, although copy-number variations positively correlated with the corresponding gene expression levels. These results suggest that SIDR-seq is potentially a powerful tool to reveal genetic heterogeneity and phenotypic information inferred from gene expression patterns at the single-cell level. © 2018 Han et al.; Published by Cold Spring Harbor Laboratory Press.
SIDR: simultaneous isolation and parallel sequencing of genomic DNA and total RNA from single cells
Han, Kyung Yeon; Kim, Kyu-Tae; Joung, Je-Gun; Son, Dae-Soon; Kim, Yeon Jeong; Jo, Areum; Jeon, Hyo-Jeong; Moon, Hui-Sung; Yoo, Chang Eun; Chung, Woosung; Eum, Hye Hyeon; Kim, Sangmin; Kim, Hong Kwan; Lee, Jeong Eon; Ahn, Myung-Ju; Lee, Hae-Ock; Park, Donghyun; Park, Woong-Yang
2018-01-01
Simultaneous sequencing of the genome and transcriptome at the single-cell level is a powerful tool for characterizing genomic and transcriptomic variation and revealing correlative relationships. However, it remains technically challenging to analyze both the genome and transcriptome in the same cell. Here, we report a novel method for simultaneous isolation of genomic DNA and total RNA (SIDR) from single cells, achieving high recovery rates with minimal cross-contamination, as is crucial for accurate description and integration of the single-cell genome and transcriptome. For reliable and efficient separation of genomic DNA and total RNA from single cells, the method uses hypotonic lysis to preserve nuclear lamina integrity and subsequently captures the cell lysate using antibody-conjugated magnetic microbeads. Evaluating the performance of this method using real-time PCR demonstrated that it efficiently recovered genomic DNA and total RNA. Thorough data quality assessments showed that DNA and RNA simultaneously fractionated by the SIDR method were suitable for genome and transcriptome sequencing analysis at the single-cell level. The integration of single-cell genome and transcriptome sequencing by SIDR (SIDR-seq) showed that genetic alterations, such as copy-number and single-nucleotide variations, were more accurately captured by single-cell SIDR-seq compared with conventional single-cell RNA-seq, although copy-number variations positively correlated with the corresponding gene expression levels. These results suggest that SIDR-seq is potentially a powerful tool to reveal genetic heterogeneity and phenotypic information inferred from gene expression patterns at the single-cell level. PMID:29208629
Novel primers for complete mitochondrial cytochrome b genesequencing in mammals
Naidu, Ashwin; Fitak, Robert R.; Munguia-Vega, Adrian; Culver, Melanie
2011-01-01
Sequence-based species identification relies on the extent and integrity of sequence data available in online databases such as GenBank. When identifying species from a sample of unknown origin, partial DNA sequences obtained from the sample are aligned against existing sequences in databases. When the sequence from the matching species is not present in the database, high-scoring alignments with closely related sequences might produce unreliable results on species identity. For species identification in mammals, the cytochrome b (cyt b) gene has been identified to be highly informative; thus, large amounts of reference sequence data from the cyt b gene are much needed. To enhance availability of cyt b gene sequence data on a large number of mammalian species in GenBank and other such publicly accessible online databases, we identified a primer pair for complete cyt b gene sequencing in mammals. Using this primer pair, we successfully PCR amplified and sequenced the complete cyt b gene from 40 of 44 mammalian species representing 10 orders of mammals. We submitted 40 complete, correctly annotated, cyt b protein coding sequences to GenBank. To our knowledge, this is the first single primer pair to amplify the complete cyt b gene in a broad range of mammalian species. This primer pair can be used for the addition of new cyt b gene sequences and to enhance data available on species represented in GenBank. The availability of novel and complete gene sequences as high-quality reference data can improve the reliability of sequence-based species identification.
Pervasive sequence patents cover the entire human genome.
Rosenfeld, Jeffrey A; Mason, Christopher E
2013-01-01
The scope and eligibility of patents for genetic sequences have been debated for decades, but a critical case regarding gene patents (Association of Molecular Pathologists v. Myriad Genetics) is now reaching the US Supreme Court. Recent court rulings have supported the assertion that such patents can provide intellectual property rights on sequences as small as 15 nucleotides (15mers), but an analysis of all current US patent claims and the human genome presented here shows that 15mer sequences from all human genes match at least one other gene. The average gene matches 364 other genes as 15mers; the breast-cancer-associated gene BRCA1 has 15mers matching at least 689 other genes. Longer sequences (1,000 bp) still showed extensive cross-gene matches. Furthermore, 15mer-length claims from bovine and other animal patents could also claim as much as 84% of the genes in the human genome. In addition, when we expanded our analysis to full-length patent claims on DNA from all US patents to date, we found that 41% of the genes in the human genome have been claimed. Thus, current patents for both short and long nucleotide sequences are extraordinarily non-specific and create an uncertain, problematic liability for genomic medicine, especially in regard to targeted re-sequencing and other sequence diagnostic assays.
Reinprecht, Yarmilla; Yadegari, Zeinab; Perry, Gregory E.; Siddiqua, Mahbuba; Wright, Lori C.; McClean, Phillip E.; Pauls, K. Peter
2013-01-01
Legumes contain a variety of phytochemicals derived from the phenylpropanoid pathway that have important effects on human health as well as seed coat color, plant disease resistance and nodulation. However, the information about the genes involved in this important pathway is fragmentary in common bean (Phaseolus vulgaris L.). The objectives of this research were to isolate genes that function in and control the phenylpropanoid pathway in common bean, determine their genomic locations in silico in common bean and soybean, and analyze sequences of the 4CL gene family in two common bean genotypes. Sequences of phenylpropanoid pathway genes available for common bean or other plant species were aligned, and the conserved regions were used to design sequence-specific primers. The PCR products were cloned and sequenced and the gene sequences along with common bean gene-based (g) markers were BLASTed against the Glycine max v.1.0 genome and the P. vulgaris v.1.0 (Andean) early release genome. In addition, gene sequences were BLASTed against the OAC Rex (Mesoamerican) genome sequence assembly. In total, fragments of 46 structural and regulatory phenylpropanoid pathway genes were characterized in this way and placed in silico on common bean and soybean sequence maps. The maps contain over 250 common bean g and SSR (simple sequence repeat) markers and identify the positions of more than 60 additional phenylpropanoid pathway gene sequences, plus the putative locations of seed coat color genes. The majority of cloned phenylpropanoid pathway gene sequences were mapped to one location in the common bean genome but had two positions in soybean. The comparison of the genomic maps confirmed previous studies, which show that common bean and soybean share genomic regions, including those containing phenylpropanoid pathway gene sequences, with conserved synteny. Indels identified in the comparison of Andean and Mesoamerican common bean 4CL gene sequences might be used to develop inter-pool phenylpropanoid pathway gene-based markers. We anticipate that the information obtained by this study will simplify and accelerate selections of common bean with specific phenylpropanoid pathway alleles to increase the contents of beneficial phenylpropanoids in common bean and other legumes. PMID:24046770
Hu, Ping; Wang, Tao; Tao, Jing; Zong, Shixiang
2017-01-01
Seabuckthorn carpenter moth, Eogystia hippophaecolus (Lepidoptera: Cossidae), is an important pest of sea buckthorn (Hippophae rhamnoides), which is a shrub that has significant ecological and economic value in China. E. hippophaecolus is highly cold tolerant, but limited studies have been conducted to elucidate the molecular mechanisms underlying its cold resistance. Here we sequenced the E. hippophaecolus transcriptome using RNA-Seq technology and performed de novo assembly from the short paired-end reads. We investigated the larval response to cold stress by comparing gene expression profiles between treatments. We obtained 118,034 unigenes, of which 22,161 were annotated with gene descriptions, conserved domains, gene ontology terms, and metabolic pathways. These resulted in 57 GO terms and 193 Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. By comparing transcriptome profiles for differential gene expression, we identified many differentially expressed proteins and genes, including heat shock proteins and cuticular proteins which have previously been reported to be involved in cold resistance of insects. This study provides a global transcriptome analysis and an assessment of differential gene expression in E. hippophaecolus under cold stress. We found seven differential expressed genes in common between developmental stages, which were verified with qPCR. Our findings facilitate future genomic studies aimed at improving our understanding of the molecular mechanisms underlying the response of insects to low temperatures. PMID:29131867
Cui, Mingming; Hu, Ping; Wang, Tao; Tao, Jing; Zong, Shixiang
2017-01-01
Seabuckthorn carpenter moth, Eogystia hippophaecolus (Lepidoptera: Cossidae), is an important pest of sea buckthorn (Hippophae rhamnoides), which is a shrub that has significant ecological and economic value in China. E. hippophaecolus is highly cold tolerant, but limited studies have been conducted to elucidate the molecular mechanisms underlying its cold resistance. Here we sequenced the E. hippophaecolus transcriptome using RNA-Seq technology and performed de novo assembly from the short paired-end reads. We investigated the larval response to cold stress by comparing gene expression profiles between treatments. We obtained 118,034 unigenes, of which 22,161 were annotated with gene descriptions, conserved domains, gene ontology terms, and metabolic pathways. These resulted in 57 GO terms and 193 Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. By comparing transcriptome profiles for differential gene expression, we identified many differentially expressed proteins and genes, including heat shock proteins and cuticular proteins which have previously been reported to be involved in cold resistance of insects. This study provides a global transcriptome analysis and an assessment of differential gene expression in E. hippophaecolus under cold stress. We found seven differential expressed genes in common between developmental stages, which were verified with qPCR. Our findings facilitate future genomic studies aimed at improving our understanding of the molecular mechanisms underlying the response of insects to low temperatures.
Johnson, Matthew G.; Gardner, Elliot M.; Liu, Yang; Medina, Rafael; Goffinet, Bernard; Shaw, A. Jonathan; Zerega, Nyree J. C.; Wickett, Norman J.
2016-01-01
Premise of the study: Using sequence data generated via target enrichment for phylogenetics requires reassembly of high-throughput sequence reads into loci, presenting a number of bioinformatics challenges. We developed HybPiper as a user-friendly platform for assembly of gene regions, extraction of exon and intron sequences, and identification of paralogous gene copies. We test HybPiper using baits designed to target 333 phylogenetic markers and 125 genes of functional significance in Artocarpus (Moraceae). Methods and Results: HybPiper implements parallel execution of sequence assembly in three phases: read mapping, contig assembly, and target sequence extraction. The pipeline was able to recover nearly complete gene sequences for all genes in 22 species of Artocarpus. HybPiper also recovered more than 500 bp of nontargeted intron sequence in over half of the phylogenetic markers and identified paralogous gene copies in Artocarpus. Conclusions: HybPiper was designed for Linux and Mac OS X and is freely available at https://github.com/mossmatters/HybPiper. PMID:27437175
Peng, Zhi-yu; Zhou, Xin; Li, Linchuan; Yu, Xiangchun; Li, Hongjiang; Jiang, Zhiqiang; Cao, Guangyu; Bai, Mingyi; Wang, Xingchun; Jiang, Caifu; Lu, Haibin; Hou, Xianhui; Qu, Lijia; Wang, Zhiyong; Zuo, Jianru; Fu, Xiangdong; Su, Zhen; Li, Songgang; Guo, Hongwei
2009-01-01
Plant hormones are small organic molecules that influence almost every aspect of plant growth and development. Genetic and molecular studies have revealed a large number of genes that are involved in responses to numerous plant hormones, including auxin, gibberellin, cytokinin, abscisic acid, ethylene, jasmonic acid, salicylic acid, and brassinosteroid. Here, we develop an Arabidopsis hormone database, which aims to provide a systematic and comprehensive view of genes participating in plant hormonal regulation, as well as morphological phenotypes controlled by plant hormones. Based on data from mutant studies, transgenic analysis and gene ontology (GO) annotation, we have identified a total of 1026 genes in the Arabidopsis genome that participate in plant hormone functions. Meanwhile, a phenotype ontology is developed to precisely describe myriad hormone-regulated morphological processes with standardized vocabularies. A web interface (http://ahd.cbi.pku.edu.cn) would allow users to quickly get access to information about these hormone-related genes, including sequences, functional category, mutant information, phenotypic description, microarray data and linked publications. Several applications of this database in studying plant hormonal regulation and hormone cross-talk will be presented and discussed. PMID:19015126
Peng, Zhi-yu; Zhou, Xin; Li, Linchuan; Yu, Xiangchun; Li, Hongjiang; Jiang, Zhiqiang; Cao, Guangyu; Bai, Mingyi; Wang, Xingchun; Jiang, Caifu; Lu, Haibin; Hou, Xianhui; Qu, Lijia; Wang, Zhiyong; Zuo, Jianru; Fu, Xiangdong; Su, Zhen; Li, Songgang; Guo, Hongwei
2009-01-01
Plant hormones are small organic molecules that influence almost every aspect of plant growth and development. Genetic and molecular studies have revealed a large number of genes that are involved in responses to numerous plant hormones, including auxin, gibberellin, cytokinin, abscisic acid, ethylene, jasmonic acid, salicylic acid, and brassinosteroid. Here, we develop an Arabidopsis hormone database, which aims to provide a systematic and comprehensive view of genes participating in plant hormonal regulation, as well as morphological phenotypes controlled by plant hormones. Based on data from mutant studies, transgenic analysis and gene ontology (GO) annotation, we have identified a total of 1026 genes in the Arabidopsis genome that participate in plant hormone functions. Meanwhile, a phenotype ontology is developed to precisely describe myriad hormone-regulated morphological processes with standardized vocabularies. A web interface (http://ahd.cbi.pku.edu.cn) would allow users to quickly get access to information about these hormone-related genes, including sequences, functional category, mutant information, phenotypic description, microarray data and linked publications. Several applications of this database in studying plant hormonal regulation and hormone cross-talk will be presented and discussed.
Exome-wide DNA capture and next generation sequencing in domestic and wild species.
Cosart, Ted; Beja-Pereira, Albano; Chen, Shanyuan; Ng, Sarah B; Shendure, Jay; Luikart, Gordon
2011-07-05
Gene-targeted and genome-wide markers are crucial to advance evolutionary biology, agriculture, and biodiversity conservation by improving our understanding of genetic processes underlying adaptation and speciation. Unfortunately, for eukaryotic species with large genomes it remains costly to obtain genome sequences and to develop genome resources such as genome-wide SNPs. A method is needed to allow gene-targeted, next-generation sequencing that is flexible enough to include any gene or number of genes, unlike transcriptome sequencing. Such a method would allow sequencing of many individuals, avoiding ascertainment bias in subsequent population genetic analyses.We demonstrate the usefulness of a recent technology, exon capture, for genome-wide, gene-targeted marker discovery in species with no genome resources. We use coding gene sequences from the domestic cow genome sequence (Bos taurus) to capture (enrich for), and subsequently sequence, thousands of exons of B. taurus, B. indicus, and Bison bison (wild bison). Our capture array has probes for 16,131 exons in 2,570 genes, including 203 candidate genes with known function and of interest for their association with disease and other fitness traits. We successfully sequenced and mapped exon sequences from across the 29 autosomes and X chromosome in the B. taurus genome sequence. Exon capture and high-throughput sequencing identified thousands of putative SNPs spread evenly across all reference chromosomes, in all three individuals, including hundreds of SNPs in our targeted candidate genes. This study shows exon capture can be customized for SNP discovery in many individuals and for non-model species without genomic resources. Our captured exome subset was small enough for affordable next-generation sequencing, and successfully captured exons from a divergent wild species using the domestic cow genome as reference.
He, Hairong; Zhang, Yuejing; Ma, Zhaoxu; Li, Chuang; Liu, Chongxi; Zhou, Ying; Li, Lianjie; Wang, Xiangjing; Xiang, Wensheng
2015-05-01
A novel actinomycete, designated strain NEAU-B-8(T), was isolated from the rhizosphere soil of a peace lily (Spathi phyllum Kochii) collected from Heilongjiang province, north-east China. Key morphological and physiological characteristics as well as chemotaxonomic features of strain NEAU-B-8(T) were congruent with the description of the genus Actinomycetospora , such as the major fatty acids, the whole-cell hydrolysates, the predominant menaquinone and the phospholipid profile. The 16S rRNA gene sequence analysis revealed that strain NEAU-B-8(T) shared the highest sequence similarities with Actinomycetospora lutea JCM 17982(T) (99.3% 16S rRNA gene sequence similarity), Actinomycetospora chlora TT07I-57(T) (98.4 %), Actinomycetospora straminea IY07-55(T) (98.3%) and Actinomycetospora chibensis TT04-21(T) (98.2%); similarities to type strains of other species of this genus were lower than 98%. The phylogenetic tree based on 16S rRNA gene sequences showed that strain NEAU-B-8(T) formed a distinct branch with A. lutea JCM 17982(T) that was supported by a high bootstrap value of 97% in the neighbour-joining tree and was also recovered with the maximum-likelihood algorithm. However, the DNA-DNA relatedness between strain NEAU-B-8(T) and A. lutea JCM 17982(T) was found to be 50.6 ± 1.2%. Meanwhile, strain NEAU-B-8(T) differs from other most closely related strains in phenotypic properties, such as maximum NaCl tolerance, hydrolysis of aesculin and decomposition of urea. On the basis of the morphological, physiological, chemotaxonomic, phylogenetic and DNA-DNA hybridization data, we conclude that strain NEAU-B-8(T) represents a novel species of the genus Actinomycetospora , named Actinomycetospora rhizophila sp. nov. The type strain is NEAU-B-8(T). ( = CGMCC 4.7134(T) =DSM 46673(T)). © 2015 IUMS.
Olson, Nathan D.; Lund, Steven P.; Zook, Justin M.; Rojas-Cornejo, Fabiola; Beck, Brian; Foy, Carole; Huggett, Jim; Whale, Alexandra S.; Sui, Zhiwei; Baoutina, Anna; Dobeson, Michael; Partis, Lina; Morrow, Jayne B.
2015-01-01
This study presents the results from an interlaboratory sequencing study for which we developed a novel high-resolution method for comparing data from different sequencing platforms for a multi-copy, paralogous gene. The combination of PCR amplification and 16S ribosomal RNA gene (16S rRNA) sequencing has revolutionized bacteriology by enabling rapid identification, frequently without the need for culture. To assess variability between laboratories in sequencing 16S rRNA, six laboratories sequenced the gene encoding the 16S rRNA from Escherichia coli O157:H7 strain EDL933 and Listeria monocytogenes serovar 4b strain NCTC11994. Participants performed sequencing methods and protocols available in their laboratories: Sanger sequencing, Roche 454 pyrosequencing®, or Ion Torrent PGM®. The sequencing data were evaluated on three levels: (1) identity of biologically conserved position, (2) ratio of 16S rRNA gene copies featuring identified variants, and (3) the collection of variant combinations in a set of 16S rRNA gene copies. The same set of biologically conserved positions was identified for each sequencing method. Analytical methods using Bayesian and maximum likelihood statistics were developed to estimate variant copy ratios, which describe the ratio of nucleotides at each identified biologically variable position, as well as the likely set of variant combinations present in 16S rRNA gene copies. Our results indicate that estimated variant copy ratios at biologically variable positions were only reproducible for high throughput sequencing methods. Furthermore, the likely variant combination set was only reproducible with increased sequencing depth and longer read lengths. We also demonstrate novel methods for evaluating variable positions when comparing multi-copy gene sequence data from multiple laboratories generated using multiple sequencing technologies. PMID:27077030
Chen, Ming; Henry, Nathan; Almsaeed, Abdullah; Zhou, Xiao; Wegrzyn, Jill; Ficklin, Stephen
2017-01-01
Abstract Tripal is an open source software package for developing biological databases with a focus on genetic and genomic data. It consists of a set of core modules that deliver essential functions for loading and displaying data records and associated attributes including organisms, sequence features and genetic markers. Beyond the core modules, community members are encouraged to contribute extension modules to build on the Tripal core and to customize Tripal for individual community needs. To expand the utility of the Tripal software system, particularly for RNASeq data, we developed two new extension modules. Tripal Elasticsearch enables fast, scalable searching of the entire content of a Tripal site as well as the construction of customized advanced searches of specific data types. We demonstrate the use of this module for searching assembled transcripts by functional annotation. A second module, Tripal Analysis Expression, houses and displays records from gene expression assays such as RNA sequencing. This includes biological source materials (biomaterials), gene expression values and protocols used to generate the data. In the case of an RNASeq experiment, this would reflect the individual organisms and tissues used to produce sequencing libraries, the normalized gene expression values derived from the RNASeq data analysis and a description of the software or code used to generate the expression values. The module will load data from common flat file formats including standard NCBI Biosample XML. Data loading, display options and other configurations can be controlled by authorized users in the Drupal administrative backend. Both modules are open source, include usage documentation, and can be found in the Tripal organization’s GitHub repository. Database URL: Tripal Elasticsearch module: https://github.com/tripal/tripal_elasticsearch Tripal Analysis Expression module: https://github.com/tripal/tripal_analysis_expression PMID:29220446
Nedashkovskaya, Olga I; Kim, Song-Gun; Zhukova, Natalia V; Lee, Jung-Sook; Mikhailov, Valery V
2016-11-01
A strictly aerobic, Gram-stain-negative, rod-shaped, motile by gliding and yellow-pigmented bacterium, designated strain 7Alg 4T, was isolated from the green alga Cladophora stimpsonii. Phylogenetic analysis based on 16S rRNA gene sequences revealed that the novel strain was affiliated to the family Flavobacteriaceae of the phylum Bacteroidetes, and was most closely related to the recognized species of the genera Lacinutrixand Flavirhabdus, with 16S rRNA gene sequence similarities of 95.1-98.1 and 97.0 %, respectively. Strain 7Alg 4T grew in the presence of 1-5 % NaCl and at 4-32 °C, and hydrolysed aesculin, gelatin, starch and Tween 80. The prevalent fatty acids were iso-C15 : 1 G, iso-C15 : 0, iso-C17 : 0 3-OH, iso-C15 : 0 3-OH and C15 : 0. The polar lipid profile was characterized by the presence of phosphatidylethanolamine, three unidentified aminolipids and four unidentified lipids. The major respiratory quinone was MK-6. The DNA G+C content was 31.9 mol%. On the basis of the differences in 16S rRNA gene sequences, chemotaxonomic and phenotypic characteristics, it is suggested that strain 7Alg 4T represents a novel species of the genus Lacinutrix, for which the name Lacinutrixcladophorae sp. nov. is proposed. The type strain is 7Alg 4T (=KCTC 23036T=KMM 6381T). Reclassification of Flavirhabdus iliipiscaria as Lacinutrix iliipiscaria comb. nov. and an emend of the genus Lacinutrix are also proposed.
An expression database for roots of the model legume Medicago truncatula under salt stress
2009-01-01
Background Medicago truncatula is a model legume whose genome is currently being sequenced by an international consortium. Abiotic stresses such as salt stress limit plant growth and crop productivity, including those of legumes. We anticipate that studies on M. truncatula will shed light on other economically important legumes across the world. Here, we report the development of a database called MtED that contains gene expression profiles of the roots of M. truncatula based on time-course salt stress experiments using the Affymetrix Medicago GeneChip. Our hope is that MtED will provide information to assist in improving abiotic stress resistance in legumes. Description The results of our microarray experiment with roots of M. truncatula under 180 mM sodium chloride were deposited in the MtED database. Additionally, sequence and annotation information regarding microarray probe sets were included. MtED provides functional category analysis based on Gene and GeneBins Ontology, and other Web-based tools for querying and retrieving query results, browsing pathways and transcription factor families, showing metabolic maps, and comparing and visualizing expression profiles. Utilities like mapping probe sets to genome of M. truncatula and In-Silico PCR were implemented by BLAT software suite, which were also available through MtED database. Conclusion MtED was built in the PHP script language and as a MySQL relational database system on a Linux server. It has an integrated Web interface, which facilitates ready examination and interpretation of the results of microarray experiments. It is intended to help in selecting gene markers to improve abiotic stress resistance in legumes. MtED is available at http://bioinformatics.cau.edu.cn/MtED/. PMID:19906315
Curated eutherian third party data gene data sets.
Premzl, Marko
2016-03-01
The free available eutherian genomic sequence data sets advanced scientific field of genomics. Of note, future revisions of gene data sets were expected, due to incompleteness of public eutherian genomic sequence assemblies and potential genomic sequence errors. The eutherian comparative genomic analysis protocol was proposed as guidance in protection against potential genomic sequence errors in public eutherian genomic sequences. The protocol was applicable in updates of 7 major eutherian gene data sets, including 812 complete coding sequences deposited in European Nucleotide Archive as curated third party data gene data sets.
Opik, M; Metsis, M; Daniell, T J; Zobel, M; Moora, M
2009-10-01
* Knowledge of the diversity of arbuscular mycorrhizal fungi (AMF) in natural ecosystems is a major bottleneck in mycorrhizal ecology. Here, we aimed to apply 454 sequencing--providing a new level of descriptive power--to assess the AMF diversity in a boreonemoral forest. * 454 sequencing reads of the small subunit ribosomal RNA (SSU rRNA) gene of Glomeromycota were assigned to sequence groups by blast searches against a custom-made annotated sequence database. * We detected 47 AMF taxa in the roots of 10 plant species in a 10 x 10 m plot, which is almost the same as the number of plant species in the whole studied forest. There was a significant difference between AMF communities in the roots of forest specialist plant species and in the roots of habitat generalist plant species. Forest plant species hosted 22 specialist AMF taxa, and the generalist plants shared all but one AMF taxon with forest plants, including globally distributed generalist fungi. These AMF taxa that have been globally recorded only in forest ecosystems were significantly over-represented in the roots of forest plant species. * Our findings suggest that partner specificity in AM symbiosis may occur at the level of ecological groups, rather than at the species level, of both plant and fungal partners.
Cosart, Ted; Beja-Pereira, Albano; Luikart, Gordon
2014-11-01
The computer program EXONSAMPLER automates the sampling of thousands of exon sequences from publicly available reference genome sequences and gene annotation databases. It was designed to provide exon sequences for the efficient, next-generation gene sequencing method called exon capture. The exon sequences can be sampled by a list of gene name abbreviations (e.g. IFNG, TLR1), or by sampling exons from genes spaced evenly across chromosomes. It provides a list of genomic coordinates (a bed file), as well as a set of sequences in fasta format. User-adjustable parameters for collecting exon sequences include a minimum and maximum acceptable exon length, maximum number of exonic base pairs (bp) to sample per gene, and maximum total bp for the entire collection. It allows for partial sampling of very large exons. It can preferentially sample upstream (5 prime) exons, downstream (3 prime) exons, both external exons, or all internal exons. It is written in the Python programming language using its free libraries. We describe the use of EXONSAMPLER to collect exon sequences from the domestic cow (Bos taurus) genome for the design of an exon-capture microarray to sequence exons from related species, including the zebu cow and wild bison. We collected ~10% of the exome (~3 million bp), including 155 candidate genes, and ~16,000 exons evenly spaced genomewide. We prioritized the collection of 5 prime exons to facilitate discovery and genotyping of SNPs near upstream gene regulatory DNA sequences, which control gene expression and are often under natural selection. © 2014 John Wiley & Sons Ltd.
Integrative Annotation of 21,037 Human Genes Validated by Full-Length cDNA Clones
Imanishi, Tadashi; Itoh, Takeshi; Suzuki, Yutaka; O'Donovan, Claire; Fukuchi, Satoshi; Koyanagi, Kanako O; Barrero, Roberto A; Tamura, Takuro; Yamaguchi-Kabata, Yumi; Tanino, Motohiko; Yura, Kei; Miyazaki, Satoru; Ikeo, Kazuho; Homma, Keiichi; Kasprzyk, Arek; Nishikawa, Tetsuo; Hirakawa, Mika; Thierry-Mieg, Jean; Thierry-Mieg, Danielle; Ashurst, Jennifer; Jia, Libin; Nakao, Mitsuteru; Thomas, Michael A; Mulder, Nicola; Karavidopoulou, Youla; Jin, Lihua; Kim, Sangsoo; Yasuda, Tomohiro; Lenhard, Boris; Eveno, Eric; Suzuki, Yoshiyuki; Yamasaki, Chisato; Takeda, Jun-ichi; Gough, Craig; Hilton, Phillip; Fujii, Yasuyuki; Sakai, Hiroaki; Tanaka, Susumu; Amid, Clara; Bellgard, Matthew; Bonaldo, Maria de Fatima; Bono, Hidemasa; Bromberg, Susan K; Brookes, Anthony J; Bruford, Elspeth; Carninci, Piero; Chelala, Claude; Couillault, Christine; de Souza, Sandro J.; Debily, Marie-Anne; Devignes, Marie-Dominique; Dubchak, Inna; Endo, Toshinori; Estreicher, Anne; Eyras, Eduardo; Fukami-Kobayashi, Kaoru; R. Gopinath, Gopal; Graudens, Esther; Hahn, Yoonsoo; Han, Michael; Han, Ze-Guang; Hanada, Kousuke; Hanaoka, Hideki; Harada, Erimi; Hashimoto, Katsuyuki; Hinz, Ursula; Hirai, Momoki; Hishiki, Teruyoshi; Hopkinson, Ian; Imbeaud, Sandrine; Inoko, Hidetoshi; Kanapin, Alexander; Kaneko, Yayoi; Kasukawa, Takeya; Kelso, Janet; Kersey, Paul; Kikuno, Reiko; Kimura, Kouichi; Korn, Bernhard; Kuryshev, Vladimir; Makalowska, Izabela; Makino, Takashi; Mano, Shuhei; Mariage-Samson, Regine; Mashima, Jun; Matsuda, Hideo; Mewes, Hans-Werner; Minoshima, Shinsei; Nagai, Keiichi; Nagasaki, Hideki; Nagata, Naoki; Nigam, Rajni; Ogasawara, Osamu; Ohara, Osamu; Ohtsubo, Masafumi; Okada, Norihiro; Okido, Toshihisa; Oota, Satoshi; Ota, Motonori; Ota, Toshio; Otsuki, Tetsuji; Piatier-Tonneau, Dominique; Poustka, Annemarie; Ren, Shuang-Xi; Saitou, Naruya; Sakai, Katsunaga; Sakamoto, Shigetaka; Sakate, Ryuichi; Schupp, Ingo; Servant, Florence; Sherry, Stephen; Shiba, Rie; Shimizu, Nobuyoshi; Shimoyama, Mary; Simpson, Andrew J; Soares, Bento; Steward, Charles; Suwa, Makiko; Suzuki, Mami; Takahashi, Aiko; Tamiya, Gen; Tanaka, Hiroshi; Taylor, Todd; Terwilliger, Joseph D; Unneberg, Per; Veeramachaneni, Vamsi; Watanabe, Shinya; Wilming, Laurens; Yasuda, Norikazu; Yoo, Hyang-Sook; Stodolsky, Marvin; Makalowski, Wojciech; Go, Mitiko; Nakai, Kenta; Takagi, Toshihisa; Kanehisa, Minoru; Sakaki, Yoshiyuki; Quackenbush, John; Okazaki, Yasushi; Hayashizaki, Yoshihide; Hide, Winston; Chakraborty, Ranajit; Nishikawa, Ken; Sugawara, Hideaki; Tateno, Yoshio; Chen, Zhu; Oishi, Michio; Tonellato, Peter; Apweiler, Rolf; Okubo, Kousaku; Wagner, Lukas; Wiemann, Stefan; Strausberg, Robert L; Isogai, Takao; Auffray, Charles; Nomura, Nobuo; Sugano, Sumio
2004-01-01
The human genome sequence defines our inherent biological potential; the realization of the biology encoded therein requires knowledge of the function of each gene. Currently, our knowledge in this area is still limited. Several lines of investigation have been used to elucidate the structure and function of the genes in the human genome. Even so, gene prediction remains a difficult task, as the varieties of transcripts of a gene may vary to a great extent. We thus performed an exhaustive integrative characterization of 41,118 full-length cDNAs that capture the gene transcripts as complete functional cassettes, providing an unequivocal report of structural and functional diversity at the gene level. Our international collaboration has validated 21,037 human gene candidates by analysis of high-quality full-length cDNA clones through curation using unified criteria. This led to the identification of 5,155 new gene candidates. It also manifested the most reliable way to control the quality of the cDNA clones. We have developed a human gene database, called the H-Invitational Database (H-InvDB; http://www.h-invitational.jp/). It provides the following: integrative annotation of human genes, description of gene structures, details of novel alternative splicing isoforms, non-protein-coding RNAs, functional domains, subcellular localizations, metabolic pathways, predictions of protein three-dimensional structure, mapping of known single nucleotide polymorphisms (SNPs), identification of polymorphic microsatellite repeats within human genes, and comparative results with mouse full-length cDNAs. The H-InvDB analysis has shown that up to 4% of the human genome sequence (National Center for Biotechnology Information build 34 assembly) may contain misassembled or missing regions. We found that 6.5% of the human gene candidates (1,377 loci) did not have a good protein-coding open reading frame, of which 296 loci are strong candidates for non-protein-coding RNA genes. In addition, among 72,027 uniquely mapped SNPs and insertions/deletions localized within human genes, 13,215 nonsynonymous SNPs, 315 nonsense SNPs, and 452 indels occurred in coding regions. Together with 25 polymorphic microsatellite repeats present in coding regions, they may alter protein structure, causing phenotypic effects or resulting in disease. The H-InvDB platform represents a substantial contribution to resources needed for the exploration of human biology and pathology. PMID:15103394
Nearing saturation of cancer driver gene discovery.
Hsiehchen, David; Hsieh, Antony
2018-06-15
Extensive sequencing efforts of cancer genomes such as The Cancer Genome Atlas (TCGA) have been undertaken to uncover bona fide cancer driver genes which has enhanced our understanding of cancer and revealed therapeutic targets. However, the number of driver gene mutations is bounded, indicating that there must be a point when further sequencing efforts will be excessive. We found that there was a significant positive correlation between sample size and identified driver gene mutations across 33 cancers sequenced by the TCGA, which is expected if additional sequencing is still leading to the identification of more driver genes. However, the rate of new cancer driver genes being discovered with larger samples is declining rapidly. Our analysis provides a general guide for determining which cancer types would likely benefit from additional sequencing efforts, particularly those with relatively high rates of cancer driver gene discovery. Our results argue that past strategies of indiscriminately sequencing as many specimens as possible for all cancer types is becoming inefficient. In addition, without significant investments into applying our knowledge of cancer genomes, we risk sequencing more cancer genomes for the sake of sequencing rather than meaningful patient benefit.
Description of new genera and species of marine cyanobacteria from the Portuguese Atlantic coast.
Brito, Ângela; Ramos, Vitor; Mota, Rita; Lima, Steeve; Santos, Arlete; Vieira, Jorge; Vieira, Cristina P; Kaštovský, Jan; Vasconcelos, Vitor M; Tamagnini, Paula
2017-06-01
Aiming at increasing the knowledge on marine cyanobacteria from temperate regions, we previously isolated and characterized 60 strains from the Portuguese foreshore and evaluate their potential to produce secondary metabolites. About 15% of the obtained 16S rRNA gene sequences showed less than 97% similarity to sequences in the databases revealing novel biodiversity. Herein, seven of these strains were extensively characterized and their classification was re-evaluated. The present study led to the proposal of five new taxa, three genera (Geminobacterium, Lusitaniella, and Calenema) and two species (Hyella patelloides and Jaaginema litorale). Geminobacterium atlanticum LEGE 07459 is a chroococcalean that shares morphological characteristics with other unicellular cyanobacterial genera but has a distinct phylogenetic position and particular ultrastructural features. The description of the Pleurocapsales Hyella patelloides LEGE 07179 includes novel molecular data for members of this genus. The filamentous isolates of Lusitaniella coriacea - LEGE 07167, 07157 and 06111 - constitute a very distinct lineage, and seem to be ubiquitous on the Portuguese coast. Jaaginema litorale LEGE 07176 has distinct characteristics compared to their marine counterparts, and our analysis indicates that this genus is polyphyletic. The Synechococcales Calenema singularis possess wider trichomes than Leptolyngbya, and its phylogenetic position reinforces the establishment of this new genus. Copyright © 2017 Elsevier Inc. All rights reserved.
SMITH: a LIMS for handling next-generation sequencing workflows
2014-01-01
Background Life-science laboratories make increasing use of Next Generation Sequencing (NGS) for studying bio-macromolecules and their interactions. Array-based methods for measuring gene expression or protein-DNA interactions are being replaced by RNA-Seq and ChIP-Seq. Sequencing is generally performed by specialized facilities that have to keep track of sequencing requests, trace samples, ensure quality and make data available according to predefined privileges. An integrated tool helps to troubleshoot problems, to maintain a high quality standard, to reduce time and costs. Commercial and non-commercial tools called LIMS (Laboratory Information Management Systems) are available for this purpose. However, they often come at prohibitive cost and/or lack the flexibility and scalability needed to adjust seamlessly to the frequently changing protocols employed. In order to manage the flow of sequencing data produced at the Genomic Unit of the Italian Institute of Technology (IIT), we developed SMITH (Sequencing Machine Information Tracking and Handling). Methods SMITH is a web application with a MySQL server at the backend. Wet-lab scientists of the Centre for Genomic Science and database experts from the Politecnico of Milan in the context of a Genomic Data Model Project developed SMITH. The data base schema stores all the information of an NGS experiment, including the descriptions of all protocols and algorithms used in the process. Notably, an attribute-value table allows associating an unconstrained textual description to each sample and all the data produced afterwards. This method permits the creation of metadata that can be used to search the database for specific files as well as for statistical analyses. Results SMITH runs automatically and limits direct human interaction mainly to administrative tasks. SMITH data-delivery procedures were standardized making it easier for biologists and analysts to navigate the data. Automation also helps saving time. The workflows are available through an API provided by the workflow management system. The parameters and input data are passed to the workflow engine that performs de-multiplexing, quality control, alignments, etc. Conclusions SMITH standardizes, automates, and speeds up sequencing workflows. Annotation of data with key-value pairs facilitates meta-analysis. PMID:25471934
SMITH: a LIMS for handling next-generation sequencing workflows.
Venco, Francesco; Vaskin, Yuriy; Ceol, Arnaud; Muller, Heiko
2014-01-01
Life-science laboratories make increasing use of Next Generation Sequencing (NGS) for studying bio-macromolecules and their interactions. Array-based methods for measuring gene expression or protein-DNA interactions are being replaced by RNA-Seq and ChIP-Seq. Sequencing is generally performed by specialized facilities that have to keep track of sequencing requests, trace samples, ensure quality and make data available according to predefined privileges. An integrated tool helps to troubleshoot problems, to maintain a high quality standard, to reduce time and costs. Commercial and non-commercial tools called LIMS (Laboratory Information Management Systems) are available for this purpose. However, they often come at prohibitive cost and/or lack the flexibility and scalability needed to adjust seamlessly to the frequently changing protocols employed. In order to manage the flow of sequencing data produced at the Genomic Unit of the Italian Institute of Technology (IIT), we developed SMITH (Sequencing Machine Information Tracking and Handling). SMITH is a web application with a MySQL server at the backend. Wet-lab scientists of the Centre for Genomic Science and database experts from the Politecnico of Milan in the context of a Genomic Data Model Project developed SMITH. The data base schema stores all the information of an NGS experiment, including the descriptions of all protocols and algorithms used in the process. Notably, an attribute-value table allows associating an unconstrained textual description to each sample and all the data produced afterwards. This method permits the creation of metadata that can be used to search the database for specific files as well as for statistical analyses. SMITH runs automatically and limits direct human interaction mainly to administrative tasks. SMITH data-delivery procedures were standardized making it easier for biologists and analysts to navigate the data. Automation also helps saving time. The workflows are available through an API provided by the workflow management system. The parameters and input data are passed to the workflow engine that performs de-multiplexing, quality control, alignments, etc. SMITH standardizes, automates, and speeds up sequencing workflows. Annotation of data with key-value pairs facilitates meta-analysis.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Larsen, P. E.; Trivedi, G.; Sreedasyam, A.
2010-07-06
Accurate structural annotation is important for prediction of function and required for in vitro approaches to characterize or validate the gene expression products. Despite significant efforts in the field, determination of the gene structure from genomic data alone is a challenging and inaccurate process. The ease of acquisition of transcriptomic sequence provides a direct route to identify expressed sequences and determine the correct gene structure. We developed methods to utilize RNA-seq data to correct errors in the structural annotation and extend the boundaries of current gene models using assembly approaches. The methods were validated with a transcriptomic data set derivedmore » from the fungus Laccaria bicolor, which develops a mycorrhizal symbiotic association with the roots of many tree species. Our analysis focused on the subset of 1501 gene models that are differentially expressed in the free living vs. mycorrhizal transcriptome and are expected to be important elements related to carbon metabolism, membrane permeability and transport, and intracellular signaling. Of the set of 1501 gene models, 1439 (96%) successfully generated modified gene models in which all error flags were successfully resolved and the sequences aligned to the genomic sequence. The remaining 4% (62 gene models) either had deviations from transcriptomic data that could not be spanned or generated sequence that did not align to genomic sequence. The outcome of this process is a set of high confidence gene models that can be reliably used for experimental characterization of protein function. 69% of expressed mycorrhizal JGI 'best' gene models deviated from the transcript sequence derived by this method. The transcriptomic sequence enabled correction of a majority of the structural inconsistencies and resulted in a set of validated models for 96% of the mycorrhizal genes. The method described here can be applied to improve gene structural annotation in other species, provided that there is a sequenced genome and a set of gene models.« less
2018-01-01
FAM230C, a long intergenic non-coding RNA (lincRNA) gene in human chromosome 13 (chr13) is a member of lincRNA genes termed family with sequence similarity 230. An analysis using bioinformatics search tools and alignment programs was undertaken to determine properties of FAM230C and its related genes. Results reveal that the DNA translocation element, the Translocation Breakpoint Type A (TBTA) sequence, which consists of satellite DNA, Alu elements, and AT-rich sequences is embedded in the FAM230C gene. Eight lincRNA genes related to FAM230C also carry the TBTA sequences. These genes were formed from a large segment of the 3’ half of the FAM230C sequence duplicated in chr22, and are specifically in regions of low copy repeats (LCR22)s, in or close to the 22q.11.2 region. 22q11.2 is a chromosomal segment that undergoes a high rate of DNA translocation and is prone to genetic deletions. FAM230C-related genes present in other chromosomes do not carry the TBTA motif and were formed from the 5’ half region of the FAM230C sequence. These findings identify a high specificity in lincRNA gene formation by gene sequence duplication in different chromosomes. PMID:29668722
Carlson, M; Celenza, J L; Eng, F J
1985-01-01
The SUC gene family of Saccharomyces contains six structural genes for invertase (SUC1 through SUC5 and SUC7) which are located on different chromosomes. Most yeast strains do not carry all six SUC genes and instead carry natural negative (suc0) alleles at some or all SUC loci. We determined the physical structures of SUC and suc0 loci. Except for SUC2, which is an unusual member of the family, all of the SUC genes are located very close to telomeres and are flanked by homologous sequences. On the centromere-proximal side of the gene, the conserved region contains X sequences, which are sequences found adjacent to telomeres (C. S. M. Chan and B.-K. Tye, Cell 33:563-573, 1983). On the other side of the gene, the homology includes about 4 kilobases of flanking sequence and then extends into a Y' element, which is an element often found distal to the X sequence at telomeres (Chan and Tye, Cell 33:563-573, 1983). Thus, these SUC genes and flanking sequences are embedded in telomere-adjacent sequences. Chromosomes carrying suc0 alleles (except suc20) lack SUC structural genes and portions of the conserved flanking sequences. The results indicate that the dispersal of SUC genes to different chromosomes occurred by rearrangements of chromosome telomeres. Images PMID:3018485
Mahrouki, Sihem; Perilli, Mariagrazia; Bourouis, Amel; Chihi, Hela; Ferjani, Mustapha; Ben Moussa, Mohamed; Amicosante, Gianfranco; Belhadj, Omrane
2013-08-01
The aim of this study was to investigate the prevalence and the emergence of plasmid-mediated quinolone resistance among broad-spectrum beta-lactam-resistant Proteus mirabilis and Morganella morganii clinical isolates recovered in the Military Hospital in Tunisia. Of 200 strains examined, 50 exhibited resistance to quinolones. Quinolone resistance determinants (qnr and aac(6')-Ib-cr) were characterized by multiplex PCR and sequencing. Chromosomal quinolone resistance mutations in the quinolone resistance-determining region (QRDR) and class 1 integron characterization were analysed by PCR and sequencing. The clonal relationship between the isolates was studied by pulsed-field gel electrophoresis (PFGE). Fourteen isolates harboured qnrA6 and among them 8 (57%) were extended-spectrum beta-lactamase (ESBL) producers, whilst 12 (85%) isolates harboured blaDHA-1. Mutations in the QRDR were detected in gyrA (Ser83Ile, Glu87Lys), gyrB (Ser464Phe), and parC (Ser80Ile). qnrA6 and blaDHA-1 genes were found embedded in complex sul1-type class 1 integrons. A gene cassette carrying aac(6')-Ib-cr was found located in the class 1 integron upstream of the qacEΔ1 gene. According to the PFGE analysis, the isolates were clonally unrelated. This is the first description in North Africa of class 1 integrons carrying blaDHA-1, qnrA6 gene, and aac(6')-Ib-cr determinants in clinical strains of Proteus mirabilis and Morganella morganii.
Suzuki, Masaharu; Ketterling, Matthew G; McCarty, Donald R
2005-09-01
We have developed a simple quantitative computational approach for objective analysis of cis-regulatory sequences in promoters of coregulated genes. The program, designated MotifFinder, identifies oligo sequences that are overrepresented in promoters of coregulated genes. We used this approach to analyze promoter sequences of Viviparous1 (VP1)/abscisic acid (ABA)-regulated genes and cold-regulated genes, respectively, of Arabidopsis (Arabidopsis thaliana). We detected significantly enriched sequences in up-regulated genes but not in down-regulated genes. This result suggests that gene activation but not repression is mediated by specific and common sequence elements in promoters. The enriched motifs include several known cis-regulatory sequences as well as previously unidentified motifs. With respect to known cis-elements, we dissected the flanking nucleotides of the core sequences of Sph element, ABA response elements (ABREs), and the C repeat/dehydration-responsive element. This analysis identified the motif variants that may correlate with qualitative and quantitative differences in gene expression. While both VP1 and cold responses are mediated in part by ABA signaling via ABREs, these responses correlate with unique ABRE variants distinguished by nucleotides flanking the ACGT core. ABRE and Sph motifs are tightly associated uniquely in the coregulated set of genes showing a strict dependence on VP1 and ABA signaling. Finally, analysis of distribution of the enriched sequences revealed a striking concentration of enriched motifs in a proximal 200-base region of VP1/ABA and cold-regulated promoters. Overall, each class of coregulated genes possesses a discrete set of the enriched motifs with unique distributions in their promoters that may account for the specificity of gene regulation.
Sequence-based model of gap gene regulatory network.
Kozlov, Konstantin; Gursky, Vitaly; Kulakovskiy, Ivan; Samsonova, Maria
2014-01-01
The detailed analysis of transcriptional regulation is crucially important for understanding biological processes. The gap gene network in Drosophila attracts large interest among researches studying mechanisms of transcriptional regulation. It implements the most upstream regulatory layer of the segmentation gene network. The knowledge of molecular mechanisms involved in gap gene regulation is far less complete than that of genetics of the system. Mathematical modeling goes beyond insights gained by genetics and molecular approaches. It allows us to reconstruct wild-type gene expression patterns in silico, infer underlying regulatory mechanism and prove its sufficiency. We developed a new model that provides a dynamical description of gap gene regulatory systems, using detailed DNA-based information, as well as spatial transcription factor concentration data at varying time points. We showed that this model correctly reproduces gap gene expression patterns in wild type embryos and is able to predict gap expression patterns in Kr mutants and four reporter constructs. We used four-fold cross validation test and fitting to random dataset to validate the model and proof its sufficiency in data description. The identifiability analysis showed that most model parameters are well identifiable. We reconstructed the gap gene network topology and studied the impact of individual transcription factor binding sites on the model output. We measured this impact by calculating the site regulatory weight as a normalized difference between the residual sum of squares error for the set of all annotated sites and for the set with the site of interest excluded. The reconstructed topology of the gap gene network is in agreement with previous modeling results and data from literature. We showed that 1) the regulatory weights of transcription factor binding sites show very weak correlation with their PWM score; 2) sites with low regulatory weight are important for the model output; 3) functional important sites are not exclusively located in cis-regulatory elements, but are rather dispersed through regulatory region. It is of importance that some of the sites with high functional impact in hb, Kr and kni regulatory regions coincide with strong sites annotated and verified in Dnase I footprint assays.
Patel, Sejal; Roncaglia, Paola; Lovering, Ruth C
2015-06-06
People with an autistic spectrum disorder (ASD) display a variety of characteristic behavioral traits, including impaired social interaction, communication difficulties and repetitive behavior. This complex neurodevelopment disorder is known to be associated with a combination of genetic and environmental factors. Neurexins and neuroligins play a key role in synaptogenesis and neurexin-neuroligin adhesion is one of several processes that have been implicated in autism spectrum disorders. In this report we describe the manual annotation of a selection of gene products known to be associated with autism and/or the neurexin-neuroligin-SHANK complex and demonstrate how a focused annotation approach leads to the creation of more descriptive Gene Ontology (GO) terms, as well as an increase in both the number of gene product annotations and their granularity, thus improving the data available in the GO database. The manual annotations we describe will impact on the functional analysis of a variety of future autism-relevant datasets. Comprehensive gene annotation is an essential aspect of genomic and proteomic studies, as the quality of gene annotations incorporated into statistical analysis tools affects the effective interpretation of data obtained through genome wide association studies, next generation sequencing, proteomic and transcriptomic datasets.
Walter, Lutz; Petersen, Beatrix
2017-02-01
The killer immunoglobulin-like receptors (KIR) as well as their MHC class I ligands display enormous genetic diversity and polymorphism in macaque species. Signals resulting from interaction between KIR or CD94/NKG2 receptors and their cognate MHC class I proteins essentially regulate the activity of natural killer (NK) cells. Macaque and human KIR share many features, such as clonal expression patterns, gene copy number variations, specificity for particular MHC class I allotypes, or epistasis between KIR and MHC class I genes that influence susceptibility and resistance to immunodeficiency virus infection. In this review article we also annotated publicly available rhesus macaque BAC clone sequences and provide the first description of the CD94-NKG2 genomic region. Besides the presence of genes that are orthologous to human NKG2A and NKG2F, this region contains three NKG2C paralogues. Hence, the genome of rhesus macaques contains moderately expanded and diversified NKG2 genes in addition to highly diversified KIR genes. The presence of two diversified NK cell receptor families in one species has not been described before and is expected to require a complex MHC-dependent regulation of NK cells. © 2016 John Wiley & Sons Ltd.
Wang, Min; Li, Min; Liu, Yue-Sheng; Lei, Si-Min; Xiao, Yan-Feng
2017-11-01
The aim of the study was to provide a descriptive analysis of familial male-limited precocious puberty (FMPP), which is a rare inherited disease caused by heterozygous constitutively activating mutations of the luteinizing hormone/choriogonadotropin receptor gene (LHCGR). The patient was a ten-month-old boy, presenting with penile enlargement, pubic hair formation, and spontaneous erections. Based on the clinical manifestations and laboratory data, including sexual characteristics, serum testosterone levels, GnRH stimulation test, and bone age, this boy was diagnosed with peripheral precocious puberty. Subsequently the precocious puberty-related genes were analyzed by direct DNA sequencing of amplified PCR products from the patient and his parents. Genetic analysis revealed a novel heterozygous missense mutation c.1732G>C (Asp578His) of the LHCGR gene exon11 in the patient, which had never been reported. His parents had no mutations. After combined treatment with aromatase inhibitor letrozole and anti-androgen spironolactone for six months, the patient's symptoms were controlled. The findings in this study expand the mutation spectrum of the LHCGR gene, and provide molecular evidence for the etiologic diagnosis as well as for the genetic counseling and prenatal diagnosis in the family.
EGenBio: A Data Management System for Evolutionary Genomics and Biodiversity
Nahum, Laila A; Reynolds, Matthew T; Wang, Zhengyuan O; Faith, Jeremiah J; Jonna, Rahul; Jiang, Zhi J; Meyer, Thomas J; Pollock, David D
2006-01-01
Background Evolutionary genomics requires management and filtering of large numbers of diverse genomic sequences for accurate analysis and inference on evolutionary processes of genomic and functional change. We developed Evolutionary Genomics and Biodiversity (EGenBio; ) to begin to address this. Description EGenBio is a system for manipulation and filtering of large numbers of sequences, integrating curated sequence alignments and phylogenetic trees, managing evolutionary analyses, and visualizing their output. EGenBio is organized into three conceptual divisions, Evolution, Genomics, and Biodiversity. The Genomics division includes tools for selecting pre-aligned sequences from different genes and species, and for modifying and filtering these alignments for further analysis. Species searches are handled through queries that can be modified based on a tree-based navigation system and saved. The Biodiversity division contains tools for analyzing individual sequences or sequence alignments, whereas the Evolution division contains tools involving phylogenetic trees. Alignments are annotated with analytical results and modification history using our PRAED format. A miscellaneous Tools section and Help framework are also available. EGenBio was developed around our comparative genomic research and a prototype database of mtDNA genomes. It utilizes MySQL-relational databases and dynamic page generation, and calls numerous custom programs. Conclusion EGenBio was designed to serve as a platform for tools and resources to ease combined analysis in evolution, genomics, and biodiversity. PMID:17118150
Hennebert, Elise; Maldonado, Barbara; Ladurner, Peter; Flammang, Patrick; Santos, Romana
2015-01-01
Adhesive secretions occur in both aquatic and terrestrial animals, in which they perform diverse functions. Biological adhesives can therefore be remarkably complex and involve a large range of components with different functions and interactions. However, being mainly protein based, biological adhesives can be characterized by classical molecular methods. This review compiles experimental strategies that were successfully used to identify, characterize and obtain the full-length sequence of adhesive proteins from nine biological models: echinoderms, barnacles, tubeworms, mussels, sticklebacks, slugs, velvet worms, spiders and ticks. A brief description and practical examples are given for a variety of tools used to study adhesive molecules at different levels from genes to secreted proteins. In most studies, proteins, extracted from secreted materials or from adhesive organs, are analysed for the presence of post-translational modifications and submitted to peptide sequencing. The peptide sequences are then used directly for a BLAST search in genomic or transcriptomic databases, or to design degenerate primers to perform RT-PCR, both allowing the recovery of the sequence of the cDNA coding for the investigated protein. These sequences can then be used for functional validation and recombinant production. In recent years, the dual proteomic and transcriptomic approach has emerged as the best way leading to the identification of novel adhesive proteins and retrieval of their complete sequences. PMID:25657842
Archaebacterial rhodopsin sequences: Implications for evolution
NASA Technical Reports Server (NTRS)
Lanyi, J. K.
1991-01-01
It was proposed over 10 years ago that the archaebacteria represent a separate kingdom which diverged very early from the eubacteria and eukaryotes. It follows that investigations of archaebacterial characteristics might reveal features of early evolution. So far, two genes, one for bacteriorhodopsin and another for halorhodopsin, both from Halobacterium halobium, have been sequenced. We cloned and sequenced the gene coding for the polypeptide of another one of these rhodopsins, a halorhodopsin in Natronobacterium pharaonis. Peptide sequencing of cyanogen bromide fragments, and immuno-reactions of the protein and synthetic peptides derived from the C-terminal gene sequence, confirmed that the open reading frame was the structural gene for the pharaonis halorhodopsin polypeptide. The flanking DNA sequences of this gene, as well as those of other bacterial rhodopsins, were compared to previously proposed archaebacterial consensus sequences. In pairwise comparisons of the open reading frame with DNA sequences for bacterio-opsin and halo-opsin from Halobacterium halobium, silent divergences were calculated. These indicate very considerable evolutionary distance between each pair of genes, even in the dame organism. In spite of this, three protein sequences show extensive similarities, indicating strong selective pressures.
Li, Wen Hui; Jia, Wan Zhong; Qu, Zi Gang; Xie, Zhi Zhou; Luo, Jian Xun; Yin, Hong; Sun, Xiao Lin; Blaga, Radu; Fu, Bao Quan
2013-04-01
A total of 16 Taenia multiceps isolates collected from naturally infected sheep or goats in Gansu Province, China were characterized by sequences of mitochondrial cytochrome c oxidase subunit 1 (cox1) gene. The complete cox1 gene was amplified for individual T. multiceps isolates by PCR, ligated to pMD18T vector, and sequenced. Sequence analysis indicated that out of 16 T. multiceps isolates 10 unique cox1 gene sequences of 1,623 bp were obtained with sequence variation of 0.12-0.68%. The results showed that the cox1 gene sequences were highly conserved among the examined T. multiceps isolates. However, they were quite different from those of the other Taenia species. Phylogenetic analysis based on complete cox1 gene sequences revealed that T. multiceps isolates were composed of 3 genotypes and distinguished from the other Taenia species.
Li, Wen Hui; Jia, Wan Zhong; Qu, Zi Gang; Xie, Zhi Zhou; Luo, Jian Xun; Yin, Hong; Sun, Xiao Lin; Blaga, Radu
2013-01-01
A total of 16 Taenia multiceps isolates collected from naturally infected sheep or goats in Gansu Province, China were characterized by sequences of mitochondrial cytochrome c oxidase subunit 1 (cox1) gene. The complete cox1 gene was amplified for individual T. multiceps isolates by PCR, ligated to pMD18T vector, and sequenced. Sequence analysis indicated that out of 16 T. multiceps isolates 10 unique cox1 gene sequences of 1,623 bp were obtained with sequence variation of 0.12-0.68%. The results showed that the cox1 gene sequences were highly conserved among the examined T. multiceps isolates. However, they were quite different from those of the other Taenia species. Phylogenetic analysis based on complete cox1 gene sequences revealed that T. multiceps isolates were composed of 3 genotypes and distinguished from the other Taenia species. PMID:23710087
Computational analyses of mammalian lactate dehydrogenases: human, mouse, opossum and platypus LDHs.
Holmes, Roger S; Goldberg, Erwin
2009-10-01
Computational methods were used to predict the amino acid sequences and gene locations for mammalian lactate dehydrogenase (LDH) genes and proteins using genome sequence databanks. Human LDHA, LDHC and LDH6A genes were located in tandem on chromosome 11, while LDH6B and LDH6C genes were on chromosomes 15 and 12, respectively. Opossum LDHC and LDH6B genes were located in tandem with the opossum LDHA gene on chromosome 5 and contained 7 (LDHA and LDHC) or 8 (LDH6B) exons. An amino acid sequence prediction for the opossum LDH6B subunit gave an extended N-terminal sequence, similar to the human and mouse LDH6B sequences, which may support the export of this enzyme into mitochondria. The platypus genome contained at least 3 LDH genes encoding LDHA, LDHB and LDH6B subunits. Phylogenetic studies and sequence analyses indicated that LDHA, LDHB and LDH6B genes are present in all mammalian genomes examined, including a monotreme species (platypus), whereas the LDHC gene may have arisen more recently in marsupial mammals.
Computational analyses of mammalian lactate dehydrogenases: human, mouse, opossum and platypus LDHs
Holmes, Roger S; Goldberg, Erwin
2009-01-01
Computational methods were used to predict the amino acid sequences and gene locations for mammalian lactate dehydrogenase (LDH) genes and proteins using genome sequence databanks. Human LDHA, LDHC and LDH6A genes were located in tandem on chromosome 11, while LDH6B and LDH6C genes were on chromosomes 15 and 12, respectively. Opossum LDHC and LDH6B genes were located in tandem with the opossum LDHA gene on chromosome 5 and contained 7 (LDHA and LDHC) or 8 (LDH6B) exons. An amino acid sequence prediction for the opossum LDH6B subunit gave an extended N-terminal sequence, similar to the human and mouse LDH6B sequences, which may support the export of this enzyme into mitochondria. The platypus genome contained at least 3 LDH genes encoding LDHA, LDHB and LDH6B subunits. Phylogenetic studies and sequence analyses indicated that LDHA, LDHB and LDH6B genes are present in all mammalian genomes examined, including a monotreme species (platypus), whereas the LDHC gene may have arisen more recently in marsupial mammals. PMID:19679512
Rowe, Will; Baker, Kate S; Verner-Jeffreys, David; Baker-Austin, Craig; Ryan, Jim J; Maskell, Duncan; Pearce, Gareth
2015-01-01
Antimicrobial resistance remains a growing and significant concern in human and veterinary medicine. Current laboratory methods for the detection and surveillance of antimicrobial resistant bacteria are limited in their effectiveness and scope. With the rapidly developing field of whole genome sequencing beginning to be utilised in clinical practice, the ability to interrogate sequencing data quickly and easily for the presence of antimicrobial resistance genes will become increasingly important and useful for informing clinical decisions. Additionally, use of such tools will provide insight into the dynamics of antimicrobial resistance genes in metagenomic samples such as those used in environmental monitoring. Here we present the Search Engine for Antimicrobial Resistance (SEAR), a pipeline and web interface for detection of horizontally acquired antimicrobial resistance genes in raw sequencing data. The pipeline provides gene information, abundance estimation and the reconstructed sequence of antimicrobial resistance genes; it also provides web links to additional information on each gene. The pipeline utilises clustering and read mapping to annotate full-length genes relative to a user-defined database. It also uses local alignment of annotated genes to a range of online databases to provide additional information. We demonstrate SEAR's application in the detection and abundance estimation of antimicrobial resistance genes in two novel environmental metagenomes, 32 human faecal microbiome datasets and 126 clinical isolates of Shigella sonnei. We have developed a pipeline that contributes to the improved capacity for antimicrobial resistance detection afforded by next generation sequencing technologies, allowing for rapid detection of antimicrobial resistance genes directly from sequencing data. SEAR uses raw sequencing data via an intuitive interface so can be run rapidly without requiring advanced bioinformatic skills or resources. Finally, we show that SEAR is effective in detecting antimicrobial resistance genes in metagenomic and isolate sequencing data from both environmental metagenomes and sequencing data from clinical isolates.
USDA-ARS?s Scientific Manuscript database
: Hemoglobin-y gene of channel catfish , lctalurus punctatus, was cloned and sequenced . Total RNA from head kidneys was isolated, reverse transcribed and amplified . The sequence of the channel catfish hemoglobin-y gene consists of 600 nucleotides . Analysis of the nucleotide sequence reveals one o...
Joshi, R K; Mohanty, S; Subudhi, E; Nayak, S
2010-09-08
Turmeric (Curcuma longa), an important asexually reproducing spice crop of the family Zingiberaceae is highly susceptible to bacterial and fungal pathogens. The identification of resistance gene analogs holds great promise for development of resistant turmeric cultivars. Degenerate primers designed based on known resistance genes (R-genes) were used in combinations to elucidate resistance gene analogs from Curcuma longa cultivar surama. The three primers resulted in amplicons with expected sizes of 450-600 bp. The nucleotide sequence of these amplicons was obtained through sequencing; their predicted amino acid sequences compared to each other and to the amino acid sequences of known R-genes revealed significant sequence similarity. The finding of conserved domains, viz., kinase-1a, kinase-2 and hydrophobic motif, provided evidence that the sequences belong to the NBS-LRR class gene family. The presence of tryptophan as the last residue of kinase-2 motif further qualified them to be in the non-TIR-NBS-LRR subfamily of resistance genes. A cluster analysis based on the neighbor-joining method was carried out using Curcuma NBS analogs together with several resistance gene analogs and known R-genes, which classified them into two distinct subclasses, corresponding to clades N3 and N4 of non-TIR-NBS sequences described in plants. The NBS analogs that we isolated can be used as guidelines to eventually isolate numerous R-genes in turmeric.
Miñana-Galbis, David; Farfàn, Maribel; Lorén, J Gaspar; Fusté, M Carmen
2010-03-01
The use of reference strains is a critical element for the quality control of different assays, from the development of molecular methods to the evaluation of antimicrobial activities. Most of the strains used in these assays are not type strains and some of them are cited erroneously because of subsequent reclassifications and descriptions of novel species. In this study, we propose that the reference strain Aeromonas hydrophila CIP 57.50 be reclassified as Aeromonas salmonicida CIP 57.50 based on phenotypic characterization and sequence analyses of the cpn60, dnaJ, gyrB and rpoD genes.
Beneitez, David; Carrera, Alícia; Duran-Suárez, Joan Ramón; Paz, Victoria; León, Antonio; García Talavera, Juan
2006-01-01
Hb Hope [beta136(H14)Gly --> Asp (GGT --> GAT)] has been found alone or in combination with other globin gene mutations in several African-American families, as well as in Japanese, Thai, Laotian, Cuban and Mauritanian families. We report the hematological and molecular characteristics of a heterozygous association of Hb Hope with beta0-thalassemia (thal) in a Spanish patient, in whom the level of expression of abnormal hemoglobin (Hb) by cation exchange high performance liquid chromatography (HPLC) and electrophoresis suggested initially a homozygous expression of the abnormal Hb, although sequencing of the polymerase chain reaction (PCR)-amplified beta-globin gene demonstrated a heterozygous genotype for Hb Hope. To the best of our knowledge, this is the first description of a case of Hb Hope in a Spanish family.
Santana, Flávia A; Nunes, Francis M F; Vieira, Carlos U; Machado, Maria Alice M S; Kerr, Warwick E; Silva, Wilson A; Bonetti, Ana Maria
2006-03-01
We have compared gene expression, using the Differential Display Reverse Transcriptase-Polymerase Chain Reaction (DDRT-PCR) technique, by means of mRNA profile in Melipona scutellaris during ontogenetic postembryonic development, in adult worker, and in both Natural and Juvenile Hormone III-induced adult queen. Six, out of the nine ESTs described here, presented differentially expressed in the phases L1 or L2, or even in both of them, suggesting that key mechanisms to the development of Melipona scutellaris are regulated in these stages. The combination HT11G-AP05 revealed in L1 and L2 a product which matches to thioredoxin reductase protein domain in the Clostridium sporogenes, an important protein during cellular oxidoreduction processes. This study represents the first molecular evidence of differential gene expression profiles toward a description of the genetic developmental traits in the genus Melipona.
Bleeker-Wagemakers, E M; Zweije-Hofman, I; Gal, A
1988-11-01
A 15-year-old male patient with the typical ocular symptoms of Norrie disease is described. Additionally, he presents severe mental retardation, growth disturbances, hypogonadism, and increased susceptibility to infections. This complex syndrome is apparently segregating through three generations: four other male relatives of the patient were blind from birth and died from recurrent infections between the ages of three to 15 months. The DNA sequence of the DXS7 locus (L1.28 probe), known to be closely linked to the Norrie gene, was not found in the patient's DNA. This result suggests that the more complex clinical picture seen is the result of a deletion of the X chromosome spanning DXS7, the Norrie gene, and several neighbouring loci. A detailed clinical description of the patient is given and compared to that of similar cases.
Lv, Qiang; Chen, Ming; Xu, Haiyan; Song, Yuqin; Sun, Zhihong; Dan, Tong; Sun, Tiansong
2013-07-04
Using the 16S rRNA, dnaA, murC and pyrG gene sequences, we identified the phylogenetic relationship among closely related Leuconostoc citreum species. Seven Leu. citreum strains originally isolated from sourdough were characterized by PCR methods to amplify the dnaA, murC and pyrG gene sequences, which were determined to assess the suitability as phylogenetic markers. Then, we estimated the genetic distance and constructed the phylogenetic trees including 16S rRNA and above mentioned three housekeeping genes combining with published corresponding sequences. By comparing the phylogenetic trees, the topology of three housekeeping genes trees were consistent with that of 16S rRNA gene. The homology of closely related Leu. citreum species among dnaA, murC, pyrG and 16S rRNA gene sequences were different, ranged from75.5% to 97.2%, 50.2% to 99.7%, 65.0% to 99.8% and 98.5% 100%, respectively. The phylogenetic relationship of three housekeeping genes sequences were highly consistent with the results of 16S rRNA gene sequence, while the genetic distance of these housekeeping genes were extremely high than 16S rRNA gene. Consequently, the dnaA, murC and pyrG gene are suitable for classification and identification closely related Leu. citreum species.
Lactobacillus heilongjiangensis sp. nov., isolated from Chinese pickle.
Gu, Chun Tao; Li, Chun Yan; Yang, Li Jie; Huo, Gui Cheng
2013-11-01
A Gram-stain-positive bacterial strain, S4-3(T), was isolated from traditional pickle in Heilongjiang Province, China. The bacterium was characterized by a polyphasic approach, including 16S rRNA gene sequence analysis, pheS gene sequence analysis, rpoA gene sequence analysis, dnaK gene sequence analysis, fatty acid methyl ester (FAME) analysis, determination of DNA G+C content, DNA-DNA hybridization and an analysis of phenotypic features. Strain S4-3(T) showed 97.9-98.7 % 16S rRNA gene sequence similarities, 84.4-94.1 % pheS gene sequence similarities and 94.4-96.9 % rpoA gene sequence similarities to the type strains of Lactobacillus nantensis, Lactobacillus mindensis, Lactobacillus crustorum, Lactobacillus futsaii, Lactobacillus farciminis and Lactobacillus kimchiensis. dnaK gene sequence similarities between S4-3(T) and Lactobacillus nantensis LMG 23510(T), Lactobacillus mindensis LMG 21932(T), Lactobacillus crustorum LMG 23699(T), Lactobacillus futsaii JCM 17355(T) and Lactobacillus farciminis LMG 9200(T) were 95.4, 91.5, 90.4, 91.7 and 93.1 %, respectively. Based upon the data obtained in the present study, a novel species, Lactobacillus heilongjiangensis sp. nov., is proposed and the type strain is S4-3(T) ( = LMG 26166(T) = NCIMB 14701(T)).
Oduru, Sreedhar; Campbell, Janee L; Karri, SriTulasi; Hendry, William J; Khan, Shafiq A; Williams, Simon C
2003-01-01
Background Complete genome annotation will likely be achieved through a combination of computer-based analysis of available genome sequences combined with direct experimental characterization of expressed regions of individual genomes. We have utilized a comparative genomics approach involving the sequencing of randomly selected hamster testis cDNAs to begin to identify genes not previously annotated on the human, mouse, rat and Fugu (pufferfish) genomes. Results 735 distinct sequences were analyzed for their relatedness to known sequences in public databases. Eight of these sequences were derived from previously unidentified genes and expression of these genes in testis was confirmed by Northern blotting. The genomic locations of each sequence were mapped in human, mouse, rat and pufferfish, where applicable, and the structure of their cognate genes was derived using computer-based predictions, genomic comparisons and analysis of uncharacterized cDNA sequences from human and macaque. Conclusion The use of a comparative genomics approach resulted in the identification of eight cDNAs that correspond to previously uncharacterized genes in the human genome. The proteins encoded by these genes included a new member of the kinesin superfamily, a SET/MYND-domain protein, and six proteins for which no specific function could be predicted. Each gene was expressed primarily in testis, suggesting that they may play roles in the development and/or function of testicular cells. PMID:12783626
Busslinger, M; Portmann, R; Irminger, J C; Birnstiel, M L
1980-01-01
The DNA sequences of the entire structural H4, H3, H2A and H2B genes and of their 5' flanking regions have been determined in the histone DNA clone h19 of the sea urchin Psammechinus miliaris. In clone h19 the polarity of transcription and the relative arrangement of the histone genes is identical to that in clone h22 of the same species. The histone proteins encoded by h19 DNA differ in their primary structure from those encoded by clone h22 and have been compared to histone protein sequences of other sea urchin species as well as other eukaryotes. A comparative analysis of the 5' flanking DNA sequences of the structural histone genes in both clones revealed four ubiquitous sequence motifs; a pentameric element GATCC, followed at short distance by the Hogness box GTATAAATAG, a conserved sequence PyCATTCPu, in or near which the 5' ends of the mRNAs map in h22 DNA and lastly a sequence A, containing the initiation codon. These sequences are also found, sometimes in modified version, in front of other eukaryotic genes transcribed by polymerase II. When prelude sequences of isocoding histone genes in clone h19 and h22 are compared areas of homology are seen to extend beyond the ubiquitous sequence motifs towards the divergent AT-rich spacer and terminate between approximately 140 and 240 nucleotides away from the structural gene. These prelude regions contain quite large conservative sequence blocks which are specific for each type of histone genes. Images PMID:7443547
Kim, Young-Ok; Park, Sooyeon; Nam, Bo-Hye; Jung, Yong-Taek; Kim, Dong-Gyun; Bae, Kyung Sook; Yoon, Jung-Hoon
2014-06-01
A Gram-stain-negative, non-motile, coccoid, ovoid or rod-shaped bacterial strain, designated RSS3-C1(T), was isolated from a golden sea squirt (Halocynthia aurantium) collected from the East Sea, South Korea. Strain RSS3-C1(T) was found to grow optimally at 20-25 °C, at pH 7.0-8.0 and in the presence of 2.0% (w/v) NaCl. Phylogenetic trees based on 16S rRNA gene sequences revealed that strain RSS3-C1(T) clustered with the type strains of Lutimonas vermicola and Aestuariicola saemankumensis. Strain RSS3-C1(T) exhibited 98.8% 16S rRNA gene sequence similarity to each type strain. Strain RSS3-C1(T) contained MK-6 as the predominant menaquinone and iso-C(15 : 0), iso-C(17 : 0) 3-OH and anteiso-C(15 : 0) as the major fatty acids. The major polar lipids of strain RSS3-C1(T) were phosphatidylethanolamine and two unidentified lipids. The DNA G+C content of strain RSS3-C1(T) was 39.2 mol%, and DNA-DNA relatedness to the type strains of and was 21±5.3 and 26±7.5 %, respectively. The differential phenotypic properties, together with its phylogenetic and genetic distinctiveness, revealed that strain RSS3-C1(T) is separated from and . On the basis of the data presented, strain RSS3-C1(T) is considered to represent a novel species of the genus Lutimonas, for which the name Lutimonas halocynthiae sp. nov. is proposed. The type strain is RSS3-C1(T) ( = KCTC 32537(T) = CECT 8444(T)). In this study, it is also proposed that Aestuariicola saemankumensis should be reclassified as a member of the genus Lutimonas, as Lutimonas saemankumensis comb. nov. (type strain SMK-142(T) = KCTC 22171(T) = CCUG 55329(T)), and the description of the genus Lutimonas is emended. © 2014 IUMS.
The pig X and Y Chromosomes: structure, sequence, and evolution
Skinner, Benjamin M.; Sargent, Carole A.; Churcher, Carol; Hunt, Toby; Herrero, Javier; Loveland, Jane E.; Dunn, Matt; Louzada, Sandra; Fu, Beiyuan; Chow, William; Gilbert, James; Austin-Guest, Siobhan; Beal, Kathryn; Carvalho-Silva, Denise; Cheng, William; Gordon, Daria; Grafham, Darren; Hardy, Matt; Harley, Jo; Hauser, Heidi; Howden, Philip; Howe, Kerstin; Lachani, Kim; Ellis, Peter J.I.; Kelly, Daniel; Kerry, Giselle; Kerwin, James; Ng, Bee Ling; Threadgold, Glen; Wileman, Thomas; Wood, Jonathan M.D.; Yang, Fengtang; Harrow, Jen; Affara, Nabeel A.; Tyler-Smith, Chris
2016-01-01
We have generated an improved assembly and gene annotation of the pig X Chromosome, and a first draft assembly of the pig Y Chromosome, by sequencing BAC and fosmid clones from Duroc animals and incorporating information from optical mapping and fiber-FISH. The X Chromosome carries 1033 annotated genes, 690 of which are protein coding. Gene order closely matches that found in primates (including humans) and carnivores (including cats and dogs), which is inferred to be ancestral. Nevertheless, several protein-coding genes present on the human X Chromosome were absent from the pig, and 38 pig-specific X-chromosomal genes were annotated, 22 of which were olfactory receptors. The pig Y-specific Chromosome sequence generated here comprises 30 megabases (Mb). A 15-Mb subset of this sequence was assembled, revealing two clusters of male-specific low copy number genes, separated by an ampliconic region including the HSFY gene family, which together make up most of the short arm. Both clusters contain palindromes with high sequence identity, presumably maintained by gene conversion. Many of the ancestral X-related genes previously reported in at least one mammalian Y Chromosome are represented either as active genes or partial sequences. This sequencing project has allowed us to identify genes—both single copy and amplified—on the pig Y Chromosome, to compare the pig X and Y Chromosomes for homologous sequences, and thereby to reveal mechanisms underlying pig X and Y Chromosome evolution. PMID:26560630
Evidence for Horizontal Gene Transfer in Evolution of Elongation Factor Tu in Enterococci
Ke, Danbing; Boissinot, Maurice; Huletsky, Ann; Picard, François J.; Frenette, Johanne; Ouellette, Marc; Roy, Paul H.; Bergeron, Michel G.
2000-01-01
The elongation factor Tu, encoded by tuf genes, is a GTP binding protein that plays a central role in protein synthesis. One to three tuf genes per genome are present, depending on the bacterial species. Most low-G+C-content gram-positive bacteria carry only one tuf gene. We have designed degenerate PCR primers derived from consensus sequences of the tuf gene to amplify partial tuf sequences from 17 enterococcal species and other phylogenetically related species. The amplified DNA fragments were sequenced either by direct sequencing or by sequencing cloned inserts containing putative amplicons. Two different tuf genes (tufA and tufB) were found in 11 enterococcal species, including Enterococcus avium, Enterococcus casseliflavus, Enterococcus dispar, Enterococcus durans, Enterococcus faecium, Enterococcus gallinarum, Enterococcus hirae, Enterococcus malodoratus, Enterococcus mundtii, Enterococcus pseudoavium, and Enterococcus raffinosus. For the other six enterococcal species (Enterococcus cecorum, Enterococcus columbae, Enterococcus faecalis, Enterococcus sulfureus, Enterococcus saccharolyticus, and Enterococcus solitarius), only the tufA gene was present. Based on 16S rRNA gene sequence analysis, the 11 species having two tuf genes all have a common ancestor, while the six species having only one copy diverged from the enterococcal lineage before that common ancestor. The presence of one or two copies of the tuf gene in enterococci was confirmed by Southern hybridization. Phylogenetic analysis of tuf sequences demonstrated that the enterococcal tufA gene branches with the Bacillus, Listeria, and Staphylococcus genera, while the enterococcal tufB gene clusters with the genera Streptococcus and Lactococcus. Primary structure analysis showed that four amino acid residues encoded within the sequenced regions are conserved and unique to the enterococcal tufB genes and the tuf genes of streptococci and Lactococcus lactis. The data suggest that an ancestral streptococcus or a streptococcus-related species may have horizontally transferred a tuf gene to the common ancestor of the 11 enterococcal species which now carry two tuf genes. PMID:11092850
Mouse mammary tumor virus-like gene sequences are present in lung patient specimens
2011-01-01
Background Previous studies have reported on the presence of Murine Mammary Tumor Virus (MMTV)-like gene sequences in human cancer tissue specimens. Here, we search for MMTV-like gene sequences in lung diseases including carcinomas specimens from a Mexican population. This study was based on our previous study reporting that the INER51 lung cancer cell line, from a pleural effusion of a Mexican patient, contains MMTV-like env gene sequences. Results The MMTV-like env gene sequences have been detected in three out of 18 specimens studied, by PCR using a specific set of MMTV-like primers. The three identified MMTV-like gene sequences, which were assigned as INER6, HZ101, and HZ14, were 99%, 98%, and 97% homologous, respectively, as compared to GenBank sequence accession number AY161347. The INER6 and HZ-101 samples were isolated from lung cancer specimens, and the HZ-14 was isolated from an acute inflammatory lung infiltrate sample. Two of the env sequences exhibited disruption of the reading frame due to mutations. Conclusion In summary, we identified the presence of MMTV-like gene sequences in 2 out of 11 (18%) of the lung carcinomas and 1 out of 7 (14%) of acute inflamatory lung infiltrate specimens studied of a Mexican Population. PMID:21943279
Sequence heterogeneity in the two 16S rRNA genes of Phormium yellow leaf phytoplasma.
Liefting, L W; Andersen, M T; Beever, R E; Gardner, R C; Forster, R L
1996-01-01
Phormium yellow leaf (PYL) phytoplasma causes a lethal disease of the monocotyledon, New Zealand flax (Phormium tenax). The 16S rRNA genes of PYL phytoplasma were amplified from infected flax by PCR and cloned, and the nucleotide sequences were determined. DNA sequencing and Southern hybridization analysis of genomic DNA indicated the presence of two copies of the 16S rRNA gene. The two 16S rRNA genes exhibited sequence heterogeneity in 4 nucleotide positions and could be distinguished by the restriction enzymes BpmI and BsrI. This is the first record in which sequence heterogeneity in the 16S rRNA genes of a phytoplasma has been determined by sequence analysis. A phylogenetic tree based on 16S rRNA gene sequences showed that PYL phytoplasma is most closely related to the stolbur and German grapevine yellows phytoplasmas, which form the stolbur subgroup of the aster yellows group. This phylogenetic position of PYL phytoplasma was supported by 16S/23S spacer region sequence data. PMID:8795200
HGVS Recommendations for the Description of Sequence Variants: 2016 Update.
den Dunnen, Johan T; Dalgleish, Raymond; Maglott, Donna R; Hart, Reece K; Greenblatt, Marc S; McGowan-Jordan, Jean; Roux, Anne-Francoise; Smith, Timothy; Antonarakis, Stylianos E; Taschner, Peter E M
2016-06-01
The consistent and unambiguous description of sequence variants is essential to report and exchange information on the analysis of a genome. In particular, DNA diagnostics critically depends on accurate and standardized description and sharing of the variants detected. The sequence variant nomenclature system proposed in 2000 by the Human Genome Variation Society has been widely adopted and has developed into an internationally accepted standard. The recommendations are currently commissioned through a Sequence Variant Description Working Group (SVD-WG) operating under the auspices of three international organizations: the Human Genome Variation Society (HGVS), the Human Variome Project (HVP), and the Human Genome Organization (HUGO). Requests for modifications and extensions go through the SVD-WG following a standard procedure including a community consultation step. Version numbers are assigned to the nomenclature system to allow users to specify the version used in their variant descriptions. Here, we present the current recommendations, HGVS version 15.11, and briefly summarize the changes that were made since the 2000 publication. Most focus has been on removing inconsistencies and tightening definitions allowing automatic data processing. An extensive version of the recommendations is available online, at http://www.HGVS.org/varnomen. © 2016 WILEY PERIODICALS, INC.
2014-01-01
Background Deciphering of the information content of eukaryotic promoters has remained confined to universal landmarks and conserved sequence elements such as enhancers and transcription factor binding motifs, which are considered sufficient for gene activation and regulation. Gene-specific sequences, interspersed between the canonical transacting factor binding sites or adjoining them within a promoter, are generally taken to be devoid of any regulatory information and have therefore been largely ignored. An unanswered question therefore is, do gene-specific sequences within a eukaryotic promoter have a role in gene activation? Here, we present an exhaustive experimental analysis of a gene-specific sequence adjoining the heat shock element (HSE) in the proximal promoter of the small heat shock protein gene, αB-crystallin (cryab). These sequences are highly conserved between the rodents and the humans. Results Using human retinal pigment epithelial cells in culture as the host, we have identified a 10-bp gene-specific promoter sequence (GPS), which, unlike an enhancer, controls expression from the promoter of this gene, only when in appropriate position and orientation. Notably, the data suggests that GPS in comparison with the HSE works in a context-independent fashion. Additionally, when moved upstream, about a nucleosome length of DNA (−154 bp) from the transcription start site (TSS), the activity of the promoter is markedly inhibited, suggesting its involvement in local promoter access. Importantly, we demonstrate that deletion of the GPS results in complete loss of cryab promoter activity in transgenic mice. Conclusions These data suggest that gene-specific sequences such as the GPS, identified here, may have critical roles in regulating gene-specific activity from eukaryotic promoters. PMID:24589182
Schiex, Thomas; Gouzy, Jérôme; Moisan, Annick; de Oliveira, Yannick
2003-07-01
We describe FrameD, a program that predicts coding regions in prokaryotic and matured eukaryotic sequences. Initially targeted at gene prediction in bacterial GC rich genomes, the gene model used in FrameD also allows to predict genes in the presence of frameshifts and partially undetermined sequences which makes it also very suitable for gene prediction and frameshift correction in unfinished sequences such as EST and EST cluster sequences. Like recent eukaryotic gene prediction programs, FrameD also includes the ability to take into account protein similarity information both in its prediction and its graphical output. Its performances are evaluated on different bacterial genomes. The web site (http://genopole.toulouse.inra.fr/bioinfo/FrameD/FD) allows direct prediction, sequence correction and translation and the ability to learn new models for new organisms.
Reanalysis of RNA-Sequencing Data Reveals Several Additional Fusion Genes with Multiple Isoforms
Kangaspeska, Sara; Hultsch, Susanne; Edgren, Henrik; Nicorici, Daniel; Murumägi, Astrid; Kallioniemi, Olli
2012-01-01
RNA-sequencing and tailored bioinformatic methodologies have paved the way for identification of expressed fusion genes from the chaotic genomes of solid tumors. We have recently successfully exploited RNA-sequencing for the discovery of 24 novel fusion genes in breast cancer. Here, we demonstrate the importance of continuous optimization of the bioinformatic methodology for this purpose, and report the discovery and experimental validation of 13 additional fusion genes from the same samples. Integration of copy number profiling with the RNA-sequencing results revealed that the majority of the gene fusions were promoter-donating events that occurred at copy number transition points or involved high-level DNA-amplifications. Sequencing of genomic fusion break points confirmed that DNA-level rearrangements underlie selected fusion transcripts. Furthermore, a significant portion (>60%) of the fusion genes were alternatively spliced. This illustrates the importance of reanalyzing sequencing data as gene definitions change and bioinformatic methods improve, and highlights the previously unforeseen isoform diversity among fusion transcripts. PMID:23119097
Reanalysis of RNA-sequencing data reveals several additional fusion genes with multiple isoforms.
Kangaspeska, Sara; Hultsch, Susanne; Edgren, Henrik; Nicorici, Daniel; Murumägi, Astrid; Kallioniemi, Olli
2012-01-01
RNA-sequencing and tailored bioinformatic methodologies have paved the way for identification of expressed fusion genes from the chaotic genomes of solid tumors. We have recently successfully exploited RNA-sequencing for the discovery of 24 novel fusion genes in breast cancer. Here, we demonstrate the importance of continuous optimization of the bioinformatic methodology for this purpose, and report the discovery and experimental validation of 13 additional fusion genes from the same samples. Integration of copy number profiling with the RNA-sequencing results revealed that the majority of the gene fusions were promoter-donating events that occurred at copy number transition points or involved high-level DNA-amplifications. Sequencing of genomic fusion break points confirmed that DNA-level rearrangements underlie selected fusion transcripts. Furthermore, a significant portion (>60%) of the fusion genes were alternatively spliced. This illustrates the importance of reanalyzing sequencing data as gene definitions change and bioinformatic methods improve, and highlights the previously unforeseen isoform diversity among fusion transcripts.
Holland, M J; Holland, J P; Thill, G P; Jackson, K A
1981-02-10
Segments of yeast genomic DNA containing two enolase structural genes have been isolated by subculture cloning procedures using a cDNA hybridization probe synthesized from purified yeast enolase mRNA. Based on restriction endonuclease and transcriptional maps of these two segments of yeast DNA, each hybrid plasmid contains a region of extensive nucleotide sequence homology which forms hybrids with the cDNA probe. The DNA sequences which flank this homologous region in the two hybrid plasmids are nonhomologous indicating that these sequences are nontandemly repeated in the yeast genome. The complete nucleotide sequence of the coding as well as the flanking noncoding regions of these genes has been determined. The amino acid sequence predicted from one reading frame of both structural genes is extremely similar to that determined for yeast enolase (Chin, C. C. Q., Brewer, J. M., Eckard, E., and Wold, F. (1981) J. Biol. Chem. 256, 1370-1376), confirming that these isolated structural genes encode yeast enolase. The nucleotide sequences of the coding regions of the genes are approximately 95% homologous, and neither gene contains an intervening sequence. Codon utilization in the enolase genes follows the same biased pattern previously described for two yeast glyceraldehyde-3-phosphate dehydrogenase structural genes (Holland, J. P., and Holland, M. J. (1980) J. Biol. Chem. 255, 2596-2605). DNA blotting analysis confirmed that the isolated segments of yeast DNA are colinear with yeast genomic DNA and that there are two nontandemly repeated enolase genes per haploid yeast genome. The noncoding portions of the two enolase genes adjacent to the initiation and termination codons are approximately 70% homologous and contain sequences thought to be involved in the synthesis and processing messenger RNA. Finally there are regions of extensive homology between the two enolase structural genes and two yeast glyceraldehyde-3-phosphate dehydrogenase structural genes within the 5- noncoding portions of these glycolytic genes.
Chiu, Shih-Hau; Chen, Chien-Chi; Yuan, Gwo-Fang; Lin, Thy-Hou
2006-01-01
Background The number of sequences compiled in many genome projects is growing exponentially, but most of them have not been characterized experimentally. An automatic annotation scheme must be in an urgent need to reduce the gap between the amount of new sequences produced and reliable functional annotation. This work proposes rules for automatically classifying the fungus genes. The approach involves elucidating the enzyme classifying rule that is hidden in UniProt protein knowledgebase and then applying it for classification. The association algorithm, Apriori, is utilized to mine the relationship between the enzyme class and significant InterPro entries. The candidate rules are evaluated for their classificatory capacity. Results There were five datasets collected from the Swiss-Prot for establishing the annotation rules. These were treated as the training sets. The TrEMBL entries were treated as the testing set. A correct enzyme classification rate of 70% was obtained for the prokaryote datasets and a similar rate of about 80% was obtained for the eukaryote datasets. The fungus training dataset which lacks an enzyme class description was also used to evaluate the fungus candidate rules. A total of 88 out of 5085 test entries were matched with the fungus rule set. These were otherwise poorly annotated using their functional descriptions. Conclusion The feasibility of using the method presented here to classify enzyme classes based on the enzyme domain rules is evident. The rules may be also employed by the protein annotators in manual annotation or implemented in an automatic annotation flowchart. PMID:16776838
Chung, Eu Jin; Park, Tae Soon; Kim, Kyung Hyun; Jeon, Che Ok; Lee, Hae-In; Chang, Woo-Suk; Aslam, Zubair; Chung, Young Ryun
2015-09-01
A polyphasic approach was used to characterize a novel nitrogen-fixing bacterial strain, designated YC6995(T), isolated from the rhizosphere soil of Iris ensata var. spontanea (Makino) Nakai inhabiting a wetland located at an altitude of 960 m on Jiri Mountain, Korea. Strain YC6995(T) cells were Gram-negative, and rod-shaped, with motility provided by a single polar flagellum. Optimal growth conditions were 30 °C and pH 7.0. The major fatty acids of strain YC6995(T) were C18:1 ω7c, C18:1 2-OH and C16:0 3-OH. The major respiratory quinone was ubiquinone-10 (Q-10). The polar lipids were phosphatidylethanolamine, phosphatidyldimethylethanolamine, phosphatidylcholine, phosphatidylglycerol and unidentified glycolipids. The genomic DNA G+C content was 64.1 mol%. Phylogenetic analysis based on 16S rRNA gene sequences showed strain YC6995(T) to form a phyletic lineage with Nitrospirillum amazonense DSM 2787(T) with a high sequence similarity (97.2 %), but it displayed low sequence similarity with other remotely related genera, including Azospirillum (<93 %), Rhodocista (93.1-93.4 %), and Skermanella (91.2-93.3 %) in the family Alphaproteobacteria. Based on the phenotypic, chemotaxonomic, and phylogenetic evidences, strain YC6995(T) represents a novel species within the genus Nitrospirillum, for which the name Nitrospirillum irinus sp. nov. is proposed. The type strain is YC6995(T) (= KACC 13777(T) = DSM 22198(T)). An emended description of the genus Nitrospirillum is also proposed.
Tan, Qian-Qian; Zhu, Li; Li, Yi; Liu, Wen; Ma, Wei-Hua; Lei, Chao-Liang; Wang, Xiao-Ping
2015-01-01
The cabbage beetle Colaphellus bowringi Baly is a serious insect pest of crucifers and undergoes reproductive diapause in soil. An understanding of the molecular mechanisms of diapause regulation, insecticide resistance, and other physiological processes is helpful for developing new management strategies for this beetle. However, the lack of genomic information and valid reference genes limits knowledge on the molecular bases of these physiological processes in this species. Using Illumina sequencing, we obtained more than 57 million sequence reads derived from C. bowringi, which were assembled into 39,390 unique sequences. A Clusters of Orthologous Groups classification was obtained for 9,048 of these sequences, covering 25 categories, and 16,951 were assigned to 255 Kyoto Encyclopedia of Genes and Genomes pathways. Eleven candidate reference gene sequences from the transcriptome were then identified through reverse transcriptase polymerase chain reaction. Among these candidate genes, EF1α, ACT1, and RPL19 proved to be the most stable reference genes for different reverse transcriptase quantitative polymerase chain reaction experiments in C. bowringi. Conversely, aTUB and GAPDH were the least stable reference genes. The abundant putative C. bowringi transcript sequences reported enrich the genomic resources of this beetle. Importantly, the larger number of gene sequences and valid reference genes provide a valuable platform for future gene expression studies, especially with regard to exploring the molecular mechanisms of different physiological processes in this species.
Nomoto, R; Kagawa, H; Yoshida, T
2008-01-01
To investigate the difference between Lancefield group C Streptococcus dysgalactiae (GCSD) strains isolated from diseased fish and animals by sequencing and phylogenetic analysis of the sodA gene. The sodA gene of Strep. dysgalactiae strains isolated from fish and animals were amplified and its nucleotide sequences were determined. Although 100% sequence identity was observed among fish GCSD strains, the determined sequences from animal isolates showed variations against fish isolate sequences. Thus, all fish GCSD strains were clearly separated from the GCSD strains of other origin by using phylogenetic tree analysis. In addition, the original primer set was designed based on the determined sequences for specifically amplify the sodA gene of fish GCSD strains. The primer set yield amplification products from only fish GCSD strains. By sequencing analysis of the sodA gene, the genetic divergence between Strep. dysgalactiae strains isolated from fish and mammals was demonstrated. Moreover, an original oligonucletide primer set, which could simply detect the genotype of fish GCSD strains was designed. This study shows that Strep. dysgalactiae isolated from diseased fish could be distinguished from conventional GCSD strains by the difference in the sequence of the sodA gene.
A Newly Described Bovine Type 2 Scurs Syndrome Segregates with a Frame-Shift Mutation in TWIST1
Capitan, Aurélien; Grohs, Cécile; Weiss, Bernard; Rossignol, Marie-Noëlle; Reversé, Patrick; Eggen, André
2011-01-01
The developmental pathways involved in horn development are complex and still poorly understood. Here we report the description of a new dominant inherited syndrome in the bovine Charolais breed that we have named type 2 scurs. Clinical examination revealed that, despite a strong phenotypic variability, all affected individuals show both horn abnormalities similar to classical scurs phenotype and skull interfrontal suture synostosis. Based on a genome-wide linkage analysis using Illumina BovineSNP50 BeadChip genotyping data from 57 half-sib and full-sib progeny, this locus was mapped to a 1.7 Mb interval on bovine chromosome 4. Within this region, the TWIST1 gene encoding a transcription factor was considered as a strong candidate gene since its haploinsufficiency is responsible for the human Saethre-Chotzen syndrome, characterized by skull coronal suture synostosis. Sequencing of the TWIST1 gene identified a c.148_157dup (p.A56RfsX87) frame-shift mutation predicted to completely inactivate this gene. Genotyping 17 scurred and 20 horned founders of our pedigree as well as 48 unrelated horned controls revealed a perfect association between this mutation and the type 2 scurs phenotype. Subsequent genotyping of 32 individuals born from heterozygous parents showed that homozygous mutated progeny are completely absent, which is consistent with the embryonic lethality reported in Drosophila and mouse suffering from TWIST1 complete insufficiency. Finally, data from previous studies on model species and a fine description of type 2 scurs symptoms allowed us to propose different mechanisms to explain the features of this syndrome. In conclusion, this first report on the identification of a potential causal mutation affecting horn development in cattle offers a unique opportunity to better understand horn ontogenesis. PMID:21814570
Sugita, Chieko; Ogata, Koretsugu; Shikata, Masamitsu; Jikuya, Hiroyuki; Takano, Jun; Furumichi, Miho; Kanehisa, Minoru; Omata, Tatsuo; Sugiura, Masahiro; Sugita, Mamoru
2007-01-01
The entire genome of the unicellular cyanobacterium Synechococcus elongatus PCC 6301 (formerly Anacystis nidulans Berkeley strain 6301) was sequenced. The genome consisted of a circular chromosome 2,696,255 bp long. A total of 2,525 potential protein-coding genes, two sets of rRNA genes, 45 tRNA genes representing 42 tRNA species, and several genes for small stable RNAs were assigned to the chromosome by similarity searches and computer predictions. The translated products of 56% of the potential protein-coding genes showed sequence similarities to experimentally identified and predicted proteins of known function, and the products of 35% of the genes showed sequence similarities to the translated products of hypothetical genes. The remaining 9% of genes lacked significant similarities to genes for predicted proteins in the public DNA databases. Some 139 genes coding for photosynthesis-related components were identified. Thirty-seven genes for two-component signal transduction systems were also identified. This is the smallest number of such genes identified in cyanobacteria, except for marine cyanobacteria, suggesting that only simple signal transduction systems are found in this strain. The gene arrangement and nucleotide sequence of Synechococcus elongatus PCC 6301 were nearly identical to those of a closely related strain Synechococcus elongatus PCC 7942, except for the presence of a 188.6 kb inversion. The sequences as well as the gene information shown in this paper are available in the Web database, CYORF (http://www.cyano.genome.jp/).
Ye, Weixing; Zhu, Lei; Liu, Yingying; Crickmore, Neil; Peng, Donghai; Ruan, Lifang; Sun, Ming
2012-07-01
We have designed a high-throughput system for the identification of novel crystal protein genes (cry) from Bacillus thuringiensis strains. The system was developed with two goals: (i) to acquire the mixed plasmid-enriched genomic sequence of B. thuringiensis using next-generation sequencing biotechnology, and (ii) to identify cry genes with a computational pipeline (using BtToxin_scanner). In our pipeline method, we employed three different kinds of well-developed prediction methods, BLAST, hidden Markov model (HMM), and support vector machine (SVM), to predict the presence of Cry toxin genes. The pipeline proved to be fast (average speed, 1.02 Mb/min for proteins and open reading frames [ORFs] and 1.80 Mb/min for nucleotide sequences), sensitive (it detected 40% more protein toxin genes than a keyword extraction method using genomic sequences downloaded from GenBank), and highly specific. Twenty-one strains from our laboratory's collection were selected based on their plasmid pattern and/or crystal morphology. The plasmid-enriched genomic DNA was extracted from these strains and mixed for Illumina sequencing. The sequencing data were de novo assembled, and a total of 113 candidate cry sequences were identified using the computational pipeline. Twenty-seven candidate sequences were selected on the basis of their low level of sequence identity to known cry genes, and eight full-length genes were obtained with PCR. Finally, three new cry-type genes (primary ranks) and five cry holotypes, which were designated cry8Ac1, cry7Ha1, cry21Ca1, cry32Fa1, and cry21Da1 by the B. thuringiensis Toxin Nomenclature Committee, were identified. The system described here is both efficient and cost-effective and can greatly accelerate the discovery of novel cry genes.
Cloning and sequencing of a cellobiohydrolase gene from Trichoderma harzianum FP108
Patrick Guilfoile; Ron Burns; Zu-Yi Gu; Matt Amundson; Fu-Hsian Chang
1999-01-01
A cbbl cellobiohydrolase gene was cloned and sequenced from the fungus Trichoderrna harzianum FP108. The cloning was performed by PCR amplification of T. harzianum genomic DNA, using PCR primers whose sequence was based on the cbbl gene from Tricboderma reesei. The 3' end of the gene was isolated by inverse...
Pikuta, Elena V; Lyu, Zhe; Williams, Melissa D; Patel, Nisha B; Liu, Yuchen; Hoover, Richard B; Busse, Hans-Jürgen; Lawson, Paul A; Whitman, William B
2017-05-01
A novel psychrotolerant bacterium, strain ISLP-3T, was isolated from a sample of naturally formed ice sculpture on the shore of Lake Podprudnoye in Antarctica. Cells were motile, stained Gram-positive, non-spore-forming, straight or slightly curved rods with the shape of a baseball bat. The new isolate was facultatively anaerobic and catalase-positive. Growth occurred at 3-35 °C with an optimum at 22-24 °C, 0-2 % (w/v) NaCl with an optimum at 0.3 % and pH 6.2-9.5 with an optimum at pH 7.5. Strain ISLP-3T grew on several carbon sources, with the best growth on cellobiose. The isolate possessed ureolytic activity but growth was inhibited by urea. The strain was sensitive to: ampicillin, gentamycin, kanamycin rifampicin, tetracycline and chloramphenicol. Major fatty acids were: anteiso-C15 : 0, iso-C16 : 0, C16 : 0, C14 : 0 and iso-C15 : 0. The predominant menaquinone was MK-9(H4). The genomic G+C content was 69.5 mol%. The 16S rRNA gene showed 99 % sequence similarity to that of Sanguibacter suarezii ST-26T, but their recA genes shared ≤91 % sequence similarity, suggesting that this new isolate represents a novel species within the genus Sanguibacter. This conclusion was supported by average nucleotide identity, which was ≤91 % to the most closely related strain. The name Sanguibacter gelidistatuariae sp. nov. is proposed for the novel species with the type strain ISLP-3T=ATCC TSD-17T=DSM 100501T=JCM 30887T). The complete genome draft sequence of ISLP-3T was deposited under IMG OID 2657245272. Emendments to the descriptions of related taxa have been made based on experimental data from our comparative analysis.
Generation and validation of homozygous fluorescent knock-in cells using CRISPR-Cas9 genome editing.
Koch, Birgit; Nijmeijer, Bianca; Kueblbeck, Moritz; Cai, Yin; Walther, Nike; Ellenberg, Jan
2018-06-01
Gene tagging with fluorescent proteins is essential for investigations of the dynamic properties of cellular proteins. CRISPR-Cas9 technology is a powerful tool for inserting fluorescent markers into all alleles of the gene of interest (GOI) and allows functionality and physiological expression of the fusion protein. It is essential to evaluate such genome-edited cell lines carefully in order to preclude off-target effects caused by (i) incorrect insertion of the fluorescent protein, (ii) perturbation of the fusion protein by the fluorescent proteins or (iii) nonspecific genomic DNA damage by CRISPR-Cas9. In this protocol, we provide a step-by-step description of our systematic pipeline to generate and validate homozygous fluorescent knock-in cell lines.We have used the paired Cas9D10A nickase approach to efficiently insert tags into specific genomic loci via homology-directed repair (HDR) with minimal off-target effects. It is time-consuming and costly to perform whole-genome sequencing of each cell clone to check for spontaneous genetic variations occurring in mammalian cell lines. Therefore, we have developed an efficient validation pipeline of the generated cell lines consisting of junction PCR, Southern blotting analysis, Sanger sequencing, microscopy, western blotting analysis and live-cell imaging for cell-cycle dynamics. This protocol takes between 6 and 9 weeks. With this protocol, up to 70% of the targeted genes can be tagged homozygously with fluorescent proteins, thus resulting in physiological levels and phenotypically functional expression of the fusion proteins.
Le Ber, Isabelle; Camuzat, Agnès; Guerreiro, Rita; Bouya-Ahmed, Kawtar; Bras, Jose; Nicolas, Gael; Gabelle, Audrey; Didic, Mira; De Septenville, Anne; Millecamps, Stéphanie; Lenglet, Timothée; Latouche, Morwena; Kabashi, Edor; Campion, Dominique; Hannequin, Didier; Hardy, John; Brice, Alexis
2013-11-01
Mutations in the SQSTM1 gene, coding for p62, are a cause of Paget disease of bone and amyotrophic lateral sclerosis (ALS). Recently, SQSTM1 mutations were confirmed in ALS, and mutations were also identified in 3 patients with frontotemporal dementia (FTD), suggesting a role for SQSTM1 in FTD. To evaluate the exact contribution of SQSTM1 to FTD and FTD with ALS (FTD-ALS) in an independent cohort of patients. A SQSTM1 mutation was first identified in a multiplex family with FTD by use of whole-exome sequencing. To evaluate the frequency of SQSTM1 mutations, we sequenced this gene in a cohort of patients with FTD or FTD-ALS, with no mutations in known FTD and ALS genes. Primary care or referral center. An overall cohort of 188 French patients, including 132 probands with FTD and 56 probands with FTD-ALS. Frequency of SQSTM1 mutations in patients with FTD or FTD-ALS; description of associated phenotypes. We identified 4 heterozygous missense mutations in 4 unrelated families with FTD; only 1 family had clinical symptoms of Paget disease of bone, and only 1 family had clinical symptoms of FTD-ALS, possibly owing to the low penetrance of some of the clinical manifestations. Although the frequency of the mutations is low in our series (4 of 188 patients [2%]), our results, similar to those already reported, support a direct pathogenic role of p62 in different types of FTD.
Comparison of alternative approaches for analysing multi-level RNA-seq data
Mohorianu, Irina; Bretman, Amanda; Smith, Damian T.; Fowler, Emily K.; Dalmay, Tamas
2017-01-01
RNA sequencing (RNA-seq) is widely used for RNA quantification in the environmental, biological and medical sciences. It enables the description of genome-wide patterns of expression and the identification of regulatory interactions and networks. The aim of RNA-seq data analyses is to achieve rigorous quantification of genes/transcripts to allow a reliable prediction of differential expression (DE), despite variation in levels of noise and inherent biases in sequencing data. This can be especially challenging for datasets in which gene expression differences are subtle, as in the behavioural transcriptomics test dataset from D. melanogaster that we used here. We investigated the power of existing approaches for quality checking mRNA-seq data and explored additional, quantitative quality checks. To accommodate nested, multi-level experimental designs, we incorporated sample layout into our analyses. We employed a subsampling without replacement-based normalization and an identification of DE that accounted for the hierarchy and amplitude of effect sizes within samples, then evaluated the resulting differential expression call in comparison to existing approaches. In a final step to test for broader applicability, we applied our approaches to a published set of H. sapiens mRNA-seq samples, The dataset-tailored methods improved sample comparability and delivered a robust prediction of subtle gene expression changes. The proposed approaches have the potential to improve key steps in the analysis of RNA-seq data by incorporating the structure and characteristics of biological experiments. PMID:28792517
Ruppitsch, W; Stöger, A; Indra, A; Grif, K; Schabereiter-Gurtner, C; Hirschl, A; Allerberger, F
2007-03-01
In a bioterrorism event a rapid tool is needed to identify relevant dangerous bacteria. The aim of the study was to assess the usefulness of partial 16S rRNA gene sequence analysis and the suitability of diverse databases for identifying dangerous bacterial pathogens. For rapid identification purposes a 500-bp fragment of the 16S rRNA gene of 28 isolates comprising Bacillus anthracis, Brucella melitensis, Burkholderia mallei, Burkholderia pseudomallei, Francisella tularensis, Yersinia pestis, and eight genus-related and unrelated control strains was amplified and sequenced. The obtained sequence data were submitted to three public and two commercial sequence databases for species identification. The most frequent reason for incorrect identification was the lack of the respective 16S rRNA gene sequences in the database. Sequence analysis of a 500-bp 16S rDNA fragment allows the rapid identification of dangerous bacterial species. However, for discrimination of closely related species sequencing of the entire 16S rRNA gene, additional sequencing of the 23S rRNA gene or sequencing of the 16S-23S rRNA intergenic spacer is essential. This work provides comprehensive information on the suitability of partial 16S rDNA analysis and diverse databases for rapid and accurate identification of dangerous bacterial pathogens.
Deng, Peng; Tan, Xiaoqing; Wu, Ying; Bai, Qunhua; Jia, Yan; Xiao, Hong
2015-03-01
The ChrT gene encodes a chromate reductase enzyme which catalyzes the reduction of Cr(VI). The chromate reductase is also known as flavin mononucleotide (FMN) reductase (FMN_red). The aim of the present study was to clone the full-length ChrT DNA from Serratia sp. CQMUS2 and analyze the deduced amino acid sequence and three-dimensional structure. The putative ChrT gene fragment of Serratia sp. CQMUS2 was isolated by polymerase chain reaction (PCR), according to the known FMN_red gene sequence from Serratia sp. AS13. The flanking sequences of the ChrT gene were obtained by high efficiency TAIL-PCR, while the full-length gene of ChrT was cloned in Escherichia coli for subsequent sequencing. The nucleotide sequence of ChrT was submitted onto GenBank under the accession number, KF211434. Sequence analysis of the gene and amino acids was conducted using the Basic Local Alignment Search Tool, and open reading frame (ORF) analysis was performed using ORF Finder software. The ChrT gene was found to be an ORF of 567 bp that encodes a 188-amino acid enzyme with a calculated molecular weight of 20.4 kDa. In addition, the ChrT protein was hypothesized to be an NADPH-dependent FMN_red and a member of the flavodoxin-2 superfamily. The amino acid sequence of ChrT showed high sequence similarity to the FMN reductase genes of Klebsiella pneumonia and Raoultella ornithinolytica , which belong to the flavodoxin-2 superfamily. Furthermore, ChrT was shown to have a 85.6% similarity to the three-dimensional structure of Escherichia coli ChrR, sharing four common enzyme active sites for chromate reduction. Therefore, ChrT gene cloning and protein structure determination demonstrated the ability of the gene for chromate reduction. The results of the present study provide a basis for further studies on ChrT gene expression and protein function.
DENG, PENG; TAN, XIAOQING; WU, YING; BAI, QUNHUA; JIA, YAN; XIAO, HONG
2015-01-01
The ChrT gene encodes a chromate reductase enzyme which catalyzes the reduction of Cr(VI). The chromate reductase is also known as flavin mononucleotide (FMN) reductase (FMN_red). The aim of the present study was to clone the full-length ChrT DNA from Serratia sp. CQMUS2 and analyze the deduced amino acid sequence and three-dimensional structure. The putative ChrT gene fragment of Serratia sp. CQMUS2 was isolated by polymerase chain reaction (PCR), according to the known FMN_red gene sequence from Serratia sp. AS13. The flanking sequences of the ChrT gene were obtained by high efficiency TAIL-PCR, while the full-length gene of ChrT was cloned in Escherichia coli for subsequent sequencing. The nucleotide sequence of ChrT was submitted onto GenBank under the accession number, KF211434. Sequence analysis of the gene and amino acids was conducted using the Basic Local Alignment Search Tool, and open reading frame (ORF) analysis was performed using ORF Finder software. The ChrT gene was found to be an ORF of 567 bp that encodes a 188-amino acid enzyme with a calculated molecular weight of 20.4 kDa. In addition, the ChrT protein was hypothesized to be an NADPH-dependent FMN_red and a member of the flavodoxin-2 superfamily. The amino acid sequence of ChrT showed high sequence similarity to the FMN reductase genes of Klebsiella pneumonia and Raoultella ornithinolytica, which belong to the flavodoxin-2 superfamily. Furthermore, ChrT was shown to have a 85.6% similarity to the three-dimensional structure of Escherichia coli ChrR, sharing four common enzyme active sites for chromate reduction. Therefore, ChrT gene cloning and protein structure determination demonstrated the ability of the gene for chromate reduction. The results of the present study provide a basis for further studies on ChrT gene expression and protein function. PMID:25667630
Danley, Patrick D; Mullen, Sean P; Liu, Fenglong; Nene, Vishvanath; Quackenbush, John; Shaw, Kerry L
2007-01-01
Background As the developmental costs of genomic tools decline, genomic approaches to non-model systems are becoming more feasible. Many of these systems may lack advanced genetic tools but are extremely valuable models in other biological fields. Here we report the development of expressed sequence tags (EST's) in an orthopteroid insect, a model for the study of neurobiology, speciation, and evolution. Results We report the sequencing of 14,502 EST's from clones derived from a nerve cord cDNA library, and the subsequent construction of a Gene Index from these sequences, from the Hawaiian trigonidiine cricket Laupala kohalensis. The Gene Index contains 8607 unique sequences comprised of 2575 tentative consensus (TC) sequences and 6032 singletons. For each of the unique sequences, an attempt was made to assign a provisional annotation and to categorize its function using a Gene Ontology-based classification through a sequence-based comparison to known proteins. In addition, a set of unique 70 base pair oligomers that can be used for DNA microarrays was developed. All Gene Index information is posted at the DFCI Gene Indices web page Conclusion Orthopterans are models used to understand the neurophysiological basis of complex motor patterns such as flight and stridulation. The sequences presented in the cricket Gene Index will provide neurophysiologists with many genetic tools that have been largely absent in this field. The cricket Gene Index is one of only two gene indices to be developed in an evolutionary model system. Species within the genus Laupala have speciated recently, rapidly, and extensively. Therefore, the genes identified in the cricket Gene Index can be used to study the genomics of speciation. Furthermore, this gene index represents a significant EST resources for basal insects. As such, this resource is a valuable comparative tool for the understanding of invertebrate molecular evolution. The sequences presented here will provide much needed genomic resources for three distinct but overlapping fields of inquiry: neurobiology, speciation, and molecular evolution. PMID:17459168
Girlich, Delphine; Bonnin, Rémy A; Bogaerts, Pierre; De Laveleye, Morgane; Huang, Daniel T; Dortet, Laurent; Glaser, Philippe; Glupczynski, Youri; Naas, Thierry
2017-02-01
Horizontal gene transfer may occur between distantly related bacteria, thus leading to genetic plasticity and in some cases to acquisition of novel resistance traits. Proteus mirabilis is an enterobacterial species responsible for human infections that may express various acquired β-lactam resistance genes, including different classes of carbapenemase genes. Here we report a Proteus mirabilis clinical isolate (strain 1091) displaying resistance to penicillin, including temocillin, together with reduced susceptibility to carbapenems and susceptibility to expanded-spectrum cephalosporins. Using biochemical tests, significant carbapenem hydrolysis was detected in P. mirabilis 1091. Since PCR failed to detect acquired carbapenemase genes commonly found in Enterobacteriaceae, we used a whole-genome sequencing approach that revealed the presence of bla OXA-58 class D carbapenemase gene, so far identified only in Acinetobacter species. This gene was located on a 3.1-kb element coharboring a bla AmpC -like gene. Remarkably, these two genes were bracketed by putative XerC-XerD binding sites and inserted at a XerC-XerD site located between the terminase-like small- and large-subunit genes of a bacteriophage. Increased expression of the two bla genes resulted from a 6-time tandem amplification of the element as revealed by Southern blotting. This is the first isolation of a clinical P. mirabilis strain producing OXA-58, a class D carbapenemase, and the first description of a XerC-XerD-dependent insertion of antibiotic resistance genes within a bacteriophage. This study revealed a new role for the XerC-XerD recombinase in bacteriophage biology. Copyright © 2017 American Society for Microbiology.
Girlich, Delphine; Bogaerts, Pierre; De Laveleye, Morgane; Huang, Daniel T.; Glupczynski, Youri
2016-01-01
ABSTRACT Horizontal gene transfer may occur between distantly related bacteria, thus leading to genetic plasticity and in some cases to acquisition of novel resistance traits. Proteus mirabilis is an enterobacterial species responsible for human infections that may express various acquired β-lactam resistance genes, including different classes of carbapenemase genes. Here we report a Proteus mirabilis clinical isolate (strain 1091) displaying resistance to penicillin, including temocillin, together with reduced susceptibility to carbapenems and susceptibility to expanded-spectrum cephalosporins. Using biochemical tests, significant carbapenem hydrolysis was detected in P. mirabilis 1091. Since PCR failed to detect acquired carbapenemase genes commonly found in Enterobacteriaceae, we used a whole-genome sequencing approach that revealed the presence of blaOXA-58 class D carbapenemase gene, so far identified only in Acinetobacter species. This gene was located on a 3.1-kb element coharboring a blaAmpC-like gene. Remarkably, these two genes were bracketed by putative XerC-XerD binding sites and inserted at a XerC-XerD site located between the terminase-like small- and large-subunit genes of a bacteriophage. Increased expression of the two bla genes resulted from a 6-time tandem amplification of the element as revealed by Southern blotting. This is the first isolation of a clinical P. mirabilis strain producing OXA-58, a class D carbapenemase, and the first description of a XerC-XerD-dependent insertion of antibiotic resistance genes within a bacteriophage. This study revealed a new role for the XerC-XerD recombinase in bacteriophage biology. PMID:27855079
Sequencing of individual chromosomes of plant pathogenic Fusarium oxysporum.
Kashiwa, Takeshi; Kozaki, Toshinori; Ishii, Kazuo; Turgeon, B Gillian; Teraoka, Tohru; Komatsu, Ken; Arie, Tsutomu
2017-01-01
A small chromosome in reference isolate 4287 of F. oxysporum f. sp. lycopersici (Fol) has been designated as a 'pathogenicity chromosome' because it carries several pathogenicity related genes such as the Secreted In Xylem (SIX) genes. Sequence assembly of small chromosomes in other isolates, based on a reference genome template, is difficult because of karyotype variation among isolates and a high number of sequences associated with transposable elements. These factors often result in misassembly of sequences, making it unclear whether other isolates possess the same pathogenicity chromosome harboring SIX genes as in the reference isolate. To overcome this difficulty, single chromosome sequencing after Contour-clamped Homogeneous Electric Field (CHEF) separation of chromosomes was performed, followed by de novo assembly of sequences. The assembled sequences of individual chromosomes were consistent with results of probing gels of CHEF separated chromosomes with SIX genes. Individual chromosome sequencing revealed that several SIX genes are located on a single small chromosome in two pathogenic forms of F. oxysporum, beyond the reference isolate 4287, and in the cabbage yellows fungus F. oxysporum f. sp. conglutinans. The particular combination of SIX genes on each small chromosome varied. Moreover, not all SIX genes were found on small chromosomes; depending on the isolate, some were on big chromosomes. This suggests that recombination of chromosomes and/or translocation of SIX genes may occur frequently. Our method improves sequence comparison of small chromosomes among isolates. Copyright © 2016 Elsevier Inc. All rights reserved.
Vega, Ana I; Medrano, Celia; Navarrete, Rosa; Desviat, Lourdes R; Merinero, Begoña; Rodríguez-Pombo, Pilar; Vitoria, Isidro; Ugarte, Magdalena; Pérez-Cerdá, Celia; Pérez, Belen
2016-10-01
Glycogen storage disease (GSD) is an umbrella term for a group of genetic disorders that involve the abnormal metabolism of glycogen; to date, 23 types of GSD have been identified. The nonspecific clinical presentation of GSD and the lack of specific biomarkers mean that Sanger sequencing is now widely relied on for making a diagnosis. However, this gene-by-gene sequencing technique is both laborious and costly, which is a consequence of the number of genes to be sequenced and the large size of some genes. This work reports the use of massive parallel sequencing to diagnose patients at our laboratory in Spain using either a customized gene panel (targeted exome sequencing) or the Illumina Clinical-Exome TruSight One Gene Panel (clinical exome sequencing (CES)). Sequence variants were matched against biochemical and clinical hallmarks. Pathogenic mutations were detected in 23 patients. Twenty-two mutations were recognized (mostly loss-of-function mutations), including 11 that were novel in GSD-associated genes. In addition, CES detected five patients with mutations in ALDOB, LIPA, NKX2-5, CPT2, or ANO5. Although these genes are not involved in GSD, they are associated with overlapping phenotypic characteristics such as hepatic, muscular, and cardiac dysfunction. These results show that next-generation sequencing, in combination with the detection of biochemical and clinical hallmarks, provides an accurate, high-throughput means of making genetic diagnoses of GSD and related diseases.Genet Med 18 10, 1037-1043.
Wang, Tingting; Liu, Minxuan; Liu, Jing; Zhang, Zongwen
2017-01-01
Buckwheat is an important minor crop with pharmaceutical functions due to rutin enrichment in the seed. Seeds of common buckwheat cultivars (Fagopyrum esculentum, Fes) usually have much lower rutin content than tartary buckwheat (F. tartaricum, Ft). We previously found a wild species of common buckwheat (F. esculentum ssp. ancestrale, Fea), with seeds that are high in rutin, similar to Ft. In the present study, we investigated the mechanism by which rutin production varies among different buckwheat cultivars, Fea, a Ft variety (Xide) and a Fes variety (No.2 Pingqiao) using RNA sequencing of filling stage seeds. Sequencing data generated approximately 43.78-Gb of clean bases, all these data were pooled together and assembled 180,568 transcripts, and 109,952 unigenes. We established seed gene expression profiles of each buckwheat sample and assessed genes involved in flavonoid biosynthesis, storage proteins production, CYP450 family, starch and sucrose metabolism, and transcription factors. Differentially expressed genes between Fea and Fes were further analyzed due to their close relationship than with Ft. Expression levels of flavonoid biosynthesis gene FLS1 (Flavonol synthase 1) were similar in Fea and Ft, and much higher than in Fes, which was validated by qRT-PCR. This suggests that FLS1 transcript levels may be associated with rutin accumulation in filling stage seeds of buckwheat species. Further, we explored transcription factors by iTAK, and multiple gene families were identified as being involved in the coordinate regulation of metabolism and development. Our extensive transcriptomic data sets provide a complete description of metabolically related genes that are differentially expressed in filling stage buckwheat seeds and suggests that FLS1 is a key controller of rutin synthesis in buckwheat species. FLS1 can effectively convert dihydroflavonoids into flavonol products. These findings provide a basis for further studies of flavonoid biosynthesis in buckwheat breeding to help accelerate flavonoid metabolic engineering that would increase rutin content in cultivars of common buckwheat. PMID:29261741
Kikuchi, Taisei; Cotton, James A.; Dalzell, Jonathan J.; Hasegawa, Koichi; Kanzaki, Natsumi; McVeigh, Paul; Takanashi, Takuma; Tsai, Isheng J.; Assefa, Samuel A.; Cock, Peter J. A.; Otto, Thomas Dan; Hunt, Martin; Reid, Adam J.; Sanchez-Flores, Alejandro; Tsuchihara, Kazuko; Yokoi, Toshiro; Larsson, Mattias C.; Miwa, Johji; Maule, Aaron G.; Sahashi, Norio; Jones, John T.; Berriman, Matthew
2011-01-01
Bursaphelenchus xylophilus is the nematode responsible for a devastating epidemic of pine wilt disease in Asia and Europe, and represents a recent, independent origin of plant parasitism in nematodes, ecologically and taxonomically distinct from other nematodes for which genomic data is available. As well as being an important pathogen, the B. xylophilus genome thus provides a unique opportunity to study the evolution and mechanism of plant parasitism. Here, we present a high-quality draft genome sequence from an inbred line of B. xylophilus, and use this to investigate the biological basis of its complex ecology which combines fungal feeding, plant parasitic and insect-associated stages. We focus particularly on putative parasitism genes as well as those linked to other key biological processes and demonstrate that B. xylophilus is well endowed with RNA interference effectors, peptidergic neurotransmitters (including the first description of ins genes in a parasite) stress response and developmental genes and has a contracted set of chemosensory receptors. B. xylophilus has the largest number of digestive proteases known for any nematode and displays expanded families of lysosome pathway genes, ABC transporters and cytochrome P450 pathway genes. This expansion in digestive and detoxification proteins may reflect the unusual diversity in foods it exploits and environments it encounters during its life cycle. In addition, B. xylophilus possesses a unique complement of plant cell wall modifying proteins acquired by horizontal gene transfer, underscoring the impact of this process on the evolution of plant parasitism by nematodes. Together with the lack of proteins homologous to effectors from other plant parasitic nematodes, this confirms the distinctive molecular basis of plant parasitism in the Bursaphelenchus lineage. The genome sequence of B. xylophilus adds to the diversity of genomic data for nematodes, and will be an important resource in understanding the biology of this unusual parasite. PMID:21909270
Molecular exploration of hidden diversity in the Indo-West Pacific sciaenid clade
Lo, Pei-Chun; Liu, Shu-Hui; Nor, Siti Azizah Mohd
2017-01-01
The family Sciaenidae, known as croakers or drums, is one of the largest perciform fish families. A recent multi-gene based study investigating the phylogeny and biogeography of global sciaenids revealed that the origin and early diversification of this family occurred in tropical America during the Late Oligocene—Early Miocene before undergoing range expansions to other seas including the Indo-West Pacific, where high species richness is observed. Despite this clarification of the overall evolutionary history of the family, knowledge of the taxonomy and phylogeny of sciaenid genera endemic to the Indo-West Pacific is still limited due to lack of a thorough survey of all taxa. In this study, we used DNA-based approaches to investigate the evolutionary relationships, to explore the species diversity, and to elucidate the taxonomic status of sciaenid species/genera within the Indo-West Pacific clade. Three datasets were herein built for the above objectives: the combined dataset (248 samples from 45 currently recognized species) from one nuclear gene (RAG1) and one mitochondrial gene (COI); the dataset with only RAG1 gene sequences (245 samples from 44 currently recognized species); and the dataset with only COI gene sequences (308 samples from 51 currently recognized species). The latter was primarily used for our biodiversity exploration with two different species delimitation methods (Automatic Barcode Gap Discovery, ABGD and Generalized Mixed Yule Coalescent, GMYC). The results were further evaluated with help of four supplementary criteria for species delimitation (genetic similarity, monophyly inferred from individual gene and combined data trees, geographic distribution, and morphology). Our final results confirmed the validity of 32 currently recognized species and identified several potential new species waiting for formal descriptions. We also reexamined the taxonomic status of the genera, Larimichthys, Nibea, Protonibea and Megalonibea, and suggested a revision of Nibea and proposed a new genus Pseudolarimichthys. PMID:28453569
Molecular exploration of hidden diversity in the Indo-West Pacific sciaenid clade.
Lo, Pei-Chun; Liu, Shu-Hui; Nor, Siti Azizah Mohd; Chen, Wei-Jen
2017-01-01
The family Sciaenidae, known as croakers or drums, is one of the largest perciform fish families. A recent multi-gene based study investigating the phylogeny and biogeography of global sciaenids revealed that the origin and early diversification of this family occurred in tropical America during the Late Oligocene-Early Miocene before undergoing range expansions to other seas including the Indo-West Pacific, where high species richness is observed. Despite this clarification of the overall evolutionary history of the family, knowledge of the taxonomy and phylogeny of sciaenid genera endemic to the Indo-West Pacific is still limited due to lack of a thorough survey of all taxa. In this study, we used DNA-based approaches to investigate the evolutionary relationships, to explore the species diversity, and to elucidate the taxonomic status of sciaenid species/genera within the Indo-West Pacific clade. Three datasets were herein built for the above objectives: the combined dataset (248 samples from 45 currently recognized species) from one nuclear gene (RAG1) and one mitochondrial gene (COI); the dataset with only RAG1 gene sequences (245 samples from 44 currently recognized species); and the dataset with only COI gene sequences (308 samples from 51 currently recognized species). The latter was primarily used for our biodiversity exploration with two different species delimitation methods (Automatic Barcode Gap Discovery, ABGD and Generalized Mixed Yule Coalescent, GMYC). The results were further evaluated with help of four supplementary criteria for species delimitation (genetic similarity, monophyly inferred from individual gene and combined data trees, geographic distribution, and morphology). Our final results confirmed the validity of 32 currently recognized species and identified several potential new species waiting for formal descriptions. We also reexamined the taxonomic status of the genera, Larimichthys, Nibea, Protonibea and Megalonibea, and suggested a revision of Nibea and proposed a new genus Pseudolarimichthys.
Kikuchi, Taisei; Cotton, James A; Dalzell, Jonathan J; Hasegawa, Koichi; Kanzaki, Natsumi; McVeigh, Paul; Takanashi, Takuma; Tsai, Isheng J; Assefa, Samuel A; Cock, Peter J A; Otto, Thomas Dan; Hunt, Martin; Reid, Adam J; Sanchez-Flores, Alejandro; Tsuchihara, Kazuko; Yokoi, Toshiro; Larsson, Mattias C; Miwa, Johji; Maule, Aaron G; Sahashi, Norio; Jones, John T; Berriman, Matthew
2011-09-01
Bursaphelenchus xylophilus is the nematode responsible for a devastating epidemic of pine wilt disease in Asia and Europe, and represents a recent, independent origin of plant parasitism in nematodes, ecologically and taxonomically distinct from other nematodes for which genomic data is available. As well as being an important pathogen, the B. xylophilus genome thus provides a unique opportunity to study the evolution and mechanism of plant parasitism. Here, we present a high-quality draft genome sequence from an inbred line of B. xylophilus, and use this to investigate the biological basis of its complex ecology which combines fungal feeding, plant parasitic and insect-associated stages. We focus particularly on putative parasitism genes as well as those linked to other key biological processes and demonstrate that B. xylophilus is well endowed with RNA interference effectors, peptidergic neurotransmitters (including the first description of ins genes in a parasite) stress response and developmental genes and has a contracted set of chemosensory receptors. B. xylophilus has the largest number of digestive proteases known for any nematode and displays expanded families of lysosome pathway genes, ABC transporters and cytochrome P450 pathway genes. This expansion in digestive and detoxification proteins may reflect the unusual diversity in foods it exploits and environments it encounters during its life cycle. In addition, B. xylophilus possesses a unique complement of plant cell wall modifying proteins acquired by horizontal gene transfer, underscoring the impact of this process on the evolution of plant parasitism by nematodes. Together with the lack of proteins homologous to effectors from other plant parasitic nematodes, this confirms the distinctive molecular basis of plant parasitism in the Bursaphelenchus lineage. The genome sequence of B. xylophilus adds to the diversity of genomic data for nematodes, and will be an important resource in understanding the biology of this unusual parasite.
Generation of a foveomacular transcriptome
Bernstein, Steven; Wong, Paul W.
2014-01-01
Purpose Organizing molecular biologic data is a growing challenge since the rate of data accumulation is steadily increasing. Information relevant to a particular biologic query can be difficult to extract from the comprehensive databases currently available. We present a data collection and organization model designed to ameliorate these problems and applied it to generate an expressed sequence tag (EST)–based foveomacular transcriptome. Methods Using Perl, MySQL, EST libraries, screening, and human foveomacular gene expression as a model system, we generated a foveomacular transcriptome database enriched for molecularly relevant data. Results Using foveomacula as a gene expression model tissue, we identified and organized 6,056 genes expressed in that tissue. Of those identified genes, 3,480 had not been previously described as expressed in the foveomacula. Internal experimental controls as well as comparison of our data set to published data sets suggest we do not yet have a complete description of the foveomacula transcriptome. Conclusions We present an organizational method designed to amplify the utility of data pertinent to a specific research interest. Our method is generic enough to be applicable to a variety of conditions yet focused enough to allow for specialized study. PMID:24991187
Deng, Yuhua; Yan, Hui; Gu, Jinbao; Xu, Jiabao; Wu, Kun; Tu, Zhijian; James, Anthony A.; Chen, Xiaoguang
2013-01-01
Aedes albopictus is a major vector of dengue and Chikungunya viruses. Olfaction plays a vital role in guiding mosquito behaviors and contributes to their ability to transmit pathogens. Odorant-binding proteins (OBPs) are abundant in insect olfactory tissues and involved in the first step of odorant reception. While comprehensive descriptions are available of OBPs from Aedes aegypti, Culex quinquefasciatus and Anopheles gambiae, only a few genes from Ae. albopictus have been reported. In this study, twenty-one putative AalbOBP genes were cloned using their homologues in Ae. aegypti to query an Ae. albopictus partial genome sequence. Two antenna-specific OBPs, AalbOBP37 and AalbOBP39, display a remarkable similarity in their overall folding and binding pockets, according to molecular modeling. Binding affinity assays indicated that AalbOBP37 and AalbOBP39 had overlapping ligand affinities and are affected in different pH condition. Electroantennagrams (EAG) and behavioral tests show that these two genes were involved in olfactory reception. An improved understanding of the Ae. albopictus OBPs is expected to contribute to the development of more efficient and environmentally-friendly mosquito control strategies. PMID:23935894
Russian Doll Genes and Complex Chromosome Rearrangements in Oxytricha trifallax
Braun, Jasper; Nabergall, Lukas; Neme, Rafik; Landweber, Laura F.; Saito, Masahico; Jonoska, Nataša
2018-01-01
Ciliates have two different types of nuclei per cell, with one acting as a somatic, transcriptionally active nucleus (macronucleus; abbr. MAC) and another serving as a germline nucleus (micronucleus; abbr. MIC). Furthermore, Oxytricha trifallax undergoes extensive genome rearrangements during sexual conjugation and post-zygotic development of daughter cells. These rearrangements are necessary because the precursor MIC loci are often both fragmented and scrambled, with respect to the corresponding MAC loci. Such genome architectures are remarkably tolerant of encrypted MIC loci, because RNA-guided processes during MAC development reorganize the gene fragments in the correct order to resemble the parental MAC sequence. Here, we describe the germline organization of several nested and highly scrambled genes in Oxytricha trifallax. These include cases with multiple layers of nesting, plus highly interleaved or tangled precursor loci that appear to deviate from previously described patterns. We present mathematical methods to measure the degree of nesting between precursor MIC loci, and revisit a method for a mathematical description of scrambling. After applying these methods to the chromosome rearrangement maps of O. trifallax we describe cases of nested arrangements with up to five layers of embedded genes, as well as the most scrambled loci in O. trifallax. PMID:29545465
Gona, Floriana; Caio, Carla; Iannolo, Gioacchin; Monaco, Francesco; Di Mento, Giuseppina; Cuscino, Nicola; Fontana, Ignazio; Panarello, Giovanna; Maugeri, Gaetano; Mezzatesta, Maria Lina; Stefani, Stefania; Conaldi, Pier Giulio
2017-10-01
Dissemination of resistance to carbapenems among Enterobacteriaceae through plasmids is an increasingly important concern in health care worldwide. Here we report the first description of an IncX3 plasmid carrying the blaKPC-3 gene in a strain of Serratia marcescens isolated from a kidney-liver transplanted patient at the transplantation centre ISMETT (Istituto Mediterraneo per i Trapianti e Terapie ad Alta Specializzazione, Palermo, Italy). To localize the transposable element containing the resistance-associated gene Next-Generation Sequencing of the bacterial DNA was performed. S. marcescens was positive for blaKPC-3 and blaSHV-11 genes. The molecular analysis demonstrated that the blaKPC-3 gene of this bacterial strain was located in one copy of the Tn-3-like element Tn4401-a carried in a plasmid that is 53 392 bp in size and showed the typical IncX3 scaffold. Our data demonstrated the presence of a new blaKPC-3 harbouring the IncX3 plasmid in S. marcescens. The possible dissemination among Enterobacteriaceae of this type of plasmid should be monitored and evaluated in terms of clinical risk.
Yusoff, K; Millar, N S; Chambers, P; Emmerson, P T
1987-01-01
The nucleotide sequence of the L gene of the Beaudette C strain of Newcastle disease virus (NDV) has been determined. The L gene is 6704 nucleotides long and encodes a protein of 2204 amino acids with a calculated molecular weight of 248822. Mung bean nuclease mapping of the 5' terminus of the L gene mRNA indicates that the transcription of the L gene is initiated 11 nucleotides upstream of the translational start site. Comparison with the amino acid sequences of the L genes of Sendai virus and vesicular stomatitis virus (VSV) suggests that there are several regions of homology between the sequences. These data provide further evidence for an evolutionary relationship between the Paramyxoviridae and the Rhabdoviridae. A non-coding sequence of 46 nucleotides downstream of the presumed polyadenylation site of the L gene may be part of a negative strand leader RNA. Images PMID:3035486
Sequence and analysis of chromosome 4 of the plant Arabidopsis thaliana.
Mayer, K; Schüller, C; Wambutt, R; Murphy, G; Volckaert, G; Pohl, T; Düsterhöft, A; Stiekema, W; Entian, K D; Terryn, N; Harris, B; Ansorge, W; Brandt, P; Grivell, L; Rieger, M; Weichselgartner, M; de Simone, V; Obermaier, B; Mache, R; Müller, M; Kreis, M; Delseny, M; Puigdomenech, P; Watson, M; Schmidtheini, T; Reichert, B; Portatelle, D; Perez-Alonso, M; Boutry, M; Bancroft, I; Vos, P; Hoheisel, J; Zimmermann, W; Wedler, H; Ridley, P; Langham, S A; McCullagh, B; Bilham, L; Robben, J; Van der Schueren, J; Grymonprez, B; Chuang, Y J; Vandenbussche, F; Braeken, M; Weltjens, I; Voet, M; Bastiaens, I; Aert, R; Defoor, E; Weitzenegger, T; Bothe, G; Ramsperger, U; Hilbert, H; Braun, M; Holzer, E; Brandt, A; Peters, S; van Staveren, M; Dirske, W; Mooijman, P; Klein Lankhorst, R; Rose, M; Hauf, J; Kötter, P; Berneiser, S; Hempel, S; Feldpausch, M; Lamberth, S; Van den Daele, H; De Keyser, A; Buysshaert, C; Gielen, J; Villarroel, R; De Clercq, R; Van Montagu, M; Rogers, J; Cronin, A; Quail, M; Bray-Allen, S; Clark, L; Doggett, J; Hall, S; Kay, M; Lennard, N; McLay, K; Mayes, R; Pettett, A; Rajandream, M A; Lyne, M; Benes, V; Rechmann, S; Borkova, D; Blöcker, H; Scharfe, M; Grimm, M; Löhnert, T H; Dose, S; de Haan, M; Maarse, A; Schäfer, M; Müller-Auer, S; Gabel, C; Fuchs, M; Fartmann, B; Granderath, K; Dauner, D; Herzl, A; Neumann, S; Argiriou, A; Vitale, D; Liguori, R; Piravandi, E; Massenet, O; Quigley, F; Clabauld, G; Mündlein, A; Felber, R; Schnabl, S; Hiller, R; Schmidt, W; Lecharny, A; Aubourg, S; Chefdor, F; Cooke, R; Berger, C; Montfort, A; Casacuberta, E; Gibbons, T; Weber, N; Vandenbol, M; Bargues, M; Terol, J; Torres, A; Perez-Perez, A; Purnelle, B; Bent, E; Johnson, S; Tacon, D; Jesse, T; Heijnen, L; Schwarz, S; Scholler, P; Heber, S; Francs, P; Bielke, C; Frishman, D; Haase, D; Lemcke, K; Mewes, H W; Stocker, S; Zaccaria, P; Bevan, M; Wilson, R K; de la Bastide, M; Habermann, K; Parnell, L; Dedhia, N; Gnoj, L; Schutz, K; Huang, E; Spiegel, L; Sehkon, M; Murray, J; Sheet, P; Cordes, M; Abu-Threideh, J; Stoneking, T; Kalicki, J; Graves, T; Harmon, G; Edwards, J; Latreille, P; Courtney, L; Cloud, J; Abbott, A; Scott, K; Johnson, D; Minx, P; Bentley, D; Fulton, B; Miller, N; Greco, T; Kemp, K; Kramer, J; Fulton, L; Mardis, E; Dante, M; Pepin, K; Hillier, L; Nelson, J; Spieth, J; Ryan, E; Andrews, S; Geisel, C; Layman, D; Du, H; Ali, J; Berghoff, A; Jones, K; Drone, K; Cotton, M; Joshu, C; Antonoiu, B; Zidanic, M; Strong, C; Sun, H; Lamar, B; Yordan, C; Ma, P; Zhong, J; Preston, R; Vil, D; Shekher, M; Matero, A; Shah, R; Swaby, I K; O'Shaughnessy, A; Rodriguez, M; Hoffmann, J; Till, S; Granat, S; Shohdy, N; Hasegawa, A; Hameed, A; Lodhi, M; Johnson, A; Chen, E; Marra, M; Martienssen, R; McCombie, W R
1999-12-16
The higher plant Arabidopsis thaliana (Arabidopsis) is an important model for identifying plant genes and determining their function. To assist biological investigations and to define chromosome structure, a coordinated effort to sequence the Arabidopsis genome was initiated in late 1996. Here we report one of the first milestones of this project, the sequence of chromosome 4. Analysis of 17.38 megabases of unique sequence, representing about 17% of the genome, reveals 3,744 protein coding genes, 81 transfer RNAs and numerous repeat elements. Heterochromatic regions surrounding the putative centromere, which has not yet been completely sequenced, are characterized by an increased frequency of a variety of repeats, new repeats, reduced recombination, lowered gene density and lowered gene expression. Roughly 60% of the predicted protein-coding genes have been functionally characterized on the basis of their homology to known genes. Many genes encode predicted proteins that are homologous to human and Caenorhabditis elegans proteins.
Similarity-based gene detection: using COGs to find evolutionarily-conserved ORFs.
Powell, Bradford C; Hutchison, Clyde A
2006-01-19
Experimental verification of gene products has not kept pace with the rapid growth of microbial sequence information. However, existing annotations of gene locations contain sufficient information to screen for probable errors. Furthermore, comparisons among genomes become more informative as more genomes are examined. We studied all open reading frames (ORFs) of at least 30 codons from the genomes of 27 sequenced bacterial strains. We grouped the potential peptide sequences encoded from the ORFs by forming Clusters of Orthologous Groups (COGs). We used this grouping in order to find homologous relationships that would not be distinguishable from noise when using simple BLAST searches. Although COG analysis was initially developed to group annotated genes, we applied it to the task of grouping anonymous DNA sequences that may encode proteins. "Mixed COGs" of ORFs (clusters in which some sequences correspond to annotated genes and some do not) are attractive targets when seeking errors of gene prediction. Examination of mixed COGs reveals some situations in which genes appear to have been missed in current annotations and a smaller number of regions that appear to have been annotated as gene loci erroneously. This technique can also be used to detect potential pseudogenes or sequencing errors. Our method uses an adjustable parameter for degree of conservation among the studied genomes (stringency). We detail results for one level of stringency at which we found 83 potential genes which had not previously been identified, 60 potential pseudogenes, and 7 sequences with existing gene annotations that are probably incorrect. Systematic study of sequence conservation offers a way to improve existing annotations by identifying potentially homologous regions where the annotation of the presence or absence of a gene is inconsistent among genomes.
Similarity-based gene detection: using COGs to find evolutionarily-conserved ORFs
Powell, Bradford C; Hutchison, Clyde A
2006-01-01
Background Experimental verification of gene products has not kept pace with the rapid growth of microbial sequence information. However, existing annotations of gene locations contain sufficient information to screen for probable errors. Furthermore, comparisons among genomes become more informative as more genomes are examined. We studied all open reading frames (ORFs) of at least 30 codons from the genomes of 27 sequenced bacterial strains. We grouped the potential peptide sequences encoded from the ORFs by forming Clusters of Orthologous Groups (COGs). We used this grouping in order to find homologous relationships that would not be distinguishable from noise when using simple BLAST searches. Although COG analysis was initially developed to group annotated genes, we applied it to the task of grouping anonymous DNA sequences that may encode proteins. Results "Mixed COGs" of ORFs (clusters in which some sequences correspond to annotated genes and some do not) are attractive targets when seeking errors of gene predicion. Examination of mixed COGs reveals some situations in which genes appear to have been missed in current annotations and a smaller number of regions that appear to have been annotated as gene loci erroneously. This technique can also be used to detect potential pseudogenes or sequencing errors. Our method uses an adjustable parameter for degree of conservation among the studied genomes (stringency). We detail results for one level of stringency at which we found 83 potential genes which had not previously been identified, 60 potential pseudogenes, and 7 sequences with existing gene annotations that are probably incorrect. Conclusion Systematic study of sequence conservation offers a way to improve existing annotations by identifying potentially homologous regions where the annotation of the presence or absence of a gene is inconsistent among genomes. PMID:16423288
Richards, Stephen; Liu, Yue; Bettencourt, Brian R.; Hradecky, Pavel; Letovsky, Stan; Nielsen, Rasmus; Thornton, Kevin; Hubisz, Melissa J.; Chen, Rui; Meisel, Richard P.; Couronne, Olivier; Hua, Sujun; Smith, Mark A.; Zhang, Peili; Liu, Jing; Bussemaker, Harmen J.; van Batenburg, Marinus F.; Howells, Sally L.; Scherer, Steven E.; Sodergren, Erica; Matthews, Beverly B.; Crosby, Madeline A.; Schroeder, Andrew J.; Ortiz-Barrientos, Daniel; Rives, Catharine M.; Metzker, Michael L.; Muzny, Donna M.; Scott, Graham; Steffen, David; Wheeler, David A.; Worley, Kim C.; Havlak, Paul; Durbin, K. James; Egan, Amy; Gill, Rachel; Hume, Jennifer; Morgan, Margaret B.; Miner, George; Hamilton, Cerissa; Huang, Yanmei; Waldron, Lenée; Verduzco, Daniel; Clerc-Blankenburg, Kerstin P.; Dubchak, Inna; Noor, Mohamed A.F.; Anderson, Wyatt; White, Kevin P.; Clark, Andrew G.; Schaeffer, Stephen W.; Gelbart, William; Weinstock, George M.; Gibbs, Richard A.
2005-01-01
We have sequenced the genome of a second Drosophila species, Drosophila pseudoobscura, and compared this to the genome sequence of Drosophila melanogaster, a primary model organism. Throughout evolution the vast majority of Drosophila genes have remained on the same chromosome arm, but within each arm gene order has been extensively reshuffled, leading to a minimum of 921 syntenic blocks shared between the species. A repetitive sequence is found in the D. pseudoobscura genome at many junctions between adjacent syntenic blocks. Analysis of this novel repetitive element family suggests that recombination between offset elements may have given rise to many paracentric inversions, thereby contributing to the shuffling of gene order in the D. pseudoobscura lineage. Based on sequence similarity and synteny, 10,516 putative orthologs have been identified as a core gene set conserved over 25–55 million years (Myr) since the pseudoobscura/melanogaster divergence. Genes expressed in the testes had higher amino acid sequence divergence than the genome-wide average, consistent with the rapid evolution of sex-specific proteins. Cis-regulatory sequences are more conserved than random and nearby sequences between the species—but the difference is slight, suggesting that the evolution of cis-regulatory elements is flexible. Overall, a pattern of repeat-mediated chromosomal rearrangement, and high coadaptation of both male genes and cis-regulatory sequences emerges as important themes of genome divergence between these species of Drosophila. PMID:15632085
Bryant, D A; de Lorimier, R; Lambert, D H; Dubbs, J M; Stirewalt, V L; Stevens, S E; Porter, R D; Tam, J; Jay, E
1985-01-01
The genes for the alpha- and beta-subunit apoproteins of allophycocyanin (AP) were isolated from the cyanelle genome of Cyanophora paradoxa and subjected to nucleotide sequence analysis. The AP beta-subunit apoprotein gene was localized to a 7.8-kilobase-pair Pst I restriction fragment from cyanelle DNA by hybridization with a tetradecameric oligonucleotide probe. Sequence analysis using that oligonucleotide and its complement as primers for the dideoxy chain-termination sequencing method confirmed the presence of both AP alpha- and beta-subunit genes on this restriction fragment. Additional oligonucleotide primers were synthesized as sequencing progressed and were used to determine rapidly the nucleotide sequence of a 1336-base-pair region of this cloned fragment. This strategy allowed the sequencing to be completed without a detailed restriction map and without extensive and time-consuming subcloning. The sequenced region contains two open reading frames whose deduced amino acid sequences are 81-85% homologous to cyanobacterial and red algal AP subunits whose amino acid sequences have been determined. The two open reading frames are in the same orientation and are separated by 39 base pairs. AP alpha is 5' to AP beta and both coding sequences are preceded by a polypurine, Shine-Dalgarno-type sequence. Sequences upstream from AP alpha closely resemble the Escherichia coli consensus promoter sequences and also show considerable homology to promoter sequences for several chloroplast-encoded psbA genes. A 56-base-pair palindromic sequence downstream from the AP beta gene could play a role in the termination of transcription or translation. The allophycocyanin apoprotein subunit genes are located on the large single-copy region of the cyanelle genome. PMID:2987916
Plant nitrogen regulatory P-PII genes
Coruzzi, Gloria M.; Lam, Hon-Ming; Hsieh, Ming-Hsiun
2001-01-01
The present invention generally relates to plant nitrogen regulatory PII gene (hereinafter P-PII gene), a gene involved in regulating plant nitrogen metabolism. The invention provides P-PII nucleotide sequences, expression constructs comprising said nucleotide sequences, and host cells and plants having said constructs and, optionally expressing the P-PII gene from said constructs. The invention also provides substantially pure P-PII proteins. The P-PII nucleotide sequences and constructs of the
Hunt, C; Morimoto, R I
1985-01-01
We have determined the nucleotide sequence of the human hsp70 gene and 5' flanking region. The hsp70 gene is transcribed as an uninterrupted primary transcript of 2440 nucleotides composed of a 5' noncoding leader sequence of 212 nucleotides, a 3' noncoding region of 242 nucleotides, and a continuous open reading frame of 1986 nucleotides that encodes a protein with predicted molecular mass of 69,800 daltons. Upstream of the 5' terminus are the canonical TATAAA box, the sequence ATTGG that corresponds in the inverted orientation to the CCAAT motif, and the dyad sequence CTGGAAT/ATTCCCG that shares homology in 12 of 14 positions with the consensus transcription regulatory sequence common to Drosophila heat shock genes. Comparison of the predicted amino acid sequences of human hsp70 with the published sequences of Drosophila hsp70 and Escherichia coli dnaK reveals that human hsp70 is 73% identical to Drosophila hsp70 and 47% identical to E. coli dnaK. Surprisingly, the nucleotide sequences of the human and Drosophila genes are 72% identical and human and E. coli genes are 50% identical, which is more highly conserved than necessary given the degeneracy of the genetic code. The lack of accumulated silent nucleotide substitutions leads us to propose that there may be additional information in the nucleotide sequence of the hsp70 gene or the corresponding mRNA that precludes the maximum divergence allowed in the silent codon positions. PMID:3931075
Odronitz, Florian; Kollmar, Martin
2006-01-01
Background Annotation of protein sequences of eukaryotic organisms is crucial for the understanding of their function in the cell. Manual annotation is still by far the most accurate way to correctly predict genes. The classification of protein sequences, their phylogenetic relation and the assignment of function involves information from various sources. This often leads to a collection of heterogeneous data, which is hard to track. Cytoskeletal and motor proteins consist of large and diverse superfamilies comprising up to several dozen members per organism. Up to date there is no integrated tool available to assist in the manual large-scale comparative genomic analysis of protein families. Description Pfarao (Protein Family Application for Retrieval, Analysis and Organisation) is a database driven online working environment for the analysis of manually annotated protein sequences and their relationship. Currently, the system can store and interrelate a wide range of information about protein sequences, species, phylogenetic relations and sequencing projects as well as links to literature and domain predictions. Sequences can be imported from multiple sequence alignments that are generated during the annotation process. A web interface allows to conveniently browse the database and to compile tabular and graphical summaries of its content. Conclusion We implemented a protein sequence-centric web application to store, organize, interrelate, and present heterogeneous data that is generated in manual genome annotation and comparative genomics. The application has been developed for the analysis of cytoskeletal and motor proteins (CyMoBase) but can easily be adapted for any protein. PMID:17134497
van der Ley, P
1988-11-01
Gonococci express a family of related outer membrane proteins designated protein II (P.II). These surface proteins are subject to both phase variation and antigenic variation. The P.II gene repertoire of Neisseria gonorrhoeae strain JS3 was found to consist of at least ten genes, eight of which were cloned. Sequence analysis and DNA hybridization studies revealed that one particular P.II-encoding sequence is present in three distinct, but almost identical, copies in the JS3 genome. These genes encode the P.II protein that was previously identified as P.IIc. Comparison of their sequences shows that the multiple copies of this P.IIc-encoding gene might have been generated by both gene conversion and gene duplication.
USDA-ARS?s Scientific Manuscript database
Background: In many bacteria including E. coli, genes encoding O-antigens are clustered in the chromosome, with a 39-bp JUMPstart sequence and gnd gene located upstream and downstream of the cluster, respectively. For determining the DNA sequence of the E. coli O-antigen gene cluster, one set of P...
USDA-ARS?s Scientific Manuscript database
The concept of utilizing putative and unique gene sequences for the design of species specific probes was tested. The abundance profile of assigned functions within the Lactobacillus plantarum genome was used for the identification of the putative and unique gene sequence, csh. The targeted gene (cs...
Primer development to obtain complete coding sequence of HA and NA genes of influenza A/H3N2 virus.
Agustiningsih, Agustiningsih; Trimarsanto, Hidayat; Setiawaty, Vivi; Artika, I Made; Muljono, David Handojo
2016-08-30
Influenza is an acute respiratory illness and has become a serious public health problem worldwide. The need to study the HA and NA genes in influenza A virus is essential since these genes frequently undergo mutations. This study describes the development of primer sets for RT-PCR to obtain complete coding sequence of Hemagglutinin (HA) and Neuraminidase (NA) genes of influenza A/H3N2 virus from Indonesia. The primers were developed based on influenza A/H3N2 sequence worldwide from Global Initiative on Sharing All Influenza Data (GISAID) and further tested using Indonesian influenza A/H3N2 archived samples of influenza-like illness (ILI) surveillance from 2008 to 2009. An optimum RT-PCR condition was acquired for all HA and NA fragments designed to cover complete coding sequence of HA and NA genes. A total of 71 samples were successfully sequenced for complete coding sequence both of HA and NA genes out of 145 samples of influenza A/H3N2 tested. The developed primer sets were suitable for obtaining complete coding sequences of HA and NA genes of Indonesian samples from 2008 to 2009.
Lu, Hongsheng; Sato, Yoshinori; Fujimura, Reiko; Nishizawa, Tomoyasu; Kamijo, Takashi; Ohta, Hiroyuki
2011-02-01
A Gram-negative, aerobic, heterotrophic bacterium, designated KP1-19(T), was isolated from a 22-year-old volcanic deposit at a site lacking vegetation on the island of Miyake, Japan. Strain KP1-19(T) was able to use thiosulfate (optimum concentration 10 mM) as an additional energy source. 16S rRNA gene sequence analysis indicated that strain KP1-19(T) was closely related to Limnobacter thiooxidans CS-K2(T) within the class Betaproteobacteria (97.7 % 16S rRNA gene sequence similarity). The cellular fatty acid profile was characteristic of the genus Limnobacter: the major fatty acids (>5 %) were C(16 : 0), C(16 : 1)ω7c and C(18 : 1)ω7c and minor amounts of C(10 : 0) 3-OH were also found. DNA-DNA relatedness between strain KP1-19(T) and L. thiooxidans LMG 19593(T) was 18 %. Therefore, strain KP1-19(T) represents a novel species, for which the name Limnobacter litoralis sp. nov. is proposed. The type strain is KP1-19(T) (=LMG 24869(T) =NBRC 105857(T) =CIP 109929(T)).
Hirose, Masato; Fukiage, Ryuma; Katoh, Toru; Kajihara, Hiroshi
2014-01-01
Abstract We describe Phoronis emigi sp. n. as the eighth member of the genus based on specimens collected from a sandy bottom at 33.2 m depth in Tomioka Bay, Amakusa, Japan. The new species is morphologically similar to P. psammophila Cori, 1889, but can be distinguished from the latter by the number of longitudinal muscle bundles in the body wall (56–72 vs. 25–50 in P. psammophila) and the position of the nephridiopores (situated level with the anus vs. lower than the anus in P. psammophila). Using sequences of the nuclear 18S and 28S rRNA genes and the mitochondrial cytochrome c oxidase subunit I (COI) gene, we inferred the relationship of P. emigi to other phoronids by the maximum likelihood method and Bayesian analysis. The analyses showed that P. emigi is closely related to P. hippocrepia Wright, 1856 and P. psammophila Cori, 1889. We describe the morphology of the topotypes and additional material for P. ijimai Oka, 1897. Neither our morphological observations of P. ijimai, nor the phylogenetic analyses based on 18S and COI sequences, contradicts that P. vancouverensis Pixell, 1912 is conspecific with P. ijimai, a synonymy that has long been disputed. PMID:24715799
Luco, Sophie; Delmas, Olivier; Vidalain, Pierre-Olivier; Tangy, Frédéric; Weil, Robert; Bourhy, Hervé
2012-01-01
NF-κB transcription factors are crucial for many cellular processes. NF-κB is activated by viral infections to induce expression of antiviral cytokines. Here, we identified a novel member of the human NF-κB family, denoted RelAp43, the nucleotide sequence of which contains several exons as well as an intron of the RelA gene. RelAp43 is expressed in all cell lines and tissues tested and exhibits all the properties of a NF-κB protein. Although its sequence does not include a transactivation domain, identifying it as a class I member of the NF-κB family, it is able to potentiate RelA-mediated transactivation and stabilize dimers comprising p50. Furthermore, RelAp43 stimulates the expression of HIAP1, IRF1, and IFN-β - three genes involved in cell immunity against viral infection. It is also targeted by the matrix protein of lyssaviruses, the agents of rabies, resulting in an inhibition of the NF-κB pathway. Taken together, our data provide the description of a novel functional member of the NF-κB family, which plays a key role in the induction of anti-viral innate immune response.
Vidalain, Pierre-Olivier; Tangy, Frédéric; Weil, Robert; Bourhy, Hervé
2012-01-01
NF-κB transcription factors are crucial for many cellular processes. NF-κB is activated by viral infections to induce expression of antiviral cytokines. Here, we identified a novel member of the human NF-κB family, denoted RelAp43, the nucleotide sequence of which contains several exons as well as an intron of the RelA gene. RelAp43 is expressed in all cell lines and tissues tested and exhibits all the properties of a NF-κB protein. Although its sequence does not include a transactivation domain, identifying it as a class I member of the NF-κB family, it is able to potentiate RelA-mediated transactivation and stabilize dimers comprising p50. Furthermore, RelAp43 stimulates the expression of HIAP1, IRF1, and IFN-β - three genes involved in cell immunity against viral infection. It is also targeted by the matrix protein of lyssaviruses, the agents of rabies, resulting in an inhibition of the NF-κB pathway. Taken together, our data provide the description of a novel functional member of the NF-κB family, which plays a key role in the induction of anti-viral innate immune response. PMID:23271966
Allegrucci, Giuliana; Rampini, Mauro; Di Russo, Claudio; Lana, Enrico; Cocchi, Sara; Sbordoni, Valerio
2014-01-01
Abstract The genus Dolichopoda (Orthoptera; Rhaphidopohoridae) is present in Italy with 9 species distributed from northwestern Italy (Piedmont and Liguria) to the southernmost Apennines (Calabria), occurring also in the Tyrrhenian coastal areas and in Sardinia. Three morphologically very close taxa have been described in Piedmont and Liguria, i.e., D. ligustica ligustica, D. ligustica septentrionalis and D. azami azami. To investigate the delimitation of the northwestern species of Dolichopoda, we performed both morphological and molecular analyses. Morphological analysis was carried out by considering diagnostic characters generally used to distinguish different taxa, as the shape of epiphallus in males and the subgenital fig in females. Molecular analysis was performed by sequencing three mitochondrial genes, 12S rRNA, 16S rRNA, partially sequenced and the entire gene of COI. Results from both morphological and molecular analyses highlighted a very homogeneous group of populations, although genetically structured. Three haplogroups geographically distributed could be distinguished and based on these results we suggest a new taxonomic arrangement. All populations, due to the priority of description, should be assigned to D. azami azami Saulcy, 1893 and to preserve the names ligustica and septentrionalis, corresponding to different genetic haplogroups, we assign them to D. azami ligustica stat. n. Baccetti & Capra, 1959 and to D. azami septentrionalis stat. n. Baccetti & Capra, 1959. PMID:25197209
Motato, Karina Edith; Milani, Christian; Ventura, Marco; Valencia, Francia Elena; Ruas-Madiedo, Patricia; Delgado, Susana
2017-12-01
"Suero Costeño" (SC) is a traditional soured cream elaborated from raw milk in the Northern-Caribbean coast of Colombia. The natural microbiota that characterizes this popular Colombian fermented milk is unknown, although several culturing studies have previously been attempted. In this work, the microbiota associated with SC from three manufacturers in two regions, "Planeta Rica" (Córdoba) and "Caucasia" (Antioquia), was analysed by means of culturing methods in combination with high-throughput sequencing and DGGE analysis of 16S rRNA gene amplicons. The bacterial ecosystem of SC samples was revealed to be composed of lactic acid bacteria belonging to the Streptococcaceae and Lactobacillaceae families; the proportions and genera varying among manufacturers and region of elaboration. Members of the Lactobacillus acidophilus group, Lactocococcus lactis, Streptococcus infantarius and Streptococcus salivarius characterized this artisanal product. In comparison with culturing, the use of molecular in deep culture-independent techniques provides a more realistic picture of the overall bacterial communities residing in SC. Besides the descriptive purpose, these approaches will facilitate a rational strategy to follow (culture media and growing conditions) for the isolation of indigenous strains that allow standardization in the manufacture of SC. Copyright © 2017 Elsevier Ltd. All rights reserved.
Viver, Tomeu; Orellana, Luis; González-Torres, Pedro; Díaz, Sara; Urdiain, Mercedes; Farías, María Eugenia; Benes, Vladimir; Kaempfer, Peter; Shahinpei, Azadeh; Ali Amoozegar, Mohammad; Amann, Rudolf; Antón, Josefa; Konstantinidis, Konstantinos T; Rosselló-Móra, Ramon
2018-05-01
The application of tandem MALDI-TOF MS screening with 16S rRNA gene sequencing of selected isolates has been demonstrated to be an excellent approach for retrieving novelty from large-scale culturing. The application of such methodologies in different hypersaline samples allowed the isolation of the culture-recalcitrant Salinibacter ruber second phylotype (EHB-2) for the first time, as well as a new species recently isolated from the Argentinian Altiplano hypersaline lakes. In this study, the genome sequences of the different species of the phylum Rhodothermaeota were compared and the genetic repertoire along the evolutionary gradient was analyzed together with each intraspecific variability. Altogether, the results indicated an open pan-genome for the family Salinibacteraceae, as well as the codification of relevant traits such as diverse rhodopsin genes, CRISPR-Cas systems and spacers, and one T6SS secretion system that could give ecological advantages to an EHB-2 isolate. For the new Salinibacter species, we propose the name Salinibacter altiplanensis sp. nov. (the designated type strain is AN15 T =CECT 9105 T =IBRC-M 11031 T ). Copyright © 2018 Elsevier GmbH. All rights reserved.
MIPS: curated databases and comprehensive secondary data resources in 2010.
Mewes, H Werner; Ruepp, Andreas; Theis, Fabian; Rattei, Thomas; Walter, Mathias; Frishman, Dmitrij; Suhre, Karsten; Spannagl, Manuel; Mayer, Klaus F X; Stümpflen, Volker; Antonov, Alexey
2011-01-01
The Munich Information Center for Protein Sequences (MIPS at the Helmholtz Center for Environmental Health, Neuherberg, Germany) has many years of experience in providing annotated collections of biological data. Selected data sets of high relevance, such as model genomes, are subjected to careful manual curation, while the bulk of high-throughput data is annotated by automatic means. High-quality reference resources developed in the past and still actively maintained include Saccharomyces cerevisiae, Neurospora crassa and Arabidopsis thaliana genome databases as well as several protein interaction data sets (MPACT, MPPI and CORUM). More recent projects are PhenomiR, the database on microRNA-related phenotypes, and MIPS PlantsDB for integrative and comparative plant genome research. The interlinked resources SIMAP and PEDANT provide homology relationships as well as up-to-date and consistent annotation for 38,000,000 protein sequences. PPLIPS and CCancer are versatile tools for proteomics and functional genomics interfacing to a database of compilations from gene lists extracted from literature. A novel literature-mining tool, EXCERBT, gives access to structured information on classified relations between genes, proteins, phenotypes and diseases extracted from Medline abstracts by semantic analysis. All databases described here, as well as the detailed descriptions of our projects can be accessed through the MIPS WWW server (http://mips.helmholtz-muenchen.de).
MIPS: curated databases and comprehensive secondary data resources in 2010
Mewes, H. Werner; Ruepp, Andreas; Theis, Fabian; Rattei, Thomas; Walter, Mathias; Frishman, Dmitrij; Suhre, Karsten; Spannagl, Manuel; Mayer, Klaus F.X.; Stümpflen, Volker; Antonov, Alexey
2011-01-01
The Munich Information Center for Protein Sequences (MIPS at the Helmholtz Center for Environmental Health, Neuherberg, Germany) has many years of experience in providing annotated collections of biological data. Selected data sets of high relevance, such as model genomes, are subjected to careful manual curation, while the bulk of high-throughput data is annotated by automatic means. High-quality reference resources developed in the past and still actively maintained include Saccharomyces cerevisiae, Neurospora crassa and Arabidopsis thaliana genome databases as well as several protein interaction data sets (MPACT, MPPI and CORUM). More recent projects are PhenomiR, the database on microRNA-related phenotypes, and MIPS PlantsDB for integrative and comparative plant genome research. The interlinked resources SIMAP and PEDANT provide homology relationships as well as up-to-date and consistent annotation for 38 000 000 protein sequences. PPLIPS and CCancer are versatile tools for proteomics and functional genomics interfacing to a database of compilations from gene lists extracted from literature. A novel literature-mining tool, EXCERBT, gives access to structured information on classified relations between genes, proteins, phenotypes and diseases extracted from Medline abstracts by semantic analysis. All databases described here, as well as the detailed descriptions of our projects can be accessed through the MIPS WWW server (http://mips.helmholtz-muenchen.de). PMID:21109531
MIPS: analysis and annotation of proteins from whole genomes in 2005
Mewes, H. W.; Frishman, D.; Mayer, K. F. X.; Münsterkötter, M.; Noubibou, O.; Pagel, P.; Rattei, T.; Oesterheld, M.; Ruepp, A.; Stümpflen, V.
2006-01-01
The Munich Information Center for Protein Sequences (MIPS at the GSF), Neuherberg, Germany, provides resources related to genome information. Manually curated databases for several reference organisms are maintained. Several of these databases are described elsewhere in this and other recent NAR database issues. In a complementary effort, a comprehensive set of >400 genomes automatically annotated with the PEDANT system are maintained. The main goal of our current work on creating and maintaining genome databases is to extend gene centered information to information on interactions within a generic comprehensive framework. We have concentrated our efforts along three lines (i) the development of suitable comprehensive data structures and database technology, communication and query tools to include a wide range of different types of information enabling the representation of complex information such as functional modules or networks Genome Research Environment System, (ii) the development of databases covering computable information such as the basic evolutionary relations among all genes, namely SIMAP, the sequence similarity matrix and the CABiNet network analysis framework and (iii) the compilation and manual annotation of information related to interactions such as protein–protein interactions or other types of relations (e.g. MPCDB, MPPI, CYGD). All databases described and the detailed descriptions of our projects can be accessed through the MIPS WWW server (). PMID:16381839
MIPS: analysis and annotation of proteins from whole genomes in 2005.
Mewes, H W; Frishman, D; Mayer, K F X; Münsterkötter, M; Noubibou, O; Pagel, P; Rattei, T; Oesterheld, M; Ruepp, A; Stümpflen, V
2006-01-01
The Munich Information Center for Protein Sequences (MIPS at the GSF), Neuherberg, Germany, provides resources related to genome information. Manually curated databases for several reference organisms are maintained. Several of these databases are described elsewhere in this and other recent NAR database issues. In a complementary effort, a comprehensive set of >400 genomes automatically annotated with the PEDANT system are maintained. The main goal of our current work on creating and maintaining genome databases is to extend gene centered information to information on interactions within a generic comprehensive framework. We have concentrated our efforts along three lines (i) the development of suitable comprehensive data structures and database technology, communication and query tools to include a wide range of different types of information enabling the representation of complex information such as functional modules or networks Genome Research Environment System, (ii) the development of databases covering computable information such as the basic evolutionary relations among all genes, namely SIMAP, the sequence similarity matrix and the CABiNet network analysis framework and (iii) the compilation and manual annotation of information related to interactions such as protein-protein interactions or other types of relations (e.g. MPCDB, MPPI, CYGD). All databases described and the detailed descriptions of our projects can be accessed through the MIPS WWW server (http://mips.gsf.de).
Liu, Weiwei; Yi, Zhenzhen; Xu, Dapeng; Clamp, John C; Li, Jiqiu; Lin, Xiaofeng; Song, Weibo
2015-01-01
Oligotrich ciliates are common marine microplankters, but their biodiversity and evolutionary relationships have not been well-documented. Morphological descriptions and small subunit rRNA gene sequences of two new species representing two new strombidiid genera, Sinistrostrombidium cupiformum gen. nov., sp. nov. and Antestrombidium agathae gen. nov., sp. nov. are presented, and their taxonomy and molecular phylogeny are analyzed. Sinistrostrombidium gen. nov. is characterized by a sinistrally spiraled girdle kinety and a longitudinal ventral kinety. Antestrombidium gen. nov. is distinguished by tripartite somatic kineties (circular and ventral kineties plus dextrally spiraled girdle kinety). Sinistrostrombidium and Antestrombidium branched separately from one another in phylogenetic trees, clustering with different clades of strombidiids. The new genera added to the diversities of ciliary patterns and small subunit rRNA gene sequences in strombidiids leads to presentation of a new hypothesis about evolution of the 12 known strombidiid genera, based on ciliary pattern and partly supported by molecular evidence. In addition, our new morphological and molecular analyses support establishment of a new order Lynnellida ord. nov., characterized by an open adoral zone of membranelles without differentiation of anterior and ventral membranelles, for Lynnella, but we remain unable to assign the genus to a subclass with confidence.
Xu, Dapeng; Clamp, John C.; Li, Jiqiu; Lin, Xiaofeng; Song, Weibo
2015-01-01
Oligotrich ciliates are common marine microplankters, but their biodiversity and evolutionary relationships have not been well-documented. Morphological descriptions and small subunit rRNA gene sequences of two new species representing two new strombidiid genera, Sinistrostrombidium cupiformum gen. nov., sp. nov. and Antestrombidium agathae gen. nov., sp. nov. are presented, and their taxonomy and molecular phylogeny are analyzed. Sinistrostrombidium gen. nov. is characterized by a sinistrally spiraled girdle kinety and a longitudinal ventral kinety. Antestrombidium gen. nov. is distinguished by tripartite somatic kineties (circular and ventral kineties plus dextrally spiraled girdle kinety). Sinistrostrombidium and Antestrombidium branched separately from one another in phylogenetic trees, clustering with different clades of strombidiids. The new genera added to the diversities of ciliary patterns and small subunit rRNA gene sequences in strombidiids leads to presentation of a new hypothesis about evolution of the 12 known strombidiid genera, based on ciliary pattern and partly supported by molecular evidence. In addition, our new morphological and molecular analyses support establishment of a new order Lynnellida ord. nov., characterized by an open adoral zone of membranelles without differentiation of anterior and ventral membranelles, for Lynnella, but we remain unable to assign the genus to a subclass with confidence. PMID:26121340
Sequencing and comparing whole mitochondrial genomes ofanimals
DOE Office of Scientific and Technical Information (OSTI.GOV)
Boore, Jeffrey L.; Macey, J. Robert; Medina, Monica
2005-04-22
Comparing complete animal mitochondrial genome sequences is becoming increasingly common for phylogenetic reconstruction and as a model for genome evolution. Not only are they much more informative than shorter sequences of individual genes for inferring evolutionary relatedness, but these data also provide sets of genome-level characters, such as the relative arrangements of genes, that can be especially powerful. We describe here the protocols commonly used for physically isolating mtDNA, for amplifying these by PCR or RCA, for cloning,sequencing, assembly, validation, and gene annotation, and for comparing both sequences and gene arrangements. On several topics, we offer general observations based onmore » our experiences to date with determining and comparing complete mtDNA sequences.« less
Nucleotide sequence of the gag gene and gag-pol junction of feline leukemia virus.
Laprevotte, I; Hampe, A; Sherr, C J; Galibert, F
1984-01-01
The nucleotide sequence of the gag gene of feline leukemia virus and its flanking sequences were determined and compared with the corresponding sequences of two strains of feline sarcoma virus and with that of the Moloney strain of murine leukemia virus. A high degree of nucleotide sequence homology between the feline leukemia virus and murine leukemia virus gag genes was observed, suggesting that retroviruses of domestic cats and laboratory mice have a common, proximal evolutionary progenitor. The predicted structure of the complete feline leukemia virus gag gene precursor suggests that the translation of nonglycosylated and glycosylated gag gene polypeptides is initiated at two different AUG codons. These initiator codons fall in the same reading frame and are separated by a 222-base-pair segment which encodes an amino terminal signal peptide. The nucleotide sequence predicts the order of amino acids in each of the individual gag-coded proteins (p15, p12, p30, p10), all of which derive from the gag gene precursor. Stable stem-and-loop secondary structures are proposed for two regions of viral RNA. The first falls within sequences at the 5' end of the viral genome, together with adjacent palindromic sequences which may play a role in dimer linkage of RNA subunits. The second includes coding sequences at the gag-pol junction and is proposed to be involved in translation of the pol gene product. Sequence analysis of the latter region shows that the gag and pol genes are translated in different reading frames. Classical consensus splice donor and acceptor sequences could not be localized to regions which would permit synthesis of the expected gag-pol precursor protein. Alternatively, we suggest that the pol gene product (RNA-dependent DNA polymerase) could be translated by a frameshift suppressing mechanism which could involve cleavage modification of stems and loops in a manner similar to that observed in tRNA processing. PMID:6328019
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kamb, A.; Weir, M.; Rudy, B.
1989-06-01
The study of gene family members has been aided by the isolation of related genes on the basis of DNA homology. The authors have adapted the polymerase chain reaction to screen animal genomes very rapidly and reliably for likely gene family members. Using conserved amino acid sequences to design degenerate oligonucleotide primers, they have shown that the genome of the nematode Caenorhabditis elegans contains sequences homologous to many Drosophila genes involved in pattern formation, including the segment polarity gene wingless (vertebrate int-1), and homeobox sequences characteristic of the Antennapedia, engrailed, and paired families. In addition, they have used this methodmore » to show that C. elegans contains at least five different sequences homologous to genes in the tyrosine kinase family. Lastly, they have isolated six potassium channel sequences from humans, a result that validates the utility of the method with large genomes and suggests that human potassium channel gene diversity may be extensive.« less
Petroni, Roberta Cardoso; da Rosa, Susana Elaine Alves; de Carvalho, Flavia Pereira; Santana, Rúbia Anita Ferraz; Hyppolito, Joyce Esteves; Nascimento, Claudia Mac Donald Bley; Hamerschlak, Nelson; Campregher, Paulo Vidal
2017-01-01
ABSTRACT Hereditary hyperferritinemia-cataract syndrome is an autosomal dominant genetic disorder associated with mutations in the 5’UTR region of the ferritin light chain gene. These mutations cause the ferritin levels to increase even in the absence of iron overload. Patients also develop bilateral cataract early due to accumulation of ferritin in the lens, and many are misdiagnosed as having hemochromatosis and thus not properly treated. The first cases were described in 1995 and several mutations have already been identified. However, this syndrome is still a poorly understood. We report two cases of unrelated Brazilian families with clinical suspicion of the syndrome, which were treated in our department. For the definitive diagnosis, the affected patients, their parents and siblings were submitted to Sanger sequencing of the 5’UTR region for detection of the ferritin light gene mutation. Single nucleotide polymorphism-like mutations were found in the affected patients, previously described. The test assisted in making the accurate diagnosis of the disease, and its description is important so that the test can be incorporated into clinical practice. PMID:28746593
Expression cloning of human B cell immunoglobulins.
Wardemann, Hedda; Kofer, Juliane
2013-01-01
The majority of lymphomas originate from B cells at the germinal center stage or beyond. Preferential selection of B cell clones by a limited set of antigens has been suggested to drive lymphoma development. However, little is known about the specificity of the antibodies expressed by lymphoma cells, and the role of antibody-specificity in lymphomagenesis remains elusive. Here, we describe a strategy to characterize the antibody reactivity of human B cells. The approach allows the unbiased characterization of the human antibody repertoire on a single cell level through the generation of recombinant monoclonal antibodies from single primary human B cells of defined origin. This protocol offers a detailed description of the method starting from the flow cytometric isolation of single human B cells, to the RT-PCR-based amplification of the expressed Igh, Igκ, and Igλ chain genes, and Ig gene expression vector cloning for the in vitro production of monoclonal antibodies. The strategy may be used to obtain information on the clonal evolution of B cell lymphomas by single cell Ig gene sequencing and on the antibody reactivity of human lymphoma B cells.
Description of Kribbella italica sp. nov., isolated from a Roman catacomb.
Everest, Gareth J; Curtis, Sarah M; De Leo, Filomena; Urzì, Clara; Meyers, Paul R
2015-02-01
A novel actinobacterium, strain BC637(T), was isolated from a biodeteriogenic biofilm sample collected in 2009 in the Saint Callixstus Roman catacomb. The strain was found to belong to the genus Kribbella by analysis of the 16S rRNA gene. Phylogenetic analysis using the 16S rRNA gene and the gyrB, rpoB, relA, recA and atpD concatenated gene sequences showed that strain BC637(T) was most closely related to the type strains of Kribbella lupini and Kribbella endophytica. DNA-DNA hybridization experiments confirmed that strain BC637(T) is a genomic species that is distinct from its closest phylogenetic relatives, K. endophytica DSM 23718(T) (63 % DNA relatedness) and K. lupini LU14(T) (63 % DNA relatedness). Physiological comparisons showed that strain BC637(T) is phenotypically distinct from the type strains of K. endophytica and K. lupini. Thus, strain BC637(T) represents the type strain of a novel species, for which the name Kribella italica sp. nov. is proposed ( = DSM 28967(T) = NRRL B-59155(T)). © 2015 IUMS.
Kouvelis, Vassili N; Sialakouma, Aphrodite; Typas, Milton A
2008-07-01
The recent revision of Verticillium sect. Prostrata led to the introduction of the genus Lecanicillium, which comprises the majority of the entomopathogenic strains. Sixty-five strains previously classified as Verticillium lecanii or Verticillium sp. from different geographical regions and hosts were examined and their phylogenetic relationships were determined using sequences from three mitochondrial (mt) genes [the small rRNA subunit (rns), the NADH dehydrogenase subunits 1 (nad1) and 3 (nad3)] and the ITS region. In general, single gene phylogenetic trees differentiated and placed the strains examined in well-supported (by BS analysis) groups of L. lecanii, L. longisporum, L. muscarium, and L. nodulosum, although in some cases a few uncertainties still remained. nad1 was the most informative single gene in phylogenetic analyses and was also found to contain group I introns with putative open reading frames (ORFs) encoding for GIY-YIG endonucleases. The combined use of mt gene sequences resolved taxonomic uncertainties arisen from ITS analysis and, alone or in combination with ITS sequences, helped in placing uncharacterised Verticillium lecanii and Verticillium sp. firmly into Lecanicillium species. Combined gene data from all the mt genes and all the mt genes and the ITS region together, were very similar. Furthermore, a relaxed correlation with host specificity -- at least for Homoptera -- was indicated for the rns and the combined mt gene sequences. Thus, the usefulness of mt gene sequences as a convenient molecular tool in phylogenetic studies of entomopathogenic fungi was demonstrated.
Krinitsyna, A A; Mel'nikova, N V; Belenikin, M S; Poltronieri, P; Santino, A; Kudriavtseva, A V; Savilova, A M; Speranskaia, A S
2013-01-01
Kunitz-type proteinase inhibitor proteins of group A (KPI-A) are involved in the protection of potato plants from pathogens and pests. Although sequences of large number of the KPI-A genes from different species of cultivated potato (Solanum tuberosum subsp. tuberosum) and a few genes from tomato (Solanum lycopersicum) are known to date, information about the allelic diversity of these genes in other species of the genus Solanum is lacking. In our work, the consensus sequences of the KPI-A genes were established in two species of subgenus Potatoe sect. Petota (Solanum tuberosum subsp. andigenum--5 genes and Solanum stoloniferum--2 genes) and in the subgenus Solanum (Solanum nigrum--5 genes) by amplification, cloning, sequencing and subsequent analysis. The determined sequences of KPI-A genes were 97-100% identical to known sequences of the cultivated potato of sect. Petota (cultivated potato Solanum tuberosum subsp. tuberosum) and sect. Etuberosum (S. palustre). The interspecific variability of these genes did not exceed the intraspecific variability for all studied species except Solanum lycopersicum. The distribution of highly variable and conserved sequences in the mature protein-encoding regions was uniform for all investigated KPI-A genes. However, our attempts to amplify the homologous genes using the same primers and the genomes of Solanum dulcamarum, Solanum lycopersicum and Mandragora officinarum resulted in no product formation. Phylogenetic analysis of KPI-A diversity showed that the sequences of the S. lycopersicum form independent cluster, whereas KPI-A of S. nigrum and species of sect. Etuberosum and sect. Petota are closely related and do not form species-specific subclasters. Although Solanum nigrum is resistant to all known races of economically one of the most important diseases of solanaceous plants oomycete Phytophthora infestans aminoacid sequences encoding by KPI-A genes from its genome have nearly or absolutely no differences to the same from genomes of cultivated potatoes involved by P. infestans.
Identification of human chromosome 22 transcribed sequences with ORF expressed sequence tags
de Souza, Sandro J.; Camargo, Anamaria A.; Briones, Marcelo R. S.; Costa, Fernando F.; Nagai, Maria Aparecida; Verjovski-Almeida, Sergio; Zago, Marco A.; Andrade, Luis Eduardo C.; Carrer, Helaine; El-Dorry, Hamza F. A.; Espreafico, Enilza M.; Habr-Gama, Angelita; Giannella-Neto, Daniel; Goldman, Gustavo H.; Gruber, Arthur; Hackel, Christine; Kimura, Edna T.; Maciel, Rui M. B.; Marie, Suely K. N.; Martins, Elizabeth A. L.; Nóbrega, Marina P.; Paçó-Larson, Maria Luisa; Pardini, Maria Inês M. C.; Pereira, Gonçalo G.; Pesquero, João Bosco; Rodrigues, Vanderlei; Rogatto, Silvia R.; da Silva, Ismael D. C. G.; Sogayar, Mari C.; de Fátima Sonati, Maria; Tajara, Eloiza H.; Valentini, Sandro R.; Acencio, Marcio; Alberto, Fernando L.; Amaral, Maria Elisabete J.; Aneas, Ivy; Bengtson, Mário Henrique; Carraro, Dirce M.; Carvalho, Alex F.; Carvalho, Lúcia Helena; Cerutti, Janete M.; Corrêa, Maria Lucia C.; Costa, Maria Cristina R.; Curcio, Cyntia; Gushiken, Tsieko; Ho, Paulo L.; Kimura, Elza; Leite, Luciana C. C.; Maia, Gustavo; Majumder, Paromita; Marins, Mozart; Matsukuma, Adriana; Melo, Analy S. A.; Mestriner, Carlos Alberto; Miracca, Elisabete C.; Miranda, Daniela C.; Nascimento, Ana Lucia T. O.; Nóbrega, Francisco G.; Ojopi, Élida P. B.; Pandolfi, José Rodrigo C.; Pessoa, Luciana Gilbert; Rahal, Paula; Rainho, Claudia A.; da Ro's, Nancy; de Sá, Renata G.; Sales, Magaly M.; da Silva, Neusa P.; Silva, Tereza C.; da Silva, Wilson; Simão, Daniel F.; Sousa, Josane F.; Stecconi, Daniella; Tsukumo, Fernando; Valente, Valéria; Zalcberg, Heloisa; Brentani, Ricardo R.; Reis, Luis F. L.; Dias-Neto, Emmanuel; Simpson, Andrew J. G.
2000-01-01
Transcribed sequences in the human genome can be identified with confidence only by alignment with sequences derived from cDNAs synthesized from naturally occurring mRNAs. We constructed a set of 250,000 cDNAs that represent partial expressed gene sequences and that are biased toward the central coding regions of the resulting transcripts. They are termed ORF expressed sequence tags (ORESTES). The 250,000 ORESTES were assembled into 81,429 contigs. Of these, 1,181 (1.45%) were found to match sequences in chromosome 22 with at least one ORESTES contig for 162 (65.6%) of the 247 known genes, for 67 (44.6%) of the 150 related genes, and for 45 of the 148 (30.4%) EST-predicted genes on this chromosome. Using a set of stringent criteria to validate our sequences, we identified a further 219 previously unannotated transcribed sequences on chromosome 22. Of these, 171 were in fact also defined by EST or full length cDNA sequences available in GenBank but not utilized in the initial annotation of the first human chromosome sequence. Thus despite representing less than 15% of all expressed human sequences in the public databases at the time of the present analysis, ORESTES sequences defined 48 transcribed sequences on chromosome 22 not defined by other sequences. All of the transcribed sequences defined by ORESTES coincided with DNA regions predicted as encoding exons by genscan. (http://genes.mit.edu/GENSCAN.html). PMID:11070084
Minson, A C; Darby, G K; Wildy, P
1979-11-01
Two independently derived cell lines which carry the herpes simplex type 2 thymidine kinase gene have been examined for the presence of HSV-2-specific DNA sequences. Both cell lines contained 1 to 3 copies per cell of a sequence lying within map co-ordinates 0.2 to 0.4 of the HSV-2 genome. Revertant cells, which contained no detectable thymidine kinase, did not contain this DNA sequence. The failure of EcoR1-restricted HSV-2 DNA to act as a donor of the thymidine kinase gene in transformation experiments suggests that the gene lies close to the EcoR1 restriction site within this sequence at a map position of approx. 0.3. The HSV-2 kinase gene is therefore approximately co-linear with the HSV-1 gene.
Jensen, Peter D; Zhang, Yuanji; Wiggins, B Elizabeth; Petrick, Jay S; Zhu, Jin; Kerstetter, Randall A; Heck, Gregory R; Ivashuta, Sergey I
2013-01-01
Long double-stranded RNAs (long dsRNAs) are precursors for the effector molecules of sequence-specific RNA-based gene silencing in eukaryotes. Plant cells can contain numerous endogenous long dsRNAs. This study demonstrates that such endogenous long dsRNAs in plants have sequence complementarity to human genes. Many of these complementary long dsRNAs have perfect sequence complementarity of at least 21 nucleotides to human genes; enough complementarity to potentially trigger gene silencing in targeted human cells if delivered in functional form. However, the number and diversity of long dsRNA molecules in plant tissue from crops such as lettuce, tomato, corn, soy and rice with complementarity to human genes that have a long history of safe consumption supports a conclusion that long dsRNAs do not present a significant dietary risk.
Chen, H T; Alexander, C B; Mage, R G
1995-06-15
Normal rabbits preferentially rearrange the 3'-most VH gene, VH1, to encode Igs with VHa allotypes, which constitute the majority of rabbit serum Igs. A gene conversion-like mechanism is employed to diversify the primary Ab repertoire. In mutant Alicia rabbits that derived from a rabbit with VHa2 allotype, the VH1 gene was deleted. Our previous studies showed that the first functional gene (VH4) or VH4-like genes were rearranged in 2- to 8-wk-old homozygous Alicia. The VH1a2-like sequences that were found in splenic mRNA from 6-wk and older Alicia rabbits still had some residues that were typical of VH4. The appearances of sequences resembling that of VH1a2 may have been caused by gene conversions that altered the sequences of the rearranged VH or there may have been rearrangement of upstream VH1a2-like genes later in development. To investigate this further, we constructed a cosmid library and isolated a VH1a2-like gene, VH12-1-6, with a sequence almost identical to VH1a2. This gene had a deleted base in the heptamer of its recombination signal sequence. However, even if this defect diminished or eliminated its ability to rearrange, the a2-like gene could have acted as a donor for gene-conversion-like alteration of rearranged VH genes. Sequence comparisons suggested that this gene or a gene like it could have acted as a donor for gene conversion in mutant Alicia and in normal rabbits.
Schwientek, Patrick; Neshat, Armin; Kalinowski, Jörn; Klein, Andreas; Rückert, Christian; Schneiker-Bekel, Susanne; Wendler, Sergej; Stoye, Jens; Pühler, Alfred
2014-11-20
Actinoplanes sp. SE50/110 is the producer of the alpha-glucosidase inhibitor acarbose, which is an economically relevant and potent drug in the treatment of type-2 diabetes mellitus. In this study, we present the detection of transcription start sites on this genome by sequencing enriched 5'-ends of primary transcripts. Altogether, 1427 putative transcription start sites were initially identified. With help of the annotated genome sequence, 661 transcription start sites were found to belong to the leader region of protein-coding genes with the surprising result that roughly 20% of these genes rank among the class of leaderless transcripts. Next, conserved promoter motifs were identified for protein-coding genes with and without leader sequences. The mapped transcription start sites were finally used to improve the annotation of the Actinoplanes sp. SE50/110 genome sequence. Concerning protein-coding genes, 41 translation start sites were corrected and 9 novel protein-coding genes could be identified. In addition to this, 122 previously undetermined non-coding RNA (ncRNA) genes of Actinoplanes sp. SE50/110 were defined. Focusing on antisense transcription start sites located within coding genes or their leader sequences, it was discovered that 96 of those ncRNA genes belong to the class of antisense RNA (asRNA) genes. The remaining 26 ncRNA genes were found outside of known protein-coding genes. Four chosen examples of prominent ncRNA genes, namely the transfer messenger RNA gene ssrA, the ribonuclease P class A RNA gene rnpB, the cobalamin riboswitch RNA gene cobRS, and the selenocysteine-specific tRNA gene selC, are presented in more detail. This study demonstrates that sequencing of enriched 5'-ends of primary transcripts and the identification of transcription start sites are valuable tools for advanced genome annotation of Actinoplanes sp. SE50/110 and most probably also for other bacteria. Copyright © 2014 Elsevier B.V. All rights reserved.
Glynn, Neil C; Comstock, Jack C; Sood, Sushma G; Dang, Phat M; Chaparro, Jose X
2008-01-01
Resistance gene analogues (RGAs) have been isolated from many crops and offer potential in breeding for disease resistance through marker-assisted selection, either as closely linked or as perfect markers. Many R-gene sequences contain kinase domains, and indeed kinase genes have been reported as being proximal to R-genes, making kinase analogues an additionally promising target. The first step towards utilizing RGAs as markers for disease resistance is isolation and characterization of the sequences. Sugarcane clone US01-1158 was identified as resistant to yellow leaf caused by the sugarcane yellow leaf virus (SCYLV) and moderately resistant to rust caused by Puccinia melanocephala Sydow & Sydow. Degenerate primers that had previously proved useful for isolating RGAs and kinase analogues in wheat and soybean were used to amplify DNA from sugarcane (Saccharum spp.) clone US-01-1158. Sequences generated from 1512 positive clones were assembled into 134 contigs of between two and 105 sequences. Comparison of the contig consensuses with the NCBI sequence database using BLASTx showed that 20 had sequence homology to nuclear binding site and leucine rich repeat (NBS-LRR) RGAs, and eight to kinase genes. Alignment of the deduced amino acid sequences with similar sequences from the NCBI database allowed the identification of several conserved domains. The alignment and resulting phenetic tree showed that many of the sequences had greater similarity to sequences from other species than to one another. The use of degenerate primers is a useful method for isolating novel sugarcane RGA and kinase gene analogues. Further studies are needed to evaluate the role of these genes in disease resistance.
Zhang, Tingting; Hu, Shuhao; Yan, Caixia; Li, Chunjuan; Zhao, Xiaobo; Wan, Shubo; Shan, Shihua
2017-02-01
In the present investigation, a total of 60 conserved peanut (Arachis hypogaea L.) microRNA (miRNA) sequences, belonging to 16 families, were identified using bioinformatics methods. There were 392 target gene sequences, identified from 58 miRNAs with Target-align software and BLASTx analyses. Gene Ontology (GO) functional analysis suggested that these target genes were involved in mediating peanut growth and development, signal transduction and stress resistance. There were 55 miRNA sequences, verified employing a poly (A) tailing test, with a success rate of up to 91.67%. Twenty peanut target gene sequences were randomly selected, and the 5' rapid amplification of the cDNA ends (5'-RACE) method were used to validate the cleavage sites of these target genes. Of these, 14 (70%) peanut miRNA targets were verified by means of gel electrophoresis, cloning and sequencing. Furthermore, functional analysis and homologous sequence retrieval were conducted for target gene sequences, and 26 target genes were chosen as the objects for stress resistance experimental study. Real-time fluorescence quantitative PCR (qRT-PCR) technology was applied to measure the expression level of resistance-associated miRNAs and their target genes in peanut exposed to Aspergillus flavus (A. flavus) infection and drought stress, respectively. In consequence, 5 groups of miRNAs & targets were found accorded with the mode of miRNA negatively controlling the expression of target genes. This study, preliminarily determined the biological functions of some resistance-associated miRNAs and their target genes in peanut. Copyright © 2016 Elsevier Masson SAS. All rights reserved.
Auernik, Kathryne S; Maezato, Yukari; Blum, Paul H; Kelly, Robert M
2008-02-01
Despite their taxonomic description, not all members of the order Sulfolobales are capable of oxidizing reduced sulfur species, which, in addition to iron oxidation, is a desirable trait of biomining microorganisms. However, the complete genome sequence of the extremely thermoacidophilic archaeon Metallosphaera sedula DSM 5348 (2.2 Mb, approximately 2,300 open reading frames [ORFs]) provides insights into biologically catalyzed metal sulfide oxidation. Comparative genomics was used to identify pathways and proteins involved (directly or indirectly) with bioleaching. As expected, the M. sedula genome contains genes related to autotrophic carbon fixation, metal tolerance, and adhesion. Also, terminal oxidase cluster organization indicates the presence of hybrid quinol-cytochrome oxidase complexes. Comparisons with the mesophilic biomining bacterium Acidithiobacillus ferrooxidans ATCC 23270 indicate that the M. sedula genome encodes at least one putative rusticyanin, involved in iron oxidation, and a putative tetrathionate hydrolase, implicated in sulfur oxidation. The fox gene cluster, involved in iron oxidation in the thermoacidophilic archaeon Sulfolobus metallicus, was also identified. These iron- and sulfur-oxidizing components are missing from genomes of nonleaching members of the Sulfolobales, such as Sulfolobus solfataricus P2 and Sulfolobus acidocaldarius DSM 639. Whole-genome transcriptional response analysis showed that 88 ORFs were up-regulated twofold or more in M. sedula upon addition of ferrous sulfate to yeast extract-based medium; these included genes for components of terminal oxidase clusters predicted to be involved with iron oxidation, as well as genes predicted to be involved with sulfur metabolism. Many hypothetical proteins were also differentially transcribed, indicating that aspects of the iron and sulfur metabolism of M. sedula remain to be identified and characterized.
Auernik, Kathryne S.; Maezato, Yukari; Blum, Paul H.; Kelly, Robert M.
2008-01-01
Despite their taxonomic description, not all members of the order Sulfolobales are capable of oxidizing reduced sulfur species, which, in addition to iron oxidation, is a desirable trait of biomining microorganisms. However, the complete genome sequence of the extremely thermoacidophilic archaeon Metallosphaera sedula DSM 5348 (2.2 Mb, ∼2,300 open reading frames [ORFs]) provides insights into biologically catalyzed metal sulfide oxidation. Comparative genomics was used to identify pathways and proteins involved (directly or indirectly) with bioleaching. As expected, the M. sedula genome contains genes related to autotrophic carbon fixation, metal tolerance, and adhesion. Also, terminal oxidase cluster organization indicates the presence of hybrid quinol-cytochrome oxidase complexes. Comparisons with the mesophilic biomining bacterium Acidithiobacillus ferrooxidans ATCC 23270 indicate that the M. sedula genome encodes at least one putative rusticyanin, involved in iron oxidation, and a putative tetrathionate hydrolase, implicated in sulfur oxidation. The fox gene cluster, involved in iron oxidation in the thermoacidophilic archaeon Sulfolobus metallicus, was also identified. These iron- and sulfur-oxidizing components are missing from genomes of nonleaching members of the Sulfolobales, such as Sulfolobus solfataricus P2 and Sulfolobus acidocaldarius DSM 639. Whole-genome transcriptional response analysis showed that 88 ORFs were up-regulated twofold or more in M. sedula upon addition of ferrous sulfate to yeast extract-based medium; these included genes for components of terminal oxidase clusters predicted to be involved with iron oxidation, as well as genes predicted to be involved with sulfur metabolism. Many hypothetical proteins were also differentially transcribed, indicating that aspects of the iron and sulfur metabolism of M. sedula remain to be identified and characterized. PMID:18083856
Drancourt, M
2012-03-01
With plague being not only a subject of interest for historians, but still a disease of public health concern in several countries, mainly in Africa, there were hopes that analyses of the Yersinia pestis genomes would put an end to this deadly epidemic pathogen. Genomics revealed that Y. pestis isolates evolved from Yersinia pseudotuberculosis in Central Asia some millennia ago, after the acquisition of two Y. pestis-specific plasmids balanced genomic reduction parallel with the expansion of insertion sequences, illustrating the modern concept that, except for the acquisition of plasmid-borne toxin-encoding genes, the increased virulence of Y. pestis resulted from gene loss rather than gene acquisition. The telluric persistence of Y. pestis reminds us of this close relationship, and matters in terms of plague epidemiology. Whereas biotype Orientalis isolates spread worldwide, the Antiqua and Medievalis isolates showed more limited expansion. In addition to animal ectoparasites, human ectoparasites such as the body louse may have participated in this expansion and in devastating historical epidemics. The recent analysis of a Black Death genome indicated that it was more closely related to the Orientalis branch than to the Medievalis branch. Modern Y. pestis isolates grossly exhibit the same gene content, but still undergo micro-evolution in geographically limited areas by differing in the genome architecture, owing to inversions near insertion sequences and the stabilization of the YpfPhi prophage in Orientalis biotype isolates. Genomics have provided several new molecular tools for the genotyping and phylogeographical tracing of isolates and description of plague foci. However, genomics and post-genomics approaches have not yet provided new tools for the prevention, diagnosis and management of plague patients and the plague epidemics still raging in some sub-Saharan countries. © 2012 The Author. Clinical Microbiology and Infection © 2012 European Society of Clinical Microbiology and Infectious Diseases.
Application of industrial scale genomics to discovery of therapeutic targets in heart failure.
Mehraban, F; Tomlinson, J E
2001-12-01
In recent years intense activity in both academic and industrial sectors has provided a wealth of information on the human genome with an associated impressive increase in the number of novel gene sequences deposited in sequence data repositories and patent applications. This genomic industrial revolution has transformed the way in which drug target discovery is now approached. In this article we discuss how various differential gene expression (DGE) technologies are being utilized for cardiovascular disease (CVD) drug target discovery. Other approaches such as sequencing cDNA from cardiovascular derived tissues and cells coupled with bioinformatic sequence analysis are used with the aim of identifying novel gene sequences that may be exploited towards target discovery. Additional leverage from gene sequence information is obtained through identification of polymorphisms that may confer disease susceptibility and/or affect drug responsiveness. Pharmacogenomic studies are described wherein gene expression-based techniques are used to evaluate drug response and/or efficacy. Industrial-scale genomics supports and addresses not only novel target gene discovery but also the burgeoning issues in pharmaceutical and clinical cardiovascular medicine relative to polymorphic gene responses.
Mumps virus F gene and HN gene sequencing as a molecular tool to study mumps virus transmission.
Gouma, Sigrid; Cremer, Jeroen; Parkkali, Saara; Veldhuijzen, Irene; van Binnendijk, Rob S; Koopmans, Marion P G
2016-11-01
Various mumps outbreaks have occurred in the Netherlands since 2004, particularly among persons who had received 2 doses of measles, mumps, and rubella (MMR) vaccination. Genomic typing of pathogens can be used to track outbreaks, but the established genotyping of mumps virus based on the small hydrophobic (SH) gene sequences did not provide sufficient resolution. Therefore, we expanded the sequencing to include fusion (F) gene and haemagglutinin-neuraminidase (HN) gene sequences in addition to the SH gene sequences from 109 mumps virus genotype G strains obtained between 2004 and mid 2015 in the Netherlands. When the molecular information from these 3 genes was combined, we were able to identify separate mumps virus clusters and track mumps virus transmission. The analyses suggested that multiple mumps virus introductions occurred in the Netherlands between 2004 and 2015 resulting in several mumps outbreaks throughout this period, whereas during some local outbreaks the molecular data pointed towards endemic circulation. Combined analysis of epidemiological data and sequence data collected in 2015 showed good support for the phylogenetic clustering. Copyright © 2016 Elsevier B.V. All rights reserved.
Esmaeili Rastaghi, Ahmad Reza; Spotin, Adel; Khataminezhad, Mohammad Reza; Jafarpour, Mostafa; Alaeenovin, Elnaz; Najafzadeh, Narmin; Samei, Neda; Taleshi, Neda; Mohammadi, Somayeh; Parvizi, Parviz
2017-10-01
Leishmaniasis as an emerging and reemerging disease is increasing worldwide with high prevalence and new incidence in recent years. For epidemiological investigation and accurate identification of Leishmania species, three nuclear and mitochondrial genes (ITS-rDNA, Hsp70, and Cyt b ) were employed and analyzed from clinical samples in three important Zoonotic Cutaneous Leishmaniasis (ZCL) foci of Iran. In this cross-sectional/descriptive study conducted in 2014-15, serous smears of lesions were directly prepared from suspected patients of ZCL in Turkmen in northeast, Abarkouh in center and Shush district in southwest of Iran. They were directly prepared from suspected patients and DNA was extracted. Two nuclear genes of ITS-rDNA, Hsp70 and one mitochondrial gene of Cyt b within Leishmania parasites were amplified. RFLP was performed on PCR-positive samples. PCR products were sequenced, aligned and edited with sequencher 4.1.4 and phylogenic analyses performed using MEGA 5.05 software. Overall, 203 out of 360 clinical samples from suspected patients were Leishmania positive using routine laboratory methods and 231 samples were positive by molecular techniques. L. major L. tropica , and L. turanica were firmly identified by employing different molecular genes and phylogenic analyses. By combining different molecular genes, Leishmania parasites were identified accurately. The sensitivity and specificity three genes were evaluated and had more advantages to compare routine laboratory methods. ITS-rDNA gene is more appropriate for firm identification of Leishmania species.
Yoshida, Naoto; Shimura, Hanako; Masuta, Chikara
2018-06-01
Allexiviruses are economically important garlic viruses that are involved in garlic mosaic diseases. In this study, we characterized the allexivirus cysteine-rich protein (CRP) gene located just downstream of the coat protein (CP) gene in the viral genome. We determined the nucleotide sequences of the CP and CRP genes from numerous allexivirus isolates and performed a phylogenetic analysis. According to the resulting phylogenetic tree, we found that allexiviruses were clearly divided into two major groups (group I and group II) based on the sequences of the CP and CRP genes. In addition, the allexiviruses in group II had distinct sequences just before the CRP gene, while group I isolates did not. The inserted sequence between the CP and CRP genes was partially complementary to garlic 18S rRNA. Using a potato virus X vector, we showed that the CRPs affected viral accumulation and symptom induction in Nicotiana benthamiana, suggesting that the allexivirus CRP is a pathogenicity determinant. We assume that the inserted sequences before the CRP gene may have been generated during viral evolution to alter the termination-reinitiation mechanism for coupled translation of CP and CRP.
Recognition of Yeast Species from Gene Sequence Comparisons
USDA-ARS?s Scientific Manuscript database
This review discusses recognition of yeast species from gene sequence comparisons, which have been responsible for doubling the number of known yeasts over the past decade. The resolution provided by various single gene sequences is examined for both ascomycetous and basidiomycetous species, and th...
Cloning and characterization of two novel DNases from Streptococcus pyogenes.
Hasegawa, Tadao; Torii, Keizo; Hashikawa, Shinnosuke; Iinuma, Yoshitsugu; Ohta, Michio
2002-06-01
The proteins in the culture supernatant (exoproteins) from Streptococcus pyogenes serotype M1 were separated by two-dimensional gel electrophoresis, and their N-terminal amino acid sequences were determined. The amino acid sequences were compared to sequences in the S. pyogenes genome database. The coding sequence showed similarity to sequences of two genes, mf2-v ( mf2 variant) and mf3, which had sequence similarity to genes encoding mitogenic factor (MF); MF has DNase activity. The recombinant genes were expressed in Escherichia coli and the proteins were synthesized. Mf2-v and Mf3 had DNase activity. The activity of Mf2-v was localized to the C-terminal half of the protein. The mf3 gene was shown to be present in most clinically isolated strains of S. pyogenes tested, and the mf2gene was detected in 20% of the isolates. The products of the mf2 and mf3 genes in clinically isolated S. pyogenes strains were thus shown to be DNases.
Shen, K A; Meyers, B C; Islam-Faridi, M N; Chin, D B; Stelly, D M; Michelmore, R W
1998-08-01
The recent cloning of genes for resistance against diverse pathogens from a variety of plants has revealed that many share conserved sequence motifs. This provides the possibility of isolating numerous additional resistance genes by polymerase chain reaction (PCR) with degenerate oligonucleotide primers. We amplified resistance gene candidates (RGCs) from lettuce with multiple combinations of primers with low degeneracy designed from motifs in the nucleotide binding sites (NBSs) of RPS2 of Arabidopsis thaliana and N of tobacco. Genomic DNA, cDNA, and bacterial artificial chromosome (BAC) clones were successfully used as templates. Four families of sequences were identified that had the same similarity to each other as to resistance genes from other species. The relationship of the amplified products to resistance genes was evaluated by several sequence and genetic criteria. The amplified products contained open reading frames with additional sequences characteristic of NBSs. Hybridization of RGCs to genomic DNA and to BAC clones revealed large numbers of related sequences. Genetic analysis demonstrated the existence of clustered multigene families for each of the four RGC sequences. This parallels classical genetic data on clustering of disease resistance genes. Two of the four families mapped to known clusters of resistance genes; these two families were therefore studied in greater detail. Additional evidence that these RGCs could be resistance genes was gained by the identification of leucine-rich repeat (LRR) regions in sequences adjoining the NBS similar to those in RPM1 and RPS2 of A. thaliana. Fluorescent in situ hybridization confirmed the clustered genomic distribution of these sequences. The use of PCR with degenerate oligonucleotide primers is therefore an efficient method to identify numerous RGCs in plants.
Wesener, Thomas; Le, Daniel Minh-Tu; Loria, Stephanie F.
2014-01-01
Abstract The Malagasy giant pill-millipede genus Sphaeromimus de Saussure & Zehntner, 1902 is revised. Seven new species, S. titanus sp. n., S. vatovavy sp. n., S. lavasoa sp. n., S. andohahela sp. n., S. ivohibe sp. n., S. saintelucei sp. n., and S. andrahomana sp. n. were discovered, in one case with the help of sequence data, in the rainforests of southeastern Madagascar. The species are described using light- and scanning electron microscopy. A key to all 10 species of the genus is presented. All but one (S. andohahela) of the newly discovered species are microendemics each occurring in isolated forest fragments. The mitochondrial COI barcoding gene was amplified and sequenced for 18 Sphaeromimus specimens, and a dataset containing COI sequences of 28 specimens representing all Sphaeromimus species (except S. vatovavy) was analyzed. All species are genetically monophyletic. Interspecific uncorrected genetic distances were moderate (4–10%) to high (18–25%), whereas intraspecific variation is low (0–3.5%). Sequence data allowed the correct identification of three colour morphs of S. musicus, as well as the identity of a cave specimen, which although aberrant in its morphology and colouration, was genetically identical to the holotype of S. andrahoma. PMID:25009417
Palaniappan, Krishna; Meier-Kolthoff, Jan P.; Teshima, Hazuki; ...
2013-10-16
Thermanaerovibrio velox Zavarzina et al. 2000 is a member of the Synergistaceae, a family in the phylum Synergistetes that is already well-characterized at the genome level. Members of this phylum were described as Gram-negative staining anaerobic bacteria with a rod/vibrioid cell shape and possessing an atypical outer cell envelope. They inhabit a large variety of anaerobic environments including soil, oil wells, wastewater treatment plants and animal gastrointestinal tracts. They are also found to be linked to sites of human diseases such as cysts, abscesses, and areas of periodontal disease. The moderately thermophilic and organotrophic T. velox shares most of itsmore » morphologic and physiologic features with the closely related species, T. acidaminovorans. In addition to Su883 T, the type strain of T. acidaminovorans, stain Z-9701 T is the second type strain in the genus Thermanaerovibrio to have its genome sequence published. Here we describe the features of this organism, together with the non-contiguous genome sequence and annotation. The 1,880,838 bp long chromosome (non-contiguous finished sequence) with its 1,751 protein-coding and 59 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Palaniappan, Krishna; Meier-Kolthoff, Jan P.; Teshima, Hazuki
Thermanaerovibrio velox Zavarzina et al. 2000 is a member of the Synergistaceae, a family in the phylum Synergistetes that is already well-characterized at the genome level. Members of this phylum were described as Gram-negative staining anaerobic bacteria with a rod/vibrioid cell shape and possessing an atypical outer cell envelope. They inhabit a large variety of anaerobic environments including soil, oil wells, wastewater treatment plants and animal gastrointestinal tracts. They are also found to be linked to sites of human diseases such as cysts, abscesses, and areas of periodontal disease. The moderately thermophilic and organotrophic T. velox shares most of itsmore » morphologic and physiologic features with the closely related species, T. acidaminovorans. In addition to Su883 T, the type strain of T. acidaminovorans, stain Z-9701 T is the second type strain in the genus Thermanaerovibrio to have its genome sequence published. Here we describe the features of this organism, together with the non-contiguous genome sequence and annotation. The 1,880,838 bp long chromosome (non-contiguous finished sequence) with its 1,751 protein-coding and 59 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.« less
Palaniappan, Krishna; Meier-Kolthoff, Jan P.; Teshima, Hazuki; Nolan, Matt; Lapidus, Alla; Tice, Hope; Del Rio, Tijana Glavina; Cheng, Jan-Fang; Han, Cliff; Tapia, Roxanne; Goodwin, Lynne A.; Pitluck, Sam; Liolios, Konstantinos; Mavromatis, Konstantinos; Pagani, Ioanna; Ivanova, Natalia; Mikhailova, Natalia; Pati, Amrita; Chen, Amy; Rohde, Manfred; Mayilraj, Shanmugam; Spring, Stefan; Detter, John C.; Göker, Markus; Bristow, James; Eisen, Jonathan A.; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C.; Klenk, Hans-Peter; Woyke, Tanja
2013-01-01
Thermanaerovibrio velox Zavarzina et al. 2000 is a member of the Synergistaceae, a family in the phylum Synergistetes that is already well-characterized at the genome level. Members of this phylum were described as Gram-negative staining anaerobic bacteria with a rod/vibrioid cell shape and possessing an atypical outer cell envelope. They inhabit a large variety of anaerobic environments including soil, oil wells, wastewater treatment plants and animal gastrointestinal tracts. They are also found to be linked to sites of human diseases such as cysts, abscesses, and areas of periodontal disease. The moderately thermophilic and organotrophic T. velox shares most of its morphologic and physiologic features with the closely related species, T. acidaminovorans. In addition to Su883T, the type strain of T. acidaminovorans, stain Z-9701T is the second type strain in the genus Thermanaerovibrio to have its genome sequence published. Here we describe the features of this organism, together with the non-contiguous genome sequence and annotation. The 1,880,838 bp long chromosome (non-contiguous finished sequence) with its 1,751 protein-coding and 59 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project. PMID:24501645
Sequencing, Analysis, and Annotation of Expressed Sequence Tags for Camelus dromedarius
Al-Swailem, Abdulaziz M.; Shehata, Maher M.; Abu-Duhier, Faisel M.; Al-Yamani, Essam J.; Al-Busadah, Khalid A.; Al-Arawi, Mohammed S.; Al-Khider, Ali Y.; Al-Muhaimeed, Abdullah N.; Al-Qahtani, Fahad H.; Manee, Manee M.; Al-Shomrani, Badr M.; Al-Qhtani, Saad M.; Al-Harthi, Amer S.; Akdemir, Kadir C.; Otu, Hasan H.
2010-01-01
Despite its economical, cultural, and biological importance, there has not been a large scale sequencing project to date for Camelus dromedarius. With the goal of sequencing complete DNA of the organism, we first established and sequenced camel EST libraries, generating 70,272 reads. Following trimming, chimera check, repeat masking, cluster and assembly, we obtained 23,602 putative gene sequences, out of which over 4,500 potentially novel or fast evolving gene sequences do not carry any homology to other available genomes. Functional annotation of sequences with similarities in nucleotide and protein databases has been obtained using Gene Ontology classification. Comparison to available full length cDNA sequences and Open Reading Frame (ORF) analysis of camel sequences that exhibit homology to known genes show more than 80% of the contigs with an ORF>300 bp and ∼40% hits extending to the start codons of full length cDNAs suggesting successful characterization of camel genes. Similarity analyses are done separately for different organisms including human, mouse, bovine, and rat. Accompanying web portal, CAGBASE (http://camel.kacst.edu.sa/), hosts a relational database containing annotated EST sequences and analysis tools with possibility to add sequences from public domain. We anticipate our results to provide a home base for genomic studies of camel and other comparative studies enabling a starting point for whole genome sequencing of the organism. PMID:20502665
Identification of Genetic Elements Associated with EPSPS Gene Amplification
Gaines, Todd A.; Wright, Alice A.; Molin, William T.; Lorentz, Lothar; Riggins, Chance W.; Tranel, Patrick J.; Beffa, Roland; Westra, Philip; Powles, Stephen B.
2013-01-01
Weed populations can have high genetic plasticity and rapid responses to environmental selection pressures. For example, 100-fold amplification of the 5-enolpyruvylshikimate-3-phosphate synthase (EPSPS) gene evolved in the weed species Amaranthus palmeri to confer resistance to glyphosate, the world’s most important herbicide. However, the gene amplification mechanism is unknown. We sequenced the EPSPS gene and genomic regions flanking EPSPS loci in A. palmeri, and searched for mobile genetic elements or repetitive sequences. The EPSPS gene was 10,229 bp, containing 8 exons and 7 introns. The gene amplification likely proceeded through a DNA-mediated mechanism, as introns exist in the amplified gene copies and the entire amplified sequence is at least 30 kb in length. Our data support the presence of two EPSPS loci in susceptible (S) A. palmeri, and that only one of these was amplified in glyphosate-resistant (R) A. palmeri. The EPSPS gene amplification event likely occurred recently, as no sequence polymorphisms were found within introns of amplified EPSPS copies from R individuals. Sequences with homology to miniature inverted-repeat transposable elements (MITEs) were identified next to EPSPS gene copies only in R individuals. Additionally, a putative Activator (Ac) transposase and a repetitive sequence region were associated with amplified EPSPS genes. The mechanism controlling this DNA-mediated amplification remains unknown. Further investigation is necessary to determine if the gene amplification may have proceeded via DNA transposon-mediated replication, and/or unequal recombination between different genomic regions resulting in replication of the EPSPS gene. PMID:23762434
Evolution dynamics of a model for gene duplication under adaptive conflict
NASA Astrophysics Data System (ADS)
Ancliff, Mark; Park, Jeong-Man
2014-06-01
We present and solve the dynamics of a model for gene duplication showing escape from adaptive conflict. We use a Crow-Kimura quasispecies model of evolution where the fitness landscape is a function of Hamming distances from two reference sequences, which are assumed to optimize two different gene functions, to describe the dynamics of a mixed population of individuals with single and double copies of a pleiotropic gene. The evolution equations are solved through a spin coherent state path integral, and we find two phases: one is an escape from an adaptive conflict phase, where each copy of a duplicated gene evolves toward subfunctionalization, and the other is a duplication loss of function phase, where one copy maintains its pleiotropic form and the other copy undergoes neutral mutation. The phase is determined by a competition between the fitness benefits of subfunctionalization and the greater mutational load associated with maintaining two gene copies. In the escape phase, we find a dynamics of an initial population of single gene sequences only which escape adaptive conflict through gene duplication and find that there are two time regimes: until a time t* single gene sequences dominate, and after t* double gene sequences outgrow single gene sequences. The time t* is identified as the time necessary for subfunctionalization to evolve and spread throughout the double gene sequences, and we show that there is an optimum mutation rate which minimizes this time scale.
Hodgetts, Jennifer; Boonham, Neil; Mumford, Rick; Harrison, Nigel; Dickinson, Matthew
2008-08-01
Phytoplasma phylogenetics has focused primarily on sequences of the non-coding 16S rRNA gene and the 16S-23S rRNA intergenic spacer region (16-23S ISR), and primers that enable amplification of these regions from all phytoplasmas by PCR are well established. In this study, primers based on the secA gene have been developed into a semi-nested PCR assay that results in a sequence of the expected size (about 480 bp) from all 34 phytoplasmas examined, including strains representative of 12 16Sr groups. Phylogenetic analysis of secA gene sequences showed similar clustering of phytoplasmas when compared with clusters resolved by similar sequence analyses of a 16-23S ISR-23S rRNA gene contig or of the 16S rRNA gene alone. The main differences between trees were in the branch lengths, which were elongated in the 16-23S ISR-23S rRNA gene tree when compared with the 16S rRNA gene tree and elongated still further in the secA gene tree, despite this being a shorter sequence. The improved resolution in the secA gene-derived phylogenetic tree resulted in the 16SrII group splitting into two distinct clusters, while phytoplasmas associated with coconut lethal yellowing-type diseases split into three distinct groups, thereby supporting past proposals that they represent different candidate species within 'Candidatus Phytoplasma'. The ability to differentiate 16Sr groups and subgroups by virtual RFLP analysis of secA gene sequences suggests that this gene may provide an informative alternative molecular marker for pathogen identification and diagnosis of phytoplasma diseases.
Targeted Re-Sequencing Emulsion PCR Panel for Myopathies: Results in 94 Cases.
Punetha, Jaya; Kesari, Akanchha; Uapinyoying, Prech; Giri, Mamta; Clarke, Nigel F; Waddell, Leigh B; North, Kathryn N; Ghaoui, Roula; O'Grady, Gina L; Oates, Emily C; Sandaradura, Sarah A; Bönnemann, Carsten G; Donkervoort, Sandra; Plotz, Paul H; Smith, Edward C; Tesi-Rocha, Carolina; Bertorini, Tulio E; Tarnopolsky, Mark A; Reitter, Bernd; Hausmanowa-Petrusewicz, Irena; Hoffman, Eric P
2016-05-27
Molecular diagnostics in the genetic myopathies often requires testing of the largest and most complex transcript units in the human genome (DMD, TTN, NEB). Iteratively targeting single genes for sequencing has traditionally entailed high costs and long turnaround times. Exome sequencing has begun to supplant single targeted genes, but there are concerns regarding coverage and needed depth of the very large and complex genes that frequently cause myopathies. To evaluate efficiency of next-generation sequencing technologies to provide molecular diagnostics for patients with previously undiagnosed myopathies. We tested a targeted re-sequencing approach, using a 45 gene emulsion PCR myopathy panel, with subsequent sequencing on the Illumina platform in 94 undiagnosed patients. We compared the targeted re-sequencing approach to exome sequencing for 10 of these patients studied. We detected likely pathogenic mutations in 33 out of 94 patients with a molecular diagnostic rate of approximately 35%. The remaining patients showed variants of unknown significance (35/94 patients) or no mutations detected in the 45 genes tested (26/94 patients). Mutation detection rates for targeted re-sequencing vs. whole exome were similar in both methods; however exome sequencing showed better distribution of reads and fewer exon dropouts. Given that costs of highly parallel re-sequencing and whole exome sequencing are similar, and that exome sequencing now takes considerably less laboratory processing time than targeted re-sequencing, we recommend exome sequencing as the standard approach for molecular diagnostics of myopathies.
Tomazetto, Geizecler; Wibberg, Daniel; Schlüter, Andreas; Oliveira, Valéria M
2015-01-01
A fosmid metagenomic library was constructed with total community DNA obtained from a municipal wastewater treatment plant (MWWTP), with the aim of identifying new FeFe-hydrogenase genes encoding the enzymes most important for hydrogen metabolism. The dataset generated by pyrosequencing of a fosmid library was mined to identify environmental gene tags (EGTs) assigned to FeFe-hydrogenase. The majority of EGTs representing FeFe-hydrogenase genes were affiliated with the class Clostridia, suggesting that this group is the main hydrogen producer in the MWWTP analyzed. Based on assembled sequences, three FeFe-hydrogenase genes were predicted based on detection of the L2 motif (MPCxxKxxE) in the encoded gene product, confirming true FeFe-hydrogenase sequences. These sequences were used to design specific primers to detect fosmids encoding FeFe-hydrogenase genes predicted from the dataset. Three identified fosmids were completely sequenced. The cloned genomic fragments within these fosmids are closely related to members of the Spirochaetaceae, Bacteroidales and Firmicutes, and their FeFe-hydrogenase sequences are characterized by the structure type M3, which is common to clostridial enzymes. FeFe-hydrogenase sequences found in this study represent hitherto undetected sequences, indicating the high genetic diversity regarding these enzymes in MWWTP. Results suggest that MWWTP have to be considered as reservoirs for new FeFe-hydrogenase genes. Copyright © 2014 Institut Pasteur. Published by Elsevier Masson SAS. All rights reserved.
Yohda, Masafumi; Yagi, Osami; Takechi, Ayane; Kitajima, Mizuki; Matsuda, Hisashi; Miyamura, Naoaki; Aizawa, Tomoko; Nakajima, Mutsuyasu; Sunairi, Michio; Daiba, Akito; Miyajima, Takashi; Teruya, Morimi; Teruya, Kuniko; Shiroma, Akino; Shimoji, Makiko; Tamotsu, Hinako; Juan, Ayaka; Nakano, Kazuma; Aoyama, Misako; Terabayashi, Yasunobu; Satou, Kazuhito; Hirano, Takashi
2015-07-01
A Dehalococcoides-containing bacterial consortium that performed dechlorination of 0.20 mM cis-1,2-dichloroethene to ethene in 14 days was obtained from the sediment mud of the lotus field. To obtain detailed information of the consortium, the metagenome was analyzed using the short-read next-generation sequencer SOLiD 3. Matching the obtained sequence tags with the reference genome sequences indicated that the Dehalococcoides sp. in the consortium was highly homologous to Dehalococcoides mccartyi CBDB1 and BAV1. Sequence comparison with the reference sequence constructed from 16S rRNA gene sequences in a public database showed the presence of Sedimentibacter, Sulfurospirillum, Clostridium, Desulfovibrio, Parabacteroides, Alistipes, Eubacterium, Peptostreptococcus and Proteocatella in addition to Dehalococcoides sp. After further enrichment, the members of the consortium were narrowed down to almost three species. Finally, the full-length circular genome sequence of the Dehalococcoides sp. in the consortium, D. mccartyi IBARAKI, was determined by analyzing the metagenome with the single-molecule DNA sequencer PacBio RS. The accuracy of the sequence was confirmed by matching it to the tag sequences obtained by SOLiD 3. The genome is 1,451,062 nt and the number of CDS is 1566, which includes 3 rRNA genes and 47 tRNA genes. There exist twenty-eight RDase genes that are accompanied by the genes for anchor proteins. The genome exhibits significant sequence identity with other Dehalococcoides spp. throughout the genome, but there exists significant difference in the distribution RDase genes. The combination of a short-read next-generation DNA sequencer and a long-read single-molecule DNA sequencer gives detailed information of a bacterial consortium. Copyright © 2014 The Society for Biotechnology, Japan. Published by Elsevier B.V. All rights reserved.
Transcriptomic profiles of human foreskin fibroblast cells in response to orf virus.
Chen, Daxiang; Long, Mingjian; Xiao, Bin; Xiong, Yufeng; Chen, Huiqin; Chen, Yu; Kuang, Zhenzhan; Li, Ming; Wu, Yingsong; Rock, Daniel L; Gong, Daoyuan; Wang, Yong; He, Haijian; Liu, Fang; Luo, Shuhong; Hao, Wenbo
2017-08-29
Orf virus has been utilized as a safe and efficient viral vector against not only diverse infectious diseases, but also against tumors. However, the nature of the genes triggered by the vector in human cells is poorly characterized. Using RNA sequencing technology, we compared specific changes in the transcriptomic profiles in human foreskin fibroblast cells following infection by the orf virus. The results indicated that orf virus upregulates or downregulates expression of a variety of genes, including genes involved in antiviral immune response, apoptosis, cell cycle and a series of signaling pathways, such as the IFN and p53-signaling pathways. The orf virus stimulates or inhibits immune gene expression such as chemokines, chemokine receptors, cytokines, cytokine receptors, and molecules involved in antigen uptake and processing after infection. Expression of pro-apoptotic genes increased at 8 hours post-infection. The p53 signaling pathway was activated to induce apoptosis at the same time. However, the cell cycle program was promoted after infection, which may be due to the immunomodulatory genes of the orf virus. This presents the first description of transcription profile changes in human foreskin fibroblast cells after orf virus infection and provides an in-depth analysis of the interaction between the host and orf virus. These data offer new insights into the understanding of the mechanisms of infection by orf virus and identify potential targets for future studies.
Targeting Conserved Genes in Penicillium Species.
Peterson, Stephen W
2017-01-01
Polymerase chain reaction amplification of conserved genes and sequence analysis provides a very powerful tool for the identification of toxigenic as well as non-toxigenic Penicillium species. Sequences are obtained by amplification of the gene fragment, sequencing via capillary electrophoresis of dideoxynucleotide-labeled fragments or NGS. The sequences are compared to a database of validated isolates. Identification of species indicates the potential of the fungus to make particular mycotoxins.
Karaca, Gürsel; Jonathan, Rinita; Paul, Bernard
2009-06-01
Pythium stipitatum is a slow-growing oomycete and has been isolated from soil samples and plant materials from France, Tunisia, Turkey and India. Its morphological characteristics are reminiscent of those of Pythium ramificatum, discovered in Algeria by the corresponding author. Unfortunately, the Algerian isolate was not deposited in any culture collection and ultimately got lost. Those were the days when molecular description of fungi was not a fashion; hence, no molecular characteristics of the Algerian isolates were deposited to the GenBank. Moreover, its coralloid antheridial branches made it an easy prey to be considered as synonymous to Pythium minus. Because there are no living strains of P. ramificatum, and no sequence at the GenBank, it is being treated as 'nomen invalidum' here. However, we have now isolated the same type of oomycete from four different countries and we have sufficient evidence, both molecular and morphological, to describe it as a new species, quite different from P. minus. In this article, we are giving the morphological and molecular evidence to separate it as a distinct species, P. stipitatum, belonging to the 'Clade E' of the genus Pythium. Taxonomic description of this oomycete, its comparison with related species, and the sequence of the internal transcribed spacer region of its rRNA gene, are discussed here.
Composite transcriptome assembly of RNA-seq data in a sheep model for delayed bone healing.
Jäger, Marten; Ott, Claus-Eric; Grünhagen, Johannes; Hecht, Jochen; Schell, Hanna; Mundlos, Stefan; Duda, Georg N; Robinson, Peter N; Lienau, Jasmin
2011-03-24
The sheep is an important model organism for many types of medically relevant research, but molecular genetic experiments in the sheep have been limited by the lack of knowledge about ovine gene sequences. Prior to our study, mRNA sequences for only 1,556 partial or complete ovine genes were publicly available. Therefore, we developed a composite de novo transcriptome assembly method for next-generation sequence data to combine known ovine mRNA and EST sequences, mRNA sequences from mouse and cow, and sequences assembled de novo from short read RNA-Seq data into a composite reference transcriptome, and identified transcripts from over 12 thousand previously undescribed ovine genes. Gene expression analysis based on these data revealed substantially different expression profiles in standard versus delayed bone healing in an ovine tibial osteotomy model. Hundreds of transcripts were differentially expressed between standard and delayed healing and between the time points of the standard and delayed healing groups. We used the sheep sequences to design quantitative RT-PCR assays with which we validated the differential expression of 26 genes that had been identified by RNA-seq analysis. A number of clusters of characteristic expression profiles could be identified, some of which showed striking differences between the standard and delayed healing groups. Gene Ontology (GO) analysis showed that the differentially expressed genes were enriched in terms including extracellular matrix, cartilage development, contractile fiber, and chemokine activity. Our results provide a first atlas of gene expression profiles and differentially expressed genes in standard and delayed bone healing in a large-animal model and provide a number of clues as to the shifts in gene expression that underlie delayed bone healing. In the course of our study, we identified transcripts of 13,987 ovine genes, including 12,431 genes for which no sequence information was previously available. This information will provide a basis for future molecular research involving the sheep as a model organism.
Composite transcriptome assembly of RNA-seq data in a sheep model for delayed bone healing
2011-01-01
Background The sheep is an important model organism for many types of medically relevant research, but molecular genetic experiments in the sheep have been limited by the lack of knowledge about ovine gene sequences. Results Prior to our study, mRNA sequences for only 1,556 partial or complete ovine genes were publicly available. Therefore, we developed a composite de novo transcriptome assembly method for next-generation sequence data to combine known ovine mRNA and EST sequences, mRNA sequences from mouse and cow, and sequences assembled de novo from short read RNA-Seq data into a composite reference transcriptome, and identified transcripts from over 12 thousand previously undescribed ovine genes. Gene expression analysis based on these data revealed substantially different expression profiles in standard versus delayed bone healing in an ovine tibial osteotomy model. Hundreds of transcripts were differentially expressed between standard and delayed healing and between the time points of the standard and delayed healing groups. We used the sheep sequences to design quantitative RT-PCR assays with which we validated the differential expression of 26 genes that had been identified by RNA-seq analysis. A number of clusters of characteristic expression profiles could be identified, some of which showed striking differences between the standard and delayed healing groups. Gene Ontology (GO) analysis showed that the differentially expressed genes were enriched in terms including extracellular matrix, cartilage development, contractile fiber, and chemokine activity. Conclusions Our results provide a first atlas of gene expression profiles and differentially expressed genes in standard and delayed bone healing in a large-animal model and provide a number of clues as to the shifts in gene expression that underlie delayed bone healing. In the course of our study, we identified transcripts of 13,987 ovine genes, including 12,431 genes for which no sequence information was previously available. This information will provide a basis for future molecular research involving the sheep as a model organism. PMID:21435219
Characterization and Amplification of Gene-Based Simple Sequence Repeat (SSR) Markers in Date Palm.
Zhao, Yongli; Keremane, Manjunath; Prakash, Channapatna S; He, Guohao
2017-01-01
The paucity of molecular markers limits the application of genetic and genomic research in date palm (Phoenix dactylifera L.). Availability of expressed sequence tag (EST) sequences in date palm may provide a good resource for developing gene-based markers. This study characterizes a substantial fraction of transcriptome sequences containing simple sequence repeats (SSRs) from the EST sequences in date palm. The EST sequences studied are mainly homologous to those of Elaeis guineensis and Musa acuminata. A total of 911 gene-based SSR markers, characterized with functional annotations, have provided a useful basis not only for discovering candidate genes and understanding genetic basis of traits of interest but also for developing genetic and genomic tools for molecular research in date palm, such as diversity study, quantitative trait locus (QTL) mapping, and molecular breeding. The procedures of DNA extraction, polymerase chain reaction (PCR) amplification of these gene-based SSR markers, and gel electrophoresis of PCR products are described in this chapter.
USDA-ARS?s Scientific Manuscript database
Lipase (lip) and lipase-specific foldase (lif) genes of a biodegradable polyhydroxyalkanoate- (PHA-) synthesizing Pseudomonas resinovorans NRRL B-2649 were cloned using primers based on consensus sequences, followed by PCR-based genome walking. Sequence analyses showed a putative Lip gene-product (...
COI (cytochrome oxidase-I) sequence based studies of Carangid fishes from Kakinada coast, India.
Persis, M; Chandra Sekhar Reddy, A; Rao, L M; Khedkar, G D; Ravinder, K; Nasruddin, K
2009-09-01
Mitochondrial DNA, cytochrome oxidase-1 gene sequences were analyzed for species identification and phylogenetic relationship among the very high food value and commercially important Indian carangid fish species. Sequence analysis of COI gene very clearly indicated that all the 28 fish species fell into five distinct groups, which are genetically distant from each other and exhibited identical phylogenetic reservation. All the COI gene sequences from 28 fishes provide sufficient phylogenetic information and evolutionary relationship to distinguish the carangid species unambiguously. This study proves the utility of mtDNA COI gene sequence based approach in identifying fish species at a faster pace.
Mohkam, Milad; Nezafat, Navid; Berenjian, Aydin; Mobasher, Mohammad Ali; Ghasemi, Younes
2016-03-01
Some Bacillus species, especially Bacillus subtilis and Bacillus pumilus groups, have highly similar 16S rRNA gene sequences, which are hard to identify based on 16S rDNA sequence analysis. To conquer this drawback, rpoB, recA sequence analysis along with randomly amplified polymorphic (RAPD) fingerprinting was examined as an alternative method for differentiating Bacillus species. The 16S rRNA, rpoB and recA genes were amplified via a polymerase chain reaction using their specific primers. The resulted PCR amplicons were sequenced, and phylogenetic analysis was employed by MEGA 6 software. Identification based on 16S rRNA gene sequencing was underpinned by rpoB and recA gene sequencing as well as RAPD-PCR technique. Subsequently, concatenation and phylogenetic analysis showed that extent of diversity and similarity were better obtained by rpoB and recA primers, which are also reinforced by RAPD-PCR methods. However, in one case, these approaches failed to identify one isolate, which in combination with the phenotypical method offsets this issue. Overall, RAPD fingerprinting, rpoB and recA along with concatenated genes sequence analysis discriminated closely related Bacillus species, which highlights the significance of the multigenic method in more precisely distinguishing Bacillus strains. This research emphasizes the benefit of RAPD fingerprinting, rpoB and recA sequence analysis superior to 16S rRNA gene sequence analysis for suitable and effective identification of Bacillus species as recommended for probiotic products.
Huang, Chien-Hsun; Chang, Mu-Tzu; Huang, Mu-Chiou; Wang, Li-Tin; Huang, Lina; Lee, Fwu-Ling
2012-10-01
To clearly identify specific species and subspecies of the Lactobacillus acidophilus group using phenotypic and genotypic (16S rDNA sequence analysis) techniques alone is difficult. The aim of this study was to use the recA gene for species discrimination in the L. acidophilus group, as well as to develop a species-specific primer and single nucleotide polymorphism primer based on the recA gene sequence for species and subspecies identification. The average sequence similarity for the recA gene among type strains was 80.0%, and most members of the L. acidophilus group could be clearly distinguished. The species-specific primer was designed according to the recA gene sequencing, which was employed for polymerase chain reaction with the template DNA of Lactobacillus strains. A single 231-bp species-specific band was found only in L. delbrueckii. A SNaPshot mini-sequencing assay using recA as a target gene was also developed. The specificity of the mini-sequencing assay was evaluated using 31 strains of L. delbrueckii species and was able to unambiguously discriminate strains belonging to the subspecies L. delbrueckii subsp. bulgaricus. The phylogenetic relationships of most strains in the L. acidophilus group can be resolved using recA gene sequencing, and a novel method to identify the species and subspecies of the L. delbrueckii and L. delbrueckii subsp. bulgaricus was developed by species-specific polymerase chain reaction combined with SNaPshot mini-sequencing. Copyright © 2012 Society of Chemical Industry.
Methods and compositions for regulating gene expression in plant cells
NASA Technical Reports Server (NTRS)
Dai, Shunhong (Inventor); Beachy, Roger N. (Inventor); Luis, Maria Isabel Ordiz (Inventor)
2010-01-01
Novel chimeric plant promoter sequences are provided, together with plant gene expression cassettes comprising such sequences. In certain preferred embodiments, the chimeric plant promoters comprise the BoxII cis element and/or derivatives thereof. In addition, novel transcription factors are provided, together with nucleic acid sequences encoding such transcription factors and plant gene expression cassettes comprising such nucleic acid sequences. In certain preferred embodiments, the novel transcription factors comprise the acidic domain, or fragments thereof, of the RF2a transcription factor. Methods for using the chimeric plant promoter sequences and novel transcription factors in regulating the expression of at least one gene of interest are provided, together with transgenic plants comprising such chimeric plant promoter sequences and novel transcription factors.
Bricheux, G; Brugerolle, G
1997-08-01
The parasitic protozoan Trichomonas vaginalis is known to contain the ubiquitous and highly conserved protein actin. A genomic library and a cDNA library have been screened to identify and clone the actin gene(s) of T. vaginalis. The nucleotide sequence of one gene and its flanking regions have been determined. The open reading frame encodes a protein of 376 amino acids. The sequence is not interrupted by any introns and the promoter could be represented by a 10 bp motif close to a consensus motif also found upstream of most sequenced T. vaginalis genes. The five different clones isolated from the cDNA library have similar sequences and encode three actin proteins differing only by one or two amino acids. A phylogenetic analysis of 31 actin sequences by distance matrix and parsimony methods, using centractin as outgroup, gives congruent trees with Parabasala branching above Diplomonadida.
Raventós, D; Jensen, A B; Rask, M B; Casacuberta, J M; Mundy, J; San Segundo, B
1995-01-01
Transient gene expression assays in barley aleurone protoplasts were used to identify a cis-regulatory element involved in the elicitor-responsive expression of the maize PRms gene. Analysis of transcriptional fusions between PRms 5' upstream sequences and a chloramphenicol acetyltransferase reporter gene, as well as chimeric promoters containing PRms promoter fragments or repeated oligonucleotides fused to a minimal promoter, delineated a 20 bp sequence which functioned as an elicitor-response element (ERE). This sequence contains a motif (-246 AATTGACC) similar to sequences found in promoters of other pathogen-responsive genes. The analysis also indicated that an enhancing sequence(s) between -397 and -296 is required for full PRms activation by elicitors. The protein kinase inhibitor staurosporine was found to completely block the transcriptional activation induced by elicitors. These data indicate that protein phosphorylation is involved in the signal transduction pathway leading to PRms expression.
Gene Deletion in Barley Mediated by LTR-retrotransposon BARE
Shang, Yi; Yang, Fei; Schulman, Alan H.; Zhu, Jinghuan; Jia, Yong; Wang, Junmei; Zhang, Xiao-Qi; Jia, Qiaojun; Hua, Wei; Yang, Jianming; Li, Chengdao
2017-01-01
A poly-row branched spike (prbs) barley mutant was obtained from soaking a two-rowed barley inflorescence in a solution of maize genomic DNA. Positional cloning and sequencing demonstrated that the prbs mutant resulted from a 28 kb deletion including the inflorescence architecture gene HvRA2. Sequence annotation revealed that the HvRA2 gene is flanked by two LTR (long terminal repeat) retrotransposons (BARE) sharing 89% sequence identity. A recombination between the integrase (IN) gene regions of the two BARE copies resulted in the formation of an intact BARE and loss of HvRA2. No maize DNA was detected in the recombination region although the flanking sequences of HvRA2 gene showed over 73% of sequence identity with repetitive sequences on 10 maize chromosomes. It is still unknown whether the interaction of retrotransposons between barley and maize has resulted in the recombination observed in the present study. PMID:28252053
Guo, Yahong; Tsuruga, Ayako; Yamaguchi, Shigeharu; Oba, Koji; Iwai, Kasumi; Sekita, Setsuko; Mizukami, Hajime
2006-06-01
Chloroplast chlB gene encoding subunit B of light-independent protochlorophyllide reductase was amplified from herbarium and crude drug specimens of Ephedra sinica, E. intermedia, E. equisetina, and E. przewalskii. Sequence comparison of the chlB gene indicated that all the E. sinica specimens have the same sequence type (Type S) distinctive from other species, while there are two sequence types (Type E1 and Type E2) in E. equisetina. E. intermedia and E. prezewalskii revealed an identical sequence type (Type IP). E. sinica was also identified by digesting the chlB fragment with Bcl I. A novel method for DNA authentication of Ephedra Herb based on the sequences of the chloroplast chlB gene and internal transcribed spacer of nuclear rRNA genes was developed and successfully applied for identification of the crude drugs obtained in the Chinese market.
Comparative Genome Sequence Analysis of the Bpa/Str Region in Mouse and Man
Mallon, A.-M.; Platzer, M.; Bate, R.; Gloeckner, G.; Botcherby, M.R.M.; Nordsiek, G.; Strivens, M.A.; Kioschis, P.; Dangel, A.; Cunningham, D.; Straw, R.N.A.; Weston, P.; Gilbert, M.; Fernando, S.; Goodall, K.; Hunter, G.; Greystrong, J.S.; Clarke, D.; Kimberley, C.; Goerdes, M.; Blechschmidt, K.; Rump, A.; Hinzmann, B.; Mundy, C.R.; Miller, W.; Poustka, A.; Herman, G.E.; Rhodes, M.; Denny, P.; Rosenthal, A.; Brown, S.D.M.
2000-01-01
The progress of human and mouse genome sequencing programs presages the possibility of systematic cross-species comparison of the two genomes as a powerful tool for gene and regulatory element identification. As the opportunities to perform comparative sequence analysis emerge, it is important to develop parameters for such analyses and to examine the outcomes of cross-species comparison. Our analysis used gene prediction and a database search of 430 kb of genomic sequence covering the Bpa/Str region of the mouse X chromosome, and 745 kb of genomic sequence from the homologous human X chromosome region. We identified 11 genes in mouse and 13 genes and two pseudogenes in human. In addition, we compared the mouse and human sequences using pairwise alignment and searches for evolutionary conserved regions (ECRs) exceeding a defined threshold of sequence identity. This approach aided the identification of at least four further putative conserved genes in the region. Comparative sequencing revealed that this region is a mosaic in evolutionary terms, with considerably more rearrangement between the two species than realized previously from comparative mapping studies. Surprisingly, this region showed an extremely high LINE and low SINE content, low G+C content, and yet a relatively high gene density, in contrast to the low gene density usually associated with such regions. [The sequence data described in this paper have been submitted to EMBL under the following accession nos.: Mouse Genomic Sequence: Mouse contig A (AL021127), Mouse contig B (AL049866), BAC41M10 (AL136328), PAC303O11(AL136329). Human Genomic Sequence: Human contig 1 (U82671, U82670), Human contig 2 (U82695).] PMID:10854409
Taylor, Robin L; Bailey, Jeffrey Craig; Freshwater, David Wilson
2017-06-01
Identification of Cladophora species is challenging due to conservation of gross morphology, few discrete autapomorphies, and environmental influences on morphology. Twelve species of marine Cladophora were reported from North Carolina waters. Cladophora specimens were collected from inshore and offshore marine waters for DNA sequence and morphological analyses. The nuclear-encoded rRNA internal transcribed spacer regions (ITS) were sequenced for 105 specimens and used in molecular assisted identification. The ITS1 and ITS2 region was highly variable, and sequences were sorted into ITS Sets of Alignable Sequences (SASs). Sequencing of short hyper-variable ITS1 sections from Cladophora type specimens was used to positively identify species represented by SASs when the types were made available. Secondary structures for the ITS1 locus were also predicted for each specimen and compared to predicted structures from Cladophora sequences available in GenBank. Nine ITS SASs were identified and representative specimens chosen for phylogenetic analyses of 18S and 28S rRNA gene sequences to reveal relationships with other Cladophora species. Phylogenetic analyses indicated that marine Cladophorales were polyphyletic and separated into two clades, the Cladophora clade and the "Siphonocladales" clade. Morphological analyses were performed to assess the consistency of character states within species, and complement the DNA sequence analyses. These analyses revealed intra- and interspecific character state variation, and that combined molecular and morphological analyses were required for the identification of species. One new report, Cladophora dotyana, and one new species Cladophora subtilissima sp. nov., were revealed, and increased the biodiversity of North Carolina marine Cladophora to 14 species. © 2017 Phycological Society of America.
Hecht, Jochen; Kuhl, Heiner; Haas, Stefan A; Bauer, Sebastian; Poustka, Albert J; Lienau, Jasmin; Schell, Hanna; Stiege, Asita C; Seitz, Volkhard; Reinhardt, Richard; Duda, Georg N; Mundlos, Stefan; Robinson, Peter N
2006-07-05
The sheep is an important model animal for testing novel fracture treatments and other medical applications. Despite these medical uses and the well known economic and cultural importance of the sheep, relatively little research has been performed into sheep genetics, and DNA sequences are available for only a small number of sheep genes. In this work we have sequenced over 47 thousand expressed sequence tags (ESTs) from libraries developed from healing bone in a sheep model of fracture healing. These ESTs were clustered with the previously available 10 thousand sheep ESTs to a total of 19087 contigs with an average length of 603 nucleotides. We used the newly identified sequences to develop RT-PCR assays for 78 sheep genes and measured differential expression during the course of fracture healing between days 7 and 42 postfracture. All genes showed significant shifts at one or more time points. 23 of the genes were differentially expressed between postfracture days 7 and 10, which could reflect an important role for these genes for the initiation of osteogenesis. The sequences we have identified in this work are a valuable resource for future studies on musculoskeletal healing and regeneration using sheep and represent an important head-start for genomic sequencing projects for Ovis aries, with partial or complete sequences being made available for over 5,800 previously unsequenced sheep genes.
DNA sequence analysis of the photosynthesis region of Rhodobacter sphaeroides 2.4.1.
Choudhary, M; Kaplan, S
2000-02-15
This paper describes the DNA sequence of the photosynthesis region of Rhodobacter sphaeroides 2.4.1 (T). The photosynthesis gene cluster is located within a approximately 73 kb Ase I genomic DNA fragment containing the puf, puhA, cycA and puc operons. A total of 65 open reading frames (ORFs) have been identified, of which 61 showed significant similarity to genes/proteins of other organisms while only four did not reveal any significant sequence similarity to any gene/protein sequences in the database. The data were compared with the corresponding genes/ORFs from a different strain of R.sphaeroides and Rhodobacter capsulatus, a close relative of R. sphaeroides. A detailed analysis of the gene organization in the photosynthesis region revealed a similar gene order in both species with some notable differences located to the pucBAC = cycA region. In addition, photosynthesis gene regulatory protein (PpsR, FNR, IHF) binding motifs in upstream sequences of a number of photosynthesis genes have been identified and shown to differ between these two species. The difference in gene organization relative to pucBAC and cycA suggests that this region originated independently of the photosynthesis gene cluster of R.sphaeroides.
Hashimoto, Masayuki; Fukui, Mitsuru; Hayano, Kouichi; Hayatsu, Masahito
2002-01-01
Rhizobium sp. strain AC100, which is capable of degrading carbaryl (1-naphthyl-N-methylcarbamate), was isolated from soil treated with carbaryl. This bacterium hydrolyzed carbaryl to 1-naphthol and methylamine. Carbaryl hydrolase from the strain was purified to homogeneity, and its N-terminal sequence, molecular mass (82 kDa), and enzymatic properties were determined. The purified enzyme hydrolyzed 1-naphthyl acetate and 4-nitrophenyl acetate indicating that the enzyme is an esterase. We then cloned the carbaryl hydrolase gene (cehA) from the plasmid DNA of the strain and determined the nucleotide sequence of the 10-kb region containing cehA. No homologous sequences were found by a database homology search using the nucleotide and deduced amino acid sequences of the cehA gene. Six open reading frames including the cehA gene were found in the 10-kb region, and sequencing analysis shows that the cehA gene is flanked by two copies of insertion sequence-like sequence, suggesting that it makes part of a composite transposon. PMID:11872471
The Extrapolation of Elementary Sequences
NASA Technical Reports Server (NTRS)
Laird, Philip; Saul, Ronald
1992-01-01
We study sequence extrapolation as a stream-learning problem. Input examples are a stream of data elements of the same type (integers, strings, etc.), and the problem is to construct a hypothesis that both explains the observed sequence of examples and extrapolates the rest of the stream. A primary objective -- and one that distinguishes this work from previous extrapolation algorithms -- is that the same algorithm be able to extrapolate sequences over a variety of different types, including integers, strings, and trees. We define a generous family of constructive data types, and define as our learning bias a stream language called elementary stream descriptions. We then give an algorithm that extrapolates elementary descriptions over constructive datatypes and prove that it learns correctly. For freely-generated types, we prove a polynomial time bound on descriptions of bounded complexity. An especially interesting feature of this work is the ability to provide quantitative measures of confidence in competing hypotheses, using a Bayesian model of prediction.
Zhou, Wen-Zhao; Zhang, Yan-Mei; Lu, Jun-Ying; Li, Jun-Feng
2012-01-01
To provide a resource of sisal-specific expressed sequence data and facilitate this powerful approach in new gene research, the preparation of normalized cDNA libraries enriched with full-length sequences is necessary. Four libraries were produced with RNA pooled from Agave sisalana multiple tissues to increase efficiency of normalization and maximize the number of independent genes by SMART™ method and the duplex-specific nuclease (DSN). This procedure kept the proportion of full-length cDNAs in the subtracted/normalized libraries and dramatically enhanced the discovery of new genes. Sequencing of 3875 cDNA clones of libraries revealed 3320 unigenes with an average insert length about 1.2 kb, indicating that the non-redundancy of libraries was about 85.7%. These unigene functions were predicted by comparing their sequences to functional domain databases and extensively annotated with Gene Ontology (GO) terms. Comparative analysis of sisal unigenes and other plant genomes revealed that four putative MADS-box genes and knotted-like homeobox (knox) gene were obtained from a total of 1162 full-length transcripts. Furthermore, real-time PCR showed that the characteristics of their transcripts mainly depended on the tight expression regulation of a number of genes during the leaf and flower development. Analysis of individual library sequence data indicated that the pooled-tissue approach was highly effective in discovering new genes and preparing libraries for efficient deep sequencing. PMID:23202944
Liu, Juan; Qi, Zhe-Chen; Zhao, Yun-Peng; Fu, Cheng-Xin; Jenny Xiang, Qiu-Yun
2012-09-01
The complete nucleotide sequence of the chloroplast genome (cpDNA) of Smilax china L. (Smilacaceae) is reported. It is the first complete cp genome sequence in Liliales. Genomic analyses were conducted to examine the rate and pattern of cpDNA genome evolution in Smilax relative to other major lineages of monocots. The cpDNA genomic sequences were combined with those available for Lilium to evaluate the phylogenetic position of Liliales and to investigate the influence of taxon sampling, gene sampling, gene function, natural selection, and substitution rate on phylogenetic inference in monocots. Phylogenetic analyses using sequence data of gene groups partitioned according to gene function, selection force, and total substitution rate demonstrated evident impacts of these factors on phylogenetic inference of monocots and the placement of Liliales, suggesting potential evolutionary convergence or adaptation of some cpDNA genes in monocots. Our study also demonstrated that reduced taxon sampling reduced the bootstrap support for the placement of Liliales in the cpDNA phylogenomic analysis. Analyses of sequences of 77 protein genes with some missing data and sequences of 81 genes (all protein genes plus the rRNA genes) support a sister relationship of Liliales to the commelinids-Asparagales clade, consistent with the APG III system. Analyses of 63 cpDNA protein genes for 32 taxa with few missing data, however, support a sister relationship of Liliales (represented by Smilax and Lilium) to Dioscoreales-Pandanales. Topology tests indicated that these two alignments do not significantly differ given any of these three cpDNA genomic sequence data sets. Furthermore, we found no saturation effect of the data, suggesting that the cpDNA genomic sequence data used in the study are appropriate for monocot phylogenetic study and long-branch attraction is unlikely to be the cause to explain the result of two well-supported, conflict placements of Liliales. Further analyses using sufficient nuclear data remain necessary to evaluate these two phylogenetic hypotheses regarding the position of Liliales and to address the causes of signal conflict among genes and partitions. Copyright © 2012 Elsevier Inc. All rights reserved.
2009-01-01
Background Conifers are a large group of gymnosperm trees which are separated from the angiosperms by more than 300 million years of independent evolution. Conifer genomes are extremely large and contain considerable amounts of repetitive DNA. Currently, conifer sequence resources exist predominantly as expressed sequence tags (ESTs) and full-length (FL)cDNAs. There is no genome sequence available for a conifer or any other gymnosperm. Conifer defence-related genes often group into large families with closely related members. The goals of this study are to assess the feasibility of targeted isolation and sequence assembly of conifer BAC clones containing specific genes from two large gene families, and to characterize large segments of genomic DNA sequence for the first time from a conifer. Results We used a PCR-based approach to identify BAC clones for two target genes, a terpene synthase (3-carene synthase; 3CAR) and a cytochrome P450 (CYP720B4) from a non-arrayed genomic BAC library of white spruce (Picea glauca). Shotgun genomic fragments isolated from the BAC clones were sequenced to a depth of 15.6- and 16.0-fold coverage, respectively. Assembly and manual curation yielded sequence scaffolds of 172 kbp (3CAR) and 94 kbp (CYP720B4) long. Inspection of the genomic sequences revealed the intron-exon structures, the putative promoter regions and putative cis-regulatory elements of these genes. Sequences related to transposable elements (TEs), high complexity repeats and simple repeats were prevalent and comprised approximately 40% of the sequenced genomic DNA. An in silico simulation of the effect of sequencing depth on the quality of the sequence assembly provides direction for future efforts of conifer genome sequencing. Conclusion We report the first targeted cloning, sequencing, assembly, and annotation of large segments of genomic DNA from a conifer. We demonstrate that genomic BAC clones for individual members of multi-member gene families can be isolated in a gene-specific fashion. The results of the present work provide important new information about the structure and content of conifer genomic DNA that will guide future efforts to sequence and assemble conifer genomes. PMID:19656416
Hamberger, Björn; Hall, Dawn; Yuen, Mack; Oddy, Claire; Hamberger, Britta; Keeling, Christopher I; Ritland, Carol; Ritland, Kermit; Bohlmann, Jörg
2009-08-06
Conifers are a large group of gymnosperm trees which are separated from the angiosperms by more than 300 million years of independent evolution. Conifer genomes are extremely large and contain considerable amounts of repetitive DNA. Currently, conifer sequence resources exist predominantly as expressed sequence tags (ESTs) and full-length (FL)cDNAs. There is no genome sequence available for a conifer or any other gymnosperm. Conifer defence-related genes often group into large families with closely related members. The goals of this study are to assess the feasibility of targeted isolation and sequence assembly of conifer BAC clones containing specific genes from two large gene families, and to characterize large segments of genomic DNA sequence for the first time from a conifer. We used a PCR-based approach to identify BAC clones for two target genes, a terpene synthase (3-carene synthase; 3CAR) and a cytochrome P450 (CYP720B4) from a non-arrayed genomic BAC library of white spruce (Picea glauca). Shotgun genomic fragments isolated from the BAC clones were sequenced to a depth of 15.6- and 16.0-fold coverage, respectively. Assembly and manual curation yielded sequence scaffolds of 172 kbp (3CAR) and 94 kbp (CYP720B4) long. Inspection of the genomic sequences revealed the intron-exon structures, the putative promoter regions and putative cis-regulatory elements of these genes. Sequences related to transposable elements (TEs), high complexity repeats and simple repeats were prevalent and comprised approximately 40% of the sequenced genomic DNA. An in silico simulation of the effect of sequencing depth on the quality of the sequence assembly provides direction for future efforts of conifer genome sequencing. We report the first targeted cloning, sequencing, assembly, and annotation of large segments of genomic DNA from a conifer. We demonstrate that genomic BAC clones for individual members of multi-member gene families can be isolated in a gene-specific fashion. The results of the present work provide important new information about the structure and content of conifer genomic DNA that will guide future efforts to sequence and assemble conifer genomes.
Population connectivity of the plating coral Agaricia lamarcki from southwest Puerto Rico
NASA Astrophysics Data System (ADS)
Hammerman, Nicholas M.; Rivera-Vicens, Ramon E.; Galaska, Matthew P.; Weil, Ernesto; Appledoorn, Richard S.; Alfaro, Monica; Schizas, Nikolaos V.
2018-03-01
Identifying genetic connectivity and discrete population boundaries is an important objective for management of declining Caribbean reef-building corals. A double digest restriction-associated DNA sequencing protocol was utilized to generate 321 single nucleotide polymorphisms to estimate patterns of horizontal and vertical gene flow in the brooding Caribbean plate coral, Agaricia lamarcki. Individual colonies ( n = 59) were sampled from eight locations throughout southwestern Puerto Rico from six shallow ( 10-20 m) and two mesophotic habitats ( 30-40 m). Descriptive summary statistics (fixation index, F ST), analysis of molecular variance, and analysis through landscape and ecological associations and discriminant analysis of principal components estimated high population connectivity with subtle subpopulation structure among all sampling localities.
Reverse Genetics and High Throughput Sequencing Methodologies for Plant Functional Genomics
Ben-Amar, Anis; Daldoul, Samia; Reustle, Götz M.; Krczal, Gabriele; Mliki, Ahmed
2016-01-01
In the post-genomic era, increasingly sophisticated genetic tools are being developed with the long-term goal of understanding how the coordinated activity of genes gives rise to a complex organism. With the advent of the next generation sequencing associated with effective computational approaches, wide variety of plant species have been fully sequenced giving a wealth of data sequence information on structure and organization of plant genomes. Since thousands of gene sequences are already known, recently developed functional genomics approaches provide powerful tools to analyze plant gene functions through various gene manipulation technologies. Integration of different omics platforms along with gene annotation and computational analysis may elucidate a complete view in a system biology level. Extensive investigations on reverse genetics methodologies were deployed for assigning biological function to a specific gene or gene product. We provide here an updated overview of these high throughout strategies highlighting recent advances in the knowledge of functional genomics in plants. PMID:28217003
USDA-ARS?s Scientific Manuscript database
Polymerase chain reaction amplification of conserved genes and sequence analysis provides a very powerful tool for the identification of toxigenic as well as non-toxigenic Penicillium species. Sequences are obtained by amplification of the gene fragment, sequencing via capillary electrophoresis of d...
A draft sequence of the rice genome (Oryza sativa L. ssp. indica).
Yu, Jun; Hu, Songnian; Wang, Jun; Wong, Gane Ka-Shu; Li, Songgang; Liu, Bin; Deng, Yajun; Dai, Li; Zhou, Yan; Zhang, Xiuqing; Cao, Mengliang; Liu, Jing; Sun, Jiandong; Tang, Jiabin; Chen, Yanjiong; Huang, Xiaobing; Lin, Wei; Ye, Chen; Tong, Wei; Cong, Lijuan; Geng, Jianing; Han, Yujun; Li, Lin; Li, Wei; Hu, Guangqiang; Huang, Xiangang; Li, Wenjie; Li, Jian; Liu, Zhanwei; Li, Long; Liu, Jianping; Qi, Qiuhui; Liu, Jinsong; Li, Li; Li, Tao; Wang, Xuegang; Lu, Hong; Wu, Tingting; Zhu, Miao; Ni, Peixiang; Han, Hua; Dong, Wei; Ren, Xiaoyu; Feng, Xiaoli; Cui, Peng; Li, Xianran; Wang, Hao; Xu, Xin; Zhai, Wenxue; Xu, Zhao; Zhang, Jinsong; He, Sijie; Zhang, Jianguo; Xu, Jichen; Zhang, Kunlin; Zheng, Xianwu; Dong, Jianhai; Zeng, Wanyong; Tao, Lin; Ye, Jia; Tan, Jun; Ren, Xide; Chen, Xuewei; He, Jun; Liu, Daofeng; Tian, Wei; Tian, Chaoguang; Xia, Hongai; Bao, Qiyu; Li, Gang; Gao, Hui; Cao, Ting; Wang, Juan; Zhao, Wenming; Li, Ping; Chen, Wei; Wang, Xudong; Zhang, Yong; Hu, Jianfei; Wang, Jing; Liu, Song; Yang, Jian; Zhang, Guangyu; Xiong, Yuqing; Li, Zhijie; Mao, Long; Zhou, Chengshu; Zhu, Zhen; Chen, Runsheng; Hao, Bailin; Zheng, Weimou; Chen, Shouyi; Guo, Wei; Li, Guojie; Liu, Siqi; Tao, Ming; Wang, Jian; Zhu, Lihuang; Yuan, Longping; Yang, Huanming
2002-04-05
We have produced a draft sequence of the rice genome for the most widely cultivated subspecies in China, Oryza sativa L. ssp. indica, by whole-genome shotgun sequencing. The genome was 466 megabases in size, with an estimated 46,022 to 55,615 genes. Functional coverage in the assembled sequences was 92.0%. About 42.2% of the genome was in exact 20-nucleotide oligomer repeats, and most of the transposons were in the intergenic regions between genes. Although 80.6% of predicted Arabidopsis thaliana genes had a homolog in rice, only 49.4% of predicted rice genes had a homolog in A. thaliana. The large proportion of rice genes with no recognizable homologs is due to a gradient in the GC content of rice coding sequences.
Zheng, Yang; Cai, Jing; Li, JianWen; Li, Bo; Lin, Runmao; Tian, Feng; Wang, XiaoLing; Wang, Jun
2010-01-01
A 10-fold BAC library for giant panda was constructed and nine BACs were selected to generate finish sequences. These BACs could be used as a validation resource for the de novo assembly accuracy of the whole genome shotgun sequencing reads of giant panda newly generated by the Illumina GA sequencing technology. Complete sanger sequencing, assembly, annotation and comparative analysis were carried out on the selected BACs of a joint length 878 kb. Homologue search and de novo prediction methods were used to annotate genes and repeats. Twelve protein coding genes were predicted, seven of which could be functionally annotated. The seven genes have an average gene size of about 41 kb, an average coding size of about 1.2 kb and an average exon number of 6 per gene. Besides, seven tRNA genes were found. About 27 percent of the BAC sequence is composed of repeats. A phylogenetic tree was constructed using neighbor-join algorithm across five species, including giant panda, human, dog, cat and mouse, which reconfirms dog as the most related species to giant panda. Our results provide detailed sequence and structure information for new genes and repeats of giant panda, which will be helpful for further studies on the giant panda.
Molecular characterization of the vitamin D receptor (VDR) gene in Holstein cows.
Ali, Mayar O; El-Adl, Mohamed A; Ibrahim, Hussam M M; Elseedy, Youssef Y; Rizk, Mohamed A; El-Khodery, Sabry A
2018-06-01
Vitamin D plays a vital role in calcium homeostasis, growth, and immunoregulation. Because little is known about the vitamin D receptor (VDR) gene in cattle, the aim of the present investigation was to present the molecular characterization of exons 5 and 6 of the VDR gene in Holstein cows. DNA extraction, genomic sequencing, phylogenetic analysis, synteny mapping and single nucleotide gene polymorphism analysis of the VDR gene were performed to assess blood samples collected from 50 clinically healthy Holstein cows. The results revealed the presence of a 450-base pair (bp) nucleotide sequence that resembled exons 5 and 6 with intron 5 enclosed between these exons. Sequence alignment and phylogenetic analysis revealed a close relationship between the sequenced VDR region and that found in Hereford cattle. A close association between this region and the corresponding region in small ruminants was also documented. Moreover, a single nucleotide polymorphism (SNP) that caused the replacement of a glutamate with an arginine in the deduced amino acid sequence was detected at position 7 of exon 5. In conclusion, Holstein and Hereford cattle differ with respect to exon 5 of the VDR gene. Phylogenetic analysis of the VDR gene based on nucleotide sequence produced different results from prior analyses based on amino acid sequence. Copyright © 2018 Elsevier Ltd. All rights reserved.
Thick Descriptions: A Language for Articulating Ethnographic Media Technology.
ERIC Educational Resources Information Center
Goldman-Segall, Ricki
"Thick descriptions" are descriptions that are layered enough to draw conclusions and uncover the intentions of a given act, event, or process. In a video environment, thick descriptions are images, gestures, or sequences that convey meaning. Neither the quantity nor the resolution of the images makes the descriptions thick. Thickness is…
Ralph, Duncan K; Matsen, Frederick A
2016-01-01
VDJ rearrangement and somatic hypermutation work together to produce antibody-coding B cell receptor (BCR) sequences for a remarkable diversity of antigens. It is now possible to sequence these BCRs in high throughput; analysis of these sequences is bringing new insight into how antibodies develop, in particular for broadly-neutralizing antibodies against HIV and influenza. A fundamental step in such sequence analysis is to annotate each base as coming from a specific one of the V, D, or J genes, or from an N-addition (a.k.a. non-templated insertion). Previous work has used simple parametric distributions to model transitions from state to state in a hidden Markov model (HMM) of VDJ recombination, and assumed that mutations occur via the same process across sites. However, codon frame and other effects have been observed to violate these parametric assumptions for such coding sequences, suggesting that a non-parametric approach to modeling the recombination process could be useful. In our paper, we find that indeed large modern data sets suggest a model using parameter-rich per-allele categorical distributions for HMM transition probabilities and per-allele-per-position mutation probabilities, and that using such a model for inference leads to significantly improved results. We present an accurate and efficient BCR sequence annotation software package using a novel HMM "factorization" strategy. This package, called partis (https://github.com/psathyrella/partis/), is built on a new general-purpose HMM compiler that can perform efficient inference given a simple text description of an HMM.
Gao, M L; Zhong, X M; Ma, X; Ning, H J; Zhu, D; Zou, J Z
2016-06-02
To make genetic diagnosis of Alagille syndrome (ALGS) patients using target gene sequence capture and next generation sequencing technology. Target gene sequence capture and next generation sequencing were used to detect ALGS gene of 4 patients. They were hospitalized at the Affiliated Hospital, Capital Institute of Pediatrics between January 2014 and December 2015, referred to clinical diagnosis of ALGS typical and atypical respectively in 2 cases. Blood samples were collected from patients and their parents and genomic DNA was extracted from lymphocytes. Target gene sequence capture and next generation sequencing was detected. Sanger sequencing was used to confirm the results of the patients and their parents. Cholestasis, heart defects, inverted triangular face and butterfly vertebrae were presented as main clinical features in 4 male patients. The first hospital visiting ages ranged from 3 months and 14 days to 3 years and 1 month. The age of onset ranged from 3 days to 42 days (median 23 days). According to the clinical diagnostic criteria of ALGS, patient 1 and patient 2 were considered as typical ALGS. The other 2 patients were considered as atypical ALGS. Four Jagged 1(JAG1) pathogenic mutations were detected. Three different missense mutations were detected in patient 1 to patient 3 with ALGS(c.839C>T(p.W280X), c. 703G>A(p.R235X), c. 1720C>T(p.V574M)). The JAG1 mutation of patient 3 was first reported. Patient 4 had one novel insertion mutation (c.1779_1780insA(p.Ile594AsnfsTer23)). Parental analysis verified that the JAG1 missense mutation of 3 patients were de novo. The results of sanger sequencing was consistent with the results of the next generation sequencing. Target gene sequence capture combined with next generation sequencing can detect two pathogenic genes in ALGS and test genes of other related diseases in infantile cholestatic diseases simultaneously and presents a high throughput, high efficiency and low cost. It may provide molecular diagnosis and treatment for clinicians with good clinical application prospects.
Using nearly full-genome HIV sequence data improves phylogeny reconstruction in a simulated epidemic
Yebra, Gonzalo; Hodcroft, Emma B.; Ragonnet-Cronin, Manon L.; Pillay, Deenan; Brown, Andrew J. Leigh; Fraser, Christophe; Kellam, Paul; de Oliveira, Tulio; Dennis, Ann; Hoppe, Anne; Kityo, Cissy; Frampton, Dan; Ssemwanga, Deogratius; Tanser, Frank; Keshani, Jagoda; Lingappa, Jairam; Herbeck, Joshua; Wawer, Maria; Essex, Max; Cohen, Myron S.; Paton, Nicholas; Ratmann, Oliver; Kaleebu, Pontiano; Hayes, Richard; Fidler, Sarah; Quinn, Thomas; Novitsky, Vladimir; Haywards, Andrew; Nastouli, Eleni; Morris, Steven; Clark, Duncan; Kozlakidis, Zisis
2016-01-01
HIV molecular epidemiology studies analyse viral pol gene sequences due to their availability, but whole genome sequencing allows to use other genes. We aimed to determine what gene(s) provide(s) the best approximation to the real phylogeny by analysing a simulated epidemic (created as part of the PANGEA_HIV project) with a known transmission tree. We sub-sampled a simulated dataset of 4662 sequences into different combinations of genes (gag-pol-env, gag-pol, gag, pol, env and partial pol) and sampling depths (100%, 60%, 20% and 5%), generating 100 replicates for each case. We built maximum-likelihood trees for each combination using RAxML (GTR + Γ), and compared their topologies to the corresponding true tree’s using CompareTree. The accuracy of the trees was significantly proportional to the length of the sequences used, with the gag-pol-env datasets showing the best performance and gag and partial pol sequences showing the worst. The lowest sampling depths (20% and 5%) greatly reduced the accuracy of tree reconstruction and showed high variability among replicates, especially when using the shortest gene datasets. In conclusion, using longer sequences derived from nearly whole genomes will improve the reliability of phylogenetic reconstruction. With low sample coverage, results can be highly variable, particularly when based on short sequences. PMID:28008945
Yebra, Gonzalo; Hodcroft, Emma B; Ragonnet-Cronin, Manon L; Pillay, Deenan; Brown, Andrew J Leigh
2016-12-23
HIV molecular epidemiology studies analyse viral pol gene sequences due to their availability, but whole genome sequencing allows to use other genes. We aimed to determine what gene(s) provide(s) the best approximation to the real phylogeny by analysing a simulated epidemic (created as part of the PANGEA_HIV project) with a known transmission tree. We sub-sampled a simulated dataset of 4662 sequences into different combinations of genes (gag-pol-env, gag-pol, gag, pol, env and partial pol) and sampling depths (100%, 60%, 20% and 5%), generating 100 replicates for each case. We built maximum-likelihood trees for each combination using RAxML (GTR + Γ), and compared their topologies to the corresponding true tree's using CompareTree. The accuracy of the trees was significantly proportional to the length of the sequences used, with the gag-pol-env datasets showing the best performance and gag and partial pol sequences showing the worst. The lowest sampling depths (20% and 5%) greatly reduced the accuracy of tree reconstruction and showed high variability among replicates, especially when using the shortest gene datasets. In conclusion, using longer sequences derived from nearly whole genomes will improve the reliability of phylogenetic reconstruction. With low sample coverage, results can be highly variable, particularly when based on short sequences.
2011-01-01
Background Transcriptome sequencing data has become an integral component of modern genetics, genomics and evolutionary biology. However, despite advances in the technologies of DNA sequencing, such data are lacking for many groups of living organisms, in particular, many plant taxa. We present here the results of transcriptome sequencing for two closely related plant species. These species, Fagopyrum esculentum and F. tataricum, belong to the order Caryophyllales - a large group of flowering plants with uncertain evolutionary relationships. F. esculentum (common buckwheat) is also an important food crop. Despite these practical and evolutionary considerations Fagopyrum species have not been the subject of large-scale sequencing projects. Results Normalized cDNA corresponding to genes expressed in flowers and inflorescences of F. esculentum and F. tataricum was sequenced using the 454 pyrosequencing technology. This resulted in 267 (for F. esculentum) and 229 (F. tataricum) thousands of reads with average length of 341-349 nucleotides. De novo assembly of the reads produced about 25 thousands of contigs for each species, with 7.5-8.2× coverage. Comparative analysis of two transcriptomes demonstrated their overall similarity but also revealed genes that are presumably differentially expressed. Among them are retrotransposon genes and genes involved in sugar biosynthesis and metabolism. Thirteen single-copy genes were used for phylogenetic analysis; the resulting trees are largely consistent with those inferred from multigenic plastid datasets. The sister relationships of the Caryophyllales and asterids now gained high support from nuclear gene sequences. Conclusions 454 transcriptome sequencing and de novo assembly was performed for two congeneric flowering plant species, F. esculentum and F. tataricum. As a result, a large set of cDNA sequences that represent orthologs of known plant genes as well as potential new genes was generated. PMID:21232141
Kim, Seungill; Kim, Myung-Shin; Kim, Yong-Min; Yeom, Seon-In; Cheong, Kyeongchae; Kim, Ki-Tae; Jeon, Jongbum; Kim, Sunggil; Kim, Do-Sun; Sohn, Seong-Han; Lee, Yong-Hwan; Choi, Doil
2015-01-01
The onion (Allium cepa L.) is one of the most widely cultivated and consumed vegetable crops in the world. Although a considerable amount of onion transcriptome data has been deposited into public databases, the sequences of the protein-coding genes are not accurate enough to be used, owing to non-coding sequences intermixed with the coding sequences. We generated a high-quality, annotated onion transcriptome from de novo sequence assembly and intensive structural annotation using the integrated structural gene annotation pipeline (ISGAP), which identified 54,165 protein-coding genes among 165,179 assembled transcripts totalling 203.0 Mb by eliminating the intron sequences. ISGAP performed reliable annotation, recognizing accurate gene structures based on reference proteins, and ab initio gene models of the assembled transcripts. Integrative functional annotation and gene-based SNP analysis revealed a whole biological repertoire of genes and transcriptomic variation in the onion. The method developed in this study provides a powerful tool for the construction of reference gene sets for organisms based solely on de novo transcriptome data. Furthermore, the reference genes and their variation described here for the onion represent essential tools for molecular breeding and gene cloning in Allium spp. PMID:25362073
Liu, Fei; Wu, Xiao-Li; Liu, Ying; Chen, Da-Xia; Zhang, De-Li; Yang, Da-Jian
2016-02-01
Isaria farinosa is the pathogen of the host of Ophiocordyceps sinensis. The present research has analyzed the progress on the molecular biology according to the bibliometrics, the sequences (including the gene sequences) of I. farinosa in the NCBI. The results indicated that different country had published different number of the papers, and had landed different kinds and different number of the sequences (including the gene sequences). China had published the most number of the papers, and had landed the most number of the sequences (including the gene sequences). America had landed the most numbers of the function genes. The main content about the pathogen study was focus on the biological controlling. The main content about the molecular study concentrated on the phylogenies classification. In recent years some protease genes and chitinase genes had been researched. With the increase of the effect on the healthy of O. sinensis, and the whole sequence and more and more pharmacological activities of I. farinosa being made known to the public, the study on the molecular biology of the I. farinosa would be deeper and wider. Copyright© by the Chinese Pharmaceutical Association.
Jose, Jency; Jalali, S K; Shivalingaswamy, T M; Kumar, N K Krishna; Bhatnagar, R; Bandyopadhyay, A
2013-06-01
A PCR based method for detection of viral DNA in nucleopolyhedrovirus of three lepidopterans, Spodoptera litura, Amsacta albistriga and Helicoverpa armigera, was developed by employing the late expression factor-8 (lef-8) gene of three NPV using specific primers. The amplicons of 689, 699 and 665 bp were amplified, respectively, and the nucleotide sequences were submitted to GenBank and the accession numbers were obtained. The sequences of lef-8 gene of S. litura NPV and H. armigera NPV matched with those of their respective references in the GenBank database, thereby confirming their identity, however, the sequence of A. albistriga NPV was the first sequence submitted to the GenBank database. The sequence similarity analysis between the three lef-8 gene of NPV sequenced in the present study revealed that there was no significant similarity between them, however A. albistriga NPV and S. litura NPV were found to be closely related. CLUSTAL alignment of the sequences generated revealed general relatedness among NPVs lef-8 gene. The study confirmed that lef-8 gene can be used for quick and correct discriminatory identification of insect viruses.
Methods for Discovery of Novel Cellulosomal Cellulases Using Genomics and Biochemical Tools.
Ben-David, Yonit; Dassa, Bareket; Bensoussan, Lizi; Bayer, Edward A; Moraïs, Sarah
2018-01-01
Cell wall degradation by cellulases is extensively explored owing to its potential contribution to biofuel production. The cellulosome is an extracellular multienzyme complex that can degrade the plant cell wall very efficiently, and cellulosomal enzymes are therefore of great interest. The cellulosomal cellulases are defined as enzymes that contain a dockerin module, which can interact with a cohesin module contained in multiple copies in a noncatalytic protein, termed scaffoldin. The assembly of the cellulosomal cellulases into the cellulosomal complex occurs via specific protein-protein interactions. Cellulosome systems have been described initially only in several anaerobic cellulolytic bacteria. However, owing to ongoing genome sequencing and metagenomic projects, the discovery of novel cellulosome-producing bacteria and the description of their cellulosomal genes have dramatically increased in the recent years. In this chapter, methods for discovery of novel cellulosomal cellulases from a DNA sequence by bioinformatics and biochemical tools are described. Their biochemical characterization is also described, including both the enzymatic activity of the putative cellulases and their assembly into mature designer cellulosomes.