complete genome analyses: Topics by Science.gov

Sample records for complete genome analyses

Complete Genome Sequence of Acinetobacter baumannii CIP 70.10, a Susceptible Reference Strain for Comparative Genome Analyses.

PubMed

Krahn, Thomas; Wibberg, Daniel; Maus, Irena; Winkler, Anika; Pühler, Alfred; Poirel, Laurent; Schlüter, Andreas

2015-07-30

The complete genome sequence for the reference strain Acinetobacter baumannii CIP 70.10 (ATCC 15151) was established. The strain was isolated in France in 1970, is susceptible to most antimicrobial compounds, and is therefore of importance for comparative genome analyses with clinical multidrug-resistant (MDR) A. baumannii strains to study resistance development and acquisition in this emerging human pathogen. Copyright © 2015 Krahn et al.
Complete mitochondrial genome sequences of three bats species and whole genome mitochondrial analyses reveal patterns of codon bias and lend support to a basal split in Chiroptera.

PubMed

Meganathan, P R; Pagan, Heidi J T; McCulloch, Eve S; Stevens, Richard D; Ray, David A

2012-01-15

Order Chiroptera is a unique group of mammals whose members have attained self-powered flight as their main mode of locomotion. Much speculation persists regarding bat evolution; however, lack of sufficient molecular data hampers evolutionary and conservation studies. Of ~1200 species, complete mitochondrial genome sequences are available for only eleven. Additional sequences should be generated if we are to resolve many questions concerning these fascinating mammals. Herein, we describe the complete mitochondrial genomes of three bats: Corynorhinus rafinesquii, Lasiurus borealis and Artibeus lituratus. We also compare the currently available mitochondrial genomes and analyze codon usage in Chiroptera. C. rafinesquii, L. borealis and A. lituratus mitochondrial genomes are 16438 bp, 17048 bp and 16709 bp, respectively. Genome organization and gene arrangements are similar to other bats. Phylogenetic analyses using complete mitochondrial genome sequences support previously established phylogenetic relationships and suggest utility in future studies focusing on the evolutionary aspects of these species. Comprehensive analyses of available bat mitochondrial genomes reveal distinct nucleotide patterns and synonymous codon preferences corresponding to different chiropteran families. These patterns suggest that mutational and selection forces are acting to different extents within Chiroptera and shape their mitochondrial genomes. Copyright © 2011 Elsevier B.V. All rights reserved.
From algae to angiosperms–inferring the phylogeny of green plants (Viridiplantae) from 360 plastid genomes

PubMed Central

2014-01-01

Background Next-generation sequencing has provided a wealth of plastid genome sequence data from an increasingly diverse set of green plants (Viridiplantae). Although these data have helped resolve the phylogeny of numerous clades (e.g., green algae, angiosperms, and gymnosperms), their utility for inferring relationships across all green plants is uncertain. Viridiplantae originated 700-1500 million years ago and may comprise as many as 500,000 species. This clade represents a major source of photosynthetic carbon and contains an immense diversity of life forms, including some of the smallest and largest eukaryotes. Here we explore the limits and challenges of inferring a comprehensive green plant phylogeny from available complete or nearly complete plastid genome sequence data. Results We assembled protein-coding sequence data for 78 genes from 360 diverse green plant taxa with complete or nearly complete plastid genome sequences available from GenBank. Phylogenetic analyses of the plastid data recovered well-supported backbone relationships and strong support for relationships that were not observed in previous analyses of major subclades within Viridiplantae. However, there also is evidence of systematic error in some analyses. In several instances we obtained strongly supported but conflicting topologies from analyses of nucleotides versus amino acid characters, and the considerable variation in GC content among lineages and within single genomes affected the phylogenetic placement of several taxa. Conclusions Analyses of the plastid sequence data recovered a strongly supported framework of relationships for green plants. This framework includes: i) the placement of Zygnematophyceace as sister to land plants (Embryophyta), ii) a clade of extant gymnosperms (Acrogymnospermae) with cycads + Ginkgo sister to remaining extant gymnosperms and with gnetophytes (Gnetophyta) sister to non-Pinaceae conifers (Gnecup trees), and iii) within the monilophyte clade (Monilophyta), Equisetales + Psilotales are sister to Marattiales + leptosporangiate ferns. Our analyses also highlight the challenges of using plastid genome sequences in deep-level phylogenomic analyses, and we provide suggestions for future analyses that will likely incorporate plastid genome sequence data for thousands of species. We particularly emphasize the importance of exploring the effects of different partitioning and character coding strategies. PMID:24533922
Comparative Genomics of Erwinia amylovora and Related Erwinia Species—What do We Learn?

PubMed Central

Zhao, Youfu; Qi, Mingsheng

2011-01-01

Erwinia amylovora, the causal agent of fire blight disease of apples and pears, is one of the most important plant bacterial pathogens with worldwide economic significance. Recent reports on the complete or draft genome sequences of four species in the genus Erwinia, including E. amylovora, E. pyrifoliae, E. tasmaniensis, and E. billingiae, have provided us near complete genetic information about this pathogen and its closely-related species. This review describes in silico subtractive hybridization-based comparative genomic analyses of eight genomes currently available, and highlights what we have learned from these comparative analyses, as well as genetic and functional genomic studies. Sequence analyses reinforce the assumption that E. amylovora is a relatively homogeneous species and support the current classification scheme of E. amylovora and its related species. The potential evolutionary origin of these Erwinia species is also proposed. The current understanding of the pathogen, its virulence mechanism and host specificity from genome sequencing data is summarized. Future research directions are also suggested. PMID:24710213
Complete genome sequences of four avian paramyxoviruses of serotype 10 isolated from Rockhopper Penguins on the Falkland Islands

USDA-ARS?s Scientific Manuscript database

The first complete genome sequences of four Avian paramyxovirus serotype 10 (APMV-10) isolates are described here. The viruses were isolated from Rockhopper Penguins sampled in 2007 on the Falkland Islands. All four genomes are 15,456 nucleotides in length and phylogenetic analyses show them to be c...
Complete Genome Sequences of Four Avian Paramyxoviruses of Serotype 10 Isolated from Rockhopper Penguins on the Falkland Islands

PubMed Central

Goraichuk, Iryna V.; Dimitrov, Kiril M.; Sharma, Poonam; Miller, Patti J.; Swayne, David E.; Suarez, David L.

2017-01-01

ABSTRACT The first complete genome sequences of four avian paramyxovirus serotype 10 (APMV-10) isolates are described here. The viruses were isolated from rockhopper penguins on the Falkland Islands, sampled in 2007. All four genomes are 15,456 nucleotides in length, and phylogenetic analyses show them to be closely related. PMID:28572332
First complete genome sequence of vanilla mosaic strain of Dasheen mosaic virus isolated from the Cook Islands.

PubMed

Puli'uvea, Christopher; Khan, Subuhi; Chang, Wee-Leong; Valmonte, Gardette; Pearson, Michael N; Higgins, Colleen M

2017-02-01

We present the first complete genome of vanilla mosaic virus (VanMV). The VanMV genomic structure is consistent with that of a potyvirus, containing a single open reading frame (ORF) encoding a polyprotein of 3139 amino acids. Motif analyses indicate the polyprotein can be cleaved into the expected ten individual proteins; other recognised potyvirus motifs are also present. As expected, the VanMV genome shows high sequence similarity to the published Dasheen mosaic virus (DsMV) genome sequences; comparisons with DsMV continue to support VanMV as a vanilla infecting strain of DsMV. Phylogenetic analyses indicate that VanMV and DsMV share a common ancestor, with VanMV having the closest relationship with DsMV strains from the South Pacific.
Phylogenetic utility, and variability in structure and content, of complete mitochondrial genomes among genetic lineages of the Hawaiian anchialine shrimp Halocaridina rubra Holthuis 1963 (Atyidae:Decapoda).

PubMed

Justice, Joshua L; Weese, David A; Santos, Scott Ross

2016-07-01

The Atyidae are caridean shrimp possessing hair-like setae on their claws and are important contributors to ecological services in tropical and temperate fresh and brackish water ecosystems. Complete mitochondrial genomes have only been reported from five of the 449 species in the family, thus limiting understanding of mitochondrial genome evolution and the phylogenetic utility of complete mitochondrial sequences in the Atyidae. Here, comparative analyses of complete mitochondrial genomes from eight genetic lineages of Halocaridina rubra, an atyid endemic to the anchialine ecosystem of the Hawaiian Archipelago, are presented. Although gene number, order, and orientation were syntenic among genomes, three regions were identified and further quantified where conservation was substantially lower: (1) high length and sequence variability in the tRNA-Lys and tRNA-Asp intergenic region; (2) a 317-bp insertion between the NAD6 and CytB genes confined to a single lineage and representing a partial duplication of CytB; and (3) the putative control region. Phylogenetic analyses utilizing complete mitochondrial sequences provided new insights into relationships among the H. rubra genetic lineages, with the topology of one clade correlating to the geologic sequence of the islands. However, deeper nodes in the phylogeny lacked bootstrap support. Overall, our results from H. rubra suggest intra-specific mitochondrial genomic diversity could be underestimated across the Metazoa since the vast majority of complete genomes are from just a single individual of a species.
The high-quality draft genome of peach (Prunus persica) identifies unique patterns of genetic diversity, domestication and genome evolution.

PubMed

Verde, Ignazio; Abbott, Albert G; Scalabrin, Simone; Jung, Sook; Shu, Shengqiang; Marroni, Fabio; Zhebentyayeva, Tatyana; Dettori, Maria Teresa; Grimwood, Jane; Cattonaro, Federica; Zuccolo, Andrea; Rossini, Laura; Jenkins, Jerry; Vendramin, Elisa; Meisel, Lee A; Decroocq, Veronique; Sosinski, Bryon; Prochnik, Simon; Mitros, Therese; Policriti, Alberto; Cipriani, Guido; Dondini, Luca; Ficklin, Stephen; Goodstein, David M; Xuan, Pengfei; Del Fabbro, Cristian; Aramini, Valeria; Copetti, Dario; Gonzalez, Susana; Horner, David S; Falchi, Rachele; Lucas, Susan; Mica, Erica; Maldonado, Jonathan; Lazzari, Barbara; Bielenberg, Douglas; Pirona, Raul; Miculan, Mara; Barakat, Abdelali; Testolin, Raffaele; Stella, Alessandra; Tartarini, Stefano; Tonutti, Pietro; Arús, Pere; Orellana, Ariel; Wells, Christina; Main, Dorrie; Vizzotto, Giannina; Silva, Herman; Salamini, Francesco; Schmutz, Jeremy; Morgante, Michele; Rokhsar, Daniel S

2013-05-01

Rosaceae is the most important fruit-producing clade, and its key commercially relevant genera (Fragaria, Rosa, Rubus and Prunus) show broadly diverse growth habits, fruit types and compact diploid genomes. Peach, a diploid Prunus species, is one of the best genetically characterized deciduous trees. Here we describe the high-quality genome sequence of peach obtained from a completely homozygous genotype. We obtained a complete chromosome-scale assembly using Sanger whole-genome shotgun methods. We predicted 27,852 protein-coding genes, as well as noncoding RNAs. We investigated the path of peach domestication through whole-genome resequencing of 14 Prunus accessions. The analyses suggest major genetic bottlenecks that have substantially shaped peach genome diversity. Furthermore, comparative analyses showed that peach has not undergone recent whole-genome duplication, and even though the ancestral triplicated blocks in peach are fragmentary compared to those in grape, all seven paleosets of paralogs from the putative paleoancestor are detectable.
Swine and Poultry Pathogens: the Complete Genome Sequences of Two Strains of Mycoplasma hyopneumoniae and a Strain of Mycoplasma synoviae†

PubMed Central

Vasconcelos, Ana Tereza R.; Ferreira, Henrique B.; Bizarro, Cristiano V.; Bonatto, Sandro L.; Carvalho, Marcos O.; Pinto, Paulo M.; Almeida, Darcy F.; Almeida, Luiz G. P.; Almeida, Rosana; Alves-Filho, Leonardo; Assunção, Enedina N.; Azevedo, Vasco A. C.; Bogo, Maurício R.; Brigido, Marcelo M.; Brocchi, Marcelo; Burity, Helio A.; Camargo, Anamaria A.; Camargo, Sandro S.; Carepo, Marta S.; Carraro, Dirce M.; de Mattos Cascardo, Júlio C.; Castro, Luiza A.; Cavalcanti, Gisele; Chemale, Gustavo; Collevatti, Rosane G.; Cunha, Cristina W.; Dallagiovanna, Bruno; Dambrós, Bibiana P.; Dellagostin, Odir A.; Falcão, Clarissa; Fantinatti-Garboggini, Fabiana; Felipe, Maria S. S.; Fiorentin, Laurimar; Franco, Gloria R.; Freitas, Nara S. A.; Frías, Diego; Grangeiro, Thalles B.; Grisard, Edmundo C.; Guimarães, Claudia T.; Hungria, Mariangela; Jardim, Sílvia N.; Krieger, Marco A.; Laurino, Jomar P.; Lima, Lucymara F. A.; Lopes, Maryellen I.; Loreto, Élgion L. S.; Madeira, Humberto M. F.; Manfio, Gilson P.; Maranhão, Andrea Q.; Martinkovics, Christyanne T.; Medeiros, Sílvia R. B.; Moreira, Miguel A. M.; Neiva, Márcia; Ramalho-Neto, Cicero E.; Nicolás, Marisa F.; Oliveira, Sergio C.; Paixão, Roger F. C.; Pedrosa, Fábio O.; Pena, Sérgio D. J.; Pereira, Maristela; Pereira-Ferrari, Lilian; Piffer, Itamar; Pinto, Luciano S.; Potrich, Deise P.; Salim, Anna C. M.; Santos, Fabrício R.; Schmitt, Renata; Schneider, Maria P. C.; Schrank, Augusto; Schrank, Irene S.; Schuck, Adriana F.; Seuanez, Hector N.; Silva, Denise W.; Silva, Rosane; Silva, Sérgio C.; Soares, Célia M. A.; Souza, Kelly R. L.; Souza, Rangel C.; Staats, Charley C.; Steffens, Maria B. R.; Teixeira, Santuza M. R.; Urmenyi, Turan P.; Vainstein, Marilene H.; Zuccherato, Luciana W.; Simpson, Andrew J. G.; Zaha, Arnaldo

2005-01-01

This work reports the results of analyses of three complete mycoplasma genomes, a pathogenic (7448) and a nonpathogenic (J) strain of the swine pathogen Mycoplasma hyopneumoniae and a strain of the avian pathogen Mycoplasma synoviae; the genome sizes of the three strains were 920,079 bp, 897,405 bp, and 799,476 bp, respectively. These genomes were compared with other sequenced mycoplasma genomes reported in the literature to examine several aspects of mycoplasma evolution. Strain-specific regions, including integrative and conjugal elements, and genome rearrangements and alterations in adhesin sequences were observed in the M. hyopneumoniae strains, and all of these were potentially related to pathogenicity. Genomic comparisons revealed that reduction in genome size implied loss of redundant metabolic pathways, with maintenance of alternative routes in different species. Horizontal gene transfer was consistently observed between M. synoviae and Mycoplasma gallisepticum. Our analyses indicated a likely transfer event of hemagglutinin-coding DNA sequences from M. gallisepticum to M. synoviae. PMID:16077101
Complete genome sequence of Lactobacillus paracasei CAUH35, a new strain isolated from traditional fermented dairy product koumiss in China.

PubMed

Wang, Guohong; Xiong, Yao; Xu, Qi; Yin, Jia; Hao, Yanling

2015-11-20

Lactobacillus paracasei CAUH35 was isolated from homemade koumiss, a traditional fermented dairy product with beneficial effects on human health. The genome consists of a circular 2,770,411 bp chromosome and four plasmids. Genome analysis revealed the presence of gene clusters involved in the production of exopolysaccharides and bacteriocin. The complete genome sequence of L. paracasei CAUH35 will provide genetic basis for further comparative and functional genomic analyses. Copyright © 2015. Published by Elsevier B.V.
Characterization of the first complete genome sequence of an Impatiens necrotic spot orthotospovirus isolate from the United States and worldwide phylogenetic analyses of INSV isolates.

PubMed

Zhao, Kaixi; Margaria, Paolo; Rosa, Cristina

2018-05-10

Impatiens necrotic spot orthotospovirus (INSV) can impact economically important ornamental plants and vegetables worldwide. Characterization studies on INSV are limited. For most INSV isolates, there are no complete genome sequences available. This lack of genomic information has a negative impact on the understanding of the INSV genetic diversity and evolution. Here we report the first complete nucleotide sequence of a US INSV isolate. INSV-UP01 was isolated from an impatiens in Pennsylvania, US. RT-PCR was used to clone its full-length genome and Vector NTI to assemble overlapping sequences. Phylogenetic trees were constructed by using MEGA7 software to show the phylogenetic relationships with other available INSV sequences worldwide. This US isolate has genome and biological features classical of INSV species and clusters in the Western Hemisphere clade, but its origin appears to be recent. Furthermore, INSV-UP01 might have been involved in a recombination event with an Italian isolate belonging to the Asian clade. Our analyses support that INSV isolates infect a broad plant-host range they group by geographic origin and not by host, and are subjected to frequent recombination events. These results justify the need to generate and analyze complete genome sequences of orthotospoviruses in general and INSV in particular.
High-Throughput Sequencing of Six Bamboo Chloroplast Genomes: Phylogenetic Implications for Temperate Woody Bamboos (Poaceae: Bambusoideae)

PubMed Central

Li, De-Zhu

2011-01-01

Background Bambusoideae is the only subfamily that contains woody members in the grass family, Poaceae. In phylogenetic analyses, Bambusoideae, Pooideae and Ehrhartoideae formed the BEP clade, yet the internal relationships of this clade are controversial. The distinctive life history (infrequent flowering and predominance of asexual reproduction) of woody bamboos makes them an interesting but taxonomically difficult group. Phylogenetic analyses based on large DNA fragments could only provide a moderate resolution of woody bamboo relationships, although a robust phylogenetic tree is needed to elucidate their evolutionary history. Phylogenomics is an alternative choice for resolving difficult phylogenies. Methodology/Principal Findings Here we present the complete nucleotide sequences of six woody bamboo chloroplast (cp) genomes using Illumina sequencing. These genomes are similar to those of other grasses and rather conservative in evolution. We constructed a phylogeny of Poaceae from 24 complete cp genomes including 21 grass species. Within the BEP clade, we found strong support for a sister relationship between Bambusoideae and Pooideae. In a substantial improvement over prior studies, all six nodes within Bambusoideae were supported with ≥0.95 posterior probability from Bayesian inference and 5/6 nodes resolved with 100% bootstrap support in maximum parsimony and maximum likelihood analyses. We found that repeats in the cp genome could provide phylogenetic information, while caution is needed when using indels in phylogenetic analyses based on few selected genes. We also identified relatively rapidly evolving cp genome regions that have the potential to be used for further phylogenetic study in Bambusoideae. Conclusions/Significance The cp genome of Bambusoideae evolved slowly, and phylogenomics based on whole cp genome could be used to resolve major relationships within the subfamily. The difficulty in resolving the diversification among three clades of temperate woody bamboos, even with complete cp genome sequences, suggests that these lineages may have diverged very rapidly. PMID:21655229
The complete chloroplast genome of the Dendrobium strongylanthum (Orchidaceae: Epidendroideae).

PubMed

Li, Jing; Chen, Chen; Wang, Zhe-Zhi

2016-07-01

Complete chloroplast genome sequence is very useful for studying the phylogenetic and evolution of species. In this study, the complete chloroplast genome of Dendrobium strongylanthum was constructed from whole-genome Illumina sequencing data. The chloroplast genome is 153 058 bp in length with 37.6% GC content and consists of two inverted repeats (IRs) of 26 316 bp. The IR regions are separated by large single-copy region (LSC, 85 836 bp) and small single-copy (SSC, 14 590 bp) region. A total of 130 chloroplast genes were successfully annotated, including 84 protein coding genes, 38 tRNA genes, and eight rRNA genes. Phylogenetic analyses showed that the chloroplast genome of Dendrobium strongylanthum is related to that of the Dendrobium officinal.
Impact of direct-to-consumer genomic testing at long term follow-up.

PubMed

Bloss, Cinnamon S; Wineinger, Nathan E; Darst, Burcu F; Schork, Nicholas J; Topol, Eric J

2013-06-01

There are few empirical data to inform the debate surrounding the use and regulation of direct-to-consumer (DTC) genome-wide disease risk tests. This study aimed to determine the long term psychological, behavioural, and clinical impacts of genomic risk testing for common disease. The Scripps Genomic Health Initiative is a prospective longitudinal cohort study of adults who purchased the Navigenics Health Compass, a commercially available genomic test. Web based assessments were administered at baseline, short (3 months), and long term (1 year) follow-up. 2240 participants completed either or both follow-ups and a subset of 1325 completed long term follow-up. There were no significant differences from baseline in anxiety (p=0.50), fat intake (p=0.34), or exercise (p=0.39) at long term follow-up, and 96.8% of the sample had no test related distress. Longitudinal linear mixed model analyses were consistent with results of cross-sectional analyses. Screening test completion was associated with sharing genomic test results with a physician (36.0% shared; p<0.001) and perceived utility of the test (61.5% high perceived utility; p=0.002), but was not associated with the genomic risk estimate values themselves. Over a third of DTC genomic test recipients shared their results with their own physician during an approximate 1 year follow-up period, and this sharing was associated with higher screening test completion. Genomic testing was not associated with long term psychological risks, and most participants reportedly perceived the test to be of high personal utility.
The complete mitochondrial genomes of three parasitic nematodes of birds: a unique gene order and insights into nematode phylogeny

PubMed Central

2013-01-01

Background Analyses of mitochondrial (mt) genome sequences in recent years challenge the current working hypothesis of Nematoda phylogeny proposed from morphology, ecology and nuclear small subunit rRNA gene sequences, and raise the need to sequence additional mt genomes for a broad range of nematode lineages. Results We sequenced the complete mt genomes of three Ascaridia species (family Ascaridiidae) that infest chickens, pigeons and parrots, respectively. These three Ascaridia species have an identical arrangement of mt genes to each other but differ substantially from other nematodes. Phylogenetic analyses of the mt genome sequences of the Ascaridia species, together with 62 other nematode species, support the monophylies of seven high-level taxa of the phylum Nematoda: 1) the subclass Dorylaimia; 2) the orders Rhabditida, Trichinellida and Mermithida; 3) the suborder Rhabditina; and 4) the infraorders Spiruromorpha and Oxyuridomorpha. Analyses of mt genome sequences, however, reject the monophylies of the suborders Spirurina and Tylenchina, and the infraorders Rhabditomorpha, Panagrolaimomorpha and Tylenchomorpha. Monophyly of the infraorder Ascaridomorpha varies depending on the methods of phylogenetic analysis. The Ascaridomorpha was more closely related to the infraorders Rhabditomorpha and Diplogasteromorpha (suborder Rhabditina) than they were to the other two infraorders of the Spirurina: Oxyuridorpha and Spiruromorpha. The closer relationship among Ascaridomorpha, Rhabditomorpha and Diplogasteromorpha was also supported by a shared common pattern of mitochondrial gene arrangement. Conclusions Analyses of mitochondrial genome sequences and gene arrangement has provided novel insights into the phylogenetic relationships among several major lineages of nematodes. Many lineages of nematodes, however, are underrepresented or not represented in these analyses. Expanding taxon sampling is necessary for future phylogenetic studies of nematodes with mt genome sequences. PMID:23800363
Near-Complete Genome Sequence of Thalassospira sp. Strain KO164 Isolated from a Lignin-Enriched Marine Sediment Microcosm

PubMed Central

Woo, Hannah L.; O’Dell, Kaela B.; Utturkar, Sagar; McBride, Kathryn R.; Huntemann, Marcel; Clum, Alicia; Pillay, Manoj; Palaniappan, Krishnaveni; Varghese, Neha; Mikhailova, Natalia; Stamatis, Dimitrios; Reddy, T. B. K.; Ngan, Chew Yee; Daum, Chris; Shapiro, Nicole; Markowitz, Victor; Ivanova, Natalia; Kyrpides, Nikos; Woyke, Tanja; Brown, Steven D.

2016-01-01

Thalassospira sp. strain KO164 was isolated from eastern Mediterranean seawater and sediment laboratory microcosms enriched on insoluble organosolv lignin under oxic conditions. The near-complete genome sequence presented here will facilitate analyses into this deep-ocean bacterium’s ability to degrade recalcitrant organics such as lignin. PMID:27881538
The First Complete Mitochondrial Genome Sequences for Stomatopod Crustaceans: Implications for Phylogeny

DOE Office of Scientific and Technical Information (OSTI.GOV)

Swinstrom, Kirsten; Caldwell, Roy; Fourcade, H. Matthew

2005-09-07

We report the first complete mitochondrial genome sequences of stomatopods and compare their features to each other and to those of other crustaceans. Phylogenetic analyses of the concatenated mitochondrial protein-coding sequences were used to explore relationships within the Stomatopoda, within the malacostracan crustaceans, and among crustaceans and insects. Although these analyses support the monophyly of both Malacostraca and, within it, Stomatopoda, it also confirms the view of a paraphyletic Crustacea, with Malacostraca being more closely related to insects than to the branchiopod crustaceans.
Mitochondrial DNA Evidence Supports the Hypothesis that Triodontophorus Species Belong to Cyathostominae

PubMed Central

Gao, Yuan; Zhang, Yan; Yang, Xin; Qiu, Jian-Hua; Duan, Hong; Xu, Wen-Wen; Chang, Qiao-Cheng; Wang, Chun-Ren

2017-01-01

Equine strongyles, the significant nematode pathogens of horses, are characterized by high quantities and species abundance, but classification of this group of parasitic nematodes is debated. Mitochondrial (mt) genome DNA data are often used to address classification controversies. Thus, the objectives of this study were to determine the complete mt genomes of three Cyathostominae nematode species (Cyathostomum catinatum, Cylicostephanus minutus, and Poteriostomum imparidentatum) of horses and reconstruct the phylogenetic relationship of Strongylidae with other nematodes in Strongyloidea to test the hypothesis that Triodontophorus spp. belong to Cyathostominae using the mt genomes. The mt genomes of Cy. catinatum, Cs. minutus, and P. imparidentatum were 13,838, 13,826, and 13,817 bp in length, respectively. Complete mt nucleotide sequence comparison of all Strongylidae nematodes revealed that sequence identity ranged from 77.8 to 91.6%. The mt genome sequences of Triodontophorus species had relatively high identity with Cyathostominae nematodes, rather than Strongylus species of the same subfamily (Strongylinae). Comparative analyses of mt genome organization for Strongyloidea nematodes sequenced to date revealed that members of this superfamily possess identical gene arrangements. Phylogenetic analyses using mtDNA data indicated that the Triodontophorus species clustered with Cyathostominae species instead of Strongylus species. The present study first determined the complete mt genome sequences of Cy. catinatum, Cs. minutus, and P. imparidentatum, which will provide novel genetic markers for further studies of Strongylidae taxonomy, population genetics, and systematics. Importantly, sequence comparison and phylogenetic analyses based on mtDNA sequences supported the hypothesis that Triodontophorus belongs to Cyathostominae. PMID:28824575
Complete cpDNA genome sequence of Smilax china and phylogenetic placement of Liliales--influences of gene partitions and taxon sampling.

PubMed

Liu, Juan; Qi, Zhe-Chen; Zhao, Yun-Peng; Fu, Cheng-Xin; Jenny Xiang, Qiu-Yun

2012-09-01

The complete nucleotide sequence of the chloroplast genome (cpDNA) of Smilax china L. (Smilacaceae) is reported. It is the first complete cp genome sequence in Liliales. Genomic analyses were conducted to examine the rate and pattern of cpDNA genome evolution in Smilax relative to other major lineages of monocots. The cpDNA genomic sequences were combined with those available for Lilium to evaluate the phylogenetic position of Liliales and to investigate the influence of taxon sampling, gene sampling, gene function, natural selection, and substitution rate on phylogenetic inference in monocots. Phylogenetic analyses using sequence data of gene groups partitioned according to gene function, selection force, and total substitution rate demonstrated evident impacts of these factors on phylogenetic inference of monocots and the placement of Liliales, suggesting potential evolutionary convergence or adaptation of some cpDNA genes in monocots. Our study also demonstrated that reduced taxon sampling reduced the bootstrap support for the placement of Liliales in the cpDNA phylogenomic analysis. Analyses of sequences of 77 protein genes with some missing data and sequences of 81 genes (all protein genes plus the rRNA genes) support a sister relationship of Liliales to the commelinids-Asparagales clade, consistent with the APG III system. Analyses of 63 cpDNA protein genes for 32 taxa with few missing data, however, support a sister relationship of Liliales (represented by Smilax and Lilium) to Dioscoreales-Pandanales. Topology tests indicated that these two alignments do not significantly differ given any of these three cpDNA genomic sequence data sets. Furthermore, we found no saturation effect of the data, suggesting that the cpDNA genomic sequence data used in the study are appropriate for monocot phylogenetic study and long-branch attraction is unlikely to be the cause to explain the result of two well-supported, conflict placements of Liliales. Further analyses using sufficient nuclear data remain necessary to evaluate these two phylogenetic hypotheses regarding the position of Liliales and to address the causes of signal conflict among genes and partitions. Copyright © 2012 Elsevier Inc. All rights reserved.

Origins of the Xylella fastidiosa prophage-like regions and their impact in genome differentiation

USDA-ARS?s Scientific Manuscript database

Xylella fastidiosa is a Gram negative plant pathogen causing many economically important diseases, and analyses of completely sequenced X. fastidiosa genome strains allowed the identification of many prophage-like elements and possibly phage remnants, accounting for up to 15% of the genome compositi...
Ten new complete mitochondrial genomes of pulmonates (Mollusca: Gastropoda) and their impact on phylogenetic relationships

PubMed Central

2011-01-01

Background Reconstructing the higher relationships of pulmonate gastropods has been difficult. The use of morphology is problematic due to high homoplasy. Molecular studies have suffered from low taxon sampling. Forty-eight complete mitochondrial genomes are available for gastropods, ten of which are pulmonates. Here are presented the new complete mitochondrial genomes of the ten following species of pulmonates: Salinator rhamphidia (Amphiboloidea); Auriculinella bidentata, Myosotella myosotis, Ovatella vulcani, and Pedipes pedipes (Ellobiidae); Peronia peronii (Onchidiidae); Siphonaria gigas (Siphonariidae); Succinea putris (Stylommatophora); Trimusculus reticulatus (Trimusculidae); and Rhopalocaulis grandidieri (Veronicellidae). Also, 94 new pulmonate-specific primers across the entire mitochondrial genome are provided, which were designed for amplifying entire mitochondrial genomes through short reactions and closing gaps after shotgun sequencing. Results The structural features of the 10 new mitochondrial genomes are provided. All genomes share similar gene orders. Phylogenetic analyses were performed including the 10 new genomes and 17 genomes from Genbank (outgroups, opisthobranchs, and other pulmonates). Bayesian Inference and Maximum Likelihood analyses, based on the concatenated amino-acid sequences of the 13 protein-coding genes, produced the same topology. The pulmonates are paraphyletic and basal to the opisthobranchs that are monophyletic at the tip of the tree. Siphonaria, traditionally regarded as a basal pulmonate, is nested within opisthobranchs. Pyramidella, traditionally regarded as a basal (non-euthyneuran) heterobranch, is nested within pulmonates. Several hypotheses are rejected, such as the Systellommatophora, Geophila, and Eupulmonata. The Ellobiidae is polyphyletic, but the false limpet Trimusculus reticulatus is closely related to some ellobiids. Conclusions Despite recent efforts for increasing the taxon sampling in euthyneuran (opisthobranchs and pulmonates) molecular phylogenies, several of the deeper nodes are still uncertain, because of low support values as well as some incongruence between analyses based on complete mitochondrial genomes and those based on individual genes (18S, 28S, 16S, CO1). Additional complete genomes are needed for pulmonates (especially for Williamia, Otina, and Smeagol), as well as basal heterobranchs closely related to euthyneurans. Increasing the number of markers for gastropod (and more broadly mollusk) phylogenetics also is necessary in order to resolve some of the deeper nodes -although clearly not an easy task. Step by step, however, new relationships are being unveiled, such as the close relationships between the false limpet Trimusculus and ellobiids, the nesting of pyramidelloids within pulmonates, and the close relationships of Siphonaria to sacoglossan opisthobranchs. The additional genomes presented here show that some species share an identical mitochondrial gene order due to convergence. PMID:21985526
Near-Complete Genome Sequence of Thalassospira sp. Strain KO164 Isolated from a Lignin-Enriched Marine Sediment Microcosm.

PubMed

Woo, Hannah L; O'Dell, Kaela B; Utturkar, Sagar; McBride, Kathryn R; Huntemann, Marcel; Clum, Alicia; Pillay, Manoj; Palaniappan, Krishnaveni; Varghese, Neha; Mikhailova, Natalia; Stamatis, Dimitrios; Reddy, T B K; Ngan, Chew Yee; Daum, Chris; Shapiro, Nicole; Markowitz, Victor; Ivanova, Natalia; Kyrpides, Nikos; Woyke, Tanja; Brown, Steven D; Hazen, Terry C

2016-11-23

Thalassospira sp. strain KO164 was isolated from eastern Mediterranean seawater and sediment laboratory microcosms enriched on insoluble organosolv lignin under oxic conditions. The near-complete genome sequence presented here will facilitate analyses into this deep-ocean bacterium's ability to degrade recalcitrant organics such as lignin. Copyright © 2016 Woo et al.
The complete mitochondrial genome of the dwarf tapeworm Hymenolepis nana--a neglected zoonotic helminth.

PubMed

Cheng, Tian; Liu, Guo-Hua; Song, Hui-Qun; Lin, Rui-Qing; Zhu, Xing-Quan

2016-03-01

Hymenolepis nana, commonly known as the dwarf tapeworm, is one of the most common tapeworms of humans and rodents and can cause hymenolepiasis. Although this zoonotic tapeworm is of socio-economic significance in many countries of the world, its genetics, systematics, epidemiology, and biology are poorly understood. In the present study, we sequenced and characterized the complete mitochondrial (mt) genome of H. nana. The mt genome is 13,764 bp in size and encodes 36 genes, including 12 protein-coding genes, 2 ribosomal RNA, and 22 transfer RNA genes. All genes are transcribed in the same direction. The gene order and genome content are completely identical with their congener Hymenolepis diminuta. Phylogenetic analyses based on concatenated amino acid sequences of 12 protein-coding genes by Bayesian inference, Maximum likelihood, and Maximum parsimony showed the division of class Cestoda into two orders, supported the monophylies of both the orders Cyclophyllidea and Pseudophyllidea. Analyses of mt genome sequences also support the monophylies of the three families Taeniidae, Hymenolepididae, and Diphyllobothriidae. This novel mt genome provides a useful genetic marker for studying the molecular epidemiology, systematics, and population genetics of the dwarf tapeworm and should have implications for the diagnosis, prevention, and control of hymenolepiasis in humans.
Complete chloroplast genome of Prunus yedoensis Matsum.(Rosaceae), wild and endemic flowering cherry on Jeju Island, Korea.

PubMed

Cho, Myong-Suk; Hyun Cho, Chung; Yeon Kim, Su; Su Yoon, Hwan; Kim, Seung-Chul

2016-09-01

The complete chloroplast genome sequences of the wild flowering cherry, Prunus yedoensis Matsum., which is native and endemic to Jeju Island, Korea, is reported in this study. The genome size is 157 786 bp in length with 36.7% GC content, which is composed of LSC region of 85 908 bp, SSC region of 19 120 bp and two IR copies of 26 379 bp each. The cp genome contains 131 genes, including 86 coding genes, 8 rRNA genes and 37 tRNA genes. The maximum likelihood analysis was conducted to verify a phylogenetic position of the newly sequenced cp genome of P. yedoensis using 11 representatives of complete cp genome sequences within the family Rosaceae. The genus Prunus exhibited monophyly and the result of the phylogenetic relationship agreed with the previous phylogenetic analyses within Rosaceae.
Comparative analysis of chloroplast genomes of the genus Citrus and its close relatives.

PubMed

Liu, Xiaogang; Wu, Hongkun; Luo, Yan; Xi, Wanpeng; Zhou, Zhiqin

2017-01-01

The genus Citrus and its close relatives are economically and nutritionally important fruit trees. However, the huge controversy over the phylogeny of key wild species, as well as the genetic relationship between the cultivated species and their putative wild progenitors, remains unresolved. Comparative analyses of chloroplast (cp) genomes have been useful in resolving various phylogenetic issues. Thus far, the cp genomes of only two Citrus species have been sequenced. In this study, we sequenced six complete cp genomes, four belonging to the genus Citrus, and two belonging to the genera Fortunella and Poncirus, respectively. These newly sequenced genomes together with the two publicly available were used for comparative analyses of the genus Citrus and its close relatives. All eight cp genomes share similar basic structure, gene order and gene content. Phylogenetic analyses supported the monophyly of the three genera in the order Sapindales within the major clade Malvidae.
Complete Mitochondrial Genome of Echinostoma hortense (Digenea: Echinostomatidae).

PubMed

Liu, Ze-Xuan; Zhang, Yan; Liu, Yu-Ting; Chang, Qiao-Cheng; Su, Xin; Fu, Xue; Yue, Dong-Mei; Gao, Yuan; Wang, Chun-Ren

2016-04-01

Echinostoma hortense (Digenea: Echinostomatidae) is one of the intestinal flukes with medical importance in humans. However, the mitochondrial (mt) genome of this fluke has not been known yet. The present study has determined the complete mt genome sequences of E. hortense and assessed the phylogenetic relationships with other digenean species for which the complete mt genome sequences are available in GenBank using concatenated amino acid sequences inferred from 12 protein-coding genes. The mt genome of E. hortense contained 12 protein-coding genes, 22 transfer RNA genes, 2 ribosomal RNA genes, and 1 non-coding region. The length of the mt genome of E. hortense was 14,994 bp, which was somewhat smaller than those of other trematode species. Phylogenetic analyses based on concatenated nucleotide sequence datasets for all 12 protein-coding genes using maximum parsimony (MP) method showed that E. hortense and Hypoderaeum conoideum gathered together, and they were closer to each other than to Fasciolidae and other echinostomatid trematodes. The availability of the complete mt genome sequences of E. hortense provides important genetic markers for diagnostics, population genetics, and evolutionary studies of digeneans.
Complete Mitochondrial Genome of Echinostoma hortense (Digenea: Echinostomatidae)

PubMed Central

Liu, Ze-Xuan; Zhang, Yan; Liu, Yu-Ting; Chang, Qiao-Cheng; Su, Xin; Fu, Xue; Yue, Dong-Mei; Gao, Yuan; Wang, Chun-Ren

2016-01-01

Echinostoma hortense (Digenea: Echinostomatidae) is one of the intestinal flukes with medical importance in humans. However, the mitochondrial (mt) genome of this fluke has not been known yet. The present study has determined the complete mt genome sequences of E. hortense and assessed the phylogenetic relationships with other digenean species for which the complete mt genome sequences are available in GenBank using concatenated amino acid sequences inferred from 12 protein-coding genes. The mt genome of E. hortense contained 12 protein-coding genes, 22 transfer RNA genes, 2 ribosomal RNA genes, and 1 non-coding region. The length of the mt genome of E. hortense was 14,994 bp, which was somewhat smaller than those of other trematode species. Phylogenetic analyses based on concatenated nucleotide sequence datasets for all 12 protein-coding genes using maximum parsimony (MP) method showed that E. hortense and Hypoderaeum conoideum gathered together, and they were closer to each other than to Fasciolidae and other echinostomatid trematodes. The availability of the complete mt genome sequences of E. hortense provides important genetic markers for diagnostics, population genetics, and evolutionary studies of digeneans. PMID:27180575
Genome sequencing of the extinct Eurasian wild aurochs illuminates the phylogeography and evolution of cattle

USDA-ARS?s Scientific Manuscript database

Interrogation of modern and ancient bovine genome sequences provides a valuable model to study the evolution of cattle. Here, we analyse the first complete wild aurochs (Bos primigenius) genome sequence using DNA extracted from a ~ 6,750 year-old humerus bone retrieved from a cave site in Derbyshire...
Translational Genomics Research Institute (TGen): Quantified Cancer Cell Line Encyclopedia (CCLE) RNA-seq Data | Office of Cancer Genomics

Cancer.gov

Many applications analyze quantified transcript-level abundances to make inferences. Having completed this computation across the large sample set, the CTD2 Center at the Translational Genomics Research Institute presents the quantified data in a straightforward, consolidated form for these types of analyses.
Complete genome sequence of Bacillus velezensis QST713: A biocontrol agent that protects Agaricus bisporus crops against the green mould disease.

PubMed

Pandin, Caroline; Le Coq, Dominique; Deschamps, Julien; Védie, Régis; Rousseau, Thierry; Aymerich, Stéphane; Briandet, Romain

2018-04-24

Bacillus subtilis QST713 is extensively used as a biological control agent in agricultural fields including in the button mushroom culture, Agaricus bisporus. This last use exploits its inhibitory activity against microbial pathogens such as Trichoderma aggressivum f. europaeum, the main button mushroom green mould competitor. Here, we report the complete genome sequence of this bacterium with a genome size of 4 233 757 bp, 4263 predicted genes and an average GC content of 45.9%. Based on phylogenomic analyses, strain QST713 is finally designated as Bacillus velezensis. Genomic analyses revealed two clusters encoding potential new antimicrobials with NRPS and TransATPKS synthetase. B. velezensis QST713 genome also harbours several genes previously described as being involved in surface colonization and biofilm formation. This strain shows a strong ability to form in vitro spatially organized biofilm and to antagonize T. aggressivum. The availability of this genome sequence could bring new elements to understand the interactions with micro or/and macroorganisms in crops. Copyright © 2018 Elsevier B.V. All rights reserved.
Complete Genome Sequence of Porcine Parvovirus N Strain Isolated from Guangxi, China

PubMed Central

Su, Qian-Lian; Li, Bin; Liang, Jia-Xing; He, Ying; Qin, Yi-Bin; Lu, Bing-Xia

2015-01-01

We report here the complete genomic sequence of the porcine parvovirus (PPV) N strain, isolated in 1989 from the viscera of a stillborn fetus farrowed by a gilt in Guangxi, southern China. Phylogenetic analyses suggest that the PPV-N strain is closely related to attenuated PPV NADL-2 strains. The PPV-N strain has good immunogenicity, genetic stability, and safety. PMID:25573932
The complete chloroplast genome sequences of Lychnis wilfordii and Silene capitata and comparative analyses with other Caryophyllaceae genomes.

PubMed

Kang, Jong-Soo; Lee, Byoung Yoon; Kwak, Myounghai

2017-01-01

The complete chloroplast genomes of Lychnis wilfordii and Silene capitata were determined and compared with ten previously reported Caryophyllaceae chloroplast genomes. The chloroplast genome sequences of L. wilfordii and S. capitata contain 152,320 bp and 150,224 bp, respectively. The gene contents and orders among 12 Caryophyllaceae species are consistent, but several microstructural changes have occurred. Expansion of the inverted repeat (IR) regions at the large single copy (LSC)/IRb and small single copy (SSC)/IR boundaries led to partial or entire gene duplications. Additionally, rearrangements of the LSC region were caused by gene inversions and/or transpositions. The 18 kb inversions, which occurred three times in different lineages of tribe Sileneae, were thought to be facilitated by the intermolecular duplicated sequences. Sequence analyses of the L. wilfordii and S. capitata genomes revealed 39 and 43 repeats, respectively, including forward, palindromic, and reverse repeats. In addition, a total of 67 and 56 simple sequence repeats were discovered in the L. wilfordii and S. capitata chloroplast genomes, respectively. Finally, we constructed phylogenetic trees of the 12 Caryophyllaceae species and two Amaranthaceae species based on 73 protein-coding genes using both maximum parsimony and likelihood methods.
Comparative and genetic analysis of the four sequenced Paenibacillus polymyxa genomes reveals a diverse metabolism and conservation of genes relevant to plant-growth promotion and competitiveness.

PubMed

Eastman, Alexander W; Heinrichs, David E; Yuan, Ze-Chun

2014-10-03

Members of the genus Paenibacillus are important plant growth-promoting rhizobacteria that can serve as bio-reactors. Paenibacillus polymyxa promotes the growth of a variety of economically important crops. Our lab recently completed the genome sequence of Paenibacillus polymyxa CR1. As of January 2014, four P. polymyxa genomes have been completely sequenced but no comparative genomic analyses have been reported. Here we report the comparative and genetic analyses of four sequenced P. polymyxa genomes, which revealed a significantly conserved core genome. Complex metabolic pathways and regulatory networks were highly conserved and allow P. polymyxa to rapidly respond to dynamic environmental cues. Genes responsible for phytohormone synthesis, phosphate solubilization, iron acquisition, transcriptional regulation, σ-factors, stress responses, transporters and biomass degradation were well conserved, indicating an intimate association with plant hosts and the rhizosphere niche. In addition, genes responsible for antimicrobial resistance and non-ribosomal peptide/polyketide synthesis are present in both the core and accessory genome of each strain. Comparative analyses also reveal variations in the accessory genome, including large plasmids present in strains M1 and SC2. Furthermore, a considerable number of strain-specific genes and genomic islands are irregularly distributed throughout each genome. Although a variety of plant-growth promoting traits are encoded by all strains, only P. polymyxa CR1 encodes the unique nitrogen fixation cluster found in other Paenibacillus sp. Our study revealed that genomic loci relevant to host interaction and ecological fitness are highly conserved within the P. polymyxa genomes analysed, despite variations in the accessory genome. This work suggets that plant-growth promotion by P. polymyxa is mediated largely through phytohormone production, increased nutrient availability and bio-control mechanisms. This study provides an in-depth understanding of the genome architecture of this species, thus facilitating future genetic engineering and applications in agriculture, industry and medicine. Furthermore, this study highlights the current gap in our understanding of complex plant biomass metabolism in Gram-positive bacteria.
Translational Genomics Research Institute: Quantified Cancer Cell Line Encyclopedia (CCLE) RNA-seq Data | Office of Cancer Genomics

Cancer.gov

Many applications analyze quantified transcript-level abundances to make inferences. Having completed this computation across the large sample set, the CTD2 Center at the Translational Genomics Research Institute presents the quantified data in a straightforward, consolidated form for these types of analyses. Experimental Approaches
Clustering of Pan- and Core-genome of Lactobacillus provides Novel Evolutionary Insights for Differentiation.

PubMed

Inglin, Raffael C; Meile, Leo; Stevens, Marc J A

2018-04-24

Bacterial taxonomy aims to classify bacteria based on true evolutionary events and relies on a polyphasic approach that includes phenotypic, genotypic and chemotaxonomic analyses. Until now, complete genomes are largely ignored in taxonomy. The genus Lactobacillus consists of 173 species and many genomes are available to study taxonomy and evolutionary events. We analyzed and clustered 98 completely sequenced genomes of the genus Lactobacillus and 234 draft genomes of 5 different Lactobacillus species, i.e. L. reuteri, L. delbrueckii, L. plantarum, L. rhamnosus and L. helveticus. The core-genome of the genus Lactobacillus contains 266 genes and the pan-genome 20'800 genes. Clustering of the Lactobacillus pan- and core-genome resulted in two highly similar trees. This shows that evolutionary history is traceable in the core-genome and that clustering of the core-genome is sufficient to explore relationships. Clustering of core- and pan-genomes at species' level resulted in similar trees as well. Detailed analyses of the core-genomes showed that the functional class "genetic information processing" is conserved in the core-genome but that "signaling and cellular processes" is not. The latter class encodes functions that are involved in environmental interactions. Evolution of lactobacilli seems therefore directed by the environment. The type species L. delbrueckii was analyzed in detail and its pan-genome based tree contained two major clades whose members contained different genes yet identical functions. In addition, evidence for horizontal gene transfer between strains of L. delbrueckii, L. plantarum, and L. rhamnosus, and between species of the genus Lactobacillus is presented. Our data provide evidence for evolution of some lactobacilli according to a parapatric-like model for species differentiation. Core-genome trees are useful to detect evolutionary relationships in lactobacilli and might be useful in taxonomic analyses. Lactobacillus' evolution is directed by the environment and HGT.
Re-annotation, improved large-scale assembly and establishment of a catalogue of noncoding loci for the genome of the model brown alga Ectocarpus.

PubMed

Cormier, Alexandre; Avia, Komlan; Sterck, Lieven; Derrien, Thomas; Wucher, Valentin; Andres, Gwendoline; Monsoor, Misharl; Godfroy, Olivier; Lipinska, Agnieszka; Perrineau, Marie-Mathilde; Van De Peer, Yves; Hitte, Christophe; Corre, Erwan; Coelho, Susana M; Cock, J Mark

2017-04-01

The genome of the filamentous brown alga Ectocarpus was the first to be completely sequenced from within the brown algal group and has served as a key reference genome both for this lineage and for the stramenopiles. We present a complete structural and functional reannotation of the Ectocarpus genome. The large-scale assembly of the Ectocarpus genome was significantly improved and genome-wide gene re-annotation using extensive RNA-seq data improved the structure of 11 108 existing protein-coding genes and added 2030 new loci. A genome-wide analysis of splicing isoforms identified an average of 1.6 transcripts per locus. A large number of previously undescribed noncoding genes were identified and annotated, including 717 loci that produce long noncoding RNAs. Conservation of lncRNAs between Ectocarpus and another brown alga, the kelp Saccharina japonica, suggests that at least a proportion of these loci serve a function. Finally, a large collection of single nucleotide polymorphism-based markers was developed for genetic analyses. These resources are available through an updated and improved genome database. This study significantly improves the utility of the Ectocarpus genome as a high-quality reference for the study of many important aspects of brown algal biology and as a reference for genomic analyses across the stramenopiles. © 2016 The Authors. New Phytologist © 2016 New Phytologist Trust.
Bioinformatic Workflows for Generating Complete Plastid Genome Sequences-An Example from Cabomba (Cabombaceae) in the Context of the Phylogenomic Analysis of the Water-Lily Clade.

PubMed

Gruenstaeudl, Michael; Gerschler, Nico; Borsch, Thomas

2018-06-21

The sequencing and comparison of plastid genomes are becoming a standard method in plant genomics, and many researchers are using this approach to infer plant phylogenetic relationships. Due to the widespread availability of next-generation sequencing, plastid genome sequences are being generated at breakneck pace. This trend towards massive sequencing of plastid genomes highlights the need for standardized bioinformatic workflows. In particular, documentation and dissemination of the details of genome assembly, annotation, alignment and phylogenetic tree inference are needed, as these processes are highly sensitive to the choice of software and the precise settings used. Here, we present the procedure and results of sequencing, assembling, annotating and quality-checking of three complete plastid genomes of the aquatic plant genus Cabomba as well as subsequent gene alignment and phylogenetic tree inference. We accompany our findings by a detailed description of the bioinformatic workflow employed. Importantly, we share a total of eleven software scripts for each of these bioinformatic processes, enabling other researchers to evaluate and replicate our analyses step by step. The results of our analyses illustrate that the plastid genomes of Cabomba are highly conserved in both structure and gene content.
de novo assembly and population genomic survey of natural yeast isolates with the Oxford Nanopore MinION sequencer.

PubMed

Istace, Benjamin; Friedrich, Anne; d'Agata, Léo; Faye, Sébastien; Payen, Emilie; Beluche, Odette; Caradec, Claudia; Davidas, Sabrina; Cruaud, Corinne; Liti, Gianni; Lemainque, Arnaud; Engelen, Stefan; Wincker, Patrick; Schacherer, Joseph; Aury, Jean-Marc

2017-02-01

Oxford Nanopore Technologies Ltd (Oxford, UK) have recently commercialized MinION, a small single-molecule nanopore sequencer, that offers the possibility of sequencing long DNA fragments from small genomes in a matter of seconds. The Oxford Nanopore technology is truly disruptive; it has the potential to revolutionize genomic applications due to its portability, low cost, and ease of use compared with existing long reads sequencing technologies. The MinION sequencer enables the rapid sequencing of small eukaryotic genomes, such as the yeast genome. Combined with existing assembler algorithms, near complete genome assemblies can be generated and comprehensive population genomic analyses can be performed. Here, we resequenced the genome of the Saccharomyces cerevisiae S288C strain to evaluate the performance of nanopore-only assemblers. Then we de novo sequenced and assembled the genomes of 21 isolates representative of the S. cerevisiae genetic diversity using the MinION platform. The contiguity of our assemblies was 14 times higher than the Illumina-only assemblies and we obtained one or two long contigs for 65 % of the chromosomes. This high contiguity allowed us to accurately detect large structural variations across the 21 studied genomes. Because of the high completeness of the nanopore assemblies, we were able to produce a complete cartography of transposable elements insertions and inspect structural variants that are generally missed using a short-read sequencing strategy. Our analyses show that the Oxford Nanopore technology is already usable for de novo sequencing and assembly; however, non-random errors in homopolymers require polishing the consensus using an alternate sequencing technology. © The Author 2017. Published by Oxford University Press.
de novo assembly and population genomic survey of natural yeast isolates with the Oxford Nanopore MinION sequencer

PubMed Central

Istace, Benjamin; Friedrich, Anne; d'Agata, Léo; Faye, Sébastien; Payen, Emilie; Beluche, Odette; Caradec, Claudia; Davidas, Sabrina; Cruaud, Corinne; Liti, Gianni; Lemainque, Arnaud; Engelen, Stefan; Wincker, Patrick; Schacherer, Joseph

2017-01-01

Abstract Background: Oxford Nanopore Technologies Ltd (Oxford, UK) have recently commercialized MinION, a small single-molecule nanopore sequencer, that offers the possibility of sequencing long DNA fragments from small genomes in a matter of seconds. The Oxford Nanopore technology is truly disruptive; it has the potential to revolutionize genomic applications due to its portability, low cost, and ease of use compared with existing long reads sequencing technologies. The MinION sequencer enables the rapid sequencing of small eukaryotic genomes, such as the yeast genome. Combined with existing assembler algorithms, near complete genome assemblies can be generated and comprehensive population genomic analyses can be performed. Results: Here, we resequenced the genome of the Saccharomyces cerevisiae S288C strain to evaluate the performance of nanopore-only assemblers. Then we de novo sequenced and assembled the genomes of 21 isolates representative of the S. cerevisiae genetic diversity using the MinION platform. The contiguity of our assemblies was 14 times higher than the Illumina-only assemblies and we obtained one or two long contigs for 65 % of the chromosomes. This high contiguity allowed us to accurately detect large structural variations across the 21 studied genomes. Conclusion: Because of the high completeness of the nanopore assemblies, we were able to produce a complete cartography of transposable elements insertions and inspect structural variants that are generally missed using a short-read sequencing strategy. Our analyses show that the Oxford Nanopore technology is already usable for de novo sequencing and assembly; however, non-random errors in homopolymers require polishing the consensus using an alternate sequencing technology. PMID:28369459

What can we learn about lyssavirus genomes using 454 sequencing?

PubMed

Höper, Dirk; Finke, Stefan; Freuling, Conrad M; Hoffmann, Bernd; Beer, Martin

2012-01-01

The main task of the individual project number four"Whole genome sequencing, virus-host adaptation, and molecular epidemiological analyses of lyssaviruses "within the network" Lyssaviruses--a potential re-emerging public health threat" is to provide high quality complete genome sequences from lyssaviruses. These sequences are analysed in-depth with regard to the diversity of the viral populations as to both quasi-species and so-called defective interfering RNAs. Moreover, the sequence data will facilitate further epidemiological analyses, will provide insight into the evolution of lyssaviruses and will be the basis for the design of novel nucleic acid based diagnostics. The first results presented here indicate that not only high quality full-length lyssavirus genome sequences can be generated, but indeed efficient analysis of the viral population gets feasible.
Complete genome sequence of porcine parvovirus N strain isolated from guangxi, china.

PubMed

Su, Qian-Lian; Li, Bin; Zhao, Wu; Liang, Jia-Xing; He, Ying; Qin, Yi-Bin; Lu, Bing-Xia

2015-01-08

We report here the complete genomic sequence of the porcine parvovirus (PPV) N strain, isolated in 1989 from the viscera of a stillborn fetus farrowed by a gilt in Guangxi, southern China. Phylogenetic analyses suggest that the PPV-N strain is closely related to attenuated PPV NADL-2 strains. The PPV-N strain has good immunogenicity, genetic stability, and safety. Copyright © 2015 Su et al.
Genomic Diversity in the Endosymbiotic Bacterium Rhizobium leguminosarum.

PubMed

Sánchez-Cañizares, Carmen; Jorrín, Beatriz; Durán, David; Nadendla, Suvarna; Albareda, Marta; Rubio-Sanz, Laura; Lanza, Mónica; González-Guerrero, Manuel; Prieto, Rosa Isabel; Brito, Belén; Giglio, Michelle G; Rey, Luis; Ruiz-Argüeso, Tomás; Palacios, José M; Imperial, Juan

2018-01-24

Rhizobium leguminosarum bv. viciae is a soil α-proteobacterium that establishes a diazotrophic symbiosis with different legumes of the Fabeae tribe. The number of genome sequences from rhizobial strains available in public databases is constantly increasing, although complete, fully annotated genome structures from rhizobial genomes are scarce. In this work, we report and analyse the complete genome of R. leguminosarum bv. viciae UPM791. Whole genome sequencing can provide new insights into the genetic features contributing to symbiotically relevant processes such as bacterial adaptation to the rhizosphere, mechanisms for efficient competition with other bacteria, and the ability to establish a complex signalling dialogue with legumes, to enter the root without triggering plant defenses, and, ultimately, to fix nitrogen within the host. Comparison of the complete genome sequences of two strains of R. leguminosarum bv. viciae , 3841 and UPM791, highlights the existence of different symbiotic plasmids and a common core chromosome. Specific genomic traits, such as plasmid content or a distinctive regulation, define differential physiological capabilities of these endosymbionts. Among them, strain UPM791 presents unique adaptations for recycling the hydrogen generated in the nitrogen fixation process.
Genomic Characterization of Travel-Associated Dengue Viruses Isolated from the Entry-Exit Ports in Fujian Province, China, 2013-2015.

PubMed

Gao, Bo; Zhang, Jianming; Wang, Yuping; Chen, Fan; Zheng, Chaohui; Xie, Lianhui

2017-09-25

Over the past decade, indigenous dengue outbreaks have occurred occasionally in Fujian province in southeastern China because of sporadic imported dengue viruses (DENV). In this study, 3 DENV-2 and 2 DENV-4 strains were isolated from suspected febrile travelers at 2 ports of entry in Fujian between 2013-2015. Complete viral genome sequences of these new isolates were obtained with Sanger chemistry. Genomic sequence analyses revealed that these strains belonged to genotypes of 2-Cosmopolitan and 4-II. Consistent with the patients' travel information, phylogenetic analyses of the complete coding regions also indicated that most of the new isolates were genetically similar to the circulating strains in Southeast Asia rather than previous Chinese strains that were available. Therefore, phylogenetic analyses of the imported DENV demonstrated that multiple introductions of DENV emerged continuously in Fujian, and highlighted the importance of dengue surveillance at entry-exit ports in the subtropical regions of southern China.
Dereplication, Aggregation and Scoring Tool (DAS Tool) v1.0

DOE Office of Scientific and Technical Information (OSTI.GOV)

SIEBER, CHRISTIAN

Communities of uncultivated microbes are critical to ecosystem function and microorganism health, and a key objective of metagenomic studies is to analyze organism-specific metabolic pathways and reconstruct community interaction networks. This requires accurate assignment of genes to genomes, yet existing binning methods often fail to predict a reasonable number of genomes and report many bins of low quality and completeness. Furthermore, the performance of existing algorithms varies between samples and biotypes. Here, we present a dereplication, aggregation and scoring strategy, DAS Tool, that combines the strengths of a flexible set of established binning algorithms. DAS Tools applied to a constructedmore » community generated more accurate bins than any automated method. Further, when applied to samples of different complexity, including soil, natural oil seeps, and the human gut, DAS Tool recovered substantially more near-complete genomes than any single binning method alone. Included were three genomes from a novel lineage . The ability to reconstruct many near-complete genomes from metagenomics data will greatly advance genome-centric analyses of ecosystems.« less
The Complete Chloroplast Genome of Wild Rice (Oryza minuta) and Its Comparison to Related Species.

PubMed

Asaf, Sajjad; Waqas, Muhammad; Khan, Abdul L; Khan, Muhammad A; Kang, Sang-Mo; Imran, Qari M; Shahzad, Raheem; Bilal, Saqib; Yun, Byung-Wook; Lee, In-Jung

2017-01-01

Oryza minuta , a tetraploid wild relative of cultivated rice (family Poaceae), possesses a BBCC genome and contains genes that confer resistance to bacterial blight (BB) and white-backed (WBPH) and brown (BPH) plant hoppers. Based on the importance of this wild species, this study aimed to understand the phylogenetic relationships of O. minuta with other Oryza species through an in-depth analysis of the composition and diversity of the chloroplast (cp) genome. The analysis revealed a cp genome size of 135,094 bp with a typical quadripartite structure and consisting of a pair of inverted repeats separated by small and large single copies, 139 representative genes, and 419 randomly distributed microsatellites. The genomic organization, gene order, GC content and codon usage are similar to those of typical angiosperm cp genomes. Approximately 30 forward, 28 tandem and 20 palindromic repeats were detected in the O . minuta cp genome. Comparison of the complete O. minuta cp genome with another eleven Oryza species showed a high degree of sequence similarity and relatively high divergence of intergenic spacers. Phylogenetic analyses were conducted based on the complete genome sequence, 65 shared genes and matK gene showed same topologies and O. minuta forms a single clade with parental O. punctata . Thus, the complete O . minuta cp genome provides interesting insights and valuable information that can be used to identify related species and reconstruct its phylogeny.
Improvement of genome assembly completeness and identification of novel full-length protein-coding genes by RNA-seq in the giant panda genome.

PubMed

Chen, Meili; Hu, Yibo; Liu, Jingxing; Wu, Qi; Zhang, Chenglin; Yu, Jun; Xiao, Jingfa; Wei, Fuwen; Wu, Jiayan

2015-12-11

High-quality and complete gene models are the basis of whole genome analyses. The giant panda (Ailuropoda melanoleuca) genome was the first genome sequenced on the basis of solely short reads, but the genome annotation had lacked the support of transcriptomic evidence. In this study, we applied RNA-seq to globally improve the genome assembly completeness and to detect novel expressed transcripts in 12 tissues from giant pandas, by using a transcriptome reconstruction strategy that combined reference-based and de novo methods. Several aspects of genome assembly completeness in the transcribed regions were effectively improved by the de novo assembled transcripts, including genome scaffolding, the detection of small-size assembly errors, the extension of scaffold/contig boundaries, and gap closure. Through expression and homology validation, we detected three groups of novel full-length protein-coding genes. A total of 12.62% of the novel protein-coding genes were validated by proteomic data. GO annotation analysis showed that some of the novel protein-coding genes were involved in pigmentation, anatomical structure formation and reproduction, which might be related to the development and evolution of the black-white pelage, pseudo-thumb and delayed embryonic implantation of giant pandas. The updated genome annotation will help further giant panda studies from both structural and functional perspectives.
Five Complete Chloroplast Genome Sequences from Diospyros: Genome Organization and Comparative Analysis.

PubMed

Fu, Jianmin; Liu, Huimin; Hu, Jingjing; Liang, Yuqin; Liang, Jinjun; Wuyun, Tana; Tan, Xiaofeng

2016-01-01

Diospyros is the largest genus in Ebenaceae, comprising more than 500 species with remarkable economic value, especially Diospyros kaki Thunb., which has traditionally been an important food resource in China, Korea, and Japan. Complete chloroplast (cp) genomes from D. kaki, D. lotus L., D. oleifera Cheng., D. glaucifolia Metc., and Diospyros 'Jinzaoshi' were sequenced using Illumina sequencing technology. This is the first cp genome reported in Ebenaceae. The cp genome sequences of Diospyros ranged from 157,300 to 157,784 bp in length, presenting a typical quadripartite structure with two inverted repeats each separated by one large and one small single-copy region. For each cp genome, 134 genes were annotated, including 80 protein-coding, 31 tRNA, and 4 rRNA unique genes. In all, 179 repeats and 283 single sequence repeats were identified. Four hypervariable regions, namely, intergenic region of trnQ_rps16, trnV_ndhC, and psbD_trnT, and intron of ndhA, were identified in the Diospyros genomes. Phylogenetic analyses based on the whole cp genome, protein-coding, and intergenic and intron sequences indicated that D. oleifera is closely related to D. kaki and could be used as a model plant for future research on D. kaki; to our knowledge, this is proposed for the first time. Further, these analyses together with two large deletions (301 and 140 bp) in the cp genome of D. 'Jinzaoshi', support its placement as a new species in Diospyros. Both maximum parsimony and likelihood analyses for 19 taxa indicated the basal position of Ericales in asterids and suggested that Ebenaceae is monophyletic in Ericales.
Five Complete Chloroplast Genome Sequences from Diospyros: Genome Organization and Comparative Analysis

PubMed Central

Hu, Jingjing; Liang, Yuqin; Liang, Jinjun; Wuyun, Tana; Tan, Xiaofeng

2016-01-01

Diospyros is the largest genus in Ebenaceae, comprising more than 500 species with remarkable economic value, especially Diospyros kaki Thunb., which has traditionally been an important food resource in China, Korea, and Japan. Complete chloroplast (cp) genomes from D. kaki, D. lotus L., D. oleifera Cheng., D. glaucifolia Metc., and Diospyros ‘Jinzaoshi’ were sequenced using Illumina sequencing technology. This is the first cp genome reported in Ebenaceae. The cp genome sequences of Diospyros ranged from 157,300 to 157,784 bp in length, presenting a typical quadripartite structure with two inverted repeats each separated by one large and one small single-copy region. For each cp genome, 134 genes were annotated, including 80 protein-coding, 31 tRNA, and 4 rRNA unique genes. In all, 179 repeats and 283 single sequence repeats were identified. Four hypervariable regions, namely, intergenic region of trnQ_rps16, trnV_ndhC, and psbD_trnT, and intron of ndhA, were identified in the Diospyros genomes. Phylogenetic analyses based on the whole cp genome, protein-coding, and intergenic and intron sequences indicated that D. oleifera is closely related to D. kaki and could be used as a model plant for future research on D. kaki; to our knowledge, this is proposed for the first time. Further, these analyses together with two large deletions (301 and 140 bp) in the cp genome of D. ‘Jinzaoshi’, support its placement as a new species in Diospyros. Both maximum parsimony and likelihood analyses for 19 taxa indicated the basal position of Ericales in asterids and suggested that Ebenaceae is monophyletic in Ericales. PMID:27442423
Complete sequences of organelle genomes from the medicinal plant Rhazya stricta (Apocynaceae) and contrasting patterns of mitochondrial genome evolution across asterids.

PubMed

Park, Seongjun; Ruhlman, Tracey A; Sabir, Jamal S M; Mutwakil, Mohammed H Z; Baeshen, Mohammed N; Sabir, Meshaal J; Baeshen, Nabih A; Jansen, Robert K

2014-05-28

Rhazya stricta is native to arid regions in South Asia and the Middle East and is used extensively in folk medicine to treat a wide range of diseases. In addition to generating genomic resources for this medicinally important plant, analyses of the complete plastid and mitochondrial genomes and a nuclear transcriptome from Rhazya provide insights into inter-compartmental transfers between genomes and the patterns of evolution among eight asterid mitochondrial genomes. The 154,841 bp plastid genome is highly conserved with gene content and order identical to the ancestral organization of angiosperms. The 548,608 bp mitochondrial genome exhibits a number of phenomena including the presence of recombinogenic repeats that generate a multipartite organization, transferred DNA from the plastid and nuclear genomes, and bidirectional DNA transfers between the mitochondrion and the nucleus. The mitochondrial genes sdh3 and rps14 have been transferred to the nucleus and have acquired targeting presequences. In the case of rps14, two copies are present in the nucleus; only one has a mitochondrial targeting presequence and may be functional. Phylogenetic analyses of both nuclear and mitochondrial copies of rps14 across angiosperms suggests Rhazya has experienced a single transfer of this gene to the nucleus, followed by a duplication event. Furthermore, the phylogenetic distribution of gene losses and the high level of sequence divergence in targeting presequences suggest multiple, independent transfers of both sdh3 and rps14 across asterids. Comparative analyses of mitochondrial genomes of eight sequenced asterids indicates a complicated evolutionary history in this large angiosperm clade with considerable diversity in genome organization and size, repeat, gene and intron content, and amount of foreign DNA from the plastid and nuclear genomes. Organelle genomes of Rhazya stricta provide valuable information for improving the understanding of mitochondrial genome evolution among angiosperms. The genomic data have enabled a rigorous examination of the gene transfer events. Rhazya is unique among the eight sequenced asterids in the types of events that have shaped the evolution of its mitochondrial genome. Furthermore, the organelle genomes of R. stricta provide valuable genomic resources for utilizing this important medicinal plant in biotechnology applications.
Comparative Genetic Analyses of Human Rhinovirus C (HRV-C) Complete Genome from Malaysia.

PubMed

Khaw, Yam Sim; Chan, Yoke Fun; Jafar, Faizatul Lela; Othman, Norlijah; Chee, Hui Yee

2016-01-01

Human rhinovirus-C (HRV-C) has been implicated in more severe illnesses than HRV-A and HRV-B, however, the limited number of HRV-C complete genomes (complete 5' and 3' non-coding region and open reading frame sequences) has hindered the in-depth genetic study of this virus. This study aimed to sequence seven complete HRV-C genomes from Malaysia and compare their genetic characteristics with the 18 published HRV-Cs. Seven Malaysian HRV-C complete genomes were obtained with newly redesigned primers. The seven genomes were classified as HRV-C6, C12, C22, C23, C26, C42, and pat16 based on the VP4/VP2 and VP1 pairwise distance threshold classification. Five of the seven Malaysian isolates, namely, 3430-MY-10/C22, 8713-MY-10/C23, 8097-MY-11/C26, 1570-MY-10/C42, and 7383-MY-10/pat16 are the first newly sequenced complete HRV-C genomes. All seven Malaysian isolates genomes displayed nucleotide similarity of 63-81% among themselves and 63-96% with other HRV-Cs. Malaysian HRV-Cs had similar putative immunogenic sites, putative receptor utilization and potential antiviral sites as other HRV-Cs. The genomic features of Malaysian isolates were similar to those of other HRV-Cs. Negative selections were frequently detected in HRV-Cs complete coding sequences indicating that these sequences were under functional constraint. The present study showed that HRV-Cs from Malaysia have diverse genetic sequences but share conserved genomic features with other HRV-Cs. This genetic information could provide further aid in the understanding of HRV-C infection.
Comparative Genetic Analyses of Human Rhinovirus C (HRV-C) Complete Genome from Malaysia

PubMed Central

Khaw, Yam Sim; Chan, Yoke Fun; Jafar, Faizatul Lela; Othman, Norlijah; Chee, Hui Yee

2016-01-01

Human rhinovirus-C (HRV-C) has been implicated in more severe illnesses than HRV-A and HRV-B, however, the limited number of HRV-C complete genomes (complete 5′ and 3′ non-coding region and open reading frame sequences) has hindered the in-depth genetic study of this virus. This study aimed to sequence seven complete HRV-C genomes from Malaysia and compare their genetic characteristics with the 18 published HRV-Cs. Seven Malaysian HRV-C complete genomes were obtained with newly redesigned primers. The seven genomes were classified as HRV-C6, C12, C22, C23, C26, C42, and pat16 based on the VP4/VP2 and VP1 pairwise distance threshold classification. Five of the seven Malaysian isolates, namely, 3430-MY-10/C22, 8713-MY-10/C23, 8097-MY-11/C26, 1570-MY-10/C42, and 7383-MY-10/pat16 are the first newly sequenced complete HRV-C genomes. All seven Malaysian isolates genomes displayed nucleotide similarity of 63–81% among themselves and 63–96% with other HRV-Cs. Malaysian HRV-Cs had similar putative immunogenic sites, putative receptor utilization and potential antiviral sites as other HRV-Cs. The genomic features of Malaysian isolates were similar to those of other HRV-Cs. Negative selections were frequently detected in HRV-Cs complete coding sequences indicating that these sequences were under functional constraint. The present study showed that HRV-Cs from Malaysia have diverse genetic sequences but share conserved genomic features with other HRV-Cs. This genetic information could provide further aid in the understanding of HRV-C infection. PMID:27199901
Complete mitochondrial genome sequences from five Eimeria species (Apicomplexa; Coccidia; Eimeriidae) infecting domestic turkeys

PubMed Central

2014-01-01

Background Clinical and subclinical coccidiosis is cosmopolitan and inflicts significant losses to the poultry industry globally. Seven named Eimeria species are responsible for coccidiosis in turkeys: Eimeria dispersa; Eimeria meleagrimitis; Eimeria gallopavonis; Eimeria meleagridis; Eimeria adenoeides; Eimeria innocua; and, Eimeria subrotunda. Although attempts have been made to characterize these parasites molecularly at the nuclear 18S rDNA and ITS loci, the maternally-derived and mitotically replicating mitochondrial genome may be more suited for species level molecular work; however, only limited sequence data are available for Eimeria spp. infecting turkeys. The purpose of this study was to sequence and annotate the complete mitochondrial genomes from 5 Eimeria species that commonly infect the domestic turkey (Meleagris gallopavo). Methods Six single-oocyst derived cultures of five Eimeria species infecting turkeys were PCR-amplified and sequenced completely prior to detailed annotation. Resulting sequences were aligned and used in phylogenetic analyses (BI, ML, and MP) that included complete mitochondrial genomes from 16 Eimeria species or concatenated CDS sequences from each genome. Results Complete mitochondrial genome sequences were obtained for Eimeria adenoeides Guelph, 6211 bp; Eimeria dispersa Briston, 6238 bp; Eimeria meleagridis USAR97-01, 6212 bp; Eimeria meleagrimitis USMN08-01, 6165 bp; Eimeria gallopavonis Weybridge, 6215 bp; and Eimeria gallopavonis USKS06-01, 6215 bp). The order, orientation and CDS lengths of the three protein coding genes (COI, COIII and CytB) as well as rDNA fragments encoding ribosomal large and small subunit rRNA were conserved among all sequences. Pairwise sequence identities between species ranged from 88.1% to 98.2%; sequence variability was concentrated within CDS or between rDNA fragments (where indels were common). No phylogenetic reconstruction supported monophyly of Eimeria species infecting turkeys; Eimeria dispersa may have arisen via host switching from another avian host. Phylogenetic analyses suggest E. necatrix and E. tenella are related distantly to other Eimeria of chickens. Conclusions Mitochondrial genomes of Eimeria species sequenced to date are highly conserved with regard to gene content and structure. Nonetheless, complete mitochondrial genome sequences and, particularly the three CDS, possess sufficient sequence variability for differentiating Eimeria species of poultry. The mitochondrial genome sequences are highly suited for molecular diagnostics and phylogenetics of coccidia and, potentially, genetic markers for molecular epidemiology. PMID:25034633
Complete mitochondrial genome sequences from five Eimeria species (Apicomplexa; Coccidia; Eimeriidae) infecting domestic turkeys.

PubMed

Ogedengbe, Mosun E; El-Sherry, Shiem; Whale, Julia; Barta, John R

2014-07-17

Clinical and subclinical coccidiosis is cosmopolitan and inflicts significant losses to the poultry industry globally. Seven named Eimeria species are responsible for coccidiosis in turkeys: Eimeria dispersa; Eimeria meleagrimitis; Eimeria gallopavonis; Eimeria meleagridis; Eimeria adenoeides; Eimeria innocua; and, Eimeria subrotunda. Although attempts have been made to characterize these parasites molecularly at the nuclear 18S rDNA and ITS loci, the maternally-derived and mitotically replicating mitochondrial genome may be more suited for species level molecular work; however, only limited sequence data are available for Eimeria spp. infecting turkeys. The purpose of this study was to sequence and annotate the complete mitochondrial genomes from 5 Eimeria species that commonly infect the domestic turkey (Meleagris gallopavo). Six single-oocyst derived cultures of five Eimeria species infecting turkeys were PCR-amplified and sequenced completely prior to detailed annotation. Resulting sequences were aligned and used in phylogenetic analyses (BI, ML, and MP) that included complete mitochondrial genomes from 16 Eimeria species or concatenated CDS sequences from each genome. Complete mitochondrial genome sequences were obtained for Eimeria adenoeides Guelph, 6211 bp; Eimeria dispersa Briston, 6238 bp; Eimeria meleagridis USAR97-01, 6212 bp; Eimeria meleagrimitis USMN08-01, 6165 bp; Eimeria gallopavonis Weybridge, 6215 bp; and Eimeria gallopavonis USKS06-01, 6215 bp). The order, orientation and CDS lengths of the three protein coding genes (COI, COIII and CytB) as well as rDNA fragments encoding ribosomal large and small subunit rRNA were conserved among all sequences. Pairwise sequence identities between species ranged from 88.1% to 98.2%; sequence variability was concentrated within CDS or between rDNA fragments (where indels were common). No phylogenetic reconstruction supported monophyly of Eimeria species infecting turkeys; Eimeria dispersa may have arisen via host switching from another avian host. Phylogenetic analyses suggest E. necatrix and E. tenella are related distantly to other Eimeria of chickens. Mitochondrial genomes of Eimeria species sequenced to date are highly conserved with regard to gene content and structure. Nonetheless, complete mitochondrial genome sequences and, particularly the three CDS, possess sufficient sequence variability for differentiating Eimeria species of poultry. The mitochondrial genome sequences are highly suited for molecular diagnostics and phylogenetics of coccidia and, potentially, genetic markers for molecular epidemiology.
GapBlaster-A Graphical Gap Filler for Prokaryote Genomes.

PubMed

de Sá, Pablo H C G; Miranda, Fábio; Veras, Adonney; de Melo, Diego Magalhães; Soares, Siomar; Pinheiro, Kenny; Guimarães, Luis; Azevedo, Vasco; Silva, Artur; Ramos, Rommel T J

2016-01-01

The advent of NGS (Next Generation Sequencing) technologies has resulted in an exponential increase in the number of complete genomes available in biological databases. This advance has allowed the development of several computational tools enabling analyses of large amounts of data in each of the various steps, from processing and quality filtering to gap filling and manual curation. The tools developed for gap closure are very useful as they result in more complete genomes, which will influence downstream analyses of genomic plasticity and comparative genomics. However, the gap filling step remains a challenge for genome assembly, often requiring manual intervention. Here, we present GapBlaster, a graphical application to evaluate and close gaps. GapBlaster was developed via Java programming language. The software uses contigs obtained in the assembly of the genome to perform an alignment against a draft of the genome/scaffold, using BLAST or Mummer to close gaps. Then, all identified alignments of contigs that extend through the gaps in the draft sequence are presented to the user for further evaluation via the GapBlaster graphical interface. GapBlaster presents significant results compared to other similar software and has the advantage of offering a graphical interface for manual curation of the gaps. GapBlaster program, the user guide and the test datasets are freely available at https://sourceforge.net/projects/gapblaster2015/. It requires Sun JDK 8 and Blast or Mummer.
Complete mitochondrial genome sequence of the hedgehog seahorse Hippocampus spinosissimus Weber, 1933 (Gasterosteiformes:Syngnathidae).

PubMed

Liu, Shuaishuai; Zhang, Yanhong; Wang, Changming; Lin, Qiang

2016-07-01

The complete mitochondrial genome sequence of the hedgehog seahorse Hippocampus spinosissimus was first determined in this article. The total length of H. spinosissimus mitogenome is 16 527 bp and consists of 13 protein-coding genes, 2 rRNA genes, 22 tRNA genes and 1 control region. The gene order and composition of H. spinosissimus were similar to those of most other vertebrates. The overall base composition of H. spinosissimus is 32.1% A, 30.3% T, 14.9% G and 22.7% C, with a slight A + T-rich feature (62.4%). Phylogenetic analyses based on complete mitochondrial genome sequence showed that H. spinosissimus has a close genetic relationship to H. ingens and H. kuda.
The complete and fully assembled genome sequence of Aeromonas salmonicida subsp. pectinolytica and its comparative analysis with other Aeromonas species: investigation of the mobilome in environmental and pathogenic strains.

PubMed

Pfeiffer, Friedhelm; Zamora-Lagos, Maria-Antonia; Blettinger, Martin; Yeroslaviz, Assa; Dahl, Andreas; Gruber, Stephan; Habermann, Bianca H

2018-01-05

Due to the predominant usage of short-read sequencing to date, most bacterial genome sequences reported in the last years remain at the draft level. This precludes certain types of analyses, such as the in-depth analysis of genome plasticity. Here we report the finalized genome sequence of the environmental strain Aeromonas salmonicida subsp. pectinolytica 34mel, for which only a draft genome with 253 contigs is currently available. Successful completion of the transposon-rich genome critically depended on the PacBio long read sequencing technology. Using finalized genome sequences of A. salmonicida subsp. pectinolytica and other Aeromonads, we report the detailed analysis of the transposon composition of these bacterial species. Mobilome evolution is exemplified by a complex transposon, which has shifted from pathogenicity-related to environmental-related gene content in A. salmonicida subsp. pectinolytica 34mel. Obtaining the complete, circular genome of A. salmonicida subsp. pectinolytica allowed us to perform an in-depth analysis of its mobilome. We demonstrate the mobilome-dependent evolution of this strain's genetic profile from pathogenic to environmental.
Complete mitochondrial genomes of two flat-backed millipedes by next-generation sequencing (Diplopoda, Polydesmida)

PubMed Central

Dong, Yan; Zhu, Lixin; Bai, Yu; Ou, Yongyue; Wang, Changbao

2016-01-01

Abstract A lack of mitochondrial genome data from myriapods is hampering progress across genetic, systematic, phylogenetic and evolutionary studies. Here, the complete mitochondrial genomes of two millipedes, Asiomorpha coarctata Saussure, 1860 (Diplopoda: Polydesmida: Paradoxosomatidae) and Xystodesmus sp. (Diplopoda: Polydesmida: Xystodesmidae) were assembled with high coverage using Illumina sequencing data. The mitochondrial genomes of the two newly sequenced species are circular molecules of 15,644 bp and 15,791 bp, within which the typical mitochondrial genome complement of 13 protein-coding genes, 22 tRNAs and two ribosomal RNA genes could be identified. The mitochondrial genome of Asiomorpha coarctata is the first complete sequence in the family Paradoxosomatidae (Diplopoda: Polydesmida) and the gene order of the two flat-backed millipedes is novel among known myriapod mitochondrial genomes. Unique translocations have occurred, including inversion of one half of the two genomes with respect to other millipede genomes. Inversion of the entire side of a genome (trnF-nad5-trnH-nad4-nad4L, trnP, nad1-trnL2-trnL1-rrnL-trnV-rrnS, trnQ, trnC and trnY) could constitute a common event in the order Polydesmida. Last, our phylogenetic analyses recovered the monophyletic Progoneata, subphylum Myriapoda and four internal classes. PMID:28138271
The complete genome sequence of a south Indian isolate of Rice tungro spherical virus reveals evidence of genetic recombination between distinct isolates.

PubMed

Sailaja, B; Anjum, Najreen; Patil, Yogesh K; Agarwal, Surekha; Malathi, P; Krishnaveni, D; Balachandran, S M; Viraktamath, B C; Mangrauthia, Satendra K

2013-12-01

In this study, complete genome of a south Indian isolate of Rice tungro spherical virus (RTSV) from Andhra Pradesh (AP) was sequenced, and the predicted amino acid sequence was analysed. The RTSV RNA genome consists of 12,171 nt without the poly(A) tail, encoding a putative typical polyprotein of 3,470 amino acids. Furthermore, cleavage sites and sequence motifs of the polyprotein were predicted. Multiple alignment with other RTSV isolates showed a nucleotide sequence identity of 95% to east Indian isolates and 90% to Philippines isolates. A phylogenetic tree based on complete genome sequence showed that Indian isolates clustered together, while Vt6 and PhilA isolates of Philippines formed two separate clusters. Twelve recombination events were detected in RNA genome of RTSV using the Recombination Detection Program version 3. Recombination analysis suggested significant role of 5' end and central region of genome in virus evolution. Further, AP and Odisha isolates appeared as important RTSV isolates involved in diversification of this virus in India through recombination phenomenon. The new addition of complete genome of first south Indian isolate provided an opportunity to establish the molecular evolution of RTSV through recombination analysis and phylogenetic relationship.
Complete mitochondrial DNA sequences of the Victoria tilapia (Oreochromis variabilis) and Redbelly Tilapia (Tilapia zilli): genome characterization and phylogeny analysis.

PubMed

Kinaro, Zachary Omambia; Xue, Liangyi; Volatiana, Josies Ancella

2016-07-01

The Cichlid fishes have played an important role in evolutionary biology, population studies and aquaculture industry with East African species representing a model suited for studying adaptive radiation and speciation for cichlid genome projects in which closely related genomes are fast emerging presenting questions on phenotype-genotype relations. The complete mitochondrial genomes presented here are for two closely related but eco-morphologically distinct Lake Victoria basin cichlids, Oreochromis variabilis, an endangered native species and Tilapia zilli, an invasive species, both of which are important economic fishes in local areas. The complete mitochondrial genomes determined for O. variabilis and T. zilli are 16 626 and 16,619 bp, respectively. Both the mitogenomes contain 13 protein-coding genes, 22 tRNAs, 2 rRNAs and a non-coding control region, which are typical of vertebrate mitogenomes. Phylogenetic analyses of the two species revealed that though both lie within family Cichlidae, they are remotely related.

CpGAVAS, an integrated web server for the annotation, visualization, analysis, and GenBank submission of completely sequenced chloroplast genome sequences

PubMed Central

2012-01-01

Background The complete sequences of chloroplast genomes provide wealthy information regarding the evolutionary history of species. With the advance of next-generation sequencing technology, the number of completely sequenced chloroplast genomes is expected to increase exponentially, powerful computational tools annotating the genome sequences are in urgent need. Results We have developed a web server CPGAVAS. The server accepts a complete chloroplast genome sequence as input. First, it predicts protein-coding and rRNA genes based on the identification and mapping of the most similar, full-length protein, cDNA and rRNA sequences by integrating results from Blastx, Blastn, protein2genome and est2genome programs. Second, tRNA genes and inverted repeats (IR) are identified using tRNAscan, ARAGORN and vmatch respectively. Third, it calculates the summary statistics for the annotated genome. Fourth, it generates a circular map ready for publication. Fifth, it can create a Sequin file for GenBank submission. Last, it allows the extractions of protein and mRNA sequences for given list of genes and species. The annotation results in GFF3 format can be edited using any compatible annotation editing tools. The edited annotations can then be uploaded to CPGAVAS for update and re-analyses repeatedly. Using known chloroplast genome sequences as test set, we show that CPGAVAS performs comparably to another application DOGMA, while having several superior functionalities. Conclusions CPGAVAS allows the semi-automatic and complete annotation of a chloroplast genome sequence, and the visualization, editing and analysis of the annotation results. It will become an indispensible tool for researchers studying chloroplast genomes. The software is freely accessible from http://www.herbalgenomics.org/cpgavas. PMID:23256920
Highly effective sequencing whole chloroplast genomes of angiosperms by nine novel universal primer pairs.

PubMed

Yang, Jun-Bo; Li, De-Zhu; Li, Hong-Tao

2014-09-01

Chloroplast genomes supply indispensable information that helps improve the phylogenetic resolution and even as organelle-scale barcodes. Next-generation sequencing technologies have helped promote sequencing of complete chloroplast genomes, but compared with the number of angiosperms, relatively few chloroplast genomes have been sequenced. There are two major reasons for the paucity of completely sequenced chloroplast genomes: (i) massive amounts of fresh leaves are needed for chloroplast sequencing and (ii) there are considerable gaps in the sequenced chloroplast genomes of many plants because of the difficulty of isolating high-quality chloroplast DNA, preventing complete chloroplast genomes from being assembled. To overcome these obstacles, all known angiosperm chloroplast genomes available to date were analysed, and then we designed nine universal primer pairs corresponding to the highly conserved regions. Using these primers, angiosperm whole chloroplast genomes can be amplified using long-range PCR and sequenced using next-generation sequencing methods. The primers showed high universality, which was tested using 24 species representing major clades of angiosperms. To validate the functionality of the primers, eight species representing major groups of angiosperms, that is, early-diverging angiosperms, magnoliids, monocots, Saxifragales, fabids, malvids and asterids, were sequenced and assembled their complete chloroplast genomes. In our trials, only 100 mg of fresh leaves was used. The results show that the universal primer set provided an easy, effective and feasible approach for sequencing whole chloroplast genomes in angiosperms. The designed universal primer pairs provide a possibility to accelerate genome-scale data acquisition and will therefore magnify the phylogenetic resolution and species identification in angiosperms. © 2014 John Wiley & Sons Ltd.
Whole genome sequencing data and de novo draft assemblies for 66 teleost species

PubMed Central

Malmstrøm, Martin; Matschiner, Michael; Tørresen, Ole K.; Jakobsen, Kjetill S.; Jentoft, Sissel

2017-01-01

Teleost fishes comprise more than half of all vertebrate species, yet genomic data are only available for 0.2% of their diversity. Here, we present whole genome sequencing data for 66 new species of teleosts, vastly expanding the availability of genomic data for this important vertebrate group. We report on de novo assemblies based on low-coverage (9–39×) sequencing and present detailed methodology for all analyses. To facilitate further utilization of this data set, we present statistical analyses of the gene space completeness and verify the expected phylogenetic position of the sequenced genomes in a large mitogenomic context. We further present a nuclear marker set used for phylogenetic inference and evaluate each gene tree in relation to the species tree to test for homogeneity in the phylogenetic signal. Collectively, these analyses illustrate the robustness of this highly diverse data set and enable extensive reuse of the selected phylogenetic markers and the genomic data in general. This data set covers all major teleost lineages and provides unprecedented opportunities for comparative studies of teleosts. PMID:28094797
Complete genome sequence of Menghai rhabdovirus, a novel mosquito-borne rhabdovirus from China.

PubMed

Sun, Qiang; Zhao, Qiumin; An, Xiaoping; Guo, Xiaofang; Zuo, Shuqing; Zhang, Xianglilan; Pei, Guangqian; Liu, Wenli; Cheng, Shi; Wang, Yunfei; Shu, Peng; Mi, Zhiqiang; Huang, Yong; Zhang, Zhiyi; Tong, Yigang; Zhou, Hongning; Zhang, Jiusong

2017-04-01

Menghai rhabdovirus (MRV) was isolated from Aedes albopictus in Menghai county of Yunnan Province, China, in August 2010. Whole-genome sequencing of MRV was performed using an Ion PGM™ Sequencer. We found that MRV is a single-stranded, negative-sense RNA virus. The complete genome of MRV has 10,744 nt, with short inverted repeat termini, encoding five typical rhabdovirus proteins (N, P, M, G, and L) and an additional small hypothetical protein. Nucleotide BLAST analysis using the BLASTn method showed that the genome sequence most similar to that of MRV is that of Arboretum virus (NC_025393.1), with a Max score of 322, query coverage of 14%, and 66% identity. Genomic and phylogenetic analyses both demonstrated that MRV should be considered a member of a novel species of the family Rhabdoviridae.
One chromosome, one contig: complete microbial genomes from long-read sequencing and assembly.

PubMed

Koren, Sergey; Phillippy, Adam M

2015-02-01

Like a jigsaw puzzle with large pieces, a genome sequenced with long reads is easier to assemble. However, recent sequencing technologies have favored lowering per-base cost at the expense of read length. This has dramatically reduced sequencing cost, but resulted in fragmented assemblies, which negatively affect downstream analyses and hinder the creation of finished (gapless, high-quality) genomes. In contrast, emerging long-read sequencing technologies can now produce reads tens of kilobases in length, enabling the automated finishing of microbial genomes for under $1000. This promises to improve the quality of reference databases and facilitate new studies of chromosomal structure and variation. We present an overview of these new technologies and the methods used to assemble long reads into complete genomes. Copyright © 2014 The Authors. Published by Elsevier Ltd.. All rights reserved.
Sputnik: a database platform for comparative plant genomics.

PubMed

Rudd, Stephen; Mewes, Hans-Werner; Mayer, Klaus F X

2003-01-01

Two million plant ESTs, from 20 different plant species, and totalling more than one 1000 Mbp of DNA sequence, represents a formidable transcriptomic resource. Sputnik uses the potential of this sequence resource to fill some of the information gap in the un-sequenced plant genomes and to serve as the foundation for in silicio comparative plant genomics. The complexity of the individual EST collections has been reduced using optimised EST clustering techniques. Annotation of cluster sequences is performed by exploiting and transferring information from the comprehensive knowledgebase already produced for the completed model plant genome (Arabidopsis thaliana) and by performing additional state of-the-art sequence analyses relevant to today's plant biologist. Functional predictions, comparative analyses and associative annotations for 500 000 plant EST derived peptides make Sputnik (http://mips.gsf.de/proj/sputnik/) a valid platform for contemporary plant genomics.
Sputnik: a database platform for comparative plant genomics

PubMed Central

Rudd, Stephen; Mewes, Hans-Werner; Mayer, Klaus F.X.

2003-01-01

Two million plant ESTs, from 20 different plant species, and totalling more than one 1000 Mbp of DNA sequence, represents a formidable transcriptomic resource. Sputnik uses the potential of this sequence resource to fill some of the information gap in the un-sequenced plant genomes and to serve as the foundation for in silicio comparative plant genomics. The complexity of the individual EST collections has been reduced using optimised EST clustering techniques. Annotation of cluster sequences is performed by exploiting and transferring information from the comprehensive knowledgebase already produced for the completed model plant genome (Arabidopsis thaliana) and by performing additional state of-the-art sequence analyses relevant to today's plant biologist. Functional predictions, comparative analyses and associative annotations for 500 000 plant EST derived peptides make Sputnik (http://mips.gsf.de/proj/sputnik/) a valid platform for contemporary plant genomics. PMID:12519965
Genome-reconstruction for eukaryotes from complex natural microbial communities.

PubMed

West, Patrick T; Probst, Alexander J; Grigoriev, Igor V; Thomas, Brian C; Banfield, Jillian F

2018-04-01

Microbial eukaryotes are integral components of natural microbial communities, and their inclusion is critical for many ecosystem studies, yet the majority of published metagenome analyses ignore eukaryotes. In order to include eukaryotes in environmental studies, we propose a method to recover eukaryotic genomes from complex metagenomic samples. A key step for genome recovery is separation of eukaryotic and prokaryotic fragments. We developed a k -mer-based strategy, EukRep, for eukaryotic sequence identification and applied it to environmental samples to show that it enables genome recovery, genome completeness evaluation, and prediction of metabolic potential. We used this approach to test the effect of addition of organic carbon on a geyser-associated microbial community and detected a substantial change of the community metabolism, with selection against almost all candidate phyla bacteria and archaea and for eukaryotes. Near complete genomes were reconstructed for three fungi placed within the Eurotiomycetes and an arthropod. While carbon fixation and sulfur oxidation were important functions in the geyser community prior to carbon addition, the organic carbon-impacted community showed enrichment for secreted proteases, secreted lipases, cellulose targeting CAZymes, and methanol oxidation. We demonstrate the broader utility of EukRep by reconstructing and evaluating relatively high-quality fungal, protist, and rotifer genomes from complex environmental samples. This approach opens the way for cultivation-independent analyses of whole microbial communities. © 2018 West et al.; Published by Cold Spring Harbor Laboratory Press.
Complete Genome Sequence of Biofilm-Forming Strain Staphylococcus haemolyticus S167.

PubMed

Hong, Jisoo; Kim, Jonguk; Kim, Byung-Yong; Park, Jin-Woo; Ryu, Jae-Gee; Roh, Eunjung

2016-06-16

Staphylococcus haemolyticus S167 has the ability to produce biofilms in large quantities. Genomic analyses revealed information on the biofilm-related genes of S. haemolyticus S167. Detailed studies of biofilm formation at the molecular level could provide a foundation for biofilm control research. Copyright © 2016 Hong et al.
New Hepatitis E Virus Genotype in Camels, the Middle East

PubMed Central

Lau, Susanna K.P.; Teng, Jade L.L.; Tsang, Alan K. L.; Joseph, Marina; Wong, Emily Y.M.; Tang, Ying; Sivakumar, Saritha; Xie, Jun; Bai, Ru; Wernery, Renate; Wernery, Ulrich; Yuen, Kwok-Yung

2014-01-01

In a molecular epidemiology study of hepatitis E virus (HEV) in dromedaries in Dubai, United Arab Emirates, HEV was detected in fecal samples from 3 camels. Complete genome sequencing of 2 strains showed >20% overall nucleotide difference to known HEVs. Comparative genomic and phylogenetic analyses revealed a previously unrecognized HEV genotype. PMID:24856611
MOSAIC: an online database dedicated to the comparative genomics of bacterial strains at the intra-species level.

PubMed

Chiapello, Hélène; Gendrault, Annie; Caron, Christophe; Blum, Jérome; Petit, Marie-Agnès; El Karoui, Meriem

2008-11-27

The recent availability of complete sequences for numerous closely related bacterial genomes opens up new challenges in comparative genomics. Several methods have been developed to align complete genomes at the nucleotide level but their use and the biological interpretation of results are not straightforward. It is therefore necessary to develop new resources to access, analyze, and visualize genome comparisons. Here we present recent developments on MOSAIC, a generalist comparative bacterial genome database. This database provides the bacteriologist community with easy access to comparisons of complete bacterial genomes at the intra-species level. The strategy we developed for comparison allows us to define two types of regions in bacterial genomes: backbone segments (i.e., regions conserved in all compared strains) and variable segments (i.e., regions that are either specific to or variable in one of the aligned genomes). Definition of these segments at the nucleotide level allows precise comparative and evolutionary analyses of both coding and non-coding regions of bacterial genomes. Such work is easily performed using the MOSAIC Web interface, which allows browsing and graphical visualization of genome comparisons. The MOSAIC database now includes 493 pairwise comparisons and 35 multiple maximal comparisons representing 78 bacterial species. Genome conserved regions (backbones) and variable segments are presented in various formats for further analysis. A graphical interface allows visualization of aligned genomes and functional annotations. The MOSAIC database is available online at http://genome.jouy.inra.fr/mosaic.
The whole genome sequences and experimentally phased haplotypes of over 100 personal genomes.

PubMed

Mao, Qing; Ciotlos, Serban; Zhang, Rebecca Yu; Ball, Madeleine P; Chin, Robert; Carnevali, Paolo; Barua, Nina; Nguyen, Staci; Agarwal, Misha R; Clegg, Tom; Connelly, Abram; Vandewege, Ward; Zaranek, Alexander Wait; Estep, Preston W; Church, George M; Drmanac, Radoje; Peters, Brock A

2016-10-11

Since the completion of the Human Genome Project in 2003, it is estimated that more than 200,000 individual whole human genomes have been sequenced. A stunning accomplishment in such a short period of time. However, most of these were sequenced without experimental haplotype data and are therefore missing an important aspect of genome biology. In addition, much of the genomic data is not available to the public and lacks phenotypic information. As part of the Personal Genome Project, blood samples from 184 participants were collected and processed using Complete Genomics' Long Fragment Read technology. Here, we present the experimental whole genome haplotyping and sequencing of these samples to an average read coverage depth of 100X. This is approximately three-fold higher than the read coverage applied to most whole human genome assemblies and ensures the highest quality results. Currently, 114 genomes from this dataset are freely available in the GigaDB repository and are associated with rich phenotypic data; the remaining 70 should be added in the near future as they are approved through the PGP data release process. For reproducibility analyses, 20 genomes were sequenced at least twice using independent LFR barcoded libraries. Seven genomes were also sequenced using Complete Genomics' standard non-barcoded library process. In addition, we report 2.6 million high-quality, rare variants not previously identified in the Single Nucleotide Polymorphisms database or the 1000 Genomes Project Phase 3 data. These genomes represent a unique source of haplotype and phenotype data for the scientific community and should help to expand our understanding of human genome evolution and function.
Evolution and phylogeny of the mud shrimps (Crustacea: Decapoda) revealed from complete mitochondrial genomes.

PubMed

Lin, Feng-Jiau; Liu, Yuan; Sha, Zhongli; Tsang, Ling Ming; Chu, Ka Hou; Chan, Tin-Yam; Liu, Ruiyu; Cui, Zhaoxia

2012-11-16

The evolutionary history and relationships of the mud shrimps (Crustacea: Decapoda: Gebiidea and Axiidea) are contentious, with previous attempts revealing mixed results. The mud shrimps were once classified in the infraorder Thalassinidea. Recent molecular phylogenetic analyses, however, suggest separation of the group into two individual infraorders, Gebiidea and Axiidea. Mitochondrial (mt) genome sequence and structure can be especially powerful in resolving higher systematic relationships that may offer new insights into the phylogeny of the mud shrimps and the other decapod infraorders, and test the hypothesis of dividing the mud shrimps into two infraorders. We present the complete mitochondrial genome sequences of five mud shrimps, Austinogebia edulis, Upogebia major, Thalassina kelanang (Gebiidea), Nihonotrypaea thermophilus and Neaxius glyptocercus (Axiidea). All five genomes encode a standard set of 13 protein-coding genes, two ribosomal RNA genes, 22 transfer RNA genes and a putative control region. Except for T. kelanang, mud shrimp mitochondrial genomes exhibited rearrangements and novel patterns compared to the pancrustacean ground pattern. Each of the two Gebiidea species (A. edulis and U. major) and two Axiidea species (N. glyptocercus and N. thermophiles) share unique gene order specific to their infraorders and analyses further suggest these two derived gene orders have evolved independently. Phylogenetic analyses based on the concatenated nucleotide and amino acid sequences of 13 protein-coding genes indicate the possible polyphyly of mud shrimps, supporting the division of the group into two infraorders. However, the infraordinal relationships among the Gebiidea and Axiidea, and other reptants are poorly resolved. The inclusion of mt genome from more taxa, in particular the reptant infraorders Polychelida and Glypheidea is required in further analysis. Phylogenetic analyses on the mt genome sequences and the distinct gene orders provide further evidences for the divergence between the two mud shrimp infraorders, Gebiidea and Axiidea, corroborating previous molecular phylogeny and justifying their infraordinal status. Mitochondrial genome sequences appear to be promising markers for resolving phylogenetic issues concerning decapod crustaceans that warrant further investigations and our present study has also provided further information concerning the mt genome evolution of the Decapoda.
Evolution and phylogeny of the mud shrimps (Crustacea: Decapoda) revealed from complete mitochondrial genomes

PubMed Central

2012-01-01

Background The evolutionary history and relationships of the mud shrimps (Crustacea: Decapoda: Gebiidea and Axiidea) are contentious, with previous attempts revealing mixed results. The mud shrimps were once classified in the infraorder Thalassinidea. Recent molecular phylogenetic analyses, however, suggest separation of the group into two individual infraorders, Gebiidea and Axiidea. Mitochondrial (mt) genome sequence and structure can be especially powerful in resolving higher systematic relationships that may offer new insights into the phylogeny of the mud shrimps and the other decapod infraorders, and test the hypothesis of dividing the mud shrimps into two infraorders. Results We present the complete mitochondrial genome sequences of five mud shrimps, Austinogebia edulis, Upogebia major, Thalassina kelanang (Gebiidea), Nihonotrypaea thermophilus and Neaxius glyptocercus (Axiidea). All five genomes encode a standard set of 13 protein-coding genes, two ribosomal RNA genes, 22 transfer RNA genes and a putative control region. Except for T. kelanang, mud shrimp mitochondrial genomes exhibited rearrangements and novel patterns compared to the pancrustacean ground pattern. Each of the two Gebiidea species (A. edulis and U. major) and two Axiidea species (N. glyptocercus and N. thermophiles) share unique gene order specific to their infraorders and analyses further suggest these two derived gene orders have evolved independently. Phylogenetic analyses based on the concatenated nucleotide and amino acid sequences of 13 protein-coding genes indicate the possible polyphyly of mud shrimps, supporting the division of the group into two infraorders. However, the infraordinal relationships among the Gebiidea and Axiidea, and other reptants are poorly resolved. The inclusion of mt genome from more taxa, in particular the reptant infraorders Polychelida and Glypheidea is required in further analysis. Conclusions Phylogenetic analyses on the mt genome sequences and the distinct gene orders provide further evidences for the divergence between the two mud shrimp infraorders, Gebiidea and Axiidea, corroborating previous molecular phylogeny and justifying their infraordinal status. Mitochondrial genome sequences appear to be promising markers for resolving phylogenetic issues concerning decapod crustaceans that warrant further investigations and our present study has also provided further information concerning the mt genome evolution of the Decapoda. PMID:23153176
Mitochondrial genomes of Meloidogyne chitwoodi and M. incognita (Nematoda: Tylenchina): comparative analysis, gene order and phylogenetic relationships with other nematodes.

PubMed

Humphreys-Pereira, Danny A; Elling, Axel A

2014-01-01

Root-knot nematodes (Meloidogyne spp.) are among the most important plant pathogens. In this study, the mitochondrial (mt) genomes of the root-knot nematodes, M. chitwoodi and M. incognita were sequenced. PCR analyses suggest that both mt genomes are circular, with an estimated size of 19.7 and 18.6-19.1kb, respectively. The mt genomes each contain a large non-coding region with tandem repeats and the control region. The mt gene arrangement of M. chitwoodi and M. incognita is unlike that of other nematodes. Sequence alignments of the two Meloidogyne mt genomes showed three translocations; two in transfer RNAs and one in cox2. Compared with other nematode mt genomes, the gene arrangement of M. chitwoodi and M. incognita was most similar to Pratylenchus vulnus. Phylogenetic analyses (Maximum Likelihood and Bayesian inference) were conducted using 78 complete mt genomes of diverse nematode species. Analyses based on nucleotides and amino acids of the 12 protein-coding mt genes showed strong support for the monophyly of class Chromadorea, but only amino acid-based analyses supported the monophyly of class Enoplea. The suborder Spirurina was not monophyletic in any of the phylogenetic analyses, contradicting the Clade III model, which groups Ascaridomorpha, Spiruromorpha and Oxyuridomorpha based on the small subunit ribosomal RNA gene. Importantly, comparisons of mt gene arrangement and tree-based methods placed Meloidogyne as sister taxa of Pratylenchus, a migratory plant endoparasitic nematode, and not with the sedentary endoparasitic Heterodera. Thus, comparative analyses of mt genomes suggest that sedentary endoparasitism in Meloidogyne and Heterodera is based on convergent evolution. Copyright © 2014 Elsevier B.V. All rights reserved.
Characterization of the complete mitochondrial genomes of two whipworms Trichuris ovis and Trichuris discolor (Nematoda: Trichuridae).

PubMed

Liu, Guo-Hua; Wang, Yan; Xu, Min-Jun; Zhou, Dong-Hui; Ye, Yong-Gang; Li, Jia-Yuan; Song, Hui-Qun; Lin, Rui-Qing; Zhu, Xing-Quan

2012-12-01

For many years, whipworms (Trichuris spp.) have been described with a relatively narrow range of both morphological and biometrical features. Moreover, there has been insufficient discrimination between congeners (or closely related species). In the present study, we determined the complete mitochondrial (mt) genomes of two whipworms Trichuris ovis and Trichuris discolor, compared them and then tested the hypothesis that T. ovis and T. discolor are distinct species by phylogenetic analyses using Bayesian inference, maximum likelihood and maximum parsimony) based on the deduced amino acid sequences of the mt protein-coding genes. The complete mt genomes of T. ovis and T. discolor were 13,946 bp and 13,904 bp in size, respectively. Both mt genomes are circular, and consist of 37 genes, including 13 genes coding for proteins, 2 genes for rRNA, and 22 genes for tRNA. The gene content and arrangement are identical to that of human and pig whipworms Trichuris trichiura and Trichuris suis. Taken together, these analyses showed genetic distinctiveness and strongly supported the recent proposal that T. ovis and T. discolor are distinct species using nuclear ribosomal DNA and a portion of the mtDNA sequence dataset. The availability of the complete mtDNA sequences of T. ovis and T. discolor provides novel genetic markers for studying the population genetics, diagnostics and molecular epidemiology of T. ovis and T. discolor. Copyright © 2012 Elsevier B.V. All rights reserved.
Analysis of the Complete Mitochondrial Genome Sequence of the Diploid Cotton Gossypium raimondii by Comparative Genomics Approaches

PubMed Central

Paterson, Andrew H.; Wang, Xuelin; Xu, Yiqing; Wu, Dongyang; Qu, Yanshu; Jiang, Anna; Ye, Qiaolin

2016-01-01

Cotton is one of the most important economic crops and the primary source of natural fiber and is an important protein source for animal feed. The complete nuclear and chloroplast (cp) genome sequences of G. raimondii are already available but not mitochondria. Here, we assembled the complete mitochondrial (mt) DNA sequence of G. raimondii into a circular genome of length of 676,078 bp and performed comparative analyses with other higher plants. The genome contains 39 protein-coding genes, 6 rRNA genes, and 25 tRNA genes. We also identified four larger repeats (63.9 kb, 10.6 kb, 9.1 kb, and 2.5 kb) in this mt genome, which may be active in intramolecular recombination in the evolution of cotton. Strikingly, nearly all of the G. raimondii mt genome has been transferred to nucleus on Chr1, and the transfer event must be very recent. Phylogenetic analysis reveals that G. raimondii, as a member of Malvaceae, is much closer to another cotton (G. barbadense) than other rosids, and the clade formed by two Gossypium species is sister to Brassicales. The G. raimondii mt genome may provide a crucial foundation for evolutionary analysis, molecular biology, and cytoplasmic male sterility in cotton and other higher plants. PMID:27847816
The complete chloroplast genome of Gentiana straminea (Gentianaceae), an endemic species to the Sino-Himalayan subregion.

PubMed

Ni, Lianghong; Zhao, Zhili; Xu, Hongxi; Chen, Shilin; Dorje, Gaawe

2016-02-15

Endemic to the Sino-Himalayan subregion, the medicinal alpine plant Gentiana straminea is a threatened species. The genetic and molecular data about it is deficient. Here we report the complete chloroplast (cp) genome sequence of G. straminea, as the first sequenced member of the family Gentianaceae. The cp genome is 148,991bp in length, including a large single copy (LSC) region of 81,240bp, a small single copy (SSC) region of 17,085bp and a pair of inverted repeats (IRs) of 25,333bp. It contains 112 unique genes, including 78 protein-coding genes, 30 tRNAs and 4 rRNAs. The rps16 gene lacks exon2 between trnK-UUU and trnQ-UUG, which is the first rps16 pseudogene found in the nonparasitic plants of Asterids clade. Sequence analysis revealed the presence of 13 forward repeats, 13 palindrome repeats and 39 simple sequence repeats (SSRs). An entire cp genome comparison study of G. straminea and four other species in Gentianales was carried out. Phylogenetic analyses using maximum likelihood (ML) and maximum parsimony (MP) were performed based on 69 protein-coding genes from 36 species of Asterids. The results strongly supported the position of Gentianaceae as one member of the order Gentianales. The complete chloroplast genome sequence will provide intragenic information for its conservation and contribute to research on the genetic and phylogenetic analyses of Gentianales and Asterids. Copyright © 2015 Elsevier B.V. All rights reserved.
Integrated analyses using RNA-Seq data reveal viral genomes, single nucleotide variations, the phylogenetic relationship, and recombination for Apple stem grooving virus.

PubMed

Jo, Yeonhwa; Choi, Hoseong; Kim, Sang-Min; Kim, Sun-Lim; Lee, Bong Choon; Cho, Won Kyong

2016-08-09

Next-generation sequencing (NGS) provides many possibilities for plant virology research. In this study, we performed integrated analyses using plant transcriptome data for plant virus identification using Apple stem grooving virus (ASGV) as an exemplar virus. We used 15 publicly available transcriptome libraries from three different studies, two mRNA-Seq studies and a small RNA-Seq study. We de novo assembled nearly complete genomes of ASGV isolates Fuji and Cuiguan from apple and pear transcriptomes, respectively, and identified single nucleotide variations (SNVs) of ASGV within the transcriptomes. We demonstrated the application of NGS raw data to confirm viral infections in the plant transcriptomes. In addition, we compared the usability of two de novo assemblers, Trinity and Velvet, for virus identification and genome assembly. A phylogenetic tree revealed that ASGV and Citrus tatter leaf virus (CTLV) are the same virus, which was divided into two clades. Recombination analyses identified six recombination events from 21 viral genomes. Taken together, our in silico analyses using NGS data provide a successful application of plant transcriptomes to reveal extensive information associated with viral genome assembly, SNVs, phylogenetic relationships, and genetic recombination.
Comparative chloroplast genomics: Analyses including new sequencesfrom the angiosperms Nuphar advena and Ranunculus macranthus

DOE Office of Scientific and Technical Information (OSTI.GOV)

Raubeso, Linda A.; Peery, Rhiannon; Chumley, Timothy W.

2007-03-01

The number of completely sequenced plastid genomes available is growing rapidly. This new array of sequences presents new opportunities to perform comparative analyses. In comparative studies, it is most useful to compare across wide phylogenetic spans and, within angiosperms, to include representatives from basally diverging lineages such as the new genomes reported here: Nuphar advena (from a basal-most lineage) and Ranunculus macranthus (from the basal group of eudicots). We report these two new plastid genome sequences and make comparisons (within angiosperms, seed plants, or all photosynthetic lineages) to evaluate features such as the status of ycf15 and ycf68 as proteinmore » coding genes, the distribution of simple sequence repeats (SSRs) and longer dispersed repeats (SDR), and patterns of nucleotide composition.« less

The Complete Mitochondrial Genome of Galba pervia (Gastropoda: Mollusca), an Intermediate Host Snail of Fasciola spp

PubMed Central

Huang, Wei-Yi; Zhao, Guang-Hui; Wei, Shu-Jun; Song, Hui-Qun; Xu, Min-Jun; Lin, Rui-Qing; Zhou, Dong-Hui; Zhu, Xing-Quan

2012-01-01

Complete mitochondrial (mt) genomes and the gene rearrangements are increasingly used as molecular markers for investigating phylogenetic relationships. Contributing to the complete mt genomes of Gastropoda, especially Pulmonata, we determined the mt genome of the freshwater snail Galba pervia, which is an important intermediate host for Fasciola spp. in China. The complete mt genome of G. pervia is 13,768 bp in length. Its genome is circular, and consists of 37 genes, including 13 genes for proteins, 2 genes for rRNA, 22 genes for tRNA. The mt gene order of G. pervia showed novel arrangement (tRNA-His, tRNA-Gly and tRNA-Tyr change positions and directions) when compared with mt genomes of Pulmonata species sequenced to date, indicating divergence among different species within the Pulmonata. A total of 3655 amino acids were deduced to encode 13 protein genes. The most frequently used amino acid is Leu (15.05%), followed by Phe (11.24%), Ser (10.76%) and IIe (8.346%). Phylogenetic analyses using the concatenated amino acid sequences of the 13 protein-coding genes, with three different computational algorithms (maximum parsimony, maximum likelihood and Bayesian analysis), all revealed that the families Lymnaeidae and Planorbidae are closely related two snail families, consistent with previous classifications based on morphological and molecular studies. The complete mt genome sequence of G. pervia showed a novel gene arrangement and it represents the first sequenced high quality mt genome of the family Lymnaeidae. These novel mtDNA data provide additional genetic markers for studying the epidemiology, population genetics and phylogeographics of freshwater snails, as well as for understanding interplay between the intermediate snail hosts and the intra-mollusca stages of Fasciola spp.. PMID:22844544
Complete nucleotide sequence of the Cryptomeria japonica D. Don. chloroplast genome and comparative chloroplast genomics: diversified genomic structure of coniferous species.

PubMed

Hirao, Tomonori; Watanabe, Atsushi; Kurita, Manabu; Kondo, Teiji; Takata, Katsuhiko

2008-06-23

The recent determination of complete chloroplast (cp) genomic sequences of various plant species has enabled numerous comparative analyses as well as advances in plant and genome evolutionary studies. In angiosperms, the complete cp genome sequences of about 70 species have been determined, whereas those of only three gymnosperm species, Cycas taitungensis, Pinus thunbergii, and Pinus koraiensis have been established. The lack of information regarding the gene content and genomic structure of gymnosperm cp genomes may severely hamper further progress of plant and cp genome evolutionary studies. To address this need, we report here the complete nucleotide sequence of the cp genome of Cryptomeria japonica, the first in the Cupressaceae sensu lato of gymnosperms, and provide a comparative analysis of their gene content and genomic structure that illustrates the unique genomic features of gymnosperms. The C. japonica cp genome is 131,810 bp in length, with 112 single copy genes and two duplicated (trnI-CAU, trnQ-UUG) genes that give a total of 116 genes. Compared to other land plant cp genomes, the C. japonica cp has lost one of the relevant large inverted repeats (IRs) found in angiosperms, fern, liverwort, and gymnosperms, such as Cycas and Gingko, and additionally has completely lost its trnR-CCG, partially lost its trnT-GGU, and shows diversification of accD. The genomic structure of the C. japonica cp genome also differs significantly from those of other plant species. For example, we estimate that a minimum of 15 inversions would be required to transform the gene organization of the Pinus thunbergii cp genome into that of C. japonica. In the C. japonica cp genome, direct repeat and inverted repeat sequences are observed at the inversion and translocation endpoints, and these sequences may be associated with the genomic rearrangements. The observed differences in genomic structure between C. japonica and other land plants, including pines, strongly support the theory that the large IRs stabilize the cp genome. Furthermore, the deleted large IR and the numerous genomic rearrangements that have occurred in the C. japonica cp genome provide new insights into both the evolutionary lineage of coniferous species in gymnosperm and the evolution of the cp genome.
Cucumis melo endornavirus: Genome organization, host range and codivergence with the host

USDA-ARS?s Scientific Manuscript database

A high molecular weight dsRNA was isolated from a Cucumis melo plant (referred to as“CL01”) of an unknown cultivar and completely sequenced. Sequence analyses showed similarities with members of the Endornaviridae. The name Cucumis melo endornavirus (CmEV) is proposed. The genome of CmEV-CL01 consis...
The complete chloroplast genome sequence of Dodonaea viscosa: comparative and phylogenetic analyses.

PubMed

Saina, Josphat K; Gichira, Andrew W; Li, Zhi-Zhong; Hu, Guang-Wan; Wang, Qing-Feng; Liao, Kuo

2018-02-01

The plant chloroplast (cp) genome is a highly conserved structure which is beneficial for evolution and systematic research. Currently, numerous complete cp genome sequences have been reported due to high throughput sequencing technology. However, there is no complete chloroplast genome of genus Dodonaea that has been reported before. To better understand the molecular basis of Dodonaea viscosa chloroplast, we used Illumina sequencing technology to sequence its complete genome. The whole length of the cp genome is 159,375 base pairs (bp), with a pair of inverted repeats (IRs) of 27,099 bp separated by a large single copy (LSC) 87,204 bp, and small single copy (SSC) 17,972 bp. The annotation analysis revealed a total of 115 unique genes of which 81 were protein coding, 30 tRNA, and four ribosomal RNA genes. Comparative genome analysis with other closely related Sapindaceae members showed conserved gene order in the inverted and single copy regions. Phylogenetic analysis clustered D. viscosa with other species of Sapindaceae with strong bootstrap support. Finally, a total of 249 SSRs were detected. Moreover, a comparison of the synonymous (Ks) and nonsynonymous (Ka) substitution rates in D. viscosa showed very low values. The availability of cp genome reported here provides a valuable genetic resource for comprehensive further studies in genetic variation, taxonomy and phylogenetic evolution of Sapindaceae family. In addition, SSR markers detected will be used in further phylogeographic and population structure studies of the species in this genus.
Characterization and complete genome sequence of a panicovirus from Bermuda grass by high-throughput sequencing.

PubMed

Tahir, Muhammad N; Lockhart, Ben; Grinstead, Samuel; Mollov, Dimitre

2017-04-01

Bermuda grass samples were examined by transmission electron microscopy and 28-30 nm spherical virus particles were observed. Total RNA from these plants was subjected to high-throughput sequencing (HTS). The nearly full genome sequence of a panicovirus was identified from one HTS scaffold. Sanger sequencing was used to confirm the HTS results and complete the genome sequence of 4404 nt. This virus was provisionally named Bermuda grass latent virus (BGLV). Its predicted open reading frames follow the typical arrangement of the genus Panicovirus. Based on sequence comparisons and phylogenetic analyses BGLV differs from other viruses and therefore taxonomically it is a new member of the genus Panicovirus, family Tombusviridae.
Complete mitochondrial genome of the Tyto longimembris (Strigiformes: Tytonidae).

PubMed

Xu, Peng; Li, Yankuo; Miao, Lujun; Xie, Guangyong; Huang, Yan

2016-07-01

The complete mitochondrial genome of Tyto longimembris has been determined in this study. It is 18,466 bp in length and consists of 13 protein-coding genes, 22 transfer RNA (tRNA) genes, 2 ribosomal RNA (rRNA) genes and a non-coding control region (D-loop). The overall base composition of the heavy strand of the T. longimembris mitochondrial genome is A: 30.1%, T: 23.5%, C: 31.8% and G: 14.6%. The structure of control region should be characterized by a region containing tandem repeats as two definitely separated clusters of tandem repeats were found. This study provided an important data set for phylogenetic and taxonomic analyses of Tyto species.
Characterization of the complete mitochondrial genome of Marshallagia marshalli and phylogenetic implications for the superfamily Trichostrongyloidea.

PubMed

Sun, Miao-Miao; Han, Liang; Zhang, Fu-Kai; Zhou, Dong-Hui; Wang, Shu-Qing; Ma, Jun; Zhu, Xing-Quan; Liu, Guo-Hua

2018-01-01

Marshallagia marshalli (Nematoda: Trichostrongylidae) infection can lead to serious parasitic gastroenteritis in sheep, goat, and wild ruminant, causing significant socioeconomic losses worldwide. Up to now, the study concerning the molecular biology of M. marshalli is limited. Herein, we sequenced the complete mitochondrial (mt) genome of M. marshalli and examined its phylogenetic relationship with selected members of the superfamily Trichostrongyloidea using Bayesian inference (BI) based on concatenated mt amino acid sequence datasets. The complete mt genome sequence of M. marshalli is 13,891 bp, including 12 protein-coding genes, 22 transfer RNA genes, and 2 ribosomal RNA genes. All protein-coding genes are transcribed in the same direction. Phylogenetic analyses based on concatenated amino acid sequences of the 12 protein-coding genes supported the monophylies of the families Haemonchidae, Molineidae, and Dictyocaulidae with strong statistical support, but rejected the monophyly of the family Trichostrongylidae. The determination of the complete mt genome sequence of M. marshalli provides novel genetic markers for studying the systematics, population genetics, and molecular epidemiology of M. marshalli and its congeners.
Language continuity despite population replacement in Remote Oceania.

PubMed

Posth, Cosimo; Nägele, Kathrin; Colleran, Heidi; Valentin, Frédérique; Bedford, Stuart; Kami, Kaitip W; Shing, Richard; Buckley, Hallie; Kinaston, Rebecca; Walworth, Mary; Clark, Geoffrey R; Reepmeyer, Christian; Flexner, James; Maric, Tamara; Moser, Johannes; Gresky, Julia; Kiko, Lawrence; Robson, Kathryn J; Auckland, Kathryn; Oppenheimer, Stephen J; Hill, Adrian V S; Mentzer, Alexander J; Zech, Jana; Petchey, Fiona; Roberts, Patrick; Jeong, Choongwon; Gray, Russell D; Krause, Johannes; Powell, Adam

2018-04-01

Recent genomic analyses show that the earliest peoples reaching Remote Oceania-associated with Austronesian-speaking Lapita culture-were almost completely East Asian, without detectable Papuan ancestry. However, Papuan-related genetic ancestry is found across present-day Pacific populations, indicating that peoples from Near Oceania have played a significant, but largely unknown, ancestral role. Here, new genome-wide data from 19 ancient South Pacific individuals provide direct evidence of a so-far undescribed Papuan expansion into Remote Oceania starting ~2,500 yr BP, far earlier than previously estimated and supporting a model from historical linguistics. New genome-wide data from 27 contemporary ni-Vanuatu demonstrate a subsequent and almost complete replacement of Lapita-Austronesian by Near Oceanian ancestry. Despite this massive demographic change, incoming Papuan languages did not replace Austronesian languages. Population replacement with language continuity is extremely rare-if not unprecedented-in human history. Our analyses show that rather than one large-scale event, the process was incremental and complex, with repeated migrations and sex-biased admixture with peoples from the Bismarck Archipelago.
Complete mitochondrial genome of Korean yellow-throated marten, Martes flavigula (Carnivora, Mustelidae).

PubMed

Jang, Kuem Hee; Hwang, Ui Wook

2016-05-01

The complete mitogenome sequence of Martes flavigula, which is an endangered and endemic species in South Korea, was determined. The genome is 16,533 bp in length and its gene arrangement pattern, gene content, and gene organization is identical to those of martens. The control region was located between the tRNAPro and tRNAPhe genes and is 1087 bp in length. This mitogenome sequence data might be an important role in the preservation of genetic resources by allowing researchers to conduct phylogenetic and systematic analyses of Mustelidae.
Complete chloroplast genome sequence and comparative analysis of loblolly pine (Pinus taeda L.) with related species

PubMed Central

Khan, Abdul Latif; Khan, Muhammad Aaqil; Shahzad, Raheem; Lubna; Kang, Sang Mo; Al-Harrasi, Ahmed; Al-Rawahi, Ahmed; Lee, In-Jung

2018-01-01

Pinaceae, the largest family of conifers, has a diversified organization of chloroplast (cp) genomes with two typical highly reduced inverted repeats (IRs). In the current study, we determined the complete sequence of the cp genome of an economically and ecologically important conifer tree, the loblolly pine (Pinus taeda L.), using Illumina paired-end sequencing and compared the sequence with those of other pine species. The results revealed a genome size of 121,531 base pairs (bp) containing a pair of 830-bp IR regions, distinguished by a small single copy (42,258 bp) and large single copy (77,614 bp) region. The chloroplast genome of P. taeda encodes 120 genes, comprising 81 protein-coding genes, four ribosomal RNA genes, and 35 tRNA genes, with 151 randomly distributed microsatellites. Approximately 6 palindromic, 34 forward, and 22 tandem repeats were found in the P. taeda cp genome. Whole cp genome comparison with those of other Pinus species exhibited an overall high degree of sequence similarity, with some divergence in intergenic spacers. Higher and lower numbers of indels and single-nucleotide polymorphism substitutions were observed relative to P. contorta and P. monophylla, respectively. Phylogenomic analyses based on the complete genome sequence revealed that 60 shared genes generated trees with the same topologies, and P. taeda was closely related to P. contorta in the subgenus Pinus. Thus, the complete P. taeda genome provided valuable resources for population and evolutionary studies of gymnosperms and can be used to identify related species. PMID:29596414
Sublinear growth of information in DNA sequences.

PubMed

Menconi, Giulia

2005-07-01

We introduce a novel method to analyse complete genomes and recognise some distinctive features by means of an adaptive compression algorithm, which is not DNA-oriented, based on the Lempel-Ziv scheme. We study the Information Content as a function of the number of symbols encoded by the algorithm and we analyse the dictionary created by the algorithm. Preliminary results are shown concerning regions showing a sublinear type of information growth, which is strictly connected to the presence of highly repetitive subregions that might be supposed to have a regulatory function within the genome.
The First Complete Genome Sequence of the Class Fimbriimonadia in the Phylum Armatimonadetes

PubMed Central

Im, Wan-Taek; Wang, Sheng-Yue; Zhao, Guo-Ping; Zheng, Hua-Jun; Quan, Zhe-Xue

2014-01-01

In this study, we present the complete genome of Fimbriimonas ginsengisoli Gsoil 348T belonging to the class Fimbriimonadia of the phylum Armatimonadetes, formerly called as candidate phylum OP10. The complete genome contains a single circular chromosome of 5.23 Mb including a 45.5 kb prophage. Of the 4820 open reading frames (ORFs), 3,000 (62.2%) genes could be classified into Clusters of Orthologous Groups (COG) families. With the split of rRNA genes, strain Gsoil 348T had no typical 16S-23S-5S ribosomal RNA operon. In this genome, the GC skew inversion which was usually observed in archaea was found. The predicted gene functions suggest that the organism lacks the ability to synthesize histidine, and the TCA cycle is incomplete. Phylogenetic analyses based on ribosomal proteins indicated that strain Gsoil 348T represents a deeply branching lineage of sufficient divergence with other phyla, but also strongly involved in superphylum Terrabacteria. PMID:24967843
The complete mitochondrial genome of the desert darkling beetle Asbolus verrucosus (Coleoptera, Tenebrionidae).

PubMed

Rider, Stanley Dean

2016-07-01

The complete mitochondrial genome of the desert darkling beetle Asbolus verrucosus (LeConte, 1851) was sequenced using paired-end technology to an average depth of 42,111× and assembled using De Bruijn graph-based methods. The genome is 15,828 bp in length and conforms to the basal arthropod mitochondrial gene composition with the same gene orders and orientations as other darkling beetle mitochondria. This arrangement includes a control region, 22 tRNA genes, 2 rRNA genes and 13 protein-coding genes. The main coding strand is probably replicated as the lagging strand (GC skew of -0.36 and AT skew of +0.19). Phylogenomics analyses are consistent with taxonomic classifications and indicate that Tenebrio molitor is the closest relative that has a completely sequenced mitochondrial genome available for analysis. This is the first fully assembled mitogenome sequence for a darkling beetle in the subfamily Pimeliinae and will be useful for population studies on members of this ecologically important group of beetles.
Genetic and phylogenetic analysis of a novel parvovirus isolated from chickens in Guangxi, China.

PubMed

Feng, Bin; Xie, Zhixun; Deng, Xianwen; Xie, Liji; Xie, Zhiqin; Huang, Li; Fan, Qin; Luo, Sisi; Huang, Jiaoling; Zhang, Yanfang; Zeng, Tingting; Wang, Sheng; Wang, Leyi

2016-11-01

A previously unidentified chicken parvovirus (ChPV) strain, associated with runting-stunting syndrome (RSS), is now endemic among chickens in China. To explore the genetic diversity of ChPV strains, we determined the first complete genome sequence of a novel ChPV isolate (GX-CH-PV-7) identified in chickens in Guang Xi, China, and showed moderate genome sequence similarity to reference strains. Analysis showed that the viral genome sequence is 86.4 %-93.9 % identical to those of other ChPVs. Genetic and phylogenetic analyses showed that this newly emergent GX-CH-PV-7 is closely related to Gallus gallus enteric parvovirus isolate ChPV 798 from the USA, indicating that they may share a common ancestor. The complete DNA sequence is 4612 bp long with an A+T content of 56.66 %. We determined the first complete genome sequence of a previously unidentified ChPV strain to elucidate its origin and evolutionary status.
Landscape genomics reveals altered genome wide diversity within revegetated stands of Eucalyptus microcarpa (Grey Box).

PubMed

Jordan, Rebecca; Dillon, Shannon K; Prober, Suzanne M; Hoffmann, Ary A

2016-12-01

In order to contribute to evolutionary resilience and adaptive potential in highly modified landscapes, revegetated areas should ideally reflect levels of genetic diversity within and across natural stands. Landscape genomic analyses enable such diversity patterns to be characterized at genome and chromosomal levels. Landscape-wide patterns of genomic diversity were assessed in Eucalyptus microcarpa, a dominant tree species widely used in revegetation in Southeastern Australia. Trees from small and large patches within large remnants, small isolated remnants and revegetation sites were assessed across the now highly fragmented distribution of this species using the DArTseq genomic approach. Genomic diversity was similar within all three types of remnant patches analysed, although often significantly but only slightly lower in revegetation sites compared with natural remnants. Differences in diversity between stand types varied across chromosomes. Genomic differentiation was higher between small, isolated remnants, and among revegetated sites compared with natural stands. We conclude that small remnants and revegetated sites of our E. microcarpa samples largely but not completely capture patterns in genomic diversity across the landscape. Genomic approaches provide a powerful tool for assessing restoration efforts across the landscape. © 2016 The Authors. New Phytologist © 2016 New Phytologist Trust.
Targeted sequencing for high-resolution evolutionary analyses following genome duplication in salmonid fish: Proof of concept for key components of the insulin-like growth factor axis.

PubMed

Lappin, Fiona M; Shaw, Rebecca L; Macqueen, Daniel J

2016-12-01

High-throughput sequencing has revolutionised comparative and evolutionary genome biology. It has now become relatively commonplace to generate multiple genomes and/or transcriptomes to characterize the evolution of large taxonomic groups of interest. Nevertheless, such efforts may be unsuited to some research questions or remain beyond the scope of some research groups. Here we show that targeted high-throughput sequencing offers a viable alternative to study genome evolution across a vertebrate family of great scientific interest. Specifically, we exploited sequence capture and Illumina sequencing to characterize the evolution of key components from the insulin-like growth (IGF) signalling axis of salmonid fish at unprecedented phylogenetic resolution. The IGF axis represents a central governor of vertebrate growth and its core components were expanded by whole genome duplication in the salmonid ancestor ~95Ma. Using RNA baits synthesised to genes encoding the complete family of IGF binding proteins (IGFBP) and an IGF hormone (IGF2), we captured, sequenced and assembled orthologous and paralogous exons from species representing all ten salmonid genera. This approach generated 299 novel sequences, most as complete or near-complete protein-coding sequences. Phylogenetic analyses confirmed congruent evolutionary histories for all nineteen recognized salmonid IGFBP family members and identified novel salmonid-specific IGF2 paralogues. Moreover, we reconstructed the evolution of duplicated IGF axis paralogues across a replete salmonid phylogeny, revealing complex historic selection regimes - both ancestral to salmonids and lineage-restricted - that frequently involved asymmetric paralogue divergence under positive and/or relaxed purifying selection. Our findings add to an emerging literature highlighting diverse applications for targeted sequencing in comparative-evolutionary genomics. We also set out a viable approach to obtain large sets of nuclear genes for any member of the salmonid family, which should enable insights into the evolutionary role of whole genome duplication before additional nuclear genome sequences become available. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.
New Insights into Asian Prunus Viruses in the Light of NGS-Based Full Genome Sequencing.

PubMed

Marais, Armelle; Faure, Chantal; Candresse, Thierry

2016-01-01

Double stranded RNAs were purified from five Prunus sources of Asian origin and submitted to 454 pyrosequencing after a random, whole genome amplification. Four complete genomes of Asian prunus virus 1 (APV1), APV2 and APV3 were reconstructed from the sequencing reads, as well as four additional, near-complete genome sequences. Phylogenetic analyses confirmed the close relationships of these three viruses and the taxonomical position previously proposed for APV1, the only APV so far completely sequenced. The genetic distances in the respective polymerase and coat protein genes as well as their gene products suggest that APV2 should be considered as a distinct viral species in the genus Foveavirus, even if the amino acid identity levels in the polymerase are very close to the species demarcation criteria for the family Betaflexiviridae. However, the situation is more complex for APV1 and APV3, for which opposite conclusions are obtained depending on the gene (polymerase or coat protein) analyzed. Phylogenetic and recombination analyses suggest that recombination events may have been involved in the evolution of APV. Moreover, genome comparisons show that the unusually long 3' non-coding region (3' NCR) is highly variable and a hot spot for indel polymorphisms. In particular, two APV3 variants differing only in their 3' NCR were identified in a single Prunus source, with 3' NCRs of 214-312 nt, a size similar to that observed in other foveaviruses, but 567-850 nt smaller than in other APV3 isolates. Overall, this study provides critical genome information of these viruses, frequently associated with Prunus materials, even though their precise role as pathogens remains to be elucidated.
New Insights into Asian Prunus Viruses in the Light of NGS-Based Full Genome Sequencing

PubMed Central

Marais, Armelle; Faure, Chantal; Candresse, Thierry

2016-01-01

Double stranded RNAs were purified from five Prunus sources of Asian origin and submitted to 454 pyrosequencing after a random, whole genome amplification. Four complete genomes of Asian prunus virus 1 (APV1), APV2 and APV3 were reconstructed from the sequencing reads, as well as four additional, near-complete genome sequences. Phylogenetic analyses confirmed the close relationships of these three viruses and the taxonomical position previously proposed for APV1, the only APV so far completely sequenced. The genetic distances in the respective polymerase and coat protein genes as well as their gene products suggest that APV2 should be considered as a distinct viral species in the genus Foveavirus, even if the amino acid identity levels in the polymerase are very close to the species demarcation criteria for the family Betaflexiviridae. However, the situation is more complex for APV1 and APV3, for which opposite conclusions are obtained depending on the gene (polymerase or coat protein) analyzed. Phylogenetic and recombination analyses suggest that recombination events may have been involved in the evolution of APV. Moreover, genome comparisons show that the unusually long 3’ non-coding region (3' NCR) is highly variable and a hot spot for indel polymorphisms. In particular, two APV3 variants differing only in their 3’ NCR were identified in a single Prunus source, with 3' NCRs of 214–312 nt, a size similar to that observed in other foveaviruses, but 567–850 nt smaller than in other APV3 isolates. Overall, this study provides critical genome information of these viruses, frequently associated with Prunus materials, even though their precise role as pathogens remains to be elucidated. PMID:26741704
Genome Reduction in Psychromonas Species within the Gut of an Amphipod from the Ocean's Deepest Point.

PubMed

Zhang, Weipeng; Tian, Ren-Mao; Sun, Jin; Bougouffa, Salim; Ding, Wei; Cai, Lin; Lan, Yi; Tong, Haoya; Li, Yongxin; Jamieson, Alan J; Bajic, Vladimir B; Drazen, Jeffrey C; Bartlett, Douglas; Qian, Pei-Yuan

2018-01-01

Amphipods are the dominant scavenging metazoan species in the Mariana Trench, the deepest known point in Earth's oceans. Here the gut microbiota of the amphipod Hirondellea gigas collected from the Challenger and Sirena Deeps of the Mariana Trench were investigated. The 11 amphipod individuals included for analyses were dominated by Psychromonas , of which a nearly complete genome was successfully recovered (designated CDP1). Compared with previously reported free-living Psychromonas strains, CDP1 has a highly reduced genome. Genome alignment showed deletion of the trimethylamine N -oxide (TMAO) reducing gene cluster in CDP1, suggesting that the "piezolyte" function of TMAO is more important than its function in respiration, which may lead to TMAO accumulation. In terms of nutrient utilization, the bacterium retains its central carbohydrate metabolism but lacks most of the extended carbohydrate utilization pathways, suggesting the confinement of Psychromonas to the host gut and sequestration from more variable environmental conditions. Moreover, CDP1 contains a complete formate hydrogenlyase complex, which might be involved in energy production. The genomic analyses imply that CDP1 may have developed adaptive strategies for a lifestyle within the gut of the hadal amphipod H. gigas. IMPORTANCE As a unique but poorly investigated habitat within marine ecosystems, hadal trenches have received interest in recent years. This study explores the gut microbial composition and function in hadal amphipods, which are among the dominant carrion feeders in hadal habitats. Further analyses of a dominant strain revealed genomic features that may contribute to its adaptation to the amphipod gut environment. Our findings provide new insights into animal-associated bacteria in the hadal biosphere.
Genome Reduction in Psychromonas Species within the Gut of an Amphipod from the Ocean’s Deepest Point

PubMed Central

Zhang, Weipeng; Tian, Ren-Mao; Sun, Jin; Bougouffa, Salim; Ding, Wei; Cai, Lin; Lan, Yi; Tong, Haoya; Li, Yongxin; Jamieson, Alan J.; Bajic, Vladimir B.; Drazen, Jeffrey C.; Bartlett, Douglas

2018-01-01

ABSTRACT Amphipods are the dominant scavenging metazoan species in the Mariana Trench, the deepest known point in Earth’s oceans. Here the gut microbiota of the amphipod Hirondellea gigas collected from the Challenger and Sirena Deeps of the Mariana Trench were investigated. The 11 amphipod individuals included for analyses were dominated by Psychromonas, of which a nearly complete genome was successfully recovered (designated CDP1). Compared with previously reported free-living Psychromonas strains, CDP1 has a highly reduced genome. Genome alignment showed deletion of the trimethylamine N-oxide (TMAO) reducing gene cluster in CDP1, suggesting that the “piezolyte” function of TMAO is more important than its function in respiration, which may lead to TMAO accumulation. In terms of nutrient utilization, the bacterium retains its central carbohydrate metabolism but lacks most of the extended carbohydrate utilization pathways, suggesting the confinement of Psychromonas to the host gut and sequestration from more variable environmental conditions. Moreover, CDP1 contains a complete formate hydrogenlyase complex, which might be involved in energy production. The genomic analyses imply that CDP1 may have developed adaptive strategies for a lifestyle within the gut of the hadal amphipod H. gigas. IMPORTANCE As a unique but poorly investigated habitat within marine ecosystems, hadal trenches have received interest in recent years. This study explores the gut microbial composition and function in hadal amphipods, which are among the dominant carrion feeders in hadal habitats. Further analyses of a dominant strain revealed genomic features that may contribute to its adaptation to the amphipod gut environment. Our findings provide new insights into animal-associated bacteria in the hadal biosphere. PMID:29657971

Complete Genome Sequence of a Multidrug-Resistant Salmonella enterica Serovar Typhimurium var. 5- Strain Isolated from Chicken Breast.

PubMed

Hoffmann, Maria; Muruvanda, Tim; Allard, Marc W; Korlach, Jonas; Roberts, Richard J; Timme, Ruth; Payne, Justin; McDermott, Patrick F; Evans, Peter; Meng, Jianghong; Brown, Eric W; Zhao, Shaohua

2013-12-19

Salmonella enterica subsp. enterica serovar Typhimurium is a leading cause of salmonellosis. Here, we report a closed genome sequence, including sequences of 3 plasmids, of Salmonella serovar Typhimurium var. 5- CFSAN001921 (National Antimicrobial Resistance Monitoring System [NARMS] strain ID N30688), which was isolated from chicken breast meat and shows resistance to 10 different antimicrobials. Whole-genome and plasmid sequence analyses of this isolate will help enhance our understanding of this pathogenic multidrug-resistant serovar.
Pan-genome analysis of human gastric pathogen H. pylori: comparative genomics and pathogenomics approaches to identify regions associated with pathogenicity and prediction of potential core therapeutic targets.

PubMed

Ali, Amjad; Naz, Anam; Soares, Siomar C; Bakhtiar, Marriam; Tiwari, Sandeep; Hassan, Syed S; Hanan, Fazal; Ramos, Rommel; Pereira, Ulisses; Barh, Debmalya; Figueiredo, Henrique César Pereira; Ussery, David W; Miyoshi, Anderson; Silva, Artur; Azevedo, Vasco

2015-01-01

Helicobacter pylori is a human gastric pathogen implicated as the major cause of peptic ulcer and second leading cause of gastric cancer (~70%) around the world. Conversely, an increased resistance to antibiotics and hindrances in the development of vaccines against H. pylori are observed. Pan-genome analyses of the global representative H. pylori isolates consisting of 39 complete genomes are presented in this paper. Phylogenetic analyses have revealed close relationships among geographically diverse strains of H. pylori. The conservation among these genomes was further analyzed by pan-genome approach; the predicted conserved gene families (1,193) constitute ~77% of the average H. pylori genome and 45% of the global gene repertoire of the species. Reverse vaccinology strategies have been adopted to identify and narrow down the potential core-immunogenic candidates. Total of 28 nonhost homolog proteins were characterized as universal therapeutic targets against H. pylori based on their functional annotation and protein-protein interaction. Finally, pathogenomics and genome plasticity analysis revealed 3 highly conserved and 2 highly variable putative pathogenicity islands in all of the H. pylori genomes been analyzed.
Concomitant loss of NDH complex-related genes within chloroplast and nuclear genomes in some orchids.

PubMed

Lin, Choun-Sea; Chen, Jeremy J W; Chiu, Chi-Chou; Hsiao, Han C W; Yang, Chen-Jui; Jin, Xiao-Hua; Leebens-Mack, James; de Pamphilis, Claude W; Huang, Yao-Ting; Yang, Ling-Hung; Chang, Wan-Jung; Kui, Ling; Wong, Gane Ka-Shu; Hu, Jer-Ming; Wang, Wen; Shih, Ming-Che

2017-06-01

The chloroplast NAD(P)H dehydrogenase-like (NDH) complex consists of about 30 subunits from both the nuclear and chloroplast genomes and is ubiquitous across most land plants. In some orchids, such as Phalaenopsis equestris, Dendrobium officinale and Dendrobium catenatum, most of the 11 chloroplast genome-encoded ndh genes (cp-ndh) have been lost. Here we investigated whether functional cp-ndh genes have been completely lost in these orchids or whether they have been transferred and retained in the nuclear genome. Further, we assessed whether both cp-ndh genes and nucleus-encoded NDH-related genes can be lost, resulting in the absence of the NDH complex. Comparative analyses of the genome of Apostasia odorata, an orchid species with a complete complement of cp-ndh genes which represents the sister lineage to all other orchids, and three published orchid genome sequences for P. equestris, D. officinale and D. catenatum, which are all missing cp-ndh genes, indicated that copies of cp-ndh genes are not present in any of these four nuclear genomes. This observation suggests that the NDH complex is not necessary for some plants. Comparative genomic/transcriptomic analyses of currently available plastid genome sequences and nuclear transcriptome data showed that 47 out of 660 photoautotrophic plants and all the heterotrophic plants are missing plastid-encoded cp-ndh genes and exhibit no evidence for maintenance of a functional NDH complex. Our data indicate that the NDH complex can be lost in photoautotrophic plant species. Further, the loss of the NDH complex may increase the probability of transition from a photoautotrophic to a heterotrophic life history. © 2017 The Authors The Plant Journal © 2017 John Wiley & Sons Ltd.
The complete nucleotide sequence of the barley yellow dwarf GPV isolate from China shows that it is a new member of the genus Polerovirus.

PubMed

Zhang, Wenwei; Cheng, Zhuomin; Xu, Lei; Wu, Maosen; Waterhouse, Peter; Zhou, Guanghe; Li, Shifang

2009-01-01

The complete nucleotide sequence of the ssRNA genome of a Chinese GPV isolate of barley yellow dwarf virus (BYDV) was determined. It comprised 5673 nucleotides, and the deduced genome organization resembled that of members of the genus Polerovirus. It was most closely related to cereal yellow dwarf virus-RPV (77% nt identity over the entire genome; coat protein amino acid identity 79%). The GPV isolate also differs in vector specificity from other BYDV strains. Biological properties, phylogenetic analyses and detailed sequence comparisons suggest that GPV should be considered a member of a new species within the genus, and the name Wheat yellow dwarf virus-GPV is proposed.
Complete genome sequence and phylogenetic analyses of an aquabirnavirus isolated from a diseased marbled eel culture in Taiwan.

PubMed

Wen, Chiu-Ming

2017-08-01

An aquabirnavirus was isolated from diseased marbled eels (Anguilla marmorata; MEIPNV1310) with gill haemorrhages and associated mortality. Its genome segment sequences were obtained through next-generation sequencing and compared with published aquabirnavirus sequences. The results indicated that the genome sequence of MEIPNV1310 contains segment A (3099 nucleotides) and segment B (2789 nucleotides). Phylogenetic analysis showed that MEIPNV1310 is closely related to the infectious pancreatic necrosis Ab strain within genogroup II. This genome sequence is beneficial for studying the geographic distribution and evolution of aquabirnaviruses.
Quality control and conduct of genome-wide association meta-analyses.

PubMed

Winkler, Thomas W; Day, Felix R; Croteau-Chonka, Damien C; Wood, Andrew R; Locke, Adam E; Mägi, Reedik; Ferreira, Teresa; Fall, Tove; Graff, Mariaelisa; Justice, Anne E; Luan, Jian'an; Gustafsson, Stefan; Randall, Joshua C; Vedantam, Sailaja; Workalemahu, Tsegaselassie; Kilpeläinen, Tuomas O; Scherag, André; Esko, Tonu; Kutalik, Zoltán; Heid, Iris M; Loos, Ruth J F

2014-05-01

Rigorous organization and quality control (QC) are necessary to facilitate successful genome-wide association meta-analyses (GWAMAs) of statistics aggregated across multiple genome-wide association studies. This protocol provides guidelines for (i) organizational aspects of GWAMAs, and for (ii) QC at the study file level, the meta-level across studies and the meta-analysis output level. Real-world examples highlight issues experienced and solutions developed by the GIANT Consortium that has conducted meta-analyses including data from 125 studies comprising more than 330,000 individuals. We provide a general protocol for conducting GWAMAs and carrying out QC to minimize errors and to guarantee maximum use of the data. We also include details for the use of a powerful and flexible software package called EasyQC. Precise timings will be greatly influenced by consortium size. For consortia of comparable size to the GIANT Consortium, this protocol takes a minimum of about 10 months to complete.
Quality control and conduct of genome-wide association meta-analyses

PubMed Central

Winkler, Thomas W; Day, Felix R; Croteau-Chonka, Damien C; Wood, Andrew R; Locke, Adam E; Mägi, Reedik; Ferreira, Teresa; Fall, Tove; Graff, Mariaelisa; Justice, Anne E; Luan, Jian'an; Gustafsson, Stefan; Randall, Joshua C; Vedantam, Sailaja; Workalemahu, Tsegaselassie; Kilpeläinen, Tuomas O; Scherag, André; Esko, Tonu; Kutalik, Zoltán; Heid, Iris M; Loos, Ruth JF

2014-01-01

Rigorous organization and quality control (QC) are necessary to facilitate successful genome-wide association meta-analyses (GWAMAs) of statistics aggregated across multiple genome-wide association studies. This protocol provides guidelines for [1] organizational aspects of GWAMAs, and for [2] QC at the study file level, the meta-level across studies, and the meta-analysis output level. Real–world examples highlight issues experienced and solutions developed by the GIANT Consortium that has conducted meta-analyses including data from 125 studies comprising more than 330,000 individuals. We provide a general protocol for conducting GWAMAs and carrying out QC to minimize errors and to guarantee maximum use of the data. We also include details for use of a powerful and flexible software package called EasyQC. For consortia of comparable size to the GIANT consortium, the present protocol takes a minimum of about 10 months to complete. PMID:24762786
The complete sequences and gene organisation of the mitochondrial genomes of the heterodont bivalves Acanthocardia tuberculata and Hiatella arctica – and the first record for a putative Atpase subunit 8 gene in marine bivalves

PubMed Central

Dreyer, Hermann; Steiner, Gerhard

2006-01-01

Background Mitochondrial (mt) gene arrangement is highly variable among molluscs and especially among bivalves. Of the 30 complete molluscan mt-genomes published to date, only one is of a heterodont bivalve, although this is the most diverse taxon in terms of species numbers. We determined the complete sequence of the mitochondrial genomes of Acanthocardia tuberculata and Hiatella arctica, (Mollusca, Bivalvia, Heterodonta) and describe their gene contents and genome organisations to assess the variability of these features among the Bivalvia and their value for phylogenetic inference. Results The size of the mt-genome in Acanthocardia tuberculata is 16.104 basepairs (bp), and in Hiatella arctica 18.244 bp. The Acanthocardia mt-genome contains 12 of the typical protein coding genes, lacking the Atpase subunit 8 (atp8) gene, as all published marine bivalves. In contrast, a complete atp8 gene is present in Hiatella arctica. In addition, we found a putative truncated atp8 gene when re-annotating the mt-genome of Venerupis philippinarum. Both mt-genomes reported here encode all genes on the same strand and have an additional trnM. In Acanthocardia several large non-coding regions are present. One of these contains 3.5 nearly identical copies of a 167 bp motive. In Hiatella, the 3' end of the NADH dehydrogenase subunit (nad)6 gene is duplicated together with the adjacent non-coding region. The gene arrangement of Hiatella is markedly different from all other known molluscan mt-genomes, that of Acanthocardia shows few identities with the Venerupis philippinarum. Phylogenetic analyses on amino acid and nucleotide levels robustly support the Heterodonta and the sister group relationship of Acanthocardia and Venerupis. Monophyletic Bivalvia are resolved only by a Bayesian inference of the nucleotide data set. In all other analyses the two unionid species, being to only ones with genes located on both strands, do not group with the remaining bivalves. Conclusion The two mt-genomes reported here add to and underline the high variability of gene order and presence of duplications in bivalve and molluscan taxa. Some genomic traits like the loss of the atp8 gene or the encoding of all genes on the same strand are homoplastic among the Bivalvia. These characters, gene order, and the nucleotide sequence data show considerable potential of resolving phylogenetic patterns at lower taxonomic levels. PMID:16948842
The (in)complete organelle genome: exploring the use and nonuse of available technologies for characterizing mitochondrial and plastid chromosomes.

PubMed

Sanitá Lima, Matheus; Woods, Laura C; Cartwright, Matthew W; Smith, David Roy

2016-11-01

Not long ago, scientists paid dearly in time, money and skill for every nucleotide that they sequenced. Today, DNA sequencing technologies epitomize the slogan 'faster, easier, cheaper and more', and in many ways, sequencing an entire genome has become routine, even for the smallest laboratory groups. This is especially true for mitochondrial and plastid genomes. Given their relatively small sizes and high copy numbers per cell, organelle DNAs are currently among the most highly sequenced kind of chromosome. But accurately characterizing an organelle genome and the information it encodes can require much more than DNA sequencing and bioinformatics analyses. Organelle genomes can be surprisingly complex and can exhibit convoluted and unconventional modes of gene expression. Unravelling this complexity can demand a wide assortment of experiments, from pulsed-field gel electrophoresis to Southern and Northern blots to RNA analyses. Here, we show that it is exactly these types of 'complementary' analyses that are often lacking from contemporary organelle genome papers, particularly short 'genome announcement' articles. Consequently, crucial and interesting features of organelle chromosomes are going undescribed, which could ultimately lead to a poor understanding and even a misrepresentation of these genomes and the genes they express. High-throughput sequencing and bioinformatics have made it easy to sequence and assemble entire chromosomes, but they should not be used as a substitute for or at the expense of other types of genomic characterization methods. © 2016 The Authors. Molecular Ecology Resources Published by John Wiley & Sons Ltd.
Complete Chloroplast Genome Sequences of Important Oilseed Crop Sesamum indicum L

PubMed Central

Yi, Dong-Keun; Kim, Ki-Joong

2012-01-01

Sesamum indicum is an important crop plant species for yielding oil. The complete chloroplast (cp) genome of S. indicum (GenBank acc no. JN637766) is 153,324 bp in length, and has a pair of inverted repeat (IR) regions consisting of 25,141 bp each. The lengths of the large single copy (LSC) and the small single copy (SSC) regions are 85,170 bp and 17,872 bp, respectively. Comparative cp DNA sequence analyses of S. indicum with other cp genomes reveal that the genome structure, gene order, gene and intron contents, AT contents, codon usage, and transcription units are similar to the typical angiosperm cp genomes. Nucleotide diversity of the IR region between Sesamum and three other cp genomes is much lower than that of the LSC and SSC regions in both the coding region and noncoding region. As a summary, the regional constraints strongly affect the sequence evolution of the cp genomes, while the functional constraints weakly affect the sequence evolution of cp genomes. Five short inversions associated with short palindromic sequences that form step-loop structures were observed in the chloroplast genome of S. indicum. Twenty-eight different simple sequence repeat loci have been detected in the chloroplast genome of S. indicum. Almost all of the SSR loci were composed of A or T, so this may also contribute to the A-T richness of the cp genome of S. indicum. Seven large repeated loci in the chloroplast genome of S. indicum were also identified and these loci are useful to developing S. indicum-specific cp genome vectors. The complete cp DNA sequences of S. indicum reported in this paper are prerequisite to modifying this important oilseed crop by cp genetic engineering techniques. PMID:22606240
From clinical sample to complete genome: Comparing methods for the extraction of HIV-1 RNA for high-throughput deep sequencing.

PubMed

Cornelissen, Marion; Gall, Astrid; Vink, Monique; Zorgdrager, Fokla; Binter, Špela; Edwards, Stephanie; Jurriaans, Suzanne; Bakker, Margreet; Ong, Swee Hoe; Gras, Luuk; van Sighem, Ard; Bezemer, Daniela; de Wolf, Frank; Reiss, Peter; Kellam, Paul; Berkhout, Ben; Fraser, Christophe; van der Kuyl, Antoinette C

2017-07-15

The BEEHIVE (Bridging the Evolution and Epidemiology of HIV in Europe) project aims to analyse nearly-complete viral genomes from >3000 HIV-1 infected Europeans using high-throughput deep sequencing techniques to investigate the virus genetic contribution to virulence. Following the development of a computational pipeline, including a new de novo assembler for RNA virus genomes, to generate larger contiguous sequences (contigs) from the abundance of short sequence reads that characterise the data, another area that determines genome sequencing success is the quality and quantity of the input RNA. A pilot experiment with 125 patient plasma samples was performed to investigate the optimal method for isolation of HIV-1 viral RNA for long amplicon genome sequencing. Manual isolation with the QIAamp Viral RNA Mini Kit (Qiagen) was superior over robotically extracted RNA using either the QIAcube robotic system, the mSample Preparation Systems RNA kit with automated extraction by the m2000sp system (Abbott Molecular), or the MagNA Pure 96 System in combination with the MagNA Pure 96 Instrument (Roche Diagnostics). We scored amplification of a set of four HIV-1 amplicons of ∼1.9, 3.6, 3.0 and 3.5kb, and subsequent recovery of near-complete viral genomes. Subsequently, 616 BEEHIVE patient samples were analysed to determine factors that influence successful amplification of the genome in four overlapping amplicons using the QIAamp Viral RNA Kit for viral RNA isolation. Both low plasma viral load and high sample age (stored before 1999) negatively influenced the amplification of viral amplicons >3kb. A plasma viral load of >100,000 copies/ml resulted in successful amplification of all four amplicons for 86% of the samples, this value dropped to only 46% for samples with viral loads of <20,000 copies/ml. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.
Complete mitochondrial genome sequence from an endangered Indian snake, Python molurus molurus (Serpentes, Pythonidae).

PubMed

Dubey, Bhawna; Meganathan, P R; Haque, Ikramul

2012-07-01

This paper reports the complete mitochondrial genome sequence of an endangered Indian snake, Python molurus molurus (Indian Rock Python). A typical snake mitochondrial (mt) genome of 17258 bp length comprising of 37 genes including the 13 protein coding genes, 22 tRNA genes, and 2 ribosomal RNA genes along with duplicate control regions is described herein. The P. molurus molurus mt. genome is relatively similar to other snake mt. genomes with respect to gene arrangement, composition, tRNA structures and skews of AT/GC bases. The nucleotide composition of the genome shows that there are more A-C % than T-G% on the positive strand as revealed by positive AT and CG skews. Comparison of individual protein coding genes, with other snake genomes suggests that ATP8 and NADH3 genes have high divergence rates. Codon usage analysis reveals a preference of NNC codons over NNG codons in the mt. genome of P. molurus. Also, the synonymous and non-synonymous substitution rates (ka/ks) suggest that most of the protein coding genes are under purifying selection pressure. The phylogenetic analyses involving the concatenated 13 protein coding genes of P. molurus molurus conformed to the previously established snake phylogeny.
The genome sequence of the emerging common midwife toad virus identifies an evolutionary intermediate within ranaviruses.

PubMed

Mavian, Carla; López-Bueno, Alberto; Balseiro, Ana; Casais, Rosa; Alcamí, Antonio; Alejo, Alí

2012-04-01

Worldwide amphibian population declines have been ascribed to global warming, increasing pollution levels, and other factors directly related to human activities. These factors may additionally be favoring the emergence of novel pathogens. In this report, we have determined the complete genome sequence of the emerging common midwife toad ranavirus (CMTV), which has caused fatal disease in several amphibian species across Europe. Phylogenetic and gene content analyses of the first complete genomic sequence from a ranavirus isolated in Europe show that CMTV is an amphibian-like ranavirus (ALRV). However, the CMTV genome structure is novel and represents an intermediate evolutionary stage between the two previously described ALRV groups. We find that CMTV clusters with several other ranaviruses isolated from different hosts and locations which might also be included in this novel ranavirus group. This work sheds light on the phylogenetic relationships within this complex group of emerging, disease-causing viruses.
Determination of the complete genomic sequence and analysis of the gene products of the virus of Spring Viremia of Carp, a fish rhabdovirus.

PubMed

Hoffmann, Bernd; Schütze, Heike; Mettenleiter, Thomas C

2002-03-20

The complete genome of spring viremia of carp virus (SVCV) was cloned and the sequence of 11019 nucleotides was determined. It contains five open reading frames (ORF's) encoding for the nucleoprotein N; phosphoprotein P; matrix protein M; glycoprotein G; and the viral RNA dependent RNA polymerase L. Genes are organised in the order typical for rhabdoviruses: 3'-N-P-M-G-L-5'. The short leader and trailer regions of SVCV exhibit inverse complementarity and are similar to the respective 3' and 5' ends of the genome of vesicular stomatitis virus. To verify the predicted open reading frames proteins were expressed in bacteria and analysed with a polyclonal anti-SVCV serum. Furthermore, monospecific antisera against the distinct viral proteins were generated. Comparison of genome and protein confirm the assignment of SVCV to the genus Vesiculovirus.
Complete mitochondrial genome of the Asian paddle crab Charybdis japonica (Crustacea: Decapoda: Portunidae): gene rearrangement of the marine brachyurans and phylogenetic considerations of the decapods.

PubMed

Liu, Yuan; Cui, Zhaoxia

2010-06-01

Given the commercial and ecological importance of the Asian paddle crab, Charybdis japonica, there is a clearly need for genetic and molecular research on this species. Here, we present the complete mitochondrial genome sequence of C. japonica, determined by the long-polymerase chain reaction and primer walking sequencing method. The entire genome is 15,738 bp in length, encoding a standard set of 13 protein-coding genes, two ribosomal RNA genes, and 22 transfer RNA genes, plus the putative control region, which is typical for metazoans. The total A+T content of the genome is 69.2%, lower than the other brachyuran crabs except for Callinectes sapidus. The gene order is identical to the published marine brachyurans and differs from the ancestral pancrustacean order by only the position of the tRNA ( His ) gene. Phylogenetic analyses using the concatenated nucleotide and amino acid sequences of 13 protein-coding genes strongly support the monophyly of Dendrobranchiata and Pleocyemata, which is consistent with the previous taxonomic classification. However, the systematic status of Charybdis within subfamily Thalamitinae of family Portunidae is not supported. C. japonica, as the first species of Charybdis with complete mitochondrial genome available, will provide important information on both genomics and molecular ecology of the group.
The complete mitochondrial genome of rabbit pinworm Passalurus ambiguus: genome characterization and phylogenetic analysis.

PubMed

Liu, Guo-Hua; Li, Sheng; Zou, Feng-Cai; Wang, Chun-Ren; Zhu, Xing-Quan

2016-01-01

Passalurus ambiguus (Nematda: Oxyuridae) is a common pinworm which parasitizes in the caecum and colon of rabbits. Despite its significance as a pathogen, the epidemiology, genetics, systematics, and biology of this pinworm remain poorly understood. In the present study, we sequenced the complete mitochondrial (mt) genome of P. ambiguus. The circular mt genome is 14,023 bp in size and encodes of 36 genes, including 12 protein-coding, two ribosomal RNA, and 22 transfer RNA genes. The mt gene order of P. ambiguus is the same as that of Wellcomia siamensis, but distinct from that of Enterobius vermicularis. Phylogenetic analyses based on concatenated amino acid sequences of 12 protein-coding genes by Bayesian inference (BI) showed that P. ambiguus was more closely related to W. siamensis than to E. vermicularis. This mt genome provides novel genetic markers for studying the molecular epidemiology, population genetics, systematics of pinworm of animals and humans, and should have implications for the diagnosis, prevention, and control of passaluriasis in rabbits and other animals.
Reanalysis and revision of the complete mitochondrial genome of Rachycentron canadum (Teleostei, Perciformes, Rachycentridae).

PubMed

Musika, Jidapa; Khongchatee, Adison; Phinchongsakuldit, Jaros

2014-08-01

The complete mitochondrial genome of cobia, Rachycentron canadum, was reanalyzed and revised. The genome is 18,008 bp in length, containing 13 protein-coding genes, 2 ribosomal RNA (rRNA) genes, 22 transfer RNA (tRNA) genes, and a control region or displacement loop (D-loop). The gene arrangement is identical to that observed in most vertebrates. Base composition on the heavy strand is 30.14% A, 25.22% C, 15.80% G and 28.84% T. The D-loop region exhibits an A + T rich pattern, containing short tandem repeats of TATATACATGG, TATATGCACAA and TATATGCACGG. The mitochondrial genome studied differs from the previously published genome in two segments; the control region to 12S and ND5 to tRNA(Glu). The 12S sequence also differs from those published in the databases. Phylogeny analyses revealed that the differences could be due to errors in sequence assembly and/or sample misidentification of the previous studies.
Complete genome sequence of the Phaeobacter gallaeciensis type strain CIP 105210(T) (= DSM 26640(T) = BS107(T)).

PubMed

Frank, Oliver; Pradella, Silke; Rohde, Manfred; Scheuner, Carmen; Klenk, Hans-Peter; Göker, Markus; Petersen, Jörn

2014-06-15

Phaeobacter gallaeciensis CIP 105210(T) (= DSM 26640(T) = BS107(T)) is the type strain of the species Phaeobacter gallaeciensis. The genus Phaeobacter belongs to the marine Roseobacter group (Rhodobacteraceae, Alphaproteobacteria). Phaeobacter species are effective colonizers of marine surfaces, including frequent associations with eukaryotes. Strain BS107(T) was isolated from a rearing of the scallop Pecten maximus. Here we describe the features of this organism, together with the complete genome sequence, comprising eight circular replicons with a total of 4,448 genes. In addition to a high number of extrachromosomal replicons, the genome contains six genomic island and three putative prophage regions, as well as a hybrid between a plasmid and a circular phage. Phylogenomic analyses confirm previous results, which indicated that the originally reported P. gallaeciensis type-strain deposit DSM 17395 belongs to P. inhibens and that CIP 105210(T) (= DSM 26640(T)) is the sole genome-sequenced representative of P. gallaeciensis.
Complete mitochondrial genome of the brown alga Sargassum fusiforme (Sargassaceae, Phaeophyceae): genome architecture and taxonomic consideration.

PubMed

Liu, Feng; Pang, Shaojun; Luo, Minbo

2016-01-01

Sargassum fusiforme (Harvey) Setchell (=Hizikia fusiformis (Harvey) Okamura) is one of the most important economic seaweeds for mariculture in China. In this study, we present the complete mitochondrial genome of S. fusiforme. The genome is 34,696 bp in length with circular organization, encoding the standard set of three ribosomal RNA genes (rRNA), 25 transfer RNA genes (tRNA), 35 protein-coding genes, and two conserved open reading frames (ORFs). Its total AT content is 62.47%, lower than other brown algae except Pylaiella littoralis. The mitogenome carries 1571 bp of intergenic region constituting 4.53% of the genome, and 13 pairs of overlapping genes with the overlap size from 1 to 90 bp. The phylogenetic analyses based on 35 protein-coding genes reveal that S. fusiforme has a closer evolutionary relationship with Sargassum muticum than Sargassum horneri, indicating Hizikia are not distinct evolutionary entity and should be reduced to synonymy with Sargassum.
Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bowers, Robert M.; Kyrpides, Nikos C.; Stepanauskas, Ramunas

We present two standards developed by the Genomic Standards Consortium (GSC) for reporting bacterial and archaeal genome sequences. Both are extensions of the Minimum Information about Any (x) Sequence (MIxS). The standards are the Minimum Information about a Single Amplified Genome (MISAG) and the Minimum Information about a Metagenome-Assembled Genome (MIMAG), including, but not limited to, assembly quality, and estimates of genome completeness and contamination. These standards can be used in combination with other GSC checklists, including the Minimum Information about a Genome Sequence (MIGS), Minimum Information about a Metagenomic Sequence (MIMS), and Minimum Information about a Marker Gene Sequencemore » (MIMARKS). Community-wide adoption of MISAG and MIMAG will facilitate more robust comparative genomic analyses of bacterial and archaeal diversity.« less

Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea

DOE PAGES

Bowers, Robert M.; Kyrpides, Nikos C.; Stepanauskas, Ramunas; ...

2017-08-08

Here, we present two standards developed by the Genomic Standards Consortium (GSC) for reporting bacterial and archaeal genome sequences. Both are extensions of the Minimum Information about Any (x) Sequence (MIxS). The standards are the Minimum Information about a Single Amplified Genome (MISAG) and the Minimum Information about a MetagenomeAssembled Genome (MIMAG), including, but not limited to, assembly quality, and estimates of genome completeness and contamination. These standards can be used in combination with other GSC checklists, including the Minimum Information about a Genome Sequence (MIGS), Minimum Information about a Metagenomic Sequence (MIMS), and Minimum Information about a Marker Genemore » Sequence (MIMARKS). Community-wide adoption of MISAG and MIMAG will facilitate more robust comparative genomic analyses of bacterial and archaeal diversity.« less
Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bowers, Robert M.; Kyrpides, Nikos C.; Stepanauskas, Ramunas

Here, we present two standards developed by the Genomic Standards Consortium (GSC) for reporting bacterial and archaeal genome sequences. Both are extensions of the Minimum Information about Any (x) Sequence (MIxS). The standards are the Minimum Information about a Single Amplified Genome (MISAG) and the Minimum Information about a MetagenomeAssembled Genome (MIMAG), including, but not limited to, assembly quality, and estimates of genome completeness and contamination. These standards can be used in combination with other GSC checklists, including the Minimum Information about a Genome Sequence (MIGS), Minimum Information about a Metagenomic Sequence (MIMS), and Minimum Information about a Marker Genemore » Sequence (MIMARKS). Community-wide adoption of MISAG and MIMAG will facilitate more robust comparative genomic analyses of bacterial and archaeal diversity.« less
[Complete genome sequencing and analyses of rabies viruses isolated from wild animals (Chinese Ferret-Badger) in Zhejiang province].

PubMed

Lei, Yong-Liang; Wang, Xiao-Guang; Liu, Fu-Ming; Chen, Xiu-Ying; Ye, Bi-Feng; Mei, Jian-Hua; Lan, Jin-Quan; Tang, Qing

2009-08-01

Based on sequencing the full-length genomes of two Chinese Ferret-Badger, we analyzed the properties of rabies viruses genetic variation in molecular level to get information on prevalence and variation of rabies viruses in Zhejiang, and to enrich the genome database of rabies viruses street strains isolated from Chinese wildlife. Overlapped fragments were amplified by RT-PCR and full-length genomes were assembled to analyze the nucleotide and deduced protein similarities and phylogenetic analyses of the N genes from Chinese Ferret-Badger, sika deer, vole, dog. Vaccine strains were then determined. The two full-length genomes were completely sequenced to find out that they had the same genetic structure with 11 923 nts including 58 nts-Leader, 1353 nts-NP, 894 nts-PP, 609 nts-MP, 1575 nts-GP, 6386 nts-LP, and 2, 5, 5 nts- intergenic regions (IGRs), 423 nts-Pseudogene-like sequence (Psi), 70 nts-Trailer. The two full-length genomes were in accordance with the properties of Rhabdoviridae Lyssa virus by blast and multi-sequence alignment. The nucleotide and amino acid sequences among Chinese strains had the highest similarity, especially among animals of the same species. Of the two full-length genomes, the similarity in amino acid level was dramatically higher than that in nucleotide level, so that the nucleotide mutations happened in these two genomes were most probably as synonymous mutations. Compared to the referenced rabies viruses, the lengths of the five protein coding regions did not show any changes or recombination, but only with a few-point mutations. It was evident that the five proteins appeared to be stable. The variation sites and types of the two ferret badgers genomes were similar to the referenced vaccine or street strains. The two strains were genotype 1 according to the multi-sequence and phylogenetic analyses, which possessing the distinct geographyphic characteristics of China. All the evidence suggested a cue that these two ferret badgers rabies viruses were likely to be street virus that already circulating in wildlife.
Complete chloroplast genome sequences of Drimys, Liriodendron, andPiper: Implications for the phylogeny of magnoliids and the evolution ofGC content

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhengqiu, C.; Penaflor, C.; Kuehl, J.V.

2006-06-01

The magnoliids represent the largest basal angiosperm clade with four orders, 19 families and 8,500 species. Although several recent angiosperm molecular phylogenies have supported the monophyly of magnoliids and suggested relationships among the orders, the limited number of genes examined resulted in only weak support, and these issues remain controversial. Furthermore, considerable incongruence has resulted in phylogenies supporting three different sets of relationships among magnoliids and the two large angiosperm clades, monocots and eudicots. This is one of the most important remaining issues concerning relationships among basal angiosperms. We sequenced the chloroplast genomes of three magnoliids, Drimys (Canellales), Liriodendron (Magnoliales),more » and Piper (Piperales), and used these data in combination with 32 other completed angiosperm chloroplast genomes to assess phylogenetic relationships among magnoliids. The Drimys and Piper chloroplast genomes are nearly identical in size at 160,606 and 160,624 bp, respectively. The genomes include a pair of inverted repeats of 26,649 bp (Drimys) and 27,039 (Piper), separated by a small single copy region of 18,621 (Drimys) and 18,878 (Piper) and a large single copy region of 88,685 bp (Drimys) and 87,666 bp (Piper). The gene order of both taxa is nearly identical to many other unrearranged angiosperm chloroplast genomes, including Calycanthus, the other published magnoliid genome. Comparisons of angiosperm chloroplast genomes indicate that GC content is not uniformly distributed across the genome. Overall GC content ranges from 34-39%, and coding regions have a substantially higher GC content than non-coding regions (both intergenic spacers and introns). Among protein-coding genes, GC content varies by codon position with 1st codon > 2nd codon > 3rd codon, and it varies by functional group with photosynthetic genes having the highest percentage and NADH genes the lowest. Across the genome, GC content is highest in the inverted repeat due to the presence of rRNA genes and lowest in the small single copy region where most NADH genes are located. Phylogenetic analyses using maximum parsimony and maximum likelihood methods were performed on DNA sequences of 61 protein-coding genes. Trees from both analyses provided strong support for the monophyly of magnoliids and two strongly supported groups were identified, the Canellales/Piperales and the Laurales/Magnoliales. The phylogenies also provided moderate to strong support for the basal position of Amborella, and a sister relationship of magnoliids to a clade that includes monocots and eudicots. The complete sequences of three magnoliid chloroplast genomes provide new data from the largest basal angiosperm clade. Evolutionary comparisons of these new genome sequences, combined with other published angiosperm genome, confirm that GC content is unevenly distributed across the genome by location, codon position, and functional group. Furthermore, phylogenetic analyses provide the strongest support so far for the hypothesis that the magnoliids are sister to a large clade that includes both monocots and eudicots.« less
The complete mitochondrial genome of the styloperlid stonefly species Styloperla spinicercia Wu (Insecta: Plecoptera) with family-level phylogenetic analyses of the Pteronarcyoidea.

PubMed

Wang, Ying; Cao, Jinjun; Li, Weihai

2017-03-13

We present the complete mitochondrial (mt) genome sequence of the stonefly, Styloperla spinicercia Wu, 1935 (Plecoptera: Styloperlidae), the type species of the genus Styloperla and the first complete mt genome for the family Styloperlidae. The genome is circular, 16,129 base pairs long, has an A+T content of 70.7%, and contains 37 genes including the large and small ribosomal RNA (rRNA) subunits, 13 protein coding genes (PCGs), 22 tRNA genes and a large non-coding region (CR). All of the PCGs use the standard initiation codon ATN except ND1 and ND5, which start with TTG and GTG. Twelve of the PCGs stop with conventional terminal codons TAA and TAG, except ND5 which shows an incomplete terminator signal T. All tRNAs have the classic clover-leaf structures with the dihydrouridine (DHU) arm of tRNASer(AGN) forming a simple loop. Secondary structures of the two ribosomal RNAs are presented with reference to previous models. The structural elements and the variable numbers of tandem repeats are described within the control region. Phylogenetic analyses using both Bayesian (BI) and Maximum Likelihood (ML) methods support the previous hypotheses regarding family level relationships within the Pteronarcyoidea. The genetic distance calculated based on 13 PCGs and two rRNAs between Styloperla sp. and S. spinicercia is provided and interspecific divergence is discussed.
Sequence Analysis of Leuconostoc mesenteroides Bacteriophage Φ1-A4 Isolated from an Industrial Vegetable Fermentation▿

PubMed Central

Lu, Z.; Altermann, E.; Breidt, F.; Kozyavkin, S.

2010-01-01

Vegetable fermentations rely on the proper succession of a variety of lactic acid bacteria (LAB). Leuconostoc mesenteroides initiates fermentation. As fermentation proceeds, L. mesenteroides dies off and other LAB complete the fermentation. Phages infecting L. mesenteroides may significantly influence the die-off of L. mesenteroides. However, no L. mesenteroides phages have been previously genetically characterized. Knowledge of more phage genome sequences may provide new insights into phage genomics, phage evolution, and phage-host interactions. We have determined the complete genome sequence of L. mesenteroides phage Φ1-A4, isolated from an industrial sauerkraut fermentation. The phage possesses a linear, double-stranded DNA genome consisting of 29,508 bp with a G+C content of 36%. Fifty open reading frames (ORFs) were predicted. Putative functions were assigned to 26 ORFs (52%), including 5 ORFs of structural proteins. The phage genome was modularly organized, containing DNA replication, DNA-packaging, head and tail morphogenesis, cell lysis, and DNA regulation/modification modules. In silico analyses showed that Φ1-A4 is a unique lytic phage with a large-scale genome inversion (∼30% of the genome). The genome inversion encompassed the lysis module, part of the structural protein module, and a cos site. The endolysin gene was flanked by two holin genes. The tail morphogenesis module was interspersed with cell lysis genes and other genes with unknown functions. The predicted amino acid sequences of the phage proteins showed little similarity to other phages, but functional analyses showed that Φ1-A4 clusters with several Lactococcus phages. To our knowledge, Φ1-A4 is the first genetically characterized L. mesenteroides phage. PMID:20118355
Phylogenetic analyses of complete mitochondrial genome sequences suggest a basal divergence of the enigmatic rodent Anomalurus

PubMed Central

Horner, David S; Lefkimmiatis, Konstantinos; Reyes, Aurelio; Gissi, Carmela; Saccone, Cecilia; Pesole, Graziano

2007-01-01

Background Phylogenetic relationships between Lagomorpha, Rodentia and Primates and their allies (Euarchontoglires) have long been debated. While it is now generally agreed that Rodentia constitutes a monophyletic sister-group of Lagomorpha and that this clade (Glires) is sister to Primates and Dermoptera, higher-level relationships within Rodentia remain contentious. Results We have sequenced and performed extensive evolutionary analyses on the mitochondrial genome of the scaly-tailed flying squirrel Anomalurus sp., an enigmatic rodent whose phylogenetic affinities have been obscure and extensively debated. Our phylogenetic analyses of the coding regions of available complete mitochondrial genome sequences from Euarchontoglires suggest that Anomalurus is a sister taxon to the Hystricognathi, and that this clade represents the most basal divergence among sampled Rodentia. Bayesian dating methods incorporating a relaxed molecular clock provide divergence-time estimates which are consistently in agreement with the fossil record and which indicate a rapid radiation within Glires around 60 million years ago. Conclusion Taken together, the data presented provide a working hypothesis as to the phylogenetic placement of Anomalurus, underline the utility of mitochondrial sequences in the resolution of even relatively deep divergences and go some way to explaining the difficulty of conclusively resolving higher-level relationships within Glires with available data and methodologies. PMID:17288612
Comparative Genome Analyses of Serratia marcescens FS14 Reveals Its High Antagonistic Potential

PubMed Central

Li, Pengpeng; Kwok, Amy H. Y.; Jiang, Jingwei; Ran, Tingting; Xu, Dongqing; Wang, Weiwu; Leung, Frederick C.

2015-01-01

S. marcescens FS14 was isolated from an Atractylodes macrocephala Koidz plant that was infected by Fusarium oxysporum and showed symptoms of root rot. With the completion of the genome sequence of FS14, the first comprehensive comparative-genomic analysis of the Serratia genus was performed. Pan-genome and COG analyses showed that the majority of the conserved core genes are involved in basic cellular functions, while genomic factors such as prophages contribute considerably to genome diversity. Additionally, a Type I restriction-modification system, a Type III secretion system and tellurium resistance genes are found in only some Serratia species. Comparative analysis further identified that S. marcescens FS14 possesses multiple mechanisms for antagonism against other microorganisms, including the production of prodigiosin, bacteriocins, and multi-antibiotic resistant determinants as well as chitinases. The presence of two evolutionarily distinct Type VI secretion systems (T6SSs) in FS14 may provide further competitive advantages for FS14 against other microbes. To our knowledge, this is the first report of comparative analysis on T6SSs in the genus, which identifies four types of T6SSs in Serratia spp.. Competition bioassays of FS14 against the vital plant pathogenic bacterium Ralstonia solanacearum and fungi Fusarium oxysporum and Sclerotinia sclerotiorum were performed to support our genomic analyses, in which FS14 demonstrated high antagonistic activities against both bacterial and fungal phytopathogens. PMID:25856195
Rickettsia asembonensis Characterization by Multilocus Sequence Typing of Complete Genes, Peru.

PubMed

Loyola, Steev; Flores-Mendoza, Carmen; Torre, Armando; Kocher, Claudine; Melendrez, Melanie; Luce-Fedrow, Alison; Maina, Alice N; Richards, Allen L; Leguia, Mariana

2018-05-01

While studying rickettsial infections in Peru, we detected Rickettsia asembonensis in fleas from domestic animals. We characterized 5 complete genomic regions (17kDa, gltA, ompA, ompB, and sca4) and conducted multilocus sequence typing and phylogenetic analyses. The molecular isolate from Peru is distinct from the original R. asembonensis strain from Kenya.
IMGD: an integrated platform supporting comparative genomics and phylogenetics of insect mitochondrial genomes

PubMed Central

Lee, Wonhoon; Park, Jongsun; Choi, Jaeyoung; Jung, Kyongyong; Park, Bongsoo; Kim, Donghan; Lee, Jaeyoung; Ahn, Kyohun; Song, Wonho; Kang, Seogchan; Lee, Yong-Hwan; Lee, Seunghwan

2009-01-01

Background Sequences and organization of the mitochondrial genome have been used as markers to investigate evolutionary history and relationships in many taxonomic groups. The rapidly increasing mitochondrial genome sequences from diverse insects provide ample opportunities to explore various global evolutionary questions in the superclass Hexapoda. To adequately support such questions, it is imperative to establish an informatics platform that facilitates the retrieval and utilization of available mitochondrial genome sequence data. Results The Insect Mitochondrial Genome Database (IMGD) is a new integrated platform that archives the mitochondrial genome sequences from 25,747 hexapod species, including 112 completely sequenced and 20 nearly completed genomes and 113,985 partially sequenced mitochondrial genomes. The Species-driven User Interface (SUI) of IMGD supports data retrieval and diverse analyses at multi-taxon levels. The Phyloviewer implemented in IMGD provides three methods for drawing phylogenetic trees and displays the resulting trees on the web. The SNP database incorporated to IMGD presents the distribution of SNPs and INDELs in the mitochondrial genomes of multiple isolates within eight species. A newly developed comparative SNU Genome Browser supports the graphical presentation and interactive interface for the identified SNPs/INDELs. Conclusion The IMGD provides a solid foundation for the comparative mitochondrial genomics and phylogenetics of insects. All data and functions described here are available at the web site . PMID:19351385
Pan-genome and phylogeny of Bacillus cereus sensu lato.

PubMed

Bazinet, Adam L

2017-08-02

Bacillus cereus sensu lato (s. l.) is an ecologically diverse bacterial group of medical and agricultural significance. In this study, I use publicly available genomes and novel bioinformatic workflows to characterize the B. cereus s. l. pan-genome and perform the largest phylogenetic and population genetic analyses of this group to date in terms of the number of genes and taxa included. With these fundamental data in hand, I identify genes associated with particular phenotypic traits (i.e., "pan-GWAS" analysis), and quantify the degree to which taxa sharing common attributes are phylogenetically clustered. A rapid k-mer based approach (Mash) was used to create reduced representations of selected Bacillus genomes, and a fast distance-based phylogenetic analysis of this data (FastME) was performed to determine which species should be included in B. cereus s. l. The complete genomes of eight B. cereus s. l. species were annotated de novo with Prokka, and these annotations were used by Roary to produce the B. cereus s. l. pan-genome. Scoary was used to associate gene presence and absence patterns with various phenotypes. The orthologous protein sequence clusters produced by Roary were filtered and used to build HaMStR databases of gene models that were used in turn to construct phylogenetic data matrices. Phylogenetic analyses used RAxML, DendroPy, ClonalFrameML, PAUP*, and SplitsTree. Bayesian model-based population genetic analysis assigned taxa to clusters using hierBAPS. The genealogical sorting index was used to quantify the phylogenetic clustering of taxa sharing common attributes. The B. cereus s. l. pan-genome currently consists of ≈60,000 genes, ≈600 of which are "core" (common to at least 99% of taxa sampled). Pan-GWAS analysis revealed genes associated with phenotypes such as isolation source, oxygen requirement, and ability to cause diseases such as anthrax or food poisoning. Extensive phylogenetic analyses using an unprecedented amount of data produced phylogenies that were largely concordant with each other and with previous studies. Phylogenetic support as measured by bootstrap probabilities increased markedly when all suitable pan-genome data was included in phylogenetic analyses, as opposed to when only core genes were used. Bayesian population genetic analysis recommended subdividing the three major clades of B. cereus s. l. into nine clusters. Taxa sharing common traits and species designations exhibited varying degrees of phylogenetic clustering. All phylogenetic analyses recapitulated two previously used classification systems, and taxa were consistently assigned to the same major clade and group. By including accessory genes from the pan-genome in the phylogenetic analyses, I produced an exceptionally well-supported phylogeny of 114 complete B. cereus s. l. genomes. The best-performing methods were used to produce a phylogeny of all 498 publicly available B. cereus s. l. genomes, which was in turn used to compare three different classification systems and to test the monophyly status of various B. cereus s. l. species. The majority of the methodology used in this study is generic and could be leveraged to produce pan-genome estimates and similarly robust phylogenetic hypotheses for other bacterial groups.
Characterization of the complete mitochondrial genome of the cloacal tapeworm Cloacotaenia megalops (Cestoda: Hymenolepididae).

PubMed

Guo, Aijiang

2016-09-05

The cloacal tapeworm Cloacotaenia megalops (Hymenolepididae) is one of the most common cestode parasites of domestic and wild ducks worldwide. However, limited information is available regarding its epidemiology, biology, genetics and systematics. This study provides characterisation of the complete mitochondrial (mt) genome of C. megalops. The complete mt genome of C. megalops was obtained by long PCR, sequenced and annotated. The length of the entire mt genome of C. megalops is 13,887 bp; it contains 12 protein-coding, 2 ribosomal RNA and 22 transfer RNA genes, but lacks an atp8 gene. The mt gene arrangement of C. megalops is identical to that observed in Anoplocephala magna and A. perfoliata (Anoplocephalidae), Dipylidium caninum (Dipylidiidae) and Hymenolepis diminuta (Hymenolepididae), but differs from that reported in taeniids owing to the position shift between the tRNA (L1) and tRNA (S2) genes. The phylogenetic position of C. megalops was inferred using Maximum likelihood and Bayesian inference methods based on the concatenated amino acid data for 12 protein-coding genes. Phylogenetic trees showed that C. megalops is sister to Anoplocephala spp. (Anoplocephalidae) + Pseudanoplocephala crawfordi + Hymenolepis spp. (Hymenolepididae) indicating that the family Hymenolepididae is paraphyletic. The complete mt genome of C. megalops is sequenced. Phylogenetic analyses provided an insight into the phylogenetic relationships among the families Anoplocephalidae, Hymenolepididae, Dipylidiidae and Taeniidae. This novel genomic information also provides the opportunity to develop useful genetic markers for studying the molecular epidemiology, biology, genetics and systematics of C. megalops.
Comparative chloroplast genomics and phylogenetics of Fagopyrum esculentum ssp. ancestrale – A wild ancestor of cultivated buckwheat

PubMed Central

Logacheva, Maria D; Samigullin, Tahir H; Dhingra, Amit; Penin, Aleksey A

2008-01-01

Background Chloroplast genome sequences are extremely informative about species-interrelationships owing to its non-meiotic and often uniparental inheritance over generations. The subject of our study, Fagopyrum esculentum, is a member of the family Polygonaceae belonging to the order Caryophyllales. An uncertainty remains regarding the affinity of Caryophyllales and the asterids that could be due to undersampling of the taxa. With that background, having access to the complete chloroplast genome sequence for Fagopyrum becomes quite pertinent. Results We report the complete chloroplast genome sequence of a wild ancestor of cultivated buckwheat, Fagopyrum esculentum ssp. ancestrale. The sequence was rapidly determined using a previously described approach that utilized a PCR-based method and employed universal primers, designed on the scaffold of multiple sequence alignment of chloroplast genomes. The gene content and order in buckwheat chloroplast genome is similar to Spinacia oleracea. However, some unique structural differences exist: the presence of an intron in the rpl2 gene, a frameshift mutation in the rpl23 gene and extension of the inverted repeat region to include the ycf1 gene. Phylogenetic analysis of 61 protein-coding gene sequences from 44 complete plastid genomes provided strong support for the sister relationships of Caryophyllales (including Polygonaceae) to asterids. Further, our analysis also provided support for Amborella as sister to all other angiosperms, but interestingly, in the bayesian phylogeny inference based on first two codon positions Amborella united with Nymphaeales. Conclusion Comparative genomics analyses revealed that the Fagopyrum chloroplast genome harbors the characteristic gene content and organization as has been described for several other chloroplast genomes. However, it has some unique structural features distinct from previously reported complete chloroplast genome sequences. Phylogenetic analysis of the dataset, including this new sequence from non-core Caryophyllales supports the sister relationship between Caryophyllales and asterids. PMID:18492277
Enhanced annotations and features for comparing thousands of Pseudomonas genomes in the Pseudomonas genome database.

PubMed

Winsor, Geoffrey L; Griffiths, Emma J; Lo, Raymond; Dhillon, Bhavjinder K; Shay, Julie A; Brinkman, Fiona S L

2016-01-04

The Pseudomonas Genome Database (http://www.pseudomonas.com) is well known for the application of community-based annotation approaches for producing a high-quality Pseudomonas aeruginosa PAO1 genome annotation, and facilitating whole-genome comparative analyses with other Pseudomonas strains. To aid analysis of potentially thousands of complete and draft genome assemblies, this database and analysis platform was upgraded to integrate curated genome annotations and isolate metadata with enhanced tools for larger scale comparative analysis and visualization. Manually curated gene annotations are supplemented with improved computational analyses that help identify putative drug targets and vaccine candidates or assist with evolutionary studies by identifying orthologs, pathogen-associated genes and genomic islands. The database schema has been updated to integrate isolate metadata that will facilitate more powerful analysis of genomes across datasets in the future. We continue to place an emphasis on providing high-quality updates to gene annotations through regular review of the scientific literature and using community-based approaches including a major new Pseudomonas community initiative for the assignment of high-quality gene ontology terms to genes. As we further expand from thousands of genomes, we plan to provide enhancements that will aid data visualization and analysis arising from whole-genome comparative studies including more pan-genome and population-based approaches. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Complete Genome Sequence of a Multidrug-Resistant Salmonella enterica Serovar Typhimurium var. 5− Strain Isolated from Chicken Breast

PubMed Central

Muruvanda, Tim; Allard, Marc W.; Korlach, Jonas; Roberts, Richard J.; Timme, Ruth; Payne, Justin; McDermott, Patrick F.; Evans, Peter; Meng, Jianghong; Brown, Eric W.; Zhao, Shaohua

2013-01-01

Salmonella enterica subsp. enterica serovar Typhimurium is a leading cause of salmonellosis. Here, we report a closed genome sequence, including sequences of 3 plasmids, of Salmonella serovar Typhimurium var. 5− CFSAN001921 (National Antimicrobial Resistance Monitoring System [NARMS] strain ID N30688), which was isolated from chicken breast meat and shows resistance to 10 different antimicrobials. Whole-genome and plasmid sequence analyses of this isolate will help enhance our understanding of this pathogenic multidrug-resistant serovar. PMID:24356834
The complete chloroplast DNA sequence of the green alga Nephroselmis olivacea: Insights into the architecture of ancestral chloroplast genomes

PubMed Central

Turmel, Monique; Otis, Christian; Lemieux, Claude

1999-01-01

Green plants seem to form two sister lineages: Chlorophyta, comprising the green algal classes Prasinophyceae, Ulvophyceae, Trebouxiophyceae, and Chlorophyceae, and Streptophyta, comprising the Charophyceae and land plants. We have determined the complete chloroplast DNA (cpDNA) sequence (200,799 bp) of Nephroselmis olivacea, a member of the class (Prasinophyceae) thought to include descendants of the earliest-diverging green algae. The 127 genes identified in this genome represent the largest gene repertoire among the green algal and land plant cpDNAs completely sequenced to date. Of the Nephroselmis genes, 2 (ycf81 and ftsI, a gene involved in peptidoglycan synthesis) have not been identified in any previously investigated cpDNA; 5 genes [ftsW, rnE, ycf62, rnpB, and trnS(cga)] have been found only in cpDNAs of nongreen algae; and 10 others (ndh genes) have been described only in land plant cpDNAs. Nephroselmis and land plant cpDNAs share the same quadripartite structure—which is characterized by the presence of a large rRNA-encoding inverted repeat and two unequal single-copy regions—and very similar sets of genes in corresponding genomic regions. Given that our phylogenetic analyses place Nephroselmis within the Chlorophyta, these structural characteristics were most likely present in the cpDNA of the common ancestor of chlorophytes and streptophytes. Comparative analyses of chloroplast genomes indicate that the typical quadripartite architecture and gene-partitioning pattern of land plant cpDNAs are ancient features that may have been derived from the genome of the cyanobacterial progenitor of chloroplasts. Our phylogenetic data also offer insight into the chlorophyte ancestor of euglenophyte chloroplasts. PMID:10468594
Intraspecific and heteroplasmic variations, gene losses and inversions in the chloroplast genome of Astragalus membranaceus.

PubMed

Lei, Wanjun; Ni, Dapeng; Wang, Yujun; Shao, Junjie; Wang, Xincun; Yang, Dan; Wang, Jinsheng; Chen, Haimei; Liu, Chang

2016-02-22

Astragalus membranaceus is an important medicinal plant in Asia. Several of its varieties have been used interchangeably as raw materials for commercial production. High resolution genetic markers are in urgent need to distinguish these varieties. Here, we sequenced and analyzed the chloroplast genome of A. membranaceus (Fisch.) Bunge var. mongholicus (Bunge) P.K. Hsiao using the next generation DNA sequencing technology. The genome was assembled using Abyss and then subjected to gene prediction using CPGAVAS and repeat analysis using MISA, Tandem Repeats Finder, and REPuter. Finally, the genome was subjected phylogenetic and comparative genomic analyses. The complete genome is 123,582 bp long, containing only one copy of the inverted repeat. Gene prediction revealed 110 genes encoding 76 proteins, 30 tRNAs, and four rRNAs. Five intra-specific hypermutation loci were identified, three of which are heteroplasmic. Furthermore, three gene losses and two large inversions were identified. Comparative genomic analyses demonstrated the dynamic nature of the Papilionoideae chloroplast genomes, which showed occurrence of numerous hypermutation loci, frequent gene losses, and fragment inversions. Results obtained herein elucidate the complex evolutionary history of chloroplast genomes and have laid the foundation for the identification of genetic markers to distinguish A. membranaceus varieties.
The First Mitochondrial Genome for Caddisfly (Insecta: Trichoptera) with Phylogenetic Implications

PubMed Central

Wang, Yuyu; Liu, Xingyue; Yang, Ding

2014-01-01

The Trichoptera (caddisflies) is a holometabolous insect order with 14,300 described species forming the second most species-rich monophyletic group of animals in freshwater. Hitherto, there is no mitochondrial genome reported of this order. Herein, we describe the complete mitochondrial (mt) genome of a caddisfly species, Eubasilissa regina (McLachlan, 1871). A phylogenomic analysis was carried out based on the mt genomic sequences of 13 mt protein coding genes (PCGs) and two rRNA genes of 24 species belonging to eight holometabolous orders. Both maximum likelihood and Bayesian inference analyses highly support the sister relationship between Trichoptera and Lepidoptera. PMID:24391451
The first complete chloroplast genome sequence of a lycophyte,Huperzia lucidula (Lycopodiaceae)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wolf, Paul G.; Karol, Kenneth G.; Mandoli, Dina F.

2005-02-01

We used a unique combination of techniques to sequence the first complete chloroplast genome of a lycophyte, Huperzia lucidula. This plant belongs to a significant clade hypothesized to represent the sister group to all other vascular plants. We used fluorescence-activated cell sorting (FACS) to isolate the organelles, rolling circle amplification (RCA) to amplify the genome, and shotgun sequencing to 8x depth coverage to obtain the complete chloroplast genome sequence. The genome is 154,373bp, containing inverted repeats of 15,314 bp each, a large single-copy region of 104,088 bp, and a small single-copy region of 19,671 bp. Gene order is more similarmore » to those of mosses, liverworts, and hornworts than to gene order for other vascular plants. For example, the Huperziachloroplast genome possesses the bryophyte gene order for a previously characterized 30 kb inversion, thus supporting the hypothesis that lycophytes are sister to all other extant vascular plants. The lycophytechloroplast genome data also enable a better reconstruction of the basaltracheophyte genome, which is useful for inferring relationships among bryophyte lineages. Several unique characters are observed in Huperzia, such as movement of the gene ndhF from the small single copy region into the inverted repeat. We present several analyses of evolutionary relationships among land plants by using nucleotide data, amino acid sequences, and by comparing gene arrangements from chloroplast genomes. The results, while still tentative pending the large number of chloroplast genomes from other key lineages that are soon to be sequenced, are intriguing in themselves, and contribute to a growing comparative database of genomic and morphological data across the green plants.« less
The complete chloroplast genome sequence of Cephalotaxus oliveri (Cephalotaxaceae): evolutionary comparison of cephalotaxus chloroplast DNAs and insights into the loss of inverted repeat copies in gymnosperms.

PubMed

Yi, Xuan; Gao, Lei; Wang, Bo; Su, Ying-Juan; Wang, Ting

2013-01-01

We have determined the complete chloroplast (cp) genome sequence of Cephalotaxus oliveri. The genome is 134,337 bp in length, encodes 113 genes, and lacks inverted repeat (IR) regions. Genome-wide mutational dynamics have been investigated through comparative analysis of the cp genomes of C. oliveri and C. wilsoniana. Gene order transformation analyses indicate that when distinct isomers are considered as alternative structures for the ancestral cp genome of cupressophyte and Pinaceae lineages, it is not possible to distinguish between hypotheses favoring retention of the same IR region in cupressophyte and Pinaceae cp genomes from a hypothesis proposing independent loss of IRA and IRB. Furthermore, in cupressophyte cp genomes, the highly reduced IRs are replaced by short repeats that have the potential to mediate homologous recombination, analogous to the situation in Pinaceae. The importance of repeats in the mutational dynamics of cupressophyte cp genomes is also illustrated by the accD reading frame, which has undergone extreme length expansion in cupressophytes. This has been caused by a large insertion comprising multiple repeat sequences. Overall, we find that the distribution of repeats, indels, and substitutions is significantly correlated in Cephalotaxus cp genomes, consistent with a hypothesis that repeats play a role in inducing substitutions and indels in conifer cp genomes.

Estimation of main diversification time-points of hantaviruses using phylogenetic analyses of complete genomes.

PubMed

Castel, Guillaume; Tordo, Noël; Plyusnin, Alexander

2017-04-02

Because of the great variability of their reservoir hosts, hantaviruses are excellent models to evaluate the dynamics of virus-host co-evolution. Intriguing questions remain about the timescale of the diversification events that influenced this evolution. In this paper we attempted to estimate the first ever timing of hantavirus diversification based on thirty five available complete genomes representing five major groups of hantaviruses and the assumption of co-speciation of hantaviruses with their respective mammal hosts. Phylogenetic analyses were used to estimate the main diversification points during hantavirus evolution in mammals while host diversification was mostly estimated from independent calibrators taken from fossil records. Our results support an earlier developed hypothesis of co-speciation of known hantaviruses with their respective mammal hosts and hence a common ancestor for all hantaviruses carried by placental mammals. Copyright © 2017 Elsevier B.V. All rights reserved.
Genome Sequencing of the Phytoseiid Predatory Mite Metaseiulus occidentalis Reveals Completely Atomized Hox Genes and Superdynamic Intron Evolution

PubMed Central

Hoy, Marjorie A.; Waterhouse, Robert M.; Wu, Ke; Estep, Alden S.; Ioannidis, Panagiotis; Palmer, William J.; Pomerantz, Aaron F.; Simão, Felipe A.; Thomas, Jainy; Jiggins, Francis M.; Murphy, Terence D.; Pritham, Ellen J.; Robertson, Hugh M.; Zdobnov, Evgeny M.; Gibbs, Richard A.; Richards, Stephen

2016-01-01

Metaseiulus occidentalis is an eyeless phytoseiid predatory mite employed for the biological control of agricultural pests including spider mites. Despite appearances, these predator and prey mites are separated by some 400 Myr of evolution and radically different lifestyles. We present a 152-Mb draft assembly of the M. occidentalis genome: Larger than that of its favored prey, Tetranychus urticae, but considerably smaller than those of many other chelicerates, enabling an extremely contiguous and complete assembly to be built—the best arachnid to date. Aided by transcriptome data, genome annotation cataloged 18,338 protein-coding genes and identified large numbers of Helitron transposable elements. Comparisons with other arthropods revealed a particularly dynamic and turbulent genomic evolutionary history. Its genes exhibit elevated molecular evolution, with strikingly high numbers of intron gains and losses, in stark contrast to the deer tick Ixodes scapularis. Uniquely among examined arthropods, this predatory mite’s Hox genes are completely atomized, dispersed across the genome, and it encodes five copies of the normally single-copy RNA processing Dicer-2 gene. Examining gene families linked to characteristic biological traits of this tiny predator provides initial insights into processes of sex determination, development, immune defense, and how it detects, disables, and digests its prey. As the first reference genome for the Phytoseiidae, and for any species with the rare sex determination system of parahaploidy, the genome of the western orchard predatory mite improves genomic sampling of chelicerates and provides invaluable new resources for functional genomic analyses of this family of agriculturally important mites. PMID:26951779
CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes

PubMed Central

Parks, Donovan H.; Imelfort, Michael; Skennerton, Connor T.; Hugenholtz, Philip; Tyson, Gene W.

2015-01-01

Large-scale recovery of genomes from isolates, single cells, and metagenomic data has been made possible by advances in computational methods and substantial reductions in sequencing costs. Although this increasing breadth of draft genomes is providing key information regarding the evolutionary and functional diversity of microbial life, it has become impractical to finish all available reference genomes. Making robust biological inferences from draft genomes requires accurate estimates of their completeness and contamination. Current methods for assessing genome quality are ad hoc and generally make use of a limited number of “marker” genes conserved across all bacterial or archaeal genomes. Here we introduce CheckM, an automated method for assessing the quality of a genome using a broader set of marker genes specific to the position of a genome within a reference genome tree and information about the collocation of these genes. We demonstrate the effectiveness of CheckM using synthetic data and a wide range of isolate-, single-cell-, and metagenome-derived genomes. CheckM is shown to provide accurate estimates of genome completeness and contamination and to outperform existing approaches. Using CheckM, we identify a diverse range of errors currently impacting publicly available isolate genomes and demonstrate that genomes obtained from single cells and metagenomic data vary substantially in quality. In order to facilitate the use of draft genomes, we propose an objective measure of genome quality that can be used to select genomes suitable for specific gene- and genome-centric analyses of microbial communities. PMID:25977477
CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes.

PubMed

Parks, Donovan H; Imelfort, Michael; Skennerton, Connor T; Hugenholtz, Philip; Tyson, Gene W

2015-07-01

Large-scale recovery of genomes from isolates, single cells, and metagenomic data has been made possible by advances in computational methods and substantial reductions in sequencing costs. Although this increasing breadth of draft genomes is providing key information regarding the evolutionary and functional diversity of microbial life, it has become impractical to finish all available reference genomes. Making robust biological inferences from draft genomes requires accurate estimates of their completeness and contamination. Current methods for assessing genome quality are ad hoc and generally make use of a limited number of "marker" genes conserved across all bacterial or archaeal genomes. Here we introduce CheckM, an automated method for assessing the quality of a genome using a broader set of marker genes specific to the position of a genome within a reference genome tree and information about the collocation of these genes. We demonstrate the effectiveness of CheckM using synthetic data and a wide range of isolate-, single-cell-, and metagenome-derived genomes. CheckM is shown to provide accurate estimates of genome completeness and contamination and to outperform existing approaches. Using CheckM, we identify a diverse range of errors currently impacting publicly available isolate genomes and demonstrate that genomes obtained from single cells and metagenomic data vary substantially in quality. In order to facilitate the use of draft genomes, we propose an objective measure of genome quality that can be used to select genomes suitable for specific gene- and genome-centric analyses of microbial communities. © 2015 Parks et al.; Published by Cold Spring Harbor Laboratory Press.
Transcriptional and phylogenetic analysis of five complete ambystomatid salamander mitochondrial genomes.

PubMed

Samuels, Amy K; Weisrock, David W; Smith, Jeramiah J; France, Katherine J; Walker, John A; Putta, Srikrishna; Voss, S Randal

2005-04-11

We report on a study that extended mitochondrial transcript information from a recent EST project to obtain complete mitochondrial genome sequence for 5 tiger salamander complex species (Ambystoma mexicanum, A. t. tigrinum, A. andersoni, A. californiense, and A. dumerilii). We describe, for the first time, aspects of mitochondrial transcription in a representative amphibian, and then use complete mitochondrial sequence data to examine salamander phylogeny at both deep and shallow levels of evolutionary divergence. The available mitochondrial ESTs for A. mexicanum (N=2481) and A. t. tigrinum (N=1205) provided 92% and 87% coverage of the mitochondrial genome, respectively. Complete mitochondrial sequences for all species were rapidly obtained by using long distance PCR and DNA sequencing. A number of genome structural characteristics (base pair length, base composition, gene number, gene boundaries, codon usage) were highly similar among all species and to other distantly related salamanders. Overall, mitochondrial transcription in Ambystoma approximated the pattern observed in other vertebrates. We inferred from the mapping of ESTs onto mtDNA that transcription occurs from both heavy and light strand promoters and continues around the entire length of the mtDNA, followed by post-transcriptional processing. However, the observation of many short transcripts corresponding to rRNA genes indicates that transcription may often terminate prematurely to bias transcription of rRNA genes; indeed an rRNA transcription termination signal sequence was observed immediately following the 16S rRNA gene. Phylogenetic analyses of salamander family relationships consistently grouped Ambystomatidae in a clade containing Cryptobranchidae and Hynobiidae, to the exclusion of Salamandridae. This robust result suggests a novel alternative hypothesis because previous studies have consistently identified Ambystomatidae and Salamandridae as closely related taxa. Phylogenetic analyses of tiger salamander complex species also produced robustly supported trees. The D-loop, used in previous molecular phylogenetic studies of the complex, was found to contain a relatively low level of variation and we identified mitochondrial regions with higher rates of molecular evolution that are more useful in resolving relationships among species. Our results show the benefit of using complete genome mitochondrial information in studies of recently and rapidly diverged taxa.
Ecdysozoan mitogenomics: evidence for a common origin of the legged invertebrates, the Panarthropoda.

PubMed

Rota-Stabelli, Omar; Kayal, Ehsan; Gleeson, Dianne; Daub, Jennifer; Boore, Jeffrey L; Telford, Maximilian J; Pisani, Davide; Blaxter, Mark; Lavrov, Dennis V

2010-07-12

Ecdysozoa is the recently recognized clade of molting animals that comprises the vast majority of extant animal species and the most important invertebrate model organisms--the fruit fly and the nematode worm. Evolutionary relationships within the ecdysozoans remain, however, unresolved, impairing the correct interpretation of comparative genomic studies. In particular, the affinities of the three Panarthropoda phyla (Arthropoda, Onychophora, and Tardigrada) and the position of Myriapoda within Arthropoda (Mandibulata vs. Myriochelata hypothesis) are among the most contentious issues in animal phylogenetics. To elucidate these relationships, we have determined and analyzed complete or nearly complete mitochondrial genome sequences of two Tardigrada, Hypsibius dujardini and Thulinia sp. (the first genomes to date for this phylum); one Priapulida, Halicryptus spinulosus; and two Onychophora, Peripatoides sp. and Epiperipatus biolleyi; and a partial mitochondrial genome sequence of the Onychophora Euperipatoides kanagrensis. Tardigrada mitochondrial genomes resemble those of the arthropods in term of the gene order and strand asymmetry, whereas Onychophora genomes are characterized by numerous gene order rearrangements and strand asymmetry variations. In addition, Onychophora genomes are extremely enriched in A and T nucleotides, whereas Priapulida and Tardigrada are more balanced. Phylogenetic analyses based on concatenated amino acid coding sequences support a monophyletic origin of the Ecdysozoa and the position of Priapulida as the sister group of a monophyletic Panarthropoda (Tardigrada plus Onychophora plus Arthropoda). The position of Tardigrada is more problematic, most likely because of long branch attraction (LBA). However, experiments designed to reduce LBA suggest that the most likely placement of Tardigrada is as a sister group of Onychophora. The same analyses also recover monophyly of traditionally recognized arthropod lineages such as Arachnida and of the highly debated clade Mandibulata.
Ecdysozoan Mitogenomics: Evidence for a Common Origin of the Legged Invertebrates, the Panarthropoda

PubMed Central

Rota-Stabelli, Omar; Kayal, Ehsan; Gleeson, Dianne; Daub, Jennifer; Boore, Jeffrey L.; Telford, Maximilian J.; Pisani, Davide; Blaxter, Mark; Lavrov, Dennis V.

2010-01-01

Ecdysozoa is the recently recognized clade of molting animals that comprises the vast majority of extant animal species and the most important invertebrate model organisms—the fruit fly and the nematode worm. Evolutionary relationships within the ecdysozoans remain, however, unresolved, impairing the correct interpretation of comparative genomic studies. In particular, the affinities of the three Panarthropoda phyla (Arthropoda, Onychophora, and Tardigrada) and the position of Myriapoda within Arthropoda (Mandibulata vs. Myriochelata hypothesis) are among the most contentious issues in animal phylogenetics. To elucidate these relationships, we have determined and analyzed complete or nearly complete mitochondrial genome sequences of two Tardigrada, Hypsibius dujardini and Thulinia sp. (the first genomes to date for this phylum); one Priapulida, Halicryptus spinulosus; and two Onychophora, Peripatoides sp. and Epiperipatus biolleyi; and a partial mitochondrial genome sequence of the Onychophora Euperipatoides kanagrensis. Tardigrada mitochondrial genomes resemble those of the arthropods in term of the gene order and strand asymmetry, whereas Onychophora genomes are characterized by numerous gene order rearrangements and strand asymmetry variations. In addition, Onychophora genomes are extremely enriched in A and T nucleotides, whereas Priapulida and Tardigrada are more balanced. Phylogenetic analyses based on concatenated amino acid coding sequences support a monophyletic origin of the Ecdysozoa and the position of Priapulida as the sister group of a monophyletic Panarthropoda (Tardigrada plus Onychophora plus Arthropoda). The position of Tardigrada is more problematic, most likely because of long branch attraction (LBA). However, experiments designed to reduce LBA suggest that the most likely placement of Tardigrada is as a sister group of Onychophora. The same analyses also recover monophyly of traditionally recognized arthropod lineages such as Arachnida and of the highly debated clade Mandibulata. PMID:20624745
Identification of RAN1 orthologue associated with sex determination through whole genome sequencing analysis in fig (Ficus carica L.).

PubMed

Mori, Kazuki; Shirasawa, Kenta; Nogata, Hitoshi; Hirata, Chiharu; Tashiro, Kosuke; Habu, Tsuyoshi; Kim, Sangwan; Himeno, Shuichi; Kuhara, Satoru; Ikegami, Hidetoshi

2017-01-25

With the aim of identifying sex determinants of fig, we generated the first draft genome sequence of fig and conducted the subsequent analyses. Linkage analysis with a high-density genetic map established by a restriction-site associated sequencing technique, and genome-wide association study followed by whole-genome resequencing analysis identified two missense mutations in RESPONSIVE-TO-ANTAGONIST1 (RAN1) orthologue encoding copper-transporting ATPase completely associated with sex phenotypes of investigated figs. This result suggests that RAN1 is a possible sex determinant candidate in the fig genome. The genomic resources and genetic findings obtained in this study can contribute to general understanding of Ficus species and provide an insight into fig's and plant's sex determination system.
The National Microbial Pathogen Database Resource (NMPDR): a genomics platform based on subsystem annotation.

PubMed

McNeil, Leslie Klis; Reich, Claudia; Aziz, Ramy K; Bartels, Daniela; Cohoon, Matthew; Disz, Terry; Edwards, Robert A; Gerdes, Svetlana; Hwang, Kaitlyn; Kubal, Michael; Margaryan, Gohar Rem; Meyer, Folker; Mihalo, William; Olsen, Gary J; Olson, Robert; Osterman, Andrei; Paarmann, Daniel; Paczian, Tobias; Parrello, Bruce; Pusch, Gordon D; Rodionov, Dmitry A; Shi, Xinghua; Vassieva, Olga; Vonstein, Veronika; Zagnitko, Olga; Xia, Fangfang; Zinner, Jenifer; Overbeek, Ross; Stevens, Rick

2007-01-01

The National Microbial Pathogen Data Resource (NMPDR) (http://www.nmpdr.org) is a National Institute of Allergy and Infections Disease (NIAID)-funded Bioinformatics Resource Center that supports research in selected Category B pathogens. NMPDR contains the complete genomes of approximately 50 strains of pathogenic bacteria that are the focus of our curators, as well as >400 other genomes that provide a broad context for comparative analysis across the three phylogenetic Domains. NMPDR integrates complete, public genomes with expertly curated biological subsystems to provide the most consistent genome annotations. Subsystems are sets of functional roles related by a biologically meaningful organizing principle, which are built over large collections of genomes; they provide researchers with consistent functional assignments in a biologically structured context. Investigators can browse subsystems and reactions to develop accurate reconstructions of the metabolic networks of any sequenced organism. NMPDR provides a comprehensive bioinformatics platform, with tools and viewers for genome analysis. Results of precomputed gene clustering analyses can be retrieved in tabular or graphic format with one-click tools. NMPDR tools include Signature Genes, which finds the set of genes in common or that differentiates two groups of organisms. Essentiality data collated from genome-wide studies have been curated. Drug target identification and high-throughput, in silico, compound screening are in development.
Whole genomic DNA sequencing and comparative genomic analysis of Arthrospira platensis: high genome plasticity and genetic diversity

PubMed Central

Xu, Teng; Qin, Song; Hu, Yongwu; Song, Zhijian; Ying, Jianchao; Li, Peizhen; Dong, Wei; Zhao, Fangqing; Yang, Huanming; Bao, Qiyu

2016-01-01

Arthrospira platensis is a multi-cellular and filamentous non-N2-fixing cyanobacterium that is capable of performing oxygenic photosynthesis. In this study, we determined the nearly complete genome sequence of A. platensis YZ. A. platensis YZ genome is a single, circular chromosome of 6.62 Mb in size. Phylogenetic and comparative genomic analyses revealed that A. platensis YZ was more closely related to A. platensis NIES-39 than Arthrospira sp. PCC 8005 and A. platensis C1. Broad gene gains were identified between A. platensis YZ and three other Arthrospira speices, some of which have been previously demonstrated that can be laterally transferred among different species, such as restriction-modification systems-coding genes. Moreover, unprecedented extensive chromosomal rearrangements among different strains were observed. The chromosomal rearrangements, particularly the chromosomal inversions, were analysed and estimated to be closely related to palindromes that involved long inverted repeat sequences and the extensively distributed type IIR restriction enzyme in the Arthrospira genome. In addition, species from genus Arthrospira unanimously contained the highest rate of repetitive sequence compared with the other species of order Oscillatoriales, suggested that sequence duplication significantly contributed to Arthrospira genome phylogeny. These results provided in-depth views into the genomic phylogeny and structural variation of A. platensis, as well as provide a valuable resource for functional genomics studies. PMID:27330141
Single-molecule sequencing and optical mapping yields an improved genome of woodland strawberry (Fragaria vesca) with chromosome-scale contiguity.

PubMed

Edger, Patrick P; VanBuren, Robert; Colle, Marivi; Poorten, Thomas J; Wai, Ching Man; Niederhuth, Chad E; Alger, Elizabeth I; Ou, Shujun; Acharya, Charlotte B; Wang, Jie; Callow, Pete; McKain, Michael R; Shi, Jinghua; Collier, Chad; Xiong, Zhiyong; Mower, Jeffrey P; Slovin, Janet P; Hytönen, Timo; Jiang, Ning; Childs, Kevin L; Knapp, Steven J

2018-02-01

Although draft genomes are available for most agronomically important plant species, the majority are incomplete, highly fragmented, and often riddled with assembly and scaffolding errors. These assembly issues hinder advances in tool development for functional genomics and systems biology. Here we utilized a robust, cost-effective approach to produce high-quality reference genomes. We report a near-complete genome of diploid woodland strawberry (Fragaria vesca) using single-molecule real-time sequencing from Pacific Biosciences (PacBio). This assembly has a contig N50 length of ∼7.9 million base pairs (Mb), representing a ∼300-fold improvement of the previous version. The vast majority (>99.8%) of the assembly was anchored to 7 pseudomolecules using 2 sets of optical maps from Bionano Genomics. We obtained ∼24.96 Mb of sequence not present in the previous version of the F. vesca genome and produced an improved annotation that includes 1496 new genes. Comparative syntenic analyses uncovered numerous, large-scale scaffolding errors present in each chromosome in the previously published version of the F. vesca genome. Our results highlight the need to improve existing short-read based reference genomes. Furthermore, we demonstrate how genome quality impacts commonly used analyses for addressing both fundamental and applied biological questions. © The Authors 2017. Published by Oxford University Press.
The complete mitochondrial genomes of five Eimeria species infecting domestic rabbits.

PubMed

Liu, Guo-Hua; Tian, Si-Qin; Cui, Ping; Fang, Su-Fang; Wang, Chun-Ren; Zhu, Xing-Quan

2015-12-01

Rabbit coccidiosis caused by members of the genus Eimeria can cause enormous economic impact worldwide, but the genetics, epidemiology and biology of these parasites remain poorly understood. In the present study, we sequenced and annotated the complete mitochondrial (mt) genomes of five Eimeria species that commonly infect the domestic rabbits. The complete mt genomes of Eimeria intestinalis, Eimeria flavescens, Eimeria media, Eimeria vejdovskyi and Eimeria irresidua were 6261bp, 6258bp, 6168bp, 6254bp, 6259bp in length, respectively. All of the mt genomes consist of 3 genes for proteins (cytb, cox1, and cox3), 14 gene fragments for the large subunit (LSU) rRNA and 11 gene fragments for the small subunit (SSU) rRNA, but no transfer RNA (tRNA) genes. The gene order of the mt genomes is similar to that of Plasmodium, but distinct from Haemosporida and Theileria. Phylogenetic analyses based on full nucleotide sequences using Bayesian analysis revealed that the monophyly of the Eimeria of rabbits was strongly statistically supported with a Bayesian posterior probabilities. These data provide novel mtDNA markers for studying the population genetics and molecular epidemiology of the Eimeria species, and should have implications for the molecular diagnosis, prevention and control of coccidiosis in rabbits. Copyright © 2015 Elsevier Inc. All rights reserved.
Mobile Genome Express (MGE): A comprehensive automatic genetic analyses pipeline with a mobile device.

PubMed

Yoon, Jun-Hee; Kim, Thomas W; Mendez, Pedro; Jablons, David M; Kim, Il-Jin

2017-01-01

The development of next-generation sequencing (NGS) technology allows to sequence whole exomes or genome. However, data analysis is still the biggest bottleneck for its wide implementation. Most laboratories still depend on manual procedures for data handling and analyses, which translates into a delay and decreased efficiency in the delivery of NGS results to doctors and patients. Thus, there is high demand for developing an automatic and an easy-to-use NGS data analyses system. We developed comprehensive, automatic genetic analyses controller named Mobile Genome Express (MGE) that works in smartphones or other mobile devices. MGE can handle all the steps for genetic analyses, such as: sample information submission, sequencing run quality check from the sequencer, secured data transfer and results review. We sequenced an Actrometrix control DNA containing multiple proven human mutations using a targeted sequencing panel, and the whole analysis was managed by MGE, and its data reviewing program called ELECTRO. All steps were processed automatically except for the final sequencing review procedure with ELECTRO to confirm mutations. The data analysis process was completed within several hours. We confirmed the mutations that we have identified were consistent with our previous results obtained by using multi-step, manual pipelines.
The evolutionary history of bears is characterized by gene flow across species

PubMed Central

Kumar, Vikas; Lammers, Fritjof; Bidon, Tobias; Pfenninger, Markus; Kolter, Lydia; Nilsson, Maria A.; Janke, Axel

2017-01-01

Bears are iconic mammals with a complex evolutionary history. Natural bear hybrids and studies of few nuclear genes indicate that gene flow among bears may be more common than expected and not limited to polar and brown bears. Here we present a genome analysis of the bear family with representatives of all living species. Phylogenomic analyses of 869 mega base pairs divided into 18,621 genome fragments yielded a well-resolved coalescent species tree despite signals for extensive gene flow across species. However, genome analyses using different statistical methods show that gene flow is not limited to closely related species pairs. Strong ancestral gene flow between the Asiatic black bear and the ancestor to polar, brown and American black bear explains uncertainties in reconstructing the bear phylogeny. Gene flow across the bear clade may be mediated by intermediate species such as the geographically wide-spread brown bears leading to large amounts of phylogenetic conflict. Genome-scale analyses lead to a more complete understanding of complex evolutionary processes. Evidence for extensive inter-specific gene flow, found also in other animal species, necessitates shifting the attention from speciation processes achieving genome-wide reproductive isolation to the selective processes that maintain species divergence in the face of gene flow. PMID:28422140
The evolutionary history of bears is characterized by gene flow across species.

PubMed

Kumar, Vikas; Lammers, Fritjof; Bidon, Tobias; Pfenninger, Markus; Kolter, Lydia; Nilsson, Maria A; Janke, Axel

2017-04-19

Bears are iconic mammals with a complex evolutionary history. Natural bear hybrids and studies of few nuclear genes indicate that gene flow among bears may be more common than expected and not limited to polar and brown bears. Here we present a genome analysis of the bear family with representatives of all living species. Phylogenomic analyses of 869 mega base pairs divided into 18,621 genome fragments yielded a well-resolved coalescent species tree despite signals for extensive gene flow across species. However, genome analyses using different statistical methods show that gene flow is not limited to closely related species pairs. Strong ancestral gene flow between the Asiatic black bear and the ancestor to polar, brown and American black bear explains uncertainties in reconstructing the bear phylogeny. Gene flow across the bear clade may be mediated by intermediate species such as the geographically wide-spread brown bears leading to large amounts of phylogenetic conflict. Genome-scale analyses lead to a more complete understanding of complex evolutionary processes. Evidence for extensive inter-specific gene flow, found also in other animal species, necessitates shifting the attention from speciation processes achieving genome-wide reproductive isolation to the selective processes that maintain species divergence in the face of gene flow.
[Sequencing and analysis of complete genome of rabies viruses isolated from Chinese Ferret-Badger and dog in Zhejiang province].

PubMed

Lei, Yong-Liang; Wang, Xiao-Guang; Tao, Xiao-Yan; Li, Hao; Meng, Sheng-Li; Chen, Xiu-Ying; Liu, Fu-Ming; Ye, Bi-Feng; Tang, Qing

2010-01-01

Based on sequencing the full-length genomes of four Chinese Ferret-Badger and dog, we analyze the properties of rabies viruses genetic variation in molecular level, get the information about rabies viruses prevalence and variation in Zhejiang, and enrich the genome database of rabies viruses street strains isolated from China. Rabies viruses in suckling mice were isolated, overlapped fragments were amplified by RT-PCR and full-length genomes were assembled to analyze the nucleotide and deduced protein similarities and phylogenetic analyses from Chinese Ferret-Badger, dog, sika deer, vole, used vaccine strain were determined. The four full-length genomes were sequenced completely and had the same genetic structure with the length of 11, 923 nts or 11, 925 nts including 58 nts-Leader, 1353 nts-NP, 894 nts-PP, 609 nts-MP, 1575 nts-GP, 6386 nts-LP, and 2, 5, 5 nts- intergenic regions(IGRs), 423 nts-Pseudogene-like sequence (psi), 70 nts-Trailer. The four full-length genomes were in accordance with the properties of Rhabdoviridae Lyssa virus by BLAST and multi-sequence alignment. The nucleotide and amino acid sequences among Chinese strains had the highest similarity, especially among animals of the same species. Of the four full-length genomes, the similarity in amino acid level was dramatically higher than that in nucleotide level, so the nucleotide mutations happened in these four genomes were most synonymous mutations. Compared with the reference rabies viruses, the lengths of the five protein coding regions had no change, no recombination, only with a few point mutations. It was evident that the five proteins appeared to be stable. The variation sites and types of the four genomes were similar to the reference vaccine or street strains. And the four strains were genotype 1 according to the multi-sequence and phylogenetic analyses, which possessed the distinct district characteristics of China. Therefore, these four rabies viruses are likely to be street viruses already existing in the natural world.
The complete mitochondrial genomes of two band-winged grasshoppers, Gastrimargus marmoratus and Oedaleus asiaticus

PubMed Central

Ma, Chuan; Liu, Chunxiang; Yang, Pengcheng; Kang, Le

2009-01-01

Background The two closely related species of band-winged grasshoppers, Gastrimargus marmoratus and Oedaleus asiaticus, display significant differences in distribution, biological characteristics and habitat preferences. They are so similar to their respective congeneric species that it is difficult to differentiate them from other species within each genus. Hoppers of the two species have quite similar morphologies to that of Locusta migratoria, hence causing confusion in species identification. Thus we determined and compared the mitochondrial genomes of G. marmoratus and O. asiaticus to address these questions. Results The complete mitochondrial genomes of G. marmoratus and O. asiaticus are 15,924 bp and 16,259 bp in size, respectively, with O. asiaticus being the largest among all known mitochondrial genomes in Orthoptera. Both mitochondrial genomes contain a standard set of 13 protein-coding genes, 22 transfer RNA genes, 2 ribosomal RNA genes and an A+T-rich region in the same order as those of the other analysed caeliferan species, but different from those of the ensiferan species by the rearrangement of trnD and trnK. The putative initiation codon for the cox1 gene in the two species is ATC. The presence of different sized tandem repeats in the A+T-rich region leads to size variation between their mitochondrial genomes. Except for nad2, nad4L, and nad6, most of the caeliferan mtDNA genes exhibit low levels of divergence. In phylogenetic analyses, the species from the suborder Caelifera form a monophyletic group, as is the case for the Ensifera. Furthermore, the two suborders cluster as sister groups, supporting the monophyly of Orthoptera. Conclusion The mitochondrial genomes of both G. marmoratus and O. asiaticus harbor the typical 37 genes and an A+T-rich region, exhibiting similar characters to those of other grasshopper species. Characterization of the two mitochondrial genomes has enriched our knowledge on mitochondrial genomes of Orthoptera. PMID:19361334
The coffee genome hub: a resource for coffee genomes

PubMed Central

Dereeper, Alexis; Bocs, Stéphanie; Rouard, Mathieu; Guignon, Valentin; Ravel, Sébastien; Tranchant-Dubreuil, Christine; Poncet, Valérie; Garsmeur, Olivier; Lashermes, Philippe; Droc, Gaëtan

2015-01-01

The whole genome sequence of Coffea canephora, the perennial diploid species known as Robusta, has been recently released. In the context of the C. canephora genome sequencing project and to support post-genomics efforts, we developed the Coffee Genome Hub (http://coffee-genome.org/), an integrative genome information system that allows centralized access to genomics and genetics data and analysis tools to facilitate translational and applied research in coffee. We provide the complete genome sequence of C. canephora along with gene structure, gene product information, metabolism, gene families, transcriptomics, syntenic blocks, genetic markers and genetic maps. The hub relies on generic software (e.g. GMOD tools) for easy querying, visualizing and downloading research data. It includes a Genome Browser enhanced by a Community Annotation System, enabling the improvement of automatic gene annotation through an annotation editor. In addition, the hub aims at developing interoperability among other existing South Green tools managing coffee data (phylogenomics resources, SNPs) and/or supporting data analyses with the Galaxy workflow manager. PMID:25392413
The Complete Chloroplast Genome Sequences of Five Epimedium Species: Lights into Phylogenetic and Taxonomic Analyses

PubMed Central

Zhang, Yanjun; Du, Liuwen; Liu, Ao; Chen, Jianjun; Wu, Li; Hu, Weiming; Zhang, Wei; Kim, Kyunghee; Lee, Sang-Choon; Yang, Tae-Jin; Wang, Ying

2016-01-01

Epimedium L. is a phylogenetically and economically important genus in the family Berberidaceae. We here sequenced the complete chloroplast (cp) genomes of four Epimedium species using Illumina sequencing technology via a combination of de novo and reference-guided assembly, which was also the first comprehensive cp genome analysis on Epimedium combining the cp genome sequence of E. koreanum previously reported. The five Epimedium cp genomes exhibited typical quadripartite and circular structure that was rather conserved in genomic structure and the synteny of gene order. However, these cp genomes presented obvious variations at the boundaries of the four regions because of the expansion and contraction of the inverted repeat (IR) region and the single-copy (SC) boundary regions. The trnQ-UUG duplication occurred in the five Epimedium cp genomes, which was not found in the other basal eudicotyledons. The rapidly evolving cp genome regions were detected among the five cp genomes, as well as the difference of simple sequence repeats (SSR) and repeat sequence were identified. Phylogenetic relationships among the five Epimedium species based on their cp genomes showed accordance with the updated system of the genus on the whole, but reminded that the evolutionary relationships and the divisions of the genus need further investigation applying more evidences. The availability of these cp genomes provided valuable genetic information for accurately identifying species, taxonomy and phylogenetic resolution and evolution of Epimedium, and assist in exploration and utilization of Epimedium plants. PMID:27014326
A genome-wide screening of BEL-Pao like retrotransposons in Anopheles gambiae by the LTR_STRUC program.

PubMed

Marsano, Renè Massimiliano; Caizzi, Ruggiero

2005-09-12

The advanced status of assembly of the nematoceran Anopheles gambiae genomic sequence allowed us to perform a wide genome analysis to looking at the presence of Long Terminal Repeats (LTRs) in the range of 10 kb by means of the LTR_STRUC tool. More than three hundred sequences were retrieved and 210 were treated as putative complete retrotransposons that were individually analysed with respect to known retrotransposons of A. gambiae and D. melanogaster. The results show that the vast majority of the retrotransposons analysed belong to the Ty3/gypsy class and only 8% to the Ty1/copia class. In addition, phylogenetic analysis allowed us to characterize in more detail the relationship of a large BEL-Pao lineage in which a single family was shown to harbour an additional env gene.

Long-read whole genome sequencing and comparative analysis of six strains of the human pathogen Orientia tsutsugamushi.

PubMed

Batty, Elizabeth M; Chaemchuen, Suwittra; Blacksell, Stuart; Richards, Allen L; Paris, Daniel; Bowden, Rory; Chan, Caroline; Lachumanan, Ramkumar; Day, Nicholas; Donnelly, Peter; Chen, Swaine; Salje, Jeanne

2018-06-01

Orientia tsutsugamushi is a clinically important but neglected obligate intracellular bacterial pathogen of the Rickettsiaceae family that causes the potentially life-threatening human disease scrub typhus. In contrast to the genome reduction seen in many obligate intracellular bacteria, early genetic studies of Orientia have revealed one of the most repetitive bacterial genomes sequenced to date. The dramatic expansion of mobile elements has hampered efforts to generate complete genome sequences using short read sequencing methodologies, and consequently there have been few studies of the comparative genomics of this neglected species. We report new high-quality genomes of O. tsutsugamushi, generated using PacBio single molecule long read sequencing, for six strains: Karp, Kato, Gilliam, TA686, UT76 and UT176. In comparative genomics analyses of these strains together with existing reference genomes from Ikeda and Boryong strains, we identify a relatively small core genome of 657 genes, grouped into core gene islands and separated by repeat regions, and use the core genes to infer the first whole-genome phylogeny of Orientia. Complete assemblies of multiple Orientia genomes verify initial suggestions that these are remarkable organisms. They have larger genomes compared with most other Rickettsiaceae, with widespread amplification of repeat elements and massive chromosomal rearrangements between strains. At the gene level, Orientia has a relatively small set of universally conserved genes, similar to other obligate intracellular bacteria, and the relative expansion in genome size can be accounted for by gene duplication and repeat amplification. Our study demonstrates the utility of long read sequencing to investigate complex bacterial genomes and characterise genomic variation.
Complete chloroplast genome of Tetragonia tetragonioides: Molecular phylogenetic relationships and evolution in Caryophyllales.

PubMed

Choi, Kyoung Su; Kwak, Myounghai; Lee, Byoungyoon; Park, SeonJoo

2018-01-01

The chloroplast genome of Tetragonia tetragonioides (Aizoaceae; Caryophyllales) was sequenced to provide information for studies on phylogeny and evolution within Caryophyllales. The chloroplast genome of Tetragonia tetragonioides is 149,506 bp in length and includes a pair of inverted repeats (IRs) of 24,769 bp that separate a large single copy (LSC) region of 82,780 bp and a small single copy (SSC) region of 17,188 bp. Comparative analysis of the chloroplast genome showed that Caryphyllales species have lost many genes. In particular, the rpl2 intron and infA gene were not found in T. tetragonioides, and core Caryophyllales lack the rpl2 intron. Phylogenetic analyses were conducted using 55 genes in 16 complete chloroplast genomes. Caryophyllales was found to divide into two clades; core Caryophyllales and noncore Caryophyllales. The genus Tetragonia is closely related to Mesembryanthemum. Comparisons of the synonymous (Ks), nonsynonymous (Ka), and Ka/Ks substitution rates revealed that nonsynonymous substitution rates were lower than synonymous substitution rates and that Ka/Ks rates were less than 1. The findings of the present study suggest that most genes are a purified selection.
The First Complete Chloroplast Genome Sequences in Actinidiaceae: Genome Structure and Comparative Analysis.

PubMed

Yao, Xiaohong; Tang, Ping; Li, Zuozhou; Li, Dawei; Liu, Yifei; Huang, Hongwen

2015-01-01

Actinidia chinensis is an important economic plant belonging to the basal lineage of the asterids. Availability of a complete Actinidia chloroplast genome sequence is crucial to understanding phylogenetic relationships among major lineages of angiosperms and facilitates kiwifruit genetic improvement. We report here the complete nucleotide sequences of the chloroplast genomes for Actinidia chinensis and A. chinensis var deliciosa obtained through de novo assembly of Illumina paired-end reads produced by total DNA sequencing. The total genome size ranges from 155,446 to 157,557 bp, with an inverted repeat (IR) of 24,013 to 24,391 bp, a large single copy region (LSC) of 87,984 to 88,337 bp and a small single copy region (SSC) of 20,332 to 20,336 bp. The genome encodes 113 different genes, including 79 unique protein-coding genes, 30 tRNA genes and 4 ribosomal RNA genes, with 16 duplicated in the inverted repeats, and a tRNA gene (trnfM-CAU) duplicated once in the LSC region. Comparisons of IR boundaries among four asterid species showed that IR/LSC borders were extended into the 5' portion of the psbA gene and IR contraction occurred in Actinidia. The clap gene has been lost from the chloroplast genome in Actinidia, and may have been transferred to the nucleus during chloroplast evolution. Twenty-seven polymorphic simple sequence repeat (SSR) loci were identified in the Actinidia chloroplast genome. Maximum parsimony analyses of a 72-gene, 16 taxa angiosperm dataset strongly support the placement of Actinidiaceae in Ericales within the basal asterids.
Molecular epidemiology of Plum pox virus in Japan.

PubMed

Maejima, Kensaku; Himeno, Misako; Komatsu, Ken; Takinami, Yusuke; Hashimoto, Masayoshi; Takahashi, Shuichiro; Yamaji, Yasuyuki; Oshima, Kenro; Namba, Shigetou

2011-05-01

For a molecular epidemiological study based on complete genome sequences, 37 Plum pox virus (PPV) isolates were collected from the Kanto region in Japan. Pair-wise analyses revealed that all 37 Japanese isolates belong to the PPV-D strain, with low genetic diversity (less than 0.8%). In phylogenetic analysis of the PPV-D strain based on complete nucleotide sequences, the relationships of the PPV-D strain were reconstructed with high resolution: at the global level, the American, Canadian, and Japanese isolates formed their own distinct monophyletic clusters, suggesting that the routes of viral entry into these countries were independent; at the local level, the actual transmission histories of PPV were precisely reconstructed with high bootstrap support. This is the first description of the molecular epidemiology of PPV based on complete genome sequences.
Ecogenomics and potential biogeochemical impacts of globally abundant ocean viruses.

PubMed

Roux, Simon; Brum, Jennifer R; Dutilh, Bas E; Sunagawa, Shinichi; Duhaime, Melissa B; Loy, Alexander; Poulos, Bonnie T; Solonenko, Natalie; Lara, Elena; Poulain, Julie; Pesant, Stéphane; Kandels-Lewis, Stefanie; Dimier, Céline; Picheral, Marc; Searson, Sarah; Cruaud, Corinne; Alberti, Adriana; Duarte, Carlos M; Gasol, Josep M; Vaqué, Dolors; Bork, Peer; Acinas, Silvia G; Wincker, Patrick; Sullivan, Matthew B

2016-09-29

Ocean microbes drive biogeochemical cycling on a global scale. However, this cycling is constrained by viruses that affect community composition, metabolic activity, and evolutionary trajectories. Owing to challenges with the sampling and cultivation of viruses, genome-level viral diversity remains poorly described and grossly understudied, with less than 1% of observed surface-ocean viruses known. Here we assemble complete genomes and large genomic fragments from both surface- and deep-ocean viruses sampled during the Tara Oceans and Malaspina research expeditions, and analyse the resulting 'global ocean virome' dataset to present a global map of abundant, double-stranded DNA viruses complete with genomic and ecological contexts. A total of 15,222 epipelagic and mesopelagic viral populations were identified, comprising 867 viral clusters (defined as approximately genus-level groups). This roughly triples the number of known ocean viral populations and doubles the number of candidate bacterial and archaeal virus genera, providing a near-complete sampling of epipelagic communities at both the population and viral-cluster level. We found that 38 of the 867 viral clusters were locally or globally abundant, together accounting for nearly half of the viral populations in any global ocean virome sample. While two-thirds of these clusters represent newly described viruses lacking any cultivated representative, most could be computationally linked to dominant, ecologically relevant microbial hosts. Moreover, we identified 243 viral-encoded auxiliary metabolic genes, of which only 95 were previously known. Deeper analyses of four of these auxiliary metabolic genes (dsrC, soxYZ, P-II (also known as glnB) and amoC) revealed that abundant viruses may directly manipulate sulfur and nitrogen cycling throughout the epipelagic ocean. This viral catalog and functional analyses provide a necessary foundation for the meaningful integration of viruses into ecosystem models where they act as key players in nutrient cycling and trophic networks.
Ecogenomics and potential biogeochemical impacts of globally abundant ocean viruses

NASA Astrophysics Data System (ADS)

2016-09-01

Ocean microbes drive biogeochemical cycling on a global scale. However, this cycling is constrained by viruses that affect community composition, metabolic activity, and evolutionary trajectories. Owing to challenges with the sampling and cultivation of viruses, genome-level viral diversity remains poorly described and grossly understudied, with less than 1% of observed surface-ocean viruses known. Here we assemble complete genomes and large genomic fragments from both surface- and deep-ocean viruses sampled during the Tara Oceans and Malaspina research expeditions, and analyse the resulting ‘global ocean virome’ dataset to present a global map of abundant, double-stranded DNA viruses complete with genomic and ecological contexts. A total of 15,222 epipelagic and mesopelagic viral populations were identified, comprising 867 viral clusters (defined as approximately genus-level groups). This roughly triples the number of known ocean viral populations and doubles the number of candidate bacterial and archaeal virus genera, providing a near-complete sampling of epipelagic communities at both the population and viral-cluster level. We found that 38 of the 867 viral clusters were locally or globally abundant, together accounting for nearly half of the viral populations in any global ocean virome sample. While two-thirds of these clusters represent newly described viruses lacking any cultivated representative, most could be computationally linked to dominant, ecologically relevant microbial hosts. Moreover, we identified 243 viral-encoded auxiliary metabolic genes, of which only 95 were previously known. Deeper analyses of four of these auxiliary metabolic genes (dsrC, soxYZ, P-II (also known as glnB) and amoC) revealed that abundant viruses may directly manipulate sulfur and nitrogen cycling throughout the epipelagic ocean. This viral catalog and functional analyses provide a necessary foundation for the meaningful integration of viruses into ecosystem models where they act as key players in nutrient cycling and trophic networks.
The complete mitochondrial genome of the scab mite Psoroptes cuniculi (Arthropoda: Arachnida) provides insights into Acari phylogeny

PubMed Central

2014-01-01

Background Limited available sequence information has greatly impeded population genetics, phylogenetics and systematics studies in the subclass Acari (mites and ticks). Mitochondrial (mt) DNA is well known to provide genetic markers for investigations in these areas, but complete mt genomic data have been lacking for many Acari species. Herein, we present the complete mt genome of the scab mite Psoroptes cuniculi. Methods P. cuniculi was collected from a naturally infected New Zealand white rabbit from China and identified by morphological criteria. The complete mt genome of P. cuniculi was amplified by PCR and then sequenced. The relationships of this scab mite with selected members of the Acari were assessed by phylogenetic analysis of concatenated amino acid sequence datasets by Bayesian inference (BI), maximum likelihood (ML) and maximum parsimony (MP). Results This mt genome (14,247 bp) is circular and consists of 37 genes, including 13 genes for proteins, 22 genes for tRNA, 2 genes for rRNA. The gene arrangement in mt genome of P. cuniculi is the same as those of Dermatophagoides farinae (Pyroglyphidae) and Aleuroglyphus ovatus (Acaridae), but distinct from those of Steganacarus magnus (Steganacaridae) and Panonychus citri (Tetranychidae). Phylogenetic analyses using concatenated amino acid sequences of 12 protein-coding genes, with three different computational algorithms (BI, ML and MP), showed the division of subclass Acari into two superorders, supported the monophylies of the both superorders Parasitiformes and Acariformes; and the three orders Ixodida and Mesostigmata and Astigmata, but rejected the monophyly of the order Prostigmata. Conclusions The mt genome of P. cuniculi represents the first mt genome of any member of the family Psoroptidae. Analysis of mt genome sequences in the present study has provided new insights into the phylogenetic relationships among several major lineages of Acari species. PMID:25052180
Human genome and open source: balancing ethics and business.

PubMed

Marturano, Antonio

2011-01-01

The Human Genome Project has been completed thanks to a massive use of computer techniques, as well as the adoption of the open-source business and research model by the scientists involved. This model won over the proprietary model and allowed a quick propagation and feedback of research results among peers. In this paper, the author will analyse some ethical and legal issues emerging by the use of such computer model in the Human Genome property rights. The author will argue that the Open Source is the best business model, as it is able to balance business and human rights perspectives.
Genome-wide comparisons of phylogenetic similarities between partial genomic regions and the full-length genome in Hepatitis E virus genotyping.

PubMed

Wang, Shuai; Wei, Wei; Luo, Xuenong; Cai, Xuepeng

2014-01-01

Besides the complete genome, different partial genomic sequences of Hepatitis E virus (HEV) have been used in genotyping studies, making it difficult to compare the results based on them. No commonly agreed partial region for HEV genotyping has been determined. In this study, we used a statistical method to evaluate the phylogenetic performance of each partial genomic sequence from a genome wide, by comparisons of evolutionary distances between genomic regions and the full-length genomes of 101 HEV isolates to identify short genomic regions that can reproduce HEV genotype assignments based on full-length genomes. Several genomic regions, especially one genomic region at the 3'-terminal of the papain-like cysteine protease domain, were detected to have relatively high phylogenetic correlations with the full-length genome. Phylogenetic analyses confirmed the identical performances between these regions and the full-length genome in genotyping, in which the HEV isolates involved could be divided into reasonable genotypes. This analysis may be of value in developing a partial sequence-based consensus classification of HEV species.
A catalogue of 136 microbial draft genomes from Red Sea metagenomes

NASA Astrophysics Data System (ADS)

Haroon, Mohamed F.; Thompson, Luke R.; Parks, Donovan H.; Hugenholtz, Philip; Stingl, Ulrich

2016-07-01

Earth is expected to continue warming and the Red Sea is a model environment for understanding the effects of global warming on ocean microbiomes due to its unusually high temperature, salinity and solar irradiance. However, most microbial diversity analyses of the Red Sea have been limited to cultured representatives and single marker gene analyses, hence neglecting the substantial uncultured majority. Here, we report 136 microbial genomes (completion minus contamination is ≥50%) assembled from 45 metagenomes from eight stations spanning the Red Sea and taken from multiple depths between 10 to 500 m. Phylogenomic analysis showed that most of the retrieved genomes belong to seven different phyla of known marine microbes, but more than half representing currently uncultured species. The open-access data presented here is the largest number of Red Sea representative microbial genomes reported in a single study and will help facilitate future studies in understanding the physiology of these microorganisms and how they have adapted to the relatively harsh conditions of the Red Sea.
A catalogue of 136 microbial draft genomes from Red Sea metagenomes.

PubMed

Haroon, Mohamed F; Thompson, Luke R; Parks, Donovan H; Hugenholtz, Philip; Stingl, Ulrich

2016-07-05

Earth is expected to continue warming and the Red Sea is a model environment for understanding the effects of global warming on ocean microbiomes due to its unusually high temperature, salinity and solar irradiance. However, most microbial diversity analyses of the Red Sea have been limited to cultured representatives and single marker gene analyses, hence neglecting the substantial uncultured majority. Here, we report 136 microbial genomes (completion minus contamination is ≥50%) assembled from 45 metagenomes from eight stations spanning the Red Sea and taken from multiple depths between 10 to 500 m. Phylogenomic analysis showed that most of the retrieved genomes belong to seven different phyla of known marine microbes, but more than half representing currently uncultured species. The open-access data presented here is the largest number of Red Sea representative microbial genomes reported in a single study and will help facilitate future studies in understanding the physiology of these microorganisms and how they have adapted to the relatively harsh conditions of the Red Sea.
Analysis for complete genomic sequence of HLA-B and HLA-C alleles in the Chinese Han population.

PubMed

Zhu, F; He, Y; Zhang, W; He, J; He, J; Xu, X; Lv, H; Yan, L

2011-08-01

In the present study, we have determined the complete genomic sequence and analysed the intron polymorphism of partial HLA-B and HLA-C alleles in the Chinese Han population. Over 3.0 kb DNA fragments of HLA-B and HLA-C loci were amplified by polymerase chain reaction from partial 5' untranslated region to 3' noncoding region respectively, and then the amplified products were sequenced. Full-length nucleotide sequences of 14 HLA-B alleles and 10 HLA-C alleles were obtained and have been submitted to GenBank and IMGT/HLA database. Two novel alleles of HLA-B*52:01:01:02 and HLA-B*59:01:01:02 were identified, and the complete genomic sequence of HLA-B*52:01:01:01 was firstly reported. Totally 157 and 167 polymorphism positions were found in the full-length genomic sequence of HLA-B and HLA-C loci respectively. Our results suggested that many single nucleotide polymorphisms existed in the exon and intron regions, and the data can provide useful information for understanding the evolution of HLA-B and HLA-C alleles. © 2011 Blackwell Publishing Ltd.
The complete genome sequencing of Prevotella intermedia strain OMA14 and a subsequent fine-scale, intra-species genomic comparison reveal an unusual amplification of conjugative and mobile transposons and identify a novel Prevotella-lineage-specific repeat

PubMed Central

Naito, Mariko; Ogura, Yoshitoshi; Itoh, Takehiko; Shoji, Mikio; Okamoto, Masaaki; Hayashi, Tetsuya; Nakayama, Koji

2016-01-01

Prevotella intermedia is a pathogenic bacterium involved in periodontal diseases. Here, we present the complete genome sequence of a clinical strain, OMA14, of this bacterium along with the results of comparative genome analysis with strain 17 of the same species whose genome has also been sequenced, but not fully analysed yet. The genomes of both strains consist of two circular chromosomes: the larger chromosomes are similar in size and exhibit a high overall linearity of gene organizations, whereas the smaller chromosomes show a significant size variation and have undergone remarkable genome rearrangements. Unique features of the Pre. intermedia genomes are the presence of a remarkable number of essential genes on the second chromosomes and the abundance of conjugative and mobilizable transposons (CTns and MTns). The CTns/MTns are particularly abundant in the second chromosomes, involved in its extensive genome rearrangement, and have introduced a number of strain-specific genes into each strain. We also found a novel 188-bp repeat sequence that has been highly amplified in Pre. intermedia and are specifically distributed among the Pre. intermedia-related species. These findings expand our understanding of the genetic features of Pre. intermedia and the roles of CTns and MTns in the evolution of bacteria. PMID:26645327
Characterization of the complete mitochondrial genomes of Nematodirus oiratianus and Nematodirus spathiger of small ruminants

PubMed Central

2014-01-01

Background Nematodirus spp. are among the most common nematodes of ruminants worldwide. N. oiratianus and N. spathiger are distributed worldwide as highly prevalent gastrointestinal nematodes, which cause emerging health problems and economic losses. Accurate identification of Nematodirus species is essential to develop effective control strategies for Nematodirus infection in ruminants. Mitochondrial DNA (mtDNA) could provide powerful genetic markers for identifying these closely related species and resolving phylogenetic relationships at different taxonomic levels. Methods In the present study, the complete mitochondrial (mt) genomes of N. oiratianus and N. spathiger from small ruminants in China were obtained using Long-range PCR and sequencing. Results The complete mt genomes of N. oiratianus and N. spathiger were 13,765 bp and 13,519 bp in length, respectively. Both mt genomes were circular and consisted of 36 genes, including 12 genes encoding proteins, 2 genes encoding rRNA, and 22 genes encoding tRNA. Phylogenetic analyses based on the concatenated amino acid sequence data of all 12 protein-coding genes by Bayesian inference (BI), Maximum likelihood (ML) and Maximum parsimony (MP) showed that the two Nematodirus species (Molineidae) were closely related to Dictyocaulidae. Conclusions The availability of the complete mtDNA sequences of N. oiratianus and N. spathiger not only provides new mtDNA sources for a better understanding of nematode mt genomics and phylogeny, but also provides novel and useful genetic markers for studying diagnosis, population genetics and molecular epidemiology of Nematodirus spp. in small ruminants. PMID:25015379
Characterization of the complete mitochondrial genomes of Nematodirus oiratianus and Nematodirus spathiger of small ruminants.

PubMed

Zhao, Guang-Hui; Jia, Yan-Qing; Cheng, Wen-Yu; Zhao, Wen; Bian, Qing-Qing; Liu, Guo-Hua

2014-07-11

Nematodirus spp. are among the most common nematodes of ruminants worldwide. N. oiratianus and N. spathiger are distributed worldwide as highly prevalent gastrointestinal nematodes, which cause emerging health problems and economic losses. Accurate identification of Nematodirus species is essential to develop effective control strategies for Nematodirus infection in ruminants. Mitochondrial DNA (mtDNA) could provide powerful genetic markers for identifying these closely related species and resolving phylogenetic relationships at different taxonomic levels. In the present study, the complete mitochondrial (mt) genomes of N. oiratianus and N. spathiger from small ruminants in China were obtained using Long-range PCR and sequencing. The complete mt genomes of N. oiratianus and N. spathiger were 13,765 bp and 13,519 bp in length, respectively. Both mt genomes were circular and consisted of 36 genes, including 12 genes encoding proteins, 2 genes encoding rRNA, and 22 genes encoding tRNA. Phylogenetic analyses based on the concatenated amino acid sequence data of all 12 protein-coding genes by Bayesian inference (BI), Maximum likelihood (ML) and Maximum parsimony (MP) showed that the two Nematodirus species (Molineidae) were closely related to Dictyocaulidae. The availability of the complete mtDNA sequences of N. oiratianus and N. spathiger not only provides new mtDNA sources for a better understanding of nematode mt genomics and phylogeny, but also provides novel and useful genetic markers for studying diagnosis, population genetics and molecular epidemiology of Nematodirus spp. in small ruminants.
Comparative genomics in chicken and Pekin duck using FISH mapping and microarray analysis

PubMed Central

2009-01-01

Background The availability of the complete chicken (Gallus gallus) genome sequence as well as a large number of chicken probes for fluorescent in-situ hybridization (FISH) and microarray resources facilitate comparative genomic studies between chicken and other bird species. In a previous study, we provided a comprehensive cytogenetic map for the turkey (Meleagris gallopavo) and the first analysis of copy number variants (CNVs) in birds. Here, we extend this approach to the Pekin duck (Anas platyrhynchos), an obvious target for comparative genomic studies due to its agricultural importance and resistance to avian flu. Results We provide a detailed molecular cytogenetic map of the duck genome through FISH assignment of 155 chicken clones. We identified one inter- and six intrachromosomal rearrangements between chicken and duck macrochromosomes and demonstrated conserved synteny among all microchromosomes analysed. Array comparative genomic hybridisation revealed 32 CNVs, of which 5 overlap previously designated "hotspot" regions between chicken and turkey. Conclusion Our results suggest extensive conservation of avian genomes across 90 million years of evolution in both macro- and microchromosomes. The data on CNVs between chicken and duck extends previous analyses in chicken and turkey and supports the hypotheses that avian genomes contain fewer CNVs than mammalian genomes and that genomes of evolutionarily distant species share regions of copy number variation ("CNV hotspots"). Our results will expedite duck genomics, assist marker development and highlight areas of interest for future evolutionary and functional studies. PMID:19656363
The Complete Plastid Genome Sequence of Madagascar Periwinkle Catharanthus roseus (L.) G. Don: Plastid Genome Evolution, Molecular Marker Identification, and Phylogenetic Implications in Asterids

PubMed Central

Ku, Chuan; Chung, Wan-Chia; Chen, Ling-Ling; Kuo, Chih-Horng

2013-01-01

The Madagascar periwinkle ( Catharanthus roseus in the family Apocynaceae) is an important medicinal plant and is the source of several widely marketed chemotherapeutic drugs. It is also commonly grown for its ornamental values and, due to ease of infection and distinctiveness of symptoms, is often used as the host for studies on phytoplasmas, an important group of uncultivated plant pathogens. To gain insights into the characteristics of apocynaceous plastid genomes (plastomes), we used a reference-assisted approach to assemble the complete plastome of C . roseus , which could be applied to other C . roseus -related studies. The C . roseus plastome is the second completely sequenced plastome in the asterid order Gentianales. We performed comparative analyses with two other representative sequences in the same order, including the complete plastome of Coffea arabica (from the basal Gentianales family Rubiaceae) and the nearly complete plastome of Asclepias syriaca (Apocynaceae). The results demonstrated considerable variations in gene content and plastome organization within Apocynaceae, including the presence/absence of three essential genes (i.e., accD, clpP, and ycf1) and large size changes in non-coding regions (e.g., rps2-rpoC2 and IRb-ndhF). To find plastome markers of potential utility for Catharanthus breeding and phylogenetic analyses, we identified 41 C . roseus -specific simple sequence repeats. Furthermore, five intergenic regions with high divergence between C . roseus and three other euasterids I taxa were identified as candidate markers. To resolve the euasterids I interordinal relationships, 82 plastome genes were used for phylogenetic inference. With the addition of representatives from Apocynaceae and sampling of most other asterid orders, a sister relationship between Gentianales and Solanales is supported. PMID:23825699
A specific indel marker for the Philippines Schistosoma japonicum revealed by analysis of mitochondrial genome sequences.

PubMed

Li, Juan; Chen, Fen; Sugiyama, Hiromu; Blair, David; Lin, Rui-Qing; Zhu, Xing-Quan

2015-07-01

In the present study, near-complete mitochondrial (mt) genome sequences for Schistosoma japonicum from different regions in the Philippines and Japan were amplified and sequenced. Comparisons among S. japonicum from the Philippines, Japan, and China revealed a geographically based length difference in mt genomes, but the mt genomic organization and gene arrangement were the same. Sequence differences among samples from the Philippines and all samples from the three endemic areas were 0.57-2.12 and 0.76-3.85 %, respectively. The most variable part of the mt genome was the non-coding region. In the coding portion of the genome, protein-coding genes varied more than rRNA genes and tRNAs. The near-complete mt genome sequences for Philippine specimens were identical in length (14,091 bp) which was 4 bp longer than those of S. japonicum samples from Japan and China. This indel provides a unique genetic marker for S. japonicum samples from the Philippines. Phylogenetic analyses based on the concatenated amino acids of 12 protein-coding genes showed that samples of S. japonicum clustered according to their geographical origins. The identified mitochondrial indel marker will be useful for tracing the source of S. japonicum infection in humans and animals in Southeast Asia.
Phylogenomic relationship of feijoa (Acca sellowiana (O.Berg) Burret) with other Myrtaceae based on complete chloroplast genome sequences.

PubMed

Machado, Lilian de Oliveira; Vieira, Leila do Nascimento; Stefenon, Valdir Marcos; Oliveira Pedrosa, Fábio de; Souza, Emanuel Maltempi de; Guerra, Miguel Pedro; Nodari, Rubens Onofre

2017-04-01

Given their distribution, importance, and richness, Myrtaceae species comprise a model system for studying the evolution of tropical plant diversity. In addition, chloroplast (cp) genome sequencing is an efficient tool for phylogenetic relationship studies. Feijoa [Acca sellowiana (O. Berg) Burret; CN: pineapple-guava] is a Myrtaceae species that occurs naturally in southern Brazil and northern Uruguay. Feijoa is known for its exquisite perfume and flavorful fruits, pharmacological properties, ornamental value and increasing economic relevance. In the present work, we reported the complete cp genome of feijoa. The feijoa cp genome is a circular molecule of 159,370 bp with a quadripartite structure containing two single copy regions, a Large Single Copy region (LSC 88,028 bp) and a Small Single Copy region (SSC 18,598 bp) separated by Inverted Repeat regions (IRs 26,372 bp). The genome structure, gene order, GC content and codon usage are similar to those of typical angiosperm cp genomes. When compared to other cp genome sequences of Myrtaceae, feijoa showed closest relationship with pitanga (Eugenia uniflora L.). Furthermore, a comparison of pitanga synonymous (Ks) and nonsynonymous (Ka) substitution rates revealed extremely low values. Maximum Likelihood and Bayesian Inference analyses produced phylogenomic trees identical in topology. These trees supported monophyly of three Myrtoideae clades.
The complete chloroplast genome sequence of strawberry (Fragaria × ananassa Duch.) and comparison with related species of Rosaceae

PubMed Central

Cheng, Hui; Li, Jinfeng; Zhang, Hong; Cai, Binhua; Gao, Zhihong

2017-01-01

Compared with other members of the family Rosaceae, the chloroplast genomes of Fragaria species exhibit low variation, and this situation has limited phylogenetic analyses; thus, complete chloroplast genome sequencing of Fragaria species is needed. In this study, we sequenced the complete chloroplast genome of F. × ananassa ‘Benihoppe’ using the Illumina HiSeq 2500-PE150 platform and then performed a combination of de novo assembly and reference-guided mapping of contigs to generate complete chloroplast genome sequences. The chloroplast genome exhibits a typical quadripartite structure with a pair of inverted repeats (IRs, 25,936 bp) separated by large (LSC, 85,531 bp) and small (SSC, 18,146 bp) single-copy (SC) regions. The length of the F. × ananassa ‘Benihoppe’ chloroplast genome is 155,549 bp, representing the smallest Fragaria chloroplast genome observed to date. The genome encodes 112 unique genes, comprising 78 protein-coding genes, 30 tRNA genes and four rRNA genes. Comparative analysis of the overall nucleotide sequence identity among ten complete chloroplast genomes confirmed that for both coding and non-coding regions in Rosaceae, SC regions exhibit higher sequence variation than IRs. The Ka/Ks ratio of most genes was less than 1, suggesting that most genes are under purifying selection. Moreover, the mVISTA results also showed a high degree of conservation in genome structure, gene order and gene content in Fragaria, particularly among three octoploid strawberries which were F. × ananassa ‘Benihoppe’, F. chiloensis (GP33) and F. virginiana (O477). However, when the sequences of the coding and non-coding regions of F. × ananassa ‘Benihoppe’ were compared in detail with those of F. chiloensis (GP33) and F. virginiana (O477), a number of SNPs and InDels were revealed by MEGA 7. Six non-coding regions (trnK-matK, trnS-trnG, atpF-atpH, trnC-petN, trnT-psbD and trnP-psaJ) with a percentage of variable sites greater than 1% and no less than five parsimony-informative sites were identified and may be useful for phylogenetic analysis of the genus Fragaria. PMID:29038765

The difference between trivial and scientific names: There were never any true cheetahs in North America.

PubMed

Faurby, S; Werdelin, L; Svenning, J C

2016-05-05

Dobrynin et al. (Genome Biol 16:277, 2015) recently published the complete genome of the cheetah (Acinonyx jubatus) and provided an exhaustive set of analyses supporting the famously low genetic variation in the species, known for several decades. Their genetic analyses represent state-of-the-art and we do not criticize them. However, their interpretation of the results is inconsistent with current knowledge of cheetah evolution. Dobrynin et al. suggest that the causes of the two inferred bottlenecks at ∼ 100,000 and 10,000 years ago were immigration by cheetahs from North America and end-Pleistocene megafauna extinction, respectively, but the first explanation is impossible and the second implausible.
The complete mitochondrial genome of Pauropus longiramus (Myriapoda: Pauropoda): implications on early diversification of the myriapods revealed from comparative analysis.

PubMed

Dong, Yan; Sun, Hongying; Guo, Hua; Pan, Da; Qian, Changyuan; Hao, Sijing; Zhou, Kaiya

2012-08-15

Myriapods are among the earliest arthropods and may have evolved to become part of the terrestrial biota more than 400 million years ago. A noticeable lack of mitochondrial genome data from Pauropoda hampers phylogenetic and evolutionary studies within the subphylum Myriapoda. We sequenced the first complete mitochondrial genome of a microscopic pauropod, Pauropus longiramus (Arthropoda: Myriapoda), and conducted comprehensive mitogenomic analyses across the Myriapoda. The pauropod mitochondrial genome is a circular molecule of 14,487 bp long and contains the entire set of thirty-seven genes. Frequent intergenic overlaps occurred between adjacent tRNAs, and between tRNA and protein-coding genes. This is the first example of a mitochondrial genome with multiple intergenic overlaps and reveals a strategy for arthropods to effectively compact the mitochondrial genome by overlapping and truncating tRNA genes with neighbor genes, instead of only truncating tRNAs. Phylogenetic analyses based on protein-coding genes provide strong evidence that the sister group of Pauropoda is Symphyla. Additionally, approximately unbiased (AU) tests strongly support the Progoneata and confirm the basal position of Chilopoda in Myriapoda. This study provides an estimation of myriapod origins around 555 Ma (95% CI: 444-704 Ma) and this date is comparable with that of the Cambrian explosion and candidate myriapod-like fossils. A new time-scale suggests that deep radiations during early myriapod diversification occurred at least three times, not once as previously proposed. A Carboniferous origin of pauropods is congruent with the idea that these taxa are derived, rather than basal, progoneatans. Copyright © 2012 Elsevier B.V. All rights reserved.
IDEA: Interactive Display for Evolutionary Analyses.

PubMed

Egan, Amy; Mahurkar, Anup; Crabtree, Jonathan; Badger, Jonathan H; Carlton, Jane M; Silva, Joana C

2008-12-08

The availability of complete genomic sequences for hundreds of organisms promises to make obtaining genome-wide estimates of substitution rates, selective constraints and other molecular evolution variables of interest an increasingly important approach to addressing broad evolutionary questions. Two of the programs most widely used for this purpose are codeml and baseml, parts of the PAML (Phylogenetic Analysis by Maximum Likelihood) suite. A significant drawback of these programs is their lack of a graphical user interface, which can limit their user base and considerably reduce their efficiency. We have developed IDEA (Interactive Display for Evolutionary Analyses), an intuitive graphical input and output interface which interacts with PHYLIP for phylogeny reconstruction and with codeml and baseml for molecular evolution analyses. IDEA's graphical input and visualization interfaces eliminate the need to edit and parse text input and output files, reducing the likelihood of errors and improving processing time. Further, its interactive output display gives the user immediate access to results. Finally, IDEA can process data in parallel on a local machine or computing grid, allowing genome-wide analyses to be completed quickly. IDEA provides a graphical user interface that allows the user to follow a codeml or baseml analysis from parameter input through to the exploration of results. Novel options streamline the analysis process, and post-analysis visualization of phylogenies, evolutionary rates and selective constraint along protein sequences simplifies the interpretation of results. The integration of these functions into a single tool eliminates the need for lengthy data handling and parsing, significantly expediting access to global patterns in the data.
IDEA: Interactive Display for Evolutionary Analyses

PubMed Central

Egan, Amy; Mahurkar, Anup; Crabtree, Jonathan; Badger, Jonathan H; Carlton, Jane M; Silva, Joana C

2008-01-01

Background The availability of complete genomic sequences for hundreds of organisms promises to make obtaining genome-wide estimates of substitution rates, selective constraints and other molecular evolution variables of interest an increasingly important approach to addressing broad evolutionary questions. Two of the programs most widely used for this purpose are codeml and baseml, parts of the PAML (Phylogenetic Analysis by Maximum Likelihood) suite. A significant drawback of these programs is their lack of a graphical user interface, which can limit their user base and considerably reduce their efficiency. Results We have developed IDEA (Interactive Display for Evolutionary Analyses), an intuitive graphical input and output interface which interacts with PHYLIP for phylogeny reconstruction and with codeml and baseml for molecular evolution analyses. IDEA's graphical input and visualization interfaces eliminate the need to edit and parse text input and output files, reducing the likelihood of errors and improving processing time. Further, its interactive output display gives the user immediate access to results. Finally, IDEA can process data in parallel on a local machine or computing grid, allowing genome-wide analyses to be completed quickly. Conclusion IDEA provides a graphical user interface that allows the user to follow a codeml or baseml analysis from parameter input through to the exploration of results. Novel options streamline the analysis process, and post-analysis visualization of phylogenies, evolutionary rates and selective constraint along protein sequences simplifies the interpretation of results. The integration of these functions into a single tool eliminates the need for lengthy data handling and parsing, significantly expediting access to global patterns in the data. PMID:19061522
Complete genomic sequence of Powassan virus: evaluation of genetic elements in tick-borne versus mosquito-borne flaviviruses.

PubMed

Mandl, C W; Holzmann, H; Kunz, C; Heinz, F X

1993-05-01

The complete nucleotide sequence of the positive-stranded RNA genome of the tick-borne flavivirus Powassan (10,839 nucleotides) was elucidated and the amino acid sequence of all viral proteins was derived. Based on this sequence as well as serological data, Powassan virus represents the most divergent member of the tick-borne serocomplex within the genus flaviviruses, family Flaviviridae. The primary nucleotide sequence and potential RNA secondary structures of the Powassan virus genome as well as the protein sequences and the reactivities of the virion with a panel of monoclonal antibodies were compared to other tick-borne and mosquito-borne flaviviruses. These analyses corroborated significant differences between tick-borne and mosquito-borne flaviviruses, but also emphasized structural elements that are conserved among both vector groups. The comparisons among tick-borne flaviviruses revealed conserved sequence elements that might represent important determinants of the tick-borne flavivirus phenotype.
Complete mitochondrial genome of the invasive brown alga Sargassum muticum (Sargassaceae, Phaeophyceae).

PubMed

Liu, Feng; Pang, Shaojun

2016-01-01

Sargassum muticum (Yendo) Fensholt is an invasive canopy-forming brown alga, expanding its presence from Northeast Asia to North America and Europe. The complete mitochondrial genome of S. muticum is characterized as a circular molecule of 34,720 bp. The overall AT content of S. muticum mitogenome is 63.41%. This mitogenome contains 65 genes typically found in brown algae, including 3 ribosomal RNA genes, 25 transfer RNA genes, 35 protein-coding genes, and 2 conserved open reading frames (ORFs). The gene order of mitogenome for S. muticum is identical to that for Sargassum horneri, Fucus vesiculosus and Desmarestia viridis. Phylogenetic analyses based on 35 protein-coding genes reveal that S. muticum has a close evolutionary relationship with S. horneri and a distant relationship with Dictyota dichotoma, supporting current taxonomic systems. The present investigation provides new molecular data for studies of S. muticum population diversity as well as comparative genomics in the Phaeophyceae.
Sequencing and annotation of mitochondrial genomes from individual parasitic helminths.

PubMed

Jex, Aaron R; Littlewood, D Timothy; Gasser, Robin B

2015-01-01

Mitochondrial (mt) genomics has significant implications in a range of fundamental areas of parasitology, including evolution, systematics, and population genetics as well as explorations of mt biochemistry, physiology, and function. Mt genomes also provide a rich source of markers to aid molecular epidemiological and ecological studies of key parasites. However, there is still a paucity of information on mt genomes for many metazoan organisms, particularly parasitic helminths, which has often related to challenges linked to sequencing from tiny amounts of material. The advent of next-generation sequencing (NGS) technologies has paved the way for low cost, high-throughput mt genomic research, but there have been obstacles, particularly in relation to post-sequencing assembly and analyses of large datasets. In this chapter, we describe protocols for the efficient amplification and sequencing of mt genomes from small portions of individual helminths, and highlight the utility of NGS platforms to expedite mt genomics. In addition, we recommend approaches for manual or semi-automated bioinformatic annotation and analyses to overcome the bioinformatic "bottleneck" to research in this area. Taken together, these approaches have demonstrated applicability to a range of parasites and provide prospects for using complete mt genomic sequence datasets for large-scale molecular systematic and epidemiological studies. In addition, these methods have broader utility and might be readily adapted to a range of other medium-sized molecular regions (i.e., 10-100 kb), including large genomic operons, and other organellar (e.g., plastid) and viral genomes.
Genome-wide phylogenetic analysis of the pathogenic potential of Vibrio furnissii

PubMed Central

Lux, Thomas M.; Lee, Rob; Love, John

2014-01-01

We recently reported the genome sequence of a free-living strain of Vibrio furnissii (NCTC 11218) harvested from an estuarine environment. V. furnissii is a widespread, free-living proteobacterium and emerging pathogen that can cause acute gastroenteritis in humans and lethal zoonoses in aquatic invertebrates, including farmed crustaceans and molluscs. Here we present the analyses to assess the potential pathogenic impact of V. furnissii. We compared the complete genome of V. furnissii with 8 other emerging and pathogenic Vibrio species. We selected and analyzed more deeply 10 genomic regions based upon unique or common features, and used 3 of these regions to construct a phylogenetic tree. Thus, we positioned V. furnissii more accurately than before and revealed a closer relationship between V. furnissii and V. cholerae than previously thought. However, V. furnissii lacks several important features normally associated with virulence in the human pathogens V. cholera and V. vulnificus. A striking feature of the V. furnissii genome is the hugely increased Super Integron, compared to the other Vibrio. Analyses of predicted genomic islands resulted in the discovery of a protein sequence that is present only in Vibrio associated with diseases in aquatic animals. We also discovered evidence of high levels horizontal gene transfer in V. furnissii. V. furnissii seems therefore to have a dynamic and fluid genome that could quickly adapt to environmental perturbation or increase its pathogenicity. Taken together, these analyses confirm the potential of V. furnissii as an emerging marine and possible human pathogen, especially in the developing, tropical, coastal regions that are most at risk from climate change. PMID:25191313
Genome-wide phylogenetic analysis of the pathogenic potential of Vibrio furnissii.

PubMed

Lux, Thomas M; Lee, Rob; Love, John

2014-01-01

We recently reported the genome sequence of a free-living strain of Vibrio furnissii (NCTC 11218) harvested from an estuarine environment. V. furnissii is a widespread, free-living proteobacterium and emerging pathogen that can cause acute gastroenteritis in humans and lethal zoonoses in aquatic invertebrates, including farmed crustaceans and molluscs. Here we present the analyses to assess the potential pathogenic impact of V. furnissii. We compared the complete genome of V. furnissii with 8 other emerging and pathogenic Vibrio species. We selected and analyzed more deeply 10 genomic regions based upon unique or common features, and used 3 of these regions to construct a phylogenetic tree. Thus, we positioned V. furnissii more accurately than before and revealed a closer relationship between V. furnissii and V. cholerae than previously thought. However, V. furnissii lacks several important features normally associated with virulence in the human pathogens V. cholera and V. vulnificus. A striking feature of the V. furnissii genome is the hugely increased Super Integron, compared to the other Vibrio. Analyses of predicted genomic islands resulted in the discovery of a protein sequence that is present only in Vibrio associated with diseases in aquatic animals. We also discovered evidence of high levels horizontal gene transfer in V. furnissii. V. furnissii seems therefore to have a dynamic and fluid genome that could quickly adapt to environmental perturbation or increase its pathogenicity. Taken together, these analyses confirm the potential of V. furnissii as an emerging marine and possible human pathogen, especially in the developing, tropical, coastal regions that are most at risk from climate change.
EDGAR: A software framework for the comparative analysis of prokaryotic genomes

PubMed Central

Blom, Jochen; Albaum, Stefan P; Doppmeier, Daniel; Pühler, Alfred; Vorhölter, Frank-Jörg; Zakrzewski, Martha; Goesmann, Alexander

2009-01-01

Background The introduction of next generation sequencing approaches has caused a rapid increase in the number of completely sequenced genomes. As one result of this development, it is now feasible to analyze large groups of related genomes in a comparative approach. A main task in comparative genomics is the identification of orthologous genes in different genomes and the classification of genes as core genes or singletons. Results To support these studies EDGAR – "Efficient Database framework for comparative Genome Analyses using BLAST score Ratios" – was developed. EDGAR is designed to automatically perform genome comparisons in a high throughput approach. Comparative analyses for 582 genomes across 75 genus groups taken from the NCBI genomes database were conducted with the software and the results were integrated into an underlying database. To demonstrate a specific application case, we analyzed ten genomes of the bacterial genus Xanthomonas, for which phylogenetic studies were awkward due to divergent taxonomic systems. The resultant phylogeny EDGAR provided was consistent with outcomes from traditional approaches performed recently and moreover, it was possible to root each strain with unprecedented accuracy. Conclusion EDGAR provides novel analysis features and significantly simplifies the comparative analysis of related genomes. The software supports a quick survey of evolutionary relationships and simplifies the process of obtaining new biological insights into the differential gene content of kindred genomes. Visualization features, like synteny plots or Venn diagrams, are offered to the scientific community through a web-based and therefore platform independent user interface , where the precomputed data sets can be browsed. PMID:19457249
Calcisponges have a ParaHox gene and dynamic expression of dispersed NK homeobox genes.

PubMed

Fortunato, Sofia A V; Adamski, Marcin; Ramos, Olivia Mendivil; Leininger, Sven; Liu, Jing; Ferrier, David E K; Adamska, Maja

2014-10-30

Sponges are simple animals with few cell types, but their genomes paradoxically contain a wide variety of developmental transcription factors, including homeobox genes belonging to the Antennapedia (ANTP) class, which in bilaterians encompass Hox, ParaHox and NK genes. In the genome of the demosponge Amphimedon queenslandica, no Hox or ParaHox genes are present, but NK genes are linked in a tight cluster similar to the NK clusters of bilaterians. It has been proposed that Hox and ParaHox genes originated from NK cluster genes after divergence of sponges from the lineage leading to cnidarians and bilaterians. On the other hand, synteny analysis lends support to the notion that the absence of Hox and ParaHox genes in Amphimedon is a result of secondary loss (the ghost locus hypothesis). Here we analysed complete suites of ANTP-class homeoboxes in two calcareous sponges, Sycon ciliatum and Leucosolenia complicata. Our phylogenetic analyses demonstrate that these calcisponges possess orthologues of bilaterian NK genes (Hex, Hmx and Msx), a varying number of additional NK genes and one ParaHox gene, Cdx. Despite the generation of scaffolds spanning multiple genes, we find no evidence of clustering of Sycon NK genes. All Sycon ANTP-class genes are developmentally expressed, with patterns suggesting their involvement in cell type specification in embryos and adults, metamorphosis and body plan patterning. These results demonstrate that ParaHox genes predate the origin of sponges, thus confirming the ghost locus hypothesis, and highlight the need to analyse the genomes of multiple sponge lineages to obtain a complete picture of the ancestral composition of the first animal genome.
Complete Genome Sequence of an Avian Paramyxovirus Type 4 Strain Isolated from Domestic Duck at a Live Bird Market in South Korea.

PubMed

Tseren-Ochir, Erdene-Ochir; Yuk, Seong-Su; Kwon, Jung-Hoon; Noh, Jin-Yong; Hong, Woo-Tack; Jeong, Jei-Hyun; Jeong, Sol; Kim, Yu-Jin; Kim, Kyu-Jik; Lee, Ji-Ho; Kim, Jun-Beom; Lee, Joong-Bok; Park, Seung-Yong; Choi, In-Soo; Lee, Sang-Won; Song, Chang-Seon

2017-05-18

We report here the first full-genome sequence of an avian paramyxovirus type 4 (APMV-4) strain isolated from a domestic mallard duck at a live bird market in South Korea. Phylogenetic analyses provide genetic information on a new genetic clade, APMV-4, isolated from a domestic duck and evidence of APMV-4 exchange between poultry and wild birds. Copyright © 2017 Tseren-Ochir et al.
Exploring Pandora's Box: Potential and Pitfalls of Low Coverage Genome Surveys for Evolutionary Biology

PubMed Central

Leese, Florian; Mayer, Christoph; Agrawal, Shobhit; Dambach, Johannes; Dietz, Lars; Doemel, Jana S.; Goodall-Copstake, William P.; Held, Christoph; Jackson, Jennifer A.; Lampert, Kathrin P.; Linse, Katrin; Macher, Jan N.; Nolzen, Jennifer; Raupach, Michael J.; Rivera, Nicole T.; Schubart, Christoph D.; Striewski, Sebastian; Tollrian, Ralph; Sands, Chester J.

2012-01-01

High throughput sequencing technologies are revolutionizing genetic research. With this “rise of the machines”, genomic sequences can be obtained even for unknown genomes within a short time and for reasonable costs. This has enabled evolutionary biologists studying genetically unexplored species to identify molecular markers or genomic regions of interest (e.g. micro- and minisatellites, mitochondrial and nuclear genes) by sequencing only a fraction of the genome. However, when using such datasets from non-model species, it is possible that DNA from non-target contaminant species such as bacteria, viruses, fungi, or other eukaryotic organisms may complicate the interpretation of the results. In this study we analysed 14 genomic pyrosequencing libraries of aquatic non-model taxa from four major evolutionary lineages. We quantified the amount of suitable micro- and minisatellites, mitochondrial genomes, known nuclear genes and transposable elements and searched for contamination from various sources using bioinformatic approaches. Our results show that in all sequence libraries with estimated coverage of about 0.02–25%, many appropriate micro- and minisatellites, mitochondrial gene sequences and nuclear genes from different KEGG (Kyoto Encyclopedia of Genes and Genomes) pathways could be identified and characterized. These can serve as markers for phylogenetic and population genetic analyses. A central finding of our study is that several genomic libraries suffered from different biases owing to non-target DNA or mobile elements. In particular, viruses, bacteria or eukaryote endosymbionts contributed significantly (up to 10%) to some of the libraries analysed. If not identified as such, genetic markers developed from high-throughput sequencing data for non-model organisms may bias evolutionary studies or fail completely in experimental tests. In conclusion, our study demonstrates the enormous potential of low-coverage genome survey sequences and suggests bioinformatic analysis workflows. The results also advise a more sophisticated filtering for problematic sequences and non-target genome sequences prior to developing markers. PMID:23185309
A locally funded Puerto Rican parrot (Amazona vittata) genome sequencing project increases avian data and advances young researcher education

PubMed Central

2012-01-01

Background Amazona vittata is a critically endangered Puerto Rican endemic bird, the only surviving native parrot species in the United States territory, and the first parrot in the large Neotropical genus Amazona, to be studied on a genomic scale. Findings In a unique community-based funded project, DNA from an A. vittata female was sequenced using a HiSeq Illumina platform, resulting in a total of ~42.5 billion nucleotide bases. This provided approximately 26.89x average coverage depth at the completion of this funding phase. Filtering followed by assembly resulted in 259,423 contigs (N50 = 6,983 bp, longest = 75,003 bp), which was further scaffolded into 148,255 fragments (N50 = 19,470, longest = 206,462 bp). This provided ~76% coverage of the genome based on an estimated size of 1.58 Gb. The assembled scaffolds allowed basic genomic annotation and comparative analyses with other available avian whole-genome sequences. Conclusions The current data represents the first genomic information from and work carried out with a unique source of funding. This analysis further provides a means for directed training of young researchers in genetic and bioinformatics analyses and will facilitate progress towards a full assembly and annotation of the Puerto Rican parrot genome. It also adds extensive genomic data to a new branch of the avian tree, making it useful for comparative analyses with other avian species. Ultimately, the knowledge acquired from these data will contribute to an improved understanding of the overall population health of this species and aid in ongoing and future conservation efforts. PMID:23587420
Complete mitochondrial DNA sequence of the European flat oyster Ostrea edulis confirms Ostreidae classification.

PubMed

Danic-Tchaleu, Gwenaelle; Heurtebise, Serge; Morga, Benjamin; Lapègue, Sylvie

2011-10-12

Because of its typical architecture, inheritance and small size, mitochondrial (mt) DNA is widely used for phylogenetic studies. Gene order is generally conserved in most taxa although some groups show considerable variation. This is particularly true in the phylum Mollusca, especially in the Bivalvia. During the last few years, there have been significant increases in the number of complete mitochondrial sequences available. For bivalves, 35 complete mitochondrial genomes are now available in GenBank, a number that has more than doubled in the last three years, representing 6 families and 23 genera. In the current study, we determined the complete mtDNA sequence of O. edulis, the European flat oyster. We present an analysis of features of its gene content and genome organization in comparison with other Ostrea, Saccostrea and Crassostrea species. The Ostrea edulis mt genome is 16 320 bp in length and codes for 37 genes (12 protein-coding genes, 2 rRNAs and 23 tRNAs) on the same strand. As in other Ostreidae, O. edulis mt genome contains a split of the rrnL gene and a duplication of trnM. The tRNA gene set of O. edulis, Ostrea denselamellosa and Crassostrea virginica are identical in having 23 tRNA genes, in contrast to Asian oysters, which have 25 tRNA genes (except for C. ariakensis with 24). O. edulis and O. denselamellosa share the same gene order, but differ from other Ostreidae and are closer to Crassostrea than to Saccostrea. Phylogenetic analyses reinforce the taxonomic classification of the 3 families Ostreidae, Mytilidae and Pectinidae. Within the Ostreidae family the results also reveal a closer relationship between Ostrea and Saccostrea than between Ostrea and Crassostrea. Ostrea edulis mitogenomic analyses show a high level of conservation within the genus Ostrea, whereas they show a high level of variation within the Ostreidae family. These features provide useful information for further evolutionary analysis of oyster mitogenomes.
Complete mitochondrial DNA sequence of the European flat oyster Ostrea edulis confirms Ostreidae classification

PubMed Central

2011-01-01

Background Because of its typical architecture, inheritance and small size, mitochondrial (mt) DNA is widely used for phylogenetic studies. Gene order is generally conserved in most taxa although some groups show considerable variation. This is particularly true in the phylum Mollusca, especially in the Bivalvia. During the last few years, there have been significant increases in the number of complete mitochondrial sequences available. For bivalves, 35 complete mitochondrial genomes are now available in GenBank, a number that has more than doubled in the last three years, representing 6 families and 23 genera. In the current study, we determined the complete mtDNA sequence of O. edulis, the European flat oyster. We present an analysis of features of its gene content and genome organization in comparison with other Ostrea, Saccostrea and Crassostrea species. Results The Ostrea edulis mt genome is 16 320 bp in length and codes for 37 genes (12 protein-coding genes, 2 rRNAs and 23 tRNAs) on the same strand. As in other Ostreidae, O. edulis mt genome contains a split of the rrnL gene and a duplication of trnM. The tRNA gene set of O. edulis, Ostrea denselamellosa and Crassostrea virginica are identical in having 23 tRNA genes, in contrast to Asian oysters, which have 25 tRNA genes (except for C. ariakensis with 24). O. edulis and O. denselamellosa share the same gene order, but differ from other Ostreidae and are closer to Crassostrea than to Saccostrea. Phylogenetic analyses reinforce the taxonomic classification of the 3 families Ostreidae, Mytilidae and Pectinidae. Within the Ostreidae family the results also reveal a closer relationship between Ostrea and Saccostrea than between Ostrea and Crassostrea. Conclusions Ostrea edulis mitogenomic analyses show a high level of conservation within the genus Ostrea, whereas they show a high level of variation within the Ostreidae family. These features provide useful information for further evolutionary analysis of oyster mitogenomes. PMID:21989403
Genomic basis for natural product biosynthetic diversity in the actinomycetes†

PubMed Central

Nett, Markus; Ikeda, Haruo; Moore, Bradley S.

2010-01-01

The phylum Actinobacteria hosts diverse high G + C, Gram-positive bacteria that have evolved a complex chemical language of natural product chemistry to help navigate their fascinatingly varied lifestyles. To date, 71 Actinobacteria genomes have been completed and annotated, with the vast majority representing the Actinomycetales, which are the source of numerous antibiotics and other drugs from genera such as Streptomyces, Saccharopolyspora and Salinispora. These genomic analyses have illuminated the secondary metabolic proficiency of these microbes – underappreciated for years based on conventional isolation programs – and have helped set the foundation for a new natural product discovery paradigm based on genome mining. Trends in the secondary metabolomes of natural product-rich actinomycetes are highlighted in this review article, which contains 199 references. PMID:19844637
The Capsaspora genome reveals a complex unicellular prehistory of animals.

PubMed

Suga, Hiroshi; Chen, Zehua; de Mendoza, Alex; Sebé-Pedrós, Arnau; Brown, Matthew W; Kramer, Eric; Carr, Martin; Kerner, Pierre; Vervoort, Michel; Sánchez-Pons, Núria; Torruella, Guifré; Derelle, Romain; Manning, Gerard; Lang, B Franz; Russ, Carsten; Haas, Brian J; Roger, Andrew J; Nusbaum, Chad; Ruiz-Trillo, Iñaki

2013-01-01

To reconstruct the evolutionary origin of multicellular animals from their unicellular ancestors, the genome sequences of diverse unicellular relatives are essential. However, only the genome of the choanoflagellate Monosiga brevicollis has been reported to date. Here we completely sequence the genome of the filasterean Capsaspora owczarzaki, the closest known unicellular relative of metazoans besides choanoflagellates. Analyses of this genome alter our understanding of the molecular complexity of metazoans' unicellular ancestors showing that they had a richer repertoire of proteins involved in cell adhesion and transcriptional regulation than previously inferred only with the choanoflagellate genome. Some of these proteins were secondarily lost in choanoflagellates. In contrast, most intercellular signalling systems controlling development evolved later concomitant with the emergence of the first metazoans. We propose that the acquisition of these metazoan-specific developmental systems and the co-option of pre-existing genes drove the evolutionary transition from unicellular protists to metazoans.
Sequence Search and Comparative Genomic Analysis of SUMO-Activating Enzymes Using CoGe.

PubMed

Carretero-Paulet, Lorenzo; Albert, Victor A

2016-01-01

The growing number of genome sequences completed during the last few years has made necessary the development of bioinformatics tools for the easy access and retrieval of sequence data, as well as for downstream comparative genomic analyses. Some of these are implemented as online platforms that integrate genomic data produced by different genome sequencing initiatives with data mining tools as well as various comparative genomic and evolutionary analysis possibilities.Here, we use the online comparative genomics platform CoGe ( http://www.genomevolution.org/coge/ ) (Lyons and Freeling. Plant J 53:661-673, 2008; Tang and Lyons. Front Plant Sci 3:172, 2012) (1) to retrieve the entire complement of orthologous and paralogous genes belonging to the SUMO-Activating Enzymes 1 (SAE1) gene family from a set of species representative of the Brassicaceae plant eudicot family with genomes fully sequenced, and (2) to investigate the history, timing, and molecular mechanisms of the gene duplications driving the evolutionary expansion and functional diversification of the SAE1 family in Brassicaceae.
The ring of life provides evidence for a genome fusion origin of eukaryotes.

PubMed

Rivera, Maria C; Lake, James A

2004-09-09

Genomes hold within them the record of the evolution of life on Earth. But genome fusions and horizontal gene transfer seem to have obscured sufficiently the gene sequence record such that it is difficult to reconstruct the phylogenetic tree of life. Here we determine the general outline of the tree using complete genome data from representative prokaryotes and eukaryotes and a new genome analysis method that makes it possible to reconstruct ancient genome fusions and phylogenetic trees. Our analyses indicate that the eukaryotic genome resulted from a fusion of two diverse prokaryotic genomes, and therefore at the deepest levels linking prokaryotes and eukaryotes, the tree of life is actually a ring of life. One fusion partner branches from deep within an ancient photosynthetic clade, and the other is related to the archaeal prokaryotes. The eubacterial organism is either a proteobacterium, or a member of a larger photosynthetic clade that includes the Cyanobacteria and the Proteobacteria.

Hyb-Seq: Combining target enrichment and genome skimming for plant phylogenomics1

PubMed Central

Weitemier, Kevin; Straub, Shannon C. K.; Cronn, Richard C.; Fishbein, Mark; Schmickl, Roswitha; McDonnell, Angela; Liston, Aaron

2014-01-01

• Premise of the study: Hyb-Seq, the combination of target enrichment and genome skimming, allows simultaneous data collection for low-copy nuclear genes and high-copy genomic targets for plant systematics and evolution studies. • Methods and Results: Genome and transcriptome assemblies for milkweed (Asclepias syriaca) were used to design enrichment probes for 3385 exons from 768 genes (>1.6 Mbp) followed by Illumina sequencing of enriched libraries. Hyb-Seq of 12 individuals (10 Asclepias species and two related genera) resulted in at least partial assembly of 92.6% of exons and 99.7% of genes and an average assembly length >2 Mbp. Importantly, complete plastomes and nuclear ribosomal DNA cistrons were assembled using off-target reads. Phylogenomic analyses demonstrated signal conflict between genomes. • Conclusions: The Hyb-Seq approach enables targeted sequencing of thousands of low-copy nuclear exons and flanking regions, as well as genome skimming of high-copy repeats and organellar genomes, to efficiently produce genome-scale data sets for phylogenomics. PMID:25225629
The complete genome sequencing of Prevotella intermedia strain OMA14 and a subsequent fine-scale, intra-species genomic comparison reveal an unusual amplification of conjugative and mobile transposons and identify a novel Prevotella-lineage-specific repeat.

PubMed

Naito, Mariko; Ogura, Yoshitoshi; Itoh, Takehiko; Shoji, Mikio; Okamoto, Masaaki; Hayashi, Tetsuya; Nakayama, Koji

2016-02-01

Prevotella intermedia is a pathogenic bacterium involved in periodontal diseases. Here, we present the complete genome sequence of a clinical strain, OMA14, of this bacterium along with the results of comparative genome analysis with strain 17 of the same species whose genome has also been sequenced, but not fully analysed yet. The genomes of both strains consist of two circular chromosomes: the larger chromosomes are similar in size and exhibit a high overall linearity of gene organizations, whereas the smaller chromosomes show a significant size variation and have undergone remarkable genome rearrangements. Unique features of the Pre. intermedia genomes are the presence of a remarkable number of essential genes on the second chromosomes and the abundance of conjugative and mobilizable transposons (CTns and MTns). The CTns/MTns are particularly abundant in the second chromosomes, involved in its extensive genome rearrangement, and have introduced a number of strain-specific genes into each strain. We also found a novel 188-bp repeat sequence that has been highly amplified in Pre. intermedia and are specifically distributed among the Pre. intermedia-related species. These findings expand our understanding of the genetic features of Pre. intermedia and the roles of CTns and MTns in the evolution of bacteria. © The Author 2015. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Draft genome of the red harvester ant Pogonomyrmex barbatus.

PubMed

Smith, Chris R; Smith, Christopher D; Robertson, Hugh M; Helmkampf, Martin; Zimin, Aleksey; Yandell, Mark; Holt, Carson; Hu, Hao; Abouheif, Ehab; Benton, Richard; Cash, Elizabeth; Croset, Vincent; Currie, Cameron R; Elhaik, Eran; Elsik, Christine G; Favé, Marie-Julie; Fernandes, Vilaiwan; Gibson, Joshua D; Graur, Dan; Gronenberg, Wulfila; Grubbs, Kirk J; Hagen, Darren E; Viniegra, Ana Sofia Ibarraran; Johnson, Brian R; Johnson, Reed M; Khila, Abderrahman; Kim, Jay W; Mathis, Kaitlyn A; Munoz-Torres, Monica C; Murphy, Marguerite C; Mustard, Julie A; Nakamura, Rin; Niehuis, Oliver; Nigam, Surabhi; Overson, Rick P; Placek, Jennifer E; Rajakumar, Rajendhran; Reese, Justin T; Suen, Garret; Tao, Shu; Torres, Candice W; Tsutsui, Neil D; Viljakainen, Lumi; Wolschin, Florian; Gadau, Jürgen

2011-04-05

We report the draft genome sequence of the red harvester ant, Pogonomyrmex barbatus. The genome was sequenced using 454 pyrosequencing, and the current assembly and annotation were completed in less than 1 y. Analyses of conserved gene groups (more than 1,200 manually annotated genes to date) suggest a high-quality assembly and annotation comparable to recently sequenced insect genomes using Sanger sequencing. The red harvester ant is a model for studying reproductive division of labor, phenotypic plasticity, and sociogenomics. Although the genome of P. barbatus is similar to other sequenced hymenopterans (Apis mellifera and Nasonia vitripennis) in GC content and compositional organization, and possesses a complete CpG methylation toolkit, its predicted genomic CpG content differs markedly from the other hymenopterans. Gene networks involved in generating key differences between the queen and worker castes (e.g., wings and ovaries) show signatures of increased methylation and suggest that ants and bees may have independently co-opted the same gene regulatory mechanisms for reproductive division of labor. Gene family expansions (e.g., 344 functional odorant receptors) and pseudogene accumulation in chemoreception and P450 genes compared with A. mellifera and N. vitripennis are consistent with major life-history changes during the adaptive radiation of Pogonomyrmex spp., perhaps in parallel with the development of the North American deserts.
Comparative genome analysis of rice-pathogenic Burkholderia provides insight into capacity to adapt to different environments and hosts.

PubMed

Seo, Young-Su; Lim, Jae Yun; Park, Jungwook; Kim, Sunyoung; Lee, Hyun-Hee; Cheong, Hoon; Kim, Sang-Mok; Moon, Jae Sun; Hwang, Ingyu

2015-05-06

In addition to human and animal diseases, bacteria of the genus Burkholderia can cause plant diseases. The representative species of rice-pathogenic Burkholderia are Burkholderia glumae, B. gladioli, and B. plantarii, which primarily cause grain rot, sheath rot, and seedling blight, respectively, resulting in severe reductions in rice production. Though Burkholderia rice pathogens cause problems in rice-growing countries, comprehensive studies of these rice-pathogenic species aiming to control Burkholderia-mediated diseases are only in the early stages. We first sequenced the complete genome of B. plantarii ATCC 43733T. Second, we conducted comparative analysis of the newly sequenced B. plantarii ATCC 43733T genome with eleven complete or draft genomes of B. glumae and B. gladioli strains. Furthermore, we compared the genome of three rice Burkholderia pathogens with those of other Burkholderia species such as those found in environmental habitats and those known as animal/human pathogens. These B. glumae, B. gladioli, and B. plantarii strains have unique genes involved in toxoflavin or tropolone toxin production and the clustered regularly interspaced short palindromic repeats (CRISPR)-mediated bacterial immune system. Although the genome of B. plantarii ATCC 43733T has many common features with those of B. glumae and B. gladioli, this B. plantarii strain has several unique features, including quorum sensing and CRISPR/CRISPR-associated protein (Cas) systems. The complete genome sequence of B. plantarii ATCC 43733T and publicly available genomes of B. glumae BGR1 and B. gladioli BSR3 enabled comprehensive comparative genome analyses among three rice-pathogenic Burkholderia species responsible for tissue rotting and seedling blight. Our results suggest that B. glumae has evolved rapidly, or has undergone rapid genome rearrangements or deletions, in response to the hosts. It also, clarifies the unique features of rice pathogenic Burkholderia species relative to other animal and human Burkholderia species.
Molecular and FISH analyses of a 53-kbp intact DNA fragment inserted by biolistics in wheat (Triticum aestivum L.) genome.

PubMed

Partier, A; Gay, G; Tassy, C; Beckert, M; Feuillet, C; Barret, P

2017-10-01

A large, 53-kbp, intact DNA fragment was inserted into the wheat ( Triticum aestivum L.) genome. FISH analyses of individual transgenic events revealed multiple insertions of intact fragments. Transferring large intact DNA fragments containing clusters of resistance genes or complete metabolic pathways into the wheat genome remains a challenge. In a previous work, we showed that the use of dephosphorylated cassettes for wheat transformation enabled the production of simple integration patterns. Here, we used the same technology to produce a cassette containing a 44-kb Arabidopsis thaliana BAC, flanked by one selection gene and one reporter gene. This 53-kb linear cassette was integrated in the bread wheat (Triticum aestivum L.) genome by biolistic transformation. Our results showed that transgenic plants harboring the entire cassette were generated. The inheritability of the cassette was demonstrated in the T1 and T2 generation. Surprisingly, FISH analysis performed on T1 progeny of independent events identified double genomic insertions of intact fragments in non-homoeologous positions. Inheritability of these double insertions was demonstrated by FISH analysis of the T1 generation. Relative conclusions that can be drawn from molecular or FISH analysis are discussed along with future prospects of the engineering of large fragments for wheat transformation or genome editing.
Comparative Genomic Analyses of Clavibacter michiganensis subsp. insidiosus and Pathogenicity on Medicago truncatula.

PubMed

Lu, You; Ishimaru, Carol A; Glazebrook, Jane; Samac, Deborah A

2018-02-01

Clavibacter michiganensis is the most economically important gram-positive bacterial plant pathogen, with subspecies that cause serious diseases of maize, wheat, tomato, potato, and alfalfa. Much less is known about pathogenesis involving gram-positive plant pathogens than is known for gram-negative bacteria. Comparative genome analyses of C. michiganensis subspecies affecting tomato, potato, and maize have provided insights on pathogenicity. In this study, we identified strains of C. michiganensis subsp. insidiosus with contrasting pathogenicity on three accessions of the model legume Medicago truncatula. We generated complete genome sequences for two strains and compared these to a previously sequenced strain and genome sequences of four other subspecies. The three C. michiganensis subsp. insidiosus strains varied in gene content due to genome rearrangements, most likely facilitated by insertion elements, and plasmid number, which varied from one to three depending on strain. The core C. michiganensis genome consisted of 1,917 genes, with 379 genes unique to C. michiganensis subsp. insidiosus. An operon for synthesis of the extracellular blue pigment indigoidine, enzymes for pectin degradation, and an operon for inositol metabolism are among the unique features. Secreted serine proteases belonging to both the pat-1 and ppa families were present but highly diverged from those in other subspecies.
Complete mitochondrial genome of Porzana fusca and Porzana pusilla and phylogenetic relationship of 16 Rallidae species.

PubMed

Chen, Peng; Han, Yuqing; Zhu, Chaoying; Gao, Bin; Ruan, Luzhang

2017-12-01

The complete mitochondrial genome sequences of Porzana fusca and Porzana pusilla were determined. The two avian species share a high degree of homology in terms of mitochondrial genome organization and gene arrangement. Their corresponding mitochondrial genomes are 16,935 and 16,978 bp and consist of 37 genes and a control region. Their PCGs were both 11,365 bp long and have similar structure. Their tRNA gene sequences could be folded into canonical cloverleaf secondary structure, except for tRNA Ser (AGY) , which lost its "DHU" arm. Based on the concatenated nucleotide sequences of the complete mitochondrial DNA genes of 16 Rallidae species, reconstruction of phylogenetic trees and analysis of the molecular clock of P. fusca and P. pusilla indicated that these species from a sister group, which in turn are sister group to Rallina eurizonoides. The genus Gallirallus is a sister group to genus Lewinia, and these groups in turn are sister groups to genus Porphyrio. Moreover, molecular clock analyses suggested that the basal divergence of Rallidae could be traced back to 40.47 (41.46‒39.45) million years ago (Mya), and the divergence of Porzana occurred approximately 5.80 (15.16‒0.79) Mya.
The first mitochondrial genome for the butterfly family Riodinidae (Abisara fylloides) and its systematic implications.

PubMed

Zhao, Fang; Huang, Dun-Yuan; Sun, Xiao-Yan; Shi, Qing-Hui; Hao, Jia-Sheng; Zhang, Lan-Lan; Yang, Qun

2013-10-01

The Riodinidae is one of the lepidopteran butterfly families. This study describes the complete mitochondrial genome of the butterfly species Abisara fylloides, the first mitochondrial genome of the Riodinidae family. The results show that the entire mitochondrial genome of A. fylloides is 15 301 bp in length, and contains 13 protein-coding genes, 2 ribosomal RNA genes, 22 transfer RNA genes and a 423 bp A+T-rich region. The gene content, orientation and order are identical to the majority of other lepidopteran insects. Phylogenetic reconstruction was conducted using the concatenated 13 protein-coding gene (PCG) sequences of 19 available butterfly species covering all the five butterfly families (Papilionidae, Nymphalidae, Peridae, Lycaenidae and Riodinidae). Both maximum likelihood and Bayesian inference analyses highly supported the monophyly of Lycaenidae+Riodinidae, which was standing as the sister of Nymphalidae. In addition, we propose that the riodinids be categorized into the family Lycaenidae as a subfamilial taxon. The Riodinidae is one of the lepidopteran butterfly families. This study describes the complete mitochondrial genome of the butterfly species Abisara fylloides , the first mitochondrial genome of the Riodinidae family. The results show that the entire mitochondrial genome of A. fylloides is 15 301 bp in length, and contains 13 protein-coding genes, 2 ribosomal RNA genes, 22 transfer RNA genes and a 423 bp A+T-rich region. The gene content, orientation and order are identical to the majority of other lepidopteran insects. Phylogenetic reconstruction was conducted using the concatenated 13 protein-coding gene (PCG) sequences of 19 available butterfly species covering all the five butterfly families (Papilionidae, Nymphalidae, Peridae, Lycaenidae and Riodinidae). Both maximum likelihood and Bayesian inference analyses highly supported the monophyly of Lycaenidae+Riodinidae, which was standing as the sister of Nymphalidae. In addition, we propose that the riodinids be categorized into the family Lycaenidae as a subfamilial taxon.
2500 high-quality genomes reveal that the biogeochemical cycles of C, N, S and H are cross-linked by metabolic handoffs in the terrestrial subsurface

NASA Astrophysics Data System (ADS)

Anantharaman, K.; Brown, C. T.; Hug, L. A.; Sharon, I.; Castelle, C. J.; Shelton, A.; Bonet, B.; Probst, A. J.; Thomas, B. C.; Singh, A.; Wilkins, M.; Williams, K. H.; Tringe, S. G.; Beller, H. R.; Brodie, E.; Hubbard, S. S.; Banfield, J. F.

2015-12-01

Microorganisms drive the transformations of carbon compounds in the terrestrial subsurface, a key reservoir of carbon on earth, and impact other linked biogeochemical cycles. Our current knowledge of the microbial ecology in this environment is primarily based on 16S rRNA gene sequences that paint a biased picture of microbial community composition and provide no reliable information on microbial metabolism. Consequently, little is known about the identity and metabolic roles of the uncultivated microbial majority in the subsurface. In turn, this lack of understanding of the microbial processes that impact the turnover of carbon in the subsurface has restricted the scope and ability of biogeochemical models to capture key aspects of the carbon cycle. In this study, we used a culture-independent, genome-resolved metagenomic approach to decipher the metabolic capabilities of microorganisms in an aquifer adjacent to the Colorado River, near Rifle, CO, USA. We sequenced groundwater and sediment samples collected across fifteen different geochemical regimes. Sequence assembly, binning and manual curation resulted in the recovery of 2,542 high-quality genomes, 27 of which are complete. These genomes represent 1,300 non-redundant organisms comprising both abundant and rare community members. Phylogenetic analyses involving ribosomal proteins and 16S rRNA genes revealed the presence of up to 34 new phyla that were hitherto unknown. Less than 11% of all genomes belonged to the 4 most commonly represented phyla that constitute 93% of all currently available genomes. Genome-specific analyses of metabolic potential revealed the co-occurrence of important functional traits such as carbon fixation, nitrogen fixation and use of electron donors and electron acceptors. Finally, we predict that multiple organisms are often required to complete redox pathways through a complex network of metabolic handoffs that extensively cross-link subsurface biogeochemical cycles.
Resources and costs for microbial sequence analysis evaluated using virtual machines and cloud computing.

PubMed

Angiuoli, Samuel V; White, James R; Matalka, Malcolm; White, Owen; Fricke, W Florian

2011-01-01

The widespread popularity of genomic applications is threatened by the "bioinformatics bottleneck" resulting from uncertainty about the cost and infrastructure needed to meet increasing demands for next-generation sequence analysis. Cloud computing services have been discussed as potential new bioinformatics support systems but have not been evaluated thoroughly. We present benchmark costs and runtimes for common microbial genomics applications, including 16S rRNA analysis, microbial whole-genome shotgun (WGS) sequence assembly and annotation, WGS metagenomics and large-scale BLAST. Sequence dataset types and sizes were selected to correspond to outputs typically generated by small- to midsize facilities equipped with 454 and Illumina platforms, except for WGS metagenomics where sampling of Illumina data was used. Automated analysis pipelines, as implemented in the CloVR virtual machine, were used in order to guarantee transparency, reproducibility and portability across different operating systems, including the commercial Amazon Elastic Compute Cloud (EC2), which was used to attach real dollar costs to each analysis type. We found considerable differences in computational requirements, runtimes and costs associated with different microbial genomics applications. While all 16S analyses completed on a single-CPU desktop in under three hours, microbial genome and metagenome analyses utilized multi-CPU support of up to 120 CPUs on Amazon EC2, where each analysis completed in under 24 hours for less than $60. Representative datasets were used to estimate maximum data throughput on different cluster sizes and to compare costs between EC2 and comparable local grid servers. Although bioinformatics requirements for microbial genomics depend on dataset characteristics and the analysis protocols applied, our results suggests that smaller sequencing facilities (up to three Roche/454 or one Illumina GAIIx sequencer) invested in 16S rRNA amplicon sequencing, microbial single-genome and metagenomics WGS projects can achieve cost-efficient bioinformatics support using CloVR in combination with Amazon EC2 as an alternative to local computing centers.
Resources and Costs for Microbial Sequence Analysis Evaluated Using Virtual Machines and Cloud Computing

PubMed Central

Angiuoli, Samuel V.; White, James R.; Matalka, Malcolm; White, Owen; Fricke, W. Florian

2011-01-01

Background The widespread popularity of genomic applications is threatened by the “bioinformatics bottleneck” resulting from uncertainty about the cost and infrastructure needed to meet increasing demands for next-generation sequence analysis. Cloud computing services have been discussed as potential new bioinformatics support systems but have not been evaluated thoroughly. Results We present benchmark costs and runtimes for common microbial genomics applications, including 16S rRNA analysis, microbial whole-genome shotgun (WGS) sequence assembly and annotation, WGS metagenomics and large-scale BLAST. Sequence dataset types and sizes were selected to correspond to outputs typically generated by small- to midsize facilities equipped with 454 and Illumina platforms, except for WGS metagenomics where sampling of Illumina data was used. Automated analysis pipelines, as implemented in the CloVR virtual machine, were used in order to guarantee transparency, reproducibility and portability across different operating systems, including the commercial Amazon Elastic Compute Cloud (EC2), which was used to attach real dollar costs to each analysis type. We found considerable differences in computational requirements, runtimes and costs associated with different microbial genomics applications. While all 16S analyses completed on a single-CPU desktop in under three hours, microbial genome and metagenome analyses utilized multi-CPU support of up to 120 CPUs on Amazon EC2, where each analysis completed in under 24 hours for less than $60. Representative datasets were used to estimate maximum data throughput on different cluster sizes and to compare costs between EC2 and comparable local grid servers. Conclusions Although bioinformatics requirements for microbial genomics depend on dataset characteristics and the analysis protocols applied, our results suggests that smaller sequencing facilities (up to three Roche/454 or one Illumina GAIIx sequencer) invested in 16S rRNA amplicon sequencing, microbial single-genome and metagenomics WGS projects can achieve cost-efficient bioinformatics support using CloVR in combination with Amazon EC2 as an alternative to local computing centers. PMID:22028928
Complete mitochondrial DNA sequence of oyster Crassostrea hongkongensis-a case of "Tandem duplication-random loss" for genome rearrangement in Crassostrea?

PubMed Central

Yu, Ziniu; Wei, Zhengpeng; Kong, Xiaoyu; Shi, Wei

2008-01-01

Background Mitochondrial DNA sequences are extensively used as genetic markers not only for studies of population or ecological genetics, but also for phylogenetic and evolutionary analyses. Complete mt-sequences can reveal information about gene order and its variation, as well as gene and genome evolution when sequences from multiple phyla are compared. Mitochondrial gene order is highly variable among mollusks, with bivalves exhibiting the most variability. Of the 41 complete mt genomes sequenced so far, 12 are from bivalves. We determined, in the current study, the complete mitochondrial DNA sequence of Crassostrea hongkongensis. We present here an analysis of features of its gene content and genome organization in comparison with two other Crassostrea species to assess the variation within bivalves and among main groups of mollusks. Results The complete mitochondrial genome of C. hongkongensis was determined using long PCR and a primer walking sequencing strategy with genus-specific primers. The genome is 16,475 bp in length and contains 12 protein-coding genes (the atp8 gene is missing, as in most bivalves), 22 transfer tRNA genes (including a suppressor tRNA gene), and 2 ribosomal RNA genes, all of which appear to be transcribed from the same strand. A striking finding of this study is that a DNA segment containing four tRNA genes (trnk1, trnC, trnQ1 and trnN) and two duplicated or split rRNA gene (rrnL5' and rrnS) are absent from the genome, when compared with that of two other extant Crassostrea species, which is very likely a consequence of loss of a single genomic region present in ancestor of C. hongkongensis. It indicates this region seem to be a "hot spot" of genomic rearrangements over the Crassostrea mt-genomes. The arrangement of protein-coding genes in C. hongkongensis is identical to that of Crassostrea gigas and Crassostrea virginica, but higher amino acid sequence identities are shared between C. hongkongensis and C. gigas than between other pairs. There exists significant codon bias, favoring codons ending in A or T and against those ending with C. Pair analysis of genome rearrangements showed that the rearrangement distance is great between C. gigas-C. hongkongensis and C. virginica, indicating a high degree of rearrangements within Crassostrea. The determination of complete mt-genome of C. hongkongensis has yielded useful insight into features of gene order, variation, and evolution of Crassostrea and bivalve mt-genomes. Conclusion The mt-genome of C. hongkongensis shares some similarity with, and interesting differences to, other Crassostrea species and bivalves. The absence of trnC and trnN genes and duplicated or split rRNA genes from the C. hongkongensis genome is a completely novel feature not previously reported in Crassostrea species. The phenomenon is likely due to the loss of a segment that is present in other Crassostrea species and was present in ancestor of C. hongkongensis, thus a case of "tandem duplication-random loss (TDRL)". The mt-genome and new feature presented here reveal and underline the high level variation of gene order and gene content in Crassostrea and bivalves, inspiring more research to gain understanding to mechanisms underlying gene and genome evolution in bivalves and mollusks. PMID:18847502
Complete genome sequences of two novel European clade bovine foamy viruses from Germany and Poland.

PubMed

Hechler, Torsten; Materniak, Magdalena; Kehl, Timo; Kuzmak, Jacek; Löchelt, Martin

2012-10-01

Bovine foamy virus (BFV), or bovine spumaretrovirus, is an infectious agent of cattle with no obvious disease association but high prevalence in its host. Here, we report two complete BFV sequences, BFV-Riems, isolated in 1978 in East Germany, and BFV100, isolated in 2005 in Poland. Both new BFV isolates share the overall genetic makeup of other foamy viruses (FV). Although isolated almost 25 years apart and propagated in either bovine (BFV-Riems) or nonbovine (BFV100) cells, both viruses are highly related, forming the European BFV clade. Despite clear differences, BFV-Riems and BFV100 are still very similar to BFV isolates from China and the United States, comprising the non-European BFV clade. The genomic sequences presented here confirm the concept of high sequence conservation across most of the FV genome. Analyses of cell culture-derived genomes reveal that proviral DNA may specifically lack introns in the env-bel coding region. The spacing of the splice sites in this region suggests that BFV has developed a novel mode to express a secretory but nonfunctional Env protein.
Complete Genome Sequences of Two Novel European Clade Bovine Foamy Viruses from Germany and Poland

PubMed Central

Hechler, Torsten; Materniak, Magdalena; Kehl, Timo; Kuzmak, Jacek

2012-01-01

Bovine foamy virus (BFV), or bovine spumaretrovirus, is an infectious agent of cattle with no obvious disease association but high prevalence in its host. Here, we report two complete BFV sequences, BFV-Riems, isolated in 1978 in East Germany, and BFV100, isolated in 2005 in Poland. Both new BFV isolates share the overall genetic makeup of other foamy viruses (FV). Although isolated almost 25 years apart and propagated in either bovine (BFV-Riems) or nonbovine (BFV100) cells, both viruses are highly related, forming the European BFV clade. Despite clear differences, BFV-Riems and BFV100 are still very similar to BFV isolates from China and the United States, comprising the non-European BFV clade. The genomic sequences presented here confirm the concept of high sequence conservation across most of the FV genome. Analyses of cell culture-derived genomes reveal that proviral DNA may specifically lack introns in the env-bel coding region. The spacing of the splice sites in this region suggests that BFV has developed a novel mode to express a secretory but nonfunctional Env protein. PMID:22966195
Using genic sequence capture in combination with a syntenic pseudo genome to map a deletion mutant in a wheat species.

PubMed

Gardiner, Laura-Jayne; Gawroński, Piotr; Olohan, Lisa; Schnurbusch, Thorsten; Hall, Neil; Hall, Anthony

2014-12-01

Mapping-by-sequencing analyses have largely required a complete reference sequence and employed whole genome re-sequencing. In species such as wheat, no finished genome reference sequence is available. Additionally, because of its large genome size (17 Gb), re-sequencing at sufficient depth of coverage is not practical. Here, we extend the utility of mapping by sequencing, developing a bespoke pipeline and algorithm to map an early-flowering locus in einkorn wheat (Triticum monococcum L.) that is closely related to the bread wheat genome A progenitor. We have developed a genomic enrichment approach using the gene-rich regions of hexaploid bread wheat to design a 110-Mbp NimbleGen SeqCap EZ in solution capture probe set, representing the majority of genes in wheat. Here, we use the capture probe set to enrich and sequence an F2 mapping population of the mutant. The mutant locus was identified in T. monococcum, which lacks a complete genome reference sequence, by mapping the enriched data set onto pseudo-chromosomes derived from the capture probe target sequence, with a long-range order of genes based on synteny of wheat with Brachypodium distachyon. Using this approach we are able to map the region and identify a set of deleted genes within the interval. © 2014 The Authors.The Plant Journal published by Society for Experimental Biology and John Wiley & Sons Ltd.
Genomic Characterization of the Genus Nairovirus (Family Bunyaviridae).

PubMed

Kuhn, Jens H; Wiley, Michael R; Rodriguez, Sergio E; Bào, Yīmíng; Prieto, Karla; Travassos da Rosa, Amelia P A; Guzman, Hilda; Savji, Nazir; Ladner, Jason T; Tesh, Robert B; Wada, Jiro; Jahrling, Peter B; Bente, Dennis A; Palacios, Gustavo

2016-06-10

Nairovirus, one of five bunyaviral genera, includes seven species. Genomic sequence information is limited for members of the Dera Ghazi Khan, Hughes, Qalyub, Sakhalin, and Thiafora nairovirus species. We used next-generation sequencing and historical virus-culture samples to determine 14 complete and nine coding-complete nairoviral genome sequences to further characterize these species. Previously unsequenced viruses include Abu Mina, Clo Mor, Great Saltee, Hughes, Raza, Sakhalin, Soldado, and Tillamook viruses. In addition, we present genomic sequence information on additional isolates of previously sequenced Avalon, Dugbe, Sapphire II, and Zirqa viruses. Finally, we identify Tunis virus, previously thought to be a phlebovirus, as an isolate of Abu Hammad virus. Phylogenetic analyses indicate the need for reassignment of Sapphire II virus to Dera Ghazi Khan nairovirus and reassignment of Hazara, Tofla, and Nairobi sheep disease viruses to novel species. We also propose new species for the Kasokero group (Kasokero, Leopards Hill, Yogue viruses), the Ketarah group (Gossas, Issyk-kul, Keterah/soft tick viruses) and the Burana group (Wēnzhōu tick virus, Huángpí tick virus 1, Tǎchéng tick virus 1). Our analyses emphasize the sister relationship of nairoviruses and arenaviruses, and indicate that several nairo-like viruses (Shāyáng spider virus 1, Xīnzhōu spider virus, Sānxiá water strider virus 1, South Bay virus, Wǔhàn millipede virus 2) require establishment of novel genera in a larger nairovirus-arenavirus supergroup.
Discovery of a novel canine respiratory coronavirus support genetic recombination among betacoronavirus1.

PubMed

Lu, Shuai; Wang, Yanqun; Chen, Yingzhu; Wu, Bingjie; Qin, Kun; Zhao, Jincun; Lou, Yongliang; Tan, Wenjie

2017-06-02

Although canine respiratory coronavirus (CRCoV) is an important respiratory pathogen that is prevalent in many countries, only one complete genome sequence of CRCoV (South Korea strain K37) has been obtained to date. Genome-wide analyses and recombination have rarely been conducted, as small numbers of samples and limited genomic characterization have previously prevented further analyses. Herein, we report a unique CRCoV strain, denoted strain BJ232, derived from a CRCoV-positive dog with a mild respiratory infection. Phylogenetic analysis based on complete genome of all available coronaviruses consistently show that CRCoV BJ232 is most closely related to human coronavirus OC43 (HCoV-OC43) and BCoV, forming a separate clade that split off early from other Betacoronavirus 1. Based on the phylogenetic and SimPlot analysis we propose that CRCoV-K37 was derived from genetic recombination between CRCoV-BJ232 and BCoV. In detail, spike (S) gene of CRCoV-K37 clustered with CRCoV-BJ232. However orf1ab, membrane (M) and nucleocapsid (N) genes were more related to Bovine coronavirus (BCoV) than CRCoV-B232. Molecular epidemic analysis confirmed the prevalence of CRCoV-BJ232 lineage around the world for a long time. Recombinant events among Betacoronavirus 1 may have implications for CRCoV transmissibility. All these findings provide further information regarding the origin of CRCoV. Copyright © 2017. Published by Elsevier B.V.
The complete mitochondrial genome of Koerneria sudhausi (Diplogasteromorpha: Nematoda) supports monophyly of Diplogasteromorpha within Rhabditomorpha.

PubMed

Kim, Taeho; Kim, Jiyeon; Nadler, Steven A; Park, Joong-Ki

2016-05-01

Testing hypotheses of monophyly for different nematode groups in the context of broad representation of nematode diversity is central to understanding the patterns and processes of nematode evolution. Herein sequence information from mitochondrial genomes is used to test the monophyly of diplogasterids, which includes an important nematode model organism. The complete mitochondrial genome sequence of Koerneria sudhausi, a representative of Diplogasteromorpha, was determined and used for phylogenetic analyses along with 60 other nematode species. The mtDNA of K. sudhausi is comprised of 16,005 bp that includes 36 genes (12 protein-coding genes, 2 ribosomal RNA genes and 22 transfer RNA genes) encoded in the same direction. Phylogenetic trees inferred from amino acid and nucleotide sequence data for the 12 protein-coding genes strongly supported the sister relationship of K. sudhausi with Pristionchus pacificus, supporting Diplogasteromorpha. The gene order of K. sudhausi is identical to that most commonly found in members of the Rhabditomorpha + Ascaridomorpha + Diplogasteromorpha clade, with an exception of some tRNA translocations. Both the gene order pattern and sequence-based phylogenetic analyses support a close relationship between the diplogasterid species and Rhabditomorpha. The nesting of the two diplogasteromorph species within Rhabditomorpha is consistent with most molecular phylogenies for the group, but inconsistent with certain morphology-based hypotheses that asserted phylogenetic affinity between diplogasteromorphs and tylenchomorphs. Phylogenetic analysis of mitochondrial genome sequences strongly supports monophyly of the diplogasteromorpha.
Substitution rate and natural selection in parvovirus B19

PubMed Central

Stamenković, Gorana G.; Ćirković, Valentina S.; Šiljić, Marina M.; Blagojević, Jelena V.; Knežević, Aleksandra M.; Joksić, Ivana D.; Stanojević, Maja P.

2016-01-01

The aim of this study was to estimate substitution rate and imprints of natural selection on parvovirus B19 genotype 1. Studied datasets included 137 near complete coding B19 genomes (positions 665 to 4851) for phylogenetic and substitution rate analysis and 146 and 214 partial genomes for selection analyses in open reading frames ORF1 and ORF2, respectively, collected 1973–2012 and including 9 newly sequenced isolates from Serbia. Phylogenetic clustering assigned majority of studied isolates to G1A. Nucleotide substitution rate for total coding DNA was 1.03 (0.6–1.27) x 10−4 substitutions/site/year, with higher values for analyzed genome partitions. In spite of the highest evolutionary rate, VP2 codons were found to be under purifying selection with rare episodic positive selection, whereas codons under diversifying selection were found in the unique part of VP1, known to contain B19 immune epitopes important in persistent infection. Analyses of overlapping gene regions identified nucleotide positions under opposite selective pressure in different ORFs, suggesting complex evolutionary mechanisms of nucleotide changes in B19 viral genomes. PMID:27775080
Evolution of complete proteomes: guanine-cytosine pressure, phylogeny and environmental influences blend the proteomic architecture

PubMed Central

2013-01-01

Background Guanine-cytosine (GC) composition is an important feature of genomes. Likewise, amino acid composition is a distinct, but less valued, feature of proteomes. A major concern is that it is not clear what valuable information can be acquired from amino acid composition data. To address this concern, in-depth analyses of the amino acid composition of the complete proteomes from 63 archaea, 270 bacteria, and 128 eukaryotes were performed. Results Principal component analysis of the amino acid matrices showed that the main contributors to proteomic architecture were genomic GC variation, phylogeny, and environmental influences. GC pressure drove positive selection on Ala, Arg, Gly, Pro, Trp, and Val, and adverse selection on Asn, Lys, Ile, Phe, and Tyr. The physico-chemical framework of the complete proteomes withstood GC pressure by frequency complementation of GC-dependent amino acid pairs with similar physico-chemical properties. Gln, His, Ser, and Val were responsible for phylogeny and their constituted components could differentiate archaea, bacteria, and eukaryotes. Environmental niche was also a significant factor in determining proteomic architecture, especially for archaea for which the main amino acids were Cys, Leu, and Thr. In archaea, hyperthermophiles, acidophiles, mesophiles, psychrophiles, and halophiles gathered successively along the environment-based principal component. Concordance between proteomic architecture and the genetic code was also related closely to genomic GC content, phylogeny, and lifestyles. Conclusions Large-scale analyses of the complete proteomes of a wide range of organisms suggested that amino acid composition retained the trace of GC variation, phylogeny, and environmental influences during evolution. The findings from this study will help in the development of a global understanding of proteome evolution, and even biological evolution. PMID:24088322

Patterns and processes of Mycobacterium bovis evolution revealed by phylogenomic analyses

USDA-ARS?s Scientific Manuscript database

Mycobacterium bovis is an important animal pathogen worldwide that parasitizes wild and domesticated vertebrate livestock as well as humans. A comparison of the five M. bovis complete genomes from UK, South Korea, Brazil and USA revealed four novel large-scale structural variations of at least 2,000...
Independent assessment and improvement of wheat genome sequence assemblies using Fosill jumping libraries.

PubMed

Lu, Fu-Hao; McKenzie, Neil; Kettleborough, George; Heavens, Darren; Clark, Matthew D; Bevan, Michael W

2018-05-01

The accurate sequencing and assembly of very large, often polyploid, genomes remains a challenging task, limiting long-range sequence information and phased sequence variation for applications such as plant breeding. The 15-Gb hexaploid bread wheat (Triticum aestivum) genome has been particularly challenging to sequence, and several different approaches have recently generated long-range assemblies. Mapping and understanding the types of assembly errors are important for optimising future sequencing and assembly approaches and for comparative genomics. Here we use a Fosill 38-kb jumping library to assess medium and longer-range order of different publicly available wheat genome assemblies. Modifications to the Fosill protocol generated longer Illumina sequences and enabled comprehensive genome coverage. Analyses of two independent Bacterial Artificial Chromosome (BAC)-based chromosome-scale assemblies, two independent Illumina whole genome shotgun assemblies, and a hybrid Single Molecule Real Time (SMRT-PacBio) and short read (Illumina) assembly were carried out. We revealed a surprising scale and variety of discrepancies using Fosill mate-pair mapping and validated several of each class. In addition, Fosill mate-pairs were used to scaffold a whole genome Illumina assembly, leading to a 3-fold increase in N50 values. Our analyses, using an independent means to validate different wheat genome assemblies, show that whole genome shotgun assemblies based solely on Illumina sequences are significantly more accurate by all measures compared to BAC-based chromosome-scale assemblies and hybrid SMRT-Illumina approaches. Although current whole genome assemblies are reasonably accurate and useful, additional improvements will be needed to generate complete assemblies of wheat genomes using open-source, computationally efficient, and cost-effective methods.
Genome of Rhodnius prolixus, an insect vector of Chagas disease, reveals unique adaptations to hematophagy and parasite infection

PubMed Central

Mesquita, Rafael D.; Vionette-Amaral, Raquel J.; Lowenberger, Carl; Rivera-Pomar, Rolando; Monteiro, Fernando A.; Minx, Patrick; Spieth, John; Carvalho, A. Bernardo; Panzera, Francisco; Lawson, Daniel; Torres, André Q.; Ribeiro, Jose M. C.; Sorgine, Marcos H. F.; Waterhouse, Robert M.; Abad-Franch, Fernando; Alves-Bezerra, Michele; Amaral, Laurence R.; Araujo, Helena M.; Aravind, L.; Atella, Georgia C.; Azambuja, Patricia; Berni, Mateus; Bittencourt-Cunha, Paula R.; Braz, Gloria R. C.; Calderón-Fernández, Gustavo; Carareto, Claudia M. A.; Christensen, Mikkel B.; Costa, Igor R.; Costa, Samara G.; Dansa, Marilvia; Daumas-Filho, Carlos R. O.; De-Paula, Iron F.; Dias, Felipe A.; Dimopoulos, George; Emrich, Scott J.; Esponda-Behrens, Natalia; Fampa, Patricia; Fernandez-Medina, Rita D.; da Fonseca, Rodrigo N.; Fontenele, Marcio; Fronick, Catrina; Fulton, Lucinda A.; Gandara, Ana Caroline; Garcia, Eloi S.; Genta, Fernando A.; Giraldo-Calderón, Gloria I.; Gomes, Bruno; Gondim, Katia C.; Granzotto, Adriana; Guarneri, Alessandra A.; Guigó, Roderic; Harry, Myriam; Hughes, Daniel S. T.; Jablonka, Willy; Jacquin-Joly, Emmanuelle; Juárez, M. Patricia; Koerich, Leonardo B.; Lange, Angela B.; Latorre-Estivalis, José Manuel; Lavore, Andrés; Lawrence, Gena G.; Lazoski, Cristiano; Lazzari, Claudio R.; Lopes, Raphael R.; Lorenzo, Marcelo G.; Lugon, Magda D.; Marcet, Paula L.; Mariotti, Marco; Masuda, Hatisaburo; Megy, Karine; Missirlis, Fanis; Mota, Theo; Noriega, Fernando G.; Nouzova, Marcela; Nunes, Rodrigo D.; Oliveira, Raquel L. L.; Oliveira-Silveira, Gilbert; Ons, Sheila; Orchard, Ian; Pagola, Lucia; Paiva-Silva, Gabriela O.; Pascual, Agustina; Pavan, Marcio G.; Pedrini, Nicolás; Peixoto, Alexandre A.; Pereira, Marcos H.; Pike, Andrew; Polycarpo, Carla; Prosdocimi, Francisco; Ribeiro-Rodrigues, Rodrigo; Robertson, Hugh M.; Salerno, Ana Paula; Salmon, Didier; Santesmasses, Didac; Schama, Renata; Seabra-Junior, Eloy S.; Silva-Cardoso, Livia; Silva-Neto, Mario A. C.; Souza-Gomes, Matheus; Sterkel, Marcos; Taracena, Mabel L.; Tojo, Marta; Tu, Zhijian Jake; Tubio, Jose M. C.; Ursic-Bedoya, Raul; Venancio, Thiago M.; Walter-Nuno, Ana Beatriz; Wilson, Derek; Warren, Wesley C.; Wilson, Richard K.; Huebner, Erwin; Dotson, Ellen M.; Oliveira, Pedro L.

2015-01-01

Rhodnius prolixus not only has served as a model organism for the study of insect physiology, but also is a major vector of Chagas disease, an illness that affects approximately seven million people worldwide. We sequenced the genome of R. prolixus, generated assembled sequences covering 95% of the genome (∼702 Mb), including 15,456 putative protein-coding genes, and completed comprehensive genomic analyses of this obligate blood-feeding insect. Although immune-deficiency (IMD)-mediated immune responses were observed, R. prolixus putatively lacks key components of the IMD pathway, suggesting a reorganization of the canonical immune signaling network. Although both Toll and IMD effectors controlled intestinal microbiota, neither affected Trypanosoma cruzi, the causal agent of Chagas disease, implying the existence of evasion or tolerance mechanisms. R. prolixus has experienced an extensive loss of selenoprotein genes, with its repertoire reduced to only two proteins, one of which is a selenocysteine-based glutathione peroxidase, the first found in insects. The genome contained actively transcribed, horizontally transferred genes from Wolbachia sp., which showed evidence of codon use evolution toward the insect use pattern. Comparative protein analyses revealed many lineage-specific expansions and putative gene absences in R. prolixus, including tandem expansions of genes related to chemoreception, feeding, and digestion that possibly contributed to the evolution of a blood-feeding lifestyle. The genome assembly and these associated analyses provide critical information on the physiology and evolution of this important vector species and should be instrumental for the development of innovative disease control methods. PMID:26627243
Genome of Rhodnius prolixus, an insect vector of Chagas disease, reveals unique adaptations to hematophagy and parasite infection.

PubMed

Mesquita, Rafael D; Vionette-Amaral, Raquel J; Lowenberger, Carl; Rivera-Pomar, Rolando; Monteiro, Fernando A; Minx, Patrick; Spieth, John; Carvalho, A Bernardo; Panzera, Francisco; Lawson, Daniel; Torres, André Q; Ribeiro, Jose M C; Sorgine, Marcos H F; Waterhouse, Robert M; Montague, Michael J; Abad-Franch, Fernando; Alves-Bezerra, Michele; Amaral, Laurence R; Araujo, Helena M; Araujo, Ricardo N; Aravind, L; Atella, Georgia C; Azambuja, Patricia; Berni, Mateus; Bittencourt-Cunha, Paula R; Braz, Gloria R C; Calderón-Fernández, Gustavo; Carareto, Claudia M A; Christensen, Mikkel B; Costa, Igor R; Costa, Samara G; Dansa, Marilvia; Daumas-Filho, Carlos R O; De-Paula, Iron F; Dias, Felipe A; Dimopoulos, George; Emrich, Scott J; Esponda-Behrens, Natalia; Fampa, Patricia; Fernandez-Medina, Rita D; da Fonseca, Rodrigo N; Fontenele, Marcio; Fronick, Catrina; Fulton, Lucinda A; Gandara, Ana Caroline; Garcia, Eloi S; Genta, Fernando A; Giraldo-Calderón, Gloria I; Gomes, Bruno; Gondim, Katia C; Granzotto, Adriana; Guarneri, Alessandra A; Guigó, Roderic; Harry, Myriam; Hughes, Daniel S T; Jablonka, Willy; Jacquin-Joly, Emmanuelle; Juárez, M Patricia; Koerich, Leonardo B; Lange, Angela B; Latorre-Estivalis, José Manuel; Lavore, Andrés; Lawrence, Gena G; Lazoski, Cristiano; Lazzari, Claudio R; Lopes, Raphael R; Lorenzo, Marcelo G; Lugon, Magda D; Majerowicz, David; Marcet, Paula L; Mariotti, Marco; Masuda, Hatisaburo; Megy, Karine; Melo, Ana C A; Missirlis, Fanis; Mota, Theo; Noriega, Fernando G; Nouzova, Marcela; Nunes, Rodrigo D; Oliveira, Raquel L L; Oliveira-Silveira, Gilbert; Ons, Sheila; Orchard, Ian; Pagola, Lucia; Paiva-Silva, Gabriela O; Pascual, Agustina; Pavan, Marcio G; Pedrini, Nicolás; Peixoto, Alexandre A; Pereira, Marcos H; Pike, Andrew; Polycarpo, Carla; Prosdocimi, Francisco; Ribeiro-Rodrigues, Rodrigo; Robertson, Hugh M; Salerno, Ana Paula; Salmon, Didier; Santesmasses, Didac; Schama, Renata; Seabra-Junior, Eloy S; Silva-Cardoso, Livia; Silva-Neto, Mario A C; Souza-Gomes, Matheus; Sterkel, Marcos; Taracena, Mabel L; Tojo, Marta; Tu, Zhijian Jake; Tubio, Jose M C; Ursic-Bedoya, Raul; Venancio, Thiago M; Walter-Nuno, Ana Beatriz; Wilson, Derek; Warren, Wesley C; Wilson, Richard K; Huebner, Erwin; Dotson, Ellen M; Oliveira, Pedro L

2015-12-01

Rhodnius prolixus not only has served as a model organism for the study of insect physiology, but also is a major vector of Chagas disease, an illness that affects approximately seven million people worldwide. We sequenced the genome of R. prolixus, generated assembled sequences covering 95% of the genome (∼ 702 Mb), including 15,456 putative protein-coding genes, and completed comprehensive genomic analyses of this obligate blood-feeding insect. Although immune-deficiency (IMD)-mediated immune responses were observed, R. prolixus putatively lacks key components of the IMD pathway, suggesting a reorganization of the canonical immune signaling network. Although both Toll and IMD effectors controlled intestinal microbiota, neither affected Trypanosoma cruzi, the causal agent of Chagas disease, implying the existence of evasion or tolerance mechanisms. R. prolixus has experienced an extensive loss of selenoprotein genes, with its repertoire reduced to only two proteins, one of which is a selenocysteine-based glutathione peroxidase, the first found in insects. The genome contained actively transcribed, horizontally transferred genes from Wolbachia sp., which showed evidence of codon use evolution toward the insect use pattern. Comparative protein analyses revealed many lineage-specific expansions and putative gene absences in R. prolixus, including tandem expansions of genes related to chemoreception, feeding, and digestion that possibly contributed to the evolution of a blood-feeding lifestyle. The genome assembly and these associated analyses provide critical information on the physiology and evolution of this important vector species and should be instrumental for the development of innovative disease control methods.
A combined computational-experimental analyses of selected metabolic enzymes in Pseudomonas species.

PubMed

Perumal, Deepak; Lim, Chu Sing; Chow, Vincent T K; Sakharkar, Kishore R; Sakharkar, Meena K

2008-09-10

Comparative genomic analysis has revolutionized our ability to predict the metabolic subsystems that occur in newly sequenced genomes, and to explore the functional roles of the set of genes within each subsystem. These computational predictions can considerably reduce the volume of experimental studies required to assess basic metabolic properties of multiple bacterial species. However, experimental validations are still required to resolve the apparent inconsistencies in the predictions by multiple resources. Here, we present combined computational-experimental analyses on eight completely sequenced Pseudomonas species. Comparative pathway analyses reveal that several pathways within the Pseudomonas species show high plasticity and versatility. Potential bypasses in 11 metabolic pathways were identified. We further confirmed the presence of the enzyme O-acetyl homoserine (thiol) lyase (EC: 2.5.1.49) in P. syringae pv. tomato that revealed inconsistent annotations in KEGG and in the recently published SYSTOMONAS database. These analyses connect and integrate systematic data generation, computational data interpretation, and experimental validation and represent a synergistic and powerful means for conducting biological research.
Complete mitochondrial genome of Bugula neritina (Bryozoa, Gymnolaemata, Cheilostomata): phylogenetic position of Bryozoa and phylogeny of lophophorates within the Lophotrochozoa

PubMed Central

Jang, Kuem Hee; Hwang, Ui Wook

2009-01-01

Background The phylogenetic position of Bryozoa is one of the most controversial issues in metazoan phylogeny. In an attempt to address this issue, the first bryozoan mitochondrial genome from Flustrellidra hispida (Gymnolaemata, Ctenostomata) was recently sequenced and characterized. Unfortunately, it has extensive gene translocation and extremely reduced size. In addition, the phylogenies obtained from the result were conflicting, so they failed to assign a reliable phylogenetic position to Bryozoa or to clarify lophophorate phylogeny. Thus, it is necessary to characterize further mitochondrial genomes from slowly-evolving bryozoans to obtain a more credible lophophorate phylogeny. Results The complete mitochondrial genome (15,433 bp) of Bugula neritina (Bryozoa, Gymnolaemata, Cheilostomata), one of the most widely distributed cheliostome bryozoans, is sequenced. This second bryozoan mitochondrial genome contains the set of 37 components generally observed in other metazoans, differing from that of F. hispida (Bryozoa, Gymnolaemata, Ctenostomata), which has only 36 components with loss of tRNAser(ucn) genes. The B. neritina mitochondrial genome possesses 27 multiple noncoding regions. The gene order is more similar to those of the two remaining lophophorate phyla (Brachiopoda and Phoronida) and a chiton Katharina tunicate than to that of F. hispida. Phylogenetic analyses based on the nucleotide sequences or amino acid residues of 12 protein-coding genes showed consistently that, within the Lophotrochozoa, the monophyly of the bryozoan class Gymnolaemata (B. neritina and F. hispida) was strongly supported and the bryozoan clade was grouped with brachiopods. Echiura appeared as a subtaxon of Annelida, and Entoprocta as a sister taxon of Phoronida. The clade of Bryozoa + Brachiopoda was clustered with either the clade of Annelida-Echiura or that of Phoronida + Entoprocta. Conclusion This study presents the complete mitochondrial genome of a cheliostome bryozoan, B. neritina. The phylogenetic analyses suggest a close relationship between Bryozoa and Brachiopoda within the Lophotrochozoa. However, the sister group of Bryozoa + Brachiopoda is still ambiguous, although it has some attractions with Annelida-Echiura or Phoronida + Entoprocta. If the latter is a true phylogeny, lophophorate monophyly including Entoprocta is supported. Consequently, the present results imply that Brachiozoa (= Brachiopoda + Phoronida) and the recently-resurrected Bryozoa concept comprising Ectoprocta and Entoprocta may be refuted. PMID:19379522
Genomic profiling of plastid DNA variation in the Mediterranean olive tree

PubMed Central

2011-01-01

Background Characterisation of plastid genome (or cpDNA) polymorphisms is commonly used for phylogeographic, population genetic and forensic analyses in plants, but detecting cpDNA variation is sometimes challenging, limiting the applications of such an approach. In the present study, we screened cpDNA polymorphism in the olive tree (Olea europaea L.) by sequencing the complete plastid genome of trees with a distinct cpDNA lineage. Our objective was to develop new markers for a rapid genomic profiling (by Multiplex PCRs) of cpDNA haplotypes in the Mediterranean olive tree. Results Eight complete cpDNA genomes of Olea were sequenced de novo. The nucleotide divergence between olive cpDNA lineages was low and not exceeding 0.07%. Based on these sequences, markers were developed for studying two single nucleotide substitutions and length polymorphism of 62 regions (with variable microsatellite motifs or other indels). They were then used to genotype the cpDNA variation in cultivated and wild Mediterranean olive trees (315 individuals). Forty polymorphic loci were detected on this sample, allowing the distinction of 22 haplotypes belonging to the three Mediterranean cpDNA lineages known as E1, E2 and E3. The discriminating power of cpDNA variation was particularly low for the cultivated olive tree with one predominating haplotype, but more diversity was detected in wild populations. Conclusions We propose a method for a rapid characterisation of the Mediterranean olive germplasm. The low variation in the cultivated olive tree indicated that the utility of cpDNA variation for forensic analyses is limited to rare haplotypes. In contrast, the high cpDNA variation in wild populations demonstrated that our markers may be useful for phylogeographic and populations genetic studies in O. europaea. PMID:21569271
Low-pass sequencing for microbial comparative genomics

PubMed Central

Goo, Young Ah; Roach, Jared; Glusman, Gustavo; Baliga, Nitin S; Deutsch, Kerry; Pan, Min; Kennedy, Sean; DasSarma, Shiladitya; Victor Ng, Wailap; Hood, Leroy

2004-01-01

Background We studied four extremely halophilic archaea by low-pass shotgun sequencing: (1) the metabolically versatile Haloarcula marismortui; (2) the non-pigmented Natrialba asiatica; (3) the psychrophile Halorubrum lacusprofundi and (4) the Dead Sea isolate Halobaculum gomorrense. Approximately one thousand single pass genomic sequences per genome were obtained. The data were analyzed by comparative genomic analyses using the completed Halobacterium sp. NRC-1 genome as a reference. Low-pass shotgun sequencing is a simple, inexpensive, and rapid approach that can readily be performed on any cultured microbe. Results As expected, the four archaeal halophiles analyzed exhibit both bacterial and eukaryotic characteristics as well as uniquely archaeal traits. All five halophiles exhibit greater than sixty percent GC content and low isoelectric points (pI) for their predicted proteins. Multiple insertion sequence (IS) elements, often involved in genome rearrangements, were identified in H. lacusprofundi and H. marismortui. The core biological functions that govern cellular and genetic mechanisms of H. sp. NRC-1 appear to be conserved in these four other halophiles. Multiple TATA box binding protein (TBP) and transcription factor IIB (TFB) homologs were identified from most of the four shotgunned halophiles. The reconstructed molecular tree of all five halophiles shows a large divergence between these species, but with the closest relationship being between H. sp. NRC-1 and H. lacusprofundi. Conclusion Despite the diverse habitats of these species, all five halophiles share (1) high GC content and (2) low protein isoelectric points, which are characteristics associated with environmental exposure to UV radiation and hypersalinity, respectively. Identification of multiple IS elements in the genome of H. lacusprofundi and H. marismortui suggest that genome structure and dynamic genome reorganization might be similar to that previously observed in the IS-element rich genome of H. sp. NRC-1. Identification of multiple TBP and TFB homologs in these four halophiles are consistent with the hypothesis that different types of complex transcriptional regulation may occur through multiple TBP-TFB combinations in response to rapidly changing environmental conditions. Low-pass shotgun sequence analyses of genomes permit extensive and diverse analyses, and should be generally useful for comparative microbial genomics. PMID:14718067
The past, present, and future of Leishmania genomics and transcriptomics

PubMed Central

Cantacessi, Cinzia; Dantas-Torres, Filipe; Nolan, Matthew J.; Otranto, Domenico

2015-01-01

It has been nearly 10 years since the completion of the first entire genome sequence of a Leishmania parasite. Genomic and transcriptomic analyses have advanced our understanding of the biology of Leishmania, and shed new light on the complex interactions occurring within the parasite–host–vector triangle. Here, we review these advances and examine potential avenues for translation of these discoveries into treatment and control programs. In addition, we argue for a strong need to explore how disease in dogs relates to that in humans, and how an improved understanding in line with the ‘One Health’ concept may open new avenues for the control of these devastating diseases. PMID:25638444
Phylogeny and mitochondrial gene order variation in Lophotrochozoa in the light of new mitogenomic data from Nemertea

PubMed Central

Podsiadlowski, Lars; Braband, Anke; Struck, Torsten H; von Döhren, Jörn; Bartolomaeus, Thomas

2009-01-01

Background The new animal phylogeny established several taxa which were not identified by morphological analyses, most prominently the Ecdysozoa (arthropods, roundworms, priapulids and others) and Lophotrochozoa (molluscs, annelids, brachiopods and others). Lophotrochozoan interrelationships are under discussion, e.g. regarding the position of Nemertea (ribbon worms), which were discussed to be sister group to e.g. Mollusca, Brachiozoa or Platyhelminthes. Mitochondrial genomes contributed well with sequence data and gene order characters to the deep metazoan phylogeny debate. Results In this study we present the first complete mitochondrial genome record for a member of the Nemertea, Lineus viridis. Except two trnP and trnT, all genes are located on the same strand. While gene order is most similar to that of the brachiopod Terebratulina retusa, sequence based analyses of mitochondrial genes place nemerteans close to molluscs, phoronids and entoprocts without clear preference for one of these taxa as sister group. Conclusion Almost all recent analyses with large datasets show good support for a taxon comprising Annelida, Mollusca, Brachiopoda, Phoronida and Nemertea. But the relationships among these taxa vary between different studies. The analysis of gene order differences gives evidence for a multiple independent occurrence of a large inversion in the mitochondrial genome of Lophotrochozoa and a re-inversion of the same part in gastropods. We hypothesize that some regions of the genome have a higher chance for intramolecular recombination than others and gene order data have to be analysed carefully to detect convergent rearrangement events. PMID:19660126
An ancient genome duplication contributed to the abundance of metabolic genes in the moss Physcomitrella patens

PubMed Central

Rensing, Stefan A; Ick, Julia; Fawcett, Jeffrey A; Lang, Daniel; Zimmer, Andreas; Van de Peer, Yves; Reski, Ralf

2007-01-01

Background: Analyses of complete genomes and large collections of gene transcripts have shown that most, if not all seed plants have undergone one or more genome duplications in their evolutionary past. Results: In this study, based on a large collection of EST sequences, we provide evidence that the haploid moss Physcomitrella patens is a paleopolyploid as well. Based on the construction of linearized phylogenetic trees we infer the genome duplication to have occurred between 30 and 60 million years ago. Gene Ontology and pathway association of the duplicated genes in P. patens reveal different biases of gene retention compared with seed plants. Conclusion: Metabolic genes seem to have been retained in excess following the genome duplication in P. patens. This might, at least partly, explain the versatility of metabolism, as described for P. patens and other mosses, in comparison to other land plants. PMID:17683536
The Complete Moss Mitochondrial Genome in the Angiosperm Amborella Is a Chimera Derived from Two Moss Whole-Genome Transfers.

PubMed

Taylor, Z Nathan; Rice, Danny W; Palmer, Jeffrey D

2015-01-01

Sequencing of the 4-Mb mitochondrial genome of the angiosperm Amborella trichopoda has shown that it contains unprecedented amounts of foreign mitochondrial DNA, including four blocks of sequences that together correspond almost perfectly to one entire moss mitochondrial genome. This implies whole-genome transfer from a single moss donor but conflicts with phylogenetic results from an earlier, PCR-based study that suggested three different moss donors to Amborella. To resolve this conflict, we conducted an expanded set of phylogenetic analyses with respect to both moss lineages and mitochondrial loci. The moss DNA in Amborella was consistently placed in either of two positions, depending on the locus analyzed, as sister to the Ptychomniales or within the Hookeriales. This agrees with two of the three previously suggested donors, whereas the third is no longer supported. These results, combined with synteny analyses and other considerations, lead us to favor a model involving two successive moss-to-Amborella whole-genome transfers, followed by recombination that produced a single intact and chimeric moss mitochondrial genome integrated in the Amborella mitochondrial genome. Eight subsequent recombination events account for the state of fragmentation, rearrangement, duplication, and deletion of this chimeric moss mitochondrial genome as it currently exists in Amborella. Five of these events are associated with short-to-intermediate sized repeats. Two of the five probably occurred by reciprocal homologous recombination, whereas the other three probably occurred in a non-reciprocal manner via microhomology-mediated break-induced replication (MMBIR). These findings reinforce and extend recent evidence for an important role of MMBIR in plant mitochondrial DNA evolution.
First complete mitochondrial genome data from ancient South American camelids - The mystery of the chilihueques from Isla Mocha (Chile)

PubMed Central

Westbury, Michael; Prost, Stefan; Seelenfreund, Andrea; Ramírez, José-Miguel; Matisoo-Smith, Elizabeth A.; Knapp, Michael

2016-01-01

In South American societies, domesticated camelids were of great cultural importance and subject to trade and translocation. South American camelids were even found on remote and hard to reach islands, emphasizing their importance to historic and pre-historic South American populations. Isla Mocha, a volcanic island 35 km offshore of Central-South Chile, is an example of such an island. When Dutch and Spanish explorers reached the island in the early 17th century, they found that domesticated camelids called “chilihueque” played a major role in the island’s society. The origin and taxonomy of these enigmatic camelids is unclear and controversial. This study aims to resolve this controversy through genetic analyses of Isla Mocha camelid remains dating from pre-Columbian to early historic times. A recent archaeological excavation of site P21-3 on Isla Mocha yielded a number of camelid remains. Three complete mitochondrial genomes were successfully recovered and analysed. Phylogenetic analyses suggest that “chilihueque” was a local term for a domesticated guanaco. Results from phylogeographic analyses are consistent with Isla Mocha camelids being sourced from Southern Chilean guanaco populations. Our data highlights the capability of ancient DNA to answer questions about extinct populations which includes species identity, potential translocation events and origins of founding individuals. PMID:27929050
The complete mitochondrial genome of Papilio glaucus and its phylogenetic implications.

PubMed

Shen, Jinhui; Cong, Qian; Grishin, Nick V

2015-09-01

Due to the intriguing morphology, lifecycle, and diversity of butterflies and moths, Lepidoptera are emerging as model organisms for the study of genetics, evolution and speciation. The progress of these studies relies on decoding Lepidoptera genomes, both nuclear and mitochondrial. Here we describe a protocol to obtain mitogenomes from Next Generation Sequencing reads performed for whole-genome sequencing and report the complete mitogenome of Papilio (Pterourus) glaucus. The circular mitogenome is 15,306 bp in length and rich in A and T. It contains 13 protein-coding genes (PCGs), 22 transfer-RNA-coding genes (tRNA), and 2 ribosomal-RNA-coding genes (rRNA), with a gene order typical for mitogenomes of Lepidoptera. We performed phylogenetic analyses based on PCG and RNA-coding genes or protein sequences using Bayesian Inference and Maximum Likelihood methods. The phylogenetic trees consistently show that among species with available mitogenomes Papilio glaucus is the closest to Papilio (Agehana) maraho from Asia.
The complete mitochondrial genome of the green lizard Lacerta viridis viridis (Reptilia: Lacertidae) and its phylogenetic position within squamate reptiles.

PubMed

Böhme, M U; Fritzsch, G; Tippmann, A; Schlegel, M; Berendonk, T U

2007-06-01

For the first time the complete mitochondrial genome was sequenced for a member of Lacertidae. Lacerta viridis viridis was sequenced in order to compare the phylogenetic relationships of this family to other reptilian lineages. Using the long-polymerase chain reaction (long PCR) we characterized a mitochondrial genome, 17,156 bp long showing a typical vertebrate pattern with 13 protein coding genes, 22 transfer RNAs (tRNA), two ribosomal RNAs (rRNA) and one major noncoding region. The noncoding region of L. v. viridis was characterized by a conspicuous 35 bp tandem repeat at its 5' terminus. A phylogenetic study including all currently available squamate mitochondrial sequences demonstrates the position of Lacertidae within a monophyletic squamate group. We obtained a narrow relationship of Lacertidae to Scincidae, Iguanidae, Varanidae, Anguidae, and Cordylidae. Although, the internal relationships within this group yielded only a weak resolution and low bootstrap support, the revealed relationships were more congruent with morphological studies than with recent molecular analyses.
Comparative Genomics and Phylogenomics of East Asian Tulips (Amana, Liliaceae)

PubMed Central

Li, Pan; Lu, Rui-Sen; Xu, Wu-Qin; Ohi-Toma, Tetsuo; Cai, Min-Qi; Qiu, Ying-Xiong; Cameron, Kenneth M.; Fu, Cheng-Xin

2017-01-01

The genus Amana Honda (Liliaceae), when it is treated as separate from Tulipa, comprises six perennial herbaceous species that are restricted to China, Japan and the Korean Peninsula. Although all six Amana species have important medicinal and horticultural uses, studies focused on species identification and molecular phylogenetics are few. Here we report the nucleotide sequences of six complete Amana chloroplast (cp) genomes. The cp genomes of Amana range from 150,613 bp to 151,136 bp in length, all including a pair of inverted repeats (25,629–25,859 bp) separated by the large single-copy (81,482–82,218 bp) and small single-copy (17,366–17,465 bp) regions. Each cp genome equivalently contains 112 unique genes consisting of 30 transfer RNA genes, four ribosomal RNA genes, and 78 protein coding genes. Gene content, gene order, AT content, and IR/SC boundary structure are nearly identical among all Amana cp genomes. However, the relative contraction and expansion of the IR/SC borders among the six Amana cp genomes results in length variation among them. Simple sequence repeat (SSR) analyses of these Amana cp genomes indicate that the richest SSRs are A/T mononucleotides. The number of repeats among the six Amana species varies from 54 (A. anhuiensis) to 69 (Amana kuocangshanica) with palindromic (28–35) and forward repeats (23–30) as the most common types. Phylogenomic analyses based on these complete cp genomes and 74 common protein-coding genes strongly support the monophyly of the genus, and a sister relationship between Amana and Erythronium, rather than a shared common ancestor with Tulipa. Nine DNA markers (rps15–ycf1, accD–psaI, petA–psbJ, rpl32–trnL, atpH–atpI, petD–rpoA, trnS–trnG, psbM–trnD, and ycf4–cemA) with number of variable sites greater than 0.9% were identified, and these may be useful for future population genetic and phylogeographic studies of Amana species. PMID:28421090
Company profile: Complete Genomics Inc.

PubMed

Reid, Clifford

2011-02-01

Complete Genomics Inc. is a life sciences company that focuses on complete human genome sequencing. It is taking a completely different approach to DNA sequencing than other companies in the industry. Rather than building a general-purpose platform for sequencing all organisms and all applications, it has focused on a single application - complete human genome sequencing. The company's Complete Genomics Analysis Platform (CGA™ Platform) comprises an integrated package of biochemistry, instrumentation and software that sequences human genomes at the highest quality, lowest cost and largest scale available. Complete Genomics offers a turnkey service that enables customers to outsource their human genome sequencing to the company's genome sequencing center in Mountain View, CA, USA. Customers send in their DNA samples, the company does all the library preparation, DNA sequencing, assembly and variant analysis, and customers receive research-ready data that they can use for biological discovery.
Single nucleotide polymorphism discovery in rainbow trout by deep sequencing of a reduced representation library.

PubMed

Sánchez, Cecilia Castaño; Smith, Timothy P L; Wiedmann, Ralph T; Vallejo, Roger L; Salem, Mohamed; Yao, Jianbo; Rexroad, Caird E

2009-11-25

To enhance capabilities for genomic analyses in rainbow trout, such as genomic selection, a large suite of polymorphic markers that are amenable to high-throughput genotyping protocols must be identified. Expressed Sequence Tags (ESTs) have been used for single nucleotide polymorphism (SNP) discovery in salmonids. In those strategies, the salmonid semi-tetraploid genomes often led to assemblies of paralogous sequences and therefore resulted in a high rate of false positive SNP identification. Sequencing genomic DNA using primers identified from ESTs proved to be an effective but time consuming methodology of SNP identification in rainbow trout, therefore not suitable for high throughput SNP discovery. In this study, we employed a high-throughput strategy that used pyrosequencing technology to generate data from a reduced representation library constructed with genomic DNA pooled from 96 unrelated rainbow trout that represent the National Center for Cool and Cold Water Aquaculture (NCCCWA) broodstock population. The reduced representation library consisted of 440 bp fragments resulting from complete digestion with the restriction enzyme HaeIII; sequencing produced 2,000,000 reads providing an average 6 fold coverage of the estimated 150,000 unique genomic restriction fragments (300,000 fragment ends). Three independent data analyses identified 22,022 to 47,128 putative SNPs on 13,140 to 24,627 independent contigs. A set of 384 putative SNPs, randomly selected from the sets produced by the three analyses were genotyped on individual fish to determine the validation rate of putative SNPs among analyses, distinguish apparent SNPs that actually represent paralogous loci in the tetraploid genome, examine Mendelian segregation, and place the validated SNPs on the rainbow trout linkage map. Approximately 48% (183) of the putative SNPs were validated; 167 markers were successfully incorporated into the rainbow trout linkage map. In addition, 2% of the sequences from the validated markers were associated with rainbow trout transcripts. The use of reduced representation libraries and pyrosequencing technology proved to be an effective strategy for the discovery of a high number of putative SNPs in rainbow trout; however, modifications to the technique to decrease the false discovery rate resulting from the evolutionary recent genome duplication would be desirable.
1000 Genomes-based meta-analysis identifies 10 novel loci for kidney function

PubMed Central

Gorski, Mathias; van der Most, Peter J.; Teumer, Alexander; Chu, Audrey Y.; Li, Man; Mijatovic, Vladan; Nolte, Ilja M.; Cocca, Massimiliano; Taliun, Daniel; Gomez, Felicia; Li, Yong; Tayo, Bamidele; Tin, Adrienne; Feitosa, Mary F.; Aspelund, Thor; Attia, John; Biffar, Reiner; Bochud, Murielle; Boerwinkle, Eric; Borecki, Ingrid; Bottinger, Erwin P.; Chen, Ming-Huei; Chouraki, Vincent; Ciullo, Marina; Coresh, Josef; Cornelis, Marilyn C.; Curhan, Gary C.; d’Adamo, Adamo Pio; Dehghan, Abbas; Dengler, Laura; Ding, Jingzhong; Eiriksdottir, Gudny; Endlich, Karlhans; Enroth, Stefan; Esko, Tõnu; Franco, Oscar H.; Gasparini, Paolo; Gieger, Christian; Girotto, Giorgia; Gottesman, Omri; Gudnason, Vilmundur; Gyllensten, Ulf; Hancock, Stephen J.; Harris, Tamara B.; Helmer, Catherine; Höllerer, Simon; Hofer, Edith; Hofman, Albert; Holliday, Elizabeth G.; Homuth, Georg; Hu, Frank B.; Huth, Cornelia; Hutri-Kähönen, Nina; Hwang, Shih-Jen; Imboden, Medea; Johansson, Åsa; Kähönen, Mika; König, Wolfgang; Kramer, Holly; Krämer, Bernhard K.; Kumar, Ashish; Kutalik, Zoltan; Lambert, Jean-Charles; Launer, Lenore J.; Lehtimäki, Terho; de Borst, Martin; Navis, Gerjan; Swertz, Morris; Liu, Yongmei; Lohman, Kurt; Loos, Ruth J. F.; Lu, Yingchang; Lyytikäinen, Leo-Pekka; McEvoy, Mark A.; Meisinger, Christa; Meitinger, Thomas; Metspalu, Andres; Metzger, Marie; Mihailov, Evelin; Mitchell, Paul; Nauck, Matthias; Oldehinkel, Albertine J.; Olden, Matthias; WJH Penninx, Brenda; Pistis, Giorgio; Pramstaller, Peter P.; Probst-Hensch, Nicole; Raitakari, Olli T.; Rettig, Rainer; Ridker, Paul M.; Rivadeneira, Fernando; Robino, Antonietta; Rosas, Sylvia E.; Ruderfer, Douglas; Ruggiero, Daniela; Saba, Yasaman; Sala, Cinzia; Schmidt, Helena; Schmidt, Reinhold; Scott, Rodney J.; Sedaghat, Sanaz; Smith, Albert V.; Sorice, Rossella; Stengel, Benedicte; Stracke, Sylvia; Strauch, Konstantin; Toniolo, Daniela; Uitterlinden, Andre G.; Ulivi, Sheila; Viikari, Jorma S.; Völker, Uwe; Vollenweider, Peter; Völzke, Henry; Vuckovic, Dragana; Waldenberger, Melanie; Jin Wang, Jie; Yang, Qiong; Chasman, Daniel I.; Tromp, Gerard; Snieder, Harold; Heid, Iris M.; Fox, Caroline S.; Köttgen, Anna; Pattaro, Cristian; Böger, Carsten A.; Fuchsberger, Christian

2017-01-01

HapMap imputed genome-wide association studies (GWAS) have revealed >50 loci at which common variants with minor allele frequency >5% are associated with kidney function. GWAS using more complete reference sets for imputation, such as those from The 1000 Genomes project, promise to identify novel loci that have been missed by previous efforts. To investigate the value of such a more complete variant catalog, we conducted a GWAS meta-analysis of kidney function based on the estimated glomerular filtration rate (eGFR) in 110,517 European ancestry participants using 1000 Genomes imputed data. We identified 10 novel loci with p-value < 5 × 10−8 previously missed by HapMap-based GWAS. Six of these loci (HOXD8, ARL15, PIK3R1, EYA4, ASTN2, and EPB41L3) are tagged by common SNPs unique to the 1000 Genomes reference panel. Using pathway analysis, we identified 39 significant (FDR < 0.05) genes and 127 significantly (FDR < 0.05) enriched gene sets, which were missed by our previous analyses. Among those, the 10 identified novel genes are part of pathways of kidney development, carbohydrate metabolism, cardiac septum development and glucose metabolism. These results highlight the utility of re-imputing from denser reference panels, until whole-genome sequencing becomes feasible in large samples. PMID:28452372
1000 Genomes-based meta-analysis identifies 10 novel loci for kidney function.

PubMed

Gorski, Mathias; van der Most, Peter J; Teumer, Alexander; Chu, Audrey Y; Li, Man; Mijatovic, Vladan; Nolte, Ilja M; Cocca, Massimiliano; Taliun, Daniel; Gomez, Felicia; Li, Yong; Tayo, Bamidele; Tin, Adrienne; Feitosa, Mary F; Aspelund, Thor; Attia, John; Biffar, Reiner; Bochud, Murielle; Boerwinkle, Eric; Borecki, Ingrid; Bottinger, Erwin P; Chen, Ming-Huei; Chouraki, Vincent; Ciullo, Marina; Coresh, Josef; Cornelis, Marilyn C; Curhan, Gary C; d'Adamo, Adamo Pio; Dehghan, Abbas; Dengler, Laura; Ding, Jingzhong; Eiriksdottir, Gudny; Endlich, Karlhans; Enroth, Stefan; Esko, Tõnu; Franco, Oscar H; Gasparini, Paolo; Gieger, Christian; Girotto, Giorgia; Gottesman, Omri; Gudnason, Vilmundur; Gyllensten, Ulf; Hancock, Stephen J; Harris, Tamara B; Helmer, Catherine; Höllerer, Simon; Hofer, Edith; Hofman, Albert; Holliday, Elizabeth G; Homuth, Georg; Hu, Frank B; Huth, Cornelia; Hutri-Kähönen, Nina; Hwang, Shih-Jen; Imboden, Medea; Johansson, Åsa; Kähönen, Mika; König, Wolfgang; Kramer, Holly; Krämer, Bernhard K; Kumar, Ashish; Kutalik, Zoltan; Lambert, Jean-Charles; Launer, Lenore J; Lehtimäki, Terho; de Borst, Martin; Navis, Gerjan; Swertz, Morris; Liu, Yongmei; Lohman, Kurt; Loos, Ruth J F; Lu, Yingchang; Lyytikäinen, Leo-Pekka; McEvoy, Mark A; Meisinger, Christa; Meitinger, Thomas; Metspalu, Andres; Metzger, Marie; Mihailov, Evelin; Mitchell, Paul; Nauck, Matthias; Oldehinkel, Albertine J; Olden, Matthias; Wjh Penninx, Brenda; Pistis, Giorgio; Pramstaller, Peter P; Probst-Hensch, Nicole; Raitakari, Olli T; Rettig, Rainer; Ridker, Paul M; Rivadeneira, Fernando; Robino, Antonietta; Rosas, Sylvia E; Ruderfer, Douglas; Ruggiero, Daniela; Saba, Yasaman; Sala, Cinzia; Schmidt, Helena; Schmidt, Reinhold; Scott, Rodney J; Sedaghat, Sanaz; Smith, Albert V; Sorice, Rossella; Stengel, Benedicte; Stracke, Sylvia; Strauch, Konstantin; Toniolo, Daniela; Uitterlinden, Andre G; Ulivi, Sheila; Viikari, Jorma S; Völker, Uwe; Vollenweider, Peter; Völzke, Henry; Vuckovic, Dragana; Waldenberger, Melanie; Jin Wang, Jie; Yang, Qiong; Chasman, Daniel I; Tromp, Gerard; Snieder, Harold; Heid, Iris M; Fox, Caroline S; Köttgen, Anna; Pattaro, Cristian; Böger, Carsten A; Fuchsberger, Christian

2017-04-28

HapMap imputed genome-wide association studies (GWAS) have revealed >50 loci at which common variants with minor allele frequency >5% are associated with kidney function. GWAS using more complete reference sets for imputation, such as those from The 1000 Genomes project, promise to identify novel loci that have been missed by previous efforts. To investigate the value of such a more complete variant catalog, we conducted a GWAS meta-analysis of kidney function based on the estimated glomerular filtration rate (eGFR) in 110,517 European ancestry participants using 1000 Genomes imputed data. We identified 10 novel loci with p-value < 5 × 10 -8 previously missed by HapMap-based GWAS. Six of these loci (HOXD8, ARL15, PIK3R1, EYA4, ASTN2, and EPB41L3) are tagged by common SNPs unique to the 1000 Genomes reference panel. Using pathway analysis, we identified 39 significant (FDR < 0.05) genes and 127 significantly (FDR < 0.05) enriched gene sets, which were missed by our previous analyses. Among those, the 10 identified novel genes are part of pathways of kidney development, carbohydrate metabolism, cardiac septum development and glucose metabolism. These results highlight the utility of re-imputing from denser reference panels, until whole-genome sequencing becomes feasible in large samples.

Fungal genome sequencing: basic biology to biotechnology.

PubMed

Sharma, Krishna Kant

2016-08-01

The genome sequences provide a first glimpse into the genomic basis of the biological diversity of filamentous fungi and yeast. The genome sequence of the budding yeast, Saccharomyces cerevisiae, with a small genome size, unicellular growth, and rich history of genetic and molecular analyses was a milestone of early genomics in the 1990s. The subsequent completion of fission yeast, Schizosaccharomyces pombe and genetic model, Neurospora crassa initiated a revolution in the genomics of the fungal kingdom. In due course of time, a substantial number of fungal genomes have been sequenced and publicly released, representing the widest sampling of genomes from any eukaryotic kingdom. An ambitious genome-sequencing program provides a wealth of data on metabolic diversity within the fungal kingdom, thereby enhancing research into medical science, agriculture science, ecology, bioremediation, bioenergy, and the biotechnology industry. Fungal genomics have higher potential to positively affect human health, environmental health, and the planet's stored energy. With a significant increase in sequenced fungal genomes, the known diversity of genes encoding organic acids, antibiotics, enzymes, and their pathways has increased exponentially. Currently, over a hundred fungal genome sequences are publicly available; however, no inclusive review has been published. This review is an initiative to address the significance of the fungal genome-sequencing program and provides the road map for basic and applied research.
An Exploration into Fern Genome Space.

PubMed

Wolf, Paul G; Sessa, Emily B; Marchant, Daniel Blaine; Li, Fay-Wei; Rothfels, Carl J; Sigel, Erin M; Gitzendanner, Matthew A; Visger, Clayton J; Banks, Jo Ann; Soltis, Douglas E; Soltis, Pamela S; Pryer, Kathleen M; Der, Joshua P

2015-08-26

Ferns are one of the few remaining major clades of land plants for which a complete genome sequence is lacking. Knowledge of genome space in ferns will enable broad-scale comparative analyses of land plant genes and genomes, provide insights into genome evolution across green plants, and shed light on genetic and genomic features that characterize ferns, such as their high chromosome numbers and large genome sizes. As part of an initial exploration into fern genome space, we used a whole genome shotgun sequencing approach to obtain low-density coverage (∼0.4X to 2X) for six fern species from the Polypodiales (Ceratopteris, Pteridium, Polypodium, Cystopteris), Cyatheales (Plagiogyria), and Gleicheniales (Dipteris). We explore these data to characterize the proportion of the nuclear genome represented by repetitive sequences (including DNA transposons, retrotransposons, ribosomal DNA, and simple repeats) and protein-coding genes, and to extract chloroplast and mitochondrial genome sequences. Such initial sweeps of fern genomes can provide information useful for selecting a promising candidate fern species for whole genome sequencing. We also describe variation of genomic traits across our sample and highlight some differences and similarities in repeat structure between ferns and seed plants. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Genome analysis of medicinal Ganoderma spp. with plant-pathogenic and saprotrophic life-styles.

PubMed

Kües, Ursula; Nelson, David R; Liu, Chang; Yu, Guo-Jun; Zhang, Jianhui; Li, Jianqin; Wang, Xin-Cun; Sun, Hui

2015-06-01

Ganoderma is a fungal genus belonging to the Ganodermataceae family and Polyporales order. Plant-pathogenic species in this genus can cause severe diseases (stem, butt, and root rot) in economically important trees and perennial crops, especially in tropical countries. Ganoderma species are white rot fungi and have ecological importance in the breakdown of woody plants for nutrient mobilization. They possess effective machineries of lignocellulose-decomposing enzymes useful for bioenergy production and bioremediation. In addition, the genus contains many important species that produce pharmacologically active compounds used in health food and medicine. With the rapid adoption of next-generation DNA sequencing technologies, whole genome sequencing and systematic transcriptome analyses become affordable approaches to identify an organism's genes. In the last few years, numerous projects have been initiated to identify the genetic contents of several Ganoderma species, particularly in different strains of Ganoderma lucidum. In November 2013, eleven whole genome sequencing projects for Ganoderma species were registered in international databases, three of which were already completed with genomes being assembled to high quality. In addition to the nuclear genome, two mitochondrial genomes for Ganoderma species have also been reported. Complementing genome analysis, four transcriptome studies on various developmental stages of Ganoderma species have been performed. Information obtained from these studies has laid the foundation for the identification of genes involved in biological pathways that are critical for understanding the biology of Ganoderma, such as the mechanism of pathogenesis, the biosynthesis of active components, life cycle and cellular development, etc. With abundant genetic information becoming available, a few centralized resources have been established to disseminate the knowledge and integrate relevant data to support comparative genomic analyses of Ganoderma species. The current review carries out a detailed comparison of the nuclear genomes, mitochondrial genomes and transcriptomes from several Ganoderma species. Genes involved in biosynthetic pathways such as CYP450 genes and in cellular development such as matA and matB genes are characterized and compared in detail, as examples to demonstrate the usefulness of comparative genomic analyses for the identification of critical genes. Resources needed for future data integration and exploitation are also discussed. Copyright © 2014 Elsevier Ltd. All rights reserved.
COGNATE: comparative gene annotation characterizer.

PubMed

Wilbrandt, Jeanne; Misof, Bernhard; Niehuis, Oliver

2017-07-17

The comparison of gene and genome structures across species has the potential to reveal major trends of genome evolution. However, such a comparative approach is currently hampered by a lack of standardization (e.g., Elliott TA, Gregory TR, Philos Trans Royal Soc B: Biol Sci 370:20140331, 2015). For example, testing the hypothesis that the total amount of coding sequences is a reliable measure of potential proteome diversity (Wang M, Kurland CG, Caetano-Anollés G, PNAS 108:11954, 2011) requires the application of standardized definitions of coding sequence and genes to create both comparable and comprehensive data sets and corresponding summary statistics. However, such standard definitions either do not exist or are not consistently applied. These circumstances call for a standard at the descriptive level using a minimum of parameters as well as an undeviating use of standardized terms, and for software that infers the required data under these strict definitions. The acquisition of a comprehensive, descriptive, and standardized set of parameters and summary statistics for genome publications and further analyses can thus greatly benefit from the availability of an easy to use standard tool. We developed a new open-source command-line tool, COGNATE (Comparative Gene Annotation Characterizer), which uses a given genome assembly and its annotation of protein-coding genes for a detailed description of the respective gene and genome structure parameters. Additionally, we revised the standard definitions of gene and genome structures and provide the definitions used by COGNATE as a working draft suggestion for further reference. Complete parameter lists and summary statistics are inferred using this set of definitions to allow down-stream analyses and to provide an overview of the genome and gene repertoire characteristics. COGNATE is written in Perl and freely available at the ZFMK homepage ( https://www.zfmk.de/en/COGNATE ) and on github ( https://github.com/ZFMK/COGNATE ). The tool COGNATE allows comparing genome assemblies and structural elements on multiples levels (e.g., scaffold or contig sequence, gene). It clearly enhances comparability between analyses. Thus, COGNATE can provide the important standardization of both genome and gene structure parameter disclosure as well as data acquisition for future comparative analyses. With the establishment of comprehensive descriptive standards and the extensive availability of genomes, an encompassing database will become possible.
Complete Chloroplast Genome Sequences of Four Meliaceae Species and Comparative Analyses

PubMed Central

Mader, Malte; Pakull, Birte; Blanc-Jolivet, Céline; Paulini-Drewes, Maike; Bouda, Zoéwindé Henri-Noël; Degen, Bernd; Small, Ian

2018-01-01

The Meliaceae family mainly consists of trees and shrubs with a pantropical distribution. In this study, the complete chloroplast genomes of four Meliaceae species were sequenced and compared with each other and with the previously published Azadirachta indica plastome. The five plastomes are circular and exhibit a quadripartite structure with high conservation of gene content and order. They include 130 genes encoding 85 proteins, 37 tRNAs and 8 rRNAs. Inverted repeat expansion resulted in a duplication of rps19 in the five Meliaceae species, which is consistent with that in many other Sapindales, but different from many other rosids. Compared to Azadirachta indica, the four newly sequenced Meliaceae individuals share several large deletions, which mainly contribute to the decreased genome sizes. A whole-plastome phylogeny supports previous findings that the four species form a monophyletic sister clade to Azadirachta indica within the Meliaceae. SNPs and indels identified in all complete Meliaceae plastomes might be suitable targets for the future development of genetic markers at different taxonomic levels. The extended analysis of SNPs in the matK gene led to the identification of four potential Meliaceae-specific SNPs as a basis for future validation and marker development. PMID:29494509
Mitochondrial comparative genomics and phylogenetic signal assessment of mtDNA among arbuscular mycorrhizal fungi.

PubMed

Nadimi, Maryam; Daubois, Laurence; Hijri, Mohamed

2016-05-01

Mitochondrial (mt) genes, such as cytochrome C oxidase genes (cox), have been widely used for barcoding in many groups of organisms, although this approach has been less powerful in the fungal kingdom due to the rapid evolution of their mt genomes. The use of mt genes in phylogenetic studies of Dikarya has been met with success, while early diverging fungal lineages remain less studied, particularly the arbuscular mycorrhizal fungi (AMF). Advances in next-generation sequencing have substantially increased the number of publically available mtDNA sequences for the Glomeromycota. As a result, comparison of mtDNA across key AMF taxa can now be applied to assess the phylogenetic signal of individual mt coding genes, as well as concatenated subsets of coding genes. Here we show comparative analyses of publically available mt genomes of Glomeromycota, augmented with two mtDNA genomes that were newly sequenced for this study (Rhizophagus irregularis DAOM240159 and Glomus aggregatum DAOM240163), resulting in 16 complete mtDNA datasets. R. irregularis isolate DAOM240159 and G. aggregatum isolate DAOM240163 showed mt genomes measuring 72,293bp and 69,505bp with G+C contents of 37.1% and 37.3%, respectively. We assessed the phylogenies inferred from single mt genes and complete sets of coding genes, which are referred to as "supergenes" (16 concatenated coding genes), using Shimodaira-Hasegawa tests, in order to identify genes that best described AMF phylogeny. We found that rnl, nad5, cox1, and nad2 genes, as well as concatenated subset of these genes, provided phylogenies that were similar to the supergene set. This mitochondrial genomic analysis was also combined with principal coordinate and partitioning analyses, which helped to unravel certain evolutionary relationships in the Rhizophagus genus and for G. aggregatum within the Glomeromycota. We showed evidence to support the position of G. aggregatum within the R. irregularis 'species complex'. Copyright © 2016 Elsevier Inc. All rights reserved.
The Complete Genome Sequence and Analysis of the Epsilonproteobacterium Arcobacter butzleri

PubMed Central

Miller, William G.; Parker, Craig T.; Rubenfield, Marc; Mendz, George L.; Wösten, Marc M. S. M.; Ussery, David W.; Stolz, John F.; Binnewies, Tim T.; Hallin, Peter F.; Wang, Guilin; Malek, Joel A.; Rogosin, Andrea; Stanker, Larry H.; Mandrell, Robert E.

2007-01-01

Background Arcobacter butzleri is a member of the epsilon subdivision of the Proteobacteria and a close taxonomic relative of established pathogens, such as Campylobacter jejuni and Helicobacter pylori. Here we present the complete genome sequence of the human clinical isolate, A. butzleri strain RM4018. Methodology/Principal Findings Arcobacter butzleri is a member of the Campylobacteraceae, but the majority of its proteome is most similar to those of Sulfuromonas denitrificans and Wolinella succinogenes, both members of the Helicobacteraceae, and those of the deep-sea vent Epsilonproteobacteria Sulfurovum and Nitratiruptor. In addition, many of the genes and pathways described here, e.g. those involved in signal transduction and sulfur metabolism, have been identified previously within the epsilon subdivision only in S. denitrificans, W. succinogenes, Sulfurovum, and/or Nitratiruptor, or are unique to the subdivision. In addition, the analyses indicated also that a substantial proportion of the A. butzleri genome is devoted to growth and survival under diverse environmental conditions, with a large number of respiration-associated proteins, signal transduction and chemotaxis proteins and proteins involved in DNA repair and adaptation. To investigate the genomic diversity of A. butzleri strains, we constructed an A. butzleri DNA microarray comprising 2238 genes from strain RM4018. Comparative genomic indexing analysis of 12 additional A. butzleri strains identified both the core genes of A. butzleri and intraspecies hypervariable regions, where <70% of the genes were present in at least two strains. Conclusion/Significance The presence of pathways and loci associated often with non-host-associated organisms, as well as genes associated with virulence, suggests that A. butzleri is a free-living, water-borne organism that might be classified rightfully as an emerging pathogen. The genome sequence and analyses presented in this study are an important first step in understanding the physiology and genetics of this organism, which constitutes a bridge between the environment and mammalian hosts. PMID:18159241
Complete chloroplast genome sequence of a tree fern Alsophila spinulosa: insights into evolutionary changes in fern chloroplast genomes.

PubMed

Gao, Lei; Yi, Xuan; Yang, Yong-Xia; Su, Ying-Juan; Wang, Ting

2009-06-11

Ferns have generally been neglected in studies of chloroplast genomics. Before this study, only one polypod and two basal ferns had their complete chloroplast (cp) genome reported. Tree ferns represent an ancient fern lineage that first occurred in the Late Triassic. In recent phylogenetic analyses, tree ferns were shown to be the sister group of polypods, the most diverse group of living ferns. Availability of cp genome sequence from a tree fern will facilitate interpretation of the evolutionary changes of fern cp genomes. Here we have sequenced the complete cp genome of a scaly tree fern Alsophila spinulosa (Cyatheaceae). The Alsophila cp genome is 156,661 base pairs (bp) in size, and has a typical quadripartite structure with the large (LSC, 86,308 bp) and small single copy (SSC, 21,623 bp) regions separated by two copies of an inverted repeat (IRs, 24,365 bp each). This genome contains 117 different genes encoding 85 proteins, 4 rRNAs and 28 tRNAs. Pseudogenes of ycf66 and trnT-UGU are also detected in this genome. A unique trnR-UCG gene (derived from trnR-CCG) is found between rbcL and accD. The Alsophila cp genome shares some unusual characteristics with the previously sequenced cp genome of the polypod fern Adiantum capillus-veneris, including the absence of 5 tRNA genes that exist in most other cp genomes. The genome shows a high degree of synteny with that of Adiantum, but differs considerably from two basal ferns (Angiopteris evecta and Psilotum nudum). At one endpoint of an ancient inversion we detected a highly repeated 565-bp-region that is absent from the Adiantum cp genome. An additional minor inversion of the trnD-GUC, which is possibly shared by all ferns, was identified by comparison between the fern and other land plant cp genomes. By comparing four fern cp genome sequences it was confirmed that two major rearrangements distinguish higher leptosporangiate ferns from basal fern lineages. The Alsophila cp genome is very similar to that of the polypod fern Adiantum in terms of gene content, gene order and GC content. However, there exist some striking differences between them: the trnR-UCG gene represents a putative molecular apomorphy of tree ferns; and the repeats observed at one inversion endpoint may be a vestige of some unknown rearrangement(s). This work provided fresh insights into the fern cp genome evolution as well as useful data for future phylogenetic studies.
Sybil--efficient constraint-based modelling in R.

PubMed

Gelius-Dietrich, Gabriel; Desouki, Abdelmoneim Amer; Fritzemeier, Claus Jonathan; Lercher, Martin J

2013-11-13

Constraint-based analyses of metabolic networks are widely used to simulate the properties of genome-scale metabolic networks. Publicly available implementations tend to be slow, impeding large scale analyses such as the genome-wide computation of pairwise gene knock-outs, or the automated search for model improvements. Furthermore, available implementations cannot easily be extended or adapted by users. Here, we present sybil, an open source software library for constraint-based analyses in R; R is a free, platform-independent environment for statistical computing and graphics that is widely used in bioinformatics. Among other functions, sybil currently provides efficient methods for flux-balance analysis (FBA), MOMA, and ROOM that are about ten times faster than previous implementations when calculating the effect of whole-genome single gene deletions in silico on a complete E. coli metabolic model. Due to the object-oriented architecture of sybil, users can easily build analysis pipelines in R or even implement their own constraint-based algorithms. Based on its highly efficient communication with different mathematical optimisation programs, sybil facilitates the exploration of high-dimensional optimisation problems on small time scales. Sybil and all its dependencies are open source. Sybil and its documentation are available for download from the comprehensive R archive network (CRAN).
Evolutionary and biotechnology implications of plastid genome variation in the inverted-repeat-lacking clade of legumes.

PubMed

Sabir, Jamal; Schwarz, Erika; Ellison, Nicholas; Zhang, Jin; Baeshen, Nabih A; Mutwakil, Muhammed; Jansen, Robert; Ruhlman, Tracey

2014-08-01

Land plant plastid genomes (plastomes) provide a tractable model for evolutionary study in that they are relatively compact and gene dense. Among the groups that display an appropriate level of variation for structural features, the inverted-repeat-lacking clade (IRLC) of papilionoid legumes presents the potential to advance general understanding of the mechanisms of genomic evolution. Here, are presented six complete plastome sequences from economically important species of the IRLC, a lineage previously represented by only five completed plastomes. A number of characters are compared across the IRLC including gene retention and divergence, synteny, repeat structure and functional gene transfer to the nucleus. The loss of clpP intron 2 was identified in one newly sequenced member of IRLC, Glycyrrhiza glabra. Using deeply sequenced nuclear transcriptomes from two species helped clarify the nature of the functional transfer of accD to the nucleus in Trifolium, which likely occurred in the lineage leading to subgenus Trifolium. Legumes are second only to cereal crops in agricultural importance based on area harvested and total production. Genetic improvement via plastid transformation of IRLC crop species is an appealing proposition. Comparative analyses of intergenic spacer regions emphasize the need for complete genome sequences for developing transformation vectors for plastid genetic engineering of legume crops. © 2014 Society for Experimental Biology, Association of Applied Biologists and John Wiley & Sons Ltd.
Bluejay 1.0: genome browsing and comparison with rich customization provision and dynamic resource linking

PubMed Central

Soh, Jung; Gordon, Paul MK; Taschuk, Morgan L; Dong, Anguo; Ah-Seng, Andrew C; Turinsky, Andrei L; Sensen, Christoph W

2008-01-01

Background The Bluejay genome browser has been developed over several years to address the challenges posed by the ever increasing number of data types as well as the increasing volume of data in genome research. Beginning with a browser capable of rendering views of XML-based genomic information and providing scalable vector graphics output, we have now completed version 1.0 of the system with many additional features. Our development efforts were guided by our observation that biologists who use both gene expression profiling and comparative genomics gain functional insights above and beyond those provided by traditional per-gene analyses. Results Bluejay 1.0 is a genome viewer integrating genome annotation with: (i) gene expression information; and (ii) comparative analysis with an unlimited number of other genomes in the same view. This allows the biologist to see a gene not just in the context of its genome, but also its regulation and its evolution. Bluejay now has rich provision for personalization by users: (i) numerous display customization features; (ii) the availability of waypoints for marking multiple points of interest on a genome and subsequently utilizing them; and (iii) the ability to take user relevance feedback of annotated genes or textual items to offer personalized recommendations. Bluejay 1.0 also embeds the Seahawk browser for the Moby protocol, enabling users to seamlessly invoke hundreds of Web Services on genomic data of interest without any hard-coding. Conclusion Bluejay offers a unique set of customizable genome-browsing features, with the goal of allowing biologists to quickly focus on, analyze, compare, and retrieve related information on the parts of the genomic data they are most interested in. We expect these capabilities of Bluejay to benefit the many biologists who want to answer complex questions using the information available from completely sequenced genomes. PMID:18940007
Integrative analysis of the Caenorhabditis elegans genome by the modENCODE project.

PubMed

Gerstein, Mark B; Lu, Zhi John; Van Nostrand, Eric L; Cheng, Chao; Arshinoff, Bradley I; Liu, Tao; Yip, Kevin Y; Robilotto, Rebecca; Rechtsteiner, Andreas; Ikegami, Kohta; Alves, Pedro; Chateigner, Aurelien; Perry, Marc; Morris, Mitzi; Auerbach, Raymond K; Feng, Xin; Leng, Jing; Vielle, Anne; Niu, Wei; Rhrissorrakrai, Kahn; Agarwal, Ashish; Alexander, Roger P; Barber, Galt; Brdlik, Cathleen M; Brennan, Jennifer; Brouillet, Jeremy Jean; Carr, Adrian; Cheung, Ming-Sin; Clawson, Hiram; Contrino, Sergio; Dannenberg, Luke O; Dernburg, Abby F; Desai, Arshad; Dick, Lindsay; Dosé, Andréa C; Du, Jiang; Egelhofer, Thea; Ercan, Sevinc; Euskirchen, Ghia; Ewing, Brent; Feingold, Elise A; Gassmann, Reto; Good, Peter J; Green, Phil; Gullier, Francois; Gutwein, Michelle; Guyer, Mark S; Habegger, Lukas; Han, Ting; Henikoff, Jorja G; Henz, Stefan R; Hinrichs, Angie; Holster, Heather; Hyman, Tony; Iniguez, A Leo; Janette, Judith; Jensen, Morten; Kato, Masaomi; Kent, W James; Kephart, Ellen; Khivansara, Vishal; Khurana, Ekta; Kim, John K; Kolasinska-Zwierz, Paulina; Lai, Eric C; Latorre, Isabel; Leahey, Amber; Lewis, Suzanna; Lloyd, Paul; Lochovsky, Lucas; Lowdon, Rebecca F; Lubling, Yaniv; Lyne, Rachel; MacCoss, Michael; Mackowiak, Sebastian D; Mangone, Marco; McKay, Sheldon; Mecenas, Desirea; Merrihew, Gennifer; Miller, David M; Muroyama, Andrew; Murray, John I; Ooi, Siew-Loon; Pham, Hoang; Phippen, Taryn; Preston, Elicia A; Rajewsky, Nikolaus; Rätsch, Gunnar; Rosenbaum, Heidi; Rozowsky, Joel; Rutherford, Kim; Ruzanov, Peter; Sarov, Mihail; Sasidharan, Rajkumar; Sboner, Andrea; Scheid, Paul; Segal, Eran; Shin, Hyunjin; Shou, Chong; Slack, Frank J; Slightam, Cindie; Smith, Richard; Spencer, William C; Stinson, E O; Taing, Scott; Takasaki, Teruaki; Vafeados, Dionne; Voronina, Ksenia; Wang, Guilin; Washington, Nicole L; Whittle, Christina M; Wu, Beijing; Yan, Koon-Kiu; Zeller, Georg; Zha, Zheng; Zhong, Mei; Zhou, Xingliang; Ahringer, Julie; Strome, Susan; Gunsalus, Kristin C; Micklem, Gos; Liu, X Shirley; Reinke, Valerie; Kim, Stuart K; Hillier, LaDeana W; Henikoff, Steven; Piano, Fabio; Snyder, Michael; Stein, Lincoln; Lieb, Jason D; Waterston, Robert H

2010-12-24

We systematically generated large-scale data sets to improve genome annotation for the nematode Caenorhabditis elegans, a key model organism. These data sets include transcriptome profiling across a developmental time course, genome-wide identification of transcription factor-binding sites, and maps of chromatin organization. From this, we created more complete and accurate gene models, including alternative splice forms and candidate noncoding RNAs. We constructed hierarchical networks of transcription factor-binding and microRNA interactions and discovered chromosomal locations bound by an unusually large number of transcription factors. Different patterns of chromatin composition and histone modification were revealed between chromosome arms and centers, with similarly prominent differences between autosomes and the X chromosome. Integrating data types, we built statistical models relating chromatin, transcription factor binding, and gene expression. Overall, our analyses ascribed putative functions to most of the conserved genome.
'Big data', Hadoop and cloud computing in genomics.

PubMed

O'Driscoll, Aisling; Daugelaite, Jurate; Sleator, Roy D

2013-10-01

Since the completion of the Human Genome project at the turn of the Century, there has been an unprecedented proliferation of genomic sequence data. A consequence of this is that the medical discoveries of the future will largely depend on our ability to process and analyse large genomic data sets, which continue to expand as the cost of sequencing decreases. Herein, we provide an overview of cloud computing and big data technologies, and discuss how such expertise can be used to deal with biology's big data sets. In particular, big data technologies such as the Apache Hadoop project, which provides distributed and parallelised data processing and analysis of petabyte (PB) scale data sets will be discussed, together with an overview of the current usage of Hadoop within the bioinformatics community. Copyright © 2013 Elsevier Inc. All rights reserved.
Genomic Characterization of DArT Markers Based on High-Density Linkage Analysis and Physical Mapping to the Eucalyptus Genome

PubMed Central

Petroli, César D.; Sansaloni, Carolina P.; Carling, Jason; Steane, Dorothy A.; Vaillancourt, René E.; Myburg, Alexander A.; da Silva, Orzenil Bonfim; Pappas, Georgios Joannis; Kilian, Andrzej; Grattapaglia, Dario

2012-01-01

Diversity Arrays Technology (DArT) provides a robust, high throughput, cost-effective method to query thousands of sequence polymorphisms in a single assay. Despite the extensive use of this genotyping platform for numerous plant species, little is known regarding the sequence attributes and genome-wide distribution of DArT markers. We investigated the genomic properties of the 7,680 DArT marker probes of a Eucalyptus array, by sequencing them, constructing a high density linkage map and carrying out detailed physical mapping analyses to the Eucalyptus grandis reference genome. A consensus linkage map with 2,274 DArT markers anchored to 210 microsatellites and a framework map, with improved support for ordering, displayed extensive collinearity with the genome sequence. Only 1.4 Mbp of the 75 Mbp of still unplaced scaffold sequence was captured by 45 linkage mapped but physically unaligned markers to the 11 main Eucalyptus pseudochromosomes, providing compelling evidence for the quality and completeness of the current Eucalyptus genome assembly. A highly significant correspondence was found between the locations of DArT markers and predicted gene models, while most of the 89 DArT probes unaligned to the genome correspond to sequences likely absent in E. grandis, consistent with the pan-genomic feature of this multi-Eucalyptus species DArT array. These comprehensive linkage-to-physical mapping analyses provide novel data regarding the genomic attributes of DArT markers in plant genomes in general and for Eucalyptus in particular. DArT markers preferentially target the gene space and display a largely homogeneous distribution across the genome, thereby providing superb coverage for mapping and genome-wide applications in breeding and diversity studies. Data reported on these ubiquitous properties of DArT markers will be particularly valuable to researchers working on less-studied crop species who already count on DArT genotyping arrays but for which no reference genome is yet available to allow such detailed characterization. PMID:22984541
Comparative functional pan-genome analyses to build connections between genomic dynamics and phenotypic evolution in polycyclic aromatic hydrocarbon metabolism in the genus Mycobacterium.

PubMed

Kweon, Ohgew; Kim, Seong-Jae; Blom, Jochen; Kim, Sung-Kwan; Kim, Bong-Soo; Baek, Dong-Heon; Park, Su Inn; Sutherland, John B; Cerniglia, Carl E

2015-02-14

The bacterial genus Mycobacterium is of great interest in the medical and biotechnological fields. Despite a flood of genome sequencing and functional genomics data, significant gaps in knowledge between genome and phenome seriously hinder efforts toward the treatment of mycobacterial diseases and practical biotechnological applications. In this study, we propose the use of systematic, comparative functional pan-genomic analysis to build connections between genomic dynamics and phenotypic evolution in polycyclic aromatic hydrocarbon (PAH) metabolism in the genus Mycobacterium. Phylogenetic, phenotypic, and genomic information for 27 completely genome-sequenced mycobacteria was systematically integrated to reconstruct a mycobacterial phenotype network (MPN) with a pan-genomic concept at a network level. In the MPN, mycobacterial phenotypes show typical scale-free relationships. PAH degradation is an isolated phenotype with the lowest connection degree, consistent with phylogenetic and environmental isolation of PAH degraders. A series of functional pan-genomic analyses provide conserved and unique types of genomic evidence for strong epistatic and pleiotropic impacts on evolutionary trajectories of the PAH-degrading phenotype. Under strong natural selection, the detailed gene gain/loss patterns from horizontal gene transfer (HGT)/deletion events hypothesize a plausible evolutionary path, an epistasis-based birth and pleiotropy-dependent death, for PAH metabolism in the genus Mycobacterium. This study generated a practical mycobacterial compendium of phenotypic and genomic changes, focusing on the PAH-degrading phenotype, with a pan-genomic perspective of the evolutionary events and the environmental challenges. Our findings suggest that when selection acts on PAH metabolism, only a small fraction of possible trajectories is likely to be observed, owing mainly to a combination of the ambiguous phenotypic effects of PAHs and the corresponding pleiotropy- and epistasis-dependent evolutionary adaptation. Evolutionary constraints on the selection of trajectories, like those seen in PAH-degrading phenotypes, are likely to apply to the evolution of other phenotypes in the genus Mycobacterium.
Complete mitochondrial genome of a Pleistocene jawbone unveils the origin of polar bear.

PubMed

Lindqvist, Charlotte; Schuster, Stephan C; Sun, Yazhou; Talbot, Sandra L; Qi, Ji; Ratan, Aakrosh; Tomsho, Lynn P; Kasson, Lindsay; Zeyl, Eve; Aars, Jon; Miller, Webb; Ingólfsson, Olafur; Bachmann, Lutz; Wiig, Oystein

2010-03-16

The polar bear has become the flagship species in the climate-change discussion. However, little is known about how past climate impacted its evolution and persistence, given an extremely poor fossil record. Although it is undisputed from analyses of mitochondrial (mt) DNA that polar bears constitute a lineage within the genetic diversity of brown bears, timing estimates of their divergence have differed considerably. Using next-generation sequencing technology, we have generated a complete, high-quality mt genome from a stratigraphically validated 130,000- to 110,000-year-old polar bear jawbone. In addition, six mt genomes were generated of extant polar bears from Alaska and brown bears from the Admiralty and Baranof islands of the Alexander Archipelago of southeastern Alaska and Kodiak Island. We show that the phylogenetic position of the ancient polar bear lies almost directly at the branching point between polar bears and brown bears, elucidating a unique morphologically and molecularly documented fossil link between living mammal species. Molecular dating and stable isotope analyses also show that by very early in their evolutionary history, polar bears were already inhabitants of the Artic sea ice and had adapted very rapidly to their current and unique ecology at the top of the Arctic marine food chain. As such, polar bears provide an excellent example of evolutionary opportunism within a widespread mammalian lineage.
Complete mitochondrial genome of a Pleistocene jawbone unveils the origin of polar bear

PubMed Central

Lindqvist, Charlotte; Schuster, Stephan C.; Sun, Yazhou; Talbot, Sandra L.; Qi, Ji; Ratan, Aakrosh; Tomsho, Lynn P.; Kasson, Lindsay; Zeyl, Eve; Aars, Jon; Miller, Webb; Ingólfsson, Ólafur; Bachmann, Lutz; Wiig, Øystein

2010-01-01

The polar bear has become the flagship species in the climate-change discussion. However, little is known about how past climate impacted its evolution and persistence, given an extremely poor fossil record. Although it is undisputed from analyses of mitochondrial (mt) DNA that polar bears constitute a lineage within the genetic diversity of brown bears, timing estimates of their divergence have differed considerably. Using next-generation sequencing technology, we have generated a complete, high-quality mt genome from a stratigraphically validated 130,000- to 110,000-year-old polar bear jawbone. In addition, six mt genomes were generated of extant polar bears from Alaska and brown bears from the Admiralty and Baranof islands of the Alexander Archipelago of southeastern Alaska and Kodiak Island. We show that the phylogenetic position of the ancient polar bear lies almost directly at the branching point between polar bears and brown bears, elucidating a unique morphologically and molecularly documented fossil link between living mammal species. Molecular dating and stable isotope analyses also show that by very early in their evolutionary history, polar bears were already inhabitants of the Artic sea ice and had adapted very rapidly to their current and unique ecology at the top of the Arctic marine food chain. As such, polar bears provide an excellent example of evolutionary opportunism within a widespread mammalian lineage. PMID:20194737
Complete mitochondrial genomes of eleven extinct or possibly extinct bird species.

PubMed

Anmarkrud, Jarl A; Lifjeld, Jan T

2017-03-01

Natural history museum collections represent a vast source of ancient and historical DNA samples from extinct taxa that can be utilized by high-throughput sequencing tools to reveal novel genetic and phylogenetic information about them. Here, we report on the successful sequencing of complete mitochondrial genome sequences (mitogenomes) from eleven extinct bird species, using de novo assembly of short sequences derived from toepad samples of degraded DNA from museum specimens. For two species (the Passenger Pigeon Ectopistes migratorius and the South Island Piopio Turnagra capensis), whole mitogenomes were already available from recent studies, whereas for five others (the Great Auk Pinguinis impennis, the Imperial Woodpecker Campehilus imperialis, the Huia Heteralocha acutirostris, the Kauai Oo Moho braccathus and the South Island Kokako Callaeas cinereus), there were partial mitochondrial sequences available for comparison. For all seven species, we found sequence similarities of >98%. For the remaining four species (the Kamao Myadestes myadestinus, the Paradise Parrot Psephotellus pulcherrimus, the Ou Psittirostra psittacea and the Lesser Akialoa Akialoa obscura), there was no sequence information available for comparison, so we conducted blast searches and phylogenetic analyses to determine their phylogenetic positions and identify their closest extant relatives. These mitogenomes will be valuable for future analyses of avian phylogenetics and illustrate the importance of museum collections as repositories for genomics resources. © 2016 John Wiley & Sons Ltd.
Complete mitochondrial genome of a Pleistocene jawbone unveils the origin of polar bear

USGS Publications Warehouse

Lindqvist, Charlotte; Schuster, Stephan C.; Sun, Yazhou; Talbot, Sandra L.; Qi, Ji; Ratan, Aakrosh; Tomsho, Lynn P.; Kasson, Lindsay; Zeyl, Eve; Aars, Jon; Miller, Webb; Ingólfsson, Ólafur; Bachmann, Lutz; Wiig, Øystein

2010-01-01

The polar bear has become the flagship species in the climate-change discussion. However, little is known about how past climate impacted its evolution and persistence, given an extremely poor fossil record. Although it is undisputed from analyses of mitochondrial (mt) DNA that polar bears constitute a lineage within the genetic diversity of brown bears, timing estimates of their divergence have differed considerably. Using next-generation sequencing technology, we have generated a complete, high-quality mt genome from a stratigraphically validated 130,000- to 110,000-year-old polar bear jawbone. In addition, six mt genomes were generated of extant polar bears from Alaska and brown bears from the Admiralty and Baranof islands of the Alexander Archipelago of southeastern Alaska and Kodiak Island. We show that the phylogenetic position of the ancient polar bear lies almost directly at the branching point between polar bears and brown bears, elucidating a unique morphologically and molecularly documented fossil link between living mammal species. Molecular dating and stable isotope analyses also show that by very early in their evolutionary history, polar bears were already inhabitants of the Artic sea ice and had adapted very rapidly to their current and unique ecology at the top of the Arctic marine food chain. As such, polar bears provide an excellent example of evolutionary opportunism within a widespread mammalian lineage.
ProteinWorldDB: querying radical pairwise alignments among protein sets from complete genomes.

PubMed

Otto, Thomas Dan; Catanho, Marcos; Tristão, Cristian; Bezerra, Márcia; Fernandes, Renan Mathias; Elias, Guilherme Steinberger; Scaglia, Alexandre Capeletto; Bovermann, Bill; Berstis, Viktors; Lifschitz, Sergio; de Miranda, Antonio Basílio; Degrave, Wim

2010-03-01

Many analyses in modern biological research are based on comparisons between biological sequences, resulting in functional, evolutionary and structural inferences. When large numbers of sequences are compared, heuristics are often used resulting in a certain lack of accuracy. In order to improve and validate results of such comparisons, we have performed radical all-against-all comparisons of 4 million protein sequences belonging to the RefSeq database, using an implementation of the Smith-Waterman algorithm. This extremely intensive computational approach was made possible with the help of World Community Grid, through the Genome Comparison Project. The resulting database, ProteinWorldDB, which contains coordinates of pairwise protein alignments and their respective scores, is now made available. Users can download, compare and analyze the results, filtered by genomes, protein functions or clusters. ProteinWorldDB is integrated with annotations derived from Swiss-Prot, Pfam, KEGG, NCBI Taxonomy database and gene ontology. The database is a unique and valuable asset, representing a major effort to create a reliable and consistent dataset of cross-comparisons of the whole protein content encoded in hundreds of completely sequenced genomes using a rigorous dynamic programming approach. The database can be accessed through http://proteinworlddb.org

The complete mitochondrial genome of Strongylus equinus (Chromadorea: Strongylidae): Comparison with other closely related species and phylogenetic analyses.

PubMed

Xu, Wen-Wen; Qiu, Jian-Hua; Liu, Guo-Hua; Zhang, Yan; Liu, Ze-Xuan; Duan, Hong; Yue, Dong-Mei; Chang, Qiao-Cheng; Wang, Chun-Ren; Zhao, Xing-Cun

2015-12-01

The roundworms of genus Strongylus are the common parasitic nematodes in the large intestine of equine, causing significant economic losses to the livestock industries. In spite of its importance, the genetic data and epidemiology of this parasite are not entirely understood. In the present study, the complete S. equinus mitochondrial (mt) genome was determined. The length of S. equinus mt genome DNA sequence is 14,545 bp, containing 36 genes, of which 12 code for protein, 22 for transfer RNA, and two for ribosomal RNA, but lacks atp8 gene. All 36 genes are encoded in the same direction which is consistent with all other Chromadorea nematode mtDNAs published to date. Phylogenetic analysis based on concatenated amino acid sequence data of all 12 protein-coding genes showed that there were two large branches in the Strongyloidea nematodes, and S. equinus is genetically closer to S. vulgaris than to Cylicocyclus insignis in Strongylidae. This new mt genome provides a source of genetic markers for the molecular phylogeny and population genetics of equine strongyles. Copyright © 2015 Elsevier Inc. All rights reserved.
Genomic and probiotic characterization of SJP-SNU strain of Pichia kudriavzevii.

PubMed

Hong, Seung-Min; Kwon, Hyuk-Joon; Park, Se-Joon; Seong, Won-Jin; Kim, Ilhwan; Kim, Jae-Hong

2018-05-17

The yeast strain SJP-SNU was investigated as a probiotic and was characterized with respect to growth temperature, bile salt resistance, hydrogen sulfide reducing activity, intestinal survival ability and chicken embryo pathogenicity. In addition, we determined the complete genomic and mitochondrial sequences of SJP-SNU and conducted comparative genomics analyses. SJP-SNU grew rapidly at 37 °C and formed colonies on MacConkey agar containing bile salt. SJP-SNU reduced hydrogen sulfide produced by Salmonella serotype Enteritidis and, after being fed to 4-week-old chickens, could be isolated from cecal feces. SJP-SNU did not cause mortality in 10-day-old chicken embryos. From 13 initial contigs, 11 were finally assembled and represented 10 chromosomal sequences and 1 mitochondrial DNA sequence. Comparative genomic analyses revealed that SJP-SNU was a strain of Pichia kudriavzevii. Although SJP-SNU possesses pathogenicity-related genes, they showed very low amino acid sequence identities to those of Candida albicans. Furthermore, SJP-SNU possessed useful genes, such as phytases and cellulase. Thus, SJP-SNU is a useful yeast possessing the basic traits of a probiotic, and further studies to demonstrate its efficacy as a probiotic in the future may be warranted.
Analysis of complete mitochondrial genomes from extinct and extant rhinoceroses reveals lack of phylogenetic resolution

PubMed Central

Willerslev, Eske; Gilbert, M Thomas P; Binladen, Jonas; Ho, Simon YW; Campos, Paula F; Ratan, Aakrosh; Tomsho, Lynn P; da Fonseca, Rute R; Sher, Andrei; Kuznetsova, Tatanya V; Nowak-Kemp, Malgosia; Roth, Terri L; Miller, Webb; Schuster, Stephan C

2009-01-01

Background The scientific literature contains many examples where DNA sequence analyses have been used to provide definitive answers to phylogenetic problems that traditional (non-DNA based) approaches alone have failed to resolve. One notable example concerns the rhinoceroses, a group for which several contradictory phylogenies were proposed on the basis of morphology, then apparently resolved using mitochondrial DNA fragments. Results In this study we report the first complete mitochondrial genome sequences of the extinct ice-age woolly rhinoceros (Coelodonta antiquitatis), and the threatened Javan (Rhinoceros sondaicus), Sumatran (Dicerorhinus sumatrensis), and black (Diceros bicornis) rhinoceroses. In combination with the previously published mitochondrial genomes of the white (Ceratotherium simum) and Indian (Rhinoceros unicornis) rhinoceroses, this data set putatively enables reconstruction of the rhinoceros phylogeny. While the six species cluster into three strongly supported sister-pairings: (i) The black/white, (ii) the woolly/Sumatran, and (iii) the Javan/Indian, resolution of the higher-level relationships has no statistical support. The phylogenetic signal from individual genes is highly diffuse, with mixed topological support from different genes. Furthermore, the choice of outgroup (horse vs tapir) has considerable effect on reconstruction of the phylogeny. The lack of resolution is suggestive of a hard polytomy at the base of crown-group Rhinocerotidae, and this is supported by an investigation of the relative branch lengths. Conclusion Satisfactory resolution of the rhinoceros phylogeny may not be achievable without additional analyses of substantial amounts of nuclear DNA. This study provides a compelling demonstration that, in spite of substantial sequence length, there are significant limitations with single-locus phylogenetics. We expect further examples of this to appear as next-generation, large-scale sequencing of complete mitochondrial genomes becomes commonplace in evolutionary studies. "The human factor in classification is nowhere more evident than in dealing with this superfamily (Rhinocerotoidea)." G. G. Simpson (1945) PMID:19432984
Drafting human ancestry: what does the Neanderthal genome tell us about hominid evolution? Commentary on Green et al. (2010).

PubMed

Hofreiter, Michael

2011-02-01

Ten years after the first draft versions of the human genome were announced, technical progress in both DNA sequencing and ancient DNA analyses has allowed a research team around Ed Green and Svante Pääbo to complete this task from infinitely more difficult hominid samples: a few pieces of bone originating from our closest, albeit extinct, relatives, the Neanderthals. Pulling the Neanderthal sequences out of a sea of contaminating environmental DNA impregnating the bones and at the same time avoiding the problems of contamination with modern human DNA is in itself a remarkable accomplishment. However, the crucial question in the long run is, what can we learn from such genomic data about hominid evolution?
The complete chloroplast genome sequence of Epipremnum aureum and its comparative analysis among eight Araceae species

PubMed Central

Han, Limin; Chen, Chen; Wang, Zhezhi

2018-01-01

Epipremnum aureum is an important foliage plant in the Araceae family. In this study, we have sequenced the complete chloroplast genome of E. aureum by using Illumina Hiseq sequencing platforms. This genome is a double-stranded circular DNA sequence of 164,831 bp that contains 35.8% GC. The two inverted repeats (IRa and IRb; 26,606 bp) are spaced by a small single-copy region (22,868 bp) and a large single-copy region (88,751 bp). The chloroplast genome has 131 (113 unique) functional genes, including 86 (79 unique) protein-coding genes, 37 (30 unique) tRNA genes, and eight (four unique) rRNA genes. Tandem repeats comprise the majority of the 43 long repetitive sequences. In addition, 111 simple sequence repeats are present, with mononucleotides being the most common type and di- and tetranucleotides being infrequent events. Positive selection pressure on rps12 in the E. aureum chloroplast has been demonstrated via synonymous and nonsynonymous substitution rates and selection pressure sites analyses. Ycf15 and infA are pseudogenes in this species. We constructed a Maximum Likelihood phylogenetic tree based on the complete chloroplast genomes of 38 species from 13 families. Those results strongly indicated that E. aureum is positioned as the sister of Colocasia esculenta within the Araceae family. This work may provide information for further study of the molecular phylogenetic relationships within Araceae, as well as molecular markers and breeding novel varieties by chloroplast genetic-transformation of E. aureum in particular. PMID:29529038
The complete mitochondrial genome of the Tibetan fox (Vulpes ferrilata) and implications for the phylogeny of Canidae.

PubMed

Zhao, Chao; Zhang, Honghai; Liu, Guangshuai; Yang, Xiufeng; Zhang, Jin

2016-02-01

Canidae is a family of carnivores comprises about 36 extant species that have been defined as three distinct monophyletic groups based on multi-gene data sets. The Tibetan fox (Vulpes ferrilata) is a member of the family Canidae that is endemic to the Tibetan Plateau and has seldom been in the focus of phylogenetic analyses. To clarify the phylogenic relationship of V. ferrilata between other canids, we sequenced the mitochondrial genome and firstly attempted to clarify the relative phylogenetic position of V. ferrilata in canids using the complete mitochondrial genome data. The mitochondrial genome of the Tibetan fox was 16,667 bp, including 37 genes (13 protein-coding genes, 2 rRNA, and 22 tRNA) and a control region. A comparison analysis among the sequenced data of canids indicated that they shared a similar arrangement, codon usage, and other aspects. A phylogenetic analysis on the basis of the nearly complete mtDNA genomes of canids agreed with three monophyletic clades, and the Tibetan fox was highly supported as a sister group of the corsac fox within Vulpes. The estimation of the divergence time suggested a recent split between the Tibetan fox and the corsac fox and rapid evolution in canids. There was no genetic evidence for positive selection related to high-altitude adaption for the Tibetan fox in mtDNA and following studies should pay more attention to the detection of positive signals in nuclear genes involved in energy and oxygen metabolisms. Copyright © 2015 Académie des sciences. Published by Elsevier SAS. All rights reserved.
Kullback Leibler divergence in complete bacterial and phage genomes

PubMed Central

Akhter, Sajia; Kashef, Mona T.; Ibrahim, Eslam S.; Bailey, Barbara

2017-01-01

The amino acid content of the proteins encoded by a genome may predict the coding potential of that genome and may reflect lifestyle restrictions of the organism. Here, we calculated the Kullback–Leibler divergence from the mean amino acid content as a metric to compare the amino acid composition for a large set of bacterial and phage genome sequences. Using these data, we demonstrate that (i) there is a significant difference between amino acid utilization in different phylogenetic groups of bacteria and phages; (ii) many of the bacteria with the most skewed amino acid utilization profiles, or the bacteria that host phages with the most skewed profiles, are endosymbionts or parasites; (iii) the skews in the distribution are not restricted to certain metabolic processes but are common across all bacterial genomic subsystems; (iv) amino acid utilization profiles strongly correlate with GC content in bacterial genomes but very weakly correlate with the G+C percent in phage genomes. These findings might be exploited to distinguish coding from non-coding sequences in large data sets, such as metagenomic sequence libraries, to help in prioritizing subsequent analyses. PMID:29204318
Kullback Leibler divergence in complete bacterial and phage genomes.

PubMed

Akhter, Sajia; Aziz, Ramy K; Kashef, Mona T; Ibrahim, Eslam S; Bailey, Barbara; Edwards, Robert A

2017-01-01

The amino acid content of the proteins encoded by a genome may predict the coding potential of that genome and may reflect lifestyle restrictions of the organism. Here, we calculated the Kullback-Leibler divergence from the mean amino acid content as a metric to compare the amino acid composition for a large set of bacterial and phage genome sequences. Using these data, we demonstrate that (i) there is a significant difference between amino acid utilization in different phylogenetic groups of bacteria and phages; (ii) many of the bacteria with the most skewed amino acid utilization profiles, or the bacteria that host phages with the most skewed profiles, are endosymbionts or parasites; (iii) the skews in the distribution are not restricted to certain metabolic processes but are common across all bacterial genomic subsystems; (iv) amino acid utilization profiles strongly correlate with GC content in bacterial genomes but very weakly correlate with the G+C percent in phage genomes. These findings might be exploited to distinguish coding from non-coding sequences in large data sets, such as metagenomic sequence libraries, to help in prioritizing subsequent analyses.
Simultaneous non-contiguous deletions using large synthetic DNA and site-specific recombinases

PubMed Central

Krishnakumar, Radha; Grose, Carissa; Haft, Daniel H.; Zaveri, Jayshree; Alperovich, Nina; Gibson, Daniel G.; Merryman, Chuck; Glass, John I.

2014-01-01

Toward achieving rapid and large scale genome modification directly in a target organism, we have developed a new genome engineering strategy that uses a combination of bioinformatics aided design, large synthetic DNA and site-specific recombinases. Using Cre recombinase we swapped a target 126-kb segment of the Escherichia coli genome with a 72-kb synthetic DNA cassette, thereby effectively eliminating over 54 kb of genomic DNA from three non-contiguous regions in a single recombination event. We observed complete replacement of the native sequence with the modified synthetic sequence through the action of the Cre recombinase and no competition from homologous recombination. Because of the versatility and high-efficiency of the Cre-lox system, this method can be used in any organism where this system is functional as well as adapted to use with other highly precise genome engineering systems. Compared to present-day iterative approaches in genome engineering, we anticipate this method will greatly speed up the creation of reduced, modularized and optimized genomes through the integration of deletion analyses data, transcriptomics, synthetic biology and site-specific recombination. PMID:24914053
The Biofuel Feedstock Genomics Resource: a web-based portal and database to enable functional genomics of plant biofuel feedstock species.

PubMed

Childs, Kevin L; Konganti, Kranti; Buell, C Robin

2012-01-01

Major feedstock sources for future biofuel production are likely to be high biomass producing plant species such as poplar, pine, switchgrass, sorghum and maize. One active area of research in these species is genome-enabled improvement of lignocellulosic biofuel feedstock quality and yield. To facilitate genomic-based investigations in these species, we developed the Biofuel Feedstock Genomic Resource (BFGR), a database and web-portal that provides high-quality, uniform and integrated functional annotation of gene and transcript assembly sequences from species of interest to lignocellulosic biofuel feedstock researchers. The BFGR includes sequence data from 54 species and permits researchers to view, analyze and obtain annotation at the gene, transcript, protein and genome level. Annotation of biochemical pathways permits the identification of key genes and transcripts central to the improvement of lignocellulosic properties in these species. The integrated nature of the BFGR in terms of annotation methods, orthologous/paralogous relationships and linkage to seven species with complete genome sequences allows comparative analyses for biofuel feedstock species with limited sequence resources. Database URL: http://bfgr.plantbiology.msu.edu.
Comparative analyses of nonpathogenic, opportunistic, and totally pathogenic mycobacteria reveal genomic and biochemical variabilities and highlight the survival attributes of Mycobacterium tuberculosis.

PubMed

Rahman, Syed Asad; Singh, Yadvir; Kohli, Sakshi; Ahmad, Javeed; Ehtesham, Nasreen Z; Tyagi, Anil K; Hasnain, Seyed E

2014-11-04

Mycobacterial evolution involves various processes, such as genome reduction, gene cooption, and critical gene acquisition. Our comparative genome size analysis of 44 mycobacterial genomes revealed that the nonpathogenic (NP) genomes were bigger than those of opportunistic (OP) or totally pathogenic (TP) mycobacteria, with the TP genomes being smaller yet variable in size--their genomic plasticity reflected their ability to evolve and survive under various environmental conditions. From the 44 mycobacterial species, 13 species, representing TP, OP, and NP, were selected for genomic-relatedness analyses. Analysis of homologous protein-coding genes shared between Mycobacterium indicus pranii (NP), Mycobacterium intracellulare ATCC 13950 (OP), and Mycobacterium tuberculosis H37Rv (TP) revealed that 4,995 (i.e., ~95%) M. indicaus pranii proteins have homology with M. intracellulare, whereas the homologies among M. indicus pranii, M. intracellulare ATCC 13950, and M. tuberculosis H37Rv were significantly lower. A total of 4,153 (~79%) M. indicus pranii proteins and 4,093 (~79%) M. intracellulare ATCC 13950 proteins exhibited homology with the M. tuberculosis H37Rv proteome, while 3,301 (~82%) and 3,295 (~82%) M. tuberculosis H37Rv proteins showed homology with M. indicus pranii and M. intracellulare ATCC 13950 proteomes, respectively. Comparative metabolic pathway analyses of TP/OP/NP mycobacteria showed enzymatic plasticity between M. indicus pranii (NP) and M. intracellulare ATCC 13950 (OP), Mycobacterium avium 104 (OP), and M. tuberculosis H37Rv (TP). Mycobacterium tuberculosis seems to have acquired novel alternate pathways with possible roles in metabolism, host-pathogen interactions, virulence, and intracellular survival, and by implication some of these could be potential drug targets. The complete sequence analysis of Mycobacterium indicus pranii, a novel species of Mycobacterium shown earlier to have strong immunomodulatory properties and currently in use for the treatment of leprosy, places it evolutionarily at the point of transition to pathogenicity. With the purpose of establishing the importance of M. indicus pranii in providing insight into the virulence mechanism of tuberculous and nontuberculous mycobacteria, we carried out comparative genomic and proteomic analyses of 44 mycobacterial species representing nonpathogenic (NP), opportunistic (OP), and totally pathogenic (TP) mycobacteria. Our results clearly placed M. indicus pranii as an ancestor of the M. avium complex. Analyses of comparative metabolic pathways between M. indicus pranii (NP), M. tuberculosis (TP), and M. intracellulare (OP) pointed to the presence of novel alternative pathways in M. tuberculosis with implications for pathogenesis and survival in the human host and identification of new drug targets. Copyright © 2014 Rahman et al.
Genomic Features of the Damselfly Calopteryx splendens Representing a Sister Clade to Most Insect Orders

PubMed Central

Ioannidis, Panagiotis; Simao, Felipe A.; Waterhouse, Robert M.; Manni, Mosè; Seppey, Mathieu; Robertson, Hugh M.; Misof, Bernhard; Niehuis, Oliver

2017-01-01

Insects comprise the most diverse and successful animal group with over one million described species that are found in almost every terrestrial and limnic habitat, with many being used as important models in genetics, ecology, and evolutionary research. Genome sequencing projects have greatly expanded the sampling of species from many insect orders, but genomic resources for species of certain insect lineages have remained relatively limited to date. To address this paucity, we sequenced the genome of the banded demoiselle, Calopteryx splendens, a damselfly (Odonata: Zygoptera) belonging to Palaeoptera, the clade containing the first winged insects. The 1.6 Gbp C. splendens draft genome assembly is one of the largest insect genomes sequenced to date and encodes a predicted set of 22,523 protein-coding genes. Comparative genomic analyses with other sequenced insects identified a relatively small repertoire of C. splendens detoxification genes, which could explain its previously noted sensitivity to habitat pollution. Intriguingly, this repertoire includes a cytochrome P450 gene not previously described in any insect genome. The C. splendens immune gene repertoire appears relatively complete and features several genes encoding novel multi-domain peptidoglycan recognition proteins. Analysis of chemosensory genes revealed the presence of both gustatory and ionotropic receptors, as well as the insect odorant receptor coreceptor gene (OrCo) and at least four partner odorant receptors (ORs). This represents the oldest known instance of a complete OrCo/OR system in insects, and provides the molecular underpinning for odonate olfaction. The C. splendens genome improves the sampling of insect lineages that diverged before the radiation of Holometabola and offers new opportunities for molecular-level evolutionary, ecological, and behavioral studies. PMID:28137743
The complete genome of Burkholderia phenoliruptrix strain BR3459a, a symbiont of Mimosa flocculosa: highlighting the coexistence of symbiotic and pathogenic genes.

PubMed

Zuleta, Luiz Fernando Goda; Cunha, Claúdio de Oliveira; de Carvalho, Fabíola Marques; Ciapina, Luciane Prioli; Souza, Rangel Celso; Mercante, Fábio Martins; de Faria, Sergio Miana; Baldani, José Ivo; Straliotto, Rosangela; Hungria, Mariangela; de Vasconcelos, Ana Tereza Ribeiro

2014-06-28

Burkholderia species play an important ecological role related to xenobiosis, the promotion of plant growth, the biocontrol of agricultural diseases, and symbiotic and non-symbiotic biological nitrogen fixation. Here, we highlight our study as providing the first complete genome of a symbiotic strain of B. phenoliruptrix, BR3459a (=CLA1), which was originally isolated in Brazil from nodules of Mimosa flocculosa and is effective in fixing nitrogen in association with this leguminous species. Genomic comparisons with other pathogenic and non-pathogenic Burkholderia strains grouped B. phenoliruptrix BR3459a with plant-associated beneficial and environmental species, although it shares a high percentage of its gene repertoire with species of the B. cepacia complex (Bcc) and "pseudomallei" group. The genomic analyses showed that the bce genes involved in exopolysaccharide production are clustered together in the same genomic region, constituting part of the Group III cluster of non-pathogenic bacteria. Regarding environmental stresses, we highlight genes that might be relevant in responses to osmotic, heat, cold and general stresses. Furthermore, a number of particularly interesting genes involved in the machinery of the T1SS, T2SS, T3SS, T4ASS and T6SS secretion systems were identified. The xenobiotic properties of strain BR3459a were also investigated, and some enzymes involved in the degradation of styrene, nitrotoluene, dioxin, chlorocyclohexane, chlorobenzene and caprolactam were identified. The genomic analyses also revealed a large number of antibiotic-related genes, the most important of which were correlated with streptomycin and novobiocin. The symbiotic plasmid showed high sequence identity with the symbiotic plasmid of B. phymatum. Additionally, comparative analysis of 545 housekeeping genes among pathogenic and non-pathogenic Burkholderia species strongly supports the definition of a new genus for the second branch, which would include BR3459a. The analyses of B. phenoliruptrix BR3459a showed key property of fixing nitrogen that together with genes for high tolerance to environmental stresses might explain a successful strategy of symbiosis in the tropics. The strain also harbours interesting sets of genes with biotechnological potential. However, the resemblance of certain genes to those of pathogenic Burkholderia raise concerns about large-scale applications in agriculture or for bioremediation.
The western painted turtle genome, a model for the evolution of extreme physiological adaptations in a slowly evolving lineage

PubMed Central

2013-01-01

Background We describe the genome of the western painted turtle, Chrysemys picta bellii, one of the most widespread, abundant, and well-studied turtles. We place the genome into a comparative evolutionary context, and focus on genomic features associated with tooth loss, immune function, longevity, sex differentiation and determination, and the species' physiological capacities to withstand extreme anoxia and tissue freezing. Results Our phylogenetic analyses confirm that turtles are the sister group to living archosaurs, and demonstrate an extraordinarily slow rate of sequence evolution in the painted turtle. The ability of the painted turtle to withstand complete anoxia and partial freezing appears to be associated with common vertebrate gene networks, and we identify candidate genes for future functional analyses. Tooth loss shares a common pattern of pseudogenization and degradation of tooth-specific genes with birds, although the rate of accumulation of mutations is much slower in the painted turtle. Genes associated with sex differentiation generally reflect phylogeny rather than convergence in sex determination functionality. Among gene families that demonstrate exceptional expansions or show signatures of strong natural selection, immune function and musculoskeletal patterning genes are consistently over-represented. Conclusions Our comparative genomic analyses indicate that common vertebrate regulatory networks, some of which have analogs in human diseases, are often involved in the western painted turtle's extraordinary physiological capacities. As these regulatory pathways are analyzed at the functional level, the painted turtle may offer important insights into the management of a number of human health disorders. PMID:23537068
Genome and metagenome enabled analyses reveal new insight into the global biogeography and potential urea utilization in marine Thaumarchaeota.

NASA Astrophysics Data System (ADS)

Ahlgren, N.; Parada, A. E.; Fuhrman, J. A.

2016-02-01

Marine Thaumarchaea are an abundant, important group of marine microbial communities as they fix carbon, oxidize ammonium, and thus contribute to key N and C cycles in the oceans. From an enrichment culture, we have sequenced the complete genome of a new Thaumarchaeota strain, SPOT01. Analysis of this genome and other Thaumarchaeal genomes contributes new insight into its role in N cycling and clarifies the broader biogeography of marine Thaumarchaeal genera. Phylogenomics of Thaumarchaeota genomes reveal coherent separation into clusters roughly equivalent to the genus level, and SPOT01 represents a new genus of marine Thaumarchaea. Competitive fragment recruitment of globally distributed metagenomes from TARA, Ocean Sampling Day, and those generated from a station off California shows that the SPOT01 genus is often the most abundant genus, especially where total Thaumarchaea are most abundant in the overall community. The SPOT01 genome contains urease genes allowing it to use an alternative form of N. Genomic and metagenomic analysis also reveal that among planktonic genomes and populations, the urease genes in general are more frequently found in members of the SPOT01 genus and another genus dominant in deep waters, thus we predict these two genera contribute most significantly to urea utilization among marine Thaumarchaea. Recruitment also revealed broader biogeographic and ecological patterns of the putative genera. The SPOT01 genus was most abundant at colder temperatures (<16 C), reflective of its dominance at subpolar to polar latitudes (>45 degrees). The genus containing Nitrosopumilus maritimus had the highest temperature range, and the genus containing Candidatus Nitrosopelagicus brevis was typically most abundant at intermediate temperatures and intermediate latitudes ( 35-45 degrees). Together these genome and metagenome enabled analyses provide significant new insight into the ecology and biogeochemical contributions of marine archaea.
The complete chloroplast genome sequence of Citrus sinensis (L.) Osbeck var 'Ridge Pineapple': organization and phylogenetic relationships to other angiosperms

PubMed Central

Bausher, Michael G; Singh, Nameirakpam D; Lee, Seung-Bum; Jansen, Robert K; Daniell, Henry

2006-01-01

Background The production of Citrus, the largest fruit crop of international economic value, has recently been imperiled due to the introduction of the bacterial disease Citrus canker. No significant improvements have been made to combat this disease by plant breeding and nuclear transgenic approaches. Chloroplast genetic engineering has a number of advantages over nuclear transformation; it not only increases transgene expression but also facilitates transgene containment, which is one of the major impediments for development of transgenic trees. We have sequenced the Citrus chloroplast genome to facilitate genetic improvement of this crop and to assess phylogenetic relationships among major lineages of angiosperms. Results The complete chloroplast genome sequence of Citrus sinensis is 160,129 bp in length, and contains 133 genes (89 protein-coding, 4 rRNAs and 30 distinct tRNAs). Genome organization is very similar to the inferred ancestral angiosperm chloroplast genome. However, in Citrus the infA gene is absent. The inverted repeat region has expanded to duplicate rps19 and the first 84 amino acids of rpl22. The rpl22 gene in the IRb region has a nonsense mutation resulting in 9 stop codons. This was confirmed by PCR amplification and sequencing using primers that flank the IR/LSC boundaries. Repeat analysis identified 29 direct and inverted repeats 30 bp or longer with a sequence identity ≥ 90%. Comparison of protein-coding sequences with expressed sequence tags revealed six putative RNA edits, five of which resulted in non-synonymous modifications in petL, psbH, ycf2 and ndhA. Phylogenetic analyses using maximum parsimony (MP) and maximum likelihood (ML) methods of a dataset composed of 61 protein-coding genes for 30 taxa provide strong support for the monophyly of several major clades of angiosperms, including monocots, eudicots, rosids and asterids. The MP and ML trees are incongruent in three areas: the position of Amborella and Nymphaeales, relationship of the magnoliid genus Calycanthus, and the monophyly of the eurosid I clade. Both MP and ML trees provide strong support for the monophyly of eurosids II and for the placement of Citrus (Sapindales) sister to a clade including the Malvales/Brassicales. Conclusion This is the first complete chloroplast genome sequence for a member of the Rutaceae and Sapindales. Expansion of the inverted repeat region to include rps19 and part of rpl22 and presence of two truncated copies of rpl22 is unusual among sequenced chloroplast genomes. Availability of a complete Citrus chloroplast genome sequence provides valuable information on intergenic spacer regions and endogenous regulatory sequences for chloroplast genetic engineering. Phylogenetic analyses resolve relationships among several major clades of angiosperms and provide strong support for the monophyly of the eurosid II clade and the position of the Sapindales sister to the Brassicales/Malvales. PMID:17010212
Selfish DNA in protein-coding genes of Rickettsia.

PubMed

Ogata, H; Audic, S; Barbe, V; Artiguenave, F; Fournier, P E; Raoult, D; Claverie, J M

2000-10-13

Rickettsia conorii, the aetiological agent of Mediterranean spotted fever, is an intracellular bacterium transmitted by ticks. Preliminary analyses of the nearly complete genome sequence of R. conorii have revealed 44 occurrences of a previously undescribed palindromic repeat (150 base pairs long) throughout the genome. Unexpectedly, this repeat was found inserted in-frame within 19 different R. conorii open reading frames likely to encode functional proteins. We found the same repeat in proteins of other Rickettsia species. The finding of a mobile element inserted in many unrelated genes suggests the potential role of selfish DNA in the creation of new protein sequences.
Complete genome analysis of porcine kobuviruses from the feces of pigs in Japan.

PubMed

Akagami, Masataka; Ito, Mika; Niira, Kazutaka; Kuroda, Moegi; Masuda, Tsuneyuki; Haga, Kei; Tsuchiaka, Shinobu; Naoi, Yuki; Kishimoto, Mai; Sano, Kaori; Omatsu, Tsutomu; Aoki, Hiroshi; Katayama, Yukie; Oba, Mami; Oka, Tomoichiro; Ichimaru, Toru; Yamasato, Hiroshi; Ouchi, Yoshinao; Shirai, Junsuke; Katayama, Kazuhiko; Mizutani, Tetsuya; Nagai, Makoto

2017-08-01

Porcine kobuviruses (PoKoVs) are ubiquitously distributed in pig populations worldwide and are thought to be enteric viruses in swine. Although PoKoVs have been detected in pigs in Japan, no complete genome data for Japanese PoKoVs are available. In the present study, 24 nearly complete or complete sequences of the PoKoV genome obtained from 10 diarrheic feces and 14 non-diarrheic feces of Japanese pigs were analyzed using a metagenomics approach. Japanese PoKoVs shared 85.2-100% identity with the complete coding nucleotide (nt) sequences and the closest relationship of 85.1-98.3% with PoKoVs from other countries. Twenty of 24 Japanese PoKoVs carried a deletion of 90 nt in the 2B coding region. Phylogenetic tree analyses revealed that PoKoVs were not grouped according to their geographical region of origin and the phylogenetic trees of the L, P1, P2, and P3 genetic regions showed topologies different from each other. Similarity plot analysis using strains from a single farm revealed partially different similarity patterns among strains from identical farm origins, suggesting that recombination events had occurred. These results indicate that various PoKoV strains are prevalent and not restricted geographically on pig farms worldwide and the coexistence of multiple strains leads to recombination events of PoKoVs and contributes to the genetic diversity and evolution of PoKoVs.
Horizontal gene transfer in Histophilus somni and its role in the evolution of pathogenic strain 2336, as determined by comparative genomic analyses

PubMed Central

2011-01-01

Background Pneumonia and myocarditis are the most commonly reported diseases due to Histophilus somni, an opportunistic pathogen of the reproductive and respiratory tracts of cattle. Thus far only a few genes involved in metabolic and virulence functions have been identified and characterized in H. somni using traditional methods. Analyses of the genome sequences of several Pasteurellaceae species have provided insights into their biology and evolution. In view of the economic and ecological importance of H. somni, the genome sequence of pneumonia strain 2336 has been determined and compared to that of commensal strain 129Pt and other members of the Pasteurellaceae. Results The chromosome of strain 2336 (2,263,857 bp) contained 1,980 protein coding genes, whereas the chromosome of strain 129Pt (2,007,700 bp) contained only 1,792 protein coding genes. Although the chromosomes of the two strains differ in size, their average GC content, gene density (total number of genes predicted on the chromosome), and percentage of sequence (number of genes) that encodes proteins were similar. The chromosomes of these strains also contained a number of discrete prophage regions and genomic islands. One of the genomic islands in strain 2336 contained genes putatively involved in copper, zinc, and tetracycline resistance. Using the genome sequence data and comparative analyses with other members of the Pasteurellaceae, several H. somni genes that may encode proteins involved in virulence (e.g., filamentous haemaggutinins, adhesins, and polysaccharide biosynthesis/modification enzymes) were identified. The two strains contained a total of 17 ORFs that encode putative glycosyltransferases and some of these ORFs had characteristic simple sequence repeats within them. Most of the genes/loci common to both the strains were located in different regions of the two chromosomes and occurred in opposite orientations, indicating genome rearrangement since their divergence from a common ancestor. Conclusions Since the genome of strain 129Pt was ~256,000 bp smaller than that of strain 2336, these genomes provide yet another paradigm for studying evolutionary gene loss and/or gain in regard to virulence repertoire and pathogenic ability. Analyses of the complete genome sequences revealed that bacteriophage- and transposon-mediated horizontal gene transfer had occurred at several loci in the chromosomes of strains 2336 and 129Pt. It appears that these mobile genetic elements have played a major role in creating genomic diversity and phenotypic variability among the two H. somni strains. PMID:22111657
Horizontal gene transfer in Histophilus somni and its role in the evolution of pathogenic strain 2336, as determined by comparative genomic analyses.

PubMed

Siddaramappa, Shivakumara; Challacombe, Jean F; Duncan, Alison J; Gillaspy, Allison F; Carson, Matthew; Gipson, Jenny; Orvis, Joshua; Zaitshik, Jeremy; Barnes, Gentry; Bruce, David; Chertkov, Olga; Detter, J Chris; Han, Cliff S; Tapia, Roxanne; Thompson, Linda S; Dyer, David W; Inzana, Thomas J

2011-11-23

Pneumonia and myocarditis are the most commonly reported diseases due to Histophilus somni, an opportunistic pathogen of the reproductive and respiratory tracts of cattle. Thus far only a few genes involved in metabolic and virulence functions have been identified and characterized in H. somni using traditional methods. Analyses of the genome sequences of several Pasteurellaceae species have provided insights into their biology and evolution. In view of the economic and ecological importance of H. somni, the genome sequence of pneumonia strain 2336 has been determined and compared to that of commensal strain 129Pt and other members of the Pasteurellaceae. The chromosome of strain 2336 (2,263,857 bp) contained 1,980 protein coding genes, whereas the chromosome of strain 129Pt (2,007,700 bp) contained only 1,792 protein coding genes. Although the chromosomes of the two strains differ in size, their average GC content, gene density (total number of genes predicted on the chromosome), and percentage of sequence (number of genes) that encodes proteins were similar. The chromosomes of these strains also contained a number of discrete prophage regions and genomic islands. One of the genomic islands in strain 2336 contained genes putatively involved in copper, zinc, and tetracycline resistance. Using the genome sequence data and comparative analyses with other members of the Pasteurellaceae, several H. somni genes that may encode proteins involved in virulence (e.g., filamentous haemaggutinins, adhesins, and polysaccharide biosynthesis/modification enzymes) were identified. The two strains contained a total of 17 ORFs that encode putative glycosyltransferases and some of these ORFs had characteristic simple sequence repeats within them. Most of the genes/loci common to both the strains were located in different regions of the two chromosomes and occurred in opposite orientations, indicating genome rearrangement since their divergence from a common ancestor. Since the genome of strain 129Pt was ~256,000 bp smaller than that of strain 2336, these genomes provide yet another paradigm for studying evolutionary gene loss and/or gain in regard to virulence repertoire and pathogenic ability. Analyses of the complete genome sequences revealed that bacteriophage- and transposon-mediated horizontal gene transfer had occurred at several loci in the chromosomes of strains 2336 and 129Pt. It appears that these mobile genetic elements have played a major role in creating genomic diversity and phenotypic variability among the two H. somni strains.

Complete genome sequence of Enterococcus faecium strain TX16 and comparative genomic analysis of Enterococcus faecium genomes

PubMed Central

2012-01-01

Background Enterococci are among the leading causes of hospital-acquired infections in the United States and Europe, with Enterococcus faecalis and Enterococcus faecium being the two most common species isolated from enterococcal infections. In the last decade, the proportion of enterococcal infections caused by E. faecium has steadily increased compared to other Enterococcus species. Although the underlying mechanism for the gradual replacement of E. faecalis by E. faecium in the hospital environment is not yet understood, many studies using genotyping and phylogenetic analysis have shown the emergence of a globally dispersed polyclonal subcluster of E. faecium strains in clinical environments. Systematic study of the molecular epidemiology and pathogenesis of E. faecium has been hindered by the lack of closed, complete E. faecium genomes that can be used as references. Results In this study, we report the complete genome sequence of the E. faecium strain TX16, also known as DO, which belongs to multilocus sequence type (ST) 18, and was the first E. faecium strain ever sequenced. Whole genome comparison of the TX16 genome with 21 E. faecium draft genomes confirmed that most clinical, outbreak, and hospital-associated (HA) strains (including STs 16, 17, 18, and 78), in addition to strains of non-hospital origin, group in the same clade (referred to as the HA clade) and are evolutionally considerably more closely related to each other by phylogenetic and gene content similarity analyses than to isolates in the community-associated (CA) clade with approximately a 3–4% average nucleotide sequence difference between the two clades at the core genome level. Our study also revealed that many genomic loci in the TX16 genome are unique to the HA clade. 380 ORFs in TX16 are HA-clade specific and antibiotic resistance genes are enriched in HA-clade strains. Mobile elements such as IS16 and transposons were also found almost exclusively in HA strains, as previously reported. Conclusions Our findings along with other studies show that HA clonal lineages harbor specific genetic elements as well as sequence differences in the core genome which may confer selection advantages over the more heterogeneous CA E. faecium isolates. Which of these differences are important for the success of specific E. faecium lineages in the hospital environment remain(s) to be determined. PMID:22769602
Isotachophoresis for fractionation and recovery of cytoplasmic RNA and nucleus from single cells.

PubMed

Kuriyama, Kentaro; Shintaku, Hirofumi; Santiago, Juan G

2015-07-01

There is a substantial need for simultaneous analyses of RNA and DNA from individual single cells. Such analysis provides unique evidence of cell-to-cell differences and the correlation between gene expression and genomic mutation in highly heterogeneous cell populations. We present a novel microfluidic system that leverages isotachophoresis to fractionate and isolate cytoplasmic RNA and genomic DNA (gDNA) from single cells. The system uniquely enables independent, sequence-specific analyses of these critical markers. Our system uses a microfluidic chip with a simple geometry and four end-channel electrodes, and completes the entire process in <5 min, including lysis, purification, fractionation, and delivery to DNA and RNA output reservoirs, each containing high quality and purity aliquots with no measurable cross-contamination of cytoplasmic RNA versus gDNA. We demonstrate our system with simultaneous, sequence-specific quantitation using off-chip RT-qPCR and qPCR for simultaneous cytoplasmic RNA and gDNA analyses, respectively. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
A Multi-Platform Draft de novo Genome Assembly and Comparative Analysis for the Scarlet Macaw (Ara macao)

PubMed Central

Seabury, Christopher M.; Dowd, Scot E.; Seabury, Paul M.; Raudsepp, Terje; Brightsmith, Donald J.; Liboriussen, Poul; Halley, Yvette; Fisher, Colleen A.; Owens, Elaine; Viswanathan, Ganesh; Tizard, Ian R.

2013-01-01

Data deposition to NCBI Genomes This Whole Genome Shotgun project has been deposited at DDBJ/EMBL/GenBank under the accession AMXX00000000 (SMACv1.0, unscaffolded genome assembly). The version described in this paper is the first version (AMXX01000000). The scaffolded assembly (SMACv1.1) has been deposited at DDBJ/EMBL/GenBank under the accession AOUJ00000000, and is also the first version (AOUJ01000000). Strong biological interest in traits such as the acquisition and utilization of speech, cognitive abilities, and longevity catalyzed the utilization of two next-generation sequencing platforms to provide the first-draft de novo genome assembly for the large, new world parrot Ara macao (Scarlet Macaw). Despite the challenges associated with genome assembly for an outbred avian species, including 951,507 high-quality putative single nucleotide polymorphisms, the final genome assembly (>1.035 Gb) includes more than 997 Mb of unambiguous sequence data (excluding N’s). Cytogenetic analyses including ZooFISH revealed complex rearrangements associated with two scarlet macaw macrochromosomes (AMA6, AMA7), which supports the hypothesis that translocations, fusions, and intragenomic rearrangements are key factors associated with karyotype evolution among parrots. In silico annotation of the scarlet macaw genome provided robust evidence for 14,405 nuclear gene annotation models, their predicted transcripts and proteins, and a complete mitochondrial genome. Comparative analyses involving the scarlet macaw, chicken, and zebra finch genomes revealed high levels of nucleotide-based conservation as well as evidence for overall genome stability among the three highly divergent species. Application of a new whole-genome analysis of divergence involving all three species yielded prioritized candidate genes and noncoding regions for parrot traits of interest (i.e., speech, intelligence, longevity) which were independently supported by the results of previous human GWAS studies. We also observed evidence for genes and noncoding loci that displayed extreme conservation across the three avian lineages, thereby reflecting their likely biological and developmental importance among birds. PMID:23667475
Comparative Genomic Analysis of a Clinical Isolate of Klebsiella quasipneumoniae subsp. similipneumoniae, a KPC-2 and OKP-B-6 Beta-Lactamases Producer Harboring Two Drug-Resistance Plasmids from Southeast Brazil

PubMed Central

Nicolás, Marisa F.; Ramos, Pablo Ivan Pereira; Marques de Carvalho, Fabíola; Camargo, Dhian R. A.; de Fátima Morais Alves, Carlene; Loss de Morais, Guilherme; Almeida, Luiz G. P.; Souza, Rangel C.; Ciapina, Luciane P.; Vicente, Ana C. P.; Coimbra, Roney S.; Ribeiro de Vasconcelos, Ana T.

2018-01-01

The aim of this study was to unravel the genetic determinants responsible for multidrug (including carbapenems) resistance and virulence in a clinical isolate of Klebsiella quasipneumoniae subsp. similipneumoniae by whole-genome sequencing and comparative analyses. Eighty-three clinical isolates initially identified as carbapenem-resistant K. pneumoniae were collected from nosocomial infections in southeast Brazil. After RAPD screening, the KPC-142 isolate, showing the most divergent DNA pattern, was selected for complete genome sequencing in an Illumina HiSeq 2500 instrument. Reads were assembled into scaffolds, gaps between scaffolds were resolved by in silico gap filling and extensive bioinformatics analyses were performed, using multiple comparative analysis tools and databases. Genome sequencing allowed to correct the classification of the KPC-142 isolate as K. quasipneumoniae subsp. similipneumoniae. To the best of our knowledge this is the first complete genome reported to date of a clinical isolate of this subspecies harboring both class A beta-lactamases KPC-2 and OKP-B-6 from South America. KPC-142 has one 5.2 Mbp chromosome (57.8% G+C) and two plasmids: 190 Kbp pKQPS142a (50.7% G+C) and 11 Kbp pKQPS142b (57.3% G+C). The 3 Kbp region in pKQPS142b containing the blaKPC−2 was found highly similar to that of pKp13d of K. pneumoniae Kp13 isolated in Southern Brazil in 2009, suggesting the horizontal transfer of this resistance gene between different species of Klebsiella. KPC-142 additionally harbors an integrative conjugative element ICEPm1 that could be involved in the mobilization of pKQPS142b and determinants of resistance to other classes of antimicrobials, including aminoglycoside and silver. We present the completely assembled genome sequence of a clinical isolate of K. quasipneumoniae subsp. similipneumoniae, a KPC-2 and OKP-B-6 beta-lactamases producer and discuss the most relevant genomic features of this important resistant pathogen in comparison to several strains belonging to K. quasipneumoniae subsp. similipneumoniae (phylogroup II-B), K. quasipneumoniae subsp. quasipneumoniae (phylogroup II-A), K. pneumoniae (phylogroup I), and K. variicola (phylogroup III). Our study contributes to the description of the characteristics of a novel K. quasipneumoniae subsp. similipneumoniae strain circulating in South America that currently represent a serious potential risk for nosocomial settings. PMID:29503635
Complete mitochondrial genomes and nuclear ribosomal RNA operons of two species of Diplostomum (Platyhelminthes: Trematoda): a molecular resource for taxonomy and molecular epidemiology of important fish pathogens.

PubMed

Brabec, Jan; Kostadinova, Aneta; Scholz, Tomáš; Littlewood, D Timothy J

2015-06-19

The genus Diplostomum (Platyhelminthes: Trematoda: Diplostomidae) is a diverse group of freshwater parasites with complex life-cycles and global distribution. The larval stages are important pathogens causing eye fluke disease implicated in substantial impacts on natural fish populations and losses in aquaculture. However, the problematic species delimitation and difficulties in the identification of larval stages hamper the assessment of the distributional and host ranges of Diplostomum spp. and their transmission ecology. Total genomic DNA was isolated from adult worms and shotgun sequenced using Illumina MiSeq technology. Mitochondrial (mt) genomes and nuclear ribosomal RNA (rRNA) operons were assembled using established bioinformatic tools and fully annotated. Mt protein-coding genes and nuclear rRNA genes were subjected to phylogenetic analysis by maximum likelihood and the resulting topologies compared. We characterised novel complete mt genomes and nuclear rRNA operons of two closely related species, Diplostomum spathaceum and D. pseudospathaceum. Comparative mt genome assessment revealed that the cox1 gene and its 'barcode' region used for molecular identification are the most conserved regions; instead, nad4 and nad5 genes were identified as most promising molecular diagnostic markers. Using the novel data, we provide the first genome wide estimation of the phylogenetic relationships of the order Diplostomida, one of the two fundamental lineages of the Digenea. Analyses of the mitogenomic data invariably recovered the Diplostomidae as a sister lineage of the order Plagiorchiida rather than as a basal lineage of the Diplostomida as inferred in rDNA phylogenies; this was concordant with the mt gene order of Diplostomum spp. exhibiting closer match to the conserved gene order of the Plagiorchiida. Complete sequences of the mt genome and rRNA operon of two species of Diplostomum provide a valuable resource for novel genetic markers for species delineation and large-scale molecular epidemiology and disease ecology studies based on the most accessible life-cycle stages of eye flukes.
The Chloroplast Genome of Pellia endiviifolia: Gene Content, RNA-Editing Pattern, and the Origin of Chloroplast Editing

PubMed Central

Grosche, Christopher; Funk, Helena T.; Maier, Uwe G.; Zauner, Stefan

2012-01-01

RNA editing is a post-transcriptional process that can act upon transcripts from mitochondrial, nuclear, and chloroplast genomes. In chloroplasts, single-nucleotide conversions in mRNAs via RNA editing occur at different frequencies across the plant kingdom. These range from several hundred edited sites in some mosses and ferns to lower frequencies in seed plants and the complete lack of RNA editing in the liverwort Marchantia polymorpha. Here, we report the sequence and edited sites of the chloroplast genome from the liverwort Pellia endiviifolia. The type and frequency of chloroplast RNA editing display a pattern highly similar to that in seed plants. Analyses of the C to U conversions and the genomic context in which the editing sites are embedded provide evidence in favor of the hypothesis that chloroplast RNA editing evolved to compensate mutations in the first land plants. PMID:23221608
Comparing sequencing assays and human-machine analyses in actionable genomics for glioblastoma.

PubMed

Wrzeszczynski, Kazimierz O; Frank, Mayu O; Koyama, Takahiko; Rhrissorrakrai, Kahn; Robine, Nicolas; Utro, Filippo; Emde, Anne-Katrin; Chen, Bo-Juen; Arora, Kanika; Shah, Minita; Vacic, Vladimir; Norel, Raquel; Bilal, Erhan; Bergmann, Ewa A; Moore Vogel, Julia L; Bruce, Jeffrey N; Lassman, Andrew B; Canoll, Peter; Grommes, Christian; Harvey, Steve; Parida, Laxmi; Michelini, Vanessa V; Zody, Michael C; Jobanputra, Vaidehi; Royyuru, Ajay K; Darnell, Robert B

2017-08-01

To analyze a glioblastoma tumor specimen with 3 different platforms and compare potentially actionable calls from each. Tumor DNA was analyzed by a commercial targeted panel. In addition, tumor-normal DNA was analyzed by whole-genome sequencing (WGS) and tumor RNA was analyzed by RNA sequencing (RNA-seq). The WGS and RNA-seq data were analyzed by a team of bioinformaticians and cancer oncologists, and separately by IBM Watson Genomic Analytics (WGA), an automated system for prioritizing somatic variants and identifying drugs. More variants were identified by WGS/RNA analysis than by targeted panels. WGA completed a comparable analysis in a fraction of the time required by the human analysts. The development of an effective human-machine interface in the analysis of deep cancer genomic datasets may provide potentially clinically actionable calls for individual patients in a more timely and efficient manner than currently possible. NCT02725684.
Isolation of genomic DNA from defatted oil seed residue of rapeseed (Brassica napus).

PubMed

Sadia, M; Rabbani, M A; Hameed, S; Pearce, S R; Malik, S A

2011-02-08

A simple protocol for obtaining pure, restrictable and amplifiable megabase genomic DNA from oil-free seed residue of Brassica napus, an important oil seed plant, has been developed. Oil from the dry seeds was completely recovered in an organic solvent and quantified gravimetrically followed by processing of the residual biomass (defatted seed residue) for genomic DNA isolation. The isolated DNA can be cut by a range of restriction enzymes. The method enables simultaneous isolation and recovery of lipids and genomic DNA from the same test sample, thus allowing two independent analyses from a single sample. Multiple micro-scale oil extraction from the commercial seeds gave approximately 39% oil, which is close to the usual oil recovery from standard oil seed. Most of the amplified fragments were scored in the range of 2.5 to 0.5 kb, best suited for scoring as molecular diagnostics.
GWFASTA: server for FASTA search in eukaryotic and microbial genomes.

PubMed

Issac, Biju; Raghava, G P S

2002-09-01

Similarity searches are a powerful method for solving important biological problems such as database scanning, evolutionary studies, gene prediction, and protein structure prediction. FASTA is a widely used sequence comparison tool for rapid database scanning. Here we describe the GWFASTA server that was developed to assist the FASTA user in similarity searches against partially and/or completely sequenced genomes. GWFASTA consists of more than 60 microbial genomes, eight eukaryote genomes, and proteomes of annotatedgenomes. Infact, it provides the maximum number of databases for similarity searching from a single platform. GWFASTA allows the submission of more than one sequence as a single query for a FASTA search. It also provides integrated post-processing of FASTA output, including compositional analysis of proteins, multiple sequences alignment, and phylogenetic analysis. Furthermore, it summarizes the search results organism-wise for prokaryotes and chromosome-wise for eukaryotes. Thus, the integration of different tools for sequence analyses makes GWFASTA a powerful toolfor biologists.
Hunting the Extinct Steppe Bison (Bison priscus) Mitochondrial Genome in the Trois-Frères Paleolithic Painted Cave

PubMed Central

Marsolier-Kergoat, Marie-Claude; Palacio, Pauline; Berthonaud, Véronique; Maksud, Frédéric; Stafford, Thomas; Bégouën, Robert; Elalouf, Jean-Marc

2015-01-01

Despite the abundance of fossil remains for the extinct steppe bison (Bison priscus), an animal that was painted and engraved in numerous European Paleolithic caves, a complete mitochondrial genome sequence has never been obtained for this species. In the present study we collected bone samples from a sector of the Trois-Frères Paleolithic cave (Ariège, France) that formerly functioned as a pitfall and was sealed before the end of the Pleistocene. Screening the DNA content of the samples collected from the ground surface revealed their contamination by Bos DNA. However, a 19,000-year-old rib collected on a rock apart the pathway delineated for modern visitors was devoid of such contaminants and reproducibly yielded Bison priscus DNA. High-throughput shotgun sequencing combined with conventional PCR analysis of the rib DNA extract enabled to reconstruct a complete mitochondrial genome sequence of 16,318 bp for the extinct steppe bison with a 10.4-fold coverage. Phylogenetic analyses robustly established the position of the Bison priscus mitochondrial genome as basal to the clade delineated by the genomes of the modern American Bison bison. The extinct steppe bison sequence, which exhibits 93 specific polymorphisms as compared to the published Bison bison mitochondrial genomes, provides an additional resource for the study of Bovinae specimens. Moreover this study of ancient DNA delineates a new research pathway for the analysis of the Magdalenian Trois-Frères cave. PMID:26083419
Hunting the Extinct Steppe Bison (Bison priscus) Mitochondrial Genome in the Trois-Frères Paleolithic Painted Cave.

PubMed

Marsolier-Kergoat, Marie-Claude; Palacio, Pauline; Berthonaud, Véronique; Maksud, Frédéric; Stafford, Thomas; Bégouën, Robert; Elalouf, Jean-Marc

2015-01-01

Despite the abundance of fossil remains for the extinct steppe bison (Bison priscus), an animal that was painted and engraved in numerous European Paleolithic caves, a complete mitochondrial genome sequence has never been obtained for this species. In the present study we collected bone samples from a sector of the Trois-Frères Paleolithic cave (Ariège, France) that formerly functioned as a pitfall and was sealed before the end of the Pleistocene. Screening the DNA content of the samples collected from the ground surface revealed their contamination by Bos DNA. However, a 19,000-year-old rib collected on a rock apart the pathway delineated for modern visitors was devoid of such contaminants and reproducibly yielded Bison priscus DNA. High-throughput shotgun sequencing combined with conventional PCR analysis of the rib DNA extract enabled to reconstruct a complete mitochondrial genome sequence of 16,318 bp for the extinct steppe bison with a 10.4-fold coverage. Phylogenetic analyses robustly established the position of the Bison priscus mitochondrial genome as basal to the clade delineated by the genomes of the modern American Bison bison. The extinct steppe bison sequence, which exhibits 93 specific polymorphisms as compared to the published Bison bison mitochondrial genomes, provides an additional resource for the study of Bovinae specimens. Moreover this study of ancient DNA delineates a new research pathway for the analysis of the Magdalenian Trois-Frères cave.
Molecular characterization of Banana streak virus isolate from Musa Acuminata in China.

PubMed

Zhuang, Jun; Wang, Jian-Hua; Zhang, Xin; Liu, Zhi-Xin

2011-12-01

Banana streak virus (BSV), a member of genus Badnavirus, is a causal agent of banana streak disease throughout the world. The genetic diversity of BSVs from different regions of banana plantations has previously been investigated, but there are relatively few reports of the genetic characteristic of episomal (non-integrated) BSV genomes isolated from China. Here, the complete genome, a total of 7722bp (GenBank accession number DQ092436), of an isolate of Banana streak virus (BSV) on cultivar Cavendish (BSAcYNV) in Yunnan, China was determined. The genome organises in the typical manner of badnaviruses. The intergenic region of genomic DNA contains a large stem-loop, which may contribute to the ribosome shift into the following open reading frames (ORFs). The coding region of BSAcYNV consists of three overlapping ORFs, ORF1 with a non-AUG start codon and ORF2 encoding two small proteins are individually involved in viral movement and ORF3 encodes a polyprotein. Besides the complete genome, a defective genome lacking the whole RNA leader region and a majority of ORF1 and which encompasses 6525bp was also isolated and sequenced from this BSV DNA reservoir in infected banana plants. Sequence analyses showed that BSAcYNV has closest similarity in terms of genome organization and the coding assignments with an BSV isolate from Vietnam (BSAcVNV). The corresponding coding regions shared identities of 88% and -95% at nucleotide and amino acid levels, respectively. Phylogenetic analysis also indicated BSAcYNV shared the closest geographical evolutionary relationship to BSAcVNV among sequenced banana streak badnaviruses.
Tripal: a construction toolkit for online genome databases.

PubMed

Ficklin, Stephen P; Sanderson, Lacey-Anne; Cheng, Chun-Huai; Staton, Margaret E; Lee, Taein; Cho, Il-Hyung; Jung, Sook; Bett, Kirstin E; Main, Doreen

2011-01-01

As the availability, affordability and magnitude of genomics and genetics research increases so does the need to provide online access to resulting data and analyses. Availability of a tailored online database is the desire for many investigators or research communities; however, managing the Information Technology infrastructure needed to create such a database can be an undesired distraction from primary research or potentially cost prohibitive. Tripal provides simplified site development by merging the power of Drupal, a popular web Content Management System with that of Chado, a community-derived database schema for storage of genomic, genetic and other related biological data. Tripal provides an interface that extends the content management features of Drupal to the data housed in Chado. Furthermore, Tripal provides a web-based Chado installer, genomic data loaders, web-based editing of data for organisms, genomic features, biological libraries, controlled vocabularies and stock collections. Also available are Tripal extensions that support loading and visualizations of NCBI BLAST, InterPro, Kyoto Encyclopedia of Genes and Genomes and Gene Ontology analyses, as well as an extension that provides integration of Tripal with GBrowse, a popular GMOD tool. An Application Programming Interface is available to allow creation of custom extensions by site developers, and the look-and-feel of the site is completely customizable through Drupal-based PHP template files. Addition of non-biological content and user-management is afforded through Drupal. Tripal is an open source and freely available software package found at http://tripal.sourceforge.net.
Plant functional genomics

NASA Astrophysics Data System (ADS)

Holtorf, Hauke; Guitton, Marie-Christine; Reski, Ralf

2002-04-01

Functional genome analysis of plants has entered the high-throughput stage. The complete genome information from key species such as Arabidopsis thaliana and rice is now available and will further boost the application of a range of new technologies to functional plant gene analysis. To broadly assign functions to unknown genes, different fast and multiparallel approaches are currently used and developed. These new technologies are based on known methods but are adapted and improved to accommodate for comprehensive, large-scale gene analysis, i.e. such techniques are novel in the sense that their design allows researchers to analyse many genes at the same time and at an unprecedented pace. Such methods allow analysis of the different constituents of the cell that help to deduce gene function, namely the transcripts, proteins and metabolites. Similarly the phenotypic variations of entire mutant collections can now be analysed in a much faster and more efficient way than before. The different methodologies have developed to form their own fields within the functional genomics technological platform and are termed transcriptomics, proteomics, metabolomics and phenomics. Gene function, however, cannot solely be inferred by using only one such approach. Rather, it is only by bringing together all the information collected by different functional genomic tools that one will be able to unequivocally assign functions to unknown plant genes. This review focuses on current technical developments and their impact on the field of plant functional genomics. The lower plant Physcomitrella is introduced as a new model system for gene function analysis, owing to its high rate of homologous recombination.
TCGA Workflow: Analyze cancer genomics and epigenomics data using Bioconductor packages

PubMed Central

Bontempi, Gianluca; Ceccarelli, Michele; Noushmehr, Houtan

2016-01-01

Biotechnological advances in sequencing have led to an explosion of publicly available data via large international consortia such as The Cancer Genome Atlas (TCGA), The Encyclopedia of DNA Elements (ENCODE), and The NIH Roadmap Epigenomics Mapping Consortium (Roadmap). These projects have provided unprecedented opportunities to interrogate the epigenome of cultured cancer cell lines as well as normal and tumor tissues with high genomic resolution. The Bioconductor project offers more than 1,000 open-source software and statistical packages to analyze high-throughput genomic data. However, most packages are designed for specific data types (e.g. expression, epigenetics, genomics) and there is no one comprehensive tool that provides a complete integrative analysis of the resources and data provided by all three public projects. A need to create an integration of these different analyses was recently proposed. In this workflow, we provide a series of biologically focused integrative analyses of different molecular data. We describe how to download, process and prepare TCGA data and by harnessing several key Bioconductor packages, we describe how to extract biologically meaningful genomic and epigenomic data. Using Roadmap and ENCODE data, we provide a work plan to identify biologically relevant functional epigenomic elements associated with cancer. To illustrate our workflow, we analyzed two types of brain tumors: low-grade glioma (LGG) versus high-grade glioma (glioblastoma multiform or GBM). This workflow introduces the following Bioconductor packages: AnnotationHub, ChIPSeeker, ComplexHeatmap, pathview, ELMER, GAIA, MINET, RTCGAToolbox, TCGAbiolinks. PMID:28232861
TCGA Workflow: Analyze cancer genomics and epigenomics data using Bioconductor packages.

PubMed

Silva, Tiago C; Colaprico, Antonio; Olsen, Catharina; D'Angelo, Fulvio; Bontempi, Gianluca; Ceccarelli, Michele; Noushmehr, Houtan

2016-01-01

Biotechnological advances in sequencing have led to an explosion of publicly available data via large international consortia such as The Cancer Genome Atlas (TCGA), The Encyclopedia of DNA Elements (ENCODE), and The NIH Roadmap Epigenomics Mapping Consortium (Roadmap). These projects have provided unprecedented opportunities to interrogate the epigenome of cultured cancer cell lines as well as normal and tumor tissues with high genomic resolution. The Bioconductor project offers more than 1,000 open-source software and statistical packages to analyze high-throughput genomic data. However, most packages are designed for specific data types (e.g. expression, epigenetics, genomics) and there is no one comprehensive tool that provides a complete integrative analysis of the resources and data provided by all three public projects. A need to create an integration of these different analyses was recently proposed. In this workflow, we provide a series of biologically focused integrative analyses of different molecular data. We describe how to download, process and prepare TCGA data and by harnessing several key Bioconductor packages, we describe how to extract biologically meaningful genomic and epigenomic data. Using Roadmap and ENCODE data, we provide a work plan to identify biologically relevant functional epigenomic elements associated with cancer. To illustrate our workflow, we analyzed two types of brain tumors: low-grade glioma (LGG) versus high-grade glioma (glioblastoma multiform or GBM). This workflow introduces the following Bioconductor packages: AnnotationHub, ChIPSeeker, ComplexHeatmap, pathview, ELMER, GAIA, MINET, RTCGAToolbox, TCGAbiolinks.
Tripal: a construction toolkit for online genome databases

PubMed Central

Sanderson, Lacey-Anne; Cheng, Chun-Huai; Staton, Margaret E.; Lee, Taein; Cho, Il-Hyung; Jung, Sook; Bett, Kirstin E.; Main, Doreen

2011-01-01

As the availability, affordability and magnitude of genomics and genetics research increases so does the need to provide online access to resulting data and analyses. Availability of a tailored online database is the desire for many investigators or research communities; however, managing the Information Technology infrastructure needed to create such a database can be an undesired distraction from primary research or potentially cost prohibitive. Tripal provides simplified site development by merging the power of Drupal, a popular web Content Management System with that of Chado, a community-derived database schema for storage of genomic, genetic and other related biological data. Tripal provides an interface that extends the content management features of Drupal to the data housed in Chado. Furthermore, Tripal provides a web-based Chado installer, genomic data loaders, web-based editing of data for organisms, genomic features, biological libraries, controlled vocabularies and stock collections. Also available are Tripal extensions that support loading and visualizations of NCBI BLAST, InterPro, Kyoto Encyclopedia of Genes and Genomes and Gene Ontology analyses, as well as an extension that provides integration of Tripal with GBrowse, a popular GMOD tool. An Application Programming Interface is available to allow creation of custom extensions by site developers, and the look-and-feel of the site is completely customizable through Drupal-based PHP template files. Addition of non-biological content and user-management is afforded through Drupal. Tripal is an open source and freely available software package found at http://tripal.sourceforge.net PMID:21959868
Comparative genomic analysis of the thermophilic biomass-degrading fungi Myceliophthora thermophila and Thielavia terrestris

DOE Office of Scientific and Technical Information (OSTI.GOV)

Berka, Randy M.; Grigoriev, Igor V.; Otillar, Robert

2011-10-02

Thermostable enzymes and thermophilic cell factories may afford economic advantages in the production of many chemicals and biomass-based fuels. Here we describe and compare the genomes of two thermophilic fungi, Myceliophthora thermophila and Thielavia terrestris. To our knowledge, these genomes are the first described for thermophilic eukaryotes and the first complete telomere-to-telomere genomes for filamentous fungi. Genome analyses and experimental data suggest that both thermophiles are capable of hydrolyzing all major polysaccharides found in biomass. Examination of transcriptome data and secreted proteins suggests that the two fungi use shared approaches in the hydrolysis of cellulose and xylan but distinct mechanismsmore » in pectin degradation. Characterization of the biomass-hydrolyzing activity of recombinant enzymes suggests that these organisms are highly efficient in biomass decomposition at both moderate and high temperatures. Furthermore, we present evidence suggesting that aside from representing a potential reservoir of thermostable enzymes, thermophilic fungi are amenable to manipulation using classical and molecular genetics.« less
Clarification of Taxonomic Status within the Pseudomonas syringae Species Group Based on a Phylogenomic Analysis.

PubMed

Gomila, Margarita; Busquets, Antonio; Mulet, Magdalena; García-Valdés, Elena; Lalucat, Jorge

2017-01-01

The Pseudomonas syringae phylogenetic group comprises 15 recognized bacterial species and more than 60 pathovars. The classification and identification of strains is relevant for practical reasons but also for understanding the epidemiology and ecology of this group of plant pathogenic bacteria. Genome-based taxonomic analyses have been introduced recently to clarify the taxonomy of the whole genus. A set of 139 draft and complete genome sequences of strains belonging to all species of the P. syringae group available in public databases were analyzed, together with the genomes of closely related species used as outgroups. Comparative genomics based on the genome sequences of the species type strains in the group allowed the delineation of phylogenomic species and demonstrated that a high proportion of strains included in the study are misclassified. Furthermore, representatives of at least 7 putative novel species were detected. It was also confirmed that P. ficuserectae, P. meliae , and P. savastanoi are later synonyms of P. amygdali and that " P. coronafaciens " should be revived as a nomenspecies.
The genome of the Lactobacillus sanfranciscensis temperate phage EV3

PubMed Central

2013-01-01

Background Bacteriophages infection modulates microbial consortia and transduction is one of the most important mechanism involved in the bacterial evolution. However, phage contamination brings food fermentations to a halt causing economic setbacks. The number of phage genome sequences of lactic acid bacteria especially of lactobacilli is still limited. We analysed the genome of a temperate phage active on Lactobacillus sanfranciscensis, the predominant strain in type I sourdough fermentations. Results Sequencing of the DNA of EV3 phage revealed a genome of 34,834 bp and a G + C content of 36.45%. Of the 43 open reading frames (ORFs) identified, all but eight shared homology with other phages of lactobacilli. A similar genomic organization and mosaic pattern of identities align EV3 with the closely related Lactobacillus vaginalis ATCC 49540 prophage. Four unknown ORFs that had no homologies in the databases or predicted functions were identified. Notably, EV3 encodes a putative dextranase. Conclusions EV3 is the first L. sanfranciscensis phage that has been completely sequenced so far. PMID:24308641

Comparative genomic analysis of the thermophilic biomass-degrading fungi Myceliophthora thermophila and Thielavia terrestris

DOE Office of Scientific and Technical Information (OSTI.GOV)

Berka, Randy M.; Grigoriev, Igor V.; Otillar, Robert

2011-05-16

Thermostable enzymes and thermophilic cell factories may afford economic advantages in the production of many chemicals and biomass-based fuels. Here we describe and compare the genomes of two thermophilic fungi, Myceliophthora thermophila and Thielavia terrestris. To our knowledge, these genomes are the first described for thermophilic eukaryotes and the first complete telomere-to-telomere genomes for filamentous fungi. Genome analyses and experimental data suggest that both thermophiles are capable of hydrolyzing all major polysaccharides found in biomass. Examination of transcriptome data and secreted proteins suggests that the two fungi use shared approaches in the hydrolysis of cellulose and xylan but distinct mechanismsmore » in pectin degradation. Characterization of the biomass-hydrolyzing activity of recombinant enzymes suggests that these organisms are highly efficient in biomass decomposition at both moderate and high temperatures. Furthermore, we present evidence suggesting that aside from representing a potential reservoir of thermostable enzymes, thermophilic fungi are amenable to manipulation using classical and molecular genetics.« less
Rapid and recent diversification patterns in Anseriformes birds: Inferred from molecular phylogeny and diversification analyses.

PubMed

Sun, Zhonglou; Pan, Tao; Hu, Chaochao; Sun, Lu; Ding, Hengwu; Wang, Hui; Zhang, Chenling; Jin, Hong; Chang, Qing; Kan, Xianzhao; Zhang, Baowei

2017-01-01

The Anseriformes is a well-known and widely distributed bird order, with more than 150 species in the world. This paper aims to revise the classification, determine the phylogenetic relationships and diversification patterns in Anseriformes by exploring the Cyt b, ND2, COI genes and the complete mitochondrial genomes (mito-genomes). Molecular phylogeny and genetic distance analyses suggest that the Dendrocygna species should be considered as an independent family, Dendrocygnidae, rather than a member of Anatidae. Molecular timescale analyses suggests that the ancestral diversification occurred during the Early Eocene Climatic Optimum (58 ~ 50 Ma). Furthermore, diversification analyses showed that, after a long period of constant diversification, the median initial speciation rate was accelerated three times, and finally increased to approximately 0.3 sp/My. In the present study, both molecular phylogeny and diversification analyses results support that Anseriformes birds underwent rapid and recent diversification in their evolutionary history, especially in modern ducks, which show extreme diversification during the Plio-Pleistocene (~ 5.3 Ma). Therefore, our study support that the Plio-Pleistocene climate fluctuations are likely to have played a significant role in promoting the recent diversification for Anseriformes.
The Banana Genome Hub

PubMed Central

Droc, Gaëtan; Larivière, Delphine; Guignon, Valentin; Yahiaoui, Nabila; This, Dominique; Garsmeur, Olivier; Dereeper, Alexis; Hamelin, Chantal; Argout, Xavier; Dufayard, Jean-François; Lengelle, Juliette; Baurens, Franc-Christophe; Cenci, Alberto; Pitollat, Bertrand; D’Hont, Angélique; Ruiz, Manuel; Rouard, Mathieu; Bocs, Stéphanie

2013-01-01

Banana is one of the world’s favorite fruits and one of the most important crops for developing countries. The banana reference genome sequence (Musa acuminata) was recently released. Given the taxonomic position of Musa, the completed genomic sequence has particular comparative value to provide fresh insights about the evolution of the monocotyledons. The study of the banana genome has been enhanced by a number of tools and resources that allows harnessing its sequence. First, we set up essential tools such as a Community Annotation System, phylogenomics resources and metabolic pathways. Then, to support post-genomic efforts, we improved banana existing systems (e.g. web front end, query builder), we integrated available Musa data into generic systems (e.g. markers and genetic maps, synteny blocks), we have made interoperable with the banana hub, other existing systems containing Musa data (e.g. transcriptomics, rice reference genome, workflow manager) and finally, we generated new results from sequence analyses (e.g. SNP and polymorphism analysis). Several uses cases illustrate how the Banana Genome Hub can be used to study gene families. Overall, with this collaborative effort, we discuss the importance of the interoperability toward data integration between existing information systems. Database URL: http://banana-genome.cirad.fr/ PMID:23707967
A Distance Measure for Genome Phylogenetic Analysis

NASA Astrophysics Data System (ADS)

Cao, Minh Duc; Allison, Lloyd; Dix, Trevor

Phylogenetic analyses of species based on single genes or parts of the genomes are often inconsistent because of factors such as variable rates of evolution and horizontal gene transfer. The availability of more and more sequenced genomes allows phylogeny construction from complete genomes that is less sensitive to such inconsistency. For such long sequences, construction methods like maximum parsimony and maximum likelihood are often not possible due to their intensive computational requirement. Another class of tree construction methods, namely distance-based methods, require a measure of distances between any two genomes. Some measures such as evolutionary edit distance of gene order and gene content are computational expensive or do not perform well when the gene content of the organisms are similar. This study presents an information theoretic measure of genetic distances between genomes based on the biological compression algorithm expert model. We demonstrate that our distance measure can be applied to reconstruct the consensus phylogenetic tree of a number of Plasmodium parasites from their genomes, the statistical bias of which would mislead conventional analysis methods. Our approach is also used to successfully construct a plausible evolutionary tree for the γ-Proteobacteria group whose genomes are known to contain many horizontally transferred genes.
Phylogenetic relationships among superfamilies of Neritimorpha (Mollusca: Gastropoda).

PubMed

Uribe, Juan E; Colgan, Don; Castro, Lyda R; Kano, Yasunori; Zardoya, Rafael

2016-11-01

Despite the extraordinary morphological and ecological diversity of Neritimorpha, few studies have focused on the phylogenetic relationships of this lineage of gastropods, which includes four extant superfamilies: Neritopsoidea, Hydrocenoidea, Helicinoidea, and Neritoidea. Here, the nucleotide sequences of the complete mitochondrial genomes of Georissa bangueyensis (Hydrocenoidea), Neritina usnea (Neritoidea), and Pleuropoma jana (Helicinoidea) and the nearly complete mt genomes of Titiscania sp. (Neritopsoidea) and Theodoxus fluviatilis (Neritoidea) were determined. Phylogenetic reconstructions using probabilistic methods were based on mitochondrial (13 protein coding genes and two ribosomal rRNA genes), nuclear (partial 28S rRNA, 18S rRNA, actin, and histone H3 genes) and combined sequence data sets. All phylogenetic analyses except one converged on a single, highly supported tree in which Neritopsoidea was recovered as the sister group of a clade including Helicinoidea as the sister group of Hydrocenoidea and Neritoidea. This topology agrees with the fossil record and supports at least three independent invasions of land by neritimorph snails. The mitochondrial genomes of Titiscania sp., G. bangueyensis, N. usnea, and T. fluviatilis share the same gene organization previously described for Nerita mt genomes whereas that of P. jana has undergone major rearrangements. We sequenced about half of the mitochondrial genome of another species of Helicinoidea, Viana regina, and confirmed that this species shares the highly derived gene order of P. jana. Copyright © 2016 Elsevier Inc. All rights reserved.
Isolation and Whole-genome Sequence Analysis of the Imipenem Heteroresistant Acinetobacter baumannii Clinical Isolate HRAB-85.

PubMed

Li, Puyuan; Huang, Yong; Yu, Lan; Liu, Yannan; Niu, Wenkai; Zou, Dayang; Liu, Huiying; Zheng, Jing; Yin, Xiuyun; Yuan, Jing; Yuan, Xin; Bai, Changqing

2017-09-01

Heteroresistance is a phenomenon in which there are various responses to antibiotics from bacterial cells within the same population. Here, we isolated and characterised an imipenem heteroresistant Acinetobacter baumannii strain (HRAB-85). The genome of strain HRAB-85 was completely sequenced and analysed to understand its antibiotic resistance mechanisms. Population analysis and multilocus sequence typing were performed. Subpopulations grew in the presence of imipenem at concentrations of up to 64μg/mL, and the strain was found to belong to ST208. The total length of strain HRAB-85 was 4,098,585bp with a GC content of 39.98%. The genome harboured at least four insertion sequences: the common ISAba1, ISAba22, ISAba24, and newly reported ISAba26. Additionally, 19 antibiotic-resistance genes against eight classes of antimicrobial agents were found, and 11 genomic islands (GIs) were identified. Among them, GI3, GI10, and GI11 contained many ISs and antibiotic-resistance determinants. The existence of imipenem heteroresistant phenotypes in A. baumannii was substantiated in this hospital, and imipenem pressure, which could induce imipenem-heteroresistant subpopulations, may select for highly resistant strains. The complete genome sequencing and bioinformatics analysis of HRAB-85 could improve our understanding of the epidemiology and resistance mechanisms of carbapenem-heteroresistant A. baumannii. Copyright © 2017. Published by Elsevier Ltd.
The sea cucumber genome provides insights into morphological evolution and visceral regeneration

PubMed Central

Dai, Hui; Hamel, Jean-François; Liu, Chengzhang; Yu, Yang; Liu, Shilin; Lin, Wenchao; Guo, Kaimin; Jin, Songjun; Xu, Peng; Storey, Kenneth B.; Huan, Pin; Zhang, Tao; Zhou, Yi; Zhang, Jiquan; Lin, Chenggang; Li, Xiaoni; Xing, Lili; Huo, Da; Sun, Mingzhe; Wang, Lei; Mercier, Annie; Li, Fuhua; Yang, Hongsheng

2017-01-01

Apart from sharing common ancestry with chordates, sea cucumbers exhibit a unique morphology and exceptional regenerative capacity. Here we present the complete genome sequence of an economically important sea cucumber, A. japonicus, generated using Illumina and PacBio platforms, to achieve an assembly of approximately 805 Mb (contig N50 of 190 Kb and scaffold N50 of 486 Kb), with 30,350 protein-coding genes and high continuity. We used this resource to explore key genetic mechanisms behind the unique biological characters of sea cucumbers. Phylogenetic and comparative genomic analyses revealed the presence of marker genes associated with notochord and gill slits, suggesting that these chordate features were present in ancestral echinoderms. The unique shape and weak mineralization of the sea cucumber adult body were also preliminarily explained by the contraction of biomineralization genes. Genome, transcriptome, and proteome analyses of organ regrowth after induced evisceration provided insight into the molecular underpinnings of visceral regeneration, including a specific tandem-duplicated prostatic secretory protein of 94 amino acids (PSP94)-like gene family and a significantly expanded fibrinogen-related protein (FREP) gene family. This high-quality genome resource will provide a useful framework for future research into biological processes and evolution in deuterostomes, including remarkable regenerative abilities that could have medical applications. Moreover, the multiomics data will be of prime value for commercial sea cucumber breeding programs. PMID:29023486
The sea cucumber genome provides insights into morphological evolution and visceral regeneration.

PubMed

Zhang, Xiaojun; Sun, Lina; Yuan, Jianbo; Sun, Yamin; Gao, Yi; Zhang, Libin; Li, Shihao; Dai, Hui; Hamel, Jean-François; Liu, Chengzhang; Yu, Yang; Liu, Shilin; Lin, Wenchao; Guo, Kaimin; Jin, Songjun; Xu, Peng; Storey, Kenneth B; Huan, Pin; Zhang, Tao; Zhou, Yi; Zhang, Jiquan; Lin, Chenggang; Li, Xiaoni; Xing, Lili; Huo, Da; Sun, Mingzhe; Wang, Lei; Mercier, Annie; Li, Fuhua; Yang, Hongsheng; Xiang, Jianhai

2017-10-01

Apart from sharing common ancestry with chordates, sea cucumbers exhibit a unique morphology and exceptional regenerative capacity. Here we present the complete genome sequence of an economically important sea cucumber, A. japonicus, generated using Illumina and PacBio platforms, to achieve an assembly of approximately 805 Mb (contig N50 of 190 Kb and scaffold N50 of 486 Kb), with 30,350 protein-coding genes and high continuity. We used this resource to explore key genetic mechanisms behind the unique biological characters of sea cucumbers. Phylogenetic and comparative genomic analyses revealed the presence of marker genes associated with notochord and gill slits, suggesting that these chordate features were present in ancestral echinoderms. The unique shape and weak mineralization of the sea cucumber adult body were also preliminarily explained by the contraction of biomineralization genes. Genome, transcriptome, and proteome analyses of organ regrowth after induced evisceration provided insight into the molecular underpinnings of visceral regeneration, including a specific tandem-duplicated prostatic secretory protein of 94 amino acids (PSP94)-like gene family and a significantly expanded fibrinogen-related protein (FREP) gene family. This high-quality genome resource will provide a useful framework for future research into biological processes and evolution in deuterostomes, including remarkable regenerative abilities that could have medical applications. Moreover, the multiomics data will be of prime value for commercial sea cucumber breeding programs.
Molecular Characterization of the Complete Genome of Three Basal-BR Isolates of Turnip mosaic virus Infecting Raphanus sativus in China.

PubMed

Zhu, Fuxiang; Sun, Ying; Wang, Yan; Pan, Hongyu; Wang, Fengting; Zhang, Xianghui; Zhang, Yanhua; Liu, Jinliang

2016-06-04

Turnip mosaic virus (TuMV) infects crops of plant species in the family Brassicaceae worldwide. TuMV isolates were clustered to five lineages corresponding to basal-B, basal-BR, Asian-BR, world-B and OMs. Here, we determined the complete genome sequences of three TuMV basal-BR isolates infecting radish from Shandong and Jilin Provinces in China. Their genomes were all composed of 9833 nucleotides, excluding the 3'-terminal poly(A) tail. They contained two open reading frames (ORFs), with the large one encoding a polyprotein of 3164 amino acids and the small overlapping ORF encoding a PIPO protein of 61 amino acids, which contained the typically conserved motifs found in members of the genus Potyvirus. In pairwise comparison with 30 other TuMV genome sequences, these three isolates shared their highest identities with isolates from Eurasian countries (Germany, Italy, Turkey and China). Recombination analysis showed that the three isolates in this study had no "clear" recombination. The analyses of conserved amino acids changed between groups showed that the codons in the TuMV out group (OGp) and OMs group were the same at three codon sites (852, 1006, 1548), and the other TuMV groups (basal-B, basal-BR, Asian-BR, world-B) were different. This pattern suggests that the codon in the OMs progenitor did not change but that in the other TuMV groups the progenitor sequence did change at divergence. Genetic diversity analyses indicate that the PIPO gene was under the highest selection pressure and the selection pressure on P3N-PIPO and P3 was almost the same. It suggests that most of the selection pressure on P3 was probably imposed through P3N-PIPO.
Comparative analyses of Xanthomonas and Xylella complete genomes.

PubMed

Moreira, Leandro M; De Souza, Robson F; Digiampietri, Luciano A; Da Silva, Ana C R; Setubal, João C

2005-01-01

Computational analyses of four bacterial genomes of the Xanthomonadaceae family reveal new unique genes that may be involved in adaptation, pathogenicity, and host specificity. The Xanthomonas genus presents 3636 unique genes distributed in 1470 families, while Xylella genus presents 1026 unique genes distributed in 375 families. Among Xanthomonas-specific genes, we highlight a large number of cell wall degrading enzymes, proteases, and iron receptors, a set of energy metabolism genes, second copy of the type II secretion system, type III secretion system, flagella and chemotactic machinery, and the xanthomonadin synthesis gene cluster. Important genes unique to the Xylella genus are an additional copy of a type IV pili gene cluster and the complete machinery of colicin V synthesis and secretion. Intersections of gene sets from both genera reveal a cluster of genes homologous to Salmonella's SPI-7 island in Xanthomonas axonopodis pv citri and Xylella fastidiosa 9a5c, which might be involved in host specificity. Each genome also presents important unique genes, such as an HMS cluster, the kdgT gene, and O-antigen in Xanthomonas axonopodis pv citri; a number of avrBS genes and a distinct O-antigen in Xanthomonas campestris pv campestris, a type I restriction-modification system and a nickase gene in Xylella fastidiosa 9a5c, and a type II restriction-modification system and two genes related to peptidoglycan biosynthesis in Xylella fastidiosa temecula 1. All these differences imply a considerable number of gene gains and losses during the divergence of the four lineages, and are associated with structural genome modifications that may have a direct relation with the mode of transmission, adaptation to specific environments and pathogenicity of each organism.
Bioinformatics analysis and genetic diversity of the poliovirus.

PubMed

Liu, Yanhan; Ma, Tengfei; Liu, Jianzhu; Zhao, Xiaona; Cheng, Ziqiang; Guo, Huijun; Wang, Shujing; Xu, Ruixue

2014-12-01

Poliomyelitis, a disease which can manifest as muscle paralysis, is caused by the poliovirus, which is a human enterovirus and member of the family Picornaviridae that usually transmits by the faecal-oral route. The viruses of the OPV (oral poliovirus attenuated-live vaccine) strains can mutate in the human intestine during replication and some of these mutations can lead to the recovery of serious neurovirulence. Informatics research of the poliovirus genome can be used to explain further the characteristics of this virus. In this study, sequences from 100 poliovirus isolates were acquired from GenBank. To determine the evolutionary relationship between the strains, we compared and analysed the sequences of the complete poliovirus genome and the VP1 region. The reconstructed phylogenetic trees for the complete sequences and the VP1 sequences were both divided into two branches, indicating that the genetic relationships of the whole poliovirus genome and the VP1 sequences are very similar. This branching indicates that the virulence and pathogenicity of poliomyelitis may be associated with the VP1 region. Sequence alignment of the VP1 region revealed numerous mutation sites in which mutation rates of >30 % were detected. In a group of strains recorded in the USA, mutation sites and mutation types were the same and this may be associated with their distribution in the evolutionary tree and their genetic relationship. In conclusion, the genetic evolutionary relationships of poliovirus isolate sequences are determined to a great extent by the VP1 protein, and poliovirus strains located on the same branch of the phylogenetic tree contain the same mutation spots and mutation types. Hence, the genetic characteristics of the VP1 region in the poliovirus genome should be analysed to identify the transmission route of poliovirus and provide the basis of viral immunity development. © 2014 The Authors.
Seven new dolphin mitochondrial genomes and a time-calibrated phylogeny of whales

PubMed Central

Xiong, Ye; Brandley, Matthew C; Xu, Shixia; Zhou, Kaiya; Yang, Guang

2009-01-01

Background The phylogeny of Cetacea (whales) is not fully resolved with substantial support. The ambiguous and conflicting results of multiple phylogenetic studies may be the result of the use of too little data, phylogenetic methods that do not adequately capture the complex nature of DNA evolution, or both. In addition, there is also evidence that the generic taxonomy of Delphinidae (dolphins) underestimates its diversity. To remedy these problems, we sequenced the complete mitochondrial genomes of seven dolphins and analyzed these data with partitioned Bayesian analyses. Moreover, we incorporate a newly-developed "relaxed" molecular clock to model heterogenous rates of evolution among cetacean lineages. Results The "deep" phylogenetic relationships are well supported including the monophyly of Cetacea and Odontoceti. However, there is ambiguity in the phylogenetic affinities of two of the river dolphin clades Platanistidae (Indian River dolphins) and Lipotidae (Yangtze River dolphins). The phylogenetic analyses support a sister relationship between Delphinidae and Monodontidae + Phocoenidae. Additionally, there is statistically significant support for the paraphyly of Tursiops (bottlenose dolphins) and Stenella (spotted dolphins). Conclusion Our phylogenetic analysis of complete mitochondrial genomes using recently developed models of rate autocorrelation resolved the phylogenetic relationships of the major Cetacean lineages with a high degree of confidence. Our results indicate that a rapid radiation of lineages explains the lack of support the placement of Platanistidae and Lipotidae. Moreover, our estimation of molecular divergence dates indicates that these radiations occurred in the Middle to Late Oligocene and Middle Miocene, respectively. Furthermore, by collecting and analyzing seven new mitochondrial genomes, we provide strong evidence that the delphinid genera Tursiops and Stenella are not monophyletic, and the current taxonomy masks potentially interesting patterns of morphological, physiological, behavioral, and ecological evolution. PMID:19166626
Multiplexed pyrosequencing of nine sea anemone (Cnidaria: Anthozoa: Hexacorallia: Actiniaria) mitochondrial genomes.

PubMed

Foox, Jonathan; Brugler, Mercer; Siddall, Mark Edward; Rodríguez, Estefanía

2016-07-01

Six complete and three partial actiniarian mitochondrial genomes were amplified in two semi-circles using long-range PCR and pyrosequenced in a single run on a 454 GS Junior, doubling the number of complete mitogenomes available within the order. Typical metazoan mtDNA features included circularity, 13 protein-coding genes, 2 ribosomal RNA genes, and length ranging from 17,498 to 19,727 bp. Several typical anthozoan mitochondrial genome features were also observed including the presence of only two transfer RNA genes, elevated A + T richness ranging from 54.9 to 62.4%, large intergenic regions, and group 1 introns interrupting NADH dehydrogenase subunit 5 and cytochrome c oxidase subunit I, the latter of which possesses a homing endonuclease gene. Within the sea anemone Alicia sansibarensis, we report the first mitochondrial gene order rearrangement within the Actiniaria, as well as putative novel non-canonical protein-coding genes. Phylogenetic analyses of all 13 protein-coding and 2 ribosomal genes largely corroborated current hypotheses of sea anemone interrelatedness, with a few lower-level differences.
The first complete mitochondrial genome of Dacus longicornis (Diptera: Tephritidae) using next-generation sequencing and mitochondrial genome phylogeny of Dacini tribe

PubMed Central

Jiang, Fan; Pan, Xubin; Li, Xuankun; Yu, Yanxue; Zhang, Junhua; Jiang, Hongshan; Dou, Liduo; Zhu, Shuifang

2016-01-01

The genus Dacus is one of the most economically important tephritid fruit flies. The first complete mitochondrial genome (mitogenome) of Dacus species – D. longicornis was sequenced by next-generation sequencing in order to develop the mitogenome data for this genus. The circular 16,253 bp mitogenome is the typical set and arrangement of 37 genes present in the ancestral insect. The mitogenome data of D. longicornis was compared to all the published homologous sequences of other tephritid species. We discovered the subgenera Bactrocera, Daculus and Tetradacus differed from the subgenus Zeugodacus, the genera Dacus, Ceratitis and Procecidochares in the possession of TA instead of TAA stop codon for COI gene. There is a possibility that the TA stop codon in COI is the synapomorphy in Bactrocera group in the genus Bactrocera comparing with other Tephritidae species. Phylogenetic analyses based on the mitogenome data from Tephritidae were inferred by Bayesian and Maximum-likelihood methods, strongly supported the sister relationship between Zeugodacus and Dacus. PMID:27812024
Complete genome of the cellulolytic thermophile Acidothermus cellulolyticus 11B provides insights into its ecophysiological and evolutionary adaptations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Xie, Gary; Detter, John C; Bruce, David C

We present here the complete 2.4 MB genome of the actinobacterial thermophile, Acidothermus cellulolyticus 11B, that surprisingly reveals thermophilic amino acid usage in only the cytosolic subproteome rather than its whole proteome. Thermophilic amino acid usage in the partial proteome implies a recent, ongoing evolution of the A. cellulolyticus genome since its divergence about 200-250 million years ago from its closest phylogenetic neighbor Frankia, a mesophilic plant symbiont. Differential amino acid usage in the predicted subproteomes of A. cellulolyticus likely reflects a stepwise evolutionary process of modern thermophiles in general. An unusual occurrence of higher G+C in the non-coding DNAmore » than in the transcribed genome reinforces a late evolution from a higher G+C common ancestor. Comparative analyses of the A. cellulolyticus genome with those of Frankia and other closely-related actinobacteria revealed that A. cellulolyticus genes exhibit reciprocal purine preferences at the first and third codon positions, perhaps reflecting a subtle preference for the dinucleotide AG in its mRNAs, a possible adaptation to a thermophilic environment. Other interesting features in the genome of this cellulolytic, hot-springs dwelling prokaryote reveal streamlining for adaptation to its specialized ecological niche. These include a low occurrence of pseudo genes or mobile genetic elements, a flagellar gene complement previously unknown in this organism, and presence of laterally-acquired genomic islands of likely ecophysiological value. New glycoside hydrolases relevant for lignocellulosic biomass deconstruction were identified in the genome, indicating a diverse biomass-degrading enzyme repertoire several-fold greater than previously characterized, and significantly elevating the industrial value of this organism.« less
Complete genome of the cellulolytic thermophile Acidothermus cellulolyticus 11B provides insights into its ecophysiological and evolutionary adaptations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Xie, Gary; Detter, Chris; Bruce, David

We present here the complete 2.4 MB genome of the actinobacterial thermophile, Acidothermus cellulolyticus lIB, that surprisingly reveals thermophilic amino acid usage in only the cytosolic subproteome rather than its whole proteome. Thermophilic amino acid usage in the partial proteome implies a recent, ongoing evolution of the A. cellulolyticus genome since its divergence about 200-250 million years ago from its closest phylogenetic neighbor Frankia, a mesophilic plant symbiont. Differential amino acid usage in the predicted subproteomes of A. cellulolyticus likely reflects a stepwise evolutionary process of modern thermophiles in general. An unusual occurrence of higher G+C in the non-coding DNAmore » than in the transcribed genome reinforces a late evolution from a higher G+C common ancestor. Comparative analyses of the A. cellulolyticus genome with those of Frankia and other closely-related actinobacteria revealed that A. cellulolyticus genes exhibit reciprocal purine preferences at the first and third codon positions, perhaps reflecting a subtle preference for the dinucleotide AG in its mRNAs, a possible adaptation to a thermophilic environment. Other interesting features in the genome of this cellulolytic, hot-springs dwelling prokaryote reveal streamlining for adaptation to its specialized ecological niche. These include a low occurrence of pseudogenes or mobile genetic elements, a flagellar gene complement previously unknown in this organism, and presence of laterally-acquired genomic islands of likely ecophysiological value. New glycoside hydrolases relevant for lignocellulosic biomass deconstruction were identified in the genome, indicating a diverse biomass-degrading enzyme repertoire several-fold greater than previously characterized, and significantly elevating the industrial value of this organism.« less
Complete Cellulase System in the Marine Bacterium Saccharophagus degradans Strain 2-40T

PubMed Central

Taylor, Larry E.; Henrissat, Bernard; Coutinho, Pedro M.; Ekborg, Nathan A.; Hutcheson, Steven W.; Weiner, Ronald M.

2006-01-01

Saccharophagus degradans strain 2-40 is a representative of an emerging group of marine complex polysaccharide (CP)-degrading bacteria. It is unique in its metabolic versatility, being able to degrade at least 10 distinct CPs from diverse algal, plant and invertebrate sources. The S. degradans genome has been sequenced to completion, and more than 180 open reading frames have been identified that encode carbohydrases. Over half of these are likely to act on plant cell wall polymers. In fact, there appears to be a full array of enzymes that degrade and metabolize plant cell walls. Genomic and proteomic analyses reveal 13 cellulose depolymerases complemented by seven accessory enzymes, including two cellodextrinases, three cellobiases, a cellodextrin phosphorylase, and a cellobiose phosphorylase. Most of these enzymes exhibit modular architecture, and some contain novel combinations of catalytic and/or substrate binding modules. This is exemplified by endoglucanase Cel5A, which has three internal family 6 carbohydrate binding modules (CBM6) and two catalytic modules from family five of glycosyl hydrolases (GH5) and by Cel6A, a nonreducing-end cellobiohydrolase from family GH6 with tandem CBM2s. This is the first report of a complete and functional cellulase system in a marine bacterium with a sequenced genome. PMID:16707677
Complete sequence and analysis of the mitochondrial genome of Hemiselmis andersenii CCMP644 (Cryptophyceae).

PubMed

Kim, Eunsoo; Lane, Christopher E; Curtis, Bruce A; Kozera, Catherine; Bowman, Sharen; Archibald, John M

2008-05-12

Cryptophytes are an enigmatic group of unicellular eukaryotes with plastids derived by secondary (i.e., eukaryote-eukaryote) endosymbiosis. Cryptophytes are unusual in that they possess four genomes-a host cell-derived nuclear and mitochondrial genome and an endosymbiont-derived plastid and 'nucleomorph' genome. The evolutionary origins of the host and endosymbiont components of cryptophyte algae are at present poorly understood. Thus far, a single complete mitochondrial genome sequence has been determined for the cryptophyte Rhodomonas salina. Here, the second complete mitochondrial genome of the cryptophyte alga Hemiselmis andersenii CCMP644 is presented. The H. andersenii mtDNA is 60,553 bp in size and encodes 30 structural RNAs and 36 protein-coding genes, all located on the same strand. A prominent feature of the genome is the presence of a approximately 20 Kbp long intergenic region comprised of numerous tandem and dispersed repeat units of between 22-336 bp. Adjacent to these repeats are 27 copies of palindromic sequences predicted to form stable DNA stem-loop structures. One such stem-loop is located near a GC-rich and GC-poor region and may have a regulatory function in replication or transcription. The H. andersenii mtDNA shares a number of features in common with the genome of the cryptophyte Rhodomonas salina, including general architecture, gene content, and the presence of a large repeat region. However, the H. andersenii mtDNA is devoid of inverted repeats and introns, which are present in R. salina. Comparative analyses of the suite of tRNAs encoded in the two genomes reveal that the H. andersenii mtDNA has lost or converted its original trnK(uuu) gene and possesses a trnS-derived 'trnK(uuu)', which appears unable to produce a functional tRNA. Mitochondrial protein coding gene phylogenies strongly support a variety of previously established eukaryotic groups, but fail to resolve the relationships among higher-order eukaryotic lineages. Comparison of the H. andersenii and R. salina mitochondrial genomes reveals a number of cryptophyte-specific genomic features, most notably the presence of a large repeat-rich intergenic region. However, unlike R. salina, the H. andersenii mtDNA does not possess introns and lacks a Lys-tRNA, which is presumably imported from the cytosol.
The new modern era of yeast genomics: community sequencing and the resulting annotation of multiple Saccharomyces cerevisiae strains at the Saccharomyces Genome Database

PubMed Central

Engel, Stacia R.; Cherry, J. Michael

2013-01-01

The first completed eukaryotic genome sequence was that of the yeast Saccharomyces cerevisiae, and the Saccharomyces Genome Database (SGD; http://www.yeastgenome.org/) is the original model organism database. SGD remains the authoritative community resource for the S. cerevisiae reference genome sequence and its annotation, and continues to provide comprehensive biological information correlated with S. cerevisiae genes and their products. A diverse set of yeast strains have been sequenced to explore commercial and laboratory applications, and a brief history of those strains is provided. The publication of these new genomes has motivated the creation of new tools, and SGD will annotate and provide comparative analyses of these sequences, correlating changes with variations in strain phenotypes and protein function. We are entering a new era at SGD, as we incorporate these new sequences and make them accessible to the scientific community, all in an effort to continue in our mission of educating researchers and facilitating discovery. Database URL: http://www.yeastgenome.org/ PMID:23487186
Rapid and Recent Evolution of LTR Retrotransposons Drives Rice Genome Evolution During the Speciation of AA-Genome Oryza Species

PubMed Central

Zhang, Qun-Jie; Gao, Li-Zhi

2017-01-01

The dynamics of long terminal repeat (LTR) retrotransposons and their contribution to genome evolution during plant speciation have remained largely unanswered. Here, we perform a genome-wide comparison of all eight Oryza AA-genome species, and identify 3911 intact LTR retrotransposons classified into 790 families. The top 44 most abundant LTR retrotransposon families show patterns of rapid and distinct diversification since the species split over the last ∼4.8 MY (million years). Phylogenetic and read depth analyses of 11 representative retrotransposon families further provide a comprehensive evolutionary landscape of these changes. Compared with Ty1-copia, independent bursts of Ty3-gypsy retrotransposon expansions have occurred with the three largest showing signatures of lineage-specific evolution. The estimated insertion times of 2213 complete retrotransposons from the top 23 most abundant families reveal divergent life histories marked by speedy accumulation, decline, and extinction that differed radically between species. We hypothesize that this rapid evolution of LTR retrotransposons not only divergently shaped the architecture of rice genomes but also contributed to the process of speciation and diversification of rice. PMID:28413161

A comparison of chloroplast genome sequences in Aconitum (Ranunculaceae): a traditional herbal medicinal genus

PubMed Central

Yao, Gang

2017-01-01

The herbal medicinal genus Aconitum L., belonging to the Ranunculaceae family, represents the earliest diverging lineage within the eudicots. It currently comprises of two subgenera, A. subgenus Lycoctonum and A. subg. Aconitum. The complete chloroplast (cp) genome sequences were characterized in three species: A. angustius, A. finetianum, and A. sinomontanum in subg. Lycoctonum and compared to other Aconitum species to clarify their phylogenetic relationship and provide molecular information for utilization of Aconitum species particularly in Eastern Asia. The length of the chloroplast genome sequences were 156,109 bp in A. angustius, 155,625 bp in A. finetianum and 157,215 bp in A. sinomontanum, with each species possessing 126 genes with 84 protein coding genes (PCGs). While genomic rearrangements were absent, structural variation was detected in the LSC/IR/SSC boundaries. Five pseudogenes were identified, among which Ψrps19 and Ψycf1 were in the LSC/IR/SSC boundaries, Ψrps16 and ΨinfA in the LSC region, and Ψycf15 in the IRb region. The nucleotide variability (Pi) of Aconitum was estimated to be 0.00549, with comparably higher variations in the LSC and SSC than the IR regions. Eight intergenic regions were revealed to be highly variable and a total of 58–62 simple sequence repeats (SSRs) were detected in all three species. More than 80% of SSRs were present in the LSC region. Altogether, 64.41% and 46.81% of SSRs are mononucleotides in subg. Lycoctonum and subg. Aconitum, respectively, while a higher percentage of di-, tri-, tetra-, and penta- SSRs were present in subg. Aconitum. Most species of subg. Aconitum in Eastern Asia were first used for phylogenetic analyses. The availability of the complete cp genome sequences of these species in subg. Lycoctonum will benefit future phylogenetic analyses and aid in germplasm utilization in Aconitum species. PMID:29134154
A comparison of chloroplast genome sequences in Aconitum (Ranunculaceae): a traditional herbal medicinal genus.

PubMed

Kong, Hanghui; Liu, Wanzhen; Yao, Gang; Gong, Wei

2017-01-01

The herbal medicinal genus Aconitum L., belonging to the Ranunculaceae family, represents the earliest diverging lineage within the eudicots. It currently comprises of two subgenera, A . subgenus Lycoctonum and A . subg. Aconitum . The complete chloroplast (cp) genome sequences were characterized in three species: A. angustius , A. finetianum , and A. sinomontanum in subg. Lycoctonum and compared to other Aconitum species to clarify their phylogenetic relationship and provide molecular information for utilization of Aconitum species particularly in Eastern Asia. The length of the chloroplast genome sequences were 156,109 bp in A. angustius , 155,625 bp in A. finetianum and 157,215 bp in A. sinomontanum , with each species possessing 126 genes with 84 protein coding genes (PCGs). While genomic rearrangements were absent, structural variation was detected in the LSC/IR/SSC boundaries. Five pseudogenes were identified, among which Ψ rps 19 and Ψ ycf 1 were in the LSC/IR/SSC boundaries, Ψ rps 16 and Ψ inf A in the LSC region, and Ψ ycf 15 in the IRb region. The nucleotide variability ( Pi ) of Aconitum was estimated to be 0.00549, with comparably higher variations in the LSC and SSC than the IR regions. Eight intergenic regions were revealed to be highly variable and a total of 58-62 simple sequence repeats (SSRs) were detected in all three species. More than 80% of SSRs were present in the LSC region. Altogether, 64.41% and 46.81% of SSRs are mononucleotides in subg. Lycoctonum and subg. Aconitum , respectively, while a higher percentage of di-, tri-, tetra-, and penta- SSRs were present in subg. Aconitum . Most species of subg. Aconitum in Eastern Asia were first used for phylogenetic analyses. The availability of the complete cp genome sequences of these species in subg. Lycoctonum will benefit future phylogenetic analyses and aid in germplasm utilization in Aconitum species.
The complete mitochondrial genome of the tapeworm Cladotaenia vulturi (Cestoda: Paruterinidae): gene arrangement and phylogenetic relationships with other cestodes.

PubMed

Guo, Aijiang

2016-08-31

Tapeworms Cladotaenia spp. are among the most important wildlife pathogens in birds of prey. The genus Cladotaenia is placed in the family Paruterinidae based on morphological characteristics and hosts. However, limited molecular information is available for studying the phylogenetic position of this genus in relation to other cestodes. In this study, the complete mitochondrial (mt) genome of Cladotaenia vulturi was amplified using "Long-PCR" and then sequenced by primer walking. Sequence annotation and gene identification were performed by comparison with published flatworm mt genomes. The phylogenetic relationships of C. vulturi with other cestode species were established using the concatenated amino acid sequences of 12 protein-coding genes with Bayesian Inference and Maximum Likelihood methods. The complete mitochondrial genome of the Cladotaenia vulturi is 13,411 kb in size and contains 36 genes. The gene arrangement of C. vulturi is identical to those in Anoplocephala spp. (Anoplocephalidae), Hymenolepis spp. (Hymenolepididae) and Dipylidium caninum (Dipylidiidae), but different from that in taeniids owing to the order shift between the tRNA (L1) and tRNA (S2) genes. Phylogenetic analyses based on the amino acid sequences of the concatenated 12 protein-coding genes showed that the species in the Taeniidae form a group and C. vulturi is a sister taxon to the species of the family Taeniidae. To our knowledge, the present study provides the first molecular data to support the early proposal from morphological evidence that the Taeniidae is a sister group to the family Paruterinidae. This novel mt genome sequence will be useful for further investigations into the population genetics, phylogenetics and systematics of the family Paruterinidae and inferring phylogenetic relationships among several lineages within the order Cyclophyllidea.
Non-Gaussian Distributions Affect Identification of Expression Patterns, Functional Annotation, and Prospective Classification in Human Cancer Genomes

PubMed Central

Marko, Nicholas F.; Weil, Robert J.

2012-01-01

Introduction Gene expression data is often assumed to be normally-distributed, but this assumption has not been tested rigorously. We investigate the distribution of expression data in human cancer genomes and study the implications of deviations from the normal distribution for translational molecular oncology research. Methods We conducted a central moments analysis of five cancer genomes and performed empiric distribution fitting to examine the true distribution of expression data both on the complete-experiment and on the individual-gene levels. We used a variety of parametric and nonparametric methods to test the effects of deviations from normality on gene calling, functional annotation, and prospective molecular classification using a sixth cancer genome. Results Central moments analyses reveal statistically-significant deviations from normality in all of the analyzed cancer genomes. We observe as much as 37% variability in gene calling, 39% variability in functional annotation, and 30% variability in prospective, molecular tumor subclassification associated with this effect. Conclusions Cancer gene expression profiles are not normally-distributed, either on the complete-experiment or on the individual-gene level. Instead, they exhibit complex, heavy-tailed distributions characterized by statistically-significant skewness and kurtosis. The non-Gaussian distribution of this data affects identification of differentially-expressed genes, functional annotation, and prospective molecular classification. These effects may be reduced in some circumstances, although not completely eliminated, by using nonparametric analytics. This analysis highlights two unreliable assumptions of translational cancer gene expression analysis: that “small” departures from normality in the expression data distributions are analytically-insignificant and that “robust” gene-calling algorithms can fully compensate for these effects. PMID:23118863
The genomic and biological characterization of Citrullus lanatus cryptic virus infecting watermelon in China.

PubMed

Xin, Min; Cao, Mengji; Liu, Wenwen; Ren, Yingdang; Lu, Chuantao; Wang, Xifeng

2017-03-15

A dsRNA virus was detected in the watermelon (Citrullus lanatus) samples collected from Kaifeng, Henan province, China through the use of next generation sequencing of small RNAs. The complete genome of this virus is comprised of dsRNA-1 (1603nt) and dsRNA-2 (1466nt), both of which are single open reading frames and potentially encode a 54.2kDa RNA-dependent RNA polymerase (RdRp) and a 45.9kDa coat protein (CP), respectively. The RdRp and CP share the highest amino acid identities 85.3% and 75.4% with a previously reported Israeli strain Citrullus lanatus cryptic virus (CiLCV), respectively. Genome comparisons indicate that this virus is the same species with CiLCV, whereas the reported sequences of the Israeli strain of CiLCV are partial, and our newly identified sequences can represent the complete genome of CiLCV. Futhermore, phylogenetic tree analyses based on the RdRp sequences suggest that CiLCV is one member in the genus Deltapartitivirus, family Partitiviridae. In addition, field investigation and seed-borne bioassays show that CiLCV commonly occurs in many varieties and is transmitted though seeds at a very high rate. Copyright © 2017 Elsevier B.V. All rights reserved.
The first complete mitochondrial genome of Bactrocera tsuneonis (Miyake) (Diptera: Tephritidae) by next-generation sequencing and its phylogenetic implications.

PubMed

Zhang, Yue; Feng, Shiqian; Zeng, Yiying; Ning, Hong; Liu, Lijun; Zhao, Zihua; Jiang, Fan; Li, Zhihong

2018-06-23

Bactrocera tsuneonis (Miyake), generally known as the Japanese orange fly, is considered to be a major pest of commercial citrus crops. It has a limited distribution in China, Japan and Vietnam, but it has the potential to invade areas outside of Asia. More genetic information of B. tsuneonis should be obtained in order to develop effective methodologies for rapid and accurate molecular identification due to the difficulty of distinguishing it from Bactrocera minax based on morphological features. We report here the whole mitochondrial genome of B. tsuneonis sequenced by next-generation sequencing. This mitogenome sequence had a total length of 15,865 bp, a typical circular molecule comprising 13 protein-coding genes, 2 rRNA genes, 22 tRNA genes and a non-coding region (A + T-rich control region). The structure and organization of the molecule were typical and similar compared with the published homologous sequences of other fruit flies in Tephritidae. The phylogenetic analyses based on the mitochondrial genome data presented a close genetic relationship between B. tsuneonis and B. minax. This is the first report of the complete mitochondrial genome of B. tsuneonis, and it can be used in further studies of species diagnosis, evolutionary biology, prevention and control. Copyright © 2018. Published by Elsevier B.V.
ProteinWorldDB: querying radical pairwise alignments among protein sets from complete genomes

PubMed Central

Otto, Thomas Dan; Catanho, Marcos; Tristão, Cristian; Bezerra, Márcia; Fernandes, Renan Mathias; Elias, Guilherme Steinberger; Scaglia, Alexandre Capeletto; Bovermann, Bill; Berstis, Viktors; Lifschitz, Sergio; de Miranda, Antonio Basílio; Degrave, Wim

2010-01-01

Motivation: Many analyses in modern biological research are based on comparisons between biological sequences, resulting in functional, evolutionary and structural inferences. When large numbers of sequences are compared, heuristics are often used resulting in a certain lack of accuracy. In order to improve and validate results of such comparisons, we have performed radical all-against-all comparisons of 4 million protein sequences belonging to the RefSeq database, using an implementation of the Smith–Waterman algorithm. This extremely intensive computational approach was made possible with the help of World Community Grid™, through the Genome Comparison Project. The resulting database, ProteinWorldDB, which contains coordinates of pairwise protein alignments and their respective scores, is now made available. Users can download, compare and analyze the results, filtered by genomes, protein functions or clusters. ProteinWorldDB is integrated with annotations derived from Swiss-Prot, Pfam, KEGG, NCBI Taxonomy database and gene ontology. The database is a unique and valuable asset, representing a major effort to create a reliable and consistent dataset of cross-comparisons of the whole protein content encoded in hundreds of completely sequenced genomes using a rigorous dynamic programming approach. Availability: The database can be accessed through http://proteinworlddb.org Contact: otto@fiocruz.br PMID:20089515
First Complete Genome Sequence of Pepper vein yellows virus from Australia

PubMed Central

Maina, Solomon; Edwards, Owain R.

2016-01-01

We present here the first complete genomic RNA sequence of the polerovirus Pepper vein yellows virus (PeVYV) obtained from a pepper plant in Australia. We compare it with complete PeVYV genomes from Japan and China. The Australian genome was more closely related to the Japanese than the Chinese genome. PMID:27231375
Origins of the Xylella fastidiosa prophage-like regions and their impact in genome differentiation.

PubMed

de Mello Varani, Alessandro; Souza, Rangel Celso; Nakaya, Helder I; de Lima, Wanessa Cristina; Paula de Almeida, Luiz Gonzaga; Kitajima, Elliot Watanabe; Chen, Jianchi; Civerolo, Edwin; Vasconcelos, Ana Tereza Ribeiro; Van Sluys, Marie-Anne

2008-01-01

Xylella fastidiosa is a Gram negative plant pathogen causing many economically important diseases, and analyses of completely sequenced X. fastidiosa genome strains allowed the identification of many prophage-like elements and possibly phage remnants, accounting for up to 15% of the genome composition. To better evaluate the recent evolution of the X. fastidiosa chromosome backbone among distinct pathovars, the number and location of prophage-like regions on two finished genomes (9a5c and Temecula1), and in two candidate molecules (Ann1 and Dixon) were assessed. Based on comparative best bidirectional hit analyses, the majority (51%) of the predicted genes in the X. fastidiosa prophage-like regions are related to structural phage genes belonging to the Siphoviridae family. Electron micrograph reveals the existence of putative viral particles with similar morphology to lambda phages in the bacterial cell in planta. Moreover, analysis of microarray data indicates that 9a5c strain cultivated under stress conditions presents enhanced expression of phage anti-repressor genes, suggesting switches from lysogenic to lytic cycle of phages under stress-induced situations. Furthermore, virulence-associated proteins and toxins are found within these prophage-like elements, thus suggesting an important role in host adaptation. Finally, clustering analyses of phage integrase genes based on multiple alignment patterns reveal they group in five lineages, all possessing a tyrosine recombinase catalytic domain, and phylogenetically close to other integrases found in phages that are genetic mosaics and able to perform generalized and specialized transduction. Integration sites and tRNA association is also evidenced. In summary, we present comparative and experimental evidence supporting the association and contribution of phage activity on the differentiation of Xylella genomes.
Origins of the Xylella fastidiosa Prophage-Like Regions and Their Impact in Genome Differentiation

PubMed Central

de Mello Varani, Alessandro; Souza, Rangel Celso; Nakaya, Helder I.; de Lima, Wanessa Cristina; Paula de Almeida, Luiz Gonzaga; Kitajima, Elliot Watanabe; Chen, Jianchi; Civerolo, Edwin; Vasconcelos, Ana Tereza Ribeiro; Van Sluys, Marie-Anne

2008-01-01

Xylella fastidiosa is a Gram negative plant pathogen causing many economically important diseases, and analyses of completely sequenced X. fastidiosa genome strains allowed the identification of many prophage-like elements and possibly phage remnants, accounting for up to 15% of the genome composition. To better evaluate the recent evolution of the X. fastidiosa chromosome backbone among distinct pathovars, the number and location of prophage-like regions on two finished genomes (9a5c and Temecula1), and in two candidate molecules (Ann1 and Dixon) were assessed. Based on comparative best bidirectional hit analyses, the majority (51%) of the predicted genes in the X. fastidiosa prophage-like regions are related to structural phage genes belonging to the Siphoviridae family. Electron micrograph reveals the existence of putative viral particles with similar morphology to lambda phages in the bacterial cell in planta. Moreover, analysis of microarray data indicates that 9a5c strain cultivated under stress conditions presents enhanced expression of phage anti-repressor genes, suggesting switches from lysogenic to lytic cycle of phages under stress-induced situations. Furthermore, virulence-associated proteins and toxins are found within these prophage-like elements, thus suggesting an important role in host adaptation. Finally, clustering analyses of phage integrase genes based on multiple alignment patterns reveal they group in five lineages, all possessing a tyrosine recombinase catalytic domain, and phylogenetically close to other integrases found in phages that are genetic mosaics and able to perform generalized and specialized transduction. Integration sites and tRNA association is also evidenced. In summary, we present comparative and experimental evidence supporting the association and contribution of phage activity on the differentiation of Xylella genomes. PMID:19116666
Genome-wide investigation and expression analysis of AP2-ERF gene family in salt tolerant common bean

PubMed Central

Kavas, Musa; Kizildogan, Aslihan; Gökdemir, Gökhan; Baloglu, Mehmet Cengiz

2015-01-01

Apetala2-ethylene-responsive element binding factor (AP2-ERF) superfamily with common AP2-DNA binding domain have developmentally and physiologically important roles in plants. Since common bean genome project has been completed recently, it is possible to identify all of the AP2-ERF genes in the common bean genome. In this study, a comprehensive genome-wide in silico analysis identified 180 AP2-ERF superfamily genes in common bean (Phaseolus vulgaris). Based on the amino acid alignment and phylogenetic analyses, superfamily members were classified into four subfamilies: DREB (54), ERF (95), AP2 (27) and RAV (3), as well as one soloist. The physical and chemical characteristics of amino acids, interaction between AP2-ERF proteins, cis elements of promoter region of AP2-ERF genes and phylogenetic trees were predicted and analyzed. Additionally, expression levels of AP2-ERF genes were evaluated by in silico and qRT-PCR analyses. In silico micro-RNA target transcript analyses identified nearly all PvAP2-ERF genes as targets of by 44 different plant species' miRNAs were identified in this study. The most abundant target genes were PvAP2/ERF-20-25-62-78-113-173. miR156, miR172 and miR838 were the most important miRNAs found in targeting and BLAST analyses. Interactome analysis revealed that the transcription factor PvAP2-ERF78, an ortholog of Arabidopsis At2G28550, was potentially interacted with at least 15 proteins, indicating that it was very important in transcriptional regulation. Here we present the first study to identify and characterize the AP2-ERF transcription factors in common bean using whole-genome analysis, and the findings may serve as a references for future functional research on the transcription factors in common bean. PMID:27152109
Phenetic Comparison of Prokaryotic Genomes Using k-mers

PubMed Central

Déraspe, Maxime; Raymond, Frédéric; Boisvert, Sébastien; Culley, Alexander; Roy, Paul H.; Laviolette, François; Corbeil, Jacques

2017-01-01

Abstract Bacterial genomics studies are getting more extensive and complex, requiring new ways to envision analyses. Using the Ray Surveyor software, we demonstrate that comparison of genomes based on their k-mer content allows reconstruction of phenetic trees without the need of prior data curation, such as core genome alignment of a species. We validated the methodology using simulated genomes and previously published phylogenomic studies of Streptococcus pneumoniae and Pseudomonas aeruginosa. We also investigated the relationship of specific genetic determinants with bacterial population structures. By comparing clusters from the complete genomic content of a genome population with clusters from specific functional categories of genes, we can determine how the population structures are correlated. Indeed, the strain clustering based on a subset of k-mers allows determination of its similarity with the whole genome clusters. We also applied this methodology on 42 species of bacteria to determine the correlational significance of five important bacterial genomic characteristics. For example, intrinsic resistance is more important in P. aeruginosa than in S. pneumoniae, and the former has increased correlation of its population structure with antibiotic resistance genes. The global view of the pangenome of bacteria also demonstrated the taxa-dependent interaction of population structure with antibiotic resistance, bacteriophage, plasmid, and mobile element k-mer data sets. PMID:28957508
Genomic and archaeological evidence suggest a dual origin of domestic dogs.

PubMed

Frantz, Laurent A F; Mullin, Victoria E; Pionnier-Capitan, Maud; Lebrasseur, Ophélie; Ollivier, Morgane; Perri, Angela; Linderholm, Anna; Mattiangeli, Valeria; Teasdale, Matthew D; Dimopoulos, Evangelos A; Tresset, Anne; Duffraisse, Marilyne; McCormick, Finbar; Bartosiewicz, László; Gál, Erika; Nyerges, Éva A; Sablin, Mikhail V; Bréhard, Stéphanie; Mashkour, Marjan; Bălăşescu, Adrian; Gillet, Benjamin; Hughes, Sandrine; Chassaing, Olivier; Hitte, Christophe; Vigne, Jean-Denis; Dobney, Keith; Hänni, Catherine; Bradley, Daniel G; Larson, Greger

2016-06-03

The geographic and temporal origins of dogs remain controversial. We generated genetic sequences from 59 ancient dogs and a complete (28x) genome of a late Neolithic dog (dated to ~4800 calendar years before the present) from Ireland. Our analyses revealed a deep split separating modern East Asian and Western Eurasian dogs. Surprisingly, the date of this divergence (~14,000 to 6400 years ago) occurs commensurate with, or several millennia after, the first appearance of dogs in Europe and East Asia. Additional analyses of ancient and modern mitochondrial DNA revealed a sharp discontinuity in haplotype frequencies in Europe. Combined, these results suggest that dogs may have been domesticated independently in Eastern and Western Eurasia from distinct wolf populations. East Eurasian dogs were then possibly transported to Europe with people, where they partially replaced European Paleolithic dogs. Copyright © 2016, American Association for the Advancement of Science.
Comparative genomic analysis by microbial COGs self-attraction rate.

PubMed

Santoni, Daniele; Romano-Spica, Vincenzo

2009-06-21

Whole genome analysis provides new perspectives to determine phylogenetic relationships among microorganisms. The availability of whole nucleotide sequences allows different levels of comparison among genomes by several approaches. In this work, self-attraction rates were considered for each cluster of orthologous groups of proteins (COGs) class in order to analyse gene aggregation levels in physical maps. Phylogenetic relationships among microorganisms were obtained by comparing self-attraction coefficients. Eighteen-dimensional vectors were computed for a set of 168 completely sequenced microbial genomes (19 archea, 149 bacteria). The components of the vector represent the aggregation rate of the genes belonging to each of 18 COGs classes. Genes involved in nonessential functions or related to environmental conditions showed the highest aggregation rates. On the contrary genes involved in basic cellular tasks showed a more uniform distribution along the genome, except for translation genes. Self-attraction clustering approach allowed classification of Proteobacteria, Bacilli and other species belonging to Firmicutes. Rearrangement and Lateral Gene Transfer events may influence divergences from classical taxonomy. Each set of COG classes' aggregation values represents an intrinsic property of the microbial genome. This novel approach provides a new point of view for whole genome analysis and bacterial characterization.
Rat Genome and Model Resources.

PubMed

Shimoyama, Mary; Smith, Jennifer R; Bryda, Elizabeth; Kuramoto, Takashi; Saba, Laura; Dwinell, Melinda

2017-07-01

Rats remain a major model for studying disease mechanisms and discovery, validation, and testing of new compounds to improve human health. The rat's value continues to grow as indicated by the more than 1.4 million publications (second to human) at PubMed documenting important discoveries using this model. Advanced sequencing technologies, genome modification techniques, and the development of embryonic stem cell protocols ensure the rat remains an important mammalian model for disease studies. The 2004 release of the reference genome has been followed by the production of complete genomes for more than two dozen individual strains utilizing NextGen sequencing technologies; their analyses have identified over 80 million variants. This explosion in genomic data has been accompanied by the ability to selectively edit the rat genome, leading to hundreds of new strains through multiple technologies. A number of resources have been developed to provide investigators with access to precision rat models, comprehensive datasets, and sophisticated software tools necessary for their research. Those profiled here include the Rat Genome Database, PhenoGen, Gene Editing Rat Resource Center, Rat Resource and Research Center, and the National BioResource Project for the Rat in Japan. © The Author 2017. Published by Oxford University Press.
Genome sequence of the Thermotoga thermarum type strain (LA3(T)) from an African solfataric spring.

PubMed

Göker, Markus; Spring, Stefan; Scheuner, Carmen; Anderson, Iain; Zeytun, Ahmet; Nolan, Matt; Lucas, Susan; Tice, Hope; Del Rio, Tijana Glavina; Cheng, Jan-Fang; Han, Cliff; Tapia, Roxanne; Goodwin, Lynne A; Pitluck, Sam; Liolios, Konstantinos; Mavromatis, Konstantinos; Pagani, Ioanna; Ivanova, Natalia; Mikhailova, Natalia; Pati, Amrita; Chen, Amy; Palaniappan, Krishna; Land, Miriam; Hauser, Loren; Chang, Yun-Juan; Jeffries, Cynthia D; Rohde, Manfred; Detter, John C; Woyke, Tanja; Bristow, James; Eisen, Jonathan A; Markowitz, Victor; Hugenholtz, Philip; Kyrpides, Nikos C; Klenk, Hans-Peter; Lapidus, Alla

2014-06-15

Thermotoga thermarum Windberger et al. 1989 is a member to the genomically well characterized genus Thermotoga in the phylum 'Thermotogae'. T. thermarum is of interest for its origin from a continental solfataric spring vs. predominantly marine oil reservoirs of other members of the genus. The genome of strain LA3T also provides fresh data for the phylogenomic positioning of the (hyper-)thermophilic bacteria. T. thermarum strain LA3(T) is the fourth sequenced genome of a type strain from the genus Thermotoga, and the sixth in the family Thermotogaceae to be formally described in a publication. Phylogenetic analyses do not reveal significant discrepancies between the current classification of the group, 16S rRNA gene data and whole-genome sequences. Nevertheless, T. thermarum significantly differs from other Thermotoga species regarding its iron-sulfur cluster synthesis, as it contains only a minimal set of the necessary proteins. Here we describe the features of this organism, together with the complete genome sequence and annotation. The 2,039,943 bp long chromosome with its 2,015 protein-coding and 51 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.
The mitochondrial genome of booklouse, Liposcelis sculptilis (Psocoptera: Liposcelididae) and the evolutionary timescale of Liposcelis

PubMed Central

Shi, Yan; Chu, Qing; Wei, Dan-Dan; Qiu, Yuan-Jian; Shang, Feng; Dou, Wei; Wang, Jin-Jun

2016-01-01

Bilateral animals are featured by an extremely compact mitochondrial (mt) genome with 37 genes on a single circular chromosome. To date, the complete mt genome has only been determined for four species of Liposcelis, a genus with economic importance, including L. entomophila, L. decolor, L. bostrychophila, and L. paeta. They belong to A, B, or D group of Liposcelis, respectively. Unlike most bilateral animals, L. bostrychophila, L. entomophila and L. paeta have a bitipartite mt genome with genes on two chromosomes. However, the mt genome of L. decolor has the typical mt chromosome of bilateral animals. Here, we sequenced the mt genome of L. sculptilis, and identified 35 genes, which were on a single chromosome. The mt genome fragmentation is not shared by the D group of Liposcelis and the single chromosome of L. sculptilis differed from those of booklice known in gene content and gene arrangement. We inferred that different evolutionary patterns and rate existed in Liposcelis. Further, we reconstructed the evolutionary history of 21 psocodean taxa with phylogenetic analyses, which suggested that Liposcelididae and Phthiraptera have evolved 134 Ma and the sucking lice diversified in the Late Cretaceous. PMID:27470659
Genome sequence of the Thermotoga thermarum type strain (LA 3 T) from an African solfataric spring

DOE PAGES

Goker, Markus; Spring, Stefan; Scheuner, Carmen; ...

2014-06-15

Thermotoga thermarum Windberger et al. 1989 is a member to the genomically well characterized genus Thermotoga in the phylum ' Thermotogae'. T. thermarum is of interest for its origin from a continental solfataric spring vs. predominantly marine oil reservoirs of other members of the genus. The genome of strain LA3T also provides fresh data for the phylogenomic positioning of the (hyper-)thermophilic bacteria. T. thermarum strain LA3 T is the fourth sequenced genome of a type strain from the genus Thermotoga, and the sixth in the family Thermotogaceae to be formally described in a publication. Phylogenetic analyses do not reveal significantmore » discrepancies between the current classification of the group, 16S rRNA gene data and whole-genome sequences. Nevertheless, T. thermarum significantly differs from other Thermotoga species regarding its iron-sulfur cluster synthesis, as it contains only a minimal set of the necessary proteins. Here we describe the features of this organism, together with the complete genome sequence and annotation. The 2,039,943 bp long chromosome with its 2,015 protein-coding and 51 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.« less
Whole genome sequence analysis of BT-474 using complete Genomics' standard and long fragment read technologies.

PubMed

Ciotlos, Serban; Mao, Qing; Zhang, Rebecca Yu; Li, Zhenyu; Chin, Robert; Gulbahce, Natali; Liu, Sophie Jia; Drmanac, Radoje; Peters, Brock A

2016-01-01

The cell line BT-474 is a popular cell line for studying the biology of cancer and developing novel drugs. However, there is no complete, published genome sequence for this highly utilized scientific resource. In this study we sought to provide a comprehensive and useful data set for the scientific community by generating a whole genome sequence for BT-474. Five μg of genomic DNA, isolated from an early passage of the BT-474 cell line, was used to generate a whole genome sequence (114X coverage) using Complete Genomics' standard sequencing process. To provide additional variant phasing and structural variation data we also processed and analyzed two separate libraries of 5 and 6 individual cells to depths of 99X and 87X, respectively, using Complete Genomics' Long Fragment Read (LFR) technology. BT-474 is a highly aneuploid cell line with an extremely complex genome sequence. This ~300X total coverage genome sequence provides a more complete understanding of this highly utilized cell line at the genomic level.
Analysis of complete mitochondrial genome sequences increases phylogenetic resolution of bears (Ursidae), a mammalian family that experienced rapid speciation.

PubMed

Yu, Li; Li, Yi-Wei; Ryder, Oliver A; Zhang, Ya-Ping

2007-10-24

Despite the small number of ursid species, bear phylogeny has long been a focus of study due to their conservation value, as all bear genera have been classified as endangered at either the species or subspecies level. The Ursidae family represents a typical example of rapid evolutionary radiation. Previous analyses with a single mitochondrial (mt) gene or a small number of mt genes either provide weak support or a large unresolved polytomy for ursids. We revisit the contentious relationships within Ursidae by analyzing complete mt genome sequences and evaluating the performance of both entire mt genomes and constituent mtDNA genes in recovering a phylogeny of extremely recent speciation events. This mitochondrial genome-based phylogeny provides strong evidence that the spectacled bear diverged first, while within the genus Ursus, the sloth bear is the sister taxon of all the other five ursines. The latter group is divided into the brown bear/polar bear and the two black bears/sun bear assemblages. These findings resolve the previous conflicts between trees using partial mt genes. The ability of different categories of mt protein coding genes to recover the correct phylogeny is concordant with previous analyses for taxa with deep divergence times. This study provides a robust Ursidae phylogenetic framework for future validation by additional independent evidence, and also has significant implications for assisting in the resolution of other similarly difficult phylogenetic investigations. Identification of base composition bias and utilization of the combined data of whole mitochondrial genome sequences has allowed recovery of a strongly supported phylogeny that is upheld when using multiple alternative outgroups for the Ursidae, a mammalian family that underwent a rapid radiation since the mid- to late Pliocene. It remains to be seen if the reliability of mt genome analysis will hold up in studies of other difficult phylogenetic issues. Although the whole mitochondrial DNA sequence based phylogeny is robust, it remains in conflict with phylogenetic relationships suggested by analysis of limited nuclear-encoded data, a situation that will require gathering more nuclear DNA sequence information.

Analysis of complete mitochondrial genome sequences increases phylogenetic resolution of bears (Ursidae), a mammalian family that experienced rapid speciation

PubMed Central

Yu, Li; Li, Yi-Wei; Ryder, Oliver A; Zhang, Ya-Ping

2007-01-01

Background Despite the small number of ursid species, bear phylogeny has long been a focus of study due to their conservation value, as all bear genera have been classified as endangered at either the species or subspecies level. The Ursidae family represents a typical example of rapid evolutionary radiation. Previous analyses with a single mitochondrial (mt) gene or a small number of mt genes either provide weak support or a large unresolved polytomy for ursids. We revisit the contentious relationships within Ursidae by analyzing complete mt genome sequences and evaluating the performance of both entire mt genomes and constituent mtDNA genes in recovering a phylogeny of extremely recent speciation events. Results This mitochondrial genome-based phylogeny provides strong evidence that the spectacled bear diverged first, while within the genus Ursus, the sloth bear is the sister taxon of all the other five ursines. The latter group is divided into the brown bear/polar bear and the two black bears/sun bear assemblages. These findings resolve the previous conflicts between trees using partial mt genes. The ability of different categories of mt protein coding genes to recover the correct phylogeny is concordant with previous analyses for taxa with deep divergence times. This study provides a robust Ursidae phylogenetic framework for future validation by additional independent evidence, and also has significant implications for assisting in the resolution of other similarly difficult phylogenetic investigations. Conclusion Identification of base composition bias and utilization of the combined data of whole mitochondrial genome sequences has allowed recovery of a strongly supported phylogeny that is upheld when using multiple alternative outgroups for the Ursidae, a mammalian family that underwent a rapid radiation since the mid- to late Pliocene. It remains to be seen if the reliability of mt genome analysis will hold up in studies of other difficult phylogenetic issues. Although the whole mitochondrial DNA sequence based phylogeny is robust, it remains in conflict with phylogenetic relationships suggested by analysis of limited nuclear-encoded data, a situation that will require gathering more nuclear DNA sequence information. PMID:17956639
Complete Sequence and Analysis of the Mitochondrial Genome of Hemiselmis andersenii CCMP644 (Cryptophyceae)

PubMed Central

Kim, Eunsoo; Lane, Christopher E; Curtis, Bruce A; Kozera, Catherine; Bowman, Sharen; Archibald, John M

2008-01-01

Background Cryptophytes are an enigmatic group of unicellular eukaryotes with plastids derived by secondary (i.e., eukaryote-eukaryote) endosymbiosis. Cryptophytes are unusual in that they possess four genomes–a host cell-derived nuclear and mitochondrial genome and an endosymbiont-derived plastid and 'nucleomorph' genome. The evolutionary origins of the host and endosymbiont components of cryptophyte algae are at present poorly understood. Thus far, a single complete mitochondrial genome sequence has been determined for the cryptophyte Rhodomonas salina. Here, the second complete mitochondrial genome of the cryptophyte alga Hemiselmis andersenii CCMP644 is presented. Results The H. andersenii mtDNA is 60,553 bp in size and encodes 30 structural RNAs and 36 protein-coding genes, all located on the same strand. A prominent feature of the genome is the presence of a ~20 Kbp long intergenic region comprised of numerous tandem and dispersed repeat units of between 22–336 bp. Adjacent to these repeats are 27 copies of palindromic sequences predicted to form stable DNA stem-loop structures. One such stem-loop is located near a GC-rich and GC-poor region and may have a regulatory function in replication or transcription. The H. andersenii mtDNA shares a number of features in common with the genome of the cryptophyte Rhodomonas salina, including general architecture, gene content, and the presence of a large repeat region. However, the H. andersenii mtDNA is devoid of inverted repeats and introns, which are present in R. salina. Comparative analyses of the suite of tRNAs encoded in the two genomes reveal that the H. andersenii mtDNA has lost or converted its original trnK(uuu) gene and possesses a trnS-derived 'trnK(uuu)', which appears unable to produce a functional tRNA. Mitochondrial protein coding gene phylogenies strongly support a variety of previously established eukaryotic groups, but fail to resolve the relationships among higher-order eukaryotic lineages. Conclusion Comparison of the H. andersenii and R. salina mitochondrial genomes reveals a number of cryptophyte-specific genomic features, most notably the presence of a large repeat-rich intergenic region. However, unlike R. salina, the H. andersenii mtDNA does not possess introns and lacks a Lys-tRNA, which is presumably imported from the cytosol. PMID:18474103
Complete Genome Sequence of Pigmentation Negative Yersinia Pestis strain Cadman Running head: Complete Genome Sequence of Y. pestis strain Cadman

DTIC Science & Technology

2016-10-27

Institute of Infectious Diseases, Fort Detrick, Frederick, Maryland, USA 9 10 11 Running head: Complete Genome Sequence of Y. pestis strain Cadman...1 Complete Genome Sequence of Pigmentation Negative Yersinia pestis strain Cadman 1 2 3 Sean Lovetta, Kitty Chaseb, Galina Korolevaa, Gustavo...we report the genome sequence of Yersinia pestis strain Cadman, an attenuated strain 25 lacking the pgm locus. Y. pestis is the causative agent of
Complete mitochondrial genome of freshwater shark Wallago attu (Bloch & Schneider) from Indus River Sindh, Pakistan.

PubMed

Laghari, Muhammad Younis; Lashari, Punhal; Xu, Peng; Zhao, Zixia; Jiang, Li; Narejo, Naeem Tariq; Xin, Baoping; Sun, Xiaowen; Zhang, Yan

2016-01-01

Complete mitochondrial genome of fresh water giant catfish, Wallago attu, was isolated by LA PCR (TakaRa LAtaq, Dalian, China); and sequenced by Sanger's method to obtain the complete mitochondrial genome. The complete mitogenome was 15,639 bp in length and contains 13 typical vertebrate protein-coding genes, 2 rRNA and 22 tRNA genes. The whole genome base composition was estimated to be 31.17% A, 28.15% C, 15.55% G and 25.12% T. The complete mitochondrial genome of catfish, W. attu, provides the fundamental tools for genetic breeding.
Complete mitochondrial genomes of Taenia multiceps, T. hydatigena and T. pisiformis: additional molecular markers for a tapeworm genus of human and animal health significance.

PubMed

Jia, Wan-Zhong; Yan, Hong-Bin; Guo, Ai-Jiang; Zhu, Xing-Quan; Wang, Yu-Chao; Shi, Wan-Gui; Chen, Hao-Tai; Zhan, Fang; Zhang, Shao-Hua; Fu, Bao-Quan; Littlewood, D Timothy J; Cai, Xue-Peng

2010-07-22

Mitochondrial genomes provide a rich source of molecular variation of proven and widespread utility in molecular ecology, population genetics and evolutionary biology. The tapeworm genus Taenia includes a diversity of tapeworm parasites of significant human and veterinary importance. Here we add complete sequences of the mt genomes of T. multiceps, T. hydatigena and T. pisiformis, to a data set of 4 published mtDNAs in the same genus. Seven complete mt genomes of Taenia species are used to compare and contrast variation within and between genomes in the genus, to estimate a phylogeny for the genus, and to develop novel molecular markers as part of an extended mitochondrial toolkit. The complete circular mtDNAs of T. multiceps, T. hydatigena and T. pisiformis were 13,693, 13,492 and 13,387 bp in size respectively, comprising the usual complement of flatworm genes. Start and stop codons of protein coding genes included those found commonly amongst other platyhelminth mt genomes, but the much rarer initiation codon GTT was inferred for the gene atp6 in T. pisiformis. Phylogenetic analysis of mtDNAs offered novel estimates of the interrelationships of Taenia. Sliding window analyses showed nad6, nad5, atp6, nad3 and nad2 are amongst the most variable of genes per unit length, with the highest peaks in nucleotide diversity found in nad5. New primer pairs capable of amplifying fragments of variable DNA in nad1, rrnS and nad5 genes were designed in silico and tested as possible alternatives to existing mitochondrial markers for Taenia. With the availability of complete mtDNAs of 7 Taenia species, we have shown that analysis of amino acids provides a robust estimate of phylogeny for the genus that differs markedly from morphological estimates or those using partial genes; with implications for understanding the evolutionary radiation of important Taenia. Full alignment of the nucleotides of Taenia mtDNAs and sliding window analysis suggests numerous alternative gene regions are likely to capture greater nucleotide variation than those currently pursued as molecular markers. New PCR primers developed from a comparative mitogenomic analysis of Taenia species, extend the use of mitochondrial markers for molecular ecology, population genetics and diagnostics.
Complete mitochondrial genomes of Taenia multiceps, T. hydatigena and T. pisiformis: additional molecular markers for a tapeworm genus of human and animal health significance

PubMed Central

2010-01-01

Background Mitochondrial genomes provide a rich source of molecular variation of proven and widespread utility in molecular ecology, population genetics and evolutionary biology. The tapeworm genus Taenia includes a diversity of tapeworm parasites of significant human and veterinary importance. Here we add complete sequences of the mt genomes of T. multiceps, T. hydatigena and T. pisiformis, to a data set of 4 published mtDNAs in the same genus. Seven complete mt genomes of Taenia species are used to compare and contrast variation within and between genomes in the genus, to estimate a phylogeny for the genus, and to develop novel molecular markers as part of an extended mitochondrial toolkit. Results The complete circular mtDNAs of T. multiceps, T. hydatigena and T. pisiformis were 13,693, 13,492 and 13,387 bp in size respectively, comprising the usual complement of flatworm genes. Start and stop codons of protein coding genes included those found commonly amongst other platyhelminth mt genomes, but the much rarer initiation codon GTT was inferred for the gene atp6 in T. pisiformis. Phylogenetic analysis of mtDNAs offered novel estimates of the interrelationships of Taenia. Sliding window analyses showed nad6, nad5, atp6, nad3 and nad2 are amongst the most variable of genes per unit length, with the highest peaks in nucleotide diversity found in nad5. New primer pairs capable of amplifying fragments of variable DNA in nad1, rrnS and nad5 genes were designed in silico and tested as possible alternatives to existing mitochondrial markers for Taenia. Conclusions With the availability of complete mtDNAs of 7 Taenia species, we have shown that analysis of amino acids provides a robust estimate of phylogeny for the genus that differs markedly from morphological estimates or those using partial genes; with implications for understanding the evolutionary radiation of important Taenia. Full alignment of the nucleotides of Taenia mtDNAs and sliding window analysis suggests numerous alternative gene regions are likely to capture greater nucleotide variation than those currently pursued as molecular markers. New PCR primers developed from a comparative mitogenomic analysis of Taenia species, extend the use of mitochondrial markers for molecular ecology, population genetics and diagnostics. PMID:20649981
GAMOLA2, a Comprehensive Software Package for the Annotation and Curation of Draft and Complete Microbial Genomes

PubMed Central

Altermann, Eric; Lu, Jingli; McCulloch, Alan

2017-01-01

Expert curated annotation remains one of the critical steps in achieving a reliable biological relevant annotation. Here we announce the release of GAMOLA2, a user friendly and comprehensive software package to process, annotate and curate draft and complete bacterial, archaeal, and viral genomes. GAMOLA2 represents a wrapping tool to combine gene model determination, functional Blast, COG, Pfam, and TIGRfam analyses with structural predictions including detection of tRNAs, rRNA genes, non-coding RNAs, signal protein cleavage sites, transmembrane helices, CRISPR repeats and vector sequence contaminations. GAMOLA2 has already been validated in a wide range of bacterial and archaeal genomes, and its modular concept allows easy addition of further functionality in future releases. A modified and adapted version of the Artemis Genome Viewer (Sanger Institute) has been developed to leverage the additional features and underlying information provided by the GAMOLA2 analysis, and is part of the software distribution. In addition to genome annotations, GAMOLA2 features, among others, supplemental modules that assist in the creation of custom Blast databases, annotation transfers between genome versions, and the preparation of Genbank files for submission via the NCBI Sequin tool. GAMOLA2 is intended to be run under a Linux environment, whereas the subsequent visualization and manual curation in Artemis is mobile and platform independent. The development of GAMOLA2 is ongoing and community driven. New functionality can easily be added upon user requests, ensuring that GAMOLA2 provides information relevant to microbiologists. The software is available free of charge for academic use. PMID:28386247
GAMOLA2, a Comprehensive Software Package for the Annotation and Curation of Draft and Complete Microbial Genomes.

PubMed

Altermann, Eric; Lu, Jingli; McCulloch, Alan

2017-01-01

Expert curated annotation remains one of the critical steps in achieving a reliable biological relevant annotation. Here we announce the release of GAMOLA2, a user friendly and comprehensive software package to process, annotate and curate draft and complete bacterial, archaeal, and viral genomes. GAMOLA2 represents a wrapping tool to combine gene model determination, functional Blast, COG, Pfam, and TIGRfam analyses with structural predictions including detection of tRNAs, rRNA genes, non-coding RNAs, signal protein cleavage sites, transmembrane helices, CRISPR repeats and vector sequence contaminations. GAMOLA2 has already been validated in a wide range of bacterial and archaeal genomes, and its modular concept allows easy addition of further functionality in future releases. A modified and adapted version of the Artemis Genome Viewer (Sanger Institute) has been developed to leverage the additional features and underlying information provided by the GAMOLA2 analysis, and is part of the software distribution. In addition to genome annotations, GAMOLA2 features, among others, supplemental modules that assist in the creation of custom Blast databases, annotation transfers between genome versions, and the preparation of Genbank files for submission via the NCBI Sequin tool. GAMOLA2 is intended to be run under a Linux environment, whereas the subsequent visualization and manual curation in Artemis is mobile and platform independent. The development of GAMOLA2 is ongoing and community driven. New functionality can easily be added upon user requests, ensuring that GAMOLA2 provides information relevant to microbiologists. The software is available free of charge for academic use.
Bifidobacterium animalis subsp. lactis ATCC 27673 Is a Genomically Unique Strain within Its Conserved Subspecies

PubMed Central

Loquasto, Joseph R.; Barrangou, Rodolphe; Dudley, Edward G.; Stahl, Buffy; Chen, Chun

2013-01-01

Many strains of Bifidobacterium animalis subsp. lactis are considered health-promoting probiotic microorganisms and are commonly formulated into fermented dairy foods. Analyses of previously sequenced genomes of B. animalis subsp. lactis have revealed little genetic diversity, suggesting that it is a monomorphic subspecies. However, during a multilocus sequence typing survey of Bifidobacterium, it was revealed that B. animalis subsp. lactis ATCC 27673 gave a profile distinct from that of the other strains of the subspecies. As part of an ongoing study designed to understand the genetic diversity of this subspecies, the genome of this strain was sequenced and compared to other sequenced genomes of B. animalis subsp. lactis and B. animalis subsp. animalis. The complete genome of ATCC 27673 was 1,963,012 bp, contained 1,616 genes and 4 rRNA operons, and had a G+C content of 61.55%. Comparative analyses revealed that the genome of ATCC 27673 contained six distinct genomic islands encoding 83 open reading frames not found in other strains of the same subspecies. In four islands, either phage or mobile genetic elements were identified. In island 6, a novel clustered regularly interspaced short palindromic repeat (CRISPR) locus which contained 81 unique spacers was identified. This type I-E CRISPR-cas system differs from the type I-C systems previously identified in this subspecies, representing the first identification of a different system in B. animalis subsp. lactis. This study revealed that ATCC 27673 is a strain of B. animalis subsp. lactis with novel genetic content and suggests that the lack of genetic variability observed is likely due to the repeated sequencing of a limited number of widely distributed commercial strains. PMID:23995933
Two key arginine residues in the coat protein of Bamboo mosaic virus differentially affect the accumulation of viral genomic and subgenomic RNAs.

PubMed

Hung, Chien-Jen; Hu, Chung-Chi; Lin, Na-Sheng; Lee, Ya-Chien; Meng, Menghsiao; Tsai, Ching-Hsiu; Hsu, Yau-Heiu

2014-02-01

The interactions between viral RNAs and coat proteins (CPs) are critical for the efficient completion of infection cycles of RNA viruses. However, the specificity of the interactions between CPs and genomic or subgenomic RNAs remains poorly understood. In this study, Bamboo mosaic virus (BaMV) was used to analyse such interactions. Using reversible formaldehyde cross-linking and mass spectrometry, two regions in CP, each containing a basic amino acid (R99 and R227, respectively), were identified to bind directly to the 5' untranslated region of BaMV genomic RNA. Analyses of the alanine mutations of R99 and R227 revealed that the secondary structures of CP were not affected significantly, whereas the accumulation of BaMV genomic, but not subgenomic, RNA was severely decreased at 24 h post-inoculation in the inoculated protoplasts. In the absence of CP, the accumulation levels of genomic and subgenomic RNAs were decreased to 1.1%-1.5% and 33%-40% of that of the wild-type (wt), respectively, in inoculated leaves at 5 days post-inoculation (dpi). In contrast, in the presence of mutant CPs, the genomic RNAs remained about 1% of that of wt, whereas the subgenomic RNAs accumulated to at least 87%, suggesting that CP might increase the accumulation of subgenomic RNAs. The mutations also restricted viral movement and virion formation in Nicotiana benthamiana leaves at 5 dpi. These results demonstrate that R99 and R227 of CP play crucial roles in the accumulation, movement and virion formation of BaMV RNAs, and indicate that genomic and subgenomic RNAs interact differently with BaMV CP. © 2013 BSPP AND JOHN WILEY & SONS LTD.
Comparative Chloroplast Genomes of Photosynthetic Orchids: Insights into Evolution of the Orchidaceae and Development of Molecular Markers for Phylogenetic Applications

PubMed Central

Niu, Zhi-Tao; Liu, Wei; Xue, Qing-Yun; Ding, Xiao-Yu

2014-01-01

The orchid family Orchidaceae is one of the largest angiosperm families, including many species of important economic value. While chloroplast genomes are very informative for systematics and species identification, there is very limited information available on chloroplast genomes in the Orchidaceae. Here, we report the complete chloroplast genomes of the medicinal plant Dendrobium officinale and the ornamental orchid Cypripedium macranthos, demonstrating their gene content and order and potential RNA editing sites. The chloroplast genomes of the above two species and five known photosynthetic orchids showed similarities in structure as well as gene order and content, but differences in the organization of the inverted repeat/small single-copy junction and ndh genes. The organization of the inverted repeat/small single-copy junctions in the chloroplast genomes of these orchids was classified into four types; we propose that inverted repeats flanking the small single-copy region underwent expansion or contraction among Orchidaceae. The AT-rich regions of the ycf1 gene in orchids could be linked to the recombination of inverted repeat/small single-copy junctions. Relative species in orchids displayed similar patterns of variation in ndh gene contents. Furthermore, fifteen highly divergent protein-coding genes were identified, which are useful for phylogenetic analyses in orchids. To test the efficiency of these genes serving as markers in phylogenetic analyses, coding regions of four genes (accD, ccsA, matK, and ycf1) were used as a case study to construct phylogenetic trees in the subfamily Epidendroideae. High support was obtained for placement of previously unlocated subtribes Collabiinae and Dendrobiinae in the subfamily Epidendroideae. Our findings expand understanding of the diversity of orchid chloroplast genomes and provide a reference for study of the molecular systematics of this family. PMID:24911363
Comparative chloroplast genomes of photosynthetic orchids: insights into evolution of the Orchidaceae and development of molecular markers for phylogenetic applications.

PubMed

Luo, Jing; Hou, Bei-Wei; Niu, Zhi-Tao; Liu, Wei; Xue, Qing-Yun; Ding, Xiao-Yu

2014-01-01

The orchid family Orchidaceae is one of the largest angiosperm families, including many species of important economic value. While chloroplast genomes are very informative for systematics and species identification, there is very limited information available on chloroplast genomes in the Orchidaceae. Here, we report the complete chloroplast genomes of the medicinal plant Dendrobium officinale and the ornamental orchid Cypripedium macranthos, demonstrating their gene content and order and potential RNA editing sites. The chloroplast genomes of the above two species and five known photosynthetic orchids showed similarities in structure as well as gene order and content, but differences in the organization of the inverted repeat/small single-copy junction and ndh genes. The organization of the inverted repeat/small single-copy junctions in the chloroplast genomes of these orchids was classified into four types; we propose that inverted repeats flanking the small single-copy region underwent expansion or contraction among Orchidaceae. The AT-rich regions of the ycf1 gene in orchids could be linked to the recombination of inverted repeat/small single-copy junctions. Relative species in orchids displayed similar patterns of variation in ndh gene contents. Furthermore, fifteen highly divergent protein-coding genes were identified, which are useful for phylogenetic analyses in orchids. To test the efficiency of these genes serving as markers in phylogenetic analyses, coding regions of four genes (accD, ccsA, matK, and ycf1) were used as a case study to construct phylogenetic trees in the subfamily Epidendroideae. High support was obtained for placement of previously unlocated subtribes Collabiinae and Dendrobiinae in the subfamily Epidendroideae. Our findings expand understanding of the diversity of orchid chloroplast genomes and provide a reference for study of the molecular systematics of this family.
CoVaCS: a consensus variant calling system.

PubMed

Chiara, Matteo; Gioiosa, Silvia; Chillemi, Giovanni; D'Antonio, Mattia; Flati, Tiziano; Picardi, Ernesto; Zambelli, Federico; Horner, David Stephen; Pesole, Graziano; Castrignanò, Tiziana

2018-02-05

The advent and ongoing development of next generation sequencing technologies (NGS) has led to a rapid increase in the rate of human genome re-sequencing data, paving the way for personalized genomics and precision medicine. The body of genome resequencing data is progressively increasing underlining the need for accurate and time-effective bioinformatics systems for genotyping - a crucial prerequisite for identification of candidate causal mutations in diagnostic screens. Here we present CoVaCS, a fully automated, highly accurate system with a web based graphical interface for genotyping and variant annotation. Extensive tests on a gold standard benchmark data-set -the NA12878 Illumina platinum genome- confirm that call-sets based on our consensus strategy are completely in line with those attained by similar command line based approaches, and far more accurate than call-sets from any individual tool. Importantly our system exhibits better sensitivity and higher specificity than equivalent commercial software. CoVaCS offers optimized pipelines integrating state of the art tools for variant calling and annotation for whole genome sequencing (WGS), whole-exome sequencing (WES) and target-gene sequencing (TGS) data. The system is currently hosted at Cineca, and offers the speed of a HPC computing facility, a crucial consideration when large numbers of samples must be analysed. Importantly, all the analyses are performed automatically allowing high reproducibility of the results. As such, we believe that CoVaCS can be a valuable tool for the analysis of human genome resequencing studies. CoVaCS is available at: https://bioinformatics.cineca.it/covacs .
Plastid Phylogenomic Analyses Resolve Tofieldiaceae as the Root of the Early Diverging Monocot Order Alismatales

PubMed Central

Luo, Yang; Ma, Peng-Fei; Li, Hong-Tao; Yang, Jun-Bo; Wang, Hong; Li, De-Zhu

2016-01-01

The predominantly aquatic order Alismatales, which includes approximately 4,500 species within Araceae, Tofieldiaceae, and the core alismatid families, is a key group in investigating the origin and early diversification of monocots. Despite their importance, phylogenetic ambiguity regarding the root of the Alismatales tree precludes answering questions about the early evolution of the order. Here, we sequenced the first complete plastid genomes from three key families in this order: Potamogeton perfoliatus (Potamogetonaceae), Sagittaria lichuanensis (Alismataceae), and Tofieldia thibetica (Tofieldiaceae). Each family possesses the typical quadripartite structure, with plastid genome sizes of 156,226, 179,007, and 155,512 bp, respectively. Among them, the plastid genome of S. lichuanensis is the largest in monocots and the second largest in angiosperms. Like other sequenced Alismatales plastid genomes, all three families generally encode the same 113 genes with similar structure and arrangement. However, we detected 2.4 and 6 kb inversions in the plastid genomes of Sagittaria and Potamogeton, respectively. Further, we assembled a 79 plastid protein-coding gene sequence data matrix of 22 taxa that included the three newly generated plastid genomes plus 19 previously reported ones, which together represent all primary lineages of monocots and outgroups. In plastid phylogenomic analyses using maximum likelihood and Bayesian inference, we show both strong support for Acorales as sister to the remaining monocots and monophyly of Alismatales. More importantly, Tofieldiaceae was resolved as the most basal lineage within Alismatales. These results provide new insights into the evolution of Alismatales as well as the early-diverging monocots as a whole. PMID:26957030
Unprecedented genomic diversity of AhR1 and AhR2 genes in Atlantic salmon (Salmo salar L.).

PubMed

Hansson, Maria C; Wittzell, Håkan; Persson, Kerstin; von Schantz, Torbjörn

2004-06-24

Aryl hydrocarbon receptor (AhR) genes encode proteins involved in mediating the toxic responses induced by several environmental pollutants. Here, we describe the identification of the first two AhR1 (alpha and beta) genes and two additional AhR2 (alpha and beta) genes in the tetraploid species Atlantic salmon (Salmo salar L.) from a cosmid library screening. Cosmid clones containing genomic salmon AhR sequences were isolated using a cDNA clone containing the coding region of the Atlantic salmon AhR2gamma as a probe. Screening revealed 14 positive clones, from which four were chosen for further analyses. One of the cosmids contained genomic AhR sequences that were highly similar to the rainbow trout (Oncorhynchus mykiss) AhR2alpha and beta genes. SMART RACE amplified two complete, highly similar but not identical AhR type 2 sequences from salmon cDNA, which from phylogenetic analyses were determined as the rainbow trout AhR2alpha and beta orthologs. The salmon AhR2alpha and beta encode proteins of 1071 and 1058 residues, respectively, and encompass characteristic AhR sequence elements like a basic-helix-loop-helix (bHLH) and two PER-ARNT-SIM (PAS) domains. Both genes are transcribed in liver, spleen and muscle tissues of adult salmon. A second cosmid contained partial sequences, which were identical to the previously characterized AhR2gamma gene. The last two cosmids contained partial genomic AhR sequences, which were more similar to other AhR type 1 fish genes than the four characterized salmon AhR2 genes. However, attempts to amplify the corresponding complete cDNA sequences of the inserts proved very difficult, suggesting that these genes are non-functional or very weakly transcribed in the examined tissues. Phylogenetic analyses of the conserved regions did, however, clearly indicate that these two AhRs belong to the AhR type 1 clade and have been assigned as the Atlantic salmon AhR1alpha and AhR1beta genes. Taken together, these findings demonstrate that multiple AhR genes are present in Atlantic salmon genome, which likely is a consequence of previous genome duplications in the evolutionary past of salmonids. Plausible explanations for the high incidence of AhR genes in fish and more specifically in salmonids, like rapid divergences in specialized functions, are discussed.
The complete mitochondrial genome of the citrus red mite Panonychus citri (Acari: Tetranychidae): high genome rearrangement and extremely truncated tRNAs

PubMed Central

2010-01-01

Background The family Tetranychidae (Chelicerata: Acari) includes ~1200 species, many of which are of agronomic importance. To date, mitochondrial genomes of only two Tetranychidae species have been sequenced, and it has been found that these two mitochondrial genomes are characterized by many unusual features in genome organization and structure such as gene order and nucleotide frequency. The scarcity of available sequence data has greatly impeded evolutionary studies in Acari (mites and ticks). Information on Tetranychidae mitochondrial genomes is quite important for phylogenetic evaluation and population genetics, as well as the molecular evolution of functional genes such as acaricide-resistance genes. In this study, we sequenced the complete mitochondrial genome of Panonychus citri (Family Tetranychidae), a worldwide citrus pest, and provide a comparison to other Acari. Results The mitochondrial genome of P. citri is a typical circular molecule of 13,077 bp, and contains the complete set of 37 genes that are usually found in metazoans. This is the smallest mitochondrial genome within all sequenced Acari and other Chelicerata, primarily due to the significant size reduction of protein coding genes (PCGs), a large rRNA gene, and the A + T-rich region. The mitochondrial gene order for P. citri is the same as those for P. ulmi and Tetranychus urticae, but distinctly different from other Acari by a series of gene translocations and/or inversions. The majority of the P. citri mitochondrial genome has a high A + T content (85.28%), which is also reflected by AT-rich codons being used more frequently, but exhibits a positive GC-skew (0.03). The Acari mitochondrial nad1 exhibits a faster amino acid substitution rate than other genes, and the variation of nucleotide substitution patterns of PCGs is significantly correlated with the G + C content. Most tRNA genes of P. citri are extremely truncated and atypical (44-65, 54.1 ± 4.1 bp), lacking either the T- or D-arm, as found in P. ulmi, T. urticae, and other Acariform mites. Conclusions The P. citri mitochondrial gene order is markedly different from those of other chelicerates, but is conserved within the family Tetranychidae indicating that high rearrangements have occurred after Tetranychidae diverged from other Acari. Comparative analyses suggest that the genome size, gene order, gene content, codon usage, and base composition are strongly variable among Acari mitochondrial genomes. While extremely small and unusual tRNA genes seem to be common for Acariform mites, further experimental evidence is needed. PMID:20969792
Molecular Evolution and Intraclade Recombination of Enterovirus D68 during the 2014 Outbreak in the United States

PubMed Central

Tan, Yi; Hassan, Ferdaus; Schuster, Jennifer E.; Simenauer, Ari; Selvarangan, Rangaraj; Halpin, Rebecca A.; Lin, Xudong; Fedorova, Nadia; Stockwell, Timothy B.; Lam, Tommy Tsan-Yuk; Chappell, James D.; Hartert, Tina V.; Holmes, Edward C.

2015-01-01

ABSTRACT In August 2014, an outbreak of enterovirus D68 (EV-D68) occurred in North America, causing severe respiratory disease in children. Due to a lack of complete genome sequence data, there is only a limited understanding of the molecular evolution and epidemiology of EV-D68 during this outbreak, and it is uncertain whether the differing clinical manifestations of EV-D68 infection are associated with specific viral lineages. We developed a high-throughput complete genome sequencing pipeline for EV-D68 that produced a total of 59 complete genomes from respiratory samples with a 95% success rate, including 57 genomes from Kansas City, MO, collected during the 2014 outbreak. With these data in hand, we performed phylogenetic analyses of complete genome and VP1 capsid protein sequences. Notably, we observed considerable genetic diversity among EV-D68 isolates in Kansas City, manifest as phylogenetically distinct lineages, indicative of multiple introductions of this virus into the city. In addition, we identified an intersubclade recombination event within EV-D68, the first recombinant in this virus reported to date. Finally, we found no significant association between EV-D68 genetic variation, either lineages or individual mutations, and a variety of demographic and clinical variables, suggesting that host factors likely play a major role in determining disease severity. Overall, our study revealed the complex pattern of viral evolution within a single geographic locality during a single outbreak, which has implications for the design of effective intervention and prevention strategies. IMPORTANCE Until recently, EV-D68 was considered to be an uncommon human pathogen, associated with mild respiratory illness. However, in 2014 EV-D68 was responsible for more than 1,000 disease cases in North America, including severe respiratory illness in children and acute flaccid myelitis, raising concerns about its potential impact on public health. Despite the emergence of EV-D68, a lack of full-length genome sequences means that little is known about the molecular evolution of this virus within a single geographic locality during a single outbreak. Here, we doubled the number of publicly available complete genome sequences of EV-D68 by performing high-throughput next-generation sequencing, characterized the evolutionary history of this outbreak in detail, identified a recombination event, and investigated whether there was any correlation between the demographic and clinical characteristics of the patients and the viral variant that infected them. Overall, these results will help inform the design of intervention strategies for EV-D68. PMID:26656685
Preliminary Classification of Novel Hemorrhagic Fever-Causing Viruses Using Sequence-Based PAirwise Sequence Comparison (PASC) Analysis.

PubMed

Bào, Yīmíng; Kuhn, Jens H

2018-01-01

During the last decade, genome sequence-based classification of viruses has become increasingly prominent. Viruses can be even classified based on coding-complete genome sequence data alone. Nevertheless, classification remains arduous as experts are required to establish phylogenetic trees to depict the evolutionary relationships of such sequences for preliminary taxonomic placement. Pairwise sequence comparison (PASC) of genomes is one of several novel methods for establishing relationships among viruses. This method, provided by the US National Center for Biotechnology Information as an open-access tool, circumvents phylogenetics, and yet PASC results are often in agreement with those of phylogenetic analyses. Computationally inexpensive, PASC can be easily performed by non-taxonomists. Here we describe how to use the PASC tool for the preliminary classification of novel viral hemorrhagic fever-causing viruses.
Complete genome sequence of the chromate-reducing bacterium Thermoanaerobacter thermohydrosulfuricus strain BSB-33

DOE PAGES

Bhattacharya, Pamela; Barnebey, Adam; Zemla, Marcin; ...

2015-10-05

Thermoanaerobacter thermohydrosulfuricus BSB-33 is a thermophilic gram positive obligate anaerobe isolated from a hot spring in West Bengal, India. Unlike other T. thermohydrosulfuricus strains, BSB-33 is able to anaerobically reduce Fe(III) and Cr(VI) optimally at 60 °C. BSB-33 is the first Cr(VI) reducing T. thermohydrosulfuricus genome sequenced and of particular interest for bioremediation of environmental chromium contaminations. Here we discuss features of T. thermohydrosulfuricus BSB-33 and the unique genetic elements that may account for the peculiar metal reducing properties of this organism. The T. thermohydrosulfuricus BSB-33 genome comprises 2597606 bp encoding 2581 protein genes, 12 rRNA, 193 pseudogenes and hasmore » a G + C content of 34.20 %. Lastly, putative chromate reductases were identified by comparative analyses with other Thermoanaerobacter and chromate-reducing bacteria.« less
Hemipteran Mitochondrial Genomes: Features, Structures and Implications for Phylogeny

PubMed Central

Wang, Yuan; Chen, Jing; Jiang, Li-Yun; Qiao, Ge-Xia

2015-01-01

The study of Hemipteran mitochondrial genomes (mitogenomes) began with the Chagas disease vector, Triatoma dimidiata, in 2001. At present, 90 complete Hemipteran mitogenomes have been sequenced and annotated. This review examines the history of Hemipteran mitogenomes research and summarizes the main features of them including genome organization, nucleotide composition, protein-coding genes, tRNAs and rRNAs, and non-coding regions. Special attention is given to the comparative analysis of repeat regions. Gene rearrangements are an additional data type for a few families, and most mitogenomes are arranged in the same order to the proposed ancestral insect. We also discuss and provide insights on the phylogenetic analyses of a variety of taxonomic levels. This review is expected to further expand our understanding of research in this field and serve as a valuable reference resource. PMID:26039239

Complete genome sequence of bluetongue virus serotype 4 that emerged on the French island of Corsica in December 2016.

PubMed

Sailleau, C; Breard, E; Viarouge, C; Gorlier, A; Quenault, H; Hirchaud, E; Touzain, F; Blanchard, Y; Vitour, D; Zientara, S

2018-02-01

In November 2016, sheep located in the south of Corsica island exhibited clinical signs suggestive of bluetongue virus (BTV) infection. Laboratory analyses allowed to isolate and identify a BTV strain of serotype 4. The analysis of the full viral genome showed that all the 10 genomic segments were closely related to those of the BTV-4 present in Hungary in 2014 and involved in a large BT outbreak in the Balkan Peninsula. These results together with epidemiological data suggest that BTV-4 has been introduced to Corsica from Italy (Sardinia) where BTV-4 outbreaks have been reported in autumn 2016. This is the first report of the introduction in Corsica of a BTV strain previously spreading in eastern Europe. © 2017 Blackwell Verlag GmbH.
Analysis of resistance genes of clinical Pannonibacter phragmitetus strain 31801 by complete genome sequencing.

PubMed

Ming, De-Song; Chen, Qing-Qing; Chen, Xiao-Tin

2018-05-14

To clarify the resistance mechanisms of Pannonibacter phragmitetus 31801, isolated from the blood of a liver abscess patient, at the genomic level, we performed whole genomic sequencing using a PacBio RS II single-molecule real-time long-read sequencer. Bioinformatic analysis of the resulting sequence was then carried out to identify any possible resistance genes. Analyses included Basic Local Alignment Search Tool searches against the Antibiotic Resistance Genes Database, ResFinder analysis of the genome sequence, and Resistance Gene Identifier analysis within the Comprehensive Antibiotic Resistance Database. Prophages, clustered regularly interspaced short palindromic repeats (CRISPR), and other putative virulence factors were also identified using PHAST, CRISPRfinder, and the Virulence Factors Database, respectively. The circular chromosome and single plasmid of P. phragmitetus 31801 contained multiple antibiotic resistance genes, including those coding for three different types of β-lactamase [NPS β-lactamase (EC 3.5.2.6), β-lactamase class C, and a metal-dependent hydrolase of β-lactamase superfamily I]. In addition, genes coding for subunits of several multidrug-resistance efflux pumps were identified, including those targeting macrolides (adeJ, cmeB), tetracycline (acrB, adeAB), fluoroquinolones (acrF, ceoB), and aminoglycosides (acrD, amrB, ceoB, mexY, smeB). However, apart from the tripartite macrolide efflux pump macAB-tolC, the genome did not appear to contain the complete complement of subunit genes required for production of most of the major multidrug-resistance efflux pumps.
Complete genome sequence of Acidihalobacter prosperus strain F5, an extremely acidophilic, iron- and sulfur-oxidizing halophile with potential industrial applicability in saline water bioleaching of chalcopyrite.

PubMed

Khaleque, Himel N; Corbett, Melissa K; Ramsay, Joshua P; Kaksonen, Anna H; Boxall, Naomi J; Watkin, Elizabeth L J

2017-11-20

Successful process development for the bioleaching of mineral ores, particularly the refractory copper sulfide ore chalcopyrite, remains a challenge in regions where freshwater is scarce and source water contains high concentrations of chloride ion. In this study, a pure isolate of Acidihalobacter prosperus strain F5 was characterized for its ability to leach base metals from sulfide ores (pyrite, chalcopyrite and pentlandite) at increasing chloride ion concentrations. F5 successfully released base metals from ores including pyrite and pentlandite at up to 30gL -1 chloride ion and chalcopyrite up to 18gL -1 chloride ion. In order to understand the genetic mechanisms of tolerance to high acid, saline and heavy metal stress the genome of F5 was sequenced and analysed. As well as being the first strain of Ac. prosperus to be isolated from Australia it is also the first complete genome of the Ac. prosperus species to be sequenced. The F5 genome contains genes involved in the biosynthesis of compatible solutes and genes encoding monovalent cation/proton antiporters and heavy metal transporters which could explain its abilities to tolerate high salinity, acidity and heavy metal stress. Genome analysis also confirmed the presence of genes involved in copper tolerance. The study demonstrates the potential biotechnological applicability of Ac. prosperus strain F5 for saline water bioleaching of mineral ores. Copyright © 2017 Elsevier B.V. All rights reserved.
Analysis of the complete genome of subgroup A' hepatitis B virus isolates from South Africa.

PubMed

Kramvis, Anna; Weitzmann, Louise; Owiredu, William K B A; Kew, Michael C

2002-04-01

A phylogenetic analysis is presented of six complete and seven pre-S1/S2/S gene sequences of hepatitis B virus (HBV) isolates from South Africa. Five of the full-length sequences and all of the pre-S2/S sequences have been previously reported. Four of the six complete genomes and three of the five incomplete sequences clustered with subgroup A', a unique segment of genotype A of HBV previously identified in 60% of South African isolates using analysis of the pre-S2/S region alone. This separation was also evident when the polymerase open reading frame was analysed, but not on analysis of either the X or pre-core/core genes. Amino acids were identified in the pre-S1 and polymerase regions specific to subgroup A'. In common with genotype D, 10 of 11 genotype A South African isolates had an 11 amino acid deletion in the amino end of the pre-S1 region. This deletion is also found in hepadnaviruses from non-human primates.
Complete genome sequence of Clavibacter michiganensis subsp. insidiosus R1-1 using PacBio single-molecule real-time technology

USDA-ARS?s Scientific Manuscript database

We report the complete genome sequence of Clavibacter michiganensis subsp. insidiosus R1-1 isolated in Minnesota, USA. The R1-1 genome, generated by de novo assembly of PacBio sequencing data, is the first complete genome sequence available for this subspecies....
Complete Genome Sequence of Porcine Parvovirus 2 Recovered from Swine Sera

PubMed Central

Kluge, M.; Franco, A. C.; Giongo, A.; Valdez, F. P.; Saddi, T. M.; Brito, W. M. E. D.; Roehe, P. M.

2016-01-01

A complete genomic sequence of porcine parvovirus 2 (PPV-2) was detected by viral metagenome analysis on swine sera. A phylogenetic analysis of this genome reveals that it is highly similar to previously reported North American PPV-2 genomes. The complete PPV-2 sequence is 5,426 nucleotides long. PMID:26823583
Characterization of the complete chloroplast genome of the endangered species Carya sinensis (Juglandaceae)

Treesearch

Yiheng Hu; Xi Chen; Xiaojia Feng; Keith E. Woeste; Peng Zhao

2016-01-01

Carya sinensis (Chinese Hickory, beaked walnut, or beaked hickory) is an endangered species that needs urgent conservation action. Here, we reported the complete chloroplast (cp) genome sequence and the genomic features of the C. sinensis cp, which is the first complete cp genome of any member of Carya. The...
Complete genome of the cotton bacteria blight pathogen Xanthomonas citri pv. malvacearum strain MSCT

USDA-ARS?s Scientific Manuscript database

Xanthomonas citri pv. malvacearum (Xcm) is a major pathogen of Gossypium hirsutum. In this study we report the complete genome of the Xcm strain MSCT assembled from long read DNA sequencing technology. The MSCT genome is the first Xcm genome that has complete coding regions for Xcm transcriptional a...
Complete mitochondrial genome of the Freshwater Catfish Rita rita (Siluriformes, Bagridae).

PubMed

Lashari, Punhal; Laghari, Muhammad Younis; Xu, Peng; Zhao, Zixia; Jiang, Li; Narejo, Naeem Tariq; Deng, Yulin; Sun, Xiaowen; Zhang, Yan

2015-01-01

The complete mitochondrial genome of Catfish, Rita rita, was isolated by LA PCR (TakaRa LAtaq, Dalian, China); and sequenced by Sanger's method to obtain the complete mitochondrial genome, which is listed Critically Endangered and Red Listed species. The complete mitogenome was 16,449 bp in length and contains 13 typical vertebrate protein-coding genes, 2 rRNA and 22 tRNA genes. The whole genome base composition was estimated to be 33.40% A, 27.43% C, 14.26% G and 24.89% T. The complete mitochondrial genome of catfish, Rita rita provides the basis for genetic breeding and conservation studies.
Evolution of gastropod mitochondrial genome arrangements

PubMed Central

2008-01-01

Background Gastropod mitochondrial genomes exhibit an unusually great variety of gene orders compared to other metazoan mitochondrial genome such as e.g those of vertebrates. Hence, gastropod mitochondrial genomes constitute a good model system to study patterns, rates, and mechanisms of mitochondrial genome rearrangement. However, this kind of evolutionary comparative analysis requires a robust phylogenetic framework of the group under study, which has been elusive so far for gastropods in spite of the efforts carried out during the last two decades. Here, we report the complete nucleotide sequence of five mitochondrial genomes of gastropods (Pyramidella dolabrata, Ascobulla fragilis, Siphonaria pectinata, Onchidella celtica, and Myosotella myosotis), and we analyze them together with another ten complete mitochondrial genomes of gastropods currently available in molecular databases in order to reconstruct the phylogenetic relationships among the main lineages of gastropods. Results Comparative analyses with other mollusk mitochondrial genomes allowed us to describe molecular features and general trends in the evolution of mitochondrial genome organization in gastropods. Phylogenetic reconstruction with commonly used methods of phylogenetic inference (ME, MP, ML, BI) arrived at a single topology, which was used to reconstruct the evolution of mitochondrial gene rearrangements in the group. Conclusion Four main lineages were identified within gastropods: Caenogastropoda, Vetigastropoda, Patellogastropoda, and Heterobranchia. Caenogastropoda and Vetigastropoda are sister taxa, as well as, Patellogastropoda and Heterobranchia. This result rejects the validity of the derived clade Apogastropoda (Caenogastropoda + Heterobranchia). The position of Patellogastropoda remains unclear likely due to long-branch attraction biases. Within Heterobranchia, the most heterogeneous group of gastropods, neither Euthyneura (because of the inclusion of P. dolabrata) nor Pulmonata (polyphyletic) nor Opisthobranchia (because of the inclusion S. pectinata) were recovered as monophyletic groups. The gene order of the Vetigastropoda might represent the ancestral mitochondrial gene order for Gastropoda and we propose that at least three major rearrangements have taken place in the evolution of gastropods: one in the ancestor of Caenogastropoda, another in the ancestor of Patellogastropoda, and one more in the ancestor of Heterobranchia. PMID:18302768
Chloroplast phylogenomic analyses resolve deep-level relationships of an intractable bamboo tribe Arundinarieae (poaceae).

PubMed

Ma, Peng-Fei; Zhang, Yu-Xiao; Zeng, Chun-Xia; Guo, Zhen-Hua; Li, De-Zhu

2014-11-01

The temperate woody bamboos constitute a distinct tribe Arundinarieae (Poaceae: Bambusoideae) with high species diversity. Estimating phylogenetic relationships among the 11 major lineages of Arundinarieae has been particularly difficult, owing to a possible rapid radiation and the extremely low rate of sequence divergence. Here, we explore the use of chloroplast genome sequencing for phylogenetic inference. We sampled 25 species (22 temperate bamboos and 3 outgroups) for the complete genome representing eight major lineages of Arundinarieae in an attempt to resolve backbone relationships. Phylogenetic analyses of coding versus noncoding sequences, and of different regions of the genome (large single copy and small single copy, and inverted repeat regions) yielded no well-supported contradicting topologies but potential incongruence was found between the coding and noncoding sequences. The use of various data partitioning schemes in analysis of the complete sequences resulted in nearly identical topologies and node support values, although the partitioning schemes were decisively different from each other as to the fit to the data. Our full genomic data set substantially increased resolution along the backbone and provided strong support for most relationships despite the very short internodes and long branches in the tree. The inferred relationships were also robust to potential confounding factors (e.g., long-branch attraction) and received support from independent indels in the genome. We then added taxa from the three Arundinarieae lineages that were not included in the full-genome data set; each of these were sampled for more than 50% genome sequences. The resulting trees not only corroborated the reconstructed deep-level relationships but also largely resolved the phylogenetic placements of these three additional lineages. Furthermore, adding 129 additional taxa sampled for only eight chloroplast loci to the combined data set yielded almost identical relationships, albeit with low support values. We believe that the inferred phylogeny is robust to taxon sampling. Having resolved the deep-level relationships of Arundinarieae, we illuminate how chloroplast phylogenomics can be used for elucidating difficult phylogeny at low taxonomic levels in intractable plant groups. © The Author(s) 2014. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Complete genome analysis of highly pathogenic bovine ephemeral fever virus isolated in Turkey in 2012.

PubMed

Abayli, Hasan; Tonbak, Sukru; Azkur, Ahmet Kursat; Bulut, Hakan

2017-10-01

Relatively high prevalence and mortality rates of bovine ephemeral fever (BEF) have been reported in recent epidemics in some countries, including Turkey, when compared with previous outbreaks. A limited number of complete genome sequences of BEF virus (BEFV) are available in the GenBank Database. In this study, the complete genome of highly pathogenic BEFV isolated during an outbreak in Turkey in 2012 was analyzed for genetic characterization. The complete genome of the Turkish BEFV isolate was amplified by reverse transcription-polymerase chain reaction (RT-PCR) and sequenced. It was found that the complete genome of the Turkish BEFV isolate was 14,901 nt in length. The complete genome sequence obtained from the study showed 91-92% identity at nucleotide level to Australian (BB7721) and Chinese (Bovine/China/Henan1/2012) BEFV isolates. Phylogenetic analysis of the glycoprotein gene of the Turkish BEFV isolate also showed that Turkish isolates were closely related to Israeli isolates. Because of the limited number of complete BEFV genome sequences, the results from this study will be useful for understanding the global molecular epidemiology and geodynamics of BEF.
The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2018 update.

PubMed

Afgan, Enis; Baker, Dannon; Batut, Bérénice; van den Beek, Marius; Bouvier, Dave; Cech, Martin; Chilton, John; Clements, Dave; Coraor, Nate; Grüning, Björn A; Guerler, Aysam; Hillman-Jackson, Jennifer; Hiltemann, Saskia; Jalili, Vahid; Rasche, Helena; Soranzo, Nicola; Goecks, Jeremy; Taylor, James; Nekrutenko, Anton; Blankenberg, Daniel

2018-05-22

Galaxy (homepage: https://galaxyproject.org, main public server: https://usegalaxy.org) is a web-based scientific analysis platform used by tens of thousands of scientists across the world to analyze large biomedical datasets such as those found in genomics, proteomics, metabolomics and imaging. Started in 2005, Galaxy continues to focus on three key challenges of data-driven biomedical science: making analyses accessible to all researchers, ensuring analyses are completely reproducible, and making it simple to communicate analyses so that they can be reused and extended. During the last two years, the Galaxy team and the open-source community around Galaxy have made substantial improvements to Galaxy's core framework, user interface, tools, and training materials. Framework and user interface improvements now enable Galaxy to be used for analyzing tens of thousands of datasets, and >5500 tools are now available from the Galaxy ToolShed. The Galaxy community has led an effort to create numerous high-quality tutorials focused on common types of genomic analyses. The Galaxy developer and user communities continue to grow and be integral to Galaxy's development. The number of Galaxy public servers, developers contributing to the Galaxy framework and its tools, and users of the main Galaxy server have all increased substantially.
Floral gene resources from basal angiosperms for comparative genomics research

PubMed Central

Albert, Victor A; Soltis, Douglas E; Carlson, John E; Farmerie, William G; Wall, P Kerr; Ilut, Daniel C; Solow, Teri M; Mueller, Lukas A; Landherr, Lena L; Hu, Yi; Buzgo, Matyas; Kim, Sangtae; Yoo, Mi-Jeong; Frohlich, Michael W; Perl-Treves, Rafael; Schlarbaum, Scott E; Bliss, Barbara J; Zhang, Xiaohong; Tanksley, Steven D; Oppenheimer, David G; Soltis, Pamela S; Ma, Hong; dePamphilis, Claude W; Leebens-Mack, James H

2005-01-01

Background The Floral Genome Project was initiated to bridge the genomic gap between the most broadly studied plant model systems. Arabidopsis and rice, although now completely sequenced and under intensive comparative genomic investigation, are separated by at least 125 million years of evolutionary time, and cannot in isolation provide a comprehensive perspective on structural and functional aspects of flowering plant genome dynamics. Here we discuss new genomic resources available to the scientific community, comprising cDNA libraries and Expressed Sequence Tag (EST) sequences for a suite of phylogenetically basal angiosperms specifically selected to bridge the evolutionary gaps between model plants and provide insights into gene content and genome structure in the earliest flowering plants. Results Random sequencing of cDNAs from representatives of phylogenetically important eudicot, non-grass monocot, and gymnosperm lineages has so far (as of 12/1/04) generated 70,514 ESTs and 48,170 assembled unigenes. Efficient sorting of EST sequences into putative gene families based on whole Arabidopsis/rice proteome comparison has permitted ready identification of cDNA clones for finished sequencing. Preliminarily, (i) proportions of functional categories among sequenced floral genes seem representative of the entire Arabidopsis transcriptome, (ii) many known floral gene homologues have been captured, and (iii) phylogenetic analyses of ESTs are providing new insights into the process of gene family evolution in relation to the origin and diversification of the angiosperms. Conclusion Initial comparisons illustrate the utility of the EST data sets toward discovery of the basic floral transcriptome. These first findings also afford the opportunity to address a number of conspicuous evolutionary genomic questions, including reproductive organ transcriptome overlap between angiosperms and gymnosperms, genome-wide duplication history, lineage-specific gene duplication and functional divergence, and analyses of adaptive molecular evolution. Since not all genes in the floral transcriptome will be associated with flowering, these EST resources will also be of interest to plant scientists working on other functions, such as photosynthesis, signal transduction, and metabolic pathways. PMID:15799777
Mitochondrial genome analysis of the predatory mite Phytoseiulus persimilis and a revisit of the Metaseiulus occidentalis mitochondrial genome.

PubMed

Dermauw, Wannes; Vanholme, Bartel; Tirry, Luc; Van Leeuwen, Thomas

2010-04-01

In this study we sequenced and analysed the complete mitochondrial (mt) genome of the Chilean predatory mite Phytoseiulus persimilis Athias-Henriot (Chelicerata: Acari: Mesostigmata: Phytoseiidae: Amblyseiinae). The 16 199 bp genome (79.8% AT) contains the standard set of 13 protein-coding and 24 RNA genes. Compared with the ancestral arthropod mtDNA pattern, the gene order is extremely reshuffled (35 genes changed position) and represents a novel arrangement within the arthropods. This is probably related to the presence of several large noncoding regions in the genome. In contrast with the mt genome of the closely related species Metaseiulus occidentalis (Phytoseiidae: Typhlodrominae) - which was reported to be unusually large (24 961 bp), to lack nad6 and nad3 protein-coding genes, and to contain 22 tRNAs without T-arms - the genome of P. persimilis has all the features of a standard metazoan mt genome. Consequently, we performed additional experiments on the M. occidentalis mt genome. Our preliminary restriction digests and Southern hybridization data revealed that this genome is smaller than previously reported. In addition, we cloned nad3 in M. occidentalis and positioned this gene between nad4L and 12S-rRNA on the mt genome. Finally, we report that at least 15 of the 22 tRNAs in the M. occidentalis mt genome can be folded into canonical cloverleaf structures similar to their counterparts in P. persimilis.
Clarification of Taxonomic Status within the Pseudomonas syringae Species Group Based on a Phylogenomic Analysis

PubMed Central

Gomila, Margarita; Busquets, Antonio; Mulet, Magdalena; García-Valdés, Elena; Lalucat, Jorge

2017-01-01

The Pseudomonas syringae phylogenetic group comprises 15 recognized bacterial species and more than 60 pathovars. The classification and identification of strains is relevant for practical reasons but also for understanding the epidemiology and ecology of this group of plant pathogenic bacteria. Genome-based taxonomic analyses have been introduced recently to clarify the taxonomy of the whole genus. A set of 139 draft and complete genome sequences of strains belonging to all species of the P. syringae group available in public databases were analyzed, together with the genomes of closely related species used as outgroups. Comparative genomics based on the genome sequences of the species type strains in the group allowed the delineation of phylogenomic species and demonstrated that a high proportion of strains included in the study are misclassified. Furthermore, representatives of at least 7 putative novel species were detected. It was also confirmed that P. ficuserectae, P. meliae, and P. savastanoi are later synonyms of P. amygdali and that “P. coronafaciens” should be revived as a nomenspecies. PMID:29270162
Genome-wide analysis of the DNA-binding with one zinc finger (Dof) transcription factor family in bananas.

PubMed

Dong, Chen; Hu, Huigang; Xie, Jianghui

2016-12-01

DNA-binding with one finger (Dof) domain proteins are a multigene family of plant-specific transcription factors involved in numerous aspects of plant growth and development. In this study, we report a genome-wide search for Musa acuminata Dof (MaDof) genes and their expression profiles at different developmental stages and in response to various abiotic stresses. In addition, a complete overview of the Dof gene family in bananas is presented, including the gene structures, chromosomal locations, cis-regulatory elements, conserved protein domains, and phylogenetic inferences. Based on the genome-wide analysis, we identified 74 full-length protein-coding MaDof genes unevenly distributed on 11 chromosomes. Phylogenetic analysis with Dof members from diverse plant species showed that MaDof genes can be classified into four subgroups (StDof I, II, III, and IV). The detailed genomic information of the MaDof gene homologs in the present study provides opportunities for functional analyses to unravel the exact role of the genes in plant growth and development.
Comparing sequencing assays and human-machine analyses in actionable genomics for glioblastoma

PubMed Central

Wrzeszczynski, Kazimierz O.; Frank, Mayu O.; Koyama, Takahiko; Rhrissorrakrai, Kahn; Robine, Nicolas; Utro, Filippo; Emde, Anne-Katrin; Chen, Bo-Juen; Arora, Kanika; Shah, Minita; Vacic, Vladimir; Norel, Raquel; Bilal, Erhan; Bergmann, Ewa A.; Moore Vogel, Julia L.; Bruce, Jeffrey N.; Lassman, Andrew B.; Canoll, Peter; Grommes, Christian; Harvey, Steve; Parida, Laxmi; Michelini, Vanessa V.; Zody, Michael C.; Jobanputra, Vaidehi; Royyuru, Ajay K.

2017-01-01

Objective: To analyze a glioblastoma tumor specimen with 3 different platforms and compare potentially actionable calls from each. Methods: Tumor DNA was analyzed by a commercial targeted panel. In addition, tumor-normal DNA was analyzed by whole-genome sequencing (WGS) and tumor RNA was analyzed by RNA sequencing (RNA-seq). The WGS and RNA-seq data were analyzed by a team of bioinformaticians and cancer oncologists, and separately by IBM Watson Genomic Analytics (WGA), an automated system for prioritizing somatic variants and identifying drugs. Results: More variants were identified by WGS/RNA analysis than by targeted panels. WGA completed a comparable analysis in a fraction of the time required by the human analysts. Conclusions: The development of an effective human-machine interface in the analysis of deep cancer genomic datasets may provide potentially clinically actionable calls for individual patients in a more timely and efficient manner than currently possible. ClinicalTrials.gov identifier: NCT02725684. PMID:28740869
Genomic analysis of the blood attributed to Louis XVI (1754-1793), king of France.

PubMed

Olalde, Iñigo; Sánchez-Quinto, Federico; Datta, Debayan; Marigorta, Urko M; Chiang, Charleston W K; Rodríguez, Juan Antonio; Fernández-Callejo, Marcos; González, Irene; Montfort, Magda; Matas-Lalueza, Laura; Civit, Sergi; Luiselli, Donata; Charlier, Philippe; Pettener, Davide; Ramírez, Oscar; Navarro, Arcadi; Himmelbauer, Heinz; Marquès-Bonet, Tomàs; Lalueza-Fox, Carles

2014-04-24

A pyrographically decorated gourd, dated to the French Revolution period, has been alleged to contain a handkerchief dipped into the blood of the French king Louis XVI (1754-1793) after his beheading but recent analyses of living males from two Bourbon branches cast doubts on its authenticity. We sequenced the complete genome of the DNA contained in the gourd at low coverage (~2.5×) with coding sequences enriched at a higher ~7.3× coverage. We found that the ancestry of the gourd's genome does not seem compatible with Louis XVI's known ancestry. From a functional perspective, we did not find an excess of alleles contributing to height despite being described as the tallest person in Court. In addition, the eye colour prediction supported brown eyes, while Louis XVI had blue eyes. This is the first draft genome generated from a person who lived in a recent historical period; however, our results suggest that this sample may not correspond to the alleged king.
Phylogenetic analysis of 47 chloroplast genomes clarifies the contribution of wild species to the domesticated apple maternal line.

PubMed

Nikiforova, Svetlana V; Cavalieri, Duccio; Velasco, Riccardo; Goremykin, Vadim

2013-08-01

Both the origin of domesticated apple and the overall phylogeny of the genus Malus are still not completely resolved. Having this as a target, we built a 134,553-position-long alignment including two previously published chloroplast DNAs (cpDNAs) and 45 de novo sequenced, fully colinear chloroplast genomes from cultivated apple varieties and wild apple species. The data produced are free from compositional heterogeneity and from substitutional saturation, which can adversely affect phylogeny reconstruction. Phylogenetic analyses based on this alignment recovered a branch, having the maximum bootstrap support, subtending a large group of the cultivated apple sorts together with all analyzed European wild apple (Malus sylvestris) accessions. One apple cultivar was embedded in a monophylum comprising wild M. sieversii accessions and other Asian apple species. The data demonstrate that M. sylvestris has contributed chloroplast genome to a substantial fraction of domesticated apple varieties, supporting the conclusion that different wild species should have contributed the organelle and nuclear genomes to the domesticated apple.

Complete Genomic Sequence and Comparative Analysis of the Genome Segments of Sweet Potato Chlorotic Stunt Virus in China

PubMed Central

Qin, Yanhong; Wang, Li; Zhang, Zhenchen; Qiao, Qi; Zhang, Desheng; Tian, Yuting; Wang, Shuang; Wang, Yongjiang; Yan, Zhaoling

2014-01-01

Background Sweet potato chlorotic stunt virus (family Closteroviridae, genus Crinivirus) features a large bipartite, single-stranded, positive-sense RNA genome. To date, only three complete genomic sequences of SPCSV can be accessed through GenBank. SPCSV was first detected from China in 2011, only partial genomic sequences have been determined in the country. No report on the complete genomic sequence and genome structure of Chinese SPCSV isolates or the genetic relation between isolates from China and other countries is available. Methodology/Principal Findings The complete genomic sequences of five isolates from different areas in China were characterized. This study is the first to report the complete genome sequences of SPCSV from whitefly vectors. Genome structure analysis showed that isolates of WA and EA strains from China have the same coding protein as isolates Can181-9 and m2-47, respectively. Twenty cp genes and four RNA1 partial segments were sequenced and analyzed, and the nucleotide identities of complete genomic, cp, and RNA1 partial sequences were determined. Results indicated high conservation among strains and significant differences between WA and EA strains. Genetic analysis demonstrated that, except for isolates from Guangdong Province, SPCSVs from other areas belong to the WA strain. Genome organization analysis showed that the isolates in this study lack the p22 gene. Conclusions/Significance We presented the complete genome sequences of SPCSV in China. Comparison of nucleotide identities and genome structures between these isolates and previously reported isolates showed slight differences. The nucleotide identities of different SPCSV isolates showed high conservation among strains and significant differences between strains. All nine isolates in this study lacked p22 gene. WA strains were more extensively distributed than EA strains in China. These data provide important insights into the molecular variation and genomic structure of SPCSV in China as well as genetic relationships among isolates from China and other countries. PMID:25170926
Complete genome sequence of an attenuated Sparfloxacin-resistant Streptococcus agalactiae strain 138spar

USDA-ARS?s Scientific Manuscript database

The complete genome of a sparfloxacin-resistant Streptococcus agalactiae vaccine strain 138spar is 1,838,126 bp in size. The genome has 1892 coding sequences and 82 RNAs. The annotation of the genome is added by the NCBI Prokaryotic Genome Annotation Pipeline. The publishing of this genome will allo...
Complete Genome Sequence of Porcine Parvovirus 2 Recovered from Swine Sera.

PubMed

Campos, F S; Kluge, M; Franco, A C; Giongo, A; Valdez, F P; Saddi, T M; Brito, W M E D; Roehe, P M

2016-01-28

A complete genomic sequence of porcine parvovirus 2 (PPV-2) was detected by viral metagenome analysis on swine sera. A phylogenetic analysis of this genome reveals that it is highly similar to previously reported North American PPV-2 genomes. The complete PPV-2 sequence is 5,426 nucleotides long. Copyright © 2016 Campos et al.
Complete Genome Sequence of Clavibacter michiganensis subsp. insidiosus R1-1 Using PacBio Single-Molecule Real-Time Technology

PubMed Central

Lu, You; Samac, Deborah A.; Glazebrook, Jane

2015-01-01

We report here the complete genome sequence of Clavibacter michiganensis subsp. insidiosus R1-1, isolated in Minnesota, USA. The R1-1 genome, generated by a de novo assembly of PacBio sequencing data, is the first complete genome sequence available for this subspecies. PMID:25953184
Deep Sequencing Reveals the Complete Genome Sequence of Sweet potato virus G from East Timor

PubMed Central

Maina, Solomon; Edwards, Owain R.; Barbetti, Martin J.; de Almeida, Luis; Ximenes, Abel

2016-01-01

We present the first complete Sweet potato virus G (SPVG) genome from sweet potato in East Timor and compare it with seven complete SPVG genomes from South Korea (three), Taiwan (two), Argentina (one), and the United States (one). It most resembles the genomes from the United States and South Korea. PMID:27609925
Integrating Metagenomics and NanoSIMS to Investigate the Evolution and Ecophysiology of Magnetotactic Bacteria

NASA Astrophysics Data System (ADS)

Lin, W.; Zhang, W.; He, M.; Pan, Y.

2017-12-01

Magnetotactic bacteria (MTB) synthesize intracellular nano-sized magnetite (Fe3O4) and/or greigite (Fe3S4) crystals, called magnetosomes, which impart a permanent magnetic dipole moment to the cell causing it to align along the geomagnetic field lines as it swims. MTB play essential roles in global cycling of Fe, S, N and C, and represent an excellent model system not just for the investigation of the mechanisms of microbial engines that drive Earth's biogeochemical cycles but also for magnetotaxis and microbial biomineralization. Most of the previous studies on MTB were based on 16S rRNA gene-targeting analyses, which are powerful approaches to characterize the diversity, ecology and biogeography of MTB in nature. However, these approaches are somewhat limited in the physiological detail they can provide. In the present study, we have combined the genome-resolved metagenomics and nanoscale secondary ion mass spectrometry (NanoSIMS) analyses to study the genomic information, biomineralization mechanism and metabolic potential of environmental MTB. Two nearly complete genomes from uncultivated MTB belonging to the Nitrospirae phylum were reconstructed and their proposed metabolisms were further investigated and confirmed through NanoSIMS analyses. These results improve our understanding about the ecophysiology and evolution of MTB and their environmental function. The development of metagenomics-NanoSIMS integrated approach will provide a powerful tool for the research of geomicrobiology and environmental microbiology.
Enrichment of Root Endophytic Bacteria from Populus deltoides and Single-Cell-Genomics Analysis

DOE PAGES

Utturkar, Sagar M.; Cude, W. Nathan; Robeson, Jr., Michael S.; ...

2016-07-15

Bacterial endophytes that colonize Populus trees contribute to nutrient acquisition, prime immunity responses, and directly or indirectly increase both above- and below-ground biomasses. Endophytes are embedded within plant material, so physical separation and isolation are difficult tasks. Application of culture-independent methods, such as metagenome or bacterial transcriptome sequencing, has been limited due to the predominance of DNA from the plant biomass. In this paper, we present a modified differential and density gradient centrifugation-based protocol for the separation of endophytic bacteria from Populus roots. This protocol achieved substantial reduction in contaminating plant DNA, allowed enrichment of endophytic bacteria away from themore » plant material, and enabled single-cell genomics analysis. Four single-cell genomes were selected for whole-genome amplification based on their rarity in the microbiome (potentially uncultured taxa) as well as their inferred abilities to form associations with plants. Bioinformatics analyses, including assembly, contamination removal, and completeness estimation, were performed to obtain single-amplified genomes (SAGs) of organisms from the phyla Armatimonadetes, Verrucomicrobia, and Planctomycetes, which were unrepresented in our previous cultivation efforts. Finally, comparative genomic analysis revealed unique characteristics of each SAG that could facilitate future cultivation efforts for these bacteria.« less
Complete mitochondrial genomes of living and extinct pigeons revise the timing of the columbiform radiation.

PubMed

Soares, André E R; Novak, Ben J; Haile, James; Heupink, Tim H; Fjeldså, Jon; Gilbert, M Thomas P; Poinar, Hendrik; Church, George M; Shapiro, Beth

2016-10-26

Pigeons and doves (Columbiformes) are one of the oldest and most diverse extant lineages of birds. However, the nature and timing of the group's evolutionary radiation remains poorly resolved, despite recent advances in DNA sequencing and assembly and the growing database of pigeon mitochondrial genomes. One challenge has been to generate comparative data from the large number of extinct pigeon lineages, some of which are morphologically unique and therefore difficult to place in a phylogenetic context. We used ancient DNA and next generation sequencing approaches to assemble complete mitochondrial genomes for eleven pigeons, including the extinct Ryukyu wood pigeon (Columba jouyi), the thick-billed ground dove (Alopecoenas salamonis), the spotted green pigeon (Caloenas maculata), the Rodrigues solitaire (Pezophaps solitaria), and the dodo (Raphus cucullatus). We used a Bayesian approach to infer the evolutionary relationships among 24 species of living and extinct pigeons and doves. Our analyses indicate that the earliest radiation of the Columbidae crown group most likely occurred during the Oligocene, with continued divergence of major clades into the Miocene, suggesting that diversification within the Columbidae occurred more recently than has been reported previously.
Molecular history of plague.

PubMed

Drancourt, M; Raoult, D

2016-11-01

Plague, a deadly zoonose caused by the bacterium Yersinia pestis, has been firmly documented in 39 historical burial sites in Eurasia that date from the Bronze Age to two historical pandemics spanning the 6th to 18th centuries. Palaeomicrobiologic data, including gene and spacer sequences, whole genome sequences and protein data, confirmed that two historical pandemics swept over Europe from probable Asian sources and possible two-way-ticket journeys back from Europe to Asia. These investigations made it possible to address questions regarding the potential sources and routes of transmission by completing the standard rodent and rodent-flea transmission scheme. This suggested that plague was transmissible by human ectoparasites such as lice, and that Y. pestis was able to persist for months in the soil, which is a source of reinfection for burrowing mammals. The analyses of seven complete genome sequences from the Bronze Age indicated that Y. pestis was probably not an ectoparasite-borne pathogen in these populations. Further analyses of 14 genomes indicated that the Justinian pandemic strains may have formed a clade distinct from the one responsible for the second pandemic, spanning in Y. pestis branch 1, which also comprises the third pandemic strains. Further palaeomicrobiologic studies must tightly connect with historical and anthropologic studies to resolve questions regarding the actual sources of plague in ancient populations, alternative routes of transmission and resistance traits. Answering these questions will broaden our understanding of plague epidemiology so we may better face the actuality of this deadly infection in countries where it remains epidemic. Copyright © 2016. Published by Elsevier Ltd.
Comprehensive Analysis of Transport Proteins Encoded Within the Genome of Bdellovibrio bacteriovorus

PubMed Central

Barabote, Ravi D.; Rendulic, Snjezana; Schuster, Stephan C.; Saier, Milton H.

2012-01-01

Bdellovibrio bacteriovorus is a bacterial parasite with an unusual lifestyle. It grows and reproduces in the periplasm of a host prey bacterium. The complete genome sequence of B. bacteriovorus has recently been reported. We have reanalyzed the transport proteins encoded within the B. bacteriovorus genome according to the current content of the transporter classification database (TCDB). A comprehensive analysis is given on the types and numbers of transport systems that B. bacteriovorus has. In this regard, the potential protein secretory capabilities of at least 4 types of inner membrane secretion systems and 5 types for outer membrane secretion are described. Surprisingly, B. bacteriovorus has a disproportionate percentage of cytoplasmic membrane channels and outer membrane porins. It has far more TonB/ExbBD-type systems and MotAB-type systems for energizing outer membrane transport and motility than does E. coli. Analysis of probable substrate specificities of its transporters provides clues to its metabolic preferences. Interesting examples of gene fusions and of potentially overlapping genes were also noted. Our analyses provide a comprehensive, detailed appreciation of the transport capabilities of B. bacteriovorus. They should serve as a guide for functional experimental analyses. PMID:17706914
Complete genome sequence of 285P, a novel T7-like polyvalent E. coli bacteriophage.

PubMed

Xu, Bin; Ma, Xiangyu; Xiong, Hongyan; Li, Yafei

2014-06-01

Bacteriophages are considered potential biological agents for the control of infectious diseases and environmental disinfection. Here, we describe a novel T7-like polyvalent Escherichia coli bacteriophage, designated "285P," which can lyse several strains of E. coli. The genome, which consists of 39,270 base pairs with a G+C content of 48.73 %, was sequenced and annotated. Forty-three potential open reading frames were identified using bioinformatics tools. Based on whole-genome sequence comparison, phage 285P was identified as a novel strain of subgroup T7. It showed strongest sequence similarity to Kluyvera phage Kvp1. The phylogenetic analyses of both non-structural proteins (endonuclease gp3, amidase gp3.5, DNA primase/helicase gp4, DNA polymerase gp5, and exonuclease gp6) and structural protein (tail fiber protein gp17) led to the identification of 285P as T7-like phage. Sodium dodecyl sulfate-polyacrylamide gel electrophoresis and matrix-assisted laser desorption/ionization time-of-flight mass spectrometric analyses verified the annotation of the structural proteins (major capsid protein gp10a, tail protein gp12, and tail fiber protein gp17).
Morphologic and Genomic Analyses of New Isolates Reveal a Second Lineage of Cedratviruses.

PubMed

Rodrigues, Rodrigo Araújo Lima; Andreani, Julien; Andrade, Ana Cláudia Dos Santos Pereira; Machado, Talita Bastos; Abdi, Souhila; Levasseur, Anthony; Abrahão, Jônatas Santos; La Scola, Bernard

2018-07-01

Giant viruses have been isolated and characterized in different environments, expanding our knowledge about the biology of these unique microorganisms. In the last 2 years, a new group was discovered, the cedratviruses, currently composed of only two isolates and members of a putative new family, "Pithoviridae," along with previously known pithoviruses. Here we report the isolation and biological and genomic characterization of two novel cedratviruses isolated from samples collected in France and Brazil. Both viruses were isolated using Acanthamoeba castellanii as a host cell and exhibit ovoid particles with corks at either extremity of the particle. Curiously, the Brazilian cedratvirus is ∼20% smaller and presents a shorter genome of 460,038 bp, coding for fewer proteins than other cedratviruses. In addition, it has a completely asyntenic genome and presents a lower amino acid identity of orthologous genes (∼73%). Pangenome analysis comprising the four cedratviruses revealed an increase in the pangenome concomitant with a decrease in the core genome with the addition of the two novel viruses. Finally, phylogenetic analyses clustered the Brazilian virus in a separate branch within the group of cedratviruses, while the French isolate is closer to the previously reported Cedratvirus lausannensis Taking all together, we propose the existence of a second lineage of this emerging viral genus and provide new insights into the biodiversity and ubiquity of these giant viruses. IMPORTANCE Various giant viruses have been described in recent years, revealing a unique part of the virosphere. A new group among the giant viruses has recently been described, the cedratviruses, which is currently composed of only two isolates. In this paper, we describe two novel cedratviruses isolated from French and Brazilian samples. Biological and genomic analyses showed viruses with different particle sizes, genome lengths, and architecture, revealing the existence of a second lineage of this new group of giant viruses. Our results provide new insights into the biodiversity of cedratviruses and highlight the importance of ongoing efforts to prospect for and characterize new giant viruses. Copyright © 2018 American Society for Microbiology.
Public health in the genomic era: will Public Health Genomics contribute to major changes in the prevention of common diseases?

PubMed Central

2011-01-01

The completion of the Human Genome Project triggered a whole new field of genomic research which is likely to lead to new opportunities for the promotion of population health. As a result, the distinction between genetic and environmental diseases has faded. Presently, genomics and knowledge deriving from systems biology, epigenomics, integrative genomics or genome-environmental interactions give a better insight on the pathophysiology of common diseases. However, it is barely used in the prevention and management of diseases. Together with the boost in the amount of genetic association studies, this demands for appropriate public health actions. The field of Public Health Genomics analyses how genome-based knowledge and technologies can responsibly and effectively be integrated into health services and public policy for the benefit of population health. Environmental exposures interact with the genome to produce health information which may help explain inter-individual differences in health, or disease risk. However today, prospects for concrete applications remain distant. In addition, this information has not been translated into health practice yet. Therefore, evidence-based recommendations are few. The lack of population-based research hampers the evaluation of the impact of genomic applications. Public Health Genomics also evaluates the benefits and risks on a larger scale, including normative, legal, economic and social issues. These new developments are likely to affect all domains of public health and require rethinking the role of genomics in every condition of public health interest. This article aims at providing an introduction to the field of and the ideas behind Public Health Genomics. PMID:22958637
Evolutionary Patterns and Processes: Lessons from Ancient DNA.

PubMed

Leonardi, Michela; Librado, Pablo; Der Sarkissian, Clio; Schubert, Mikkel; Alfarhan, Ahmed H; Alquraishi, Saleh A; Al-Rasheid, Khaled A S; Gamba, Cristina; Willerslev, Eske; Orlando, Ludovic

2017-01-01

Ever since its emergence in 1984, the field of ancient DNA has struggled to overcome the challenges related to the decay of DNA molecules in the fossil record. With the recent development of high-throughput DNA sequencing technologies and molecular techniques tailored to ultra-damaged templates, it has now come of age, merging together approaches in phylogenomics, population genomics, epigenomics, and metagenomics. Leveraging on complete temporal sample series, ancient DNA provides direct access to the most important dimension in evolution—time, allowing a wealth of fundamental evolutionary processes to be addressed at unprecedented resolution. This review taps into the most recent findings in ancient DNA research to present analyses of ancient genomic and metagenomic data.
Evolutionary Patterns and Processes: Lessons from Ancient DNA

PubMed Central

Leonardi, Michela; Librado, Pablo; Der Sarkissian, Clio; Schubert, Mikkel; Alfarhan, Ahmed H.; Alquraishi, Saleh A.; Al-Rasheid, Khaled A. S.; Gamba, Cristina; Willerslev, Eske

2017-01-01

Abstract Ever since its emergence in 1984, the field of ancient DNA has struggled to overcome the challenges related to the decay of DNA molecules in the fossil record. With the recent development of high-throughput DNA sequencing technologies and molecular techniques tailored to ultra-damaged templates, it has now come of age, merging together approaches in phylogenomics, population genomics, epigenomics, and metagenomics. Leveraging on complete temporal sample series, ancient DNA provides direct access to the most important dimension in evolution—time, allowing a wealth of fundamental evolutionary processes to be addressed at unprecedented resolution. This review taps into the most recent findings in ancient DNA research to present analyses of ancient genomic and metagenomic data. PMID:28173586
Does History Repeat Itself? Wavelets and the Phylodynamics of Influenza A

PubMed Central

Tom, Jennifer A.; Sinsheimer, Janet S.; Suchard, Marc A.

2012-01-01

Unprecedented global surveillance of viruses will result in massive sequence data sets that require new statistical methods. These data sets press the limits of Bayesian phylogenetics as the high-dimensional parameters that comprise a phylogenetic tree increase the already sizable computational burden of these techniques. This burden often results in partitioning the data set, for example, by gene, and inferring the evolutionary dynamics of each partition independently, a compromise that results in stratified analyses that depend only on data within a given partition. However, parameter estimates inferred from these stratified models are likely strongly correlated, considering they rely on data from a single data set. To overcome this shortfall, we exploit the existing Monte Carlo realizations from stratified Bayesian analyses to efficiently estimate a nonparametric hierarchical wavelet-based model and learn about the time-varying parameters of effective population size that reflect levels of genetic diversity across all partitions simultaneously. Our methods are applied to complete genome influenza A sequences that span 13 years. We find that broad peaks and trends, as opposed to seasonal spikes, in the effective population size history distinguish individual segments from the complete genome. We also address hypotheses regarding intersegment dynamics within a formal statistical framework that accounts for correlation between segment-specific parameters. PMID:22160768
Whole genome assembly of a natto production strain Bacillus subtilis natto from very short read data.

PubMed

Nishito, Yukari; Osana, Yasunori; Hachiya, Tsuyoshi; Popendorf, Kris; Toyoda, Atsushi; Fujiyama, Asao; Itaya, Mitsuhiro; Sakakibara, Yasubumi

2010-04-16

Bacillus subtilis natto is closely related to the laboratory standard strain B. subtilis Marburg 168, and functions as a starter for the production of the traditional Japanese food "natto" made from soybeans. Although re-sequencing whole genomes of several laboratory domesticated B. subtilis 168 derivatives has already been attempted using short read sequencing data, the assembly of the whole genome sequence of a closely related strain, B. subtilis natto, from very short read data is more challenging, particularly with our aim to assemble one fully connected scaffold from short reads around 35 bp in length. We applied a comparative genome assembly method, which combines de novo assembly and reference guided assembly, to one of the B. subtilis natto strains. We successfully assembled 28 scaffolds and managed to avoid substantial fragmentation. Completion of the assembly through long PCR experiments resulted in one connected scaffold for B. subtilis natto. Based on the assembled genome sequence, our orthologous gene analysis between natto BEST195 and Marburg 168 revealed that 82.4% of 4375 predicted genes in BEST195 are one-to-one orthologous to genes in 168, with two genes in-paralog, 3.2% are deleted in 168, 14.3% are inserted in BEST195, and 5.9% of genes present in 168 are deleted in BEST195. The natto genome contains the same alleles in the promoter region of degQ and the coding region of swrAA as the wild strain, RO-FF-1. These are specific for gamma-PGA production ability, which is related to natto production. Further, the B. subtilis natto strain completely lacked a polyketide synthesis operon, disrupted the plipastatin production operon, and possesses previously unidentified transposases. The determination of the whole genome sequence of Bacillus subtilis natto provided detailed analyses of a set of genes related to natto production, demonstrating the number and locations of insertion sequences that B. subtilis natto harbors but B. subtilis 168 lacks. Multiple genome-level comparisons among five closely related Bacillus species were also carried out. The determined genome sequence of B. subtilis natto and gene annotations are available from the Natto genome browser http://natto-genome.org/.
A 454 sequencing approach to dipteran mitochondrial genome research

USDA-ARS?s Scientific Manuscript database

The availability of complete mitochondrial genome data for Diptera, one of the largest Metazoan orders, in public databases is limited. Herein, we generated the complete or nearly complete mitochondrial genomes for Cochliomyia hominivorax, Haematobia irritans, Phormia regina and Sarcophaga crassipa...
SnoVault and encodeD: A novel object-based storage system and applications to ENCODE metadata.

PubMed

Hitz, Benjamin C; Rowe, Laurence D; Podduturi, Nikhil R; Glick, David I; Baymuradov, Ulugbek K; Malladi, Venkat S; Chan, Esther T; Davidson, Jean M; Gabdank, Idan; Narayana, Aditi K; Onate, Kathrina C; Hilton, Jason; Ho, Marcus C; Lee, Brian T; Miyasato, Stuart R; Dreszer, Timothy R; Sloan, Cricket A; Strattan, J Seth; Tanaka, Forrest Y; Hong, Eurie L; Cherry, J Michael

2017-01-01

The Encyclopedia of DNA elements (ENCODE) project is an ongoing collaborative effort to create a comprehensive catalog of functional elements initiated shortly after the completion of the Human Genome Project. The current database exceeds 6500 experiments across more than 450 cell lines and tissues using a wide array of experimental techniques to study the chromatin structure, regulatory and transcriptional landscape of the H. sapiens and M. musculus genomes. All ENCODE experimental data, metadata, and associated computational analyses are submitted to the ENCODE Data Coordination Center (DCC) for validation, tracking, storage, unified processing, and distribution to community resources and the scientific community. As the volume of data increases, the identification and organization of experimental details becomes increasingly intricate and demands careful curation. The ENCODE DCC has created a general purpose software system, known as SnoVault, that supports metadata and file submission, a database used for metadata storage, web pages for displaying the metadata and a robust API for querying the metadata. The software is fully open-source, code and installation instructions can be found at: http://github.com/ENCODE-DCC/snovault/ (for the generic database) and http://github.com/ENCODE-DCC/encoded/ to store genomic data in the manner of ENCODE. The core database engine, SnoVault (which is completely independent of ENCODE, genomic data, or bioinformatic data) has been released as a separate Python package.
SnoVault and encodeD: A novel object-based storage system and applications to ENCODE metadata

PubMed Central

Podduturi, Nikhil R.; Glick, David I.; Baymuradov, Ulugbek K.; Malladi, Venkat S.; Chan, Esther T.; Davidson, Jean M.; Gabdank, Idan; Narayana, Aditi K.; Onate, Kathrina C.; Hilton, Jason; Ho, Marcus C.; Lee, Brian T.; Miyasato, Stuart R.; Dreszer, Timothy R.; Sloan, Cricket A.; Strattan, J. Seth; Tanaka, Forrest Y.; Hong, Eurie L.; Cherry, J. Michael

2017-01-01

The Encyclopedia of DNA elements (ENCODE) project is an ongoing collaborative effort to create a comprehensive catalog of functional elements initiated shortly after the completion of the Human Genome Project. The current database exceeds 6500 experiments across more than 450 cell lines and tissues using a wide array of experimental techniques to study the chromatin structure, regulatory and transcriptional landscape of the H. sapiens and M. musculus genomes. All ENCODE experimental data, metadata, and associated computational analyses are submitted to the ENCODE Data Coordination Center (DCC) for validation, tracking, storage, unified processing, and distribution to community resources and the scientific community. As the volume of data increases, the identification and organization of experimental details becomes increasingly intricate and demands careful curation. The ENCODE DCC has created a general purpose software system, known as SnoVault, that supports metadata and file submission, a database used for metadata storage, web pages for displaying the metadata and a robust API for querying the metadata. The software is fully open-source, code and installation instructions can be found at: http://github.com/ENCODE-DCC/snovault/ (for the generic database) and http://github.com/ENCODE-DCC/encoded/ to store genomic data in the manner of ENCODE. The core database engine, SnoVault (which is completely independent of ENCODE, genomic data, or bioinformatic data) has been released as a separate Python package. PMID:28403240

Complete mitochondrial genome and evolutionary analysis of Turritopsis dohrnii, the "immortal" jellyfish with a reversible life-cycle.

PubMed

Lisenkova, A A; Grigorenko, A P; Tyazhelova, T V; Andreeva, T V; Gusev, F E; Manakhov, A D; Goltsov, A Yu; Piraino, S; Miglietta, M P; Rogaev, E I

2017-02-01

Turritopsis dohrnii (Cnidaria, Hydrozoa, Hydroidolina, Anthoathecata) is the only known metazoan that is capable of reversing its life cycle via morph rejuvenation from the adult medusa stage to the juvenile polyp stage. Here, we present a complete mitochondrial (mt) genome sequence of T. dohrnii, which harbors genes for 13 proteins, two transfer RNAs, and two ribosomal RNAs. The T. dohrnii mt genome is characterized by typical features of species in the Hydroidolina subclass, such as a high A+T content (71.5%), reversed transcriptional orientation for the large rRNA subunit gene, and paucity of CGN codons. An incomplete complementary duplicate of the cox1 gene was found at the 5' end of the T. dohrnii mt chromosome, as were variable repeat regions flanking the chromosome. We identified species-specific variations (nad5, nad6, cob, and cox1 genes) and putative selective constraints (atp8, nad1, nad2, and nad5 genes) in the mt genes of T. dohrnii, and predicted alterations in tertiary structures of respiratory chain proteins (NADH4, NADH5, and COX1 proteins) of T. dohrnii. Based on comparative analyses of available hydrozoan mt genomes, we also determined the taxonomic relationships of T. dohrnii, recovering Filifera IV as a paraphyletic taxon, and assessed intraspecific diversity of various Hydrozoa species. Copyright © 2016 Elsevier Inc. All rights reserved.
Analysis of five complete genome sequences for members of the class Peribacteria in the recently recognized Peregrinibacteria bacterial phylum

DOE PAGES

Anantharaman, Karthik; Brown, Christopher T.; Burstein, David; ...

2016-01-28

Five closely related populations of bacteria from the Candidate Phylum (CP) Peregrinibacteria, part of the bacterial Candidate Phyla Radiation (CPR), were sampled from filtered groundwater obtained from an aquifer adjacent to the Colorado River near the town of Rifle, CO, USA. Here, we present the first complete genome sequences for organisms from this phylum. These bacteria have small genomes and, unlike most organisms from other lineages in the CPR, have the capacity for nucleotide synthesis. They invest significantly in biosynthesis of cell wall and cell envelope components, including peptidoglycan, isoprenoids via the mevalonate pathway, and a variety of amino sugarsmore » including perosamine and rhamnose. The genomes encode an intriguing set of large extracellular proteins, some of which are very cysteine-rich and may function in attachment, possibly to other cells. Strain variation in these proteins is an important source of genotypic variety. Overall, the cell envelope features, combined with the lack of biosynthesis capacities for many required cofactors, fatty acids, and most amino acids point to a symbiotic lifestyle. Furthermore, phylogenetic analyses indicate that these bacteria likely represent a new class within the Peregrinibacteria phylum, although they ultimately may be recognized as members of a separate phylum. In conclusion, we propose the provisional taxonomic assignment as ‘ Candidatus Peribacter riflensis’, Genus Peribacter, Family Peribacteraceae, Order Peribacterales, Class Peribacteria in the phylum Peregrinibacteria.« less
Functional Genomics Analysis of Singapore Grouper Iridovirus: Complete Sequence Determination and Proteomic Analysis

PubMed Central

Song, Wen Jun; Qin, Qi Wei; Qiu, Jin; Huang, Can Hua; Wang, Fan; Hew, Choy Leong

2004-01-01

Here we report the complete genome sequence of Singapore grouper iridovirus (SGIV). Sequencing of the random shotgun and restriction endonuclease genomic libraries showed that the entire SGIV genome consists of 140,131 nucleotide bp. One hundred sixty-two open reading frames (ORFs) from the sense and antisense DNA strands, coding for lengths varying from 41 to 1,268 amino acids, were identified. Computer-assisted analyses of the deduced amino acid sequences revealed that 77 of the ORFs exhibited homologies to known virus genes, 23 of which matched functional iridovirus proteins. Forty-two putative conserved domains or signatures were detected in the National Center for Biotechnology Information CD-Search database and PROSITE database. An assortment of enzyme activities involved in DNA replication, transcription, nucleotide metabolism, cell signaling, etc., were identified. Viruses were cultured on a cell line derived from the embryonated egg of the grouper Epinephelus tauvina, isolated, and purified by sucrose gradient ultracentrifugation. The protein extract from the purified virions was analyzed by polyacrylamide gel electrophoresis followed by in-gel digestion of protein bands. Matrix-assisted laser desorption ionization-time of flight mass spectrometry and database searching led to identification of 26 proteins. Twenty of these represented novel or previously unidentified genes, which were further confirmed by reverse transcription-PCR (RT-PCR) and DNA sequencing of their respective RT-PCR products. PMID:15507645
Complete Genome Sequence of Clavibacter michiganensis subsp. insidiosus R1-1 Using PacBio Single-Molecule Real-Time Technology.

PubMed

Lu, You; Samac, Deborah A; Glazebrook, Jane; Ishimaru, Carol A

2015-05-07

We report here the complete genome sequence of Clavibacter michiganensis subsp. insidiosus R1-1, isolated in Minnesota, USA. The R1-1 genome, generated by a de novo assembly of PacBio sequencing data, is the first complete genome sequence available for this subspecies. Copyright © 2015 Lu et al.
Complete Genome Sequences of the Carlavirus Sweet potato chlorotic fleck virus from East Timor and Australia

PubMed Central

Maina, Solomon; Edwards, Owain R.; de Almeida, Luis; Ximenes, Abel

2016-01-01

We present here the first complete genome sequences of Sweet potato chlorotic fleck virus (SPCFV) from sweet potato in Australia and East Timor, and we compare these with four complete SPCFV genomes from South Korea and one from Uganda. The Australian, East Timorese, South Korean, and Ugandan genomes differed considerably from each other. PMID:27231359
Genome-Centric Analysis of a Thermophilic and Cellulolytic Bacterial Consortium Derived from Composting

PubMed Central

Lemos, Leandro N.; Pereira, Roberta V.; Quaggio, Ronaldo B.; Martins, Layla F.; Moura, Livia M. S.; da Silva, Amanda R.; Antunes, Luciana P.; da Silva, Aline M.; Setubal, João C.

2017-01-01

Microbial consortia selected from complex lignocellulolytic microbial communities are promising alternatives to deconstruct plant waste, since synergistic action of different enzymes is required for full degradation of plant biomass in biorefining applications. Culture enrichment also facilitates the study of interactions among consortium members, and can be a good source of novel microbial species. Here, we used a sample from a plant waste composting operation in the São Paulo Zoo (Brazil) as inoculum to obtain a thermophilic aerobic consortium enriched through multiple passages at 60°C in carboxymethylcellulose as sole carbon source. The microbial community composition of this consortium was investigated by shotgun metagenomics and genome-centric analysis. Six near-complete (over 90%) genomes were reconstructed. Similarity and phylogenetic analyses show that four of these six genomes are novel, with the following hypothesized identifications: a new Thermobacillus species; the first Bacillus thermozeamaize genome (for which currently only 16S sequences are available) or else the first representative of a new family in the Bacillales order; the first representative of a new genus in the Paenibacillaceae family; and the first representative of a new deep-branching family in the Clostridia class. The reconstructed genomes from known species were identified as Geobacillus thermoglucosidasius and Caldibacillus debilis. The metabolic potential of these recovered genomes based on COG and CAZy analyses show that these genomes encode several glycoside hydrolases (GHs) as well as other genes related to lignocellulose breakdown. The new Thermobacillus species stands out for being the richest in diversity and abundance of GHs, possessing the greatest potential for biomass degradation among the six recovered genomes. We also investigated the presence and activity of the organisms corresponding to these genomes in the composting operation from which the consortium was built, using compost metagenome and metatranscriptome datasets generated in a previous study. We obtained strong evidence that five of the six recovered genomes are indeed present and active in that composting process. We have thus discovered three (perhaps four) new thermophillic bacterial species that add to the increasing repertoire of known lignocellulose degraders, whose biotechnological potential can now be investigated in further studies. PMID:28469608
Genomic and phylogenetic analyses of an adenovirus isolated from a corn snake (Elaphe guttata) imply a common origin with members of the proposed new genus Atadenovirus.

PubMed

Farkas, Szilvia L; Benko, Mária; Elo, Péter; Ursu, Krisztina; Dán, Adám; Ahne, Winfried; Harrach, Balázs

2002-10-01

Approximately 60% of the genome of an adenovirus isolated from a corn snake (Elaphe guttata) was cloned and sequenced. The results of homology searches showed that the genes of the corn snake adenovirus (SnAdV-1) were closest to their counterparts in members of the recently proposed new genus ATADENOVIRUS: In phylogenetic analyses of the complete hexon and protease genes, SnAdV-1 indeed clustered together with the atadenoviruses. The characteristic features in the genome organization of SnAdV-1 included the presence of a gene homologous to that for protein p32K, the lack of structural proteins V and IX and the absence of homologues of the E1A and E3 regions. These characteristics are in accordance with the genus-defining markers of atadenoviruses. Comparison of the cleavage sites of the viral protease in core protein pVII also confirmed SnAdV-1 as a candidate member of the genus ATADENOVIRUS: Thus, the hypothesis on the possible reptilian origin of atadenoviruses (Harrach, Acta Veterinaria Hungarica 48, 484-490, 2000) seems to be supported. However, the base composition of DNA sequence (>18 kb) determined from the SnAdV-1 genome showed an equilibrated GC content of 51%, which is unusual for an atadenovirus.
Dominant ectosymbiotic bacteria of cellulolytic protists in the termite gut also have the potential to digest lignocellulose.

PubMed

Yuki, Masahiro; Kuwahara, Hirokazu; Shintani, Masaki; Izawa, Kazuki; Sato, Tomoyuki; Starns, David; Hongoh, Yuichi; Ohkuma, Moriya

2015-12-01

Wood-feeding lower termites harbour symbiotic gut protists that support the termite nutritionally by degrading recalcitrant lignocellulose. These protists themselves host specific endo- and ectosymbiotic bacteria, functions of which remain largely unknown. Here, we present draft genomes of a dominant, uncultured ectosymbiont belonging to the order Bacteroidales, 'Candidatus Symbiothrix dinenymphae', which colonizes the cell surface of the cellulolytic gut protists Dinenympha spp. We analysed four single-cell genomes of Ca. S. dinenymphae, the highest genome completeness was estimated to be 81.6-82.3% with a predicted genome size of 4.28-4.31 Mb. The genome retains genes encoding large parts of the amino acid, cofactor and nucleotide biosynthetic pathways. In addition, the genome contains genes encoding various glycoside hydrolases such as endoglucanases and hemicellulases. The genome indicates that Ca. S. dinenymphae ferments lignocellulose-derived monosaccharides to acetate, a major carbon and energy source of the host termite. We suggest that the ectosymbiont digests lignocellulose and provides nutrients to the host termites, and hypothesize that the hydrolytic activity might also function as a pretreatment for the host protist to effectively decompose the crystalline cellulose components. © 2015 Society for Applied Microbiology and John Wiley & Sons Ltd.
Advances in Maize Genomics and Their Value for Enhancing Genetic Gains from Breeding

PubMed Central

Xu, Yunbi; Skinner, Debra J.; Wu, Huixia; Palacios-Rojas, Natalia; Araus, Jose Luis; Yan, Jianbing; Gao, Shibin; Warburton, Marilyn L.; Crouch, Jonathan H.

2009-01-01

Maize is an important crop for food, feed, forage, and fuel across tropical and temperate areas of the world. Diversity studies at genetic, molecular, and functional levels have revealed that, tropical maize germplasm, landraces, and wild relatives harbor a significantly wider range of genetic variation. Among all types of markers, SNP markers are increasingly the marker-of-choice for all genomics applications in maize breeding. Genetic mapping has been developed through conventional linkage mapping and more recently through linkage disequilibrium-based association analyses. Maize genome sequencing, initially focused on gene-rich regions, now aims for the availability of complete genome sequence. Conventional insertion mutation-based cloning has been complemented recently by EST- and map-based cloning. Transgenics and nutritional genomics are rapidly advancing fields targeting important agronomic traits including pest resistance and grain quality. Substantial advances have been made in methodologies for genomics-assisted breeding, enhancing progress in yield as well as abiotic and biotic stress resistances. Various genomic databases and informatics tools have been developed, among which MaizeGDB is the most developed and widely used by the maize research community. In the future, more emphasis should be given to the development of tools and strategic germplasm resources for more effective molecular breeding of tropical maize products. PMID:19688107
The mitochondrial genome of Frankliniella intonsa: insights into the evolution of mitochondrial genomes at lower taxonomic levels in Thysanoptera.

PubMed

Yan, Dankan; Tang, Yunxia; Hu, Min; Liu, Fengquan; Zhang, Dongfang; Fan, Jiaqin

2014-10-01

Thrips is an ideal group for studying the evolution of mitochondrial (mt) genomes in the genus and family due to independent rearrangements within this order. The complete sequence of the mitochondrial DNA (mtDNA) of the flower thrips Frankliniella intonsa has been completed and annotated in this study. The circular genome is 15,215bp in length with an A+T content of 75.9% and contains the typical 37 genes and it has triplicate putative control regions. Nucleotide composition is A+T biased, and the majority of the protein-coding genes present opposite CG skew which is reflected by the nucleotide composition, codon and amino acid usage. Although the known thrips have massive gene rearrangements, it showed no reversal of strand asymmetry. Gene rearrangements have been found in the lower taxonomic levels of thrips. Three tRNA genes were translocated in the genus Frankliniella and eight tRNA genes in the family Thripidae. Although the gene arrangements of mt genomes of all three thrips species differ massively from the ancestral insect, they are all very similar to each other, indicating that there was a large rearrangement somewhere before the most recent common ancestor of these three species and very little genomic evolution or rearrangements after then. The extremely similar sequences among the CRs suggest that they are ongoing concerted evolution. Analyses of the up and downstream sequence of CRs reveal that the CR2 is actually the ancestral CR. The three CRs are in the same spot in each of the three thrips mt genomes which have the identical inverted genes. These characteristics might be obtained from the most recent common ancestor of this three thrips. Above observations suggest that the mt genomes of the three thrips keep a single massive rearrangement from the common ancestor and have low evolutionary rates among them. Copyright © 2014 Elsevier Inc. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Castelle, Cindy; Wrighton, Kelly C.; Thomas, Brian C.

Domain Archaea is currently represented by one phylum (Euryarchaeota) and two superphyla (TACK and DPANN). However, gene surveys indicate the existence of a vast diversity of uncultivated archaea for which metabolic information is lacking. We sequenced DNA from complex sediment- and groundwater-associated microbial communities sampled prior to and during an acetate biostimulation field experiment to investigate the diversity and physiology of uncultivated subsurface archaea. We sampled 15 genomes that improve resolution of a new phylum within the TACK superphylum and 119 DPANN genomes that highlight a major subdivision within the archaeal domain that separates DPANN from TACK/Euryarchaeota lineages. Within themore » DPANN superphylum, which lacks any isolated representatives, we defined two new phyla using sequences from 100 newly sampled genomes. The first new phylum, for which we propose the name Woesearchaeota, was defined using 54 new sequences. We reconstructed a complete (finished) genome for an archaeon from this phylum that is only 0.8 Mb in length and lacks almost all core biosynthetic pathways, but has genes encoding enzymes predicted to interact with bacterial cell walls, consistent with a symbiotic lifestyle. The second new phylum, for which we propose the name Pacearchaeota, was defined based on 46 newly sampled archaeal genomes. This phylum includes the first non-methanogen with an intermediate Type II/III RuBisCO. We also reconstructed a complete (1.24 Mb) genome for another DPANN archaeon, a member of the Diapherotrites phylum. Metabolic prediction and transcriptomic data indicate that this organism has a fermentation-based lifestyle. In fact, genomic analyses consistently indicate lack of recognizable pathways for sulfur, nitrogen, methane, oxygen, and metal cycling, and suggest that symbiotic and fermentation-based lifestyles are widespread across the DPANN superphylum. Thus, as for a recently identified superphylum of bacteria with small genomes and no cultivated representatives, the biogeochemical impacts of this major radiation of archaea are primarily through anaerobic carbon and hydrogen cycling.« less
Common position of indels that cause deviations from canonical genome organization in different measles virus strains.

PubMed

Ivancic-Jelecki, Jelena; Slovic, Anamarija; Šantak, Maja; Tešović, Goran; Forcic, Dubravko

2016-07-29

The canonical genome organization of measles virus (MV) is characterized by total size of 15 894 nucleotides (nts) and defined length of every genomic region, both coding and non-coding. Only rarely have reports of strains possessing non-canonical genomic properties (possessing indels, with or without the change of total genome length) been published. The observed mutations are mutually compensatory in a sense that the total genome length remains polyhexameric. Although programmed and highly precise pseudo-templated nucleotide additions during transcription are inherent to polymerases of all viruses belonging to family Paramyxoviridae, a similar mechanism that would serve to non-randomly correct genome length, if an indel has occurred during replication, has so far not been described in the context of a complete virus genome. We compiled all complete MV genomic sequences (64 in total) available in open access sequence databases. Multiple sequence comparisons and phylogenetic analyses were performed with the aim of exploring whether non-recombinant and non-evolutionary linked measles strains that show deviations from canonical genome organization possess a common genetic characteristic. In 11 MV sequences we detected deviations from canonical genome organization due to short indels located within homopolymeric stretches or next to them. In nine out of 11 identified non-canonical MV sequences, a common feature was observed: one mutation, either an insertion or a deletion, was located in a 28 nts long region in F gene 5' untranslated region (positions 5051-5078 in genomic cDNA of canonical strains). This segment is composed of five tandemly linked homopolymeric stretches, its consensus sequence is G6-7C7-8A6-7G1-3C5-6. Although none of the mononucleotide repeats within this segment has fixed length, the total number of nts in canonical strains is always 28. These nine non-canonical strains, as well as the tenth (not mutated in 5051-5078 segment), can be grouped in three clusters, based on their passage histories/epidemiological data/genetic similarities. There are no indications that the 3 clusters are evolutionary linked, other than the fact that they all belong to clade D. A common narrow genomic region was found to be mutated in different, non-related, wild type strains suggesting that this region might have a function in non-random genome length corrections occurring during MV replication.
Tunicate mitogenomics and phylogenetics: peculiarities of the Herdmania momus mitochondrial genome and support for the new chordate phylogeny

PubMed Central

2009-01-01

Background Tunicates represent a key metazoan group as the sister-group of vertebrates within chordates. The six complete mitochondrial genomes available so far for tunicates have revealed distinctive features. Extensive gene rearrangements and particularly high evolutionary rates have been evidenced with regard to other chordates. This peculiar evolutionary dynamics has hampered the reconstruction of tunicate phylogenetic relationships within chordates based on mitogenomic data. Results In order to further understand the atypical evolutionary dynamics of the mitochondrial genome of tunicates, we determined the complete sequence of the solitary ascidian Herdmania momus. This genome from a stolidobranch ascidian presents the typical tunicate gene content with 13 protein-coding genes, 2 rRNAs and 24 tRNAs which are all encoded on the same strand. However, it also presents a novel gene arrangement, highlighting the extreme plasticity of gene order observed in tunicate mitochondrial genomes. Probabilistic phylogenetic inferences were conducted on the concatenation of the 13 mitochondrial protein-coding genes from representatives of major metazoan phyla. We show that whereas standard homogeneous amino acid models support an artefactual sister position of tunicates relative to all other bilaterians, the CAT and CAT+BP site- and time-heterogeneous mixture models place tunicates as the sister-group of vertebrates within monophyletic chordates. Moreover, the reference phylogeny indicates that tunicate mitochondrial genomes have experienced a drastic acceleration in their evolutionary rate that equally affects protein-coding and ribosomal-RNA genes. Conclusion This is the first mitogenomic study supporting the new chordate phylogeny revealed by recent phylogenomic analyses. It illustrates the beneficial effects of an increased taxon sampling coupled with the use of more realistic amino acid substitution models for the reconstruction of animal phylogeny. PMID:19922605
Tunicate mitogenomics and phylogenetics: peculiarities of the Herdmania momus mitochondrial genome and support for the new chordate phylogeny.

PubMed

Singh, Tiratha Raj; Tsagkogeorga, Georgia; Delsuc, Frédéric; Blanquart, Samuel; Shenkar, Noa; Loya, Yossi; Douzery, Emmanuel Jp; Huchon, Dorothée

2009-11-17

Tunicates represent a key metazoan group as the sister-group of vertebrates within chordates. The six complete mitochondrial genomes available so far for tunicates have revealed distinctive features. Extensive gene rearrangements and particularly high evolutionary rates have been evidenced with regard to other chordates. This peculiar evolutionary dynamics has hampered the reconstruction of tunicate phylogenetic relationships within chordates based on mitogenomic data. In order to further understand the atypical evolutionary dynamics of the mitochondrial genome of tunicates, we determined the complete sequence of the solitary ascidian Herdmania momus. This genome from a stolidobranch ascidian presents the typical tunicate gene content with 13 protein-coding genes, 2 rRNAs and 24 tRNAs which are all encoded on the same strand. However, it also presents a novel gene arrangement, highlighting the extreme plasticity of gene order observed in tunicate mitochondrial genomes. Probabilistic phylogenetic inferences were conducted on the concatenation of the 13 mitochondrial protein-coding genes from representatives of major metazoan phyla. We show that whereas standard homogeneous amino acid models support an artefactual sister position of tunicates relative to all other bilaterians, the CAT and CAT+BP site- and time-heterogeneous mixture models place tunicates as the sister-group of vertebrates within monophyletic chordates. Moreover, the reference phylogeny indicates that tunicate mitochondrial genomes have experienced a drastic acceleration in their evolutionary rate that equally affects protein-coding and ribosomal-RNA genes. This is the first mitogenomic study supporting the new chordate phylogeny revealed by recent phylogenomic analyses. It illustrates the beneficial effects of an increased taxon sampling coupled with the use of more realistic amino acid substitution models for the reconstruction of animal phylogeny.
Complete genomic sequences of Propionibacterium freudenreichii phages from Swiss cheese reveal greater diversity than Cutibacterium (formerly Propionibacterium) acnes phages.

PubMed

Cheng, Lucy; Marinelli, Laura J; Grosset, Noël; Fitz-Gibbon, Sorel T; Bowman, Charles A; Dang, Brian Q; Russell, Daniel A; Jacobs-Sera, Deborah; Shi, Baochen; Pellegrini, Matteo; Miller, Jeff F; Gautier, Michel; Hatfull, Graham F; Modlin, Robert L

2018-03-01

A remarkable exception to the large genetic diversity often observed for bacteriophages infecting a specific bacterial host was found for the Cutibacterium acnes (formerly Propionibacterium acnes) phages, which are highly homogeneous. Phages infecting the related species, which is also a member of the Propionibacteriaceae family, Propionibacterium freudenreichii, a bacterium used in production of Swiss-type cheeses, have also been described and are common contaminants of the cheese manufacturing process. However, little is known about their genetic composition and diversity. We obtained seven independently isolated bacteriophages that infect P. freudenreichii from Swiss-type cheese samples, and determined their complete genome sequences. These data revealed that all seven phage isolates are of similar genomic length and GC% content, but their genomes are highly diverse, including genes encoding the capsid, tape measure, and tail proteins. In contrast to C. acnes phages, all P. freudenreichii phage genomes encode a putative integrase protein, suggesting they are capable of lysogenic growth. This is supported by the finding of related prophages in some P. freudenreichii strains. The seven phages could further be distinguished as belonging to two distinct genomic types, or 'clusters', based on nucleotide sequences, and host range analyses conducted on a collection of P. freudenreichii strains show a higher degree of host specificity than is observed for the C. acnes phages. Overall, our data demonstrate P. freudenreichii bacteriophages are distinct from C. acnes phages, as evidenced by their higher genetic diversity, potential for lysogenic growth, and more restricted host ranges. This suggests substantial differences in the evolution of these related species from the Propionibacteriaceae family and their phages, which is potentially related to their distinct environmental niches.
Comparative Genomics of the Dual-Obligate Symbionts from the Treehopper, Entylia carinata (Hemiptera: Membracidae), Provide Insight into the Origins and Evolution of an Ancient Symbiosis.

PubMed

Mao, Meng; Yang, Xiushuai; Poff, Kirsten; Bennett, Gordon

2017-06-01

Insect species in the Auchenorrhyncha suborder (Hemiptera) maintain ancient obligate symbioses with bacteria that provide essential amino acids (EAAs) deficient in their plant-sap diets. Molecular studies have revealed that two complementary symbiont lineages, "Candidatus Sulcia muelleri" and a betaproteobacterium ("Ca. Zinderia insecticola" in spittlebugs [Cercopoidea] and "Ca. Nasuia deltocephalinicola" in leafhoppers [Cicadellidae]) may have persisted in the suborder since its origin ∼300 Ma. However, investigation of how this pair has co-evolved on a genomic level is limited to only a few host lineages. We sequenced the complete genomes of Sulcia and a betaproteobacterium from the treehopper, Entylia carinata (Membracidae: ENCA), as the first representative from this species-rich group. It also offers the opportunity to compare symbiont evolution across a major insect group, the Membracoidea (leafhoppers + treehoppers). Genomic analyses show that the betaproteobacteria in ENCA is a member of the Nasuia lineage. Both symbionts have larger genomes (Sulcia = 218 kb and Nasuia = 144 kb) than related lineages in Deltocephalinae leafhoppers, retaining genes involved in basic cellular functions and information processing. Nasuia-ENCA further exhibits few unique gene losses, suggesting that its parent lineage in the common ancestor to the Membracoidea was already highly reduced. Sulcia-ENCA has lost the abilities to synthesize menaquinone cofactor and to complete the synthesis of the branched-chain EAAs. Both capabilities are conserved in other Sulcia lineages sequenced from across the Auchenorrhyncha. Finally, metagenomic sequencing recovered the partial genome of an Arsenophonus symbiont, although it infects only 20% of individuals indicating a facultative role. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Comparative Genomics of the Dual-Obligate Symbionts from the Treehopper, Entylia carinata (Hemiptera: Membracidae), Provide Insight into the Origins and Evolution of an Ancient Symbiosis

PubMed Central

Yang, Xiushuai; Poff, Kirsten; Bennett, Gordon

2017-01-01

Abstract Insect species in the Auchenorrhyncha suborder (Hemiptera) maintain ancient obligate symbioses with bacteria that provide essential amino acids (EAAs) deficient in their plant-sap diets. Molecular studies have revealed that two complementary symbiont lineages, “Candidatus Sulcia muelleri” and a betaproteobacterium (“Ca. Zinderia insecticola” in spittlebugs [Cercopoidea] and “Ca. Nasuia deltocephalinicola” in leafhoppers [Cicadellidae]) may have persisted in the suborder since its origin ∼300 Ma. However, investigation of how this pair has co-evolved on a genomic level is limited to only a few host lineages. We sequenced the complete genomes of Sulcia and a betaproteobacterium from the treehopper, Entylia carinata (Membracidae: ENCA), as the first representative from this species-rich group. It also offers the opportunity to compare symbiont evolution across a major insect group, the Membracoidea (leafhoppers + treehoppers). Genomic analyses show that the betaproteobacteria in ENCA is a member of the Nasuia lineage. Both symbionts have larger genomes (Sulcia = 218 kb and Nasuia = 144 kb) than related lineages in Deltocephalinae leafhoppers, retaining genes involved in basic cellular functions and information processing. Nasuia-ENCA further exhibits few unique gene losses, suggesting that its parent lineage in the common ancestor to the Membracoidea was already highly reduced. Sulcia-ENCA has lost the abilities to synthesize menaquinone cofactor and to complete the synthesis of the branched-chain EAAs. Both capabilities are conserved in other Sulcia lineages sequenced from across the Auchenorrhyncha. Finally, metagenomic sequencing recovered the partial genome of an Arsenophonus symbiont, although it infects only 20% of individuals indicating a facultative role. PMID:28854637
Community genomic analyses constrain the distribution of metabolic traits across the Chloroflexi phylum and indicate roles in sediment carbon cycling

PubMed Central

2013-01-01

Background Sediments are massive reservoirs of carbon compounds and host a large fraction of microbial life. Microorganisms within terrestrial aquifer sediments control buried organic carbon turnover, degrade organic contaminants, and impact drinking water quality. Recent 16S rRNA gene profiling indicates that members of the bacterial phylum Chloroflexi are common in sediment. Only the role of the class Dehalococcoidia, which degrade halogenated solvents, is well understood. Genomic sampling is available for only six of the approximate 30 Chloroflexi classes, so little is known about the phylogenetic distribution of reductive dehalogenation or about the broader metabolic characteristics of Chloroflexi in sediment. Results We used metagenomics to directly evaluate the metabolic potential and diversity of Chloroflexi in aquifer sediments. We sampled genomic sequence from 86 Chloroflexi representing 15 distinct lineages, including members of eight classes previously characterized only by 16S rRNA sequences. Unlike in the Dehalococcoidia, genes for organohalide respiration are rare within the Chloroflexi genomes sampled here. Near-complete genomes were reconstructed for three Chloroflexi. One, a member of an unsequenced lineage in the Anaerolinea, is an aerobe with the potential for respiring diverse carbon compounds. The others represent two genomically unsampled classes sibling to the Dehalococcoidia, and are anaerobes likely involved in sugar and plant-derived-compound degradation to acetate. Both fix CO2 via the Wood-Ljungdahl pathway, a pathway not previously documented in Chloroflexi. The genomes each encode unique traits apparently acquired from Archaea, including mechanisms of motility and ATP synthesis. Conclusions Chloroflexi in the aquifer sediments are abundant and highly diverse. Genomic analyses provide new evolutionary boundaries for obligate organohalide respiration. We expand the potential roles of Chloroflexi in sediment carbon cycling beyond organohalide respiration to include respiration of sugars, fermentation, CO2 fixation, and acetogenesis with ATP formation by substrate-level phosphorylation. PMID:24450983
Complete Coding Genome Sequence for Mogiana Tick Virus, a Jingmenvirus Isolated from Ticks in Brazil

DTIC Science & Technology

2017-05-04

and capable of infecting a wide range of animal hosts (1–5). Here, we report the complete coding genome sequence (i.e., only missing portions of...segmented nature of the genome was not under- stood. Therefore, only the two genome segments with detectable sequence homolo- gies to flaviviruses were...originally reported (2). We revisited the data set of Maruyama et al. (2) and assembled the complete coding sequences for all four genome segments. We
Complete genome sequence of Campylobacter concisus ATCC 33237T and draft genome sequences for an additional eight well-characterized C. concisus strains

USDA-ARS?s Scientific Manuscript database

This report includes the complete genome of the Campylobacter concisus type strain ATCC 33237T and the draft genomes of eight additional well characterized C. concisus genomes. C. concisus has been shown to be a genetically heterogeneous species and these nine genomes provide valuable information re...

Tibrogargan and Coastal Plains rhabdoviruses: genomic characterization, evolution of novel genes and seroprevalence in Australian livestock.

PubMed

Gubala, Aneta; Davis, Steven; Weir, Richard; Melville, Lorna; Cowled, Chris; Boyle, David

2011-09-01

Tibrogargan virus (TIBV) and Coastal Plains virus (CPV) were isolated from cattle in Australia and TIBV has also been isolated from the biting midge Culicoides brevitarsis. Complete genomic sequencing revealed that the viruses share a novel genome structure within the family Rhabdoviridae, each virus containing two additional putative genes between the matrix protein (M) and glycoprotein (G) genes and one between the G and viral RNA polymerase (L) genes. The predicted novel protein products are highly diverged at the sequence level but demonstrate clear conservation of secondary structure elements, suggesting conservation of biological functions. Phylogenetic analyses showed that TIBV and CPV form an independent group within the 'dimarhabdovirus supergroup'. Although no disease has been observed in association with these viruses, antibodies were detected at high prevalence in cattle and buffalo in northern Australia, indicating the need for disease monitoring and further study of this distinctive group of viruses.
Plastid Phylogenomic Analyses Resolve Tofieldiaceae as the Root of the Early Diverging Monocot Order Alismatales.

PubMed

Luo, Yang; Ma, Peng-Fei; Li, Hong-Tao; Yang, Jun-Bo; Wang, Hong; Li, De-Zhu

2016-04-06

The predominantly aquatic order Alismatales, which includes approximately 4,500 species within Araceae, Tofieldiaceae, and the core alismatid families, is a key group in investigating the origin and early diversification of monocots. Despite their importance, phylogenetic ambiguity regarding the root of the Alismatales tree precludes answering questions about the early evolution of the order. Here, we sequenced the first complete plastid genomes from three key families in this order:Potamogeton perfoliatus(Potamogetonaceae),Sagittaria lichuanensis(Alismataceae), andTofieldia thibetica(Tofieldiaceae). Each family possesses the typical quadripartite structure, with plastid genome sizes of 156,226, 179,007, and 155,512 bp, respectively. Among them, the plastid genome ofS. lichuanensisis the largest in monocots and the second largest in angiosperms. Like other sequenced Alismatales plastid genomes, all three families generally encode the same 113 genes with similar structure and arrangement. However, we detected 2.4 and 6 kb inversions in the plastid genomes ofSagittariaandPotamogeton, respectively. Further, we assembled a 79 plastid protein-coding gene sequence data matrix of 22 taxa that included the three newly generated plastid genomes plus 19 previously reported ones, which together represent all primary lineages of monocots and outgroups. In plastid phylogenomic analyses using maximum likelihood and Bayesian inference, we show both strong support for Acorales as sister to the remaining monocots and monophyly of Alismatales. More importantly, Tofieldiaceae was resolved as the most basal lineage within Alismatales. These results provide new insights into the evolution of Alismatales as well as the early-diverging monocots as a whole. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Genomic features separating ten strains of Neorhizobium galegae with different symbiotic phenotypes.

PubMed

Österman, Janina; Mousavi, Seyed Abdollah; Koskinen, Patrik; Paulin, Lars; Lindström, Kristina

2015-05-02

The symbiotic phenotype of Neorhizobium galegae, with strains specifically fixing nitrogen with either Galega orientalis or G. officinalis, has made it a target in research on determinants of host specificity in nitrogen fixation. The genomic differences between representative strains of the two symbiovars are, however, relatively small. This introduced a need for a dataset representing a larger bacterial population in order to make better conclusions on characteristics typical for a subset of the species. In this study, we produced draft genomes of eight strains of N. galegae having different symbiotic phenotypes, both with regard to host specificity and nitrogen fixation efficiency. These genomes were analysed together with the previously published complete genomes of N. galegae strains HAMBI 540T and HAMBI 1141. The results showed that the presence of an additional rpoN sigma factor gene in the symbiosis gene region is a characteristic specific to symbiovar orientalis, required for nitrogen fixation. Also the nifQ gene was shown to be crucial for functional symbiosis in both symbiovars. Genome-wide analyses identified additional genes characteristic of strains of the same symbiovar and of strains having similar plant growth promoting properties on Galega orientalis. Many of these genes are involved in transcriptional regulation or in metabolic functions. The results of this study confirm that the only symbiosis-related gene that is present in one symbiovar of N. galegae but not in the other is an rpoN gene. The specific function of this gene remains to be determined, however. New genes that were identified as specific for strains of one symbiovar may be involved in determining host specificity, while others are defined as potential determinant genes for differences in efficiency of nitrogen fixation.
Complete Genome Sequence of Germline Chromosomally Integrated Human Herpesvirus 6A and Analyses Integration Sites Define a New Human Endogenous Virus with Potential to Reactivate as an Emerging Infection.

PubMed

Tweedy, Joshua; Spyrou, Maria Alexandra; Pearson, Max; Lassner, Dirk; Kuhl, Uwe; Gompels, Ursula A

2016-01-15

Human herpesvirus-6A and B (HHV-6A, HHV-6B) have recently defined endogenous genomes, resulting from integration into the germline: chromosomally-integrated "CiHHV-6A/B". These affect approximately 1.0% of human populations, giving potential for virus gene expression in every cell. We previously showed that CiHHV-6A was more divergent than CiHHV-6B by examining four genes in 44 European CiHHV-6A/B cardiac/haematology patients. There was evidence for gene expression/reactivation, implying functional non-defective genomes. To further define the relationship between HHV-6A and CiHHV-6A we used next-generation sequencing to characterize genomes from three CiHHV-6A cardiac patients. Comparisons to known exogenous HHV-6A showed CiHHV-6A genomes formed a separate clade; including all 85 non-interrupted genes and necessary cis-acting signals for reactivation as infectious virus. Greater single nucleotide polymorphism (SNP) density was defined in 16 genes and the direct repeats (DR) terminal regions. Using these SNPs, deep sequencing analyses demonstrated superinfection with exogenous HHV-6A in two of the CiHHV-6A patients with recurrent cardiac disease. Characterisation of the integration sites in twelve patients identified the human chromosome 17p subtelomere as a prevalent site, which had specific repeat structures and phylogenetically related CiHHV-6A coding sequences indicating common ancestral origins. Overall CiHHV-6A genomes were similar, but distinct from known exogenous HHV-6A virus, and have the capacity to reactivate as emerging virus infections.
'Candidatus Phytoplasma phoenicium' associated with almond witches'-broom disease: from draft genome to genetic diversity among strain populations.

PubMed

Quaglino, Fabio; Kube, Michael; Jawhari, Maan; Abou-Jawdah, Yusuf; Siewert, Christin; Choueiri, Elia; Sobh, Hana; Casati, Paola; Tedeschi, Rosemarie; Lova, Marina Molino; Alma, Alberto; Bianco, Piero Attilio

2015-07-30

Almond witches'-broom (AlmWB), a devastating disease of almond, peach and nectarine in Lebanon, is associated with 'Candidatus Phytoplasma phoenicium'. In the present study, we generated a draft genome sequence of 'Ca. P. phoenicium' strain SA213, representative of phytoplasma strain populations from different host plants, and determined the genetic diversity among phytoplasma strain populations by phylogenetic analyses of 16S rRNA, groEL, tufB and inmp gene sequences. Sequence-based typing and phylogenetic analysis of the gene inmp, coding an integral membrane protein, distinguished AlmWB-associated phytoplasma strains originating from diverse host plants, whereas their 16S rRNA, tufB and groEL genes shared 100 % sequence identity. Moreover, dN/dS analysis indicated positive selection acting on inmp gene. Additionally, the analysis of 'Ca. P. phoenicium' draft genome revealed the presence of integral membrane proteins and effector-like proteins and potential candidates for interaction with hosts. One of the integral membrane proteins was predicted as BI-1, an inhibitor of apoptosis-promoting Bax factor. Bioinformatics analyses revealed the presence of putative BI-1 in draft and complete genomes of other 'Ca. Phytoplasma' species. The genetic diversity within 'Ca. P. phoenicium' strain populations in Lebanon suggested that AlmWB disease could be associated with phytoplasma strains derived from the adaptation of an original strain to diverse hosts. Moreover, the identification of a putative inhibitor of apoptosis-promoting Bax factor (BI-1) in 'Ca. P. phoenicium' draft genome and within genomes of other 'Ca. Phytoplasma' species suggested its potential role as a phytoplasma fitness-increasing factor by modification of the host-defense response.
Complete Genome Sequence of Germline Chromosomally Integrated Human Herpesvirus 6A and Analyses Integration Sites Define a New Human Endogenous Virus with Potential to Reactivate as an Emerging Infection

PubMed Central

Tweedy, Joshua; Spyrou, Maria Alexandra; Pearson, Max; Lassner, Dirk; Kuhl, Uwe; Gompels, Ursula A.

2016-01-01

Human herpesvirus-6A and B (HHV-6A, HHV-6B) have recently defined endogenous genomes, resulting from integration into the germline: chromosomally-integrated “CiHHV-6A/B”. These affect approximately 1.0% of human populations, giving potential for virus gene expression in every cell. We previously showed that CiHHV-6A was more divergent than CiHHV-6B by examining four genes in 44 European CiHHV-6A/B cardiac/haematology patients. There was evidence for gene expression/reactivation, implying functional non-defective genomes. To further define the relationship between HHV-6A and CiHHV-6A we used next-generation sequencing to characterize genomes from three CiHHV-6A cardiac patients. Comparisons to known exogenous HHV-6A showed CiHHV-6A genomes formed a separate clade; including all 85 non-interrupted genes and necessary cis-acting signals for reactivation as infectious virus. Greater single nucleotide polymorphism (SNP) density was defined in 16 genes and the direct repeats (DR) terminal regions. Using these SNPs, deep sequencing analyses demonstrated superinfection with exogenous HHV-6A in two of the CiHHV-6A patients with recurrent cardiac disease. Characterisation of the integration sites in twelve patients identified the human chromosome 17p subtelomere as a prevalent site, which had specific repeat structures and phylogenetically related CiHHV-6A coding sequences indicating common ancestral origins. Overall CiHHV-6A genomes were similar, but distinct from known exogenous HHV-6A virus, and have the capacity to reactivate as emerging virus infections. PMID:26784220
Whole genome duplication events in plant evolution reconstructed and predicted using myosin motor proteins

PubMed Central

2013-01-01

Background The evolution of land plants is characterized by whole genome duplications (WGD), which drove species diversification and evolutionary novelties. Detecting these events is especially difficult if they date back to the origin of the plant kingdom. Established methods for reconstructing WGDs include intra- and inter-genome comparisons, KS age distribution analyses, and phylogenetic tree constructions. Results By analysing 67 completely sequenced plant genomes 775 myosins were identified and manually assembled. Phylogenetic trees of the myosin motor domains revealed orthologous and paralogous relationships and were consistent with recent species trees. Based on the myosin inventories and the phylogenetic trees, we have identified duplications of the entire myosin motor protein family at timings consistent with 23 WGDs, that had been reported before. We also predict 6 WGDs based on further protein family duplications. Notably, the myosin data support the two recently reported WGDs in the common ancestor of all extant angiosperms. We predict single WGDs in the Manihot esculenta and Nicotiana benthamiana lineages, two WGDs for Linum usitatissimum and Phoenix dactylifera, and a triplication or two WGDs for Gossypium raimondii. Our data show another myosin duplication in the ancestor of the angiosperms that could be either the result of a single gene duplication or a remnant of a WGD. Conclusions We have shown that the myosin inventories in angiosperms retain evidence of numerous WGDs that happened throughout plant evolution. In contrast to other protein families, many myosins are still present in extant species. They are closely related and have similar domain architectures, and their phylogenetic grouping follows the genome duplications. Because of its broad taxonomic sampling the dataset provides the basis for reliable future identification of further whole genome duplications. PMID:24053117
Whole genome duplication events in plant evolution reconstructed and predicted using myosin motor proteins.

PubMed

Mühlhausen, Stefanie; Kollmar, Martin

2013-09-22

The evolution of land plants is characterized by whole genome duplications (WGD), which drove species diversification and evolutionary novelties. Detecting these events is especially difficult if they date back to the origin of the plant kingdom. Established methods for reconstructing WGDs include intra- and inter-genome comparisons, KS age distribution analyses, and phylogenetic tree constructions. By analysing 67 completely sequenced plant genomes 775 myosins were identified and manually assembled. Phylogenetic trees of the myosin motor domains revealed orthologous and paralogous relationships and were consistent with recent species trees. Based on the myosin inventories and the phylogenetic trees, we have identified duplications of the entire myosin motor protein family at timings consistent with 23 WGDs, that had been reported before. We also predict 6 WGDs based on further protein family duplications. Notably, the myosin data support the two recently reported WGDs in the common ancestor of all extant angiosperms. We predict single WGDs in the Manihot esculenta and Nicotiana benthamiana lineages, two WGDs for Linum usitatissimum and Phoenix dactylifera, and a triplication or two WGDs for Gossypium raimondii. Our data show another myosin duplication in the ancestor of the angiosperms that could be either the result of a single gene duplication or a remnant of a WGD. We have shown that the myosin inventories in angiosperms retain evidence of numerous WGDs that happened throughout plant evolution. In contrast to other protein families, many myosins are still present in extant species. They are closely related and have similar domain architectures, and their phylogenetic grouping follows the genome duplications. Because of its broad taxonomic sampling the dataset provides the basis for reliable future identification of further whole genome duplications.
Ancient Duplications and Expression Divergence in the Globin Gene Superfamily of Vertebrates: Insights from the Elephant Shark Genome and Transcriptome

PubMed Central

Opazo, Juan C.; Toloza-Villalobos, Jessica; Burmester, Thorsten; Venkatesh, Byrappa; Storz, Jay F.

2015-01-01

Comparative analyses of vertebrate genomes continue to uncover a surprising diversity of genes in the globin gene superfamily, some of which have very restricted phyletic distributions despite their antiquity. Genomic analysis of the globin gene repertoire of cartilaginous fish (Chondrichthyes) should be especially informative about the duplicative origins and ancestral functions of vertebrate globins, as divergence between Chondrichthyes and bony vertebrates represents the most basal split within the jawed vertebrates. Here, we report a comparative genomic analysis of the vertebrate globin gene family that includes the complete globin gene repertoire of the elephant shark (Callorhinchus milii). Using genomic sequence data from representatives of all major vertebrate classes, integrated analyses of conserved synteny and phylogenetic relationships revealed that the last common ancestor of vertebrates possessed a repertoire of at least seven globin genes: single copies of androglobin and neuroglobin, four paralogous copies of globin X, and the single-copy progenitor of the entire set of vertebrate-specific globins. Combined with expression data, the genomic inventory of elephant shark globins yielded four especially surprising findings: 1) there is no trace of the neuroglobin gene (a highly conserved gene that is present in all other jawed vertebrates that have been examined to date), 2) myoglobin is highly expressed in heart, but not in skeletal muscle (reflecting a possible ancestral condition in vertebrates with single-circuit circulatory systems), 3) elephant shark possesses two highly divergent globin X paralogs, one of which is preferentially expressed in gonads, and 4) elephant shark possesses two structurally distinct α-globin paralogs, one of which is preferentially expressed in the brain. Expression profiles of elephant shark globin genes reveal distinct specializations of function relative to orthologs in bony vertebrates and suggest hypotheses about ancestral functions of vertebrate globins. PMID:25743544
First Complete Genome Sequence of Bean common mosaic necrosis virus from East Timor

PubMed Central

Maina, Solomon; Edwards, Owain R.; de Almeida, Luis; Ximenes, Abel

2016-01-01

We present here the first complete Bean common mosaic necrosis virus (BCMNV) genomic sequence isolated from virus-infected common bean (Phaseolus vulgaris) in East Timor, and compare it with six complete BMCNV genomes from the Netherlands, and one each from the United States, Tanzania, and an unspecified country. It most resembled the Netherlands strain NL-8 genome. PMID:27688343
Genome alignment with graph data structures: a comparison

PubMed Central

2014-01-01

Background Recent advances in rapid, low-cost sequencing have opened up the opportunity to study complete genome sequences. The computational approach of multiple genome alignment allows investigation of evolutionarily related genomes in an integrated fashion, providing a basis for downstream analyses such as rearrangement studies and phylogenetic inference. Graphs have proven to be a powerful tool for coping with the complexity of genome-scale sequence alignments. The potential of graphs to intuitively represent all aspects of genome alignments led to the development of graph-based approaches for genome alignment. These approaches construct a graph from a set of local alignments, and derive a genome alignment through identification and removal of graph substructures that indicate errors in the alignment. Results We compare the structures of commonly used graphs in terms of their abilities to represent alignment information. We describe how the graphs can be transformed into each other, and identify and classify graph substructures common to one or more graphs. Based on previous approaches, we compile a list of modifications that remove these substructures. Conclusion We show that crucial pieces of alignment information, associated with inversions and duplications, are not visible in the structure of all graphs. If we neglect vertex or edge labels, the graphs differ in their information content. Still, many ideas are shared among all graph-based approaches. Based on these findings, we outline a conceptual framework for graph-based genome alignment that can assist in the development of future genome alignment tools. PMID:24712884
COMPUTATIONAL RESOURCES FOR BIOFUEL FEEDSTOCK SPECIES

DOE Office of Scientific and Technical Information (OSTI.GOV)

Buell, Carol Robin; Childs, Kevin L

2013-05-07

While current production of ethanol as a biofuel relies on starch and sugar inputs, it is anticipated that sustainable production of ethanol for biofuel use will utilize lignocellulosic feedstocks. Candidate plant species to be used for lignocellulosic ethanol production include a large number of species within the Grass, Pine and Birch plant families. For these biofuel feedstock species, there are variable amounts of genome sequence resources available, ranging from complete genome sequences (e.g. sorghum, poplar) to transcriptome data sets (e.g. switchgrass, pine). These data sets are not only dispersed in location but also disparate in content. It will be essentialmore » to leverage and improve these genomic data sets for the improvement of biofuel feedstock production. The objectives of this project were to provide computational tools and resources for data-mining genome sequence/annotation and large-scale functional genomic datasets available for biofuel feedstock species. We have created a Bioenergy Feedstock Genomics Resource that provides a web-based portal or clearing house for genomic data for plant species relevant to biofuel feedstock production. Sequence data from a total of 54 plant species are included in the Bioenergy Feedstock Genomics Resource including model plant species that permit leveraging of knowledge across taxa to biofuel feedstock species.We have generated additional computational analyses of these data, including uniform annotation, to facilitate genomic approaches to improved biofuel feedstock production. These data have been centralized in the publicly available Bioenergy Feedstock Genomics Resource (http://bfgr.plantbiology.msu.edu/).« less
The complete mitochondrial genomes for three Toxocara species of human and animal health significance.

PubMed

Li, Ming-Wei; Lin, Rui-Qing; Song, Hui-Qun; Wu, Xiang-Yun; Zhu, Xing-Quan

2008-05-16

Studying mitochondrial (mt) genomics has important implications for various fundamental areas, including mt biochemistry, physiology and molecular biology. In addition, mt genome sequences have provided useful markers for investigating population genetic structures, systematics and phylogenetics of organisms. Toxocara canis, Toxocara cati and Toxocara malaysiensis cause significant health problems in animals and humans. Although they are of importance in human and animal health, no information on the mt genomes for any of Toxocara species is available. The sizes of the entire mt genome are 14,322 bp for T. canis, 14029 bp for T. cati and 14266 bp for T. malaysiensis, respectively. These circular genomes are amongst the largest reported to date for all secernentean nematodes. Their relatively large sizes relate mainly to an increased length in the AT-rich region. The mt genomes of the three Toxocara species all encode 12 proteins, two ribosomal RNAs and 22 transfer RNA genes, but lack the ATP synthetase subunit 8 gene, which is consistent with all other species of Nematode studied to date, with the exception of Trichinella spiralis. All genes are transcribed in the same direction and have a nucleotide composition high in A and T, but low in G and C. The contents of A+T of the complete genomes are 68.57% for T. canis, 69.95% for T. cati and 68.86% for T. malaysiensis, among which the A+T for T. canis is the lowest among all nematodes studied to date. The AT bias had a significant effect on both the codon usage pattern and amino acid composition of proteins. The mt genome structures for three Toxocara species, including genes and non-coding regions, are in the same order as for Ascaris suum and Anisakis simplex, but differ from Ancylostoma duodenale, Necator americanus and Caenorhabditis elegans only in the location of the AT-rich region, whereas there are substantial differences when compared with Onchocerca volvulus,Dirofiliria immitis and Strongyloides stercoralis. Phylogenetic analyses based on concatenated amino acid sequences of 12 protein-coding genes revealed that the newly described species T. malaysiensis was more closely related to T. cati than to T. canis, consistent with results of a previous study using sequences of nuclear internal transcribed spacers as genetic markers. The present study determined the complete mt genome sequences for three roundworms of human and animal health significance, which provides mtDNA evidence for the validity of T. malaysiensis and also provides a foundation for studying the systematics, population genetics and ecology of these and other nematodes of socio-economic importance.
Analysis of the complete nucleotide sequence and functional organization of the genome of Streptococcus pneumoniae bacteriophage Cp-1.

PubMed

Martín, A C; López, R; García, P

1996-06-01

Cp-1, a bacteriophage infecting Streptococcus pneumoniae, has a linear double-stranded DNA genome, with a terminal protein covalently linked to its 5' ends, that replicates by the protein-priming mechanism. We describe here the complete DNA sequence and transcriptional map of the Cp-1 genome. These analyses have led to the firm assignment of 10 genes and the localization of 19 additional open reading frames in the 19,345-bp Cp-1 DNA. Striking similarities and differences between some of these proteins and those of the Bacillus subtilis phage phi 29, a system that also replicates its DNA by the protein-priming mechanism, have been revealed. The genes coding for structural proteins and assembly factors are located in the central part of the Cp-1 genome. Several proteins corresponding to the predicted gene products were identified by in vitro and in vivo expression of the cloned genes. Mature major head protein from the virion particles results from hydrolysis of the primary gene product at the His-49 residue, whereas the phage gene is expressed in Escherichia coli without modification. We have also identified two open reading frames coding for proteins that show high degrees of similarity to the N- and C-terminal regions, respectively, of the single tail protein identified in phi 29. Sequencing and primer extension analysis suggest transcription of a small RNA showing a secondary structure similar to that of the prohead RNA required for the ATP-dependent packaging of phi 29 DNA. On the basis of its temporal expression, transcription of the Cp-1 genome takes place in two stages, early and late. Combined Northern (RNA) blot and primer extension experiments allowed us to map the 5' initiation sites of the transcripts, and we found that only three genes were transcribed from right to left. These analyses reveal that there are also noticeable differences between Cp-l and phi 29 in transcriptional organization. Considered together, the observations reported here provide new tangible evidence on phylogenetic relationships between B. subtilis and S. pneumoniae.
Genomic Analysis of the DNA Replication Timing Program during Mitotic S Phase in Maize (Zea mays) Root Tips[OPEN

PubMed Central

LeBlanc, Chantal; Lee, Tae-Jin; Mulvaney, Patrick; Allen, George C.; Martienssen, Robert A.; Thompson, William F.

2017-01-01

All plants and animals must replicate their DNA, using a regulated process to ensure that their genomes are completely and accurately replicated. DNA replication timing programs have been extensively studied in yeast and animal systems, but much less is known about the replication programs of plants. We report a novel adaptation of the “Repli-seq” assay for use in intact root tips of maize (Zea mays) that includes several different cell lineages and present whole-genome replication timing profiles from cells in early, mid, and late S phase of the mitotic cell cycle. Maize root tips have a complex replication timing program, including regions of distinct early, mid, and late S replication that each constitute between 20 and 24% of the genome, as well as other loci corresponding to ∼32% of the genome that exhibit replication activity in two different time windows. Analyses of genomic, transcriptional, and chromatin features of the euchromatic portion of the maize genome provide evidence for a gradient of early replicating, open chromatin that transitions gradually to less open and less transcriptionally active chromatin replicating in mid S phase. Our genomic level analysis also demonstrated that the centromere core replicates in mid S, before heavily compacted classical heterochromatin, including pericentromeres and knobs, which replicate during late S phase. PMID:28842533
Thousands of microbial genomes shed light on interconnected biogeochemical processes in an aquifer system

DOE Office of Scientific and Technical Information (OSTI.GOV)

Anantharaman, Karthik; Brown, Christopher T.; Hug, Laura A.

The subterranean world hosts up to one-fifth of all biomass, including microbial communities that drive transformations central to Earth's biogeochemical cycles. However, little is known about how complex microbial communities in such environments are structured, and how inter-organism interactions shape ecosystem function. Here we apply terabase-scale cultivation-independent metagenomics to aquifer sediments and groundwater, and reconstruct 2,540 draft-quality, near-complete and complete strain-resolved genomes that represent the majority of known bacterial phyla as well as 47 newly discovered phylum-level lineages. Metabolic analyses spanning this vast phylogenetic diversity and representing up to 36% of organisms detected in the system are used to documentmore » the distribution of pathways in coexisting organisms. Consistent with prior findings indicating metabolic handoffs in simple consortia, we find that few organisms within the community can conduct multiple sequential redox transformations. As environmental conditions change, different assemblages of organisms are selected for, altering linkages among the major biogeochemical cycles.« less
Thousands of microbial genomes shed light on interconnected biogeochemical processes in an aquifer system

DOE PAGES

Anantharaman, Karthik; Brown, Christopher T.; Hug, Laura A.; ...

2016-10-24

The subterranean world hosts up to one-fifth of all biomass, including microbial communities that drive transformations central to Earth's biogeochemical cycles. However, little is known about how complex microbial communities in such environments are structured, and how inter-organism interactions shape ecosystem function. Here we apply terabase-scale cultivation-independent metagenomics to aquifer sediments and groundwater, and reconstruct 2,540 draft-quality, near-complete and complete strain-resolved genomes that represent the majority of known bacterial phyla as well as 47 newly discovered phylum-level lineages. Metabolic analyses spanning this vast phylogenetic diversity and representing up to 36% of organisms detected in the system are used to documentmore » the distribution of pathways in coexisting organisms. Consistent with prior findings indicating metabolic handoffs in simple consortia, we find that few organisms within the community can conduct multiple sequential redox transformations. As environmental conditions change, different assemblages of organisms are selected for, altering linkages among the major biogeochemical cycles.« less
Thousands of microbial genomes shed light on interconnected biogeochemical processes in an aquifer system

PubMed Central

Anantharaman, Karthik; Brown, Christopher T.; Hug, Laura A.; Sharon, Itai; Castelle, Cindy J.; Probst, Alexander J.; Thomas, Brian C.; Singh, Andrea; Wilkins, Michael J.; Karaoz, Ulas; Brodie, Eoin L.; Williams, Kenneth H.; Hubbard, Susan S.; Banfield, Jillian F.

2016-01-01

The subterranean world hosts up to one-fifth of all biomass, including microbial communities that drive transformations central to Earth's biogeochemical cycles. However, little is known about how complex microbial communities in such environments are structured, and how inter-organism interactions shape ecosystem function. Here we apply terabase-scale cultivation-independent metagenomics to aquifer sediments and groundwater, and reconstruct 2,540 draft-quality, near-complete and complete strain-resolved genomes that represent the majority of known bacterial phyla as well as 47 newly discovered phylum-level lineages. Metabolic analyses spanning this vast phylogenetic diversity and representing up to 36% of organisms detected in the system are used to document the distribution of pathways in coexisting organisms. Consistent with prior findings indicating metabolic handoffs in simple consortia, we find that few organisms within the community can conduct multiple sequential redox transformations. As environmental conditions change, different assemblages of organisms are selected for, altering linkages among the major biogeochemical cycles. PMID:27774985
Thousands of microbial genomes shed light on interconnected biogeochemical processes in an aquifer system

NASA Astrophysics Data System (ADS)

Anantharaman, Karthik; Brown, Christopher T.; Hug, Laura A.; Sharon, Itai; Castelle, Cindy J.; Probst, Alexander J.; Thomas, Brian C.; Singh, Andrea; Wilkins, Michael J.; Karaoz, Ulas; Brodie, Eoin L.; Williams, Kenneth H.; Hubbard, Susan S.; Banfield, Jillian F.

2016-10-01

The subterranean world hosts up to one-fifth of all biomass, including microbial communities that drive transformations central to Earth's biogeochemical cycles. However, little is known about how complex microbial communities in such environments are structured, and how inter-organism interactions shape ecosystem function. Here we apply terabase-scale cultivation-independent metagenomics to aquifer sediments and groundwater, and reconstruct 2,540 draft-quality, near-complete and complete strain-resolved genomes that represent the majority of known bacterial phyla as well as 47 newly discovered phylum-level lineages. Metabolic analyses spanning this vast phylogenetic diversity and representing up to 36% of organisms detected in the system are used to document the distribution of pathways in coexisting organisms. Consistent with prior findings indicating metabolic handoffs in simple consortia, we find that few organisms within the community can conduct multiple sequential redox transformations. As environmental conditions change, different assemblages of organisms are selected for, altering linkages among the major biogeochemical cycles.
Complete sequence of the first chimera genome constructed by cloning the whole genome of Synechocystis strain PCC6803 into the Bacillus subtilis 168 genome.

PubMed

Watanabe, Satoru; Shiwa, Yuh; Itaya, Mitsuhiro; Yoshikawa, Hirofumi

2012-12-01

Genome synthesis of existing or designed genomes is made feasible by the first successful cloning of a cyanobacterium, Synechocystis PCC6803, in Gram-positive, endospore-forming Bacillus subtilis. Whole-genome sequence analysis of the isolate and parental B. subtilis strains provides clues for identifying single nucleotide polymorphisms (SNPs) in the 2 complete bacterial genomes in one cell.

The Complete Mitochondrial Genomes of Two Octopods Cistopus chinensis and Cistopus taiwanicus: Revealing the Phylogenetic Position of the Genus Cistopus within the Order Octopoda

PubMed Central

Cheng, Rubin; Zheng, Xiaodong; Ma, Yuanyuan; Li, Qi

2013-01-01

In the present study, we determined the complete mitochondrial DNA (mtDNA) sequences of two species of Cistopus, namely C. chinensis and C. taiwanicus, and conducted a comparative mt genome analysis across the class Cephalopoda. The mtDNA length of C. chinensis and C. taiwanicus are 15706 and 15793 nucleotides with an AT content of 76.21% and 76.5%, respectively. The sequence identity of mtDNA between C. chinensis and C. taiwanicus was 88%, suggesting a close relationship. Compared with C. taiwanicus and other octopods, C. chinensis encoded two additional tRNA genes, showing a novel gene arrangement. In addition, an unusual 23 poly (A) signal structure is found in the ATP8 coding region of C. chinensis. The entire genome and each protein coding gene of the two Cistopus species displayed notable levels of AT and GC skews. Based on sliding window analysis among Octopodiformes, ND1 and DN5 were considered to be more reliable molecular beacons. Phylogenetic analyses based on the 13 protein-coding genes revealed that C. chinensis and C. taiwanicus form a monophyletic group with high statistical support, consistent with previous studies based on morphological characteristics. Our results also indicated that the phylogenetic position of the genus Cistopus is closer to Octopus than to Amphioctopus and Callistoctopus. The complete mtDNA sequence of C. chinensis and C. taiwanicus represent the first whole mt genomes in the genus Cistopus. These novel mtDNA data will be important in refining the phylogenetic relationships within Octopodiformes and enriching the resource of markers for systematic, population genetic and evolutionary biological studies of Cephalopoda. PMID:24358345
Complete chloroplast genome sequences of Hordeum vulgare, Sorghum bicolor and Agrostis stolonifera, and comparative analyses with other grass genomes

PubMed Central

Saski, Christopher; Lee, Seung-Bum; Fjellheim, Siri; Guda, Chittibabu; Jansen, Robert K.; Luo, Hong; Tomkins, Jeffrey; Rognli, Odd Arne; Clarke, Jihong Liu

2009-01-01

Comparisons of complete chloroplast genome sequences of Hordeum vulgare, Sorghum bicolor and Agrostis stolonifera to six published grass chloroplast genomes reveal that gene content and order are similar but two microstructural changes have occurred. First, the expansion of the IR at the SSC/IRa boundary that duplicates a portion of the 5′ end of ndhH is restricted to the three genera of the subfamily Pooideae (Agrostis, Hordeum and Triticum). Second, a 6 bp deletion in ndhK is shared by Agrostis, Hordeum, Oryza and Triticum, and this event supports the sister relationship between the subfamilies Erhartoideae and Pooideae. Repeat analysis identified 19–37 direct and inverted repeats 30 bp or longer with a sequence identity of at least 90%. Seventeen of the 26 shared repeats are found in all the grass chloroplast genomes examined and are located in the same genes or intergenic spacer (IGS) regions. Examination of simple sequence repeats (SSRs) identified 16–21 potential polymorphic SSRs. Five IGS regions have 100% sequence identity among Zea mays, Saccharum officinarum and Sorghum bicolor, whereas no spacer regions were identical among Oryza sativa, Triticum aestivum, H. vulgare and A. stolonifera despite their close phylogenetic relationship. Alignment of EST sequences and DNA coding sequences identified six C–U conversions in both Sorghum bicolor and H. vulgare but only one in A. stolonifera. Phylogenetic trees based on DNA sequences of 61 protein-coding genes of 38 taxa using both maximum parsimony and likelihood methods provide moderate support for a sister relationship between the subfamilies Erhartoideae and Pooideae. PMID:17534593
Single-cell and metagenomic analyses indicate a fermentative and saccharolytic lifestyle for members of the OP9 lineage

PubMed Central

Dodsworth, Jeremy A.; Blainey, Paul C.; Murugapiran, Senthil K.; Swingley, Wesley D.; Ross, Christian A.; Tringe, Susannah G.; Chain, Patrick S. G.; Scholz, Matthew B.; Lo, Chien-Chi; Raymond, Jason; Quake, Stephen R.; Hedlund, Brian P.

2013-01-01

OP9 is a yet-uncultivated bacterial lineage found in geothermal systems, petroleum reservoirs, anaerobic digesters, and wastewater treatment facilities. Here we use single-cell and metagenome sequencing to obtain two distinct, nearly-complete OP9 genomes, one constructed from single cells sorted from hot spring sediments and the other derived from binned metagenomic contigs from an in situ-enriched cellulolytic, thermophilic community. Phylogenomic analyses support the designation of OP9 as a candidate phylum for which we propose the name ‘Atribacteria’. Although a plurality of predicted proteins is most similar to those from Firmicutes, the presence of key genes suggests a diderm cell envelope. Metabolic reconstruction from the core genome suggests an anaerobic lifestyle based on sugar fermentation by Embden-Meyerhof glycolysis with production of hydrogen, acetate, and ethanol. Putative glycohydrolases and an endoglucanase may enable catabolism of (hemi)cellulose in thermal environments. This study lays a foundation for understanding the physiology and ecological role of the ‘Atribacteria’. PMID:23673639
Genomic insights into strategies used by Xanthomonas albilineans with its reduced artillery to spread within sugarcane xylem vessels.

PubMed

Pieretti, Isabelle; Royer, Monique; Barbe, Valérie; Carrere, Sébastien; Koebnik, Ralf; Couloux, Arnaud; Darrasse, Armelle; Gouzy, Jérôme; Jacques, Marie-Agnès; Lauber, Emmanuelle; Manceau, Charles; Mangenot, Sophie; Poussier, Stéphane; Segurens, Béatrice; Szurek, Boris; Verdier, Valérie; Arlat, Matthieu; Gabriel, Dean W; Rott, Philippe; Cociancich, Stéphane

2012-11-21

Xanthomonas albilineans causes leaf scald, a lethal disease of sugarcane. X. albilineans exhibits distinctive pathogenic mechanisms, ecology and taxonomy compared to other species of Xanthomonas. For example, this species produces a potent DNA gyrase inhibitor called albicidin that is largely responsible for inducing disease symptoms; its habitat is limited to xylem; and the species exhibits large variability. A first manuscript on the complete genome sequence of the highly pathogenic X. albilineans strain GPE PC73 focused exclusively on distinctive genomic features shared with Xylella fastidiosa-another xylem-limited Xanthomonadaceae. The present manuscript on the same genome sequence aims to describe all other pathogenicity-related genomic features of X. albilineans, and to compare, using suppression subtractive hybridization (SSH), genomic features of two strains differing in pathogenicity. Comparative genomic analyses showed that most of the known pathogenicity factors from other Xanthomonas species are conserved in X. albilineans, with the notable absence of two major determinants of the "artillery" of other plant pathogenic species of Xanthomonas: the xanthan gum biosynthesis gene cluster, and the type III secretion system Hrp (hypersensitive response and pathogenicity). Genomic features specific to X. albilineans that may contribute to specific adaptation of this pathogen to sugarcane xylem vessels were also revealed. SSH experiments led to the identification of 20 genes common to three highly pathogenic strains but missing in a less pathogenic strain. These 20 genes, which include four ABC transporter genes, a methyl-accepting chemotaxis protein gene and an oxidoreductase gene, could play a key role in pathogenicity. With the exception of hypothetical proteins revealed by our comparative genomic analyses and SSH experiments, no genes potentially involved in any offensive or counter-defensive mechanism specific to X. albilineans were identified, supposing that X. albilineans has a reduced artillery compared to other pathogenic Xanthomonas species. Particular attention has therefore been given to genomic features specific to X. albilineans making it more capable of evading sugarcane surveillance systems or resisting sugarcane defense systems. This study confirms that X. albilineans is a highly distinctive species within the genus Xanthomonas, and opens new perpectives towards a greater understanding of the pathogenicity of this destructive sugarcane pathogen.
Complete Genome Sequences of the Potyvirus Sweet potato virus 2 from East Timor and Australia

PubMed Central

Maina, Solomon; Edwards, Owain R.; de Almeida, Luis; Ximenes, Abel

2016-01-01

We present here the first complete genome sequences of Sweet potato virus 2 (SPV2) from sweet potato in Australia and East Timor, and compare these with five complete SPV2 genome sequences from South Korea and one each from Spain and the United States. Both were closely related to SPV2 genomes from South Korea, Spain, and the United States. PMID:27257208
Extensive structural variations between mitochondrial genomes of CMS and normal peppers (Capsicum annuum L.) revealed by complete nucleotide sequencing.

PubMed

Jo, Yeong Deuk; Choi, Yoomi; Kim, Dong-Hwan; Kim, Byung-Dong; Kang, Byoung-Cheorl

2014-07-04

Cytoplasmic male sterility (CMS) is an inability to produce functional pollen that is caused by mutation of the mitochondrial genome. Comparative analyses of mitochondrial genomes of lines with and without CMS in several species have revealed structural differences between genomes, including extensive rearrangements caused by recombination. However, the mitochondrial genome structure and the DNA rearrangements that may be related to CMS have not been characterized in Capsicum spp. We obtained the complete mitochondrial genome sequences of the pepper CMS line FS4401 (507,452 bp) and the fertile line Jeju (511,530 bp). Comparative analysis between mitochondrial genomes of peppers and tobacco that are included in Solanaceae revealed extensive DNA rearrangements and poor conservation in non-coding DNA. In comparison between pepper lines, FS4401 and Jeju mitochondrial DNAs contained the same complement of protein coding genes except for one additional copy of an atp6 gene (ψatp6-2) in FS4401. In terms of genome structure, we found eighteen syntenic blocks in the two mitochondrial genomes, which have been rearranged in each genome. By contrast, sequences between syntenic blocks, which were specific to each line, accounted for 30,380 and 17,847 bp in FS4401 and Jeju, respectively. The previously-reported CMS candidate genes, orf507 and ψatp6-2, were located on the edges of the largest sequence segments that were specific to FS4401. In this region, large number of small sequence segments which were absent or found on different locations in Jeju mitochondrial genome were combined together. The incorporation of repeats and overlapping of connected sequence segments by a few nucleotides implied that extensive rearrangements by homologous recombination might be involved in evolution of this region. Further analysis using mtDNA pairs from other plant species revealed common features of DNA regions around CMS-associated genes. Although large portion of sequence context was shared by mitochondrial genomes of CMS and male-fertile pepper lines, extensive genome rearrangements were detected. CMS candidate genes located on the edges of highly-rearranged CMS-specific DNA regions and near to repeat sequences. These characteristics were detected among CMS-associated genes in other species, implying a common mechanism might be involved in the evolution of CMS-associated genes.
The complete genome sequence, occurrence and host range of Tomato mottle mosaic virus Chinese isolate.

PubMed

Li, Yueyue; Wang, Yang; Hu, John; Xiao, Long; Tan, Guanlin; Lan, Pingxiu; Liu, Yong; Li, Fan

2017-01-31

Tomato mottle mosaic virus (ToMMV) is a recently identified species in the genus Tobamovirus and was first reported from a greenhouse tomato sample collected in Mexico in 2013. In August 2013, ToMMV was detected on peppers (Capsicum spp.) in China. However, little is known about the molecular and biological characteristics of ToMMV. Reverse transcription-polymerase chain reaction (RT-PCR) and rapid identification of cDNA ends (RACE) were carried out to obtain the complete genomic sequences of ToMMV. Sap transmission was used to test the host range and pathogenicity of ToMMV. The full-length genomes of two ToMMV isolates infecting peppers in Yunnan Province and Tibet Autonomous Region of China were determined and analyzed. The complete genomic sequences of both ToMMV isolates consisted of 6399 nucleotides and contained four open reading frames (ORFs) encoding 126, 183, 30 and 18 kDa proteins from the 5' to 3' end, respectively. Overall similarities of the ToMMV genome sequence to those of the other tobamoviruses available in GenBank ranged from 49.6% to 84.3%. Phylogenetic analyses of the sequences of full-genome nucleotide and the amino acids of its four proteins confirmed that ToMMV was most closely related to Tomato mosaic virus (ToMV). According to the genetic structure, host of origin and phylogenetic relationships, the available 32 tobamoviruses could be divided into at least eight subgroups based on the host plant family they infect: Solanaceae-, Brassicaceae-, Cactaceae-, Apocynaceae-, Cucurbitaceae-, Malvaceae-, Leguminosae-, and Passifloraceae-infecting subgroups. The detection of ToMMV on some solanaceous, cucurbitaceous, brassicaceous and leguminous plants in Yunnan Province and other few parts of China revealed ToMMV only occurred on peppers so far. However, the host range test results showed ToMMV could infect most of the tested solanaceous and cruciferous plants, and had a high affinity for the solanaceous plants. The complete nucleotide sequences of two Chinese ToMMV isolates from naturally infected peppers were verified. The tobamoviruses were divided into at least eight subgroups, with ToMMV belonging to the subgroup that infected plants in the Solanaceae. In China, ToMMV only occurred on peppers in the fields till now. ToMMV could infect the plants in family Solanaceae and Cucurbitaceae by sap transmission.
Meta-analysis of human genome-microbiome association studies: the MiBioGen consortium initiative.

PubMed

Wang, Jun; Kurilshikov, Alexander; Radjabzadeh, Djawad; Turpin, Williams; Croitoru, Kenneth; Bonder, Marc Jan; Jackson, Matthew A; Medina-Gomez, Carolina; Frost, Fabian; Homuth, Georg; Rühlemann, Malte; Hughes, David; Kim, Han-Na; Spector, Tim D; Bell, Jordana T; Steves, Claire J; Timpson, Nicolas; Franke, Andre; Wijmenga, Cisca; Meyer, Katie; Kacprowski, Tim; Franke, Lude; Paterson, Andrew D; Raes, Jeroen; Kraaij, Robert; Zhernakova, Alexandra

2018-06-08

In recent years, human microbiota, especially gut microbiota, have emerged as an important yet complex trait influencing human metabolism, immunology, and diseases. Many studies are investigating the forces underlying the observed variation, including the human genetic variants that shape human microbiota. Several preliminary genome-wide association studies (GWAS) have been completed, but more are necessary to achieve a fuller picture. Here, we announce the MiBioGen consortium initiative, which has assembled 18 population-level cohorts and some 19,000 participants. Its aim is to generate new knowledge for the rapidly developing field of microbiota research. Each cohort has surveyed the gut microbiome via 16S rRNA sequencing and genotyped their participants with full-genome SNP arrays. We have standardized the analytical pipelines for both the microbiota phenotypes and genotypes, and all the data have been processed using identical approaches. Our analysis of microbiome composition shows that we can reduce the potential artifacts introduced by technical differences in generating microbiota data. We are now in the process of benchmarking the association tests and performing meta-analyses of genome-wide associations. All pipeline and summary statistics results will be shared using public data repositories. We present the largest consortium to date devoted to microbiota-GWAS. We have adapted our analytical pipelines to suit multi-cohort analyses and expect to gain insight into host-microbiota cross-talk at the genome-wide level. And, as an open consortium, we invite more cohorts to join us (by contacting one of the corresponding authors) and to follow the analytical pipeline we have developed.
New Insights into the Classification and Integration Specificity of Streptococcus Integrative Conjugative Elements through Extensive Genome Exploration

PubMed Central

Ambroset, Chloé; Coluzzi, Charles; Guédon, Gérard; Devignes, Marie-Dominique; Loux, Valentin; Lacroix, Thomas; Payot, Sophie; Leblond-Bourget, Nathalie

2016-01-01

Recent genome analyses suggest that integrative and conjugative elements (ICEs) are widespread in bacterial genomes and therefore play an essential role in horizontal transfer. However, only a few of these elements are precisely characterized and correctly delineated within sequenced bacterial genomes. Even though previous analysis showed the presence of ICEs in some species of Streptococci, the global prevalence and diversity of ICEs was not analyzed in this genus. In this study, we searched for ICEs in the completely sequenced genomes of 124 strains belonging to 27 streptococcal species. These exhaustive analyses revealed 105 putative ICEs and 26 slightly decayed elements whose limits were assessed and whose insertion site was identified. These ICEs were grouped in seven distinct unrelated or distantly related families, according to their conjugation modules. Integration of these streptococcal ICEs is catalyzed either by a site-specific tyrosine integrase, a low-specificity tyrosine integrase, a site-specific single serine integrase, a triplet of site-specific serine integrases or a DDE transposase. Analysis of their integration site led to the detection of 18 target-genes for streptococcal ICE insertion including eight that had not been identified previously (ftsK, guaA, lysS, mutT, rpmG, rpsI, traG, and ebfC). It also suggests that all specificities have evolved to minimize the impact of the insertion on the host. This overall analysis of streptococcal ICEs emphasizes their prevalence and diversity and demonstrates that exchanges or acquisitions of conjugation and recombination modules are frequent. PMID:26779141
New Insights into the Classification and Integration Specificity of Streptococcus Integrative Conjugative Elements through Extensive Genome Exploration.

PubMed

Ambroset, Chloé; Coluzzi, Charles; Guédon, Gérard; Devignes, Marie-Dominique; Loux, Valentin; Lacroix, Thomas; Payot, Sophie; Leblond-Bourget, Nathalie

2015-01-01

Recent genome analyses suggest that integrative and conjugative elements (ICEs) are widespread in bacterial genomes and therefore play an essential role in horizontal transfer. However, only a few of these elements are precisely characterized and correctly delineated within sequenced bacterial genomes. Even though previous analysis showed the presence of ICEs in some species of Streptococci, the global prevalence and diversity of ICEs was not analyzed in this genus. In this study, we searched for ICEs in the completely sequenced genomes of 124 strains belonging to 27 streptococcal species. These exhaustive analyses revealed 105 putative ICEs and 26 slightly decayed elements whose limits were assessed and whose insertion site was identified. These ICEs were grouped in seven distinct unrelated or distantly related families, according to their conjugation modules. Integration of these streptococcal ICEs is catalyzed either by a site-specific tyrosine integrase, a low-specificity tyrosine integrase, a site-specific single serine integrase, a triplet of site-specific serine integrases or a DDE transposase. Analysis of their integration site led to the detection of 18 target-genes for streptococcal ICE insertion including eight that had not been identified previously (ftsK, guaA, lysS, mutT, rpmG, rpsI, traG, and ebfC). It also suggests that all specificities have evolved to minimize the impact of the insertion on the host. This overall analysis of streptococcal ICEs emphasizes their prevalence and diversity and demonstrates that exchanges or acquisitions of conjugation and recombination modules are frequent.
Transcontinental Phylogeography of the Daphnia pulex Species Complex

PubMed Central

Costanzo, Katie S.; Taylor, Derek J.

2012-01-01

Daphnia pulex is quickly becoming an attractive model species in the field of ecological genomics due to the recent release of its complete genome sequence, a wide variety of new genomic resources, and a rich history of ecological data. Sequences of the mitochondrial NADH dehydrogenase subunit 5 and cytochrome c oxidase subunit 1 genes were used to assess the global phylogeography of this species, and to further elucidate its phylogenetic relationship to other members of the Daphnia pulex species complex. Using both newly acquired and previously published data, we analyzed 398 individuals from collections spanning five continents. Eleven strongly supported lineages were found within the D. pulex complex, and one lineage in particular, panarctic D. pulex, has very little phylogeographical structure and a near worldwide distribution. Mismatch distribution, haplotype network, and population genetic analyses are compatible with a North American origin for this lineage and subsequent spatial expansion in the Late Pleistocene. In addition, our analyses suggest that dispersal between North and South America of this and other species in the D. pulex complex has occurred multiple times, and is predominantly from north to south. Our results provide additional support for the evolutionary relationships of the eleven main mitochondrial lineages of the D. pulex complex. We found that the well-studied panarctic D. pulex is present on every continent except Australia and Antarctica. Despite being geographically very widespread, there is a lack of strong regionalism in the mitochondrial genomes of panarctic D. pulex – a pattern that differs from that of most studied cladocerans. Moreover, our analyses suggest recent expansion of the panarctic D. pulex lineage, with some continents sharing haplotypes. The hypothesis that hybrid asexuality has contributed to the recent and unusual geographic success of the panarctic D. pulex lineage warrants further study. PMID:23056371
Transcontinental phylogeography of the Daphnia pulex species complex.

PubMed

Crease, Teresa J; Omilian, Angela R; Costanzo, Katie S; Taylor, Derek J

2012-01-01

Daphnia pulex is quickly becoming an attractive model species in the field of ecological genomics due to the recent release of its complete genome sequence, a wide variety of new genomic resources, and a rich history of ecological data. Sequences of the mitochondrial NADH dehydrogenase subunit 5 and cytochrome c oxidase subunit 1 genes were used to assess the global phylogeography of this species, and to further elucidate its phylogenetic relationship to other members of the Daphnia pulex species complex. Using both newly acquired and previously published data, we analyzed 398 individuals from collections spanning five continents. Eleven strongly supported lineages were found within the D. pulex complex, and one lineage in particular, panarctic D. pulex, has very little phylogeographical structure and a near worldwide distribution. Mismatch distribution, haplotype network, and population genetic analyses are compatible with a North American origin for this lineage and subsequent spatial expansion in the Late Pleistocene. In addition, our analyses suggest that dispersal between North and South America of this and other species in the D. pulex complex has occurred multiple times, and is predominantly from north to south. Our results provide additional support for the evolutionary relationships of the eleven main mitochondrial lineages of the D. pulex complex. We found that the well-studied panarctic D. pulex is present on every continent except Australia and Antarctica. Despite being geographically very widespread, there is a lack of strong regionalism in the mitochondrial genomes of panarctic D. pulex--a pattern that differs from that of most studied cladocerans. Moreover, our analyses suggest recent expansion of the panarctic D. pulex lineage, with some continents sharing haplotypes. The hypothesis that hybrid asexuality has contributed to the recent and unusual geographic success of the panarctic D. pulex lineage warrants further study.
Complete genome sequence of Coriobacterium glomerans type strain (PW2T) from the midgut of Pyrrhocoris apterus L. (red soldier bug)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Stackebrandt, Erko; Zeytun, Ahmet; Lapidus, Alla L.

2013-01-01

Coriobacterium glomerans Haas and Ko nig 1988, is the only species of the genus Coriobacterium, family Coriobacteriaceae, order Coriobacteriales, phylum Actinobacteria. The bacterium thrives as an endosymbiont of pyrrhocorid bugs, i.e. the red fire bug Pyrrhocoris apterus L. The rationale for sequencing the genome of strain PW2T is its endosymbiotic life style which is rare among members of Actinobacteria. Here we describe the features of this symbiont, together with the complete genome sequence and its annotation. This is the first complete genome sequence of a member of the genus Coriobacterium and the sixth member of the order Coriobacteriales for whichmore » complete genome sequences are now available. The 2,115,681 bp long single replicon genome with its 1,804 protein-coding and 54 RNA genes is part of the Genomic Encyclopedia of Bacteria and Archaea project.« less
Complete genomic sequence of a Tobacco rattle virus isolate from Michigan-grown potatoes.

PubMed

Crosslin, James M; Hamm, Philip B; Kirk, William W; Hammond, Rosemarie W

2010-04-01

Tobacco rattle virus (TRV) causes stem mottle on potato leaves and necrotic arcs and rings in potato tubers, known as corky ringspot disease. Recently, TRV was reported in Michigan potato tubers cv. FL1879 exhibiting corky ringspot disease. Sequence analysis of the RNA-1-encoded 16-kDa gene of the Michigan isolate, designated MI-1, revealed homology to TRV isolates from Florida and Washington. Here, we report the complete genomic sequence of RNA-1 (6,791 nt) and RNA-2 (3,685 nt) of TRV MI-1. RNA-1 is predicted to contain four open reading frames, and the genome structure and phylogenetic analyses of the RNA-1 nucleotide sequence revealed significant homologies to the known sequences of other TRV-1 isolates. The relationships based on the full-length nucleotide sequence were different from than those based on the 16-kDa gene encoded on genomic RNA-1 and reflect sequence variation within a 20-25-aa residue region of the 16-kDa protein. MI-1 RNA-2 is predicted to contain three ORFs, encoding the coat protein (CP), a 37.6-kDa protein (ORF 2b), and a 33.6-kDa protein (ORF 2c). In addition, it contains a region of similarity to the 3' terminus of RNA-1, including a truncated portion of the 16-kDa cistron. Phylogenetic analysis of RNA-2, based on a comparison of nucleotide sequences with other members of the genus Tobravirus, indicates that TRV MI-1 and other North American isolates cluster as a distinct group. TRV M1-1 is only the second North American isolate for which there is a complete sequence of the genome, and it is distinct from the North American isolate TRV ORY. The relationship of the TRV MI-1 isolate to other tobravirus isolates is discussed.
Phenotypic and genomic comparison of Mycobacterium aurum and surrogate model species to Mycobacterium tuberculosis: implications for drug discovery.

PubMed

Namouchi, Amine; Cimino, Mena; Favre-Rochex, Sandrine; Charles, Patricia; Gicquel, Brigitte

2017-07-13

Tuberculosis (TB) is caused by Mycobacterium tuberculosis and represents one of the major challenges facing drug discovery initiatives worldwide. The considerable rise in bacterial drug resistance in recent years has led to the need of new drugs and drug regimens. Model systems are regularly used to speed-up the drug discovery process and circumvent biosafety issues associated with manipulating M. tuberculosis. These include the use of strains such as Mycobacterium smegmatis and Mycobacterium marinum that can be handled in biosafety level 2 facilities, making high-throughput screening feasible. However, each of these model species have their own limitations. We report and describe the first complete genome sequence of Mycobacterium aurum ATCC23366, an environmental mycobacterium that can also grow in the gut of humans and animals as part of the microbiota. This species shows a comparable resistance profile to that of M. tuberculosis for several anti-TB drugs. The aims of this study were to (i) determine the drug resistance profile of a recently proposed model species, Mycobacterium aurum, strain ATCC23366, for anti-TB drug discovery as well as Mycobacterium smegmatis and Mycobacterium marinum (ii) sequence and annotate the complete genome sequence of this species obtained using Pacific Bioscience technology (iii) perform comparative genomics analyses of the various surrogate strains with M. tuberculosis (iv) discuss how the choice of the surrogate model used for drug screening can affect the drug discovery process. We describe the complete genome sequence of M. aurum, a surrogate model for anti-tuberculosis drug discovery. Most of the genes already reported to be associated with drug resistance are shared between all the surrogate strains and M. tuberculosis. We consider that M. aurum might be used in high-throughput screening for tuberculosis drug discovery. We also highly recommend the use of different model species during the drug discovery screening process.
Evolution of Modern Birds Revealed by Mitogenomics: Timing the Radiation and Origin of Major Orders

PubMed Central

Pacheco, M. Andreína; Battistuzzi, Fabia U.; Lentino, Miguel; Aguilar, Roberto F.; Kumar, Sudhir; Escalante, Ananias A.

2011-01-01

Mitochondrial (mt) genes and genomes are among the major sources of data for evolutionary studies in birds. This places mitogenomic studies in birds at the core of intense debates in avian evolutionary biology. Indeed, complete mt genomes are actively been used to unveil the phylogenetic relationships among major orders, whereas single genes (e.g., cytochrome c oxidase I [COX1]) are considered standard for species identification and defining species boundaries (DNA barcoding). In this investigation, we study the time of origin and evolutionary relationships among Neoaves orders using complete mt genomes. First, we were able to solve polytomies previously observed at the deep nodes of the Neoaves phylogeny by analyzing 80 mt genomes, including 17 new sequences reported in this investigation. As an example, we found evidence indicating that columbiforms and charadriforms are sister groups. Overall, our analyses indicate that by improving the taxonomic sampling, complete mt genomes can solve the evolutionary relationships among major bird groups. Second, we used our phylogenetic hypotheses to estimate the time of origin of major avian orders as a way to test if their diversification took place prior to the Cretaceous/Tertiary (K/T) boundary. Such timetrees were estimated using several molecular dating approaches and conservative calibration points. Whereas we found time estimates slightly younger than those reported by others, most of the major orders originated prior to the K/T boundary. Finally, we used our timetrees to estimate the rate of evolution of each mt gene. We found great variation on the mutation rates among mt genes and within different bird groups. COX1 was the gene with less variation among Neoaves orders and the one with the least amount of rate heterogeneity across lineages. Such findings support the choice of COX 1 among mt genes as target for developing DNA barcoding approaches in birds. PMID:21242529
GenePattern | Informatics Technology for Cancer Research (ITCR)

Cancer.gov

GenePattern is a genomic analysis platform that provides access to hundreds of tools for the analysis and visualization of multiple data types. A web-based interface provides easy access to these tools and allows the creation of multi-step analysis pipelines that enable reproducible in silico research. A new GenePattern Notebook environment allows users to combine GenePattern analyses with text, graphics, and code to create complete reproducible research narratives.
Complete genome sequence of Parvibaculum lavamentivorans type strain (DS-1(T)).

PubMed

Schleheck, David; Weiss, Michael; Pitluck, Sam; Bruce, David; Land, Miriam L; Han, Shunsheng; Saunders, Elizabeth; Tapia, Roxanne; Detter, Chris; Brettin, Thomas; Han, James; Woyke, Tanja; Goodwin, Lynne; Pennacchio, Len; Nolan, Matt; Cook, Alasdair M; Kjelleberg, Staffan; Thomas, Torsten

2011-12-31

Parvibaculum lavamentivorans DS-1(T) is the type species of the novel genus Parvibaculum in the novel family Rhodobiaceae (formerly Phyllobacteriaceae) of the order Rhizobiales of Alphaproteobacteria. Strain DS-1(T) is a non-pigmented, aerobic, heterotrophic bacterium and represents the first tier member of environmentally important bacterial communities that catalyze the complete degradation of synthetic laundry surfactants. Here we describe the features of this organism, together with the complete genome sequence and annotation. The 3,914,745 bp long genome with its predicted 3,654 protein coding genes is the first completed genome sequence of the genus Parvibaculum, and the first genome sequence of a representative of the family Rhodobiaceae.
Using Partial Genomic Fosmid Libraries for Sequencing CompleteOrganellar Genomes

DOE Office of Scientific and Technical Information (OSTI.GOV)

McNeal, Joel R.; Leebens-Mack, James H.; Arumuganathan, K.

2005-08-26

Organellar genome sequences provide numerous phylogenetic markers and yield insight into organellar function and molecular evolution. These genomes are much smaller in size than their nuclear counterparts; thus, their complete sequencing is much less expensive than total nuclear genome sequencing, making broader phylogenetic sampling feasible. However, for some organisms it is challenging to isolate plastid DNA for sequencing using standard methods. To overcome these difficulties, we constructed partial genomic libraries from total DNA preparations of two heterotrophic and two autotrophic angiosperm species using fosmid vectors. We then used macroarray screening to isolate clones containing large fragments of plastid DNA. Amore » minimum tiling path of clones comprising the entire genome sequence of each plastid was selected, and these clones were shotgun-sequenced and assembled into complete genomes. Although this method worked well for both heterotrophic and autotrophic plants, nuclear genome size had a dramatic effect on the proportion of screened clones containing plastid DNA and, consequently, the overall number of clones that must be screened to ensure full plastid genome coverage. This technique makes it possible to determine complete plastid genome sequences for organisms that defy other available organellar genome sequencing methods, especially those for which limited amounts of tissue are available.« less
High-throughput comparison, functional annotation, and metabolic modeling of plant genomes using the PlantSEED resource

PubMed Central

Seaver, Samuel M. D.; Gerdes, Svetlana; Frelin, Océane; Lerma-Ortiz, Claudia; Bradbury, Louis M. T.; Zallot, Rémi; Hasnain, Ghulam; Niehaus, Thomas D.; El Yacoubi, Basma; Pasternak, Shiran; Olson, Robert; Pusch, Gordon; Overbeek, Ross; Stevens, Rick; de Crécy-Lagard, Valérie; Ware, Doreen; Hanson, Andrew D.; Henry, Christopher S.

2014-01-01

The increasing number of sequenced plant genomes is placing new demands on the methods applied to analyze, annotate, and model these genomes. Today’s annotation pipelines result in inconsistent gene assignments that complicate comparative analyses and prevent efficient construction of metabolic models. To overcome these problems, we have developed the PlantSEED, an integrated, metabolism-centric database to support subsystems-based annotation and metabolic model reconstruction for plant genomes. PlantSEED combines SEED subsystems technology, first developed for microbial genomes, with refined protein families and biochemical data to assign fully consistent functional annotations to orthologous genes, particularly those encoding primary metabolic pathways. Seamless integration with its parent, the prokaryotic SEED database, makes PlantSEED a unique environment for cross-kingdom comparative analysis of plant and bacterial genomes. The consistent annotations imposed by PlantSEED permit rapid reconstruction and modeling of primary metabolism for all plant genomes in the database. This feature opens the unique possibility of model-based assessment of the completeness and accuracy of gene annotation and thus allows computational identification of genes and pathways that are restricted to certain genomes or need better curation. We demonstrate the PlantSEED system by producing consistent annotations for 10 reference genomes. We also produce a functioning metabolic model for each genome, gapfilling to identify missing annotations and proposing gene candidates for missing annotations. Models are built around an extended biomass composition representing the most comprehensive published to date. To our knowledge, our models are the first to be published for seven of the genomes analyzed. PMID:24927599

High-throughput comparison, functional annotation, and metabolic modeling of plant genomes using the PlantSEED resource.

PubMed

Seaver, Samuel M D; Gerdes, Svetlana; Frelin, Océane; Lerma-Ortiz, Claudia; Bradbury, Louis M T; Zallot, Rémi; Hasnain, Ghulam; Niehaus, Thomas D; El Yacoubi, Basma; Pasternak, Shiran; Olson, Robert; Pusch, Gordon; Overbeek, Ross; Stevens, Rick; de Crécy-Lagard, Valérie; Ware, Doreen; Hanson, Andrew D; Henry, Christopher S

2014-07-01

The increasing number of sequenced plant genomes is placing new demands on the methods applied to analyze, annotate, and model these genomes. Today's annotation pipelines result in inconsistent gene assignments that complicate comparative analyses and prevent efficient construction of metabolic models. To overcome these problems, we have developed the PlantSEED, an integrated, metabolism-centric database to support subsystems-based annotation and metabolic model reconstruction for plant genomes. PlantSEED combines SEED subsystems technology, first developed for microbial genomes, with refined protein families and biochemical data to assign fully consistent functional annotations to orthologous genes, particularly those encoding primary metabolic pathways. Seamless integration with its parent, the prokaryotic SEED database, makes PlantSEED a unique environment for cross-kingdom comparative analysis of plant and bacterial genomes. The consistent annotations imposed by PlantSEED permit rapid reconstruction and modeling of primary metabolism for all plant genomes in the database. This feature opens the unique possibility of model-based assessment of the completeness and accuracy of gene annotation and thus allows computational identification of genes and pathways that are restricted to certain genomes or need better curation. We demonstrate the PlantSEED system by producing consistent annotations for 10 reference genomes. We also produce a functioning metabolic model for each genome, gapfilling to identify missing annotations and proposing gene candidates for missing annotations. Models are built around an extended biomass composition representing the most comprehensive published to date. To our knowledge, our models are the first to be published for seven of the genomes analyzed.
Complete genome sequence of ‘Candidatus Liberibacter africanus’

USDA-ARS?s Scientific Manuscript database

The complete genome sequence of ‘Candidatus Liberibacter africanus’ (Laf), strain ptsapsy, was obtained by an Illumina HiSeq 2000. The Laf genome comprises 1,192,232 nucleotides, 34.5% GC content, 1,141 predicted coding sequences, 44 tRNAs, 3 complete copies of ribosomal RNA genes (16S, 23S and 5S) ...
Complete genome sequence of salmonella enterica subsp. enterica Serovar Thompson Strain RM6836

USDA-ARS?s Scientific Manuscript database

Salmonella enterica subsp. enterica serovar Thompson (S. Thompson) strain RM6836 was isolated from lettuce in 2002. We report the complete sequence and annotation of the genome of S. Thompson strain RM6836. This is the first reported complete genome sequence for S. Thompson and will provide a point ...
Complete genome sequence of the clinical Campylobacter coli isolate 15-537360

USDA-ARS?s Scientific Manuscript database

Campylobacter coli strain 15-537360 was originally isolated from a 42 year-old patient with gastroenteritis. Here we report its complete genome sequence, which comprises a 1.7 Mbp chromosome and a 29 kbp conjugative cryptic plasmid. This is the first complete genome sequence of a clinical isolate of...
DOE Office of Scientific and Technical Information (OSTI.GOV)

Utturkar, Sagar M.; Cude, W. Nathan; Robeson, Jr., Michael S.

Bacterial endophytes that colonize Populus trees contribute to nutrient acquisition, prime immunity responses, and directly or indirectly increase both above- and below-ground biomasses. Endophytes are embedded within plant material, so physical separation and isolation are difficult tasks. Application of culture-independent methods, such as metagenome or bacterial transcriptome sequencing, has been limited due to the predominance of DNA from the plant biomass. In this paper, we present a modified differential and density gradient centrifugation-based protocol for the separation of endophytic bacteria from Populus roots. This protocol achieved substantial reduction in contaminating plant DNA, allowed enrichment of endophytic bacteria away from themore » plant material, and enabled single-cell genomics analysis. Four single-cell genomes were selected for whole-genome amplification based on their rarity in the microbiome (potentially uncultured taxa) as well as their inferred abilities to form associations with plants. Bioinformatics analyses, including assembly, contamination removal, and completeness estimation, were performed to obtain single-amplified genomes (SAGs) of organisms from the phyla Armatimonadetes, Verrucomicrobia, and Planctomycetes, which were unrepresented in our previous cultivation efforts. Finally, comparative genomic analysis revealed unique characteristics of each SAG that could facilitate future cultivation efforts for these bacteria.« less
An automated system for evaluation of the potential functionome: MAPLE version 2.1.0

PubMed Central

Takami, Hideto; Taniguchi, Takeaki; Arai, Wataru; Takemoto, Kazuhiro; Moriya, Yuki; Goto, Susumu

2016-01-01

Metabolic and physiological potential evaluator (MAPLE) is an automatic system that can perform a series of steps used in the evaluation of potential comprehensive functions (functionome) harboured in the genome and metagenome. MAPLE first assigns KEGG Orthology (KO) to the query gene, maps the KO-assigned genes to the Kyoto Encyclopedia of Genes and Genomes (KEGG) functional modules, and then calculates the module completion ratio (MCR) of each functional module to characterize the potential functionome in the user’s own genomic and metagenomic data. In this study, we added two more useful functions to calculate module abundance and Q-value, which indicate the functional abundance and statistical significance of the MCR results, respectively, to the new version of MAPLE for more detailed comparative genomic and metagenomic analyses. Consequently, MAPLE version 2.1.0 reported significant differences in the potential functionome, functional abundance, and diversity of contributors to each function among four metagenomic datasets generated by the global ocean sampling expedition, one of the most popular environmental samples to use with this system. MAPLE version 2.1.0 is now available through the web interface (http://www.genome.jp/tools/maple/) 17 June 2016, date last accessed. PMID:27374611
Next Generation Sequencing of Actinobacteria for the Discovery of Novel Natural Products

PubMed Central

Gomez-Escribano, Juan Pablo; Alt, Silke; Bibb, Mervyn J.

2016-01-01

Like many fields of the biosciences, actinomycete natural products research has been revolutionised by next-generation DNA sequencing (NGS). Hundreds of new genome sequences from actinobacteria are made public every year, many of them as a result of projects aimed at identifying new natural products and their biosynthetic pathways through genome mining. Advances in these technologies in the last five years have meant not only a reduction in the cost of whole genome sequencing, but also a substantial increase in the quality of the data, having moved from obtaining a draft genome sequence comprised of several hundred short contigs, sometimes of doubtful reliability, to the possibility of obtaining an almost complete and accurate chromosome sequence in a single contig, allowing a detailed study of gene clusters and the design of strategies for refactoring and full gene cluster synthesis. The impact that these technologies are having in the discovery and study of natural products from actinobacteria, including those from the marine environment, is only starting to be realised. In this review we provide a historical perspective of the field, analyse the strengths and limitations of the most relevant technologies, and share the insights acquired during our genome mining projects. PMID:27089350
Plastomes of the green algae Hydrodictyon reticulatum and Pediastrum duplex (Sphaeropleales, Chlorophyceae).

PubMed

McManus, Hilary A; Sanchez, Daniel J; Karol, Kenneth G

2017-01-01

Comparative studies of chloroplast genomes (plastomes) across the Chlorophyceae are revealing dynamic patterns of size variation, gene content, and genome rearrangements. Phylogenomic analyses are improving resolution of relationships, and uncovering novel lineages as new plastomes continue to be characterized. To gain further insight into the evolution of the chlorophyte plastome and increase the number of representative plastomes for the Sphaeropleales, this study presents two fully sequenced plastomes from the green algal family Hydrodictyaceae (Sphaeropleales, Chlorophyceae), one from Hydrodictyon reticulatum and the other from Pediastrum duplex . Genomic DNA from Hydrodictyon reticulatum and Pediastrum duplex was subjected to Illumina paired-end sequencing and the complete plastomes were assembled for each. Plastome size and gene content were characterized and compared with other plastomes from the Sphaeropleales. Homology searches using BLASTX were used to characterize introns and open reading frames (orfs) ≥ 300 bp. A phylogenetic analysis of gene order across the Sphaeropleales was performed. The plastome of Hydrodictyon reticulatum is 225,641 bp and Pediastrum duplex is 232,554 bp. The plastome structure and gene order of H. reticulatum and P. duplex are more similar to each other than to other members of the Sphaeropleales. Numerous unique open reading frames are found in both plastomes and the plastome of P. duplex contains putative viral protein genes, not found in other Sphaeropleales plastomes. Gene order analyses support the monophyly of the Hydrodictyaceae and their sister relationship to the Neochloridaceae. The complete plastomes of Hydrodictyon reticulatum and Pediastrum duplex , representing the largest of the Sphaeropleales sequenced thus far, once again highlight the variability in size, architecture, gene order and content across the Chlorophyceae. Novel intron insertion sites and unique orfs indicate recent, independent invasions into each plastome, a hypothesis testable with an expanded plastome investigation within the Hydrodictyaceae.
Massive Collection of Full-Length Complementary DNA Clones and Microarray Analyses:. Keys to Rice Transcriptome Analysis

NASA Astrophysics Data System (ADS)

Kikuchi, Shoshi

2009-02-01

Completion of the high-precision genome sequence analysis of rice led to the collection of about 35,000 full-length cDNA clones and the determination of their complete sequences. Mapping of these full-length cDNA sequences has given us information on (1) the number of genes expressed in the rice genome; (2) the start and end positions and exon-intron structures of rice genes; (3) alternative transcripts; (4) possible encoded proteins; (5) non-protein-coding (np) RNAs; (6) the density of gene localization on the chromosome; (7) setting the parameters of gene prediction programs; and (8) the construction of a microarray system that monitors global gene expression. Manual curation for rice gene annotation by using mapping information on full-length cDNA and EST assemblies has revealed about 32,000 expressed genes in the rice genome. Analysis of major gene families, such as those encoding membrane transport proteins (pumps, ion channels, and secondary transporters), along with the evolution from bacteria to higher animals and plants, reveals how gene numbers have increased through adaptation to circumstances. Family-based gene annotation also gives us a new way of comparing organisms. Massive amounts of data on gene expression under many kinds of physiological conditions are being accumulated in rice oligoarrays (22K and 44K) based on full-length cDNA sequences. Cluster analyses of genes that have the same promoter cis-elements, that have similar expression profiles, or that encode enzymes in the same metabolic pathways or signal transduction cascades give us clues to understanding the networks of gene expression in rice. As a tool for that purpose, we recently developed "RiCES", a tool for searching for cis-elements in the promoter regions of clustered genes.
Complete genome sequence of the hippuricase-positive Campylobacter avium type strain LMG 24591

USDA-ARS?s Scientific Manuscript database

Campylobacter avium is a hippurate-positive, thermotolerant campylobacter that has been isolated from poultry. Here we present the genome sequences of two C. avium strains isolated from broiler chickens: strains LMG 24591T (complete genome) and LMG 24592 (draft genome). The C. avium type strain geno...
The complete mitochondrial genome of a stonefly species, Togoperla sp. (Plecoptera: Perlidae).

PubMed

Wang, Kai; Wang, Yuyu; Yang, Ding

2016-05-01

The complete mitochondrial (mt) genome of a stonefly species, Togoperla sp. (Plecoptera: Perlidae), was sequenced. The 15,723 bp long genome has the standard metazoan complement of 37 genes and an A+T-rich region, which is the same as the insect ancestral genome arrangement.
Molecular Phylogenetic and Expression Analysis of the Complete WRKY Transcription Factor Family in Maize

PubMed Central

Wei, Kai-Fa; Chen, Juan; Chen, Yan-Feng; Wu, Ling-Juan; Xie, Dao-Xin

2012-01-01

The WRKY transcription factors function in plant growth and development, and response to the biotic and abiotic stresses. Although many studies have focused on the functional identification of the WRKY transcription factors, much less is known about molecular phylogenetic and global expression analysis of the complete WRKY family in maize. In this study, we identified 136 WRKY proteins coded by 119 genes in the B73 inbred line from the complete genome and named them in an orderly manner. Then, a comprehensive phylogenetic analysis of five species was performed to explore the origin and evolutionary patterns of these WRKY genes, and the result showed that gene duplication is the major driving force for the origin of new groups and subgroups and functional divergence during evolution. Chromosomal location analysis of maize WRKY genes indicated that 20 gene clusters are distributed unevenly in the genome. Microarray-based expression analysis has revealed that 131 WRKY transcripts encoded by 116 genes may participate in the regulation of maize growth and development. Among them, 102 transcripts are stably expressed with a coefficient of variation (CV) value of <15%. The remaining 29 transcripts produced by 25 WRKY genes with the CV value of >15% are further analysed to discover new organ- or tissue-specific genes. In addition, microarray analyses of transcriptional responses to drought stress and fungal infection showed that maize WRKY proteins are involved in stress responses. All these results contribute to a deep probing into the roles of WRKY transcription factors in maize growth and development and stress tolerance. PMID:22279089
Molecular phylogenetic and expression analysis of the complete WRKY transcription factor family in maize.

PubMed

Wei, Kai-Fa; Chen, Juan; Chen, Yan-Feng; Wu, Ling-Juan; Xie, Dao-Xin

2012-04-01

The WRKY transcription factors function in plant growth and development, and response to the biotic and abiotic stresses. Although many studies have focused on the functional identification of the WRKY transcription factors, much less is known about molecular phylogenetic and global expression analysis of the complete WRKY family in maize. In this study, we identified 136 WRKY proteins coded by 119 genes in the B73 inbred line from the complete genome and named them in an orderly manner. Then, a comprehensive phylogenetic analysis of five species was performed to explore the origin and evolutionary patterns of these WRKY genes, and the result showed that gene duplication is the major driving force for the origin of new groups and subgroups and functional divergence during evolution. Chromosomal location analysis of maize WRKY genes indicated that 20 gene clusters are distributed unevenly in the genome. Microarray-based expression analysis has revealed that 131 WRKY transcripts encoded by 116 genes may participate in the regulation of maize growth and development. Among them, 102 transcripts are stably expressed with a coefficient of variation (CV) value of <15%. The remaining 29 transcripts produced by 25 WRKY genes with the CV value of >15% are further analysed to discover new organ- or tissue-specific genes. In addition, microarray analyses of transcriptional responses to drought stress and fungal infection showed that maize WRKY proteins are involved in stress responses. All these results contribute to a deep probing into the roles of WRKY transcription factors in maize growth and development and stress tolerance.
Complete genome of Cobetia marina JCM 21022T and phylogenomic analysis of the family Halomonadaceae

NASA Astrophysics Data System (ADS)

Tang, Xianghai; Xu, Kuipeng; Han, Xiaojuan; Mo, Zhaolan; Mao, Yunxiang

2018-03-01

Cobetia marina is a model proteobacteria in researches on marine biofouling. Its taxonomic nomenclature has been revised many times over the past few decades. To better understand the role of the surface-associated lifestyle of C. marina and the phylogeny of the family Halomonadaceae, we sequenced the entire genome of C. marina JCM 21022T using single molecule real-time sequencing technology (SMRT) and performed comparative genomics and phylogenomics analyses. The circular chromosome was 4 176 300 bp with an average GC content of 62.44% and contained 3 611 predicted coding sequences, 72 tRNA genes, and 21 rRNA genes. The C. marina JCM 21022T genome contained a set of crucial genes involved in surface colonization processes. The comparative genome analysis indicated the significant differences between C. marina JCM 21022T and Cobetia amphilecti KMM 296 (formerly named C. marina KMM 296) resulted from sequence insertions or deletions and chromosomal recombination. Despite these differences, pan and core genome analysis showed similar gene functions between the two strains. The phylogenomic study of the family Halomonadaceae is reported here for the first time. We found that the relationships were well resolved among every genera tested, including Chromohalobacter, Halomonas, Cobetia, Kushneria, Zymobacter, and Halotalea.
Genome-centric resolution of microbial diversity, metabolism and interactions in anaerobic digestion.

PubMed

Vanwonterghem, Inka; Jensen, Paul D; Rabaey, Korneel; Tyson, Gene W

2016-09-01

Our understanding of the complex interconnected processes performed by microbial communities is hindered by our inability to culture the vast majority of microorganisms. Metagenomics provides a way to bypass this cultivation bottleneck and recent advances in this field now allow us to recover a growing number of genomes representing previously uncultured populations from increasingly complex environments. In this study, a temporal genome-centric metagenomic analysis was performed of lab-scale anaerobic digesters that host complex microbial communities fulfilling a series of interlinked metabolic processes to enable the conversion of cellulose to methane. In total, 101 population genomes that were moderate to near-complete were recovered based primarily on differential coverage binning. These populations span 19 phyla, represent mostly novel species and expand the genomic coverage of several rare phyla. Classification into functional guilds based on their metabolic potential revealed metabolic networks with a high level of functional redundancy as well as niche specialization, and allowed us to identify potential roles such as hydrolytic specialists for several rare, uncultured populations. Genome-centric analyses of complex microbial communities across diverse environments provide the key to understanding the phylogenetic and metabolic diversity of these interactive communities. © 2016 Society for Applied Microbiology and John Wiley & Sons Ltd.
Complete genome of Cobetia marina JCM 21022T and phylogenomic analysis of the family Halomonadaceae

NASA Astrophysics Data System (ADS)

Tang, Xianghai; Xu, Kuipeng; Han, Xiaojuan; Mo, Zhaolan; Mao, Yunxiang

2016-09-01

Cobetia marina is a model proteobacteria in researches on marine biofouling. Its taxonomic nomenclature has been revised many times over the past few decades. To better understand the role of the surface-associated lifestyle of C. marina and the phylogeny of the family Halomonadaceae, we sequenced the entire genome of C. marina JCM 21022T using single molecule real-time sequencing technology (SMRT) and performed comparative genomics and phylogenomics analyses. The circular chromosome was 4 176 300 bp with an average GC content of 62.44% and contained 3 611 predicted coding sequences, 72 tRNA genes, and 21 rRNA genes. The C. marina JCM 21022T genome contained a set of crucial genes involved in surface colonization processes. The comparative genome analysis indicated the significant diff erences between C. marina JCM 21022T and Cobetia amphilecti KMM 296 (formerly named C. marina KMM 296) resulted from sequence insertions or deletions and chromosomal recombination. Despite these diff erences, pan and core genome analysis showed similar gene functions between the two strains. The phylogenomic study of the family Halomonadaceae is reported here for the first time. We found that the relationships were well resolved among every genera tested, including Chromohalobacter, Halomonas, Cobetia, Kushneria, Zymobacter, and Halotalea.
The Small Nuclear Genomes of Selaginella Are Associated with a Low Rate of Genome Size Evolution.

PubMed

Baniaga, Anthony E; Arrigo, Nils; Barker, Michael S

2016-06-03

The haploid nuclear genome size (1C DNA) of vascular land plants varies over several orders of magnitude. Much of this observed diversity in genome size is due to the proliferation and deletion of transposable elements. To date, all vascular land plant lineages with extremely small nuclear genomes represent recently derived states, having ancestors with much larger genome sizes. The Selaginellaceae represent an ancient lineage with extremely small genomes. It is unclear how small nuclear genomes evolved in Selaginella We compared the rates of nuclear genome size evolution in Selaginella and major vascular plant clades in a comparative phylogenetic framework. For the analyses, we collected 29 new flow cytometry estimates of haploid genome size in Selaginella to augment publicly available data. Selaginella possess some of the smallest known haploid nuclear genome sizes, as well as the lowest rate of genome size evolution observed across all vascular land plants included in our analyses. Additionally, our analyses provide strong support for a history of haploid nuclear genome size stasis in Selaginella Our results indicate that Selaginella, similar to other early diverging lineages of vascular land plants, has relatively low rates of genome size evolution. Further, our analyses highlight that a rapid transition to a small genome size is only one route to an extremely small genome. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Genomic and Systems Biology Analyses of Social Behavior or Evolutionary Genomic Analyses of Insect Society: Eat, Drink, and Be Scary (2011 JGI User Meeting)

ScienceCinema

Robinson, Gene

2018-02-05

The U.S. Department of Energy Joint Genome Institute (JGI) invited scientists interested in the application of genomics to bioenergy and environmental issues, as well as all current and prospective users and collaborators, to attend the annual DOE JGI Genomics of Energy & Environment Meeting held March 22-24, 2011 in Walnut Creek, CA. The emphasis of this meeting was on the genomics of renewable energy strategies, carbon cycling, environmental gene discovery, and engineering of fuel-producing organisms. The meeting features presentations by leading scientists advancing these topics. Gene Robinson of the University of Illinois on Genomic and Systems Biology Analyses of Social Behavior at the 6th Annual Genomics of Energy & Environment Meeting on March 23, 2011.
Revisiting the taxonomical classification of Porcine Circovirus type 2 (PCV2): still a real challenge.

PubMed

Franzo, Giovanni; Cortey, Martí; Olvera, Alex; Novosel, Dinko; Castro, Alessandra Marnie Martins Gomes De; Biagini, Philippe; Segalés, Joaquim; Drigo, Michele

2015-08-28

PCV2 has emerged as one of the most devastating viral infections of swine farming, causing a relevant economic impact due to direct losses and control strategies expenses. Epidemiological and experimental studies have evidenced that genetic diversity is potentially affecting the virulence of PVC2. The growing number of PCV2 complete genomes and partial sequences available at GenBank questioned the accepted PCV2 classification. Nine hundred seventy five PCV2 complete genomes and 1,270 ORF2 sequences available from GenBank were subjected to recombination, PASC and phylogenetic analyses and results were used for comparison with previous classification scheme. The outcome of these analyses favors the recognition of four genotypes on the basis of ORF2 sequences, namely PCV2a, PCV2b, PCV2c and PCV2d-mPCV2b. To deal with the difficulty of founding an unambiguous classification and accounting the impossibility to define a p-distance cut-off, a set of reference sequences that could be used in further phylogenetic studies for PCV2 genotyping was established. Being aware that extensive phylogenetic analyses are time-consuming and often impracticable during routine diagnostic activity, ORF2 nucleotide positions adequately conserved in the reference sequences were identified and reported to allow a quick genotype differentiation. Globally, the present work provides an updated scenario of PCV2 genotypes distribution and, based on the limits of the previous classification criteria, proposes new rapid and effective schemes for differentiating the four defined PCV2 genotypes.
Genome size analyses of Pucciniales reveal the largest fungal genomes.

PubMed

Tavares, Sílvia; Ramos, Ana Paula; Pires, Ana Sofia; Azinheira, Helena G; Caldeirinha, Patrícia; Link, Tobias; Abranches, Rita; Silva, Maria do Céu; Voegele, Ralf T; Loureiro, João; Talhinhas, Pedro

2014-01-01

Rust fungi (Basidiomycota, Pucciniales) are biotrophic plant pathogens which exhibit diverse complexities in their life cycles and host ranges. The completion of genome sequencing of a few rust fungi has revealed the occurrence of large genomes. Sequencing efforts for other rust fungi have been hampered by uncertainty concerning their genome sizes. Flow cytometry was recently applied to estimate the genome size of a few rust fungi, and confirmed the occurrence of large genomes in this order (averaging 225.3 Mbp, while the average for Basidiomycota was 49.9 Mbp and was 37.7 Mbp for all fungi). In this work, we have used an innovative and simple approach to simultaneously isolate nuclei from the rust and its host plant in order to estimate the genome size of 30 rust species by flow cytometry. Genome sizes varied over 10-fold, from 70 to 893 Mbp, with an average genome size value of 380.2 Mbp. Compared to the genome sizes of over 1800 fungi, Gymnosporangium confusum possesses the largest fungal genome ever reported (893.2 Mbp). Moreover, even the smallest rust genome determined in this study is larger than the vast majority of fungal genomes (94%). The average genome size of the Pucciniales is now of 305.5 Mbp, while the average Basidiomycota genome size has shifted to 70.4 Mbp and the average for all fungi reached 44.2 Mbp. Despite the fact that no correlation could be drawn between the genome sizes, the phylogenomics or the life cycle of rust fungi, it is interesting to note that rusts with Fabaceae hosts present genomes clearly larger than those with Poaceae hosts. Although this study comprises only a small fraction of the more than 7000 rust species described, it seems already evident that the Pucciniales represent a group where genome size expansion could be a common characteristic. This is in sharp contrast to sister taxa, placing this order in a relevant position in fungal genomics research.

Genome size analyses of Pucciniales reveal the largest fungal genomes

PubMed Central

Tavares, Sílvia; Ramos, Ana Paula; Pires, Ana Sofia; Azinheira, Helena G.; Caldeirinha, Patrícia; Link, Tobias; Abranches, Rita; Silva, Maria do Céu; Voegele, Ralf T.; Loureiro, João; Talhinhas, Pedro

2014-01-01

Rust fungi (Basidiomycota, Pucciniales) are biotrophic plant pathogens which exhibit diverse complexities in their life cycles and host ranges. The completion of genome sequencing of a few rust fungi has revealed the occurrence of large genomes. Sequencing efforts for other rust fungi have been hampered by uncertainty concerning their genome sizes. Flow cytometry was recently applied to estimate the genome size of a few rust fungi, and confirmed the occurrence of large genomes in this order (averaging 225.3 Mbp, while the average for Basidiomycota was 49.9 Mbp and was 37.7 Mbp for all fungi). In this work, we have used an innovative and simple approach to simultaneously isolate nuclei from the rust and its host plant in order to estimate the genome size of 30 rust species by flow cytometry. Genome sizes varied over 10-fold, from 70 to 893 Mbp, with an average genome size value of 380.2 Mbp. Compared to the genome sizes of over 1800 fungi, Gymnosporangium confusum possesses the largest fungal genome ever reported (893.2 Mbp). Moreover, even the smallest rust genome determined in this study is larger than the vast majority of fungal genomes (94%). The average genome size of the Pucciniales is now of 305.5 Mbp, while the average Basidiomycota genome size has shifted to 70.4 Mbp and the average for all fungi reached 44.2 Mbp. Despite the fact that no correlation could be drawn between the genome sizes, the phylogenomics or the life cycle of rust fungi, it is interesting to note that rusts with Fabaceae hosts present genomes clearly larger than those with Poaceae hosts. Although this study comprises only a small fraction of the more than 7000 rust species described, it seems already evident that the Pucciniales represent a group where genome size expansion could be a common characteristic. This is in sharp contrast to sister taxa, placing this order in a relevant position in fungal genomics research. PMID:25206357
The Complete Sequence of a Human Parainfluenzavirus 4 Genome

PubMed Central

Yea, Carmen; Cheung, Rose; Collins, Carol; Adachi, Dena; Nishikawa, John; Tellier, Raymond

2009-01-01

Although the human parainfluenza virus 4 (HPIV4) has been known for a long time, its genome, alone among the human paramyxoviruses, has not been completely sequenced to date. In this study we obtained the first complete genomic sequence of HPIV4 from a clinical isolate named SKPIV4 obtained at the Hospital for Sick Children in Toronto (Ontario, Canada). The coding regions for the N, P/V, M, F and HN proteins show very high identities (95% to 97%) with previously available partial sequences for HPIV4B. The sequence for the L protein and the non-coding regions represent new information. A surprising feature of the genome is its length, more than 17 kb, making it the longest genome within the genus Rubulavirus, although the length is well within the known range of 15 kb to 19 kb for the subfamily Paramyxovirinae. The availability of a complete genomic sequence will facilitate investigations on a respiratory virus that is still not completely characterized. PMID:21994536
Analysis of an RNA-seq Strand-Specific Library from an East Timorese Cucumber Sample Reveals a Complete Cucurbit aphid-borne yellows virus Genome

PubMed Central

Maina, Solomon; Edwards, Owain R.; de Almeida, Luis; Ximenes, Abel

2017-01-01

ABSTRACT Analysis of an RNA-seq library from cucumber leaf RNA extracted from a fast technology for analysis of nucleic acids (FTA) card revealed the first complete genome of Cucurbit aphid-borne yellows virus (CABYV) from East Timor. We compare it with 35 complete CABYV genomes from other world regions. It most resembled the genome of the South Korean isolate HD118. PMID:28495776
The Genome Sequence of the Rumen Methanogen Methanobrevibacter ruminantium Reveals New Possibilities for Controlling Ruminant Methane Emissions

PubMed Central

Leahy, Sinead C.; Kelly, William J.; Altermann, Eric; Ronimus, Ron S.; Yeoman, Carl J.; Pacheco, Diana M.; Li, Dong; Kong, Zhanhao; McTavish, Sharla; Sang, Carrie; Lambie, Suzanne C.; Janssen, Peter H.; Dey, Debjit; Attwood, Graeme T.

2010-01-01

Background Methane (CH4) is a potent greenhouse gas (GHG), having a global warming potential 21 times that of carbon dioxide (CO2). Methane emissions from agriculture represent around 40% of the emissions produced by human-related activities, the single largest source being enteric fermentation, mainly in ruminant livestock. Technologies to reduce these emissions are lacking. Ruminant methane is formed by the action of methanogenic archaea typified by Methanobrevibacter ruminantium, which is present in ruminants fed a wide variety of diets worldwide. To gain more insight into the lifestyle of a rumen methanogen, and to identify genes and proteins that can be targeted to reduce methane production, we have sequenced the 2.93 Mb genome of M. ruminantium M1, the first rumen methanogen genome to be completed. Methodology/Principal Findings The M1 genome was sequenced, annotated and subjected to comparative genomic and metabolic pathway analyses. Conserved and methanogen-specific gene sets suitable as targets for vaccine development or chemogenomic-based inhibition of rumen methanogens were identified. The feasibility of using a synthetic peptide-directed vaccinology approach to target epitopes of methanogen surface proteins was demonstrated. A prophage genome was described and its lytic enzyme, endoisopeptidase PeiR, was shown to lyse M1 cells in pure culture. A predicted stimulation of M1 growth by alcohols was demonstrated and microarray analyses indicated up-regulation of methanogenesis genes during co-culture with a hydrogen (H2) producing rumen bacterium. We also report the discovery of non-ribosomal peptide synthetases in M. ruminantium M1, the first reported in archaeal species. Conclusions/Significance The M1 genome sequence provides new insights into the lifestyle and cellular processes of this important rumen methanogen. It also defines vaccine and chemogenomic targets for broad inhibition of rumen methanogens and represents a significant contribution to worldwide efforts to mitigate ruminant methane emissions and reduce production of anthropogenic greenhouse gases. PMID:20126622
A comprehensive analysis of three Asiatic black bear mitochondrial genomes (subspecies ussuricus, formosanus and mupinensis), with emphasis on the complete mtDNA sequence of Ursus thibetanus ussuricus (Ursidae).

PubMed

Hwang, Dae-Sik; Ki, Jang-Seu; Jeong, Dong-Hyuk; Kim, Bo-Hyun; Lee, Bae-Keun; Han, Sang-Hoon; Lee, Jae-Seong

2008-08-01

In the present paper, we describe the mitochondrial genome sequence of the Asiatic black bear (Ursus thibetanus ussuricus) with particular emphasis on the control region (CR), and compared with mitochondrial genomes on molecular relationships among the bears. The mitochondrial genome sequence of U. thibetanus ussuricus was 16,700 bp in size with mostly conserved structures (e.g. 13 protein-coding, two rRNA genes, 22 tRNA genes). The CR consisted of several typical conserved domains such as F, E, D, and C boxes, and a conserved sequence block. Nucleotide sequences and the repeated motifs in the CR were different among the bear species, and their copy numbers were also variable according to populations, even within F1 generations of U. thibetanus ussuricus. Comparative analyses showed that the CR D1 region was highly informative for the discrimination of the bear family. These findings suggest that nucleotide sequences of both repeated motifs and CR D1 in the bear family are good markers for species discriminations.
Do cryptic species exist in Hoplobatrachus rugulosus? An examination using four nuclear genes, the cyt b gene and the complete MT genome.

PubMed

Yu, Danna; Zhang, Jiayong; Li, Peng; Zheng, Rongquan; Shao, Chen

2015-01-01

he Chinese tiger frog Hoplobatrachus rugulosus is widely distributed in southern China, Malaysia, Myanmar, Thailand, and Vietnam. It is listed in Appendix II of CITES as the only Class II nationally-protected frog in China. The bred tiger frog known as the Thailand tiger frog, is also identified as H. rugulosus. Our analysis of the Cyt b gene showed high genetic divergence (13.8%) between wild and bred samples of tiger frog. Unexpected genetic divergence of the complete mt genome (14.0%) was also observed between wild and bred samples of tiger frog. Yet, the nuclear genes (NCX1, Rag1, Rhod, Tyr) showed little divergence between them. Despite this and their very similar morphology, the features of the mitochondrial genome including genetic divergence of other genes, different three-dimensional structures of ND5 proteins, and gene rearrangements indicate that H. rugulosus may be a cryptic species complex. Using Bayesian inference, maximum likelihood, and maximum parsimony analyses, Hoplobatrachus was resolved as a sister clade to Euphlyctis, and H. rugulosus (BT) as a sister clade to H. rugulosus (WT). We suggest that we should prevent Thailand tiger frogs (bred type) from escaping into wild environments lest they produce hybrids with Chinese tiger frogs (wild type).
Do Cryptic Species Exist in Hoplobatrachus rugulosus? An Examination Using Four Nuclear Genes, the Cyt b Gene and the Complete MT Genome

PubMed Central

Li, Peng; Zheng, Rongquan; Shao, Chen

2015-01-01

he Chinese tiger frog Hoplobatrachus rugulosus is widely distributed in southern China, Malaysia, Myanmar, Thailand, and Vietnam. It is listed in Appendix II of CITES as the only Class II nationally-protected frog in China. The bred tiger frog known as the Thailand tiger frog, is also identified as H. rugulosus. Our analysis of the Cyt b gene showed high genetic divergence (13.8%) between wild and bred samples of tiger frog. Unexpected genetic divergence of the complete mt genome (14.0%) was also observed between wild and bred samples of tiger frog. Yet, the nuclear genes (NCX1, Rag1, Rhod, Tyr) showed little divergence between them. Despite this and their very similar morphology, the features of the mitochondrial genome including genetic divergence of other genes, different three-dimensional structures of ND5 proteins, and gene rearrangements indicate that H. rugulosus may be a cryptic species complex. Using Bayesian inference, maximum likelihood, and maximum parsimony analyses, Hoplobatrachus was resolved as a sister clade to Euphlyctis, and H. rugulosus (BT) as a sister clade to H. rugulosus (WT). We suggest that we should prevent Thailand tiger frogs (bred type) from escaping into wild environments lest they produce hybrids with Chinese tiger frogs (wild type). PMID:25875761
DNA repair in Chromobacterium violaceum.

PubMed

Duarte, Fábio Teixeira; Carvalho, Fabíola Marques de; Bezerra e Silva, Uaska; Scortecci, Kátia Castanho; Blaha, Carlos Alfredo Galindo; Agnez-Lima, Lucymara Fassarella; Batistuzzo de Medeiros, Silvia Regina

2004-03-31

Chromobacterium violaceum is a Gram-negative beta-proteobacterium that inhabits a variety of ecosystems in tropical and subtropical regions, including the water and banks of the Negro River in the Brazilian Amazon. This bacterium has been the subject of extensive study over the last three decades, due to its biotechnological properties, including the characteristic violacein pigment, which has antimicrobial and anti-tumoral activities. C. violaceum promotes the solubilization of gold in a mercury-free process, and has been used in the synthesis of homopolyesters suitable for the production of biodegradable polymers. The complete genome sequence of this organism has been completed by the Brazilian National Genome Project Consortium. The aim of our group was to study the DNA repair genes in this organism, due to their importance in the maintenance of genomic integrity. We identified DNA repair genes involved in different pathways in C. violaceum through a similarity search against known sequences deposited in databases. The phylogenetic analyses were done using programs of the PHILYP package. This analysis revealed various metabolic pathways, including photoreactivation, base excision repair, nucleotide excision repair, mismatch repair, recombinational repair, and the SOS system. The similarity between the C. violaceum sequences and those of Neisserie miningitidis and Ralstonia solanacearum was greater than that between the C. violaceum and Escherichia coli sequences. The peculiarities found in the C. violaceum genome were the absence of LexA, some horizontal transfer events and a large number of repair genes involved with alkyl and oxidative DNA damage.
Gene characteristics of the complete mitochondrial genomes of Paratoxodera polyacantha and Toxodera hauseri (Mantodea: Toxoderidae)

PubMed Central

Zhang, Le-Ping; Cai, Yin-Yin; Yu, Dan-Na; Storey, Kenneth B.

2018-01-01

The family Toxoderidae (Mantodea) contains an ecologically diverse group of praying mantis species that have in common greatly elongated bodies. In this study, we sequenced and compared the complete mitochondrial genomes of two Toxoderidae species, Paratoxodera polyacantha and Toxodera hauseri, and compared their mitochondrial genome characteristics with another member of the Toxoderidae, Stenotoxodera porioni (KY689118). The lengths of the mitogenomes of T. hauseri and P. polyacantha were 15,616 bp and 15,999 bp, respectively, which is similar to that of S. porioni (15,846 bp). The size of each gene as well as the A+T-rich region and the A+T content of the whole genome were also very similar among the three species as were the protein-coding genes, the A+T content and the codon usages. The mitogenome of T. hauseri had the typical 22 tRNAs, whereas that of P. polyacantha had 26 tRNAs including an extra two copies of trnA-trnR. Intergenic regions of 67 bp and 76 bp were found in T. hauseri and P. polyacantha, respectively, between COX2 and trnK; these can be explained as residues of a tandem duplication/random loss of trnK and trnD. This non-coding region may be synapomorphic for Toxoderidae. In BI and ML analyses, the monophyly of Toxoderidae was supported and P. polyacantha was the sister clade to T. hauseri and S. porioni. PMID:29686943
Whole genome characterization of a novel porcine reproductive and respiratory syndrome virus 1 isolate: Genetic evidence for recombination between Amervac vaccine and circulating strains in mainland China.

PubMed

Chen, Nanhua; Liu, Qiaorong; Qiao, Mingming; Deng, Xiaoyu; Chen, Xizhao; Sun, Ming

2017-10-01

Genotype 1 porcine reproductive and respiratory syndrome virus (PRRSV 1) have been continuously isolated in China in recent years. Complete genome sequences of these isolates are important to investigate the prevalence and evolution of Chinese PRRSV 1. Herein, we describe the isolation of a novel PRRSV 1 isolate, denominated HLJB1, in the Heilongjiang province of China. Complete genome sequencing of HLJB1 showed that it shares 90.66% and 58.21% nucleotide identities with PRRSV 1 and 2 prototypic strains Lelystad virus and ATCC VR-2332, respectively. HLJB1 has a unique 5-amino-acid insertion in nsp2, which has never been described in other PRRSV 1 isolates. Whole genome-based phylogenetic analysis revealed that all Chinese PRRSV 1 isolates are clustered in pan-European subtype 1 and can be divided into four subgroups. HLJB1 resides in the subgroup of BJEU06-1-like isolates but is also closely related to the Amervac-like isolates. Additionally, recombination analyses suggested that HLJB1 is a recombinant from the Amervac vaccine and the BJEU06-1 isolate. To our best knowledge, our results provide the first genetic evidence for recombination between Amervac vaccine and circulating strains. These findings are also beneficial for studying the origin and evolution of PRRSV 1 in China. Copyright © 2017. Published by Elsevier B.V.
Complete genome sequence of Lactobacillus heilongjiangensis DSM 28069(T): Insight into its probiotic potential.

PubMed

Zheng, Beiwen; Jiang, Xiawei; Cheng, Hong; Xu, Zemin; Li, Ang; Hu, Xinjun; Xiao, Yonghong

2015-12-20

Lactobacillus heilongjiangensis DSM 28069(T) is a potential probiotic isolated from traditional Chinese pickle. Here we report the complete genome sequence of this strain. The complete genome is 2,790,548bp with the GC content of 37.5% and devoid of plasmids. Sets of genes involved in the biosynthesis of riboflavin and folate were identified in the genome, which revealed its potential application in biotechnological industry. The genome sequence of L. heilongjiangensis DSM 28069(T) now provides the fundamental information for future studies. Copyright © 2015 Elsevier B.V. All rights reserved.
Complete genome sequence of Streptosporangium roseum type strain (NI 9100T)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Nolan, Matt; Sikorski, Johannes; Jando, Marlen

2010-01-01

Streptosporangium roseum Crauch 1955 is the type strain of the species which is the type species of the genus Streptosporangium. The pinkish coiled Streptomyces-like organism with a spore case was isolated from vegetable garden soil in 1955. Here we describe the features of this organism, together with the complete genome sequence and annotation. This is the first completed genome sequence of a member of the family Streptosporangiaceae, and the second largest microbial genome sequence ever deciphered. The 10,369,518 bp long genome with its 9421 protein-coding and 80 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaeamore » project.« less
Analysis of bacterial genomes from an evolution experiment with horizontal gene transfer shows that recombination can sometimes overwhelm selection

PubMed Central

2018-01-01

Few experimental studies have examined the role that sexual recombination plays in bacterial evolution, including the effects of horizontal gene transfer on genome structure. To address this limitation, we analyzed genomes from an experiment in which Escherichia coli K-12 Hfr (high frequency recombination) donors were periodically introduced into 12 evolving populations of E. coli B and allowed to conjugate repeatedly over the course of 1000 generations. Previous analyses of the evolved strains from this experiment showed that recombination did not accelerate adaptation, despite increasing genetic variation relative to asexual controls. However, the resolution in that previous work was limited to only a few genetic markers. We sought to clarify and understand these puzzling results by sequencing complete genomes from each population. The effects of recombination were highly variable: one lineage was mostly derived from the donors, while another acquired almost no donor DNA. In most lineages, some regions showed repeated introgression and others almost none. Regions with high introgression tended to be near the donors’ origin of transfer sites. To determine whether introgressed alleles imposed a genetic load, we extended the experiment for 200 generations without recombination and sequenced whole-population samples. Beneficial alleles in the recipient populations were occasionally driven extinct by maladaptive donor-derived alleles. On balance, our analyses indicate that the plasmid-mediated recombination was sufficiently frequent to drive donor alleles to fixation without providing much, if any, selective advantage. PMID:29385126
Complete genome sequence of chinese strain of ‘Candidatus Liberibacter asiaticus’

USDA-ARS?s Scientific Manuscript database

The complete genome sequence of ‘Candidatus Liberibacter asiaticus’ strain (Las) Guangxi-1(GX-1) was obtained by an Illumina HiSeq 2000. The GX-1 genome comprises 1,268,237 nucleotides, 36.5 % GC content, 1,141 predicted coding sequences, 44 tRNAs, 3 complete copies of ribosomal RNA genes (16S, 23S ...
Complete genome sequence of a novel aquareovirus that infects the endangered fountain darter, Etheostoma fonticola

USGS Publications Warehouse

Iwanowicz, Luke R.; Iwanowicz, Deborah; Adams, Cynthia; Lewis, Teresa D.; Brandt, Thomas M.; Cornman, Robert S.; Sanders, Lakyn R.

2016-01-01

Here, we report the complete genome of a novel aquareovirus isolated from clinically normal fountain darters, Etheostoma fonticola, inhabiting the San Marcos River, Texas, USA. The complete genome consists of 23,958 bp consisting of 11 segments that range from 783 bp (S11) to 3,866 bp (S1).
Complete Genome Sequence of a Novel Aquareovirus That Infects the Endangered Fountain Darter, Etheostoma fonticola

PubMed Central

Adams, Cynthia R.; Lewis, Teresa D.; Brandt, Thomas M.; Sanders, Lakyn

2016-01-01

Here, we report the complete genome of a novel aquareovirus isolated from clinically normal fountain darters, Etheostoma fonticola, inhabiting the San Marcos River, Texas, USA. The complete genome consists of 23,958 bp consisting of 11 segments that range from 783 bp (S11) to 3,866 bp (S1). PMID:28007856
Genomic Diversity and Evolution of the Lyssaviruses

PubMed Central

Delmas, Olivier; Holmes, Edward C.; Talbi, Chiraz; Larrous, Florence; Dacheux, Laurent; Bouchier, Christiane; Bourhy, Hervé

2008-01-01

Lyssaviruses are RNA viruses with single-strand, negative-sense genomes responsible for rabies-like diseases in mammals. To date, genomic and evolutionary studies have most often utilized partial genome sequences, particularly of the nucleoprotein and glycoprotein genes, with little consideration of genome-scale evolution. Herein, we report the first genomic and evolutionary analysis using complete genome sequences of all recognised lyssavirus genotypes, including 14 new complete genomes of field isolates from 6 genotypes and one genotype that is completely sequenced for the first time. In doing so we significantly increase the extent of genome sequence data available for these important viruses. Our analysis of these genome sequence data reveals that all lyssaviruses have the same genomic organization. A phylogenetic analysis reveals strong geographical structuring, with the greatest genetic diversity in Africa, and an independent origin for the two known genotypes that infect European bats. We also suggest that multiple genotypes may exist within the diversity of viruses currently classified as ‘Lagos Bat’. In sum, we show that rigorous phylogenetic techniques based on full length genome sequence provide the best discriminatory power for genotype classification within the lyssaviruses. PMID:18446239
Draft Nuclear Genome, Complete Chloroplast Genome, and Complete Mitochondrial Genome for the Biofuel/Bioproduct Feedstock Species Scenedesmus obliquus Strain DOE0152z

DOE Office of Scientific and Technical Information (OSTI.GOV)

Starkenburg, S. R.; Polle, J. E. W.; Hovde, B.

ABSTRACT The green alga Scenedesmus obliquus is an emerging platform species for the industrial production of biofuels. Here, we report the draft assembly and annotation for the nuclear, plastid, and mitochondrial genomes of S. obliquus strain DOE0152z.
Molecular characterization of the complete genome of falconid herpesvirus strain S-18

USDA-ARS?s Scientific Manuscript database

Falconid herpesvirus type 1 (FHV-1) is the causative agent of falcon inclusion body disease, an acute, highly contagious disease of raptors. The complete nucleotide sequence of the genome of FHV-1 has been determined. The genome is arranged as a D-type genome with large inverted repeats flanking a ...
Draft Nuclear Genome, Complete Chloroplast Genome, and Complete Mitochondrial Genome for the Biofuel/Bioproduct Feedstock Species Scenedesmus obliquus Strain DOE0152z

DOE PAGES

Starkenburg, S. R.; Polle, J. E. W.; Hovde, B.; ...

2017-08-10

ABSTRACT The green alga Scenedesmus obliquus is an emerging platform species for the industrial production of biofuels. Here, we report the draft assembly and annotation for the nuclear, plastid, and mitochondrial genomes of S. obliquus strain DOE0152z.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.