unique genome structure: Topics by Science.gov

Sample records for unique genome structure

Molecular Innovation in Ciliates with Complex Genome Rearrangements

NASA Astrophysics Data System (ADS)

Neme, R.; Landweber, L. F.

2017-07-01

We study molecular innovation in several ciliate species with unique massive genome rearrangements to understand how a radically distinct genome architecture can shape the process of acquiring new functions, genes and structures.
Comparative Genome Structure, Secondary Metabolite, and Effector Coding Capacity across Cochliobolus Pathogens

PubMed Central

Bushley, Kathryn E.; Ohm, Robin A.; Otillar, Robert; Martin, Joel; Schackwitz, Wendy; Grimwood, Jane; MohdZainudin, NurAinIzzati; Xue, Chunsheng; Wang, Rui; Manning, Viola A.; Dhillon, Braham; Tu, Zheng Jin; Steffenson, Brian J.; Salamov, Asaf; Sun, Hui; Lowry, Steve; LaButti, Kurt; Han, James; Copeland, Alex; Lindquist, Erika; Barry, Kerrie; Schmutz, Jeremy; Baker, Scott E.; Ciuffetti, Lynda M.; Grigoriev, Igor V.; Zhong, Shaobin; Turgeon, B. Gillian

2013-01-01

The genomes of five Cochliobolus heterostrophus strains, two Cochliobolus sativus strains, three additional Cochliobolus species (Cochliobolus victoriae, Cochliobolus carbonum, Cochliobolus miyabeanus), and closely related Setosphaeria turcica were sequenced at the Joint Genome Institute (JGI). The datasets were used to identify SNPs between strains and species, unique genomic regions, core secondary metabolism genes, and small secreted protein (SSP) candidate effector encoding genes with a view towards pinpointing structural elements and gene content associated with specificity of these closely related fungi to different cereal hosts. Whole-genome alignment shows that three to five percent of each genome differs between strains of the same species, while a quarter of each genome differs between species. On average, SNP counts among field isolates of the same C. heterostrophus species are more than 25× higher than those between inbred lines and 50× lower than SNPs between Cochliobolus species. The suites of nonribosomal peptide synthetase (NRPS), polyketide synthase (PKS), and SSP–encoding genes are astoundingly diverse among species but remarkably conserved among isolates of the same species, whether inbred or field strains, except for defining examples that map to unique genomic regions. Functional analysis of several strain-unique PKSs and NRPSs reveal a strong correlation with a role in virulence. PMID:23357949
Comparative Genome Structure, Secondary Metabolite, and Effector Coding Capacity across Cochliobolus Pathogens

DOE Office of Scientific and Technical Information (OSTI.GOV)

Condon, Bradford J.; Leng, Yueqiang; Wu, Dongliang

The genomes of five Cochliobolus heterostrophus strains, two Cochliobolus sativus strains, three additional Cochliobolus species (Cochliobolus victoriae, Cochliobolus carbonum, Cochliobolus miyabeanus), and closely related Setosphaeria turcica were sequenced at the Joint Genome Institute (JGI). The datasets were used to identify SNPs between strains and species, unique genomic regions, core secondary metabolism genes, and small secreted protein (SSP) candidate effector encoding genes with a view towards pinpointing structural elements and gene content associated with specificity of these closely related fungi to different cereal hosts. Whole-genome alignment shows that three to five of each genome differs between strains of the same species,more » while a quarter of each genome differs between species. On average, SNP counts among field isolates of the same C. heterostrophus species are more than 25 higher than those between inbred lines and 50 lower than SNPs between Cochliobolus species. The suites of nonribosomal peptide synthetase (NRPS), polyketide synthase (PKS), and SSP encoding genes are astoundingly diverse among species but remarkably conserved among isolates of the same species, whether inbred or field strains, except for defining examples that map to unique genomic regions. Functional analysis of several strain-unique PKSs and NRPSs reveal a strong correlation with a role in virulence.« less
The Divided Bacterial Genome: Structure, Function, and Evolution.

PubMed

diCenzo, George C; Finan, Turlough M

2017-09-01

Approximately 10% of bacterial genomes are split between two or more large DNA fragments, a genome architecture referred to as a multipartite genome. This multipartite organization is found in many important organisms, including plant symbionts, such as the nitrogen-fixing rhizobia, and plant, animal, and human pathogens, including the genera Brucella , Vibrio , and Burkholderia . The availability of many complete bacterial genome sequences means that we can now examine on a broad scale the characteristics of the different types of DNA molecules in a genome. Recent work has begun to shed light on the unique properties of each class of replicon, the unique functional role of chromosomal and nonchromosomal DNA molecules, and how the exploitation of novel niches may have driven the evolution of the multipartite genome. The aims of this review are to (i) outline the literature regarding bacterial genomes that are divided into multiple fragments, (ii) provide a meta-analysis of completed bacterial genomes from 1,708 species as a way of reviewing the abundant information present in these genome sequences, and (iii) provide an encompassing model to explain the evolution and function of the multipartite genome structure. This review covers, among other topics, salient genome terminology; mechanisms of multipartite genome formation; the phylogenetic distribution of multipartite genomes; how each part of a genome differs with respect to genomic signatures, genetic variability, and gene functional annotation; how each DNA molecule may interact; as well as the costs and benefits of this genome structure. Copyright © 2017 American Society for Microbiology.
Complete genome sequencing and evolutionary analysis of Indian isolates of Dengue virus type 2

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dash, Paban Kumar, E-mail: pabandash@rediffmail.com; Sharma, Shashi; Soni, Manisha

Highlights: •Complete genome of Indian DENV-2 was deciphered for the first time in this study. •The recent Indian DENV-2 revealed presence of many unique amino acid residues. •Genotype shift (American to Cosmopolitan) characterizes evolution of DENV-2 in India. •Circulation of a unique clade of DENV-2 in South Asia was identified. -- Abstract: Dengue is the most important arboviral infection of global public health significance. It is now endemic in most parts of the South East Asia including India. Though Dengue virus type 2 (DENV-2) is predominantly associated with major outbreaks in India, complete genome information of Indian DENV-2 is notmore » available. In this study, the full-length genome of five DENV-2 isolates (four from 2001 to 2011 and one from 1960), from different parts of India was determined. The complete genome of the Indian DENV-2 was found to be 10,670 bases long with an open reading frame coding for 3391 amino acids. The recent Indian DENV-2 (2001–2011) revealed a nucleotide sequence identity of around 90% and 97% with an older Indian DENV-2 (1960) and closely related Sri Lankan and Chinese DENV-2 respectively. Presence of unique amino acid residues and non-conservative substitutions in critical amino acid residues of major structural and non-structural proteins was observed in recent Indian DENV-2. Selection pressure analysis revealed positive selection in few amino acid sites of the genes encoding for structural and non-structural proteins. The molecular phylogenetic analysis based on comparison of both complete coding region and envelope protein gene with globally diverse DENV-2 viruses classified the recent Indian isolates into a unique South Asian clade within Cosmopolitan genotype. A shift of genotype from American to Cosmopolitan in 1970s characterized the evolution of DENV-2 in India. Present study is the first report on complete genome characterization of emerging DENV-2 isolates from India and highlights the circulation of a unique clade in South Asia.« less
Reconstitution of wild type viral DNA in simian cells transfected with early and late SV40 defective genomes.

PubMed

O'Neill, F J; Gao, Y; Xu, X

1993-11-01

The DNAs of polyomaviruses ordinarily exist as a single circular molecule of approximately 5000 base pairs. Variants of SV40, BKV and JCV have been described which contain two complementing defective DNA molecules. These defectives, which form a bipartite genome structure, contain either the viral early region or the late region. The defectives have the unique property of being able to tolerate variable sized reiterations of regulatory and terminus region sequences, and portions of the coding region. They can also exchange coding region sequences with other polyomaviruses. It has been suggested that the bipartite genome structure might be a stage in the evolution of polyomaviruses which can uniquely sustain genome and sequence diversity. However, it is not known if the regulatory and terminus region sequences are highly mutable. Also, it is not known if the bipartite genome structure is reversible and what the conditions might be which would favor restoration of the monomolecular genome structure. We addressed the first question by sequencing the reiterated regulatory and terminus regions of E- and L-SV40 DNAs. This revealed a large number of mutations in the regulatory regions of the defective genomes, including deletions, insertions, rearrangements and base substitutions. We also detected insertions and base substitutions in the T-antigen gene. We addressed the second question by introducing into permissive simian cells, E- and L-SV40 genomes which had been engineered to contain only a single regulatory region. Analysis of viral DNA from transfected cells demonstrated recombined genomes containing a wild type monomolecular DNA structure. However, the complete defectives, containing reiterated regulatory regions, could often compete away the wild type genomes. The recombinant monomolecular genomes were isolated, cloned and found to be infectious. All of the DNA alterations identified in one of the regulatory regions of E-SV40 DNA were present in the recombinant monomolecular genomes. These and other findings indicate that the bipartite genome state can sustain many mutations which wtSV40 cannot directly sustain. However, the mutations can later be introduced into the wild type genomes when the E- and L-SV40 DNAs recombine to generate a new monomolecular genome structure.
Salmonella Strains Isolated from Galápagos Iguanas Show Spatial Structuring of Serovar and Genomic Diversity

PubMed Central

Lankau, Emily W.; Cruz Bedon, Lenin; Mackie, Roderick I.

2012-01-01

It is thought that dispersal limitation primarily structures host-associated bacterial populations because host distributions inherently limit transmission opportunities. However, enteric bacteria may disperse great distances during food-borne outbreaks. It is unclear if such rapid long-distance dispersal events happen regularly in natural systems or if these events represent an anthropogenic exception. We characterized Salmonella enterica isolates from the feces of free-living Galápagos land and marine iguanas from five sites on four islands using serotyping and genomic fingerprinting. Each site hosted unique and nearly exclusive serovar assemblages. Genomic fingerprint analysis offered a more complex model of S. enterica biogeography, with evidence of both unique strain pools and of spatial population structuring along a geographic gradient. These findings suggest that even relatively generalist enteric bacteria may be strongly dispersal limited in a natural system with strong barriers, such as oceanic divides. Yet, these differing results seen on two typing methods also suggests that genomic variation is less dispersal limited, allowing for different ecological processes to shape biogeographical patterns of the core and flexible portions of this bacterial species' genome. PMID:22615968
Statistical Significance of Optical Map Alignments

PubMed Central

Sarkar, Deepayan; Goldstein, Steve; Schwartz, David C.

2012-01-01

Abstract The Optical Mapping System constructs ordered restriction maps spanning entire genomes through the assembly and analysis of large datasets comprising individually analyzed genomic DNA molecules. Such restriction maps uniquely reveal mammalian genome structure and variation, but also raise computational and statistical questions beyond those that have been solved in the analysis of smaller, microbial genomes. We address the problem of how to filter maps that align poorly to a reference genome. We obtain map-specific thresholds that control errors and improve iterative assembly. We also show how an optimal self-alignment score provides an accurate approximation to the probability of alignment, which is useful in applications seeking to identify structural genomic abnormalities. PMID:22506568
Whole-genome sequence, SNP chips and pedigree structure: building demographic profiles in domestic dog breeds to optimize genetic-trait mapping.

PubMed

Dreger, Dayna L; Rimbault, Maud; Davis, Brian W; Bhatnagar, Adrienne; Parker, Heidi G; Ostrander, Elaine A

2016-12-01

In the decade following publication of the draft genome sequence of the domestic dog, extraordinary advances with application to several fields have been credited to the canine genetic system. Taking advantage of closed breeding populations and the subsequent selection for aesthetic and behavioral characteristics, researchers have leveraged the dog as an effective natural model for the study of complex traits, such as disease susceptibility, behavior and morphology, generating unique contributions to human health and biology. When designing genetic studies using purebred dogs, it is essential to consider the unique demography of each population, including estimation of effective population size and timing of population bottlenecks. The analytical design approach for genome-wide association studies (GWAS) and analysis of whole-genome sequence (WGS) experiments are inextricable from demographic data. We have performed a comprehensive study of genomic homozygosity, using high-depth WGS data for 90 individuals, and Illumina HD SNP data from 800 individuals representing 80 breeds. These data were coupled with extensive pedigree data analyses for 11 breeds that, together, allowed us to compute breed structure, demography, and molecular measures of genome diversity. Our comparative analyses characterize the extent, formation and implication of breed-specific diversity as it relates to population structure. These data demonstrate the relationship between breed-specific genome dynamics and population architecture, and provide important considerations influencing the technological and cohort design of association and other genomic studies. © 2016. Published by The Company of Biologists Ltd.
Whole-genome sequence, SNP chips and pedigree structure: building demographic profiles in domestic dog breeds to optimize genetic-trait mapping

PubMed Central

Dreger, Dayna L.; Rimbault, Maud; Davis, Brian W.; Bhatnagar, Adrienne; Parker, Heidi G.

2016-01-01

ABSTRACT In the decade following publication of the draft genome sequence of the domestic dog, extraordinary advances with application to several fields have been credited to the canine genetic system. Taking advantage of closed breeding populations and the subsequent selection for aesthetic and behavioral characteristics, researchers have leveraged the dog as an effective natural model for the study of complex traits, such as disease susceptibility, behavior and morphology, generating unique contributions to human health and biology. When designing genetic studies using purebred dogs, it is essential to consider the unique demography of each population, including estimation of effective population size and timing of population bottlenecks. The analytical design approach for genome-wide association studies (GWAS) and analysis of whole-genome sequence (WGS) experiments are inextricable from demographic data. We have performed a comprehensive study of genomic homozygosity, using high-depth WGS data for 90 individuals, and Illumina HD SNP data from 800 individuals representing 80 breeds. These data were coupled with extensive pedigree data analyses for 11 breeds that, together, allowed us to compute breed structure, demography, and molecular measures of genome diversity. Our comparative analyses characterize the extent, formation and implication of breed-specific diversity as it relates to population structure. These data demonstrate the relationship between breed-specific genome dynamics and population architecture, and provide important considerations influencing the technological and cohort design of association and other genomic studies. PMID:27874836
Complete chloroplast genome sequence of a major allogamous forage species, perennial ryegrass (Lolium perenne L.).

PubMed

Diekmann, Kerstin; Hodkinson, Trevor R; Wolfe, Kenneth H; van den Bekerom, Rob; Dix, Philip J; Barth, Susanne

2009-06-01

Lolium perenne L. (perennial ryegrass) is globally one of the most important forage and grassland crops. We sequenced the chloroplast (cp) genome of Lolium perenne cultivar Cashel. The L. perenne cp genome is 135 282 bp with a typical quadripartite structure. It contains genes for 76 unique proteins, 30 tRNAs and four rRNAs. As in other grasses, the genes accD, ycf1 and ycf2 are absent. The genome is of average size within its subfamily Pooideae and of medium size within the Poaceae. Genome size differences are mainly due to length variations in non-coding regions. However, considerable length differences of 1-27 codons in comparison of L. perenne to other Poaceae and 1-68 codons among all Poaceae were also detected. Within the cp genome of this outcrossing cultivar, 10 insertion/deletion polymorphisms and 40 single nucleotide polymorphisms were detected. Two of the polymorphisms involve tiny inversions within hairpin structures. By comparing the genome sequence with RT-PCR products of transcripts for 33 genes, 31 mRNA editing sites were identified, five of them unique to Lolium. The cp genome sequence of L. perenne is available under Accession number AM777385 at the European Molecular Biology Laboratory, National Center for Biotechnology Information and DNA DataBank of Japan.
Genome3D: a UK collaborative project to annotate genomic sequences with predicted 3D structures based on SCOP and CATH domains.

PubMed

Lewis, Tony E; Sillitoe, Ian; Andreeva, Antonina; Blundell, Tom L; Buchan, Daniel W A; Chothia, Cyrus; Cuff, Alison; Dana, Jose M; Filippis, Ioannis; Gough, Julian; Hunter, Sarah; Jones, David T; Kelley, Lawrence A; Kleywegt, Gerard J; Minneci, Federico; Mitchell, Alex; Murzin, Alexey G; Ochoa-Montaño, Bernardo; Rackham, Owen J L; Smith, James; Sternberg, Michael J E; Velankar, Sameer; Yeats, Corin; Orengo, Christine

2013-01-01

Genome3D, available at http://www.genome3d.eu, is a new collaborative project that integrates UK-based structural resources to provide a unique perspective on sequence-structure-function relationships. Leading structure prediction resources (DomSerf, FUGUE, Gene3D, pDomTHREADER, Phyre and SUPERFAMILY) provide annotations for UniProt sequences to indicate the locations of structural domains (structural annotations) and their 3D structures (structural models). Structural annotations and 3D model predictions are currently available for three model genomes (Homo sapiens, E. coli and baker's yeast), and the project will extend to other genomes in the near future. As these resources exploit different strategies for predicting structures, the main aim of Genome3D is to enable comparisons between all the resources so that biologists can see where predictions agree and are therefore more trusted. Furthermore, as these methods differ in whether they build their predictions using CATH or SCOP, Genome3D also contains the first official mapping between these two databases. This has identified pairs of similar superfamilies from the two resources at various degrees of consensus (532 bronze pairs, 527 silver pairs and 370 gold pairs).
Three-dimensional reconstruction of single-cell chromosome structure using recurrence plots.

PubMed

Hirata, Yoshito; Oda, Arisa; Ohta, Kunihiro; Aihara, Kazuyuki

2016-10-11

Single-cell analysis of the three-dimensional (3D) chromosome structure can reveal cell-to-cell variability in genome activities. Here, we propose to apply recurrence plots, a mathematical method of nonlinear time series analysis, to reconstruct the 3D chromosome structure of a single cell based on information of chromosomal contacts from genome-wide chromosome conformation capture (Hi-C) data. This recurrence plot-based reconstruction (RPR) method enables rapid reconstruction of a unique structure in single cells, even from incomplete Hi-C information.
Three-dimensional reconstruction of single-cell chromosome structure using recurrence plots

NASA Astrophysics Data System (ADS)

Hirata, Yoshito; Oda, Arisa; Ohta, Kunihiro; Aihara, Kazuyuki

2016-10-01

Single-cell analysis of the three-dimensional (3D) chromosome structure can reveal cell-to-cell variability in genome activities. Here, we propose to apply recurrence plots, a mathematical method of nonlinear time series analysis, to reconstruct the 3D chromosome structure of a single cell based on information of chromosomal contacts from genome-wide chromosome conformation capture (Hi-C) data. This recurrence plot-based reconstruction (RPR) method enables rapid reconstruction of a unique structure in single cells, even from incomplete Hi-C information.
Exploiting genotyping by sequencing to characterize the genomic structure of the American cranberry through high-density linkage mapping.

PubMed

Covarrubias-Pazaran, Giovanny; Diaz-Garcia, Luis; Schlautman, Brandon; Deutsch, Joseph; Salazar, Walter; Hernandez-Ochoa, Miguel; Grygleski, Edward; Steffan, Shawn; Iorizzo, Massimo; Polashock, James; Vorsa, Nicholi; Zalapa, Juan

2016-06-13

The application of genotyping by sequencing (GBS) approaches, combined with data imputation methodologies, is narrowing the genetic knowledge gap between major and understudied, minor crops. GBS is an excellent tool to characterize the genomic structure of recently domesticated (~200 years) and understudied species, such as cranberry (Vaccinium macrocarpon Ait.), by generating large numbers of markers for genomic studies such as genetic mapping. We identified 10842 potentially mappable single nucleotide polymorphisms (SNPs) in a cranberry pseudo-testcross population wherein 5477 SNPs and 211 short sequence repeats (SSRs) were used to construct a high density linkage map in cranberry of which a total of 4849 markers were mapped. Recombination frequency, linkage disequilibrium (LD), and segregation distortion at the genomic level in the parental and integrated linkage maps were characterized for first time in cranberry. SSR markers, used as the backbone in the map, revealed high collinearity with previously published linkage maps. The 4849 point map consisted of twelve linkage groups spanning 1112 cM, which anchored 2381 nuclear scaffolds accounting for ~13 Mb of the estimated 470 Mb cranberry genome. Bin mapping identified 592 and 672 unique bins in the parentals and a total of 1676 unique marker positions in the integrated map. Synteny analyses comparing the order of anchored cranberry scaffolds to their homologous positions in kiwifruit, grape, and coffee genomes provided initial evidence of homology between cranberry and closely related species. GBS data was used to rapidly saturate the cranberry genome with markers in a pseudo-testcross population. Collinearity between the present saturated genetic map and previous cranberry SSR maps suggests that the SNP locations represent accurate marker order and chromosome structure of the cranberry genome. SNPs greatly improved current marker genome coverage, which allowed for genome-wide structure investigations such as segregation distortion, recombination, linkage disequilibrium, and synteny analyses. In the future, GBS can be used to accelerate cranberry molecular breeding through QTL mapping and genome-wide association studies (GWAS).
Effector diversification within compartments of the Leptosphaeria maculans genome affected by repeat induced point mutations

USDA-ARS?s Scientific Manuscript database

The genome sequence of the phytopathogenic fungus Leptosphaeria maculans has been determined. It has a unique bipartite structure, divided between distinct GC-equilibrated and AT-rich regions (isochores), reminiscent of some plants and animals but not previously observed in fungi. The GC-equilibrate...
Structural analysis of a set of proteins resulting from a bacterial genomics project.

PubMed

Badger, J; Sauder, J M; Adams, J M; Antonysamy, S; Bain, K; Bergseid, M G; Buchanan, S G; Buchanan, M D; Batiyenko, Y; Christopher, J A; Emtage, S; Eroshkina, A; Feil, I; Furlong, E B; Gajiwala, K S; Gao, X; He, D; Hendle, J; Huber, A; Hoda, K; Kearins, P; Kissinger, C; Laubert, B; Lewis, H A; Lin, J; Loomis, K; Lorimer, D; Louie, G; Maletic, M; Marsh, C D; Miller, I; Molinari, J; Muller-Dieckmann, H J; Newman, J M; Noland, B W; Pagarigan, B; Park, F; Peat, T S; Post, K W; Radojicic, S; Ramos, A; Romero, R; Rutter, M E; Sanderson, W E; Schwinn, K D; Tresser, J; Winhoven, J; Wright, T A; Wu, L; Xu, J; Harris, T J R

2005-09-01

The targets of the Structural GenomiX (SGX) bacterial genomics project were proteins conserved in multiple prokaryotic organisms with no obvious sequence homolog in the Protein Data Bank of known structures. The outcome of this work was 80 structures, covering 60 unique sequences and 49 different genes. Experimental phase determination from proteins incorporating Se-Met was carried out for 45 structures with most of the remainder solved by molecular replacement using members of the experimentally phased set as search models. An automated tool was developed to deposit these structures in the Protein Data Bank, along with the associated X-ray diffraction data (including refined experimental phases) and experimentally confirmed sequences. BLAST comparisons of the SGX structures with structures that had appeared in the Protein Data Bank over the intervening 3.5 years since the SGX target list had been compiled identified homologs for 49 of the 60 unique sequences represented by the SGX structures. This result indicates that, for bacterial structures that are relatively easy to express, purify, and crystallize, the structural coverage of gene space is proceeding rapidly. More distant sequence-structure relationships between the SGX and PDB structures were investigated using PDB-BLAST and Combinatorial Extension (CE). Only one structure, SufD, has a truly unique topology compared to all folds in the PDB. Copyright 2005 Wiley-Liss, Inc.
Structure, proteome and genome of Sinorhizobium meliloti phage ΦM5: A virus with LUZ24-like morphology and a highly mosaic genome.

PubMed

Johnson, Matthew C; Sena-Velez, Marta; Washburn, Brian K; Platt, Georgia N; Lu, Stephen; Brewer, Tess E; Lynn, Jason S; Stroupe, M Elizabeth; Jones, Kathryn M

2017-12-01

Bacteriophages of nitrogen-fixing rhizobial bacteria are revealing a wealth of novel structures, diverse enzyme combinations and genomic features. Here we report the cryo-EM structure of the phage capsid at 4.9-5.7Å-resolution, the phage particle proteome, and the genome of the Sinorhizobium meliloti-infecting Podovirus ΦM5. This is the first structure of a phage with a capsid and capsid-associated structural proteins related to those of the LUZ24-like viruses that infect Pseudomonas aeruginosa. Like many other Podoviruses, ΦM5 is a T=7 icosahedron with a smooth capsid and short, relatively featureless tail. Nonetheless, this group is phylogenetically quite distinct from Podoviruses of the well-characterized T7, P22, and epsilon 15 supergroups. Structurally, a distinct bridge of density that appears unique to ΦM5 reaches down the body of the coat protein to the extended loop that interacts with the next monomer in a hexamer, perhaps stabilizing the mature capsid. Further, the predicted tail fibers of ΦM5 are quite different from those of enteric bacteria phages, but have domains in common with other rhizophages. Genomically, ΦM5 is highly mosaic. The ΦM5 genome is 44,005bp with 357bp direct terminal repeats (DTRs) and 58 unique ORFs. Surprisingly, the capsid structural module, the tail module, the DNA-packaging terminase, the DNA replication module and the integrase each appear to be from a different lineage. One of the most unusual features of ΦM5 is its terminase whose large subunit is quite different from previously-described short-DTR-generating packaging machines and does not fit into any of the established phylogenetic groups. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
Genome Structure of the Legume, Lotus japonicus

PubMed Central

Sato, Shusei; Nakamura, Yasukazu; Kaneko, Takakazu; Asamizu, Erika; Kato, Tomohiko; Nakao, Mitsuteru; Sasamoto, Shigemi; Watanabe, Akiko; Ono, Akiko; Kawashima, Kumiko; Fujishiro, Tsunakazu; Katoh, Midori; Kohara, Mitsuyo; Kishida, Yoshie; Minami, Chiharu; Nakayama, Shinobu; Nakazaki, Naomi; Shimizu, Yoshimi; Shinpo, Sayaka; Takahashi, Chika; Wada, Tsuyuko; Yamada, Manabu; Ohmido, Nobuko; Hayashi, Makoto; Fukui, Kiichi; Baba, Tomoya; Nakamichi, Tomoko; Mori, Hirotada; Tabata, Satoshi

2008-01-01

The legume Lotus japonicus has been widely used as a model system to investigate the genetic background of legume-specific phenomena such as symbiotic nitrogen fixation. Here, we report structural features of the L. japonicus genome. The 315.1-Mb sequences determined in this and previous studies correspond to 67% of the genome (472 Mb), and are likely to cover 91.3% of the gene space. Linkage mapping anchored 130-Mb sequences onto the six linkage groups. A total of 10 951 complete and 19 848 partial structures of protein-encoding genes were assigned to the genome. Comparative analysis of these genes revealed the expansion of several functional domains and gene families that are characteristic of L. japonicus. Synteny analysis detected traces of whole-genome duplication and the presence of synteny blocks with other plant genomes to various degrees. This study provides the first opportunity to look into the complex and unique genetic system of legumes. PMID:18511435
Transposable elements in Drosophila.

PubMed

McCullers, Tabitha J; Steiniger, Mindy

2017-01-01

Transposable elements (TEs) are mobile genetic elements that can mobilize within host genomes. As TEs comprise more than 40% of the human genome and are linked to numerous diseases, understanding their mechanisms of mobilization and regulation is important. Drosophila melanogaster is an ideal model organism for the study of eukaryotic TEs as its genome contains a diverse array of active TEs. TEs universally impact host genome size via transposition and deletion events, but may also adopt unique functional roles in host organisms. There are 2 main classes of TEs: DNA transposons and retrotransposons. These classes are further divided into subgroups of TEs with unique structural and functional characteristics, demonstrating the significant variability among these elements. Despite this variability, D. melanogaster and other eukaryotic organisms utilize conserved mechanisms to regulate TEs. This review focuses on the transposition mechanisms and regulatory pathways of TEs, and their functional roles in D. melanogaster .

Transposable elements in Drosophila

PubMed Central

McCullers, Tabitha J.; Steiniger, Mindy

2017-01-01

ABSTRACT Transposable elements (TEs) are mobile genetic elements that can mobilize within host genomes. As TEs comprise more than 40% of the human genome and are linked to numerous diseases, understanding their mechanisms of mobilization and regulation is important. Drosophila melanogaster is an ideal model organism for the study of eukaryotic TEs as its genome contains a diverse array of active TEs. TEs universally impact host genome size via transposition and deletion events, but may also adopt unique functional roles in host organisms. There are 2 main classes of TEs: DNA transposons and retrotransposons. These classes are further divided into subgroups of TEs with unique structural and functional characteristics, demonstrating the significant variability among these elements. Despite this variability, D. melanogaster and other eukaryotic organisms utilize conserved mechanisms to regulate TEs. This review focuses on the transposition mechanisms and regulatory pathways of TEs, and their functional roles in D. melanogaster. PMID:28580197
Plant Ion Channels: Gene Families, Physiology, and Functional Genomics Analyses

PubMed Central

Ward, John M.; Mäser, Pascal; Schroeder, Julian I.

2016-01-01

Distinct potassium, anion, and calcium channels in the plasma membrane and vacuolar membrane of plant cells have been identified and characterized by patch clamping. Primarily owing to advances in Arabidopsis genetics and genomics, and yeast functional complementation, many of the corresponding genes have been identified. Recent advances in our understanding of ion channel genes that mediate signal transduction and ion transport are discussed here. Some plant ion channels, for example, ALMT and SLAC anion channel subunits, are unique. The majority of plant ion channel families exhibit homology to animal genes; such families include both hyperpolarization-and depolarization-activated Shaker-type potassium channels, CLC chloride transporters/channels, cyclic nucleotide–gated channels, and ionotropic glutamate receptor homologs. These plant ion channels offer unique opportunities to analyze the structural mechanisms and functions of ion channels. Here we review gene families of selected plant ion channel classes and discuss unique structure-function aspects and their physiological roles in plant cell signaling and transport. PMID:18842100
Plant ion channels: gene families, physiology, and functional genomics analyses.

PubMed

Ward, John M; Mäser, Pascal; Schroeder, Julian I

2009-01-01

Distinct potassium, anion, and calcium channels in the plasma membrane and vacuolar membrane of plant cells have been identified and characterized by patch clamping. Primarily owing to advances in Arabidopsis genetics and genomics, and yeast functional complementation, many of the corresponding genes have been identified. Recent advances in our understanding of ion channel genes that mediate signal transduction and ion transport are discussed here. Some plant ion channels, for example, ALMT and SLAC anion channel subunits, are unique. The majority of plant ion channel families exhibit homology to animal genes; such families include both hyperpolarization- and depolarization-activated Shaker-type potassium channels, CLC chloride transporters/channels, cyclic nucleotide-gated channels, and ionotropic glutamate receptor homologs. These plant ion channels offer unique opportunities to analyze the structural mechanisms and functions of ion channels. Here we review gene families of selected plant ion channel classes and discuss unique structure-function aspects and their physiological roles in plant cell signaling and transport.
Insights on genome size evolution from a miniature inverted repeat transposon driving a satellite DNA.

PubMed

Scalvenzi, Thibault; Pollet, Nicolas

2014-12-01

The genome size in eukaryotes does not correlate well with the number of genes they contain. We can observe this so-called C-value paradox in amphibian species. By analyzing an amphibian genome we asked how repetitive DNA can impact genome size and architecture. We describe here our discovery of a Tc1/mariner miniature inverted-repeat transposon family present in Xenopus frogs. These transposons named miDNA4 are unique since they contain a satellite DNA motif. We found that miDNA4 measured 331 bp, contained 25 bp long inverted terminal repeat sequences and a sequence motif of 119 bp present as a unique copy or as an array of 2-47 copies. We characterized the structure, dynamics, impact and evolution of the miDNA4 family and its satellite DNA in Xenopus frog genomes. This led us to propose a model for the evolution of these two repeated sequences and how they can synergize to increase genome size. Copyright © 2014 Elsevier Inc. All rights reserved.
Identification of a unique library of complex, but ordered, arrays of repetitive elements in the human genome and implication of their potential involvement in pathobiology.

PubMed

Lee, Kang-Hoon; Lee, Young-Kwan; Kwon, Deug-Nam; Chiu, Sophia; Chew, Victoria; Rah, Hyungchul; Kujawski, Gregory; Melhem, Ramzi; Hsu, Karen; Chung, Cecilia; Greenhalgh, David G; Cho, Kiho

2011-06-01

Approximately 2% of the human genome is reported to be occupied by genes. Various forms of repetitive elements (REs), both characterized and uncharacterized, are presumed to make up the vast majority of the rest of the genomes of human and other species. In conjunction with a comprehensive annotation of genes, information regarding components of genome biology, such as gene polymorphisms, non-coding RNAs, and certain REs, is found in human genome databases. However, the genome-wide profile of unique RE arrangements formed by different groups of REs has not been fully characterized yet. In this study, the entire human genome was subjected to an unbiased RE survey to establish a whole-genome profile of REs and their arrangements. Due to the limitation in query size within the bl2seq alignment program (National Center for Biotechnology Information [NCBI]) utilized for the RE survey, the entire NCBI reference human genome was fragmented into 6206 units of 0.5M nucleotides. A number of RE arrangements with varying complexities and patterns were identified throughout the genome. Each chromosome had unique profiles of RE arrangements and density, and high levels of RE density were measured near the centromere regions. Subsequently, 175 complex RE arrangements, which were selected throughout the genome, were subjected to a comparison analysis using five different human genome sequences. Interestingly, three of the five human genome databases shared the exactly same arrangement patterns and sequences for all 175 RE arrangement regions (a total of 12,765,625 nucleotides). The findings from this study demonstrate that a substantial fraction of REs in the human genome are clustered into various forms of ordered structures. Further investigations are needed to examine whether some of these ordered RE arrangements contribute to the human pathobiology as a functional genome unit. Copyright © 2011 Elsevier Inc. All rights reserved.
Genome, transcriptome, and secretome analysis of wood decay fungus Postia placenta supports unique mechanisms of lignocellulose conversion

Treesearch

Diego Martinez; Jean Challacombe; Ingo Morgenstern; David Hibbett; Monika Schmoll; Christian P. Kubicek; Patricia Ferreira; Francisco J. Ruiz-Duenas; Angel T. Martinez; Philip J. Kersten; Kenneth E. Hammel; Jill A. Gaskell; Daniel Cullen

2009-01-01

Brown-rot fungi such as Postia placenta are common inhabitants of forest ecosystems and are also largely responsible for the destructive decay of wooden structures. Rapid depolymerization of cellulose is a distinguishing feature of brown-rot, but the biochemical mechanisms and underlying genetics are poorly understood. Systematic examination of the P. placenta genome,...
From the Cover: Genome analysis of the smallest free-living eukaryote Ostreococcus tauri unveils many unique features

NASA Astrophysics Data System (ADS)

Derelle, Evelyne; Ferraz, Conchita; Rombauts, Stephane; Rouzé, Pierre; Worden, Alexandra Z.; Robbens, Steven; Partensky, Frédéric; Degroeve, Sven; Echeynié, Sophie; Cooke, Richard; Saeys, Yvan; Wuyts, Jan; Jabbari, Kamel; Bowler, Chris; Panaud, Olivier; Piégu, Benoît; Ball, Steven G.; Ral, Jean-Philippe; Bouget, François-Yves; Piganeau, Gwenael; de Baets, Bernard; Picard, André; Delseny, Michel; Demaille, Jacques; van de Peer, Yves; Moreau, Hervé

2006-08-01

The green lineage is reportedly 1,500 million years old, evolving shortly after the endosymbiosis event that gave rise to early photosynthetic eukaryotes. In this study, we unveil the complete genome sequence of an ancient member of this lineage, the unicellular green alga Ostreococcus tauri (Prasinophyceae). This cosmopolitan marine primary producer is the world's smallest free-living eukaryote known to date. Features likely reflecting optimization of environmentally relevant pathways, including resource acquisition, unusual photosynthesis apparatus, and genes potentially involved in C4 photosynthesis, were observed, as was downsizing of many gene families. Overall, the 12.56-Mb nuclear genome has an extremely high gene density, in part because of extensive reduction of intergenic regions and other forms of compaction such as gene fusion. However, the genome is structurally complex. It exhibits previously unobserved levels of heterogeneity for a eukaryote. Two chromosomes differ structurally from the other eighteen. Both have a significantly biased G+C content, and, remarkably, they contain the majority of transposable elements. Many chromosome 2 genes also have unique codon usage and splicing, but phylogenetic analysis and composition do not support alien gene origin. In contrast, most chromosome 19 genes show no similarity to green lineage genes and a large number of them are specialized in cell surface processes. Taken together, the complete genome sequence, unusual features, and downsized gene families, make O. tauri an ideal model system for research on eukaryotic genome evolution, including chromosome specialization and green lineage ancestry. genome heterogeneity | genome sequence | green alga | Prasinophyceae | gene prediction
Sequence Analysis and Characterization of Active Human Alu Subfamilies Based on the 1000 Genomes Pilot Project.

PubMed

Konkel, Miriam K; Walker, Jerilyn A; Hotard, Ashley B; Ranck, Megan C; Fontenot, Catherine C; Storer, Jessica; Stewart, Chip; Marth, Gabor T; Batzer, Mark A

2015-08-29

The goal of the 1000 Genomes Consortium is to characterize human genome structural variation (SV), including forms of copy number variations such as deletions, duplications, and insertions. Mobile element insertions, particularly Alu elements, are major contributors to genomic SV among humans. During the pilot phase of the project we experimentally validated 645 (611 intergenic and 34 exon targeted) polymorphic "young" Alu insertion events, absent from the human reference genome. Here, we report high resolution sequencing of 343 (322 unique) recent Alu insertion events, along with their respective target site duplications, precise genomic breakpoint coordinates, subfamily assignment, percent divergence, and estimated A-rich tail lengths. All the sequenced Alu loci were derived from the AluY lineage with no evidence of retrotransposition activity involving older Alu families (e.g., AluJ and AluS). AluYa5 is currently the most active Alu subfamily in the human lineage, followed by AluYb8, and many others including three newly identified subfamilies we have termed AluYb7a3, AluYb8b1, and AluYa4a1. This report provides the structural details of 322 unique Alu variants from individual human genomes collectively adding about 100 kb of genomic variation. Many Alu subfamilies are currently active in human populations, including a surprising level of AluY retrotransposition. Human Alu subfamilies exhibit continuous evolution with potential drivers sprouting new Alu lineages. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Onco-Regulon: an integrated database and software suite for site specific targeting of transcription factors of cancer genes

PubMed Central

Tomar, Navneet; Mishra, Akhilesh; Mrinal, Nirotpal; Jayaram, B.

2016-01-01

Transcription factors (TFs) bind at multiple sites in the genome and regulate expression of many genes. Regulating TF binding in a gene specific manner remains a formidable challenge in drug discovery because the same binding motif may be present at multiple locations in the genome. Here, we present Onco-Regulon (http://www.scfbio-iitd.res.in/software/onco/NavSite/index.htm), an integrated database of regulatory motifs of cancer genes clubbed with Unique Sequence-Predictor (USP) a software suite that identifies unique sequences for each of these regulatory DNA motifs at the specified position in the genome. USP works by extending a given DNA motif, in 5′→3′, 3′ →5′ or both directions by adding one nucleotide at each step, and calculates the frequency of each extended motif in the genome by Frequency Counter programme. This step is iterated till the frequency of the extended motif becomes unity in the genome. Thus, for each given motif, we get three possible unique sequences. Closest Sequence Finder program predicts off-target drug binding in the genome. Inclusion of DNA-Protein structural information further makes Onco-Regulon a highly informative repository for gene specific drug development. We believe that Onco-Regulon will help researchers to design drugs which will bind to an exclusive site in the genome with no off-target effects, theoretically. Database URL: http://www.scfbio-iitd.res.in/software/onco/NavSite/index.htm PMID:27515825
The complete mitochondrial genome sequence of the maned wolf (Chrysocyon brachyurus).

PubMed

Zhao, Chao; Yang, Xiufeng; Zhang, Honghai; Zhang, Jin; Chen, Lei; Sha, Weilai; Liu, Guangshuai

2016-01-01

In this study, the complete mitochondrial genome of the maned wolf (Chrysocyon brachyurus), the unique species in Chrysocyon, was sequenced and reported for the first time using blood samples obtained from a female individual in Shanghai Zoo, China. Sequence analysis showed that the genome structure was in accordance with other Canidae species and it contained 12 S rRNA gene, 16 S rRNA gene, 22 tRNA genes, 13 protein-coding genes and 1 control region.
Crossed wires: 3D genome misfolding in human disease.

PubMed

Norton, Heidi K; Phillips-Cremins, Jennifer E

2017-11-06

Mammalian genomes are folded into unique topological structures that undergo precise spatiotemporal restructuring during healthy development. Here, we highlight recent advances in our understanding of how the genome folds inside the 3D nucleus and how these folding patterns are miswired during the onset and progression of mammalian disease states. We discuss potential mechanisms underlying the link among genome misfolding, genome dysregulation, and aberrant cellular phenotypes. We also discuss cases in which the endogenous 3D genome configurations in healthy cells might be particularly susceptible to mutation or translocation. Together, these data support an emerging model in which genome folding and misfolding is critically linked to the onset and progression of a broad range of human diseases. © 2017 Norton and Phillips-Cremins.
Transformation-associated recombination (TAR) cloning for genomics studies and synthetic biology

PubMed Central

Kouprina, Natalay; Larionov, Vladimir

2016-01-01

Transformation-associated recombination (TAR) cloning represents a unique tool for isolation and manipulation of large DNA molecules. The technique exploits a high level of homologous recombination in the yeast Sacharomyces cerevisiae. So far, TAR cloning is the only method available to selectively recover chromosomal segments up to 300 kb in length from complex and simple genomes. In addition, TAR cloning allows the assembly and cloning of entire microbe genomes up to several Mb as well as engineering of large metabolic pathways. In this review, we summarize applications of TAR cloning for functional/structural genomics and synthetic biology. PMID:27116033
Complete nucleotide sequence and genome structure of a Japanese isolate of hibiscus latent Fort Pierce virus, a unique tobamovirus that contains an internal poly(A) region in its 3' end.

PubMed

Yoshida, Tetsuya; Kitazawa, Yugo; Komatsu, Ken; Neriya, Yutaro; Ishikawa, Kazuya; Fujita, Naoko; Hashimoto, Masayoshi; Maejima, Kensaku; Yamaji, Yasuyuki; Namba, Shigetou

2014-11-01

In this study, we detected a Japanese isolate of hibiscus latent Fort Pierce virus (HLFPV-J), a member of the genus Tobamovirus, in a hibiscus plant in Japan and determined the complete sequence and organization of its genome. HLFPV-J has four open reading frames (ORFs), each of which shares more than 98 % nucleotide sequence identity with those of other HLFPV isolates. Moreover, HLFPV-J contains a unique internal poly(A) region of variable length, ranging from 44 to 78 nucleotides, in its 3'-untranslated region (UTR), as is the case with hibiscus latent Singapore virus (HLSV), another hibiscus-infecting tobamovirus. The length of the HLFPV-J genome was 6431 nucleotides, including the shortest internal poly(A) region. The sequence identities of ORFs 1, 2, 3 and 4 of HLFPV-J to other tobamoviruses were 46.6-68.7, 49.9-70.8, 31.0-70.8 and 39.4-70.1 %, respectively, at the nucleotide level and 39.8-75.0, 43.6-77.8, 19.2-70.4 and 31.2-74.2 %, respectively, at the amino acid level. The 5'- and 3'-UTRs of HLFPV-J showed 24.3-58.6 and 13.0-79.8 % identity, respectively, to other tobamoviruses. In particular, when compared to other tobamoviruses, each ORF and UTR of HLFPV-J showed the highest sequence identity to those of HLSV. Phylogenetic analysis showed that HLFPV-J, other HLFPV isolates and HLSV constitute a malvaceous-plant-infecting tobamovirus cluster. These results indicate that the genomic structure of HLFPV-J has unique features similar to those of HLSV. To our knowledge, this is the first report of the complete genome sequence of HLFPV.
From NGS assembly challenges to instability of fungal mitochondrial genomes: A case study in genome complexity.

PubMed

Misas, Elizabeth; Muñoz, José Fernando; Gallo, Juan Esteban; McEwen, Juan Guillermo; Clay, Oliver Keatinge

2016-04-01

The presence of repetitive or non-unique DNA persisting over sizable regions of a eukaryotic genome can hinder the genome's successful de novo assembly from short reads: ambiguities in assigning genome locations to the non-unique subsequences can result in premature termination of contigs and thus overfragmented assemblies. Fungal mitochondrial (mtDNA) genomes are compact (typically less than 100 kb), yet often contain short non-unique sequences that can be shown to impede their successful de novo assembly in silico. Such repeats can also confuse processes in the cell in vivo. A well-studied example is ectopic (out-of-register, illegitimate) recombination associated with repeat pairs, which can lead to deletion of functionally important genes that are located between the repeats. Repeats that remain conserved over micro- or macroevolutionary timescales despite such risks may indicate functionally or structurally (e.g., for replication) important regions. This principle could form the basis of a mining strategy for accelerating discovery of function in genome sequences. We present here our screening of a sample of 11 fully sequenced fungal mitochondrial genomes by observing where exact k-mer repeats occurred several times; initial analyses motivated us to focus on 17-mers occurring more than three times. Based on the diverse repeats we observe, we propose that such screening may serve as an efficient expedient for gaining a rapid but representative first insight into the repeat landscapes of sparsely characterized mitochondrial chromosomes. Our matching of the flagged repeats to previously reported regions of interest supports the idea that systems of persisting, non-trivial repeats in genomes can often highlight features meriting further attention. Copyright © 2016 Elsevier Ltd. All rights reserved.
Primary structure of the Aequorea victoria green-fluorescent protein.

PubMed

Prasher, D C; Eckenrode, V K; Ward, W W; Prendergast, F G; Cormier, M J

1992-02-15

Many cnidarians utilize green-fluorescent proteins (GFPs) as energy-transfer acceptors in bioluminescence. GFPs fluoresce in vivo upon receiving energy from either a luciferase-oxyluciferin excited-state complex or a Ca(2+)-activated phosphoprotein. These highly fluorescent proteins are unique due to the chemical nature of their chromophore, which is comprised of modified amino acid (aa) residues within the polypeptide. This report describes the cloning and sequencing of both cDNA and genomic clones of GFP from the cnidarian, Aequorea victoria. The gfp10 cDNA encodes a 238-aa-residue polypeptide with a calculated Mr of 26,888. Comparison of A. victoria GFP genomic clones shows three different restriction enzyme patterns which suggests that at least three different genes are present in the A. victoria population at Friday Harbor, Washington. The gfp gene encoded by the lambda GFP2 genomic clone is comprised of at least three exons spread over 2.6 kb. The nucleotide sequences of the cDNA and the gene will aid in the elucidation of structure-function relationships in this unique class of proteins.
Nucleotide sequence of the Kaposi sarcoma-associated herpesvirus (HHV8)

PubMed Central

Russo, James J.; Bohenzky, Roy A.; Chien, Ming-Cheng; Chen, Jing; Yan, Ming; Maddalena, Dawn; Parry, J. Preston; Peruzzi, Daniela; Edelman, Isidore S.; Chang, Yuan; Moore, Patrick S.

1996-01-01

The genome of the Kaposi sarcoma-associated herpesvirus (KSHV or HHV8) was mapped with cosmid and phage genomic libraries from the BC-1 cell line. Its nucleotide sequence was determined except for a 3-kb region at the right end of the genome that was refractory to cloning. The BC-1 KSHV genome consists of a 140.5-kb-long unique coding region flanked by multiple G+C-rich 801-bp terminal repeat sequences. A genomic duplication that apparently arose in the parental tumor is present in this cell culture-derived strain. At least 81 ORFs, including 66 with homology to herpesvirus saimiri ORFs, and 5 internal repeat regions are present in the long unique region. The virus encodes homologs to complement-binding proteins, three cytokines (two macrophage inflammatory proteins and interleukin 6), dihydrofolate reductase, bcl-2, interferon regulatory factors, interleukin 8 receptor, neural cell adhesion molecule-like adhesin, and a D-type cyclin, as well as viral structural and metabolic proteins. Terminal repeat analysis of virus DNA from a KS lesion suggests a monoclonal expansion of KSHV in the KS tumor. PMID:8962146
Comparative chloroplast genomics and phylogenetics of Fagopyrum esculentum ssp. ancestrale – A wild ancestor of cultivated buckwheat

PubMed Central

Logacheva, Maria D; Samigullin, Tahir H; Dhingra, Amit; Penin, Aleksey A

2008-01-01

Background Chloroplast genome sequences are extremely informative about species-interrelationships owing to its non-meiotic and often uniparental inheritance over generations. The subject of our study, Fagopyrum esculentum, is a member of the family Polygonaceae belonging to the order Caryophyllales. An uncertainty remains regarding the affinity of Caryophyllales and the asterids that could be due to undersampling of the taxa. With that background, having access to the complete chloroplast genome sequence for Fagopyrum becomes quite pertinent. Results We report the complete chloroplast genome sequence of a wild ancestor of cultivated buckwheat, Fagopyrum esculentum ssp. ancestrale. The sequence was rapidly determined using a previously described approach that utilized a PCR-based method and employed universal primers, designed on the scaffold of multiple sequence alignment of chloroplast genomes. The gene content and order in buckwheat chloroplast genome is similar to Spinacia oleracea. However, some unique structural differences exist: the presence of an intron in the rpl2 gene, a frameshift mutation in the rpl23 gene and extension of the inverted repeat region to include the ycf1 gene. Phylogenetic analysis of 61 protein-coding gene sequences from 44 complete plastid genomes provided strong support for the sister relationships of Caryophyllales (including Polygonaceae) to asterids. Further, our analysis also provided support for Amborella as sister to all other angiosperms, but interestingly, in the bayesian phylogeny inference based on first two codon positions Amborella united with Nymphaeales. Conclusion Comparative genomics analyses revealed that the Fagopyrum chloroplast genome harbors the characteristic gene content and organization as has been described for several other chloroplast genomes. However, it has some unique structural features distinct from previously reported complete chloroplast genome sequences. Phylogenetic analysis of the dataset, including this new sequence from non-core Caryophyllales supports the sister relationship between Caryophyllales and asterids. PMID:18492277
Chompy: an infestation of MITE-like repetitive elements in the crocodilian genome.

PubMed

Ray, David A; Hedges, Dale J; Herke, Scott W; Fowlkes, Justin D; Barnes, Erin W; LaVie, Daniel K; Goodwin, Lindsey M; Densmore, Llewellyn D; Batzer, Mark A

2005-12-05

Interspersed repeats are a major component of most eukaryotic genomes and have an impact on genome size and stability, but the repetitive element landscape of crocodilian genomes has not yet been fully investigated. In this report, we provide the first detailed characterization of an interspersed repeat element in any crocodilian genome. Chompy is a putative miniature inverted-repeat transposable element (MITE) family initially recovered from the genome of Alligator mississippiensis (American alligator) but also present in the genomes of Crocodylus moreletii (Morelet's crocodile) and Gavialis gangeticus (Indian gharial). The element has all of the hallmarks of MITEs including terminal inverted repeats, possible target site duplications, and a tendency to form secondary structures. We estimate the copy number in the alligator genome to be approximately 46,000 copies. As a result of their size and unique properties, Chompy elements may provide a useful source of genomic variation for crocodilian comparative genomics.
Segmental duplications: evolution and impact among the current Lepidoptera genomes.

PubMed

Zhao, Qian; Ma, Dongna; Vasseur, Liette; You, Minsheng

2017-07-06

Structural variation among genomes is now viewed to be as important as single nucleoid polymorphisms in influencing the phenotype and evolution of a species. Segmental duplication (SD) is defined as segments of DNA with homologous sequence. Here, we performed a systematic analysis of segmental duplications (SDs) among five lepidopteran reference genomes (Plutella xylostella, Danaus plexippus, Bombyx mori, Manduca sexta and Heliconius melpomene) to understand their potential impact on the evolution of these species. We find that the SDs content differed substantially among species, ranging from 1.2% of the genome in B. mori to 15.2% in H. melpomene. Most SDs formed very high identity (similarity higher than 90%) blocks but had very few large blocks. Comparative analysis showed that most of the SDs arose after the divergence of each linage and we found that P. xylostella and H. melpomene showed more duplications than other species, suggesting they might be able to tolerate extensive levels of variation in their genomes. Conserved ancestral and species specific SD events were assessed, revealing multiple examples of the gain, loss or maintenance of SDs over time. SDs content analysis showed that most of the genes embedded in SDs regions belonged to species-specific SDs ("Unique" SDs). Functional analysis of these genes suggested their potential roles in the lineage-specific evolution. SDs and flanking regions often contained transposable elements (TEs) and this association suggested some involvement in SDs formation. Further studies on comparison of gene expression level between SDs and non-SDs showed that the expression level of genes embedded in SDs was significantly lower, suggesting that structure changes in the genomes are involved in gene expression differences in species. The results showed that most of the SDs were "unique SDs", which originated after species formation. Functional analysis suggested that SDs might play different roles in different species. Our results provide a valuable resource beyond the genetic mutation to explore the genome structure for future Lepidoptera research.
Genomic characterization and phylogenetic analysis of Zika virus circulating in the Americas.

PubMed

Ye, Qing; Liu, Zhong-Yu; Han, Jian-Feng; Jiang, Tao; Li, Xiao-Feng; Qin, Cheng-Feng

2016-09-01

The rapid spread and potential link with birth defects have made Zika virus (ZIKV) a global public health problem. The virus was discovered 70years ago, yet the knowledge about its genomic structure and the genetic variations associated with current ZIKV explosive epidemics remains not fully understood. In this review, the genome organization, especially conserved terminal structures of ZIKV genome were characterized and compared with other mosquito-borne flaviviruses. It is suggested that major viral proteins of ZIKV share high structural and functional similarity with other known flaviviruses as shown by sequence comparison and prediction of functional motifs in viral proteins. Phylogenetic analysis demonstrated that all ZIKV strains circulating in the America form a unique clade within the Asian lineage. Furthermore, we identified a series of conserved amino acid residues that differentiate the Asian strains including the current circulating American strains from the ancient African strains. Overall, our findings provide an overview of ZIKV genome characterization and evolutionary dynamics in the Americas and point out critical clues for future virological and epidemiological studies. Copyright © 2016 Elsevier B.V. All rights reserved.

Archaeal Genome Guardians Give Insights into Eukaryotic DNA Replication and Damage Response Proteins

PubMed Central

Shin, David S.; Pratt, Ashley J.; Tainer, John A.

2014-01-01

As the third domain of life, archaea, like the eukarya and bacteria, must have robust DNA replication and repair complexes to ensure genome fidelity. Archaea moreover display a breadth of unique habitats and characteristics, and structural biologists increasingly appreciate these features. As archaea include extremophiles that can withstand diverse environmental stresses, they provide fundamental systems for understanding enzymes and pathways critical to genome integrity and stress responses. Such archaeal extremophiles provide critical data on the periodic table for life as well as on the biochemical, geochemical, and physical limitations to adaptive strategies allowing organisms to thrive under environmental stress relevant to determining the boundaries for life as we know it. Specifically, archaeal enzyme structures have informed the architecture and mechanisms of key DNA repair proteins and complexes. With added abilities to temperature-trap flexible complexes and reveal core domains of transient and dynamic complexes, these structures provide insights into mechanisms of maintaining genome integrity despite extreme environmental stress. The DNA damage response protein structures noted in this review therefore inform the basis for genome integrity in the face of environmental stress, with implications for all domains of life as well as for biomanufacturing, astrobiology, and medicine. PMID:24701133
12-Chemokine Gene Signature Identifies Lymph Node-like Structures in Melanoma: Potential for Patient Selection for Immunotherapy?

NASA Astrophysics Data System (ADS)

Messina, Jane L.; Fenstermacher, David A.; Eschrich, Steven; Qu, Xiaotao; Berglund, Anders E.; Lloyd, Mark C.; Schell, Michael J.; Sondak, Vernon K.; Weber, Jeffrey S.; Mulé, James J.

2012-10-01

We have interrogated a 12-chemokine gene expression signature (GES) on genomic arrays of 14,492 distinct solid tumors and show broad distribution across different histologies. We hypothesized that this 12-chemokine GES might accurately predict a unique intratumoral immune reaction in stage IV (non-locoregional) melanoma metastases. The 12-chemokine GES predicted the presence of unique, lymph node-like structures, containing CD20+ B cell follicles with prominent areas of CD3+ T cells (both CD4+ and CD8+ subsets). CD86+, but not FoxP3+, cells were present within these unique structures as well. The direct correlation between the 12-chemokine GES score and the presence of unique, lymph nodal structures was also associated with better overall survival of the subset of melanoma patients. The use of this novel 12-chemokine GES may reveal basic information on in situ mechanisms of the anti-tumor immune response, potentially leading to improvements in the identification and selection of melanoma patients most suitable for immunotherapy.
SARS-unique fold in the Rousettus bat coronavirus HKU9.

PubMed

Hammond, Robert G; Tan, Xuan; Johnson, Margaret A

2017-09-01

The coronavirus nonstructural protein 3 (nsp3) is a multifunctional protein that comprises multiple structural domains. This protein assists viral polyprotein cleavage, host immune interference, and may play other roles in genome replication or transcription. Here, we report the solution NMR structure of a protein from the "SARS-unique region" of the bat coronavirus HKU9. The protein contains a frataxin fold or double-wing motif, which is an α + β fold that is associated with protein/protein interactions, DNA binding, and metal ion binding. High structural similarity to the human severe acute respiratory syndrome (SARS) coronavirus nsp3 is present. A possible functional site that is conserved among some betacoronaviruses has been identified using bioinformatics and biochemical analyses. This structure provides strong experimental support for the recent proposal advanced by us and others that the "SARS-unique" region is not unique to the human SARS virus, but is conserved among several different phylogenetic groups of coronaviruses and provides essential functions. © 2017 The Protein Society.
Structure of faustovirus, a large dsDNA virus

DOE Office of Scientific and Technical Information (OSTI.GOV)

Klose, Thomas; Reteno, Dorine G.; Benamar, Samia

Many viruses protect their genome with a combination of a protein shell with or without a membrane layer. In this paper, we describe the structure of faustovirus, the first DNA virus (to our knowledge) that has been found to use two protein shells to encapsidate and protect its genome. The crystal structure of the major capsid protein, in combination with cryo-electron microscopy structures of two different maturation stages of the virus, shows that the outer virus shell is composed of a double jelly-roll protein that can be found in many double-stranded DNA viruses. The structure of the repeating hexameric unitmore » of the inner shell is different from all other known capsid proteins. In addition to the unique architecture, the region of the genome that encodes the major capsid protein stretches over 17,000 bp and contains a large number of introns and exons. Finally, this complexity might help the virus to rapidly adapt to new environments or hosts.« less
Structure of faustovirus, a large dsDNA virus

DOE PAGES

Klose, Thomas; Reteno, Dorine G.; Benamar, Samia; ...

2016-05-16

Many viruses protect their genome with a combination of a protein shell with or without a membrane layer. In this paper, we describe the structure of faustovirus, the first DNA virus (to our knowledge) that has been found to use two protein shells to encapsidate and protect its genome. The crystal structure of the major capsid protein, in combination with cryo-electron microscopy structures of two different maturation stages of the virus, shows that the outer virus shell is composed of a double jelly-roll protein that can be found in many double-stranded DNA viruses. The structure of the repeating hexameric unitmore » of the inner shell is different from all other known capsid proteins. In addition to the unique architecture, the region of the genome that encodes the major capsid protein stretches over 17,000 bp and contains a large number of introns and exons. Finally, this complexity might help the virus to rapidly adapt to new environments or hosts.« less
Minireview: DNA Replication in Plant Mitochondria

PubMed Central

Cupp, John D.; Nielsen, Brent L.

2014-01-01

Higher plant mitochondrial genomes exhibit much greater structural complexity as compared to most other organisms. Unlike well-characterized metazoan mitochondrial DNA (mtDNA) replication, an understanding of the mechanism(s) and proteins involved in plant mtDNA replication remains unclear. Several plant mtDNA replication proteins, including DNA polymerases, DNA primase/helicase, and accessory proteins have been identified. Mitochondrial dynamics, genome structure, and the complexity of dual-targeted and dual-function proteins that provide at least partial redundancy suggest that plants have a unique model for maintaining and replicating mtDNA when compared to the replication mechanism utilized by most metazoan organisms. PMID:24681310
Comparative analyses of putative toxin gene homologs from an Old World viper, Daboia russelii

PubMed Central

Krishnan, Neeraja M.

2017-01-01

Availability of snake genome sequences has opened up exciting areas of research on comparative genomics and gene diversity. One of the challenges in studying snake genomes is the acquisition of biological material from live animals, especially from the venomous ones, making the process cumbersome and time-consuming. Here, we report comparative sequence analyses of putative toxin gene homologs from Russell’s viper (Daboia russelii) using whole-genome sequencing data obtained from shed skin. When compared with the major venom proteins in Russell’s viper studied previously, we found 45–100% sequence similarity between the venom proteins and their putative homologs in the skin. Additionally, comparative analyses of 20 putative toxin gene family homologs provided evidence of unique sequence motifs in nerve growth factor (NGF), platelet derived growth factor (PDGF), Kunitz/Bovine pancreatic trypsin inhibitor (Kunitz BPTI), cysteine-rich secretory proteins, antigen 5, andpathogenesis-related1 proteins (CAP) and cysteine-rich secretory protein (CRISP). In those derived proteins, we identified V11 and T35 in the NGF domain; F23 and A29 in the PDGF domain; N69, K2 and A5 in the CAP domain; and Q17 in the CRISP domain to be responsible for differences in the largest pockets across the protein domain structures in crotalines, viperines and elapids from the in silico structure-based analysis. Similarly, residues F10, Y11 and E20 appear to play an important role in the protein structures across the kunitz protein domain of viperids and elapids. Our study highlights the usefulness of shed skin in obtaining good quality high-molecular weight DNA for comparative genomic studies, and provides evidence towards the unique features and evolution of putative venom gene homologs in vipers. PMID:29230357
RNA-Seq Based Transcriptional Map of Bovine Respiratory Disease Pathogen “Histophilus somni 2336”

PubMed Central

Kumar, Ranjit; Lawrence, Mark L.; Watt, James; Cooksey, Amanda M.; Burgess, Shane C.; Nanduri, Bindu

2012-01-01

Genome structural annotation, i.e., identification and demarcation of the boundaries for all the functional elements in a genome (e.g., genes, non-coding RNAs, proteins and regulatory elements), is a prerequisite for systems level analysis. Current genome annotation programs do not identify all of the functional elements of the genome, especially small non-coding RNAs (sRNAs). Whole genome transcriptome analysis is a complementary method to identify “novel” genes, small RNAs, regulatory regions, and operon structures, thus improving the structural annotation in bacteria. In particular, the identification of non-coding RNAs has revealed their widespread occurrence and functional importance in gene regulation, stress and virulence. However, very little is known about non-coding transcripts in Histophilus somni, one of the causative agents of Bovine Respiratory Disease (BRD) as well as bovine infertility, abortion, septicemia, arthritis, myocarditis, and thrombotic meningoencephalitis. In this study, we report a single nucleotide resolution transcriptome map of H. somni strain 2336 using RNA-Seq method. The RNA-Seq based transcriptome map identified 94 sRNAs in the H. somni genome of which 82 sRNAs were never predicted or reported in earlier studies. We also identified 38 novel potential protein coding open reading frames that were absent in the current genome annotation. The transcriptome map allowed the identification of 278 operon (total 730 genes) structures in the genome. When compared with the genome sequence of a non-virulent strain 129Pt, a disproportionate number of sRNAs (∼30%) were located in genomic region unique to strain 2336 (∼18% of the total genome). This observation suggests that a number of the newly identified sRNAs in strain 2336 may be involved in strain-specific adaptations. PMID:22276113
RNA-seq based transcriptional map of bovine respiratory disease pathogen "Histophilus somni 2336".

PubMed

Kumar, Ranjit; Lawrence, Mark L; Watt, James; Cooksey, Amanda M; Burgess, Shane C; Nanduri, Bindu

2012-01-01

Genome structural annotation, i.e., identification and demarcation of the boundaries for all the functional elements in a genome (e.g., genes, non-coding RNAs, proteins and regulatory elements), is a prerequisite for systems level analysis. Current genome annotation programs do not identify all of the functional elements of the genome, especially small non-coding RNAs (sRNAs). Whole genome transcriptome analysis is a complementary method to identify "novel" genes, small RNAs, regulatory regions, and operon structures, thus improving the structural annotation in bacteria. In particular, the identification of non-coding RNAs has revealed their widespread occurrence and functional importance in gene regulation, stress and virulence. However, very little is known about non-coding transcripts in Histophilus somni, one of the causative agents of Bovine Respiratory Disease (BRD) as well as bovine infertility, abortion, septicemia, arthritis, myocarditis, and thrombotic meningoencephalitis. In this study, we report a single nucleotide resolution transcriptome map of H. somni strain 2336 using RNA-Seq method.The RNA-Seq based transcriptome map identified 94 sRNAs in the H. somni genome of which 82 sRNAs were never predicted or reported in earlier studies. We also identified 38 novel potential protein coding open reading frames that were absent in the current genome annotation. The transcriptome map allowed the identification of 278 operon (total 730 genes) structures in the genome. When compared with the genome sequence of a non-virulent strain 129Pt, a disproportionate number of sRNAs (∼30%) were located in genomic region unique to strain 2336 (∼18% of the total genome). This observation suggests that a number of the newly identified sRNAs in strain 2336 may be involved in strain-specific adaptations.
Rapid isolation of microsatellite DNAs and identification of polymorphic mitochondrial DNA regions in the fish rotan (Perccottus glenii) invading European Russia

USGS Publications Warehouse

King, Timothy L.; Eackles, Michael S.; Reshetnikov, Andrey N.

2015-01-01

Human-mediated translocations and subsequent large-scale colonization by the invasive fish rotan (Perccottus glenii Dybowski, 1877; Perciformes, Odontobutidae), also known as Amur or Chinese sleeper, has resulted in dramatic transformations of small lentic ecosystems. However, no detailed genetic information exists on population structure, levels of effective movement, or relatedness among geographic populations of P. glenii within the European part of the range. We used massively parallel genomic DNA shotgun sequencing on the semiconductor-based Ion Torrent Personal Genome Machine (PGM) sequencing platform to identify nuclear microsatellite and mitochondrial DNA sequences in P. glenii from European Russia. Here we describe the characterization of nine nuclear microsatellite loci, ascertain levels of allelic diversity, heterozygosity, and demographic status of P. glenii collected from Ilev, Russia, one of several initial introduction points in European Russia. In addition, we mapped sequence reads to the complete P. glenii mitochondrial DNA sequence to identify polymorphic regions. Nuclear microsatellite markers developed for P. glenii yielded sufficient genetic diversity to: (1) produce unique multilocus genotypes; (2) elucidate structure among geographic populations; and (3) provide unique perspectives for analysis of population sizes and historical demographics. Among 4.9 million filtered P. glenii Ion Torrent PGM sequence reads, 11,304 mapped to the mitochondrial genome (NC_020350). This resulted in 100 % coverage of this genome to a mean coverage depth of 102X. A total of 130 variable sites were observed between the publicly available genome from China and the studied composite mitochondrial genome. Among these, 82 were diagnostic and monomorphic between the mitochondrial genomes and distributed among 15 genome regions. The polymorphic sites (N = 48) were distributed among 11 mitochondrial genome regions. Our results also indicate that sequence reads generated from two three-hour runs on the Ion Torrent PGM can generate a sufficient number of nuclear and mitochondrial markers to improve understanding of the evolutionary and ecological dynamics of non-model and in particular, invasive species.
Comparative analyses of Xanthomonas and Xylella complete genomes.

PubMed

Moreira, Leandro M; De Souza, Robson F; Digiampietri, Luciano A; Da Silva, Ana C R; Setubal, João C

2005-01-01

Computational analyses of four bacterial genomes of the Xanthomonadaceae family reveal new unique genes that may be involved in adaptation, pathogenicity, and host specificity. The Xanthomonas genus presents 3636 unique genes distributed in 1470 families, while Xylella genus presents 1026 unique genes distributed in 375 families. Among Xanthomonas-specific genes, we highlight a large number of cell wall degrading enzymes, proteases, and iron receptors, a set of energy metabolism genes, second copy of the type II secretion system, type III secretion system, flagella and chemotactic machinery, and the xanthomonadin synthesis gene cluster. Important genes unique to the Xylella genus are an additional copy of a type IV pili gene cluster and the complete machinery of colicin V synthesis and secretion. Intersections of gene sets from both genera reveal a cluster of genes homologous to Salmonella's SPI-7 island in Xanthomonas axonopodis pv citri and Xylella fastidiosa 9a5c, which might be involved in host specificity. Each genome also presents important unique genes, such as an HMS cluster, the kdgT gene, and O-antigen in Xanthomonas axonopodis pv citri; a number of avrBS genes and a distinct O-antigen in Xanthomonas campestris pv campestris, a type I restriction-modification system and a nickase gene in Xylella fastidiosa 9a5c, and a type II restriction-modification system and two genes related to peptidoglycan biosynthesis in Xylella fastidiosa temecula 1. All these differences imply a considerable number of gene gains and losses during the divergence of the four lineages, and are associated with structural genome modifications that may have a direct relation with the mode of transmission, adaptation to specific environments and pathogenicity of each organism.
The Complete Plastome Sequence of an Antarctic Bryophyte Sanionia uncinata (Hedw.) Loeske

PubMed Central

Park, Mira; Park, Hyun; Lee, Hyoungseok; Lee, Byeong-ha

2018-01-01

Organellar genomes of bryophytes are poorly represented with chloroplast genomes of only four mosses, four liverworts and two hornworts having been sequenced and annotated. Moreover, while Antarctic vegetation is dominated by the bryophytes, there are few reports on the plastid genomes for the Antarctic bryophytes. Sanionia uncinata (Hedw.) Loeske is one of the most dominant moss species in the maritime Antarctic. It has been researched as an important marker for ecological studies and as an extremophile plant for studies on stress tolerance. Here, we report the complete plastome sequence of S. uncinata, which can be exploited in comparative studies to identify the lineage-specific divergence across different species. The complete plastome of S. uncinata is 124,374 bp in length with a typical quadripartite structure of 114 unique genes including 82 unique protein-coding genes, 37 tRNA genes and four rRNA genes. However, two genes encoding the α subunit of RNA polymerase (rpoA) and encoding the cytochrome b6/f complex subunit VIII (petN) were absent. We could identify nuclear genes homologous to those genes, which suggests that rpoA and petN might have been relocated from the chloroplast genome to the nuclear genome. PMID:29494552
Emergence and Evolution of Hominidae-Specific Coding and Noncoding Genomic Sequences

PubMed Central

Saber, Morteza Mahmoudi; Adeyemi Babarinde, Isaac; Hettiarachchi, Nilmini; Saitou, Naruya

2016-01-01

Family Hominidae, which includes humans and great apes, is recognized for unique complex social behavior and intellectual abilities. Despite the increasing genome data, however, the genomic origin of its phenotypic uniqueness has remained elusive. Clade-specific genes and highly conserved noncoding sequences (HCNSs) are among the high-potential evolutionary candidates involved in driving clade-specific characters and phenotypes. On this premise, we analyzed whole genome sequences along with gene orthology data retrieved from major DNA databases to find Hominidae-specific (HS) genes and HCNSs. We discovered that Down syndrome critical region 4 (DSCR4) is the only experimentally verified gene uniquely present in Hominidae. DSCR4 has no structural homology to any known protein and was inferred to have emerged in several steps through LTR/ERV1, LTR/ERVL retrotransposition, and transversion. Using the genomic distance as neutral evolution threshold, we identified 1,658 HS HCNSs. Polymorphism coverage and derived allele frequency analysis of HS HCNSs showed that these HCNSs are under purifying selection, indicating that they may harbor important functions. They are overrepresented in promoters/untranslated regions, in close proximity of genes involved in sensory perception of sound and developmental process, and also showed a significantly lower nucleosome occupancy probability. Interestingly, many ancestral sequences of the HS HCNSs showed very high evolutionary rates. This suggests that new functions emerged through some kind of positive selection, and then purifying selection started to operate to keep these functions. PMID:27289096
Complete nucleotide sequence of the Cryptomeria japonica D. Don. chloroplast genome and comparative chloroplast genomics: diversified genomic structure of coniferous species.

PubMed

Hirao, Tomonori; Watanabe, Atsushi; Kurita, Manabu; Kondo, Teiji; Takata, Katsuhiko

2008-06-23

The recent determination of complete chloroplast (cp) genomic sequences of various plant species has enabled numerous comparative analyses as well as advances in plant and genome evolutionary studies. In angiosperms, the complete cp genome sequences of about 70 species have been determined, whereas those of only three gymnosperm species, Cycas taitungensis, Pinus thunbergii, and Pinus koraiensis have been established. The lack of information regarding the gene content and genomic structure of gymnosperm cp genomes may severely hamper further progress of plant and cp genome evolutionary studies. To address this need, we report here the complete nucleotide sequence of the cp genome of Cryptomeria japonica, the first in the Cupressaceae sensu lato of gymnosperms, and provide a comparative analysis of their gene content and genomic structure that illustrates the unique genomic features of gymnosperms. The C. japonica cp genome is 131,810 bp in length, with 112 single copy genes and two duplicated (trnI-CAU, trnQ-UUG) genes that give a total of 116 genes. Compared to other land plant cp genomes, the C. japonica cp has lost one of the relevant large inverted repeats (IRs) found in angiosperms, fern, liverwort, and gymnosperms, such as Cycas and Gingko, and additionally has completely lost its trnR-CCG, partially lost its trnT-GGU, and shows diversification of accD. The genomic structure of the C. japonica cp genome also differs significantly from those of other plant species. For example, we estimate that a minimum of 15 inversions would be required to transform the gene organization of the Pinus thunbergii cp genome into that of C. japonica. In the C. japonica cp genome, direct repeat and inverted repeat sequences are observed at the inversion and translocation endpoints, and these sequences may be associated with the genomic rearrangements. The observed differences in genomic structure between C. japonica and other land plants, including pines, strongly support the theory that the large IRs stabilize the cp genome. Furthermore, the deleted large IR and the numerous genomic rearrangements that have occurred in the C. japonica cp genome provide new insights into both the evolutionary lineage of coniferous species in gymnosperm and the evolution of the cp genome.
Africa: continent of genome contrasts with implications for biomedical research and health.

PubMed

Ramsay, Michèle

2012-08-31

The genomic architecture of African populations is poorly understood and there is considerable variation between ethno-linguistic groups. Genome-wide approaches have been extensively applied to search for genetic associations to complex traits in Europeans, but rarely in Africans. This is largely attributed to lower levels of funding, poor infrastructure and public health systems, and to the small pool of trained scientists. High levels of genetic variation and underlying population structure in Africans present significant challenges, but lower levels of linkage disequilibrium provide an opportunity for more effective localisation of causal variants. High throughput technologies, including dense genotyping arrays, genome sequencing and epigenome studies, together with plummeting costs, are making research more affordable, even for African scientists. Understanding the interactions between genome structure and environmental influences is essential to interpreting their contributions to the increase in infectious diseases and non-communicable diseases, exacerbated by adverse environments and lifestyle choices. The unique genome dynamics in African populations have an important role to play in understanding human health and susceptibility to disease. Copyright © 2012. Published by Elsevier B.V.
Structural Genomics of Bacterial Virulence Factors

DTIC Science & Technology

2005-05-01

is deficient to mammals and unique to bacteria, the enzymes involved in the pathway may be useful for antibiotic design. Recent genome sequence...the SARS S1 spike protein with a high affinity antibody (඘R)" ( Sui et al., 2004). Both the Si protein and antibody have been expressed and purified in... Streptococcus group are now in preparation. Key Research Accomplishments * Development of the VirFact database (J;p ’liL- tbur.htm o.i) of virulence
Highly distinct chromosomal structures in cowpea (Vigna unguiculata), as revealed by molecular cytogenetic analysis.

PubMed

Iwata-Otsubo, Aiko; Lin, Jer-Young; Gill, Navdeep; Jackson, Scott A

2016-05-01

Cowpea (Vigna unguiculata (L.) Walp) is an important legume, particularly in developing countries. However, little is known about its genome or chromosome structure. We used molecular cytogenetics to characterize the structure of pachytene chromosomes to advance our knowledge of chromosome and genome organization of cowpea. Our data showed that cowpea has highly distinct chromosomal structures that are cytologically visible as brightly DAPI-stained heterochromatic regions. Analysis of the repetitive fraction of the cowpea genome present at centromeric and pericentromeric regions confirmed that two retrotransposons are major components of pericentromeric regions and that a 455-bp tandem repeat is found at seven out of 11 centromere pairs in cowpea. These repeats likely evolved after the divergence of cowpea from common bean and form chromosomal structure unique to cowpea. The integration of cowpea genetic and physical chromosome maps reveals potential regions of suppressed recombination due to condensed heterochromatin and a lack of pairing in a few chromosomal termini. This study provides fundamental knowledge on cowpea chromosome structure and molecular cytogenetics tools for further chromosome studies.
Broad genomic and transcriptional analysis reveals a highly derived genome in dinoflagellate mitochondria

PubMed Central

Jackson, Christopher J; Norman, John E; Schnare, Murray N; Gray, Michael W; Keeling, Patrick J; Waller, Ross F

2007-01-01

Background Dinoflagellates comprise an ecologically significant and diverse eukaryotic phylum that is sister to the phylum containing apicomplexan endoparasites. The mitochondrial genome of apicomplexans is uniquely reduced in gene content and size, encoding only three proteins and two ribosomal RNAs (rRNAs) within a highly compacted 6 kb DNA. Dinoflagellate mitochondrial genomes have been comparatively poorly studied: limited available data suggest some similarities with apicomplexan mitochondrial genomes but an even more radical type of genomic organization. Here, we investigate structure, content and expression of dinoflagellate mitochondrial genomes. Results From two dinoflagellates, Crypthecodinium cohnii and Karlodinium micrum, we generated over 42 kb of mitochondrial genomic data that indicate a reduced gene content paralleling that of mitochondrial genomes in apicomplexans, i.e., only three protein-encoding genes and at least eight conserved components of the highly fragmented large and small subunit rRNAs. Unlike in apicomplexans, dinoflagellate mitochondrial genes occur in multiple copies, often as gene fragments, and in numerous genomic contexts. Analysis of cDNAs suggests several novel aspects of dinoflagellate mitochondrial gene expression. Polycistronic transcripts were found, standard start codons are absent, and oligoadenylation occurs upstream of stop codons, resulting in the absence of termination codons. Transcripts of at least one gene, cox3, are apparently trans-spliced to generate full-length mRNAs. RNA substitutional editing, a process previously identified for mRNAs in dinoflagellate mitochondria, is also implicated in rRNA expression. Conclusion The dinoflagellate mitochondrial genome shares the same gene complement and fragmentation of rRNA genes with its apicomplexan counterpart. However, it also exhibits several unique characteristics. Most notable are the expansion of gene copy numbers and their arrangements within the genome, RNA editing, loss of stop codons, and use of trans-splicing. PMID:17897476
Intrahaplotypic Variants Differentiate Complex Linkage Disequilibrium within Human MHC Haplotypes

PubMed Central

Lam, Tze Hau; Tay, Matthew Zirui; Wang, Bei; Xiao, Ziwei; Ren, Ee Chee

2015-01-01

Distinct regions of long-range genetic fixation in the human MHC region, known as conserved extended haplotypes (CEHs), possess unique genomic characteristics and are strongly associated with numerous diseases. While CEHs appear to be homogeneous by SNP analysis, the nature of fine variations within their genomic structure is unknown. Using multiple, MHC-homozygous cell lines, we demonstrate extensive sequence conservation in two common Asian MHC haplotypes: A33-B58-DR3 and A2-B46-DR9. However, characterization of phase-resolved MHC haplotypes revealed unique intra-CEH patterns of variation and uncovered 127 single nucleotide variants (SNVs) which are missing from public databases. We further show that the strong linkage disequilibrium structure within the human MHC that typically confounds precise identification of genetic features can be resolved using intra-CEH variants, as evidenced by rs3129063 and rs448489, which affect expression of ZFP57, a gene important in methylation and epigenetic regulation. This study demonstrates an improved strategy that can be used towards genetic dissection of diseases. PMID:26593880
Apophysomyces variabilis: draft genome sequence and comparison of predictive virulence determinants with other medically important Mucorales.

PubMed

Prakash, Hariprasath; Rudramurthy, Shivaprakash Mandya; Gandham, Prasad S; Ghosh, Anup Kumar; Kumar, Milner M; Badapanda, Chandan; Chakrabarti, Arunaloke

2017-09-18

Apophysomyces species are prevalent in tropical countries and A. variabilis is the second most frequent agent causing mucormycosis in India. Among Apophysomyces species, A. elegans, A. trapeziformis and A. variabilis are commonly incriminated in human infections. The genome sequences of A. elegans and A. trapeziformis are available in public database, but not A. variabilis. We, therefore, performed the whole genome sequence of A. variabilis to explore its genomic structure and possible genes determining the virulence of the organism. The whole genome of A. variabilis NCCPF 102052 was sequenced and the genomic structure of A. variabilis was compared with already available genome structures of A. elegans, A. trapeziformis and other medically important Mucorales. The total size of genome assembly of A. variabilis was 39.38 Mb with 12,764 protein-coding genes. The transposable elements (TEs) were low in Apophysomyces genome and the retrotransposon Ty3-gypsy was the common TE. Phylogenetically, Apophysomyces species were grouped closely with Phycomyces blakesleeanus. OrthoMCL analysis revealed 3025 orthologues proteins, which were common in those three pathogenic Apophysomyces species. Expansion of multiple gene families/duplication was observed in Apophysomyces genomes. Approximately 6% of Apophysomyces genes were predicted to be associated with virulence on PHIbase analysis. The virulence determinants included the protein families of CotH proteins (invasins), proteases, iron utilisation pathways, siderophores and signal transduction pathways. Serine proteases were the major group of proteases found in all Apophysomyces genomes. The carbohydrate active enzymes (CAZymes) constitute the majority of the secretory proteins. The present study is the maiden attempt to sequence and analyze the genomic structure of A. variabilis. Together with available genome sequence of A. elegans and A. trapeziformis, the study helped to indicate the possible virulence determinants of pathogenic Apophysomyces species. The presence of unique CAZymes in cell wall might be exploited in future for antifungal drug development.

The Arab genome: Health and wealth.

PubMed

Zayed, Hatem

2016-11-05

The 22 Arab nations have a unique genetic structure, which reflects both conserved and diverse gene pools due to the prevalent endogamous and consanguineous marriage culture and the long history of admixture among different ethnic subcultures descended from the Asian, European, and African continents. Human genome sequencing has enabled large-scale genomic studies of different populations and has become a powerful tool for studying disease predictions and diagnosis. Despite the importance of the Arab genome for better understanding the dynamics of the human genome, discovering rare genetic variations, and studying early human migration out of Africa, it is poorly represented in human genome databases, such as HapMap and the 1000 Genomes Project. In this review, I demonstrate the significance of sequencing the Arab genome and setting an Arab genome reference(s) for better understanding the molecular pathogenesis of genetic diseases, discovering novel/rare variants, and identifying a meaningful genotype-phenotype correlation for complex diseases. Copyright © 2016. Published by Elsevier B.V.
CTCF-Mediated Human 3D Genome Architecture Reveals Chromatin Topology for Transcription.

PubMed

Tang, Zhonghui; Luo, Oscar Junhong; Li, Xingwang; Zheng, Meizhen; Zhu, Jacqueline Jufen; Szalaj, Przemyslaw; Trzaskoma, Pawel; Magalska, Adriana; Wlodarczyk, Jakub; Ruszczycki, Blazej; Michalski, Paul; Piecuch, Emaly; Wang, Ping; Wang, Danjuan; Tian, Simon Zhongyuan; Penrad-Mobayed, May; Sachs, Laurent M; Ruan, Xiaoan; Wei, Chia-Lin; Liu, Edison T; Wilczynski, Grzegorz M; Plewczynski, Dariusz; Li, Guoliang; Ruan, Yijun

2015-12-17

Spatial genome organization and its effect on transcription remains a fundamental question. We applied an advanced chromatin interaction analysis by paired-end tag sequencing (ChIA-PET) strategy to comprehensively map higher-order chromosome folding and specific chromatin interactions mediated by CCCTC-binding factor (CTCF) and RNA polymerase II (RNAPII) with haplotype specificity and nucleotide resolution in different human cell lineages. We find that CTCF/cohesin-mediated interaction anchors serve as structural foci for spatial organization of constitutive genes concordant with CTCF-motif orientation, whereas RNAPII interacts within these structures by selectively drawing cell-type-specific genes toward CTCF foci for coordinated transcription. Furthermore, we show that haplotype variants and allelic interactions have differential effects on chromosome configuration, influencing gene expression, and may provide mechanistic insights into functions associated with disease susceptibility. 3D genome simulation suggests a model of chromatin folding around chromosomal axes, where CTCF is involved in defining the interface between condensed and open compartments for structural regulation. Our 3D genome strategy thus provides unique insights in the topological mechanism of human variations and diseases. Copyright © 2015 Elsevier Inc. All rights reserved.
Comparative genomics and evolution of eukaryotic phospholipidbiosynthesis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lykidis, Athanasios

2006-12-01

Phospholipid biosynthetic enzymes produce diverse molecular structures and are often present in multiple forms encoded by different genes. This work utilizes comparative genomics and phylogenetics for exploring the distribution, structure and evolution of phospholipid biosynthetic genes and pathways in 26 eukaryotic genomes. Although the basic structure of the pathways was formed early in eukaryotic evolution, the emerging picture indicates that individual enzyme families followed unique evolutionary courses. For example, choline and ethanolamine kinases and cytidylyltransferases emerged in ancestral eukaryotes, whereas, multiple forms of the corresponding phosphatidyltransferases evolved mainly in a lineage specific manner. Furthermore, several unicellular eukaryotes maintain bacterial-type enzymesmore » and reactions for the synthesis of phosphatidylglycerol and cardiolipin. Also, base-exchange phosphatidylserine synthases are widespread and ancestral enzymes. The multiplicity of phospholipid biosynthetic enzymes has been largely generated by gene expansion in a lineage specific manner. Thus, these observations suggest that phospholipid biosynthesis has been an actively evolving system. Finally, comparative genomic analysis indicates the existence of novel phosphatidyltransferases and provides a candidate for the uncharacterized eukaryotic phosphatidylglycerol phosphate phosphatase.« less
CMS: A Web-Based System for Visualization and Analysis of Genome-Wide Methylation Data of Human Cancers

PubMed Central

Huang, Yi-Wen; Roa, Juan C.; Goodfellow, Paul J.; Kizer, E. Lynette; Huang, Tim H. M.; Chen, Yidong

2013-01-01

Background DNA methylation of promoter CpG islands is associated with gene suppression, and its unique genome-wide profiles have been linked to tumor progression. Coupled with high-throughput sequencing technologies, it can now efficiently determine genome-wide methylation profiles in cancer cells. Also, experimental and computational technologies make it possible to find the functional relationship between cancer-specific methylation patterns and their clinicopathological parameters. Methodology/Principal Findings Cancer methylome system (CMS) is a web-based database application designed for the visualization, comparison and statistical analysis of human cancer-specific DNA methylation. Methylation intensities were obtained from MBDCap-sequencing, pre-processed and stored in the database. 191 patient samples (169 tumor and 22 normal specimen) and 41 breast cancer cell-lines are deposited in the database, comprising about 6.6 billion uniquely mapped sequence reads. This provides comprehensive and genome-wide epigenetic portraits of human breast cancer and endometrial cancer to date. Two views are proposed for users to better understand methylation structure at the genomic level or systemic methylation alteration at the gene level. In addition, a variety of annotation tracks are provided to cover genomic information. CMS includes important analytic functions for interpretation of methylation data, such as the detection of differentially methylated regions, statistical calculation of global methylation intensities, multiple gene sets of biologically significant categories, interactivity with UCSC via custom-track data. We also present examples of discoveries utilizing the framework. Conclusions/Significance CMS provides visualization and analytic functions for cancer methylome datasets. A comprehensive collection of datasets, a variety of embedded analytic functions and extensive applications with biological and translational significance make this system powerful and unique in cancer methylation research. CMS is freely accessible at: http://cbbiweb.uthscsa.edu/KMethylomes/. PMID:23630576
CMS: a web-based system for visualization and analysis of genome-wide methylation data of human cancers.

PubMed

Gu, Fei; Doderer, Mark S; Huang, Yi-Wen; Roa, Juan C; Goodfellow, Paul J; Kizer, E Lynette; Huang, Tim H M; Chen, Yidong

2013-01-01

DNA methylation of promoter CpG islands is associated with gene suppression, and its unique genome-wide profiles have been linked to tumor progression. Coupled with high-throughput sequencing technologies, it can now efficiently determine genome-wide methylation profiles in cancer cells. Also, experimental and computational technologies make it possible to find the functional relationship between cancer-specific methylation patterns and their clinicopathological parameters. Cancer methylome system (CMS) is a web-based database application designed for the visualization, comparison and statistical analysis of human cancer-specific DNA methylation. Methylation intensities were obtained from MBDCap-sequencing, pre-processed and stored in the database. 191 patient samples (169 tumor and 22 normal specimen) and 41 breast cancer cell-lines are deposited in the database, comprising about 6.6 billion uniquely mapped sequence reads. This provides comprehensive and genome-wide epigenetic portraits of human breast cancer and endometrial cancer to date. Two views are proposed for users to better understand methylation structure at the genomic level or systemic methylation alteration at the gene level. In addition, a variety of annotation tracks are provided to cover genomic information. CMS includes important analytic functions for interpretation of methylation data, such as the detection of differentially methylated regions, statistical calculation of global methylation intensities, multiple gene sets of biologically significant categories, interactivity with UCSC via custom-track data. We also present examples of discoveries utilizing the framework. CMS provides visualization and analytic functions for cancer methylome datasets. A comprehensive collection of datasets, a variety of embedded analytic functions and extensive applications with biological and translational significance make this system powerful and unique in cancer methylation research. CMS is freely accessible at: http://cbbiweb.uthscsa.edu/KMethylomes/.
The genome of the vervet (Chlorocebus aethiops sabaeus)

PubMed Central

Warren, Wesley C.; Jasinska, Anna J.; García-Pérez, Raquel; Svardal, Hannes; Tomlinson, Chad; Rocchi, Mariano; Archidiacono, Nicoletta; Capozzi, Oronzo; Minx, Patrick; Montague, Michael J.; Kyung, Kim; Hillier, LaDeana W.; Kremitzki, Milinn; Graves, Tina; Chiang, Colby; Hughes, Jennifer; Tran, Nam; Huang, Yu; Ramensky, Vasily; Choi, Oi-wa; Jung, Yoon J.; Schmitt, Christopher A.; Juretic, Nikoleta; Wasserscheid, Jessica; Turner, Trudy R.; Wiseman, Roger W.; Tuscher, Jennifer J.; Karl, Julie A.; Schmitz, Jörn E.; Zahn, Roland; O'Connor, David H.; Redmond, Eugene; Nisbett, Alex; Jacquelin, Béatrice; Müller-Trutwin, Michaela C.; Brenchley, Jason M.; Dione, Michel; Antonio, Martin; Schroth, Gary P.; Kaplan, Jay R.; Jorgensen, Matthew J.; Thomas, Gregg W.C.; Hahn, Matthew W.; Raney, Brian J.; Aken, Bronwen; Nag, Rishi; Schmitz, Juergen; Churakov, Gennady; Noll, Angela; Stanyon, Roscoe; Webb, David; Thibaud-Nissen, Francoise; Nordborg, Magnus; Marques-Bonet, Tomas; Dewar, Ken; Weinstock, George M.; Wilson, Richard K.; Freimer, Nelson B.

2015-01-01

We describe a genome reference of the African green monkey or vervet (Chlorocebus aethiops). This member of the Old World monkey (OWM) superfamily is uniquely valuable for genetic investigations of simian immunodeficiency virus (SIV), for which it is the most abundant natural host species, and of a wide range of health-related phenotypes assessed in Caribbean vervets (C. a. sabaeus), whose numbers have expanded dramatically since Europeans introduced small numbers of their ancestors from West Africa during the colonial era. We use the reference to characterize the genomic relationship between vervets and other primates, the intra-generic phylogeny of vervet subspecies, and genome-wide structural variations of a pedigreed C. a. sabaeus population. Through comparative analyses with human and rhesus macaque, we characterize at high resolution the unique chromosomal fission events that differentiate the vervets and their close relatives from most other catarrhine primates, in whom karyotype is highly conserved. We also provide a summary of transposable elements and contrast these with the rhesus macaque and human. Analysis of sequenced genomes representing each of the main vervet subspecies supports previously hypothesized relationships between these populations, which range across most of sub-Saharan Africa, while uncovering high levels of genetic diversity within each. Sequence-based analyses of major histocompatibility complex (MHC) polymorphisms reveal extremely low diversity in Caribbean C. a. sabaeus vervets, compared to vervets from putatively ancestral West African regions. In the C. a. sabaeus research population, we discover the first structural variations that are, in some cases, predicted to have a deleterious effect; future studies will determine the phenotypic impact of these variations. PMID:26377836
The Large Mitochondrial Genome of Symbiodinium minutum Reveals Conserved Noncoding Sequences between Dinoflagellates and Apicomplexans

PubMed Central

Shoguchi, Eiichi; Shinzato, Chuya; Hisata, Kanako; Satoh, Nori; Mungpakdee, Sutada

2015-01-01

Even though mitochondrial genomes, which characterize eukaryotic cells, were first discovered more than 50 years ago, mitochondrial genomics remains an important topic in molecular biology and genome sciences. The Phylum Alveolata comprises three major groups (ciliates, apicomplexans, and dinoflagellates), the mitochondrial genomes of which have diverged widely. Even though the gene content of dinoflagellate mitochondrial genomes is reportedly comparable to that of apicomplexans, the highly fragmented and rearranged genome structures of dinoflagellates have frustrated whole genomic analysis. Consequently, noncoding sequences and gene arrangements of dinoflagellate mitochondrial genomes have not been well characterized. Here we report that the continuous assembled genome (∼326 kb) of the dinoflagellate, Symbiodinium minutum, is AT-rich (∼64.3%) and that it contains three protein-coding genes. Based upon in silico analysis, the remaining 99% of the genome comprises transcriptomic noncoding sequences. RNA edited sites and unique, possible start and stop codons clarify conserved regions among dinoflagellates. Our massive transcriptome analysis shows that almost all regions of the genome are transcribed, including 27 possible fragmented ribosomal RNA genes and 12 uncharacterized small RNAs that are similar to mitochondrial RNA genes of the malarial parasite, Plasmodium falciparum. Gene map comparisons show that gene order is only slightly conserved between S. minutum and P. falciparum. However, small RNAs and intergenic sequences share sequence similarities with P. falciparum, suggesting that the function of noncoding sequences has been preserved despite development of very different genome structures. PMID:26199191
Origin and evolution of SINEs in eukaryotic genomes.

PubMed

Kramerov, D A; Vassetzky, N S

2011-12-01

Short interspersed elements (SINEs) are one of the two most prolific mobile genomic elements in most of the higher eukaryotes. Although their biology is still not thoroughly understood, unusual life cycle of these simple elements amplified as genomic parasites makes their evolution unique in many ways. In contrast to most genetic elements including other transposons, SINEs emerged de novo many times in evolution from available molecules (for example, tRNA). The involvement of reverse transcription in their amplification cycle, huge number of genomic copies and modular structure allow variation mechanisms in SINEs uncommon or rare in other genetic elements (module exchange between SINE families, dimerization, and so on.). Overall, SINE evolution includes their emergence, progressive optimization and counteraction to the cell's defense against mobile genetic elements.
The whole chloroplast genome of wild rice (Oryza australiensis).

PubMed

Wu, Zhiqiang; Ge, Song

2016-01-01

The whole chloroplast genome of wild rice (Oryza australiensis) is characterized in this study. The genome size is 135,224 bp, exhibiting a typical circular structure including a pair of 25,776 bp inverted repeats (IRa,b) separated by a large single-copy region (LSC) of 82,212 bp and a small single-copy region (SSC) of 12,470 bp. The overall GC content of the genome is 38.95%. 110 unique genes were annotated, including 76 protein-coding genes, 4 ribosomal RNA genes, and 30t RNA genes. Among these, 18 are duplicated in the inverted repeat regions, 13 genes contain one intron, and 2 genes (rps12 and ycf3) have two introns.
Complete Chloroplast Genome of the Wollemi Pine (Wollemia nobilis): Structure and Evolution.

PubMed

Yap, Jia-Yee S; Rohner, Thore; Greenfield, Abigail; Van Der Merwe, Marlien; McPherson, Hannah; Glenn, Wendy; Kornfeld, Geoff; Marendy, Elessa; Pan, Annie Y H; Wilton, Alan; Wilkins, Marc R; Rossetto, Maurizio; Delaney, Sven K

2015-01-01

The Wollemi pine (Wollemia nobilis) is a rare Southern conifer with striking morphological similarity to fossil pines. A small population of W. nobilis was discovered in 1994 in a remote canyon system in the Wollemi National Park (near Sydney, Australia). This population contains fewer than 100 individuals and is critically endangered. Previous genetic studies of the Wollemi pine have investigated its evolutionary relationship with other pines in the family Araucariaceae, and have suggested that the Wollemi pine genome contains little or no variation. However, these studies were performed prior to the widespread use of genome sequencing, and their conclusions were based on a limited fraction of the Wollemi pine genome. In this study, we address this problem by determining the entire sequence of the W. nobilis chloroplast genome. A detailed analysis of the structure of the genome is presented, and the evolution of the genome is inferred by comparison with the chloroplast sequences of other members of the Araucariaceae and the related family Podocarpaceae. Pairwise alignments of whole genome sequences, and the presence of unique pseudogenes, gene duplications and insertions in W. nobilis and Araucariaceae, indicate that the W. nobilis chloroplast genome is most similar to that of its sister taxon Agathis. However, the W. nobilis genome contains an unusually high number of repetitive sequences, and these could be used in future studies to investigate and conserve any remnant genetic diversity in the Wollemi pine.
Full genome sequence of Rocio virus reveal substantial variations from the prototype Rocio virus SPH 34675 sequence.

PubMed

Setoh, Yin Xiang; Amarilla, Alberto A; Peng, Nias Y; Slonchak, Andrii; Periasamy, Parthiban; Figueiredo, Luiz T M; Aquino, Victor H; Khromykh, Alexander A

2018-01-01

Rocio virus (ROCV) is an arbovirus belonging to the genus Flavivirus, family Flaviviridae. We present an updated sequence of ROCV strain SPH 34675 (GenBank: AY632542.4), the only available full genome sequence prior to this study. Using next-generation sequencing of the entire genome, we reveal substantial sequence variation from the prototype sequence, with 30 nucleotide differences amounting to 14 amino acid changes, as well as significant changes to predicted 3'UTR RNA structures. Our results present an updated and corrected sequence of a potential emerging human-virulent flavivirus uniquely indigenous to Brazil (GenBank: MF461639).
A multipartite mitochondrial genome in the potato cyst nematode Globodera pallida.

PubMed

Armstrong, M R; Blok, V C; Phillips, M S

2000-01-01

The mitochondrial genome (mtDNA) of the plant parasitic nematode Globodera pallida exists as a population of small, circular DNAs that, taken individually, are of insufficient length to encode the typical metazoan mitochondrial gene complement. As far as we are aware, this unusual structural organization is unique among higher metazoans, although interesting comparisons can be made with the multipartite mitochondrial genome organizations of plants and fungi. The variation in frequency between populations displayed by some components of the mtDNA is likely to have major implications for the way in which mtDNA can be used in population and evolutionary genetic studies of G. pallida.
Bacterial CRISPR Regions: General Features and their Potential for Epidemiological Molecular Typing Studies.

PubMed

Karimi, Zahra; Ahmadi, Ali; Najafi, Ali; Ranjbar, Reza

2018-01-01

CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) loci as novel and applicable regions in prokaryotic genomes have gained great attraction in the post genomics era. These unique regions are diverse in number and sequence composition in different pathogenic bacteria and thereby can be a suitable candidate for molecular epidemiology and genotyping studies. Results:Furthermore, the arrayed structure of CRISPR loci (several unique repeats spaced with the variable sequence) and associated cas genes act as an active prokaryotic immune system against viral replication and conjugative elements. This property can be used as a tool for RNA editing in bioengineering studies. The aim of this review was to survey some details about the history, nature, and potential applications of CRISPR arrays in both genetic engineering and bacterial genotyping studies.
Emergence and Evolution of Hominidae-Specific Coding and Noncoding Genomic Sequences.

PubMed

Saber, Morteza Mahmoudi; Adeyemi Babarinde, Isaac; Hettiarachchi, Nilmini; Saitou, Naruya

2016-07-12

Family Hominidae, which includes humans and great apes, is recognized for unique complex social behavior and intellectual abilities. Despite the increasing genome data, however, the genomic origin of its phenotypic uniqueness has remained elusive. Clade-specific genes and highly conserved noncoding sequences (HCNSs) are among the high-potential evolutionary candidates involved in driving clade-specific characters and phenotypes. On this premise, we analyzed whole genome sequences along with gene orthology data retrieved from major DNA databases to find Hominidae-specific (HS) genes and HCNSs. We discovered that Down syndrome critical region 4 (DSCR4) is the only experimentally verified gene uniquely present in Hominidae. DSCR4 has no structural homology to any known protein and was inferred to have emerged in several steps through LTR/ERV1, LTR/ERVL retrotransposition, and transversion. Using the genomic distance as neutral evolution threshold, we identified 1,658 HS HCNSs. Polymorphism coverage and derived allele frequency analysis of HS HCNSs showed that these HCNSs are under purifying selection, indicating that they may harbor important functions. They are overrepresented in promoters/untranslated regions, in close proximity of genes involved in sensory perception of sound and developmental process, and also showed a significantly lower nucleosome occupancy probability. Interestingly, many ancestral sequences of the HS HCNSs showed very high evolutionary rates. This suggests that new functions emerged through some kind of positive selection, and then purifying selection started to operate to keep these functions. © The Author(s) 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
The diploid genome sequence of an Asian individual

PubMed Central

Wang, Jun; Wang, Wei; Li, Ruiqiang; Li, Yingrui; Tian, Geng; Goodman, Laurie; Fan, Wei; Zhang, Junqing; Li, Jun; Zhang, Juanbin; Guo, Yiran; Feng, Binxiao; Li, Heng; Lu, Yao; Fang, Xiaodong; Liang, Huiqing; Du, Zhenglin; Li, Dong; Zhao, Yiqing; Hu, Yujie; Yang, Zhenzhen; Zheng, Hancheng; Hellmann, Ines; Inouye, Michael; Pool, John; Yi, Xin; Zhao, Jing; Duan, Jinjie; Zhou, Yan; Qin, Junjie; Ma, Lijia; Li, Guoqing; Yang, Zhentao; Zhang, Guojie; Yang, Bin; Yu, Chang; Liang, Fang; Li, Wenjie; Li, Shaochuan; Li, Dawei; Ni, Peixiang; Ruan, Jue; Li, Qibin; Zhu, Hongmei; Liu, Dongyuan; Lu, Zhike; Li, Ning; Guo, Guangwu; Zhang, Jianguo; Ye, Jia; Fang, Lin; Hao, Qin; Chen, Quan; Liang, Yu; Su, Yeyang; san, A.; Ping, Cuo; Yang, Shuang; Chen, Fang; Li, Li; Zhou, Ke; Zheng, Hongkun; Ren, Yuanyuan; Yang, Ling; Gao, Yang; Yang, Guohua; Li, Zhuo; Feng, Xiaoli; Kristiansen, Karsten; Wong, Gane Ka-Shu; Nielsen, Rasmus; Durbin, Richard; Bolund, Lars; Zhang, Xiuqing; Li, Songgang; Yang, Huanming; Wang, Jian

2009-01-01

Here we present the first diploid genome sequence of an Asian individual. The genome was sequenced to 36-fold average coverage using massively parallel sequencing technology. We aligned the short reads onto the NCBI human reference genome to 99.97% coverage, and guided by the reference genome, we used uniquely mapped reads to assemble a high-quality consensus sequence for 92% of the Asian individual's genome. We identified approximately 3 million single-nucleotide polymorphisms (SNPs) inside this region, of which 13.6% were not in the dbSNP database. Genotyping analysis showed that SNP identification had high accuracy and consistency, indicating the high sequence quality of this assembly. We also carried out heterozygote phasing and haplotype prediction against HapMap CHB and JPT haplotypes (Chinese and Japanese, respectively), sequence comparison with the two available individual genomes (J. D. Watson and J. C. Venter), and structural variation identification. These variations were considered for their potential biological impact. Our sequence data and analyses demonstrate the potential usefulness of next-generation sequencing technologies for personal genomics. PMID:18987735
Genomic Structure of an Economically Important Cyanobacterium, Arthrospira (Spirulina) platensis NIES-39

PubMed Central

Fujisawa, Takatomo; Narikawa, Rei; Okamoto, Shinobu; Ehira, Shigeki; Yoshimura, Hidehisa; Suzuki, Iwane; Masuda, Tatsuru; Mochimaru, Mari; Takaichi, Shinichi; Awai, Koichiro; Sekine, Mitsuo; Horikawa, Hiroshi; Yashiro, Isao; Omata, Seiha; Takarada, Hiromi; Katano, Yoko; Kosugi, Hiroki; Tanikawa, Satoshi; Ohmori, Kazuko; Sato, Naoki; Ikeuchi, Masahiko; Fujita, Nobuyuki; Ohmori, Masayuki

2010-01-01

A filamentous non-N2-fixing cyanobacterium, Arthrospira (Spirulina) platensis, is an important organism for industrial applications and as a food supply. Almost the complete genome of A. platensis NIES-39 was determined in this study. The genome structure of A. platensis is estimated to be a single, circular chromosome of 6.8 Mb, based on optical mapping. Annotation of this 6.7 Mb sequence yielded 6630 protein-coding genes as well as two sets of rRNA genes and 40 tRNA genes. Of the protein-coding genes, 78% are similar to those of other organisms; the remaining 22% are currently unknown. A total 612 kb of the genome comprise group II introns, insertion sequences and some repetitive elements. Group I introns are located in a protein-coding region. Abundant restriction-modification systems were determined. Unique features in the gene composition were noted, particularly in a large number of genes for adenylate cyclase and haemolysin-like Ca2+-binding proteins and in chemotaxis proteins. Filament-specific genes were highlighted by comparative genomic analysis. PMID:20203057
A tick-borne segmented RNA virus contains genome segments derived from unsegmented viral ancestors

PubMed Central

Qin, Xin-Cheng; Shi, Mang; Tian, Jun-Hua; Lin, Xian-Dan; Gao, Dong-Ya; He, Jin-Rong; Wang, Jian-Bo; Li, Ci-Xiu; Kang, Yan-Jun; Yu, Bin; Zhou, Dun-Jin; Xu, Jianguo; Plyusnin, Alexander; Holmes, Edward C.; Zhang, Yong-Zhen

2014-01-01

Although segmented and unsegmented RNA viruses are commonplace, the evolutionary links between these two very different forms of genome organization are unclear. We report the discovery and characterization of a tick-borne virus—Jingmen tick virus (JMTV)—that reveals an unexpected connection between segmented and unsegmented RNA viruses. The JMTV genome comprises four segments, two of which are related to the nonstructural protein genes of the genus Flavivirus (family Flaviviridae), whereas the remaining segments are unique to this virus, have no known homologs, and contain a number of features indicative of structural protein genes. Remarkably, homology searching revealed that sequences related to JMTV were present in the cDNA library from Toxocara canis (dog roundworm; Nematoda), and that shared strong sequence and structural resemblances. Epidemiological studies showed that JMTV is distributed in tick populations across China, especially Rhipicephalus and Haemaphysalis spp., and experiences frequent host-switching and genomic reassortment. To our knowledge, JMTV is the first example of a segmented RNA virus with a genome derived in part from unsegmented viral ancestors. PMID:24753611
Functional interactions of archaea, bacteria and viruses in a hypersaline endolithic community.

PubMed

Crits-Christoph, Alexander; Gelsinger, Diego R; Ma, Bing; Wierzchos, Jacek; Ravel, Jacques; Davila, Alfonso; Casero, M Cristina; DiRuggiero, Jocelyne

2016-06-01

Halite endoliths in the Atacama Desert represent one of the most extreme ecosystems on Earth. Cultivation-independent methods were used to examine the functional adaptations of the microbial consortia inhabiting halite nodules. The community was dominated by haloarchaea and functional analysis attributed most of the autotrophic CO2 fixation to one unique cyanobacterium. The assembled 1.1 Mbp genome of a novel nanohaloarchaeon, Candidatus Nanopetramus SG9, revealed a photoheterotrophic life style and a low median isoelectric point (pI) for all predicted proteins, suggesting a 'salt-in' strategy for osmotic balance. Predicted proteins of the algae identified in the community also had pI distributions similar to 'salt-in' strategists. The Nanopetramus genome contained a unique CRISPR/Cas system with a spacer that matched a partial viral genome from the metagenome. A combination of reference-independent methods identified over 30 complete or near complete viral or proviral genomes with diverse genome structure, genome size, gene content and hosts. Putative hosts included Halobacteriaceae, Nanohaloarchaea and Cyanobacteria. Despite the dependence of the halite community on deliquescence for liquid water availability, this study exposed an ecosystem spanning three phylogenetic domains, containing a large diversity of viruses and predominance of a 'salt-in' strategy to balance the high osmotic pressure of the environment. © 2016 Society for Applied Microbiology and John Wiley & Sons Ltd.
Three Infectious Viral Species Lying in Wait in the Banana Genome

PubMed Central

Chabannes, Matthieu; Baurens, Franc-Christophe; Duroy, Pierre-Olivier; Bocs, Stéphanie; Vernerey, Marie-Stéphanie; Rodier-Goud, Marguerite; Barbe, Valérie; Gayral, Philippe

2013-01-01

Plant pararetroviruses integrate serendipitously into their host genomes. The banana genome harbors integrated copies of banana streak virus (BSV) named endogenous BSV (eBSV) that are able to release infectious pararetrovirus. In this investigation, we characterized integrants of three BSV species—Goldfinger (eBSGFV), Imove (eBSImV), and Obino l'Ewai (eBSOLV)—in the seedy Musa balbisiana Pisang klutuk wulung (PKW) by studying their molecular structure, genomic organization, genomic landscape, and infectious capacity. All eBSVs exhibit extensive viral genome duplications and rearrangements. eBSV segregation analysis on an F1 population of PKW combined with fluorescent in situ hybridization analysis showed that eBSImV, eBSOLV, and eBSGFV are each present at a single locus. eBSOLV and eBSGFV contain two distinct alleles, whereas eBSImV has two structurally identical alleles. Genotyping of both eBSV and viral particles expressed in the progeny demonstrated that only one allele for each species is infectious. The infectious allele of eBSImV could not be identified since the two alleles are identical. Finally, we demonstrate that eBSGFV and eBSOLV are located on chromosome 1 and eBSImV is located on chromosome 2 of the reference Musa genome published recently. The structure and evolution of eBSVs suggest sequential integration into the plant genome, and haplotype divergence analysis confirms that the three loci display differential evolution. Based on our data, we propose a model for BSV integration and eBSV evolution in the Musa balbisiana genome. The mutual benefits of this unique host-pathogen association are also discussed. PMID:23720724
A novel lineage of myoviruses infecting cyanobacteria is widespread in the oceans.

PubMed

Sabehi, Gazalah; Shaulov, Lihi; Silver, David H; Yanai, Itai; Harel, Amnon; Lindell, Debbie

2012-02-07

Viruses infecting bacteria (phages) are thought to greatly impact microbial population dynamics as well as the genome diversity and evolution of their hosts. Here we report on the discovery of a novel lineage of tailed dsDNA phages belonging to the family Myoviridae and describe its first representative, S-TIM5, that infects the ubiquitous marine cyanobacterium, Synechococcus. The genome of this phage encodes an entirely unique set of structural proteins not found in any currently known phage, indicating that it uses lineage-specific genes for virion morphogenesis and represents a previously unknown lineage of myoviruses. Furthermore, among its distinctive collection of replication and DNA metabolism genes, it carries a mitochondrial-like DNA polymerase gene, providing strong evidence for the bacteriophage origin of the mitochondrial DNA polymerase. S-TIM5 also encodes an array of bacterial-like metabolism genes commonly found in phages infecting cyanobacteria including photosynthesis, carbon metabolism and phosphorus acquisition genes. This suggests a common gene pool and gene swapping of cyanophage-specific genes among different phage lineages despite distinct sets of structural and replication genes. All cytosines following purine nucleotides are methylated in the S-TIM5 genome, constituting a unique methylation pattern that likely protects the genome from nuclease degradation. This phage is abundant in the Red Sea and S-TIM5 gene homologs are widespread in the oceans. This unusual phage type is thus likely to be an important player in the oceans, impacting the population dynamics and evolution of their primary producing cyanobacterial hosts.

Revolting Developments in Our Understanding of the Organization of the Eukaryotic Genome.

ERIC Educational Resources Information Center

Krider, Hallie M.

1984-01-01

Various typs of DNA are discussed. Areas considered include highly repetitive and satellite sequences, genes encoding, ribosomal RNA, histone protein genes, and dispersed repeated genes that jump. Regulated genetic misbehavior, structure and use of unique genes, and higher order complexities of chromosomes are also discussed. (JN)
42 CFR 423.578 - Exceptions process.

Code of Federal Regulations, 2010 CFR

2010-10-01

... cost-sharing structure. Each Part D plan sponsor that provides prescription drug benefits for Part D... sponsor required to cover a non-preferred drug at the generic drug cost-sharing level if the plan... tier in which it places very high cost and unique items, such as genomic and biotech products, the...
Primary structural variation in anaplasma marginale Msp2 efficiently generates immune escape variants

USDA-ARS?s Scientific Manuscript database

Antigenic variation allows microbial pathogens to evade immune clearance and establish persistent infection. Anaplasma marginale utilizes gene conversion of a repertoire of silent msp2 alleles into a single active expression site to encode unique Msp2 variants. As the genomic complement of msp2 alle...
Distinct DNA exit and packaging portals in the virus Acanthamoeba polyphaga mimivirus.

PubMed

Zauberman, Nathan; Mutsafi, Yael; Halevy, Daniel Ben; Shimoni, Eyal; Klein, Eugenia; Xiao, Chuan; Sun, Siyang; Minsky, Abraham

2008-05-13

Icosahedral double-stranded DNA viruses use a single portal for genome delivery and packaging. The extensive structural similarity revealed by such portals in diverse viruses, as well as their invariable positioning at a unique icosahedral vertex, led to the consensus that a particular, highly conserved vertex-portal architecture is essential for viral DNA translocations. Here we present an exception to this paradigm by demonstrating that genome delivery and packaging in the virus Acanthamoeba polyphaga mimivirus occur through two distinct portals. By using high-resolution techniques, including electron tomography and cryo-scanning electron microscopy, we show that Mimivirus genome delivery entails a large-scale conformational change of the capsid, whereby five icosahedral faces open up. This opening, which occurs at a unique vertex of the capsid that we coined the "stargate", allows for the formation of a massive membrane conduit through which the viral DNA is released. A transient aperture centered at an icosahedral face distal to the DNA delivery site acts as a non-vertex DNA packaging portal. In conjunction with comparative genomic studies, our observations imply a viral packaging pathway akin to bacterial DNA segregation, which might be shared by diverse internal membrane-containing viruses.
Distinct DNA Exit and Packaging Portals in the Virus Acanthamoeba polyphaga mimivirus

PubMed Central

Zauberman, Nathan; Mutsafi, Yael; Halevy, Daniel Ben; Shimoni, Eyal; Klein, Eugenia; Xiao, Chuan; Sun, Siyang; Minsky, Abraham

2008-01-01

Icosahedral double-stranded DNA viruses use a single portal for genome delivery and packaging. The extensive structural similarity revealed by such portals in diverse viruses, as well as their invariable positioning at a unique icosahedral vertex, led to the consensus that a particular, highly conserved vertex-portal architecture is essential for viral DNA translocations. Here we present an exception to this paradigm by demonstrating that genome delivery and packaging in the virus Acanthamoeba polyphaga mimivirus occur through two distinct portals. By using high-resolution techniques, including electron tomography and cryo-scanning electron microscopy, we show that Mimivirus genome delivery entails a large-scale conformational change of the capsid, whereby five icosahedral faces open up. This opening, which occurs at a unique vertex of the capsid that we coined the “stargate”, allows for the formation of a massive membrane conduit through which the viral DNA is released. A transient aperture centered at an icosahedral face distal to the DNA delivery site acts as a non-vertex DNA packaging portal. In conjunction with comparative genomic studies, our observations imply a viral packaging pathway akin to bacterial DNA segregation, which might be shared by diverse internal membrane–containing viruses. PMID:18479185
Identifying structural variation in haploid microbial genomes from short-read resequencing data using breseq.

PubMed

Barrick, Jeffrey E; Colburn, Geoffrey; Deatherage, Daniel E; Traverse, Charles C; Strand, Matthew D; Borges, Jordan J; Knoester, David B; Reba, Aaron; Meyer, Austin G

2014-11-29

Mutations that alter chromosomal structure play critical roles in evolution and disease, including in the origin of new lifestyles and pathogenic traits in microbes. Large-scale rearrangements in genomes are often mediated by recombination events involving new or existing copies of mobile genetic elements, recently duplicated genes, or other repetitive sequences. Most current software programs for predicting structural variation from short-read DNA resequencing data are intended primarily for use on human genomes. They typically disregard information in reads mapping to repeat sequences, and significant post-processing and manual examination of their output is often required to rule out false-positive predictions and precisely describe mutational events. We have implemented an algorithm for identifying structural variation from DNA resequencing data as part of the breseq computational pipeline for predicting mutations in haploid microbial genomes. Our method evaluates the support for new sequence junctions present in a clonal sample from split-read alignments to a reference genome, including matches to repeat sequences. Then, it uses a statistical model of read coverage evenness to accept or reject these predictions. Finally, breseq combines predictions of new junctions and deleted chromosomal regions to output biologically relevant descriptions of mutations and their effects on genes. We demonstrate the performance of breseq on simulated Escherichia coli genomes with deletions generating unique breakpoint sequences, new insertions of mobile genetic elements, and deletions mediated by mobile elements. Then, we reanalyze data from an E. coli K-12 mutation accumulation evolution experiment in which structural variation was not previously identified. Transposon insertions and large-scale chromosomal changes detected by breseq account for ~25% of spontaneous mutations in this strain. In all cases, we find that breseq is able to reliably predict structural variation with modest read-depth coverage of the reference genome (>40-fold). Using breseq to predict structural variation should be useful for studies of microbial epidemiology, experimental evolution, synthetic biology, and genetics when a reference genome for a closely related strain is available. In these cases, breseq can discover mutations that may be responsible for important or unintended changes in genomes that might otherwise go undetected.
Global MLST of Salmonella Typhi Revisited in Post-genomic Era: Genetic Conservation, Population Structure, and Comparative Genomics of Rare Sequence Types.

PubMed

Yap, Kien-Pong; Ho, Wing S; Gan, Han M; Chai, Lay C; Thong, Kwai L

2016-01-01

Typhoid fever, caused by Salmonella enterica serovar Typhi, remains an important public health burden in Southeast Asia and other endemic countries. Various genotyping methods have been applied to study the genetic variations of this human-restricted pathogen. Multilocus sequence typing (MLST) is one of the widely accepted methods, and recently, there is a growing interest in the re-application of MLST in the post-genomic era. In this study, we provide the global MLST distribution of S. Typhi utilizing both publicly available 1,826 S. Typhi genome sequences in addition to performing conventional MLST on S. Typhi strains isolated from various endemic regions spanning over a century. Our global MLST analysis confirms the predominance of two sequence types (ST1 and ST2) co-existing in the endemic regions. Interestingly, S. Typhi strains with ST8 are currently confined within the African continent. Comparative genomic analyses of ST8 and other rare STs with genomes of ST1/ST2 revealed unique mutations in important virulence genes such as flhB, sipC, and tviD that may explain the variations that differentiate between seemingly successful (widespread) and unsuccessful (poor dissemination) S. Typhi populations. Large scale whole-genome phylogeny demonstrated evidence of phylogeographical structuring and showed that ST8 may have diverged from the earlier ancestral population of ST1 and ST2, which later lost some of its fitness advantages, leading to poor worldwide dissemination. In response to the unprecedented increase in genomic data, this study demonstrates and highlights the utility of large-scale genome-based MLST as a quick and effective approach to narrow the scope of in-depth comparative genomic analysis and consequently provide new insights into the fine scale of pathogen evolution and population structure.
The TTSMI database: a catalog of triplex target DNA sites associated with genes and regulatory elements in the human genome.

PubMed

Jenjaroenpun, Piroon; Chew, Chee Siang; Yong, Tai Pang; Choowongkomon, Kiattawee; Thammasorn, Wimada; Kuznetsov, Vladimir A

2015-01-01

A triplex target DNA site (TTS), a stretch of DNA that is composed of polypurines, is able to form a triple-helix (triplex) structure with triplex-forming oligonucleotides (TFOs) and is able to influence the site-specific modulation of gene expression and/or the modification of genomic DNA. The co-localization of a genomic TTS with gene regulatory signals and functional genome structures suggests that TFOs could potentially be exploited in antigene strategies for the therapy of cancers and other genetic diseases. Here, we present the TTS Mapping and Integration (TTSMI; http://ttsmi.bii.a-star.edu.sg) database, which provides a catalog of unique TTS locations in the human genome and tools for analyzing the co-localization of TTSs with genomic regulatory sequences and signals that were identified using next-generation sequencing techniques and/or predicted by computational models. TTSMI was designed as a user-friendly tool that facilitates (i) fast searching/filtering of TTSs using several search terms and criteria associated with sequence stability and specificity, (ii) interactive filtering of TTSs that co-localize with gene regulatory signals and non-B DNA structures, (iii) exploration of dynamic combinations of the biological signals of specific TTSs and (iv) visualization of a TTS simultaneously with diverse annotation tracks via the UCSC genome browser. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Weird Animals, Sex, and Genome Evolution.

PubMed

Graves, Jennifer A Marshall

2018-02-15

Making my career in Australia exposed me to the tyranny of distance, but it gave me opportunities to study our unique native fauna. Distantly related animal species present genetic variation that we can use to explore the most fundamental biological structures and processes. I have compared chromosomes and genomes of kangaroos and platypus, tiger snakes and emus, devils (Tasmanian) and dragons (lizards). I particularly love the challenges posed by sex chromosomes, which, apart from determining sex, provide stunning examples of epigenetic control and break all the evolutionary rules that we currently understand. Here I describe some of those amazing animals and the insights on genome structure, function, and evolution they have afforded us. I also describe my sometimes-random walk in science and the factors and people who influenced my direction. Being a woman in science is still not easy, and I hope others will find encouragement and empathy in my story.
The Large Mitochondrial Genome of Symbiodinium minutum Reveals Conserved Noncoding Sequences between Dinoflagellates and Apicomplexans.

PubMed

Shoguchi, Eiichi; Shinzato, Chuya; Hisata, Kanako; Satoh, Nori; Mungpakdee, Sutada

2015-07-20

Even though mitochondrial genomes, which characterize eukaryotic cells, were first discovered more than 50 years ago, mitochondrial genomics remains an important topic in molecular biology and genome sciences. The Phylum Alveolata comprises three major groups (ciliates, apicomplexans, and dinoflagellates), the mitochondrial genomes of which have diverged widely. Even though the gene content of dinoflagellate mitochondrial genomes is reportedly comparable to that of apicomplexans, the highly fragmented and rearranged genome structures of dinoflagellates have frustrated whole genomic analysis. Consequently, noncoding sequences and gene arrangements of dinoflagellate mitochondrial genomes have not been well characterized. Here we report that the continuous assembled genome (∼326 kb) of the dinoflagellate, Symbiodinium minutum, is AT-rich (∼64.3%) and that it contains three protein-coding genes. Based upon in silico analysis, the remaining 99% of the genome comprises transcriptomic noncoding sequences. RNA edited sites and unique, possible start and stop codons clarify conserved regions among dinoflagellates. Our massive transcriptome analysis shows that almost all regions of the genome are transcribed, including 27 possible fragmented ribosomal RNA genes and 12 uncharacterized small RNAs that are similar to mitochondrial RNA genes of the malarial parasite, Plasmodium falciparum. Gene map comparisons show that gene order is only slightly conserved between S. minutum and P. falciparum. However, small RNAs and intergenic sequences share sequence similarities with P. falciparum, suggesting that the function of noncoding sequences has been preserved despite development of very different genome structures. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Origin and evolution of SINEs in eukaryotic genomes

PubMed Central

Kramerov, D A; Vassetzky, N S

2011-01-01

Short interspersed elements (SINEs) are one of the two most prolific mobile genomic elements in most of the higher eukaryotes. Although their biology is still not thoroughly understood, unusual life cycle of these simple elements amplified as genomic parasites makes their evolution unique in many ways. In contrast to most genetic elements including other transposons, SINEs emerged de novo many times in evolution from available molecules (for example, tRNA). The involvement of reverse transcription in their amplification cycle, huge number of genomic copies and modular structure allow variation mechanisms in SINEs uncommon or rare in other genetic elements (module exchange between SINE families, dimerization, and so on.). Overall, SINE evolution includes their emergence, progressive optimization and counteraction to the cell's defense against mobile genetic elements. PMID:21673742
Virulence factors encoded by Legionella longbeachae identified on the basis of the genome sequence analysis of clinical isolate D-4968.

PubMed

Kozak, Natalia A; Buss, Meghan; Lucas, Claressa E; Frace, Michael; Govil, Dhwani; Travis, Tatiana; Olsen-Rasmussen, Melissa; Benson, Robert F; Fields, Barry S

2010-02-01

Legionella longbeachae causes most cases of legionellosis in Australia and may be underreported worldwide due to the lack of L. longbeachae-specific diagnostic tests. L. longbeachae displays distinctive differences in intracellular trafficking, caspase 1 activation, and infection in mouse models compared to Legionella pneumophila, yet these two species have indistinguishable clinical presentations in humans. Unlike other legionellae, which inhabit freshwater systems, L. longbeachae is found predominantly in moist soil. In this study, we sequenced and annotated the genome of an L. longbeachae clinical isolate from Oregon, isolate D-4968, and compared it to the previously published genomes of L. pneumophila. The results revealed that the D-4968 genome is larger than the L. pneumophila genome and has a gene order that is different from that of the L. pneumophila genome. Genes encoding structural components of type II, type IV Lvh, and type IV Icm/Dot secretion systems are conserved. In contrast, only 42/140 homologs of genes encoding L. pneumophila Icm/Dot substrates have been found in the D-4968 genome. L. longbeachae encodes numerous proteins with eukaryotic motifs and eukaryote-like proteins unique to this species, including 16 ankyrin repeat-containing proteins and a novel U-box protein. We predict that these proteins are secreted by the L. longbeachae Icm/Dot secretion system. In contrast to the L. pneumophila genome, the L. longbeachae D-4968 genome does not contain flagellar biosynthesis genes, yet it contains a chemotaxis operon. The lack of a flagellum explains the failure of L. longbeachae to activate caspase 1 and trigger pyroptosis in murine macrophages. These unique features of L. longbeachae may reflect adaptation of this species to life in soil.
Pharmacogenomics in the preclinical development of vaccines: evaluation of efficacy and systemic toxicity in the mouse using array technology.

PubMed

Regnström, Karin J

2008-01-01

The development of vaccines, conventional protein based as well as nucleic acid based vaccines, and their delivery systems has been largely empirical and ineffective. This is partly due to a lack of methodology, since traditionally only a few markers are studied. By introducing gene expression analysis and bioinformatics into the design of vaccines and their delivery systems, vaccine development can be improved and accelerated considerably. Each vaccine antigen and delivery system combination is characterized by a unique genomic profile, a "fingerprint" that will give information of not only immunological and toxicological responses but also other related cellular responses e.g. cell cycle, apoptosis and carcinogenic effects. The resulting unique genomic fingerprint facilitates the establishment of molecular structure--pharmacological activity relationships and therefore leads to optimization of vaccine development.
Bacterial CRISPR Regions: General Features and their Potential for Epidemiological Molecular Typing Studies

PubMed Central

Karimi, Zahra; Ahmadi, Ali; Najafi, Ali; Ranjbar, Reza

2018-01-01

Introduction: CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) loci as novel and applicable regions in prokaryotic genomes have gained great attraction in the post genomics era. Methods: These unique regions are diverse in number and sequence composition in different pathogenic bacteria and thereby can be a suitable candidate for molecular epidemiology and genotyping studies. Results:Furthermore, the arrayed structure of CRISPR loci (several unique repeats spaced with the variable sequence) and associated cas genes act as an active prokaryotic immune system against viral replication and conjugative elements. This property can be used as a tool for RNA editing in bioengineering studies. Conclusion: The aim of this review was to survey some details about the history, nature, and potential applications of CRISPR arrays in both genetic engineering and bacterial genotyping studies. PMID:29755603
Complete Chloroplast Genome of the Wollemi Pine (Wollemia nobilis): Structure and Evolution

PubMed Central

Yap, Jia-Yee S.; Rohner, Thore; Greenfield, Abigail; Van Der Merwe, Marlien; McPherson, Hannah; Glenn, Wendy; Kornfeld, Geoff; Marendy, Elessa; Pan, Annie Y. H.; Wilkins, Marc R.; Rossetto, Maurizio; Delaney, Sven K.

2015-01-01

The Wollemi pine (Wollemia nobilis) is a rare Southern conifer with striking morphological similarity to fossil pines. A small population of W. nobilis was discovered in 1994 in a remote canyon system in the Wollemi National Park (near Sydney, Australia). This population contains fewer than 100 individuals and is critically endangered. Previous genetic studies of the Wollemi pine have investigated its evolutionary relationship with other pines in the family Araucariaceae, and have suggested that the Wollemi pine genome contains little or no variation. However, these studies were performed prior to the widespread use of genome sequencing, and their conclusions were based on a limited fraction of the Wollemi pine genome. In this study, we address this problem by determining the entire sequence of the W. nobilis chloroplast genome. A detailed analysis of the structure of the genome is presented, and the evolution of the genome is inferred by comparison with the chloroplast sequences of other members of the Araucariaceae and the related family Podocarpaceae. Pairwise alignments of whole genome sequences, and the presence of unique pseudogenes, gene duplications and insertions in W. nobilis and Araucariaceae, indicate that the W. nobilis chloroplast genome is most similar to that of its sister taxon Agathis. However, the W. nobilis genome contains an unusually high number of repetitive sequences, and these could be used in future studies to investigate and conserve any remnant genetic diversity in the Wollemi pine. PMID:26061691
Herbicide targets and detoxification proteins in sugarcane: from gene assembly to structure modelling.

PubMed

Lloyd Evans, Dyfed; Joshi, Shailesh Vinay

2017-07-01

In a genome context, sugarcane is a classic orphan crop, in that no genome and only very few genes have been assembled. We have devised a novel exome assembly methodology that has allowed us to assemble and characterize 49 genes that serve as herbicide targets, safener interacting proteins, and members of herbicide detoxification pathways within the sugarcane genome. We have structurally modelled the products of each of these genes, as well as determining allelic, genomic, and RNA-Seq based polymorphisms for each gene. This study provides the largest collection of sugarcane structures modelled to date. We demonstrate that sugarcane genes are highly polymorphic, revealing that each genotype is evolving both uniquely and independently. In addition, we present an exome assembly system for orphan crops that can be executed on commodity infrastructure, making exome assembly practical for any group. In terms of knowledge about herbicide modes of action and detoxification, we have advanced sugarcane from a crop where no information about any herbicide-associated gene was available to the situation where sugarcane is now a species with the single largest collection of known and annotated herbicide-associated genes.
Bat Biology, Genomes, and the Bat1K Project: To Generate Chromosome-Level Genomes for All Living Bat Species.

PubMed

Teeling, Emma C; Vernes, Sonja C; Dávalos, Liliana M; Ray, David A; Gilbert, M Thomas P; Myers, Eugene

2018-02-15

Bats are unique among mammals, possessing some of the rarest mammalian adaptations, including true self-powered flight, laryngeal echolocation, exceptional longevity, unique immunity, contracted genomes, and vocal learning. They provide key ecosystem services, pollinating tropical plants, dispersing seeds, and controlling insect pest populations, thus driving healthy ecosystems. They account for more than 20% of all living mammalian diversity, and their crown-group evolutionary history dates back to the Eocene. Despite their great numbers and diversity, many species are threatened and endangered. Here we announce Bat1K, an initiative to sequence the genomes of all living bat species (n∼1,300) to chromosome-level assembly. The Bat1K genome consortium unites bat biologists (>148 members as of writing), computational scientists, conservation organizations, genome technologists, and any interested individuals committed to a better understanding of the genetic and evolutionary mechanisms that underlie the unique adaptations of bats. Our aim is to catalog the unique genetic diversity present in all living bats to better understand the molecular basis of their unique adaptations; uncover their evolutionary history; link genotype with phenotype; and ultimately better understand, promote, and conserve bats. Here we review the unique adaptations of bats and highlight how chromosome-level genome assemblies can uncover the molecular basis of these traits. We present a novel sequencing and assembly strategy and review the striking societal and scientific benefits that will result from the Bat1K initiative.
Mapping the yeast genome by melting in nanofluidic devices

NASA Astrophysics Data System (ADS)

Welch, Robert L.; Czolkos, Ilja; Sladek, Rob; Reisner, Walter

2012-02-01

Optical mapping of DNA provides large-scale genomic information that can be used to assemble contigs from next-generation sequencing, and to detect re-arrangements between single cells. A recent optical mapping technique called denaturation mapping has the unique advantage of using physical principles rather than the action of enzymes to probe genomic structure. The absence of reagents or reaction steps makes denaturation mapping simpler than other protocols. Denaturation mapping uses fluorescence microscopy to image the pattern of partial melting along a DNA molecule extended in a channel of cross-section ˜100nm at the heart of a nanofluidic device. We successfully aligned melting maps from single DNA molecules to a theoretical map of the yeast genome (11.6Mbp) to identify their location. By aligning hundreds of molecules we assembled a consensus melting map of the yeast genome with 95% coverage.
Proteomic analysis of skeletal organic matrix from the stony coral Stylophora pistillata

PubMed Central

Drake, Jeana L.; Mass, Tali; Haramaty, Liti; Zelzion, Ehud; Bhattacharya, Debashish; Falkowski, Paul G.

2013-01-01

It has long been recognized that a suite of proteins exists in coral skeletons that is critical for the oriented precipitation of calcium carbonate crystals, yet these proteins remain poorly characterized. Using liquid chromatography-tandem mass spectrometry analysis of proteins extracted from the cell-free skeleton of the hermatypic coral, Stylophora pistillata, combined with a draft genome assembly from the cnidarian host cells of the same species, we identified 36 coral skeletal organic matrix proteins. The proteome of the coral skeleton contains an assemblage of adhesion and structural proteins as well as two highly acidic proteins that may constitute a unique coral skeletal organic matrix protein subfamily. We compared the 36 skeletal organic matrix protein sequences to genome and transcriptome data from three other corals, three additional invertebrates, one vertebrate, and three single-celled organisms. This work represents a unique extensive proteomic analysis of biomineralization-related proteins in corals from which we identify a biomineralization “toolkit,” an organic scaffold upon which aragonite crystals can be deposited in specific orientations to form a phenotypically identifiable structure. PMID:23431140
Structure and variation of the mitochondrial genome of fishes.

PubMed

Satoh, Takashi P; Miya, Masaki; Mabuchi, Kohji; Nishida, Mutsumi

2016-09-07

The mitochondrial (mt) genome has been used as an effective tool for phylogenetic and population genetic analyses in vertebrates. However, the structure and variability of the vertebrate mt genome are not well understood. A potential strategy for improving our understanding is to conduct a comprehensive comparative study of large mt genome data. The aim of this study was to characterize the structure and variability of the fish mt genome through comparative analysis of large datasets. An analysis of the secondary structure of proteins for 250 fish species (248 ray-finned and 2 cartilaginous fishes) illustrated that cytochrome c oxidase subunits (COI, COII, and COIII) and a cytochrome bc1 complex subunit (Cyt b) had substantial amino acid conservation. Among the four proteins, COI was the most conserved, as more than half of all amino acid sites were invariable among the 250 species. Our models identified 43 and 58 stems within 12S rRNA and 16S rRNA, respectively, with larger numbers than proposed previously for vertebrates. The models also identified 149 and 319 invariable sites in 12S rRNA and 16S rRNA, respectively, in all fishes. In particular, the present result verified that a region corresponding to the peptidyl transferase center in prokaryotic 23S rRNA, which is homologous to mt 16S rRNA, is also conserved in fish mt 16S rRNA. Concerning the gene order, we found 35 variations (in 32 families) that deviated from the common gene order in vertebrates. These gene rearrangements were mostly observed in the area spanning the ND5 gene to the control region as well as two tRNA gene cluster regions (IQM and WANCY regions). Although many of such gene rearrangements were unique to a specific taxon, some were shared polyphyletically between distantly related species. Through a large-scale comparative analysis of 250 fish species mt genomes, we elucidated various structural aspects of the fish mt genome and the encoded genes. The present results will be important for understanding functions of the mt genome and developing programs for nucleotide sequence analysis. This study demonstrated the significance of extensive comparisons for understanding the structure of the mt genome.

Structural, functional and evolutionary relationships between homing endonucleases and proteins from their host organisms

PubMed Central

Taylor, Gregory K.; Stoddard, Barry L.

2012-01-01

Homing endonucleases (HEs) are highly specific DNA-cleaving enzymes that are encoded by invasive DNA elements (usually mobile introns or inteins) within the genomes of phage, bacteria, archea, protista and eukaryotic organelles. Six unique structural HE families, that collectively span four distinct nuclease catalytic motifs, have been characterized to date. Members of each family display structural homology and functional relationships to a wide variety of proteins from various organisms. The biological functions of those proteins are highly disparate and include non-specific DNA-degradation enzymes, restriction endonucleases, DNA-repair enzymes, resolvases, intron splicing factors and transcription factors. These relationships suggest that modern day HEs share common ancestors with proteins involved in genome fidelity, maintenance and gene expression. This review summarizes the results of structural studies of HEs and corresponding proteins from host organisms that have illustrated the manner in which these factors are related. PMID:22406833
NordicDB: a Nordic pool and portal for genome-wide control data.

PubMed

Leu, Monica; Humphreys, Keith; Surakka, Ida; Rehnberg, Emil; Muilu, Juha; Rosenström, Päivi; Almgren, Peter; Jääskeläinen, Juha; Lifton, Richard P; Kyvik, Kirsten Ohm; Kaprio, Jaakko; Pedersen, Nancy L; Palotie, Aarno; Hall, Per; Grönberg, Henrik; Groop, Leif; Peltonen, Leena; Palmgren, Juni; Ripatti, Samuli

2010-12-01

A cost-efficient way to increase power in a genetic association study is to pool controls from different sources. The genotyping effort can then be directed to large case series. The Nordic Control database, NordicDB, has been set up as a unique resource in the Nordic area and the data are available for authorized users through the web portal (http://www.nordicdb.org). The current version of NordicDB pools together high-density genome-wide SNP information from ∼5000 controls originating from Finnish, Swedish and Danish studies and shows country-specific allele frequencies for SNP markers. The genetic homogeneity of the samples was investigated using multidimensional scaling (MDS) analysis and pairwise allele frequency differences between the studies. The plot of the first two MDS components showed excellent resemblance to the geographical placement of the samples, with a clear NW-SE gradient. We advise researchers to assess the impact of population structure when incorporating NordicDB controls in association studies. This harmonized Nordic database presents a unique genome-wide resource for future genetic association studies in the Nordic countries.
NordicDB: a Nordic pool and portal for genome-wide control data

PubMed Central

Leu, Monica; Humphreys, Keith; Surakka, Ida; Rehnberg, Emil; Muilu, Juha; Rosenström, Päivi; Almgren, Peter; Jääskeläinen, Juha; Lifton, Richard P; Kyvik, Kirsten Ohm; Kaprio, Jaakko; Pedersen, Nancy L; Palotie, Aarno; Hall, Per; Grönberg, Henrik; Groop, Leif; Peltonen, Leena; Palmgren, Juni; Ripatti, Samuli

2010-01-01

A cost-efficient way to increase power in a genetic association study is to pool controls from different sources. The genotyping effort can then be directed to large case series. The Nordic Control database, NordicDB, has been set up as a unique resource in the Nordic area and the data are available for authorized users through the web portal (http://www.nordicdb.org). The current version of NordicDB pools together high-density genome-wide SNP information from ∼5000 controls originating from Finnish, Swedish and Danish studies and shows country-specific allele frequencies for SNP markers. The genetic homogeneity of the samples was investigated using multidimensional scaling (MDS) analysis and pairwise allele frequency differences between the studies. The plot of the first two MDS components showed excellent resemblance to the geographical placement of the samples, with a clear NW–SE gradient. We advise researchers to assess the impact of population structure when incorporating NordicDB controls in association studies. This harmonized Nordic database presents a unique genome-wide resource for future genetic association studies in the Nordic countries. PMID:20664631
Genomic Diversity in the Endosymbiotic Bacterium Rhizobium leguminosarum.

PubMed

Sánchez-Cañizares, Carmen; Jorrín, Beatriz; Durán, David; Nadendla, Suvarna; Albareda, Marta; Rubio-Sanz, Laura; Lanza, Mónica; González-Guerrero, Manuel; Prieto, Rosa Isabel; Brito, Belén; Giglio, Michelle G; Rey, Luis; Ruiz-Argüeso, Tomás; Palacios, José M; Imperial, Juan

2018-01-24

Rhizobium leguminosarum bv. viciae is a soil α-proteobacterium that establishes a diazotrophic symbiosis with different legumes of the Fabeae tribe. The number of genome sequences from rhizobial strains available in public databases is constantly increasing, although complete, fully annotated genome structures from rhizobial genomes are scarce. In this work, we report and analyse the complete genome of R. leguminosarum bv. viciae UPM791. Whole genome sequencing can provide new insights into the genetic features contributing to symbiotically relevant processes such as bacterial adaptation to the rhizosphere, mechanisms for efficient competition with other bacteria, and the ability to establish a complex signalling dialogue with legumes, to enter the root without triggering plant defenses, and, ultimately, to fix nitrogen within the host. Comparison of the complete genome sequences of two strains of R. leguminosarum bv. viciae , 3841 and UPM791, highlights the existence of different symbiotic plasmids and a common core chromosome. Specific genomic traits, such as plasmid content or a distinctive regulation, define differential physiological capabilities of these endosymbionts. Among them, strain UPM791 presents unique adaptations for recycling the hydrogen generated in the nitrogen fixation process.
Cryo-electron Microscopy Study of the Genome Release of the Dicistrovirus Israeli Acute Bee Paralysis Virus.

PubMed

Mullapudi, Edukondalu; Füzik, Tibor; Přidal, Antonín; Plevka, Pavel

2017-02-15

Viruses of the family Dicistroviridae can cause substantial economic damage by infecting agriculturally important insects. Israeli acute bee paralysis virus (IAPV) causes honeybee colony collapse disorder in the United States. High-resolution molecular details of the genome delivery mechanism of dicistroviruses are unknown. Here we present a cryo-electron microscopy analysis of IAPV virions induced to release their genomes in vitro We determined structures of full IAPV virions primed to release their genomes to a resolution of 3.3 Å and of empty capsids to a resolution of 3.9 Å. We show that IAPV does not form expanded A particles before genome release as in the case of related enteroviruses of the family Picornaviridae The structural changes observed in the empty IAPV particles include detachment of the VP4 minor capsid proteins from the inner face of the capsid and partial loss of the structure of the N-terminal arms of the VP2 capsid proteins. Unlike the case for many picornaviruses, the empty particles of IAPV are not expanded relative to the native virions and do not contain pores in their capsids that might serve as channels for genome release. Therefore, rearrangement of a unique region of the capsid is probably required for IAPV genome release. Honeybee populations in Europe and North America are declining due to pressure from pathogens, including viruses. Israeli acute bee paralysis virus (IAPV), a member of the family Dicistroviridae, causes honeybee colony collapse disorder in the United States. The delivery of virus genomes into host cells is necessary for the initiation of infection. Here we present a structural cryo-electron microscopy analysis of IAPV particles induced to release their genomes. We show that genome release is not preceded by an expansion of IAPV virions as in the case of related picornaviruses that infect vertebrates. Furthermore, minor capsid proteins detach from the capsid upon genome release. The genome leaves behind empty particles that have compact protein shells. Copyright © 2017 Mullapudi et al.
Analyses of charophyte chloroplast genomes help characterize the ancestral chloroplast genome of land plants.

PubMed

Civaň, Peter; Foster, Peter G; Embley, Martin T; Séneca, Ana; Cox, Cymon J

2014-04-01

Despite the significance of the relationships between embryophytes and their charophyte algal ancestors in deciphering the origin and evolutionary success of land plants, few chloroplast genomes of the charophyte algae have been reconstructed to date. Here, we present new data for three chloroplast genomes of the freshwater charophytes Klebsormidium flaccidum (Klebsormidiophyceae), Mesotaenium endlicherianum (Zygnematophyceae), and Roya anglica (Zygnematophyceae). The chloroplast genome of Klebsormidium has a quadripartite organization with exceptionally large inverted repeat (IR) regions and, uniquely among streptophytes, has lost the rrn5 and rrn4.5 genes from the ribosomal RNA (rRNA) gene cluster operon. The chloroplast genome of Roya differs from other zygnematophycean chloroplasts, including the newly sequenced Mesotaenium, by having a quadripartite structure that is typical of other streptophytes. On the basis of the improbability of the novel gain of IR regions, we infer that the quadripartite structure has likely been lost independently in at least three zygnematophycean lineages, although the absence of the usual rRNA operonic synteny in the IR regions of Roya may indicate their de novo origin. Significantly, all zygnematophycean chloroplast genomes have undergone substantial genomic rearrangement, which may be the result of ancient retroelement activity evidenced by the presence of integrase-like and reverse transcriptase-like elements in the Roya chloroplast genome. Our results corroborate the close phylogenetic relationship between Zygnematophyceae and land plants and identify 89 protein-coding genes and 22 introns present in the chloroplast genome at the time of the evolutionary transition of plants to land, all of which can be found in the chloroplast genomes of extant charophytes.
Analyses of Charophyte Chloroplast Genomes Help Characterize the Ancestral Chloroplast Genome of Land Plants

PubMed Central

Civáň, Peter; Foster, Peter G.; Embley, Martin T.; Séneca, Ana; Cox, Cymon J.

2014-01-01

Despite the significance of the relationships between embryophytes and their charophyte algal ancestors in deciphering the origin and evolutionary success of land plants, few chloroplast genomes of the charophyte algae have been reconstructed to date. Here, we present new data for three chloroplast genomes of the freshwater charophytes Klebsormidium flaccidum (Klebsormidiophyceae), Mesotaenium endlicherianum (Zygnematophyceae), and Roya anglica (Zygnematophyceae). The chloroplast genome of Klebsormidium has a quadripartite organization with exceptionally large inverted repeat (IR) regions and, uniquely among streptophytes, has lost the rrn5 and rrn4.5 genes from the ribosomal RNA (rRNA) gene cluster operon. The chloroplast genome of Roya differs from other zygnematophycean chloroplasts, including the newly sequenced Mesotaenium, by having a quadripartite structure that is typical of other streptophytes. On the basis of the improbability of the novel gain of IR regions, we infer that the quadripartite structure has likely been lost independently in at least three zygnematophycean lineages, although the absence of the usual rRNA operonic synteny in the IR regions of Roya may indicate their de novo origin. Significantly, all zygnematophycean chloroplast genomes have undergone substantial genomic rearrangement, which may be the result of ancient retroelement activity evidenced by the presence of integrase-like and reverse transcriptase-like elements in the Roya chloroplast genome. Our results corroborate the close phylogenetic relationship between Zygnematophyceae and land plants and identify 89 protein-coding genes and 22 introns present in the chloroplast genome at the time of the evolutionary transition of plants to land, all of which can be found in the chloroplast genomes of extant charophytes. PMID:24682153
Draft genome sequence of marine-derived Streptomyces sp. TP-A0598, a producer of anti-MRSA antibiotic lydicamycins.

PubMed

Komaki, Hisayuki; Ichikawa, Natsuko; Hosoyama, Akira; Fujita, Nobuyuki; Igarashi, Yasuhiro

2015-01-01

Streptomyces sp. TP-A0598, isolated from seawater, produces lydicamycin, structurally unique type I polyketide bearing two nitrogen-containing five-membered rings, and four congeners TPU-0037-A, -B, -C, and -D. We herein report the 8 Mb draft genome sequence of this strain, together with classification and features of the organism and generation, annotation and analysis of the genome sequence. The genome encodes 7,240 putative ORFs, of which 4,450 ORFs were assigned with COG categories. Also, 66 tRNA genes and one rRNA operon were identified. The genome contains eight gene clusters involved in the production of polyketides and nonribosomal peptides. Among them, a PKS/NRPS gene cluster was assigned to be responsible for lydicamycin biosynthesis and a plausible biosynthetic pathway was proposed on the basis of gene function prediction. This genome sequence data will facilitate to probe the potential of secondary metabolism in marine-derived Streptomyces.
In vitro resolution of the dimer bridge of the minute virus of mice (MVM) genome supports the modified rolling hairpin model for MVM replication.

PubMed

Liu, Q; Yong, C B; Astell, C R

1994-06-01

Previous characterization of the terminal sequences of the minute virus of mice (MVM) genome demonstrated that the right hand palindrome contains two sequences, each the inverted complement of the other. However, the left hand palindrome was shown to exist as a unique sequence [Astell et al., J. Virol. 54: 179-185 (1985)]. The modified rolling hairpin (MRH) model for MVM replication provided an explanation of how the right hand palindrome could undergo hairpin transfer to generate two sequences, while the left end palindrome within the dimer bridge could undergo asymmetric resolution and retain the unique left end sequence. This report describes in vitro resolution of the wild-type dimer bridge sequence of MVM using recombinant (baculovirus) expressed NS-1 and a replication extract from LA9 cells. The resolution products are consistent with those predicted by the MRH model, providing support for this replication mechanism. In addition, mutant dimer bridge clones were constructed and used in the resolution assay. The mutant structures included removal of the asymmetry in the hairpin stem, inversion of the sequence at the initiating nick site, and a 2-bp deletion within one stem of the dimer bridge. In all cases, the mutant dimer bridge structures are resolved; however, the resolution pattern observed with the mutant dimer bridge compared with the wild-type dimer bridge is shifted toward symmetrical resolution. These results suggest that sequences within the left hand hairpin (and hence dimer bridge sequence) are responsible for asymmetric resolution and conservation of the unique sequence within the left hand palindrome of the MVM genome.
Structure-Function, Stability, and Chemical Modification of the Cyanobacterial Cytochrome b6f Complex from Nostoc sp. PCC 7120*

PubMed Central

Baniulis, Danas; Yamashita, Eiki; Whitelegge, Julian P.; Zatsman, Anna I.; Hendrich, Michael P.; Hasan, S. Saif; Ryan, Christopher M.; Cramer, William A.

2009-01-01

The crystal structure of the cyanobacterial cytochrome b6f complex has previously been solved to 3.0-Å resolution using the thermophilic Mastigocladus laminosus whose genome has not been sequenced. Several unicellular cyanobacteria, whose genomes have been sequenced and are tractable for mutagenesis, do not yield b6f complex in an intact dimeric state with significant electron transport activity. The genome of Nostoc sp. PCC 7120 has been sequenced and is closer phylogenetically to M. laminosus than are unicellular cyanobacteria. The amino acid sequences of the large core subunits and four small peripheral subunits of Nostoc are 88 and 80% identical to those in the M. laminosus b6f complex. Purified b6f complex from Nostoc has a stable dimeric structure, eight subunits with masses similar to those of M. laminosus, and comparable electron transport activity. The crystal structure of the native b6f complex, determined to a resolution of 3.0Å (PDB id: 2ZT9), is almost identical to that of M. laminosus. Two unique aspects of the Nostoc complex are: (i) a dominant conformation of heme bp that is rotated 180° about the α- and γ-meso carbon axis relative to the orientation in the M. laminosus complex and (ii) acetylation of the Rieske iron-sulfur protein (PetC) at the N terminus, a post-translational modification unprecedented in cyanobacterial membrane and electron transport proteins, and in polypeptides of cytochrome bc complexes from any source. The high spin electronic character of the unique heme cn is similar to that previously found in the b6f complex from other sources. PMID:19189962
Structure-Function, Stability, and Chemical Modification of the Cyanobacterial Cytochrome b[subscript 6]f Complex from Nostoc sp. PCC 7120

DOE Office of Scientific and Technical Information (OSTI.GOV)

Baniulis, Danas; Yamashita, Eiki; Whitelegge, Julian P.

2009-06-08

The crystal structure of the cyanobacterial cytochrome b{sub 6}f complex has previously been solved to 3.0-{angstrom} resolution using the thermophilic Mastigocladus laminosus whose genome has not been sequenced. Several unicellular cyanobacteria, whose genomes have been sequenced and are tractable for mutagenesis, do not yield b{sub 6}f complex in an intact dimeric state with significant electron transport activity. The genome of Nostoc sp. PCC 7120 has been sequenced and is closer phylogenetically to M. laminosus than are unicellular cyanobacteria. The amino acid sequences of the large core subunits and four small peripheral subunits of Nostoc are 88 and 80% identical tomore » those in the M. laminosus b{sub 6}f complex. Purified b{sub 6}f complex from Nostoc has a stable dimeric structure, eight subunits with masses similar to those of M. laminosus, and comparable electron transport activity. The crystal structure of the native b{sub 6}f complex, determined to a resolution of 3.0{angstrom} (PDB id: 2ZT9), is almost identical to that of M. laminosus. Two unique aspects of the Nostoc complex are: (i) a dominant conformation of heme b{sub p} that is rotated 180 deg. about the {alpha}- and {gamma}-meso carbon axis relative to the orientation in the M. laminosus complex and (ii) acetylation of the Rieske iron-sulfur protein (PetC) at the N terminus, a post-translational modification unprecedented in cyanobacterial membrane and electron transport proteins, and in polypeptides of cytochrome bc complexes from any source. The high spin electronic character of the unique heme cn is similar to that previously found in the b{sub 6}f complex from other sources.« less
Genomic analyses of modern dog breeds.

PubMed

Parker, Heidi G

2012-02-01

A rose may be a rose by any other name, but when you call a dog a poodle it becomes a very different animal than if you call it a bulldog. Both the poodle and the bulldog are examples of dog breeds of which there are >400 recognized worldwide. Breed creation has played a significant role in shaping the modern dog from the length of his leg to the cadence of his bark. The selection and line-breeding required to maintain a breed has also reshaped the genome of the dog, resulting in a unique genetic pattern for each breed. The breed-based population structure combined with extensive morphologic variation and shared human environments have made the dog a popular model for mapping both simple and complex traits and diseases. In order to obtain the most benefit from the dog as a genetic system, it is necessary to understand the effect structured breeding has had on the genome of the species. That is best achieved by looking at genomic analyses of the breeds, their histories, and their relationships to each other.
Genomic Analyses of Modern Dog Breeds

PubMed Central

Parker, Heidi G.

2013-01-01

A rose may be a rose by any other name, but when you call a dog a poodle it becomes a very different animal than if you call it a bulldog. Both the poodle and the bulldog are examples of dog breeds of which there are >400 recognized world-wide. Breed creation has played a significant role in shaping the modern dog from the length of his leg to the cadence of his bark. The selection and line-breeding required to maintain a breed has also reshaped the genome of the dog resulting in a unique genetic pattern for each breed. The breed-based population structure combined with extensive morphologic variation and shared human environments have made the dog a popular model for mapping both simple and complex traits and diseases. In order to obtain the most benefit from the dog as a genetic system, it is necessary to understand the effect structured breeding has had on the genome of the species. That is best achieved by looking at genomic analyses of the breeds, their histories, and their relationships to each other. PMID:22231497
Advancements in zebrafish applications for 21st century toxicology.

PubMed

Garcia, Gloria R; Noyes, Pamela D; Tanguay, Robert L

2016-05-01

The zebrafish model is the only available high-throughput vertebrate assessment system, and it is uniquely suited for studies of in vivo cell biology. A sequenced and annotated genome has revealed a large degree of evolutionary conservation in comparison to the human genome. Due to our shared evolutionary history, the anatomical and physiological features of fish are highly homologous to humans, which facilitates studies relevant to human health. In addition, zebrafish provide a very unique vertebrate data stream that allows researchers to anchor hypotheses at the biochemical, genetic, and cellular levels to observations at the structural, functional, and behavioral level in a high-throughput format. In this review, we will draw heavily from toxicological studies to highlight advances in zebrafish high-throughput systems. Breakthroughs in transgenic/reporter lines and methods for genetic manipulation, such as the CRISPR-Cas9 system, will be comprised of reports across diverse disciplines. Copyright © 2016 Elsevier Inc. All rights reserved.
Advancements in zebrafish applications for 21st century toxicology

PubMed Central

Garcia, Gloria R.; Noyes, Pamela D.; Tanguay, Robert L.

2016-01-01

The zebrafish model is the only available high-throughput vertebrate assessment system, and it is uniquely suited for studies of in vivo cell biology. A sequenced and annotated genome has revealed a large degree of evolutionary conservation in comparison to the human genome. Due to our shared evolutionary history, the anatomical and physiological features of fish are highly homologous to humans, which facilitates studies relevant to human health. In addition, zebrafish provide a very unique vertebrate data stream that allows researchers to anchor hypotheses at the biochemical, genetic, and cellular levels to observations at the structural, functional, and behavioral level in a high-throughput format. In this review, we will draw heavily from toxicological studies to highlight advances in zebrafish high-throughput systems. Breakthroughs in transgenic/reporter lines and methods for genetic manipulation, such as the CRISPR-Cas9 system, will be comprised of reports across diverse disciplines. PMID:27016469
Genome packaging in EL and Lin68, two giant phiKZ-like bacteriophages of P. aeruginosa.

PubMed

Sokolova, O S; Shaburova, O V; Pechnikova, E V; Shaytan, A K; Krylov, S V; Kiselev, N A; Krylov, V N

2014-11-01

A unique feature of the Pseudomonas aeruginosa giant phage phiKZ is its way of genome packaging onto a spool-like protein structure, the inner body. Until recently, no similar structures have been detected in other phages. We have studied DNA packaging in P. aeruginosa phages EL and Lin68 using cryo-electron microscopy and revealed the presence of inner bodies. The shape and positioning of the inner body and the density of the DNA packaging in EL are different from those found in phiKZ and Lin68. This internal organization explains how the shorter EL genome is packed into a large EL capsid, which has the same external dimensions as the capsids of phiKZ and Lin68. The similarity in the structural organization in EL and other phiKZ-like phages indicates that EL is phylogenetically related to other phiKZ-like phages, and, despite the lack of detectable DNA homology, EL, phiKZ, and Lin68 descend from a common ancestor. Copyright © 2014 Elsevier Inc. All rights reserved.
Neocentromeres: A Place for Everything and Everything in Its Place

PubMed Central

Scott, Kristin C.; Sullivan, Beth A.

2014-01-01

Centromeres are essential for chromosome inheritance and genome stability. Centromeric proteins, including the centromeric histone CENP-A, define the site of centromeric chromatin and kinetochore assembly. In many organisms, centromeres are located in or near regions of repetitive DNA. However, some atypical centromeres spontaneously form on unique sequences. These neocentromeres, or new centromeres, were first identified in humans, but have since been described in other organisms. Neocentromeres are functionally and structurally similar to endogenous centromeres, but lack the added complication of underlying repetitive sequences. Here, we discuss recent studies in chicken and fungal systems where genomic engineering can promote neocentromere formation. These studies reveal key genomic and epigenetic factors that support de novo centromere formation in eukaryotes. PMID:24342629
Unique core genomes of the bacterial family vibrionaceae: insights into niche adaptation and speciation.

PubMed

Kahlke, Tim; Goesmann, Alexander; Hjerde, Erik; Willassen, Nils Peder; Haugen, Peik

2012-05-10

The criteria for defining bacterial species and even the concept of bacterial species itself are under debate, and the discussion is apparently intensifying as more genome sequence data is becoming available. However, it is still unclear how the new advances in genomics should be used most efficiently to address this question. In this study we identify genes that are common to any group of genomes in our dataset, to determine whether genes specific to a particular taxon exist and to investigate their potential role in adaptation of bacteria to their specific niche. These genes were named unique core genes. Additionally, we investigate the existence and importance of unique core genes that are found in isolates of phylogenetically non-coherent groups. These groups of isolates, that share a genetic feature without sharing a closest common ancestor, are termed genophyletic groups. The bacterial family Vibrionaceae was used as the model, and we compiled and compared genome sequences of 64 different isolates. Using the software orthoMCL we determined clusters of homologous genes among the investigated genome sequences. We used multilocus sequence analysis to build a host phylogeny and mapped the numbers of unique core genes of all distinct groups of isolates onto the tree. The results show that unique core genes are more likely to be found in monophyletic groups of isolates. Genophyletic groups of isolates, in contrast, are less common especially for large groups of isolate. The subsequent annotation of unique core genes that are present in genophyletic groups indicate a high degree of horizontally transferred genes. Finally, the annotation of the unique core genes of Vibrio cholerae revealed genes involved in aerotaxis and biosynthesis of the iron-chelator vibriobactin. The presented work indicates that genes specific for any taxon inside the bacterial family Vibrionaceae exist. These unique core genes encode conserved metabolic functions that can shed light on the adaptation of a species to its ecological niche. Additionally, our study suggests that unique core genes can be used to aid classification of bacteria and contribute to a bacterial species definition on a genomic level. Furthermore, these genes may be of importance in clinical diagnostics and drug development.
The complete chloroplast genome sequence of the medicinal plant Salvia miltiorrhiza.

PubMed

Qian, Jun; Song, Jingyuan; Gao, Huanhuan; Zhu, Yingjie; Xu, Jiang; Pang, Xiaohui; Yao, Hui; Sun, Chao; Li, Xian'en; Li, Chuyuan; Liu, Juyan; Xu, Haibin; Chen, Shilin

2013-01-01

Salvia miltiorrhiza is an important medicinal plant with great economic and medicinal value. The complete chloroplast (cp) genome sequence of Salvia miltiorrhiza, the first sequenced member of the Lamiaceae family, is reported here. The genome is 151,328 bp in length and exhibits a typical quadripartite structure of the large (LSC, 82,695 bp) and small (SSC, 17,555 bp) single-copy regions, separated by a pair of inverted repeats (IRs, 25,539 bp). It contains 114 unique genes, including 80 protein-coding genes, 30 tRNAs and four rRNAs. The genome structure, gene order, GC content and codon usage are similar to the typical angiosperm cp genomes. Four forward, three inverted and seven tandem repeats were detected in the Salvia miltiorrhiza cp genome. Simple sequence repeat (SSR) analysis among the 30 asterid cp genomes revealed that most SSRs are AT-rich, which contribute to the overall AT richness of these cp genomes. Additionally, fewer SSRs are distributed in the protein-coding sequences compared to the non-coding regions, indicating an uneven distribution of SSRs within the cp genomes. Entire cp genome comparison of Salvia miltiorrhiza and three other Lamiales cp genomes showed a high degree of sequence similarity and a relatively high divergence of intergenic spacers. Sequence divergence analysis discovered the ten most divergent and ten most conserved genes as well as their length variation, which will be helpful for phylogenetic studies in asterids. Our analysis also supports that both regional and functional constraints affect gene sequence evolution. Further, phylogenetic analysis demonstrated a sister relationship between Salvia miltiorrhiza and Sesamum indicum. The complete cp genome sequence of Salvia miltiorrhiza reported in this paper will facilitate population, phylogenetic and cp genetic engineering studies of this medicinal plant.
GenPlay Multi-Genome, a tool to compare and analyze multiple human genomes in a graphical interface.

PubMed

Lajugie, Julien; Fourel, Nicolas; Bouhassira, Eric E

2015-01-01

Parallel visualization of multiple individual human genomes is a complex endeavor that is rapidly gaining importance with the increasing number of personal, phased and cancer genomes that are being generated. It requires the display of variants such as SNPs, indels and structural variants that are unique to specific genomes and the introduction of multiple overlapping gaps in the reference sequence. Here, we describe GenPlay Multi-Genome, an application specifically written to visualize and analyze multiple human genomes in parallel. GenPlay Multi-Genome is ideally suited for the comparison of allele-specific expression and functional genomic data obtained from multiple phased genomes in a graphical interface with access to multiple-track operation. It also allows the analysis of data that have been aligned to custom genomes rather than to a standard reference and can be used as a variant calling format file browser and as a tool to compare different genome assembly, such as hg19 and hg38. GenPlay is available under the GNU public license (GPL-3) from http://genplay.einstein.yu.edu. The source code is available at https://github.com/JulienLajugie/GenPlay. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

The complete chloroplast genome sequence of Aconitum coreanum and Aconitum carmichaelii and comparative analysis with other Aconitum species

PubMed Central

Park, Inkyu; Kim, Wook-jin; Yang, Sungyu; Yeo, Sang-Min; Li, Hulin

2017-01-01

Aconitum species (belonging to the Ranunculaceae) are well known herbaceous medicinal ingredients and have great economic value in Asian countries. However, there are still limited genomic resources available for Aconitum species. In this study, we sequenced the chloroplast (cp) genomes of two Aconitum species, A. coreanum and A. carmichaelii, using the MiSeq platform. The two Aconitum chloroplast genomes were 155,880 and 157,040 bp in length, respectively, and exhibited LSC and SSC regions separated by a pair of inverted repeat regions. Both cp genomes had 38% GC content and contained 131 unique functional genes including 86 protein-coding genes, eight ribosomal RNA genes, and 37 transfer RNA genes. The gene order, content, and orientation of the two Aconitum cp genomes exhibited the general structure of angiosperms, and were similar to those of other Aconitum species. Comparison of the cp genome structure and gene order with that of other Aconitum species revealed general contraction and expansion of the inverted repeat regions and single copy boundary regions. Divergent regions were also identified. In phylogenetic analysis, Aconitum species positon among the Ranunculaceae was determined with other family cp genomes in the Ranunculales. We obtained a barcoding target sequence in a divergent region, ndhC–trnV, and successfully developed a SCAR (sequence characterized amplified region) marker for discrimination of A. coreanum. Our results provide useful genetic information and a specific barcode for discrimination of Aconitum species. PMID:28863163
The complete chloroplast genome sequence of Aconitum coreanum and Aconitum carmichaelii and comparative analysis with other Aconitum species.

PubMed

Park, Inkyu; Kim, Wook-Jin; Yang, Sungyu; Yeo, Sang-Min; Li, Hulin; Moon, Byeong Cheol

2017-01-01

Aconitum species (belonging to the Ranunculaceae) are well known herbaceous medicinal ingredients and have great economic value in Asian countries. However, there are still limited genomic resources available for Aconitum species. In this study, we sequenced the chloroplast (cp) genomes of two Aconitum species, A. coreanum and A. carmichaelii, using the MiSeq platform. The two Aconitum chloroplast genomes were 155,880 and 157,040 bp in length, respectively, and exhibited LSC and SSC regions separated by a pair of inverted repeat regions. Both cp genomes had 38% GC content and contained 131 unique functional genes including 86 protein-coding genes, eight ribosomal RNA genes, and 37 transfer RNA genes. The gene order, content, and orientation of the two Aconitum cp genomes exhibited the general structure of angiosperms, and were similar to those of other Aconitum species. Comparison of the cp genome structure and gene order with that of other Aconitum species revealed general contraction and expansion of the inverted repeat regions and single copy boundary regions. Divergent regions were also identified. In phylogenetic analysis, Aconitum species positon among the Ranunculaceae was determined with other family cp genomes in the Ranunculales. We obtained a barcoding target sequence in a divergent region, ndhC-trnV, and successfully developed a SCAR (sequence characterized amplified region) marker for discrimination of A. coreanum. Our results provide useful genetic information and a specific barcode for discrimination of Aconitum species.
The complete chloroplast genome sequence of Dodonaea viscosa: comparative and phylogenetic analyses.

PubMed

Saina, Josphat K; Gichira, Andrew W; Li, Zhi-Zhong; Hu, Guang-Wan; Wang, Qing-Feng; Liao, Kuo

2018-02-01

The plant chloroplast (cp) genome is a highly conserved structure which is beneficial for evolution and systematic research. Currently, numerous complete cp genome sequences have been reported due to high throughput sequencing technology. However, there is no complete chloroplast genome of genus Dodonaea that has been reported before. To better understand the molecular basis of Dodonaea viscosa chloroplast, we used Illumina sequencing technology to sequence its complete genome. The whole length of the cp genome is 159,375 base pairs (bp), with a pair of inverted repeats (IRs) of 27,099 bp separated by a large single copy (LSC) 87,204 bp, and small single copy (SSC) 17,972 bp. The annotation analysis revealed a total of 115 unique genes of which 81 were protein coding, 30 tRNA, and four ribosomal RNA genes. Comparative genome analysis with other closely related Sapindaceae members showed conserved gene order in the inverted and single copy regions. Phylogenetic analysis clustered D. viscosa with other species of Sapindaceae with strong bootstrap support. Finally, a total of 249 SSRs were detected. Moreover, a comparison of the synonymous (Ks) and nonsynonymous (Ka) substitution rates in D. viscosa showed very low values. The availability of cp genome reported here provides a valuable genetic resource for comprehensive further studies in genetic variation, taxonomy and phylogenetic evolution of Sapindaceae family. In addition, SSR markers detected will be used in further phylogeographic and population structure studies of the species in this genus.
The genomes and comparative genomics of Lactobacillus delbrueckii phages.

PubMed

Riipinen, Katja-Anneli; Forsman, Päivi; Alatossava, Tapani

2011-07-01

Lactobacillus delbrueckii phages are a great source of genetic diversity. Here, the genome sequences of Lb. delbrueckii phages LL-Ku, c5 and JCL1032 were analyzed in detail, and the genetic diversity of Lb. delbrueckii phages belonging to different taxonomic groups was explored. The lytic isometric group b phages LL-Ku (31,080 bp) and c5 (31,841 bp) showed a minimum nucleotide sequence identity of 90% over about three-fourths of their genomes. The genomic locations of their lysis modules were unique, and the genomes featured several putative overlapping transcription units of genes. LL-Ku and c5 virions displayed peptidoglycan hydrolytic activity associated with a ~36-kDa protein similar in size to the endolysin. Unexpectedly, the 49,433-bp genome of the prolate phage JCL1032 (temperate, group c) revealed a conserved gene order within its structural genes. Lb. delbrueckii phages representing groups a (a phage LL-H), b and c possessed only limited protein sequence homology. Genomic comparison of LL-Ku and c5 suggested that diversification of Lb. delbrueckii phages is mainly due to insertions, deletions and recombination. For the first time, the complete genome sequences of group b and c Lb. delbrueckii phages are reported.
Genome packaging in EL and Lin68, two giant phiKZ-like bacteriophages of P. aeruginosa

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sokolova, O.S., E-mail: sokolova@mail.bio.msu.ru; A.V. Shoubnikov Institute of Crystallography RAS, Moscow; Shaburova, O.V.

A unique feature of the Pseudomonas aeruginosa giant phage phiKZ is its way of genome packaging onto a spool-like protein structure, the inner body. Until recently, no similar structures have been detected in other phages. We have studied DNA packaging in P. aeruginosa phages EL and Lin68 using cryo-electron microscopy and revealed the presence of inner bodies. The shape and positioning of the inner body and the density of the DNA packaging in EL are different from those found in phiKZ and Lin68. This internal organization explains how the shorter EL genome is packed into a large EL capsid, whichmore » has the same external dimensions as the capsids of phiKZ and Lin68. The similarity in the structural organization in EL and other phiKZ-like phages indicates that EL is phylogenetically related to other phiKZ-like phages, and, despite the lack of detectable DNA homology, EL, phiKZ, and Lin68 descend from a common ancestor. - Highlights: • We performed a comparative structural study of giant P. aeruginosa phages: EL, Lin68 and phiKZ. • We revealed that the inner body is a common feature in giant phages. • The phage genome size correlates with the overall dimensions of the inner body.« less
Somatic cell nuclear transfer: infinite reproduction of a unique diploid genome.

PubMed

Kishigami, Satoshi; Wakayama, Sayaka; Hosoi, Yoshihiko; Iritani, Akira; Wakayama, Teruhiko

2008-06-10

In mammals, a diploid genome of an individual following fertilization of an egg and a spermatozoon is unique and irreproducible. This implies that the generated unique diploid genome is doomed with the individual ending. Even as cultured cells from the individual, they cannot normally proliferate in perpetuity because of the "Hayflick limit". However, Dolly, the sheep cloned from an adult mammary gland cell, changes this scenario. Somatic cell nuclear transfer (SCNT) enables us to produce offspring without germ cells, that is, to "passage" a unique diploid genome. Animal cloning has also proven to be a powerful research tool for reprogramming in many mammals, notably mouse and cow. The mechanism underlying reprogramming, however, remains largely unknown and, animal cloning has been inefficient as a result. More momentously, in addition to abortion and fetal mortality, some cloned animals display possible premature aging phenotypes including early death and short telomere lengths. Under these inauspicious conditions, is it really possible for SCNT to preserve a diploid genome? Delightfully, in mouse and recently in primate, using SCNT we can produce nuclear transfer ES cells (ntES) more efficiently, which can preserve the eternal lifespan for the "passage" of a unique diploid genome. Further, new somatic cloning technique using histone-deacetylase inhibitors has been developed which can significantly increase the previous cloning rates two to six times. Here, we introduce SCNT and its value as a preservation tool for a diploid genome while reviewing aging of cloned animals on cellular and individual levels.
novPTMenzy: a database for enzymes involved in novel post-translational modifications

PubMed Central

Khater, Shradha; Mohanty, Debasisa

2015-01-01

With the recent discoveries of novel post-translational modifications (PTMs) which play important roles in signaling and biosynthetic pathways, identification of such PTM catalyzing enzymes by genome mining has been an area of major interest. Unlike well-known PTMs like phosphorylation, glycosylation, SUMOylation, no bioinformatics resources are available for enzymes associated with novel and unusual PTMs. Therefore, we have developed the novPTMenzy database which catalogs information on the sequence, structure, active site and genomic neighborhood of experimentally characterized enzymes involved in five novel PTMs, namely AMPylation, Eliminylation, Sulfation, Hydroxylation and Deamidation. Based on a comprehensive analysis of the sequence and structural features of these known PTM catalyzing enzymes, we have created Hidden Markov Model profiles for the identification of similar PTM catalyzing enzymatic domains in genomic sequences. We have also created predictive rules for grouping them into functional subfamilies and deciphering their mechanistic details by structure-based analysis of their active site pockets. These analytical modules have been made available as user friendly search interfaces of novPTMenzy database. It also has a specialized analysis interface for some PTMs like AMPylation and Eliminylation. The novPTMenzy database is a unique resource that can aid in discovery of unusual PTM catalyzing enzymes in newly sequenced genomes. Database URL: http://www.nii.ac.in/novptmenzy.html PMID:25931459
Comparative Genomics of Non-TNL Disease Resistance Genes from Six Plant Species.

PubMed

Nepal, Madhav P; Andersen, Ethan J; Neupane, Surendra; Benson, Benjamin V

2017-09-30

Disease resistance genes (R genes), as part of the plant defense system, have coevolved with corresponding pathogen molecules. The main objectives of this project were to identify non-Toll interleukin receptor, nucleotide-binding site, leucine-rich repeat (nTNL) genes and elucidate their evolutionary divergence across six plant genomes. Using reference sequences from Arabidopsis , we investigated nTNL orthologs in the genomes of common bean, Medicago , soybean, poplar, and rice. We used Hidden Markov Models for sequence identification, performed model-based phylogenetic analyses, visualized chromosomal positioning, inferred gene clustering, and assessed gene expression profiles. We analyzed 908 nTNL R genes in the genomes of the six plant species, and classified them into 12 subgroups based on the presence of coiled-coil (CC), nucleotide binding site (NBS), leucine rich repeat (LRR), resistance to Powdery mildew 8 (RPW8), and BED type zinc finger domains. Traditionally classified CC-NBS-LRR (CNL) genes were nested into four clades (CNL A-D) often with abundant, well-supported homogeneous subclades of Type-II R genes. CNL-D members were absent in rice, indicating a unique R gene retention pattern in the rice genome. Genomes from Arabidopsis , common bean, poplar and soybean had one chromosome without any CNL R genes. Medicago and Arabidopsis had the highest and lowest number of gene clusters, respectively. Gene expression analyses suggested unique patterns of expression for each of the CNL clades. Differential gene expression patterns of the nTNL genes were often found to correlate with number of introns and GC content, suggesting structural and functional divergence.
Comparative Genomics of Non-TNL Disease Resistance Genes from Six Plant Species

PubMed Central

Andersen, Ethan J.; Neupane, Surendra; Benson, Benjamin V.

2017-01-01

Disease resistance genes (R genes), as part of the plant defense system, have coevolved with corresponding pathogen molecules. The main objectives of this project were to identify non-Toll interleukin receptor, nucleotide-binding site, leucine-rich repeat (nTNL) genes and elucidate their evolutionary divergence across six plant genomes. Using reference sequences from Arabidopsis, we investigated nTNL orthologs in the genomes of common bean, Medicago, soybean, poplar, and rice. We used Hidden Markov Models for sequence identification, performed model-based phylogenetic analyses, visualized chromosomal positioning, inferred gene clustering, and assessed gene expression profiles. We analyzed 908 nTNL R genes in the genomes of the six plant species, and classified them into 12 subgroups based on the presence of coiled-coil (CC), nucleotide binding site (NBS), leucine rich repeat (LRR), resistance to Powdery mildew 8 (RPW8), and BED type zinc finger domains. Traditionally classified CC-NBS-LRR (CNL) genes were nested into four clades (CNL A-D) often with abundant, well-supported homogeneous subclades of Type-II R genes. CNL-D members were absent in rice, indicating a unique R gene retention pattern in the rice genome. Genomes from Arabidopsis, common bean, poplar and soybean had one chromosome without any CNL R genes. Medicago and Arabidopsis had the highest and lowest number of gene clusters, respectively. Gene expression analyses suggested unique patterns of expression for each of the CNL clades. Differential gene expression patterns of the nTNL genes were often found to correlate with number of introns and GC content, suggesting structural and functional divergence. PMID:28973974
Insights into bilaterian evolution from three spiralian genomes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Simakov, Oleg; Marletaz, Ferdinand; Cho, Sung-Jin

2012-01-07

Current genomic perspectives on animal diversity neglect two prominent phyla, the molluscs and annelids, that together account for nearly one-third of known marine species and are important both ecologically and as experimental systems in classical embryology1, 2, 3. Here we describe the draft genomes of the owl limpet (Lottia gigantea), a marine polychaete (Capitella teleta) and a freshwater leech (Helobdella robusta), and compare them with other animal genomes to investigate the origin and diversification of bilaterians from a genomic perspective. We find that the genome organization, gene structure and functional content of these species are more similar to those ofmore » some invertebrate deuterostome genomes (for example, amphioxus and sea urchin) than those of other protostomes that have been sequenced to date (flies, nematodes and flatworms). The conservation of these genomic features enables us to expand the inventory of genes present in the last common bilaterian ancestor, establish the tripartite diversification of bilaterians using multiple genomic characteristics and identify ancient conserved long- and short-range genetic linkages across metazoans. Superimposed on this broadly conserved pan-bilaterian background we find examples of lineage-specific genome evolution, including varying rates of rearrangement, intron gain and loss, expansions and contractions of gene families, and the evolution of clade-specific genes that produce the unique content of each genome.« less
Systems biology approach in plant abiotic stresses.

PubMed

Mohanta, Tapan Kumar; Bashir, Tufail; Hashem, Abeer; Abd Allah, Elsayed Fathi

2017-12-01

Plant abiotic stresses are the major constraint on plant growth and development, causing enormous crop losses across the world. Plants have unique features to defend themselves against these challenging adverse stress conditions. They modulate their phenotypes upon changes in physiological, biochemical, molecular and genetic information, thus making them tolerant against abiotic stresses. It is of paramount importance to determine the stress-tolerant traits of a diverse range of genotypes of plant species and integrate those traits for crop improvement. Stress-tolerant traits can be identified by conducting genome-wide analysis of stress-tolerant genotypes through the highly advanced structural and functional genomics approach. Specifically, whole-genome sequencing, development of molecular markers, genome-wide association studies and comparative analysis of interaction networks between tolerant and susceptible crop varieties grown under stress conditions can greatly facilitate discovery of novel agronomic traits that protect plants against abiotic stresses. Copyright © 2017 Elsevier Masson SAS. All rights reserved.
Chloroplast genome expansion by intron multiplication in the basal psychrophilic euglenoid Eutreptiella pomquetensis

PubMed Central

Bennett, Matthew S.; Triemer, Richard E.; Preisfeld, Angelika

2017-01-01

Background Over the last few years multiple studies have been published showing a great diversity in size of chloroplast genomes (cpGenomes), and in the arrangement of gene clusters, in the Euglenales. However, while these genomes provided important insights into the evolution of cpGenomes across the Euglenales and within their genera, only two genomes were analyzed in regard to genomic variability between and within Euglenales and Eutreptiales. To better understand the dynamics of chloroplast genome evolution in early evolving Eutreptiales, this study focused on the cpGenome of Eutreptiella pomquetensis, and the spread and peculiarities of introns. Methods The Etl. pomquetensis cpGenome was sequenced, annotated and afterwards examined in structure, size, gene order and intron content. These features were compared with other euglenoid cpGenomes as well as those of prasinophyte green algae, including Pyramimonas parkeae. Results and Discussion With about 130,561 bp the chloroplast genome of Etl. pomquetensis, a basal taxon in the phototrophic euglenoids, was considerably larger than the two other Eutreptiales cpGenomes sequenced so far. Although the detected quadripartite structure resembled most green algae and plant chloroplast genomes, the gene content of the single copy regions in Etl. pomquetensis was completely different from those observed in green algae and plants. The gene composition of Etl. pomquetensis was extensively changed and turned out to be almost identical to other Eutreptiales and Euglenales, and not to P. parkeae. Furthermore, the cpGenome of Etl. pomquetensis was unexpectedly permeated by a high number of introns, which led to a substantially larger genome. The 51 identified introns of Etl. pomquetensis showed two major unique features: (i) more than half of the introns displayed a high level of pairwise identities; (ii) no group III introns could be identified in the protein coding genes. These findings support the hypothesis that group III introns are degenerated group II introns and evolved later. PMID:28852596
Genome evolution and speciation genetics of clawed frogs (Xenopus and Silurana).

PubMed

Evans, Ben J

2008-05-01

Speciation of clawed frogs occurred through bifurcation and reticulation of evolutionary lineages, and resulted in extant species with different ploidy levels. Duplicate gene evolution and expression in these animals provides a unique perspective into the earliest genomic transformations after vertebrate whole genome duplication (WGD) and suggests that functional constraints are relaxed compared to before duplication but still consistently strong for millions of years following WGD. Additionally, extensive quantitative expression divergence between duplicate genes occurred after WGD. Diversification of clawed frogs was potentially catalyzed by transposition and divergent resolution--processes that occur through different genetic mechanisms but that have analogous implications for genome structure. How sex determination is maintained after genome duplication is fundamental to our understanding of why allopolyploidization is so prevalent in this group, and why clawed frogs violate Haldane's Rule for hybrid sterility. Future studies of expression subfunctionalization in polyploids will shed light on the role and purviews of cis- and trans-regulatory elements in gene regulation.
Plastid genome sequence of an ornamental and editable fruit tree of Rosaceae, Prunus mume.

PubMed

Wang, Shuo; Gao, Cheng-Wen; Gao, Li-Zhi

2016-11-01

Here we assembled and analyzed the complete chloroplast genome of Prunus mume, a popular ornamental and editable fruit tree of Rosaceae. The cp genome exhibited a circular DNA molecule of 157 712 bp with a typical quadripartite structure consisted of two inverted repeat regions (IRa and IRb) of 26 394 bp separated by large (LSC) and small (SSC) single-copy regions of 85 861 and 19 063 bp, respectively. It encoded 112 unique genes, 19 of which were duplicated in the IR regions, giving a total of 131 genes. Eighteen of these genes harbored one or two introns. GC content was 38.9%, and coding regions accounted for 51.3% of the genome. Phylogenetic analysis showed that P. mume clustered with P. persica and P. kansuensis in the genus Punus. This newly determined chloroplast genome will enhance modern breeding programs for the purpose of genetic improvement of this valuable plant.
Genomic insights into the Ixodes scapularis tick vector of Lyme disease

PubMed Central

Gulia-Nuss, Monika; Nuss, Andrew B.; Meyer, Jason M.; Sonenshine, Daniel E.; Roe, R. Michael; Waterhouse, Robert M.; Sattelle, David B.; de la Fuente, José; Ribeiro, Jose M.; Megy, Karine; Thimmapuram, Jyothi; Miller, Jason R.; Walenz, Brian P.; Koren, Sergey; Hostetler, Jessica B.; Thiagarajan, Mathangi; Joardar, Vinita S.; Hannick, Linda I.; Bidwell, Shelby; Hammond, Martin P.; Young, Sarah; Zeng, Qiandong; Abrudan, Jenica L.; Almeida, Francisca C.; Ayllón, Nieves; Bhide, Ketaki; Bissinger, Brooke W.; Bonzon-Kulichenko, Elena; Buckingham, Steven D.; Caffrey, Daniel R.; Caimano, Melissa J.; Croset, Vincent; Driscoll, Timothy; Gilbert, Don; Gillespie, Joseph J.; Giraldo-Calderón, Gloria I.; Grabowski, Jeffrey M.; Jiang, David; Khalil, Sayed M. S.; Kim, Donghun; Kocan, Katherine M.; Koči, Juraj; Kuhn, Richard J.; Kurtti, Timothy J.; Lees, Kristin; Lang, Emma G.; Kennedy, Ryan C.; Kwon, Hyeogsun; Perera, Rushika; Qi, Yumin; Radolf, Justin D.; Sakamoto, Joyce M.; Sánchez-Gracia, Alejandro; Severo, Maiara S.; Silverman, Neal; Šimo, Ladislav; Tojo, Marta; Tornador, Cristian; Van Zee, Janice P.; Vázquez, Jesús; Vieira, Filipe G.; Villar, Margarita; Wespiser, Adam R.; Yang, Yunlong; Zhu, Jiwei; Arensburger, Peter; Pietrantonio, Patricia V.; Barker, Stephen C.; Shao, Renfu; Zdobnov, Evgeny M.; Hauser, Frank; Grimmelikhuijzen, Cornelis J. P.; Park, Yoonseong; Rozas, Julio; Benton, Richard; Pedra, Joao H. F.; Nelson, David R.; Unger, Maria F.; Tubio, Jose M. C.; Tu, Zhijian; Robertson, Hugh M.; Shumway, Martin; Sutton, Granger; Wortman, Jennifer R.; Lawson, Daniel; Wikel, Stephen K.; Nene, Vishvanath M.; Fraser, Claire M.; Collins, Frank H.; Birren, Bruce; Nelson, Karen E.; Caler, Elisabet; Hill, Catherine A.

2016-01-01

Ticks transmit more pathogens to humans and animals than any other arthropod. We describe the 2.1 Gbp nuclear genome of the tick, Ixodes scapularis (Say), which vectors pathogens that cause Lyme disease, human granulocytic anaplasmosis, babesiosis and other diseases. The large genome reflects accumulation of repetitive DNA, new lineages of retro-transposons, and gene architecture patterns resembling ancient metazoans rather than pancrustaceans. Annotation of scaffolds representing ∼57% of the genome, reveals 20,486 protein-coding genes and expansions of gene families associated with tick–host interactions. We report insights from genome analyses into parasitic processes unique to ticks, including host ‘questing', prolonged feeding, cuticle synthesis, blood meal concentration, novel methods of haemoglobin digestion, haem detoxification, vitellogenesis and prolonged off-host survival. We identify proteins associated with the agent of human granulocytic anaplasmosis, an emerging disease, and the encephalitis-causing Langat virus, and a population structure correlated to life-history traits and transmission of the Lyme disease agent. PMID:26856261
Genomic insights into the Ixodes scapularis tick vector of Lyme disease.

PubMed

Gulia-Nuss, Monika; Nuss, Andrew B; Meyer, Jason M; Sonenshine, Daniel E; Roe, R Michael; Waterhouse, Robert M; Sattelle, David B; de la Fuente, José; Ribeiro, Jose M; Megy, Karine; Thimmapuram, Jyothi; Miller, Jason R; Walenz, Brian P; Koren, Sergey; Hostetler, Jessica B; Thiagarajan, Mathangi; Joardar, Vinita S; Hannick, Linda I; Bidwell, Shelby; Hammond, Martin P; Young, Sarah; Zeng, Qiandong; Abrudan, Jenica L; Almeida, Francisca C; Ayllón, Nieves; Bhide, Ketaki; Bissinger, Brooke W; Bonzon-Kulichenko, Elena; Buckingham, Steven D; Caffrey, Daniel R; Caimano, Melissa J; Croset, Vincent; Driscoll, Timothy; Gilbert, Don; Gillespie, Joseph J; Giraldo-Calderón, Gloria I; Grabowski, Jeffrey M; Jiang, David; Khalil, Sayed M S; Kim, Donghun; Kocan, Katherine M; Koči, Juraj; Kuhn, Richard J; Kurtti, Timothy J; Lees, Kristin; Lang, Emma G; Kennedy, Ryan C; Kwon, Hyeogsun; Perera, Rushika; Qi, Yumin; Radolf, Justin D; Sakamoto, Joyce M; Sánchez-Gracia, Alejandro; Severo, Maiara S; Silverman, Neal; Šimo, Ladislav; Tojo, Marta; Tornador, Cristian; Van Zee, Janice P; Vázquez, Jesús; Vieira, Filipe G; Villar, Margarita; Wespiser, Adam R; Yang, Yunlong; Zhu, Jiwei; Arensburger, Peter; Pietrantonio, Patricia V; Barker, Stephen C; Shao, Renfu; Zdobnov, Evgeny M; Hauser, Frank; Grimmelikhuijzen, Cornelis J P; Park, Yoonseong; Rozas, Julio; Benton, Richard; Pedra, Joao H F; Nelson, David R; Unger, Maria F; Tubio, Jose M C; Tu, Zhijian; Robertson, Hugh M; Shumway, Martin; Sutton, Granger; Wortman, Jennifer R; Lawson, Daniel; Wikel, Stephen K; Nene, Vishvanath M; Fraser, Claire M; Collins, Frank H; Birren, Bruce; Nelson, Karen E; Caler, Elisabet; Hill, Catherine A

2016-02-09

Ticks transmit more pathogens to humans and animals than any other arthropod. We describe the 2.1 Gbp nuclear genome of the tick, Ixodes scapularis (Say), which vectors pathogens that cause Lyme disease, human granulocytic anaplasmosis, babesiosis and other diseases. The large genome reflects accumulation of repetitive DNA, new lineages of retro-transposons, and gene architecture patterns resembling ancient metazoans rather than pancrustaceans. Annotation of scaffolds representing ∼57% of the genome, reveals 20,486 protein-coding genes and expansions of gene families associated with tick-host interactions. We report insights from genome analyses into parasitic processes unique to ticks, including host 'questing', prolonged feeding, cuticle synthesis, blood meal concentration, novel methods of haemoglobin digestion, haem detoxification, vitellogenesis and prolonged off-host survival. We identify proteins associated with the agent of human granulocytic anaplasmosis, an emerging disease, and the encephalitis-causing Langat virus, and a population structure correlated to life-history traits and transmission of the Lyme disease agent.
A Unique Chromosomal Rearrangement in the Cryptococcus neoformans var. grubii Type Strain Enhances Key Phenotypes Associated with Virulence

PubMed Central

Morrow, Carl A.; Lee, I. Russel; Chow, Eve W. L.; Ormerod, Kate L.; Goldinger, Anita; Byrnes, Edmond J.; Nielsen, Kirsten; Heitman, Joseph; Schirra, Horst Joachim; Fraser, James A.

2012-01-01

ABSTRACT The accumulation of genomic structural variation between closely related populations over time can lead to reproductive isolation and speciation. The fungal pathogen Cryptococcus is thought to have recently diversified, forming a species complex containing members with distinct morphologies, distributions, and pathologies of infection. We have investigated structural changes in genomic architecture such as inversions and translocations that distinguish the most pathogenic variety, Cryptococcus neoformans var. grubii, from the less clinically prevalent Cryptococcus neoformans var. neoformans and Cryptococcus gattii. Synteny analysis between the genomes of the three Cryptococcus species/varieties (strains H99, JEC21, and R265) reveals that C. neoformans var. grubii possesses surprisingly few unique genomic rearrangements. All but one are relatively small and are shared by all molecular subtypes of C. neoformans var. grubii. In contrast, the large translocation peculiar to the C. neoformans var. grubii type strain is found in all tested subcultures from multiple laboratories, suggesting that it has possessed this rearrangement since its isolation from a human clinical sample. Furthermore, we find that the translocation directly disrupts two genes. The first of these encodes a novel protein involved in metabolism of glucose at human body temperature and affects intracellular levels of trehalose. The second encodes a homeodomain-containing transcription factor that modulates melanin production. Both mutations would be predicted to increase pathogenicity; however, when recreated in an alternate genetic background, these mutations do not affect virulence in animal models. The type strain of C. neoformans var. grubii in which the majority of molecular studies have been performed is therefore atypical for carbon metabolism and key virulence attributes. PMID:22375073
Genome, transcriptome, and secretome analysis of wood decay fungus Postia placenta supports unique mechanisms of lignocellulose conversion

DOE Office of Scientific and Technical Information (OSTI.GOV)

Martinez, Diego; Challacombe, Jean; Morgenstern, Ingo

2009-02-04

Brown-rot fungi such as Postia placenta are common inhabitants of forest ecosystems and are also largely responsible for the destructive decay of wooden structures. Rapid depolymerization of cellulose is a distinguishing feature of brown-rot, but the biochemical mechanisms and underlying genetics are poorly understood. Systematic examination of the P. placenta genome, transcriptome, and secretome revealed unique extracellular enzyme systems, including an unusual repertoire of extracellular glycoside hydrolases. Genes encoding exocellobiohydrolases and cellulose-binding domains, typical of cellulolytic microbes, are absent in this efficient cellulose-degrading fungus. When P. placenta was grown in media containing cellulose as sole carbon source, transcripts corresponding tomore » many hemicellulases and to a single putative β-1-4 endoglucanase were expressed at high levels relative to glucose grown cultures. These transcript profiles were confirmed by direct identification of peptides by liquid chromatography-tandem mass spectrometry (LC-MS/MS). Also upregulated under cellulolytic culture conditions were putative iron reductases, quinone reductase, and structurally divergent oxidases potentially involved in extracellular generation of Fe(II) and H2O2. These observations are consistent with a biodegradative role for Fenton chemistry in which Fe(II) and H2O2 react to form hydroxyl radicals, highly reactive oxidants capable of depolymerizing cellulose. The P. placenta genome resources provide unparalleled opportunities for investigating such unusual mechanisms of cellulose conversion. More broadly, the genome offers insight into the diversification of lignocellulose degrading mechanisms in fungi. In particular, comparisons between P. placenta and the closely related white-rot fungus, Phanerochaete chrysosporium support an evolutionary shift from white-rot to brown-rot during which efficient depolymerization of lignin was lost.« less
Genome, transcriptome, and secretome analysis of wood decay fungus postia placenta supports unique mechanisms of lignocellulose conversion

DOE Office of Scientific and Technical Information (OSTI.GOV)

Martinez, Diego; Challacombe, Jean F; Misra, Monica

2008-01-01

Brown-rot fungi such as Postia placenta are common inhabitants of forest ecosystems and are also largely responsible for the destructive decay of wooden structures. Rapid depolymerization of cellulose is a distinguishing feature of brown-rot, but the biochemical mechanisms and underlying genetics are poorly understood. Systematic examination of the P. placenta genome, transcriptome and secretome revealed unique extracellular enzyme systems, including an unusual repertoire of extracellular glycoside hydrolases. Genes encoding exocellobiohydrolases and cellulose-binding domains, typical of cellulolytic microbes, are absent in this efficient cellulose-degrading fungus. When P. placenta was grown in medium containing cellulose as sole carbon source, transcripts corresponding tomore » many hemicellulases and to a single putative {beta}-1-4 endoglucanase were expressed at high levels relative to glucose grown cultures. These transcript profiles were confirmed by direct identification of peptides by liquid chromatography-tandem mass spectrometry (LC{center_dot}MSIMS). Also upregulated during growth on cellulose medium were putative iron reductases, quinone reductase, and structurally divergent oxidases potentially involved in extracellular generation of Fe(II) and H202. These observations are consistent with a biodegradative role for Fenton chemistry in which Fe(II) and H202 react to form hydroxyl radicals, highly reactive oxidants capable of depolymerizing cellulose. The P. placenta genome resources provide unparalleled opportunities for investigating such unusual mechanisms of cellulose conversion. More broadly, the genome offers insight into the diversification of lignocellulose degrading mechanisms in fungi. Comparisons to the closely related white-rot fungus Phanerochaete chrysosporium support an evolutionary shift from white-rot to brown-rot during which the capacity for efficient depolymerization of lignin was lost.« less
Genome, transcriptome, and secretome analysis of wood decay fungus Postia placenta supports unique mechanisms of lignocellulose conversion

PubMed Central

Martinez, Diego; Challacombe, Jean; Morgenstern, Ingo; Hibbett, David; Schmoll, Monika; Kubicek, Christian P.; Ferreira, Patricia; Ruiz-Duenas, Francisco J.; Martinez, Angel T.; Kersten, Phil; Hammel, Kenneth E.; Vanden Wymelenberg, Amber; Gaskell, Jill; Lindquist, Erika; Sabat, Grzegorz; Splinter BonDurant, Sandra; Larrondo, Luis F.; Canessa, Paulo; Vicuna, Rafael; Yadav, Jagjit; Doddapaneni, Harshavardhan; Subramanian, Venkataramanan; Pisabarro, Antonio G.; Lavín, José L.; Oguiza, José A.; Master, Emma; Henrissat, Bernard; Coutinho, Pedro M.; Harris, Paul; Magnuson, Jon Karl; Baker, Scott E.; Bruno, Kenneth; Kenealy, William; Hoegger, Patrik J.; Kües, Ursula; Ramaiya, Preethi; Lucas, Susan; Salamov, Asaf; Shapiro, Harris; Tu, Hank; Chee, Christine L.; Misra, Monica; Xie, Gary; Teter, Sarah; Yaver, Debbie; James, Tim; Mokrejs, Martin; Pospisek, Martin; Grigoriev, Igor V.; Brettin, Thomas; Rokhsar, Dan; Berka, Randy; Cullen, Dan

2009-01-01

Brown-rot fungi such as Postia placenta are common inhabitants of forest ecosystems and are also largely responsible for the destructive decay of wooden structures. Rapid depolymerization of cellulose is a distinguishing feature of brown-rot, but the biochemical mechanisms and underlying genetics are poorly understood. Systematic examination of the P. placenta genome, transcriptome, and secretome revealed unique extracellular enzyme systems, including an unusual repertoire of extracellular glycoside hydrolases. Genes encoding exocellobiohydrolases and cellulose-binding domains, typical of cellulolytic microbes, are absent in this efficient cellulose-degrading fungus. When P. placenta was grown in medium containing cellulose as sole carbon source, transcripts corresponding to many hemicellulases and to a single putative β-1–4 endoglucanase were expressed at high levels relative to glucose-grown cultures. These transcript profiles were confirmed by direct identification of peptides by liquid chromatography-tandem mass spectrometry (LC-MS/MS). Also up-regulated during growth on cellulose medium were putative iron reductases, quinone reductase, and structurally divergent oxidases potentially involved in extracellular generation of Fe(II) and H2O2. These observations are consistent with a biodegradative role for Fenton chemistry in which Fe(II) and H2O2 react to form hydroxyl radicals, highly reactive oxidants capable of depolymerizing cellulose. The P. placenta genome resources provide unparalleled opportunities for investigating such unusual mechanisms of cellulose conversion. More broadly, the genome offers insight into the diversification of lignocellulose degrading mechanisms in fungi. Comparisons with the closely related white-rot fungus Phanerochaete chrysosporium support an evolutionary shift from white-rot to brown-rot during which the capacity for efficient depolymerization of lignin was lost. PMID:19193860

Genome Reduction Uncovers a Large Dispensable Genome and Adaptive Role for Copy Number Variation in Asexually Propagated Solanum tuberosum[OPEN

PubMed Central

Hardigan, Michael A.; Crisovan, Emily; Hamilton, John P.; Laimbeer, Parker; Leisner, Courtney P.; Manrique-Carpintero, Norma C.; Newton, Linsey; Pham, Gina M.; Vaillancourt, Brieanne; Zeng, Zixian; Jiang, Jiming

2016-01-01

Clonally reproducing plants have the potential to bear a significantly greater mutational load than sexually reproducing species. To investigate this possibility, we examined the breadth of genome-wide structural variation in a panel of monoploid/doubled monoploid clones generated from native populations of diploid potato (Solanum tuberosum), a highly heterozygous asexually propagated plant. As rare instances of purely homozygous clones, they provided an ideal set for determining the degree of structural variation tolerated by this species and deriving its minimal gene complement. Extensive copy number variation (CNV) was uncovered, impacting 219.8 Mb (30.2%) of the potato genome with nearly 30% of genes subject to at least partial duplication or deletion, revealing the highly heterogeneous nature of the potato genome. Dispensable genes (>7000) were associated with limited transcription and/or a recent evolutionary history, with lower deletion frequency observed in genes conserved across angiosperms. Association of CNV with plant adaptation was highlighted by enrichment in gene clusters encoding functions for environmental stress response, with gene duplication playing a part in species-specific expansions of stress-related gene families. This study revealed unique impacts of CNV in a species with asexual reproductive habits and how CNV may drive adaption through evolution of key stress pathways. PMID:26772996
The nuclear genome of Rhazya stricta and the evolution of alkaloid diversity in a medically relevant clade of Apocynaceae

PubMed Central

Sabir, Jamal S. M.; Jansen, Robert K.; Arasappan, Dhivya; Calderon, Virginie; Noutahi, Emmanuel; Zheng, Chunfang; Park, Seongjun; Sabir, Meshaal J.; Baeshen, Mohammed N.; Hajrah, Nahid H.; Khiyami, Mohammad A.; Baeshen, Nabih A.; Obaid, Abdullah Y.; Al-Malki, Abdulrahman L.; Sankoff, David; El-Mabrouk, Nadia; Ruhlman, Tracey A.

2016-01-01

Alkaloid accumulation in plants is activated in response to stress, is limited in distribution and specific alkaloid repertoires are variable across taxa. Rauvolfioideae (Apocynaceae, Gentianales) represents a major center of structural expansion in the monoterpenoid indole alkaloids (MIAs) yielding thousands of unique molecules including highly valuable chemotherapeutics. The paucity of genome-level data for Apocynaceae precludes a deeper understanding of MIA pathway evolution hindering the elucidation of remaining pathway enzymes and the improvement of MIA availability in planta or in vitro. We sequenced the nuclear genome of Rhazya stricta (Apocynaceae, Rauvolfioideae) and present this high quality assembly in comparison with that of coffee (Rubiaceae, Coffea canephora, Gentianales) and others to investigate the evolution of genome-scale features. The annotated Rhazya genome was used to develop the community resource, RhaCyc, a metabolic pathway database. Gene family trees were constructed to identify homologs of MIA pathway genes and to examine their evolutionary history. We found that, unlike Coffea, the Rhazya lineage has experienced many structural rearrangements. Gene tree analyses suggest recent, lineage-specific expansion and diversification among homologs encoding MIA pathway genes in Gentianales and provide candidate sequences with the potential to close gaps in characterized pathways and support prospecting for new MIA production avenues. PMID:27653669
The nuclear genome of Rhazya stricta and the evolution of alkaloid diversity in a medically relevant clade of Apocynaceae.

PubMed

Sabir, Jamal S M; Jansen, Robert K; Arasappan, Dhivya; Calderon, Virginie; Noutahi, Emmanuel; Zheng, Chunfang; Park, Seongjun; Sabir, Meshaal J; Baeshen, Mohammed N; Hajrah, Nahid H; Khiyami, Mohammad A; Baeshen, Nabih A; Obaid, Abdullah Y; Al-Malki, Abdulrahman L; Sankoff, David; El-Mabrouk, Nadia; Ruhlman, Tracey A

2016-09-22

Alkaloid accumulation in plants is activated in response to stress, is limited in distribution and specific alkaloid repertoires are variable across taxa. Rauvolfioideae (Apocynaceae, Gentianales) represents a major center of structural expansion in the monoterpenoid indole alkaloids (MIAs) yielding thousands of unique molecules including highly valuable chemotherapeutics. The paucity of genome-level data for Apocynaceae precludes a deeper understanding of MIA pathway evolution hindering the elucidation of remaining pathway enzymes and the improvement of MIA availability in planta or in vitro. We sequenced the nuclear genome of Rhazya stricta (Apocynaceae, Rauvolfioideae) and present this high quality assembly in comparison with that of coffee (Rubiaceae, Coffea canephora, Gentianales) and others to investigate the evolution of genome-scale features. The annotated Rhazya genome was used to develop the community resource, RhaCyc, a metabolic pathway database. Gene family trees were constructed to identify homologs of MIA pathway genes and to examine their evolutionary history. We found that, unlike Coffea, the Rhazya lineage has experienced many structural rearrangements. Gene tree analyses suggest recent, lineage-specific expansion and diversification among homologs encoding MIA pathway genes in Gentianales and provide candidate sequences with the potential to close gaps in characterized pathways and support prospecting for new MIA production avenues.
Whole-genome sequencing in patients with ciliopathies uncovers a novel recurrent tandem duplication in IFT140.

PubMed

Geoffroy, Véronique; Stoetzel, Corinne; Scheidecker, Sophie; Schaefer, Elise; Perrault, Isabelle; Bär, Séverine; Kröll, Ariane; Delbarre, Marion; Antin, Manuela; Leuvrey, Anne-Sophie; Henry, Charline; Blanché, Hélène; Decker, Eva; Kloth, Katja; Klaus, Günter; Mache, Christoph; Martin-Coignard, Dominique; McGinn, Steven; Boland, Anne; Deleuze, Jean-François; Friant, Sylvie; Saunier, Sophie; Rozet, Jean-Michel; Bergmann, Carsten; Dollfus, Hélène; Muller, Jean

2018-04-24

Ciliopathies represent a wide spectrum of rare diseases with overlapping phenotypes and a high genetic heterogeneity. Among those, IFT140 is implicated in a variety of phenotypes ranging from isolated retinis pigmentosa to more syndromic cases. Using whole-genome sequencing in patients with uncharacterized ciliopathies, we identified a novel recurrent tandem duplication of exon 27-30 (6.7 kb) in IFT140, c.3454-488_4182+2588dup p.(Tyr1152_Thr1394dup), missed by whole-exome sequencing. Pathogenicity of the mutation was assessed on the patients' skin fibroblasts. Several hundreds of patients with a ciliopathy phenotype were screened and biallelic mutations were identified in 11 families representing 12 pathogenic variants of which seven are novel. Among those unrelated families especially with a Mainzer-Saldino syndrome, eight carried the same tandem duplication (two at the homozygous state and six at the heterozygous state). In conclusion, we demonstrated the implication of structural variations in IFT140-related diseases expanding its mutation spectrum. We also provide evidences for a unique genomic event mediated by an Alu-Alu recombination occurring on a shared haplotype. We confirm that whole-genome sequencing can be instrumental in the ability to detect structural variants for genomic disorders. © 2018 Wiley Periodicals, Inc.
Starch Catabolism by a Prominent Human Gut Symbiont Is Directed by the Recognition of Amylose Helices

DOE Office of Scientific and Technical Information (OSTI.GOV)

Koropatkin, Nicole M.; Martens, Eric C.; Gordon, Jeffrey I.

2009-01-12

The human gut microbiota performs functions that are not encoded in our Homo sapiens genome, including the processing of otherwise undigestible dietary polysaccharides. Defining the structures of proteins involved in the import and degradation of specific glycans by saccharolytic bacteria complements genomic analysis of the nutrient-processing capabilities of gut communities. Here, we describe the atomic structure of one such protein, SusD, required for starch binding and utilization by Bacteroides thetaiotaomicron, a prominent adaptive forager of glycans in the distal human gut microbiota. The binding pocket of this unique {alpha}-helical protein contains an arc of aromatic residues that complements the naturalmore » helical structure of starch and imposes this conformation on bound maltoheptaose. Furthermore, SusD binds cyclic oligosaccharides with higher affinity than linear forms. The structures of several SusD/oligosaccharide complexes reveal an inherent ligand recognition plasticity dominated by the three-dimensional conformation of the oligosaccharides rather than specific interactions with the composite sugars.« less
Next Generation Sequencing Technologies: The Doorway to the Unexplored Genomics of Non-Model Plants

PubMed Central

Unamba, Chibuikem I. N.; Nag, Akshay; Sharma, Ram K.

2015-01-01

Non-model plants i.e., the species which have one or all of the characters such as long life cycle, difficulty to grow in the laboratory or poor fecundity, have been schemed out of sequencing projects earlier, due to high running cost of Sanger sequencing. Consequently, the information about their genomics and key biological processes are inadequate. However, the advent of fast and cost effective next generation sequencing (NGS) platforms in the recent past has enabled the unearthing of certain characteristic gene structures unique to these species. It has also aided in gaining insight about mechanisms underlying processes of gene expression and secondary metabolism as well as facilitated development of genomic resources for diversity characterization, evolutionary analysis and marker assisted breeding even without prior availability of genomic sequence information. In this review we explore how different Next Gen Sequencing platforms, as well as recent advances in NGS based high throughput genotyping technologies are rewarding efforts on de-novo whole genome/transcriptome sequencing, development of genome wide sequence based markers resources for improvement of non-model crops that are less costly than phenotyping. PMID:26734016
A Novel Type of Polyhedral Viruses Infecting Hyperthermophilic Archaea.

PubMed

Liu, Ying; Ishino, Sonoko; Ishino, Yoshizumi; Pehau-Arnaudet, Gérard; Krupovic, Mart; Prangishvili, David

2017-07-01

Encapsidation of genetic material into polyhedral particles is one of the most common structural solutions employed by viruses infecting hosts in all three domains of life. Here, we describe a new virus of hyperthermophilic archaea, Sulfolobus polyhedral virus 1 (SPV1), which condenses its circular double-stranded DNA genome in a manner not previously observed for other known viruses. The genome complexed with virion proteins is wound up sinusoidally into a spherical coil which is surrounded by an envelope and further encased by an outer polyhedral capsid apparently composed of the 20-kDa virion protein. Lipids selectively acquired from the pool of host lipids are integral constituents of the virion. None of the major virion proteins of SPV1 show similarity to structural proteins of known viruses. However, minor structural proteins, which are predicted to mediate host recognition, are shared with other hyperthermophilic archaeal viruses infecting members of the order Sulfolobales The SPV1 genome consists of 20,222 bp and contains 45 open reading frames, only one-fifth of which could be functionally annotated. IMPORTANCE Viruses infecting hyperthermophilic archaea display a remarkable morphological diversity, often presenting architectural solutions not employed by known viruses of bacteria and eukaryotes. Here we present the isolation and characterization of Sulfolobus polyhedral virus 1, which condenses its genome into a unique spherical coil. Due to the original genomic and architectural features of SPV1, the virus should be considered a representative of a new viral family, "Portogloboviridae." Copyright © 2017 American Society for Microbiology.
Detecting the Population Structure and Scanning for Signatures of Selection in Horses (Equus caballus) From Whole-Genome Sequencing Data

PubMed Central

Zhang, Cheng; Ni, Pan; Ahmad, Hafiz Ishfaq; Gemingguli, M; Baizilaitibei, A; Gulibaheti, D; Fang, Yaping; Wang, Haiyang; Asif, Akhtar Rasool; Xiao, Changyi; Chen, Jianhai; Ma, Yunlong; Liu, Xiangdong; Du, Xiaoyong; Zhao, Shuhong

2018-01-01

Animal domestication gives rise to gradual changes at the genomic level through selection in populations. Selective sweeps have been traced in the genomes of many animal species, including humans, cattle, and dogs. However, little is known regarding positional candidate genes and genomic regions that exhibit signatures of selection in domestic horses. In addition, an understanding of the genetic processes underlying horse domestication, especially the origin of Chinese native populations, is still lacking. In our study, we generated whole genome sequences from 4 Chinese native horses and combined them with 48 publicly available full genome sequences, from which 15 341 213 high-quality unique single-nucleotide polymorphism variants were identified. Kazakh and Lichuan horses are 2 typical Asian native breeds that were formed in Kazakh or Northwest China and South China, respectively. We detected 1390 loss-of-function (LoF) variants in protein-coding genes, and gene ontology (GO) enrichment analysis revealed that some LoF-affected genes were overrepresented in GO terms related to the immune response. Bayesian clustering, distance analysis, and principal component analysis demonstrated that the population structure of these breeds largely reflected weak geographic patterns. Kazakh and Lichuan horses were assigned to the same lineage with other Asian native breeds, in agreement with previous studies on the genetic origin of Chinese domestic horses. We applied the composite likelihood ratio method to scan for genomic regions showing signals of recent selection in the horse genome. A total of 1052 genomic windows of 10 kB, corresponding to 933 distinct core regions, significantly exceeded neutral simulations. The GO enrichment analysis revealed that the genes under selective sweeps were overrepresented with GO terms, including “negative regulation of canonical Wnt signaling pathway,” “muscle contraction,” and “axon guidance.” Frequent exercise training in domestic horses may have resulted in changes in the expression of genes related to metabolism, muscle structure, and the nervous system.
Sequencing Needs for Viral Diagnostics

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gardner, S N; Lam, M; Mulakken, N J

2004-01-26

We built a system to guide decisions regarding the amount of genomic sequencing required to develop diagnostic DNA signatures, which are short sequences that are sufficient to uniquely identify a viral species. We used our existing DNA diagnostic signature prediction pipeline, which selects regions of a target species genome that are conserved among strains of the target (for reliability, to prevent false negatives) and unique relative to other species (for specificity, to avoid false positives). We performed simulations, based on existing sequence data, to assess the number of genome sequences of a target species and of close phylogenetic relatives (''nearmore » neighbors'') that are required to predict diagnostic signature regions that are conserved among strains of the target species and unique relative to other bacterial and viral species. For DNA viruses such as variola (smallpox), three target genomes provide sufficient guidance for selecting species-wide signatures. Three near neighbor genomes are critical for species specificity. In contrast, most RNA viruses require four target genomes and no near neighbor genomes, since lack of conservation among strains is more limiting than uniqueness. SARS and Ebola Zaire are exceptional, as additional target genomes currently do not improve predictions, but near neighbor sequences are urgently needed. Our results also indicate that double stranded DNA viruses are more conserved among strains than are RNA viruses, since in most cases there was at least one conserved signature candidate for the DNA viruses and zero conserved signature candidates for the RNA viruses.« less
Langevin Dynamics Simulations of Genome Packing in Bacteriophage

PubMed Central

Forrey, Christopher; Muthukumar, M.

2006-01-01

We use Langevin dynamics simulations to study the process by which a coarse-grained DNA chain is packaged within an icosahedral container. We focus our inquiry on three areas of interest in viral packing: the evolving structure of the packaged DNA condensate; the packing velocity; and the internal buildup of energy and resultant forces. Each of these areas has been studied experimentally, and we find that we can qualitatively reproduce experimental results. However, our findings also suggest that the phage genome packing process is fundamentally different than that suggested by the inverse spool model. We suggest that packing in general does not proceed in the deterministic fashion of the inverse-spool model, but rather is stochastic in character. As the chain configuration becomes compressed within the capsid, the structure, energy, and packing velocity all become dependent upon polymer dynamics. That many observed features of the packing process are rooted in condensed-phase polymer dynamics suggests that statistical mechanics, rather than mechanics, should serve as the proper theoretical basis for genome packing. Finally we suggest that, as a result of an internal protein unique to bacteriophage T7, the T7 genome may be significantly more ordered than is true for bacteriophage in general. PMID:16617089
Langevin dynamics simulations of genome packing in bacteriophage.

PubMed

Forrey, Christopher; Muthukumar, M

2006-07-01

We use Langevin dynamics simulations to study the process by which a coarse-grained DNA chain is packaged within an icosahedral container. We focus our inquiry on three areas of interest in viral packing: the evolving structure of the packaged DNA condensate; the packing velocity; and the internal buildup of energy and resultant forces. Each of these areas has been studied experimentally, and we find that we can qualitatively reproduce experimental results. However, our findings also suggest that the phage genome packing process is fundamentally different than that suggested by the inverse spool model. We suggest that packing in general does not proceed in the deterministic fashion of the inverse-spool model, but rather is stochastic in character. As the chain configuration becomes compressed within the capsid, the structure, energy, and packing velocity all become dependent upon polymer dynamics. That many observed features of the packing process are rooted in condensed-phase polymer dynamics suggests that statistical mechanics, rather than mechanics, should serve as the proper theoretical basis for genome packing. Finally we suggest that, as a result of an internal protein unique to bacteriophage T7, the T7 genome may be significantly more ordered than is true for bacteriophage in general.
Genomic Organization and Molecular Analysis of Virulent Bacteriophage 2972 Infecting an Exopolysaccharide-Producing Streptococcus thermophilus Strain

PubMed Central

Lévesque, Céline; Duplessis, Martin; Labonté, Jessica; Labrie, Steve; Fremaux, Christophe; Tremblay, Denise; Moineau, Sylvain

2005-01-01

The Streptococcus thermophilus virulent pac-type phage 2972 was isolated from a yogurt made in France in 1999. It is a representative of several phages that have emerged with the industrial use of the exopolysaccharide-producing S. thermophilus strain RD534. The genome of phage 2972 has 34,704 bp with an overall G+C content of 40.15%, making it the shortest S. thermophilus phage genome analyzed so far. Forty-four open reading frames (ORFs) encoding putative proteins of 40 or more amino acids were identified, and bioinformatic analyses led to the assignment of putative functions to 23 ORFs. Comparative genomic analysis of phage 2972 with the six other sequenced S. thermophilus phage genomes confirmed that the replication module is conserved and that cos- and pac-type phages have distinct structural and packaging genes. Two group I introns were identified in the genome of 2972. They interrupted the genes coding for the putative endolysin and the terminase large subunit. Phage mRNA splicing was demonstrated for both introns, and the secondary structures were predicted. Eight structural proteins were also identified by N-terminal sequencing and/or matrix-assisted laser desorption ionization—time-of-flight mass spectrometry. Detailed analysis of the putative minor tail proteins ORF19 and ORF21 as well as the putative receptor-binding protein ORF20 showed the following interesting features: (i) ORF19 is a hybrid protein, because it displays significant identity with both pac- and cos-type phages; (ii) ORF20 is unique; and (iii) a protein similar to ORF21 of 2972 was also found in the structure of the cos-type phage DT1, indicating that this structural protein is present in both S. thermophilus phage groups. The implications of these findings for phage classification are discussed. PMID:16000821
Meraculous2

DOE Office of Scientific and Technical Information (OSTI.GOV)

2014-06-01

meraculous2 is a whole genome shotgun assembler for short-reads that is capable of assembling large, polymorphic genomes with modest computational requirements. Meraculous relies on an efficient and conservative traversal of the subgraph of the k-mer (deBruijn) graph of oligonucleotides with unique high quality extensions in the dataset, avoiding an explicit error correction step as used in other short-read assemblers. Additional features include (1) handling of allelic variation using "bubble" structures within the deBruijn graph, (2) gap closing of repetitive and low quality regions using localized assemblies, and (3) an improved scaffolding algorithm that produces more complete assemblies without compromising onmore » scaffolding accuracy« less
Improving the annotation of the Heterorhabditis bacteriophora genome.

PubMed

McLean, Florence; Berger, Duncan; Laetsch, Dominik R; Schwartz, Hillel T; Blaxter, Mark

2018-04-01

Genome assembly and annotation remain exacting tasks. As the tools available for these tasks improve, it is useful to return to data produced with earlier techniques to assess their credibility and correctness. The entomopathogenic nematode Heterorhabditis bacteriophora is widely used to control insect pests in horticulture. The genome sequence for this species was reported to encode an unusually high proportion of unique proteins and a paucity of secreted proteins compared to other related nematodes. We revisited the H. bacteriophora genome assembly and gene predictions to determine whether these unusual characteristics were biological or methodological in origin. We mapped an independent resequencing dataset to the genome and used the blobtools pipeline to identify potential contaminants. While present (0.2% of the genome span, 0.4% of predicted proteins), assembly contamination was not significant. Re-prediction of the gene set using BRAKER1 and published transcriptome data generated a predicted proteome that was very different from the published one. The new gene set had a much reduced complement of unique proteins, better completeness values that were in line with other related species' genomes, and an increased number of proteins predicted to be secreted. It is thus likely that methodological issues drove the apparent uniqueness of the initial H. bacteriophora genome annotation and that similar contamination and misannotation issues affect other published genome assemblies.
The role of internal duplication in the evolution of multi-domain proteins.

PubMed

Nacher, J C; Hayashida, M; Akutsu, T

2010-08-01

Many proteins consist of several structural domains. These multi-domain proteins have likely been generated by selective genome growth dynamics during evolution to perform new functions as well as to create structures that fold on a biologically feasible time scale. Domain units frequently evolved through a variety of genetic shuffling mechanisms. Here we examine the protein domain statistics of more than 1000 organisms including eukaryotic, archaeal and bacterial species. The analysis extends earlier findings on asymmetric statistical laws for proteome to a wider variety of species. While proteins are composed of a wide range of domains, displaying a power-law decay, the computation of domain families for each protein reveals an exponential distribution, characterizing a protein universe composed of a thin number of unique families. Structural studies in proteomics have shown that domain repeats, or internal duplicated domains, represent a small but significant fraction of genome. In spite of its importance, this observation has been largely overlooked until recently. We model the evolutionary dynamics of proteome and demonstrate that these distinct distributions are in fact rooted in an internal duplication mechanism. This process generates the contemporary protein structural domain universe, determines its reduced thickness, and tames its growth. These findings have important implications, ranging from protein interaction network modeling to evolutionary studies based on fundamental mechanisms governing genome expansion.
Gramene 2013: comparative plant genomics resources.

PubMed

Monaco, Marcela K; Stein, Joshua; Naithani, Sushma; Wei, Sharon; Dharmawardhana, Palitha; Kumari, Sunita; Amarasinghe, Vindhya; Youens-Clark, Ken; Thomason, James; Preece, Justin; Pasternak, Shiran; Olson, Andrew; Jiao, Yinping; Lu, Zhenyuan; Bolser, Dan; Kerhornou, Arnaud; Staines, Dan; Walts, Brandon; Wu, Guanming; D'Eustachio, Peter; Haw, Robin; Croft, David; Kersey, Paul J; Stein, Lincoln; Jaiswal, Pankaj; Ware, Doreen

2014-01-01

Gramene (http://www.gramene.org) is a curated online resource for comparative functional genomics in crops and model plant species, currently hosting 27 fully and 10 partially sequenced reference genomes in its build number 38. Its strength derives from the application of a phylogenetic framework for genome comparison and the use of ontologies to integrate structural and functional annotation data. Whole-genome alignments complemented by phylogenetic gene family trees help infer syntenic and orthologous relationships. Genetic variation data, sequences and genome mappings available for 10 species, including Arabidopsis, rice and maize, help infer putative variant effects on genes and transcripts. The pathways section also hosts 10 species-specific metabolic pathways databases developed in-house or by our collaborators using Pathway Tools software, which facilitates searches for pathway, reaction and metabolite annotations, and allows analyses of user-defined expression datasets. Recently, we released a Plant Reactome portal featuring 133 curated rice pathways. This portal will be expanded for Arabidopsis, maize and other plant species. We continue to provide genetic and QTL maps and marker datasets developed by crop researchers. The project provides a unique community platform to support scientific research in plant genomics including studies in evolution, genetics, plant breeding, molecular biology, biochemistry and systems biology.
Translational Genomics: Practical Applications of the Genomic Revolution in Breast Cancer.

PubMed

Yates, Lucy R; Desmedt, Christine

2017-06-01

The genomic revolution has fundamentally changed our perception of breast cancer. It is now apparent from DNA-based massively parallel sequencing data that at the genomic level, every breast cancer is unique and shaped by the mutational processes to which it was exposed during its lifetime. More than 90 breast cancer driver genes have been identified as recurrently mutated, and many occur at low frequency across the breast cancer population. Certain cancer genes are associated with traditionally defined histologic subtypes, but genomic intertumoral heterogeneity exists even between cancers that appear the same under the microscope. Most breast cancers contain subclonal populations, many of which harbor driver alterations, and subclonal structure is typically remodeled over time, across metastasis and as a consequence of treatment interventions. Genomics is deepening our understanding of breast cancer biology, contributing to an accelerated phase of targeted drug development and providing insights into resistance mechanisms. Genomics is also providing tools necessary to deliver personalized cancer medicine, but a number of challenges must still be addressed. Clin Cancer Res; 23(11); 2630-9. ©2017 AACR See all articles in this CCR Focus section, "Breast Cancer Research: From Base Pairs to Populations." ©2017 American Association for Cancer Research.
Complete Genome Sequence of the Broad-Host-Range Vibriophage KVP40: Comparative Genomics of a T4-Related Bacteriophage

PubMed Central

Miller, Eric S.; Heidelberg, John F.; Eisen, Jonathan A.; Nelson, William C.; Durkin, A. Scott; Ciecko, Ann; Feldblyum, Tamara V.; White, Owen; Paulsen, Ian T.; Nierman, William C.; Lee, Jong; Szczypinski, Bridget; Fraser, Claire M.

2003-01-01

The complete genome sequence of the T4-like, broad-host-range vibriophage KVP40 has been determined. The genome sequence is 244,835 bp, with an overall G+C content of 42.6%. It encodes 386 putative protein-encoding open reading frames (CDSs), 30 tRNAs, 33 T4-like late promoters, and 57 potential rho-independent terminators. Overall, 92.1% of the KVP40 genome is coding, with an average CDS size of 587 bp. While 65% of the CDSs were unique to KVP40 and had no known function, the genome sequence and organization show specific regions of extensive conservation with phage T4. At least 99 KVP40 CDSs have homologs in the T4 genome (Blast alignments of 45 to 68% amino acid similarity). The shared CDSs represent 36% of all T4 CDSs but only 26% of those from KVP40. There is extensive representation of the DNA replication, recombination, and repair enzymes as well as the viral capsid and tail structural genes. KVP40 lacks several T4 enzymes involved in host DNA degradation, appears not to synthesize the modified cytosine (hydroxymethyl glucose) present in T-even phages, and lacks group I introns. KVP40 likely utilizes the T4-type sigma-55 late transcription apparatus, but features of early- or middle-mode transcription were not identified. There are 26 CDSs that have no viral homolog, and many did not necessarily originate from Vibrio spp., suggesting an even broader host range for KVP40. From these latter CDSs, an NAD salvage pathway was inferred that appears to be unique among bacteriophages. Features of the KVP40 genome that distinguish it from T4 are presented, as well as those, such as the replication and virion gene clusters, that are substantially conserved. PMID:12923095
Expression of virus-encoded proteinases: functional and structural similarities with cellular enzymes.

PubMed Central

Dougherty, W G; Semler, B L

1993-01-01

Many viruses express their genome, or part of their genome, initially as a polyprotein precursor that undergoes proteolytic processing. Molecular genetic analyses of viral gene expression have revealed that many of these processing events are mediated by virus-encoded proteinases. Biochemical activity studies and structural analyses of these viral enzymes reveal that they have remarkable similarities to cellular proteinases. However, the viral proteinases have evolved unique features that permit them to function in a cellular environment. In this article, the current status of plant and animal virus proteinases is described along with their role in the viral replication cycle. The reactions catalyzed by viral proteinases are not simple enzyme-substrate interactions; rather, the processing steps are highly regulated, are coordinated with other viral processes, and frequently involve the participation of other factors. Images PMID:8302216
Comparative chloroplast genomes of eleven Schima (Theaceae) species: Insights into DNA barcoding and phylogeny.

PubMed

Yu, Xiang-Qin; Drew, Bryan T; Yang, Jun-Bo; Gao, Lian-Ming; Li, De-Zhu

2017-01-01

Schima is an ecologically and economically important woody genus in tea family (Theaceae). Unresolved species delimitations and phylogenetic relationships within Schima limit our understanding of the genus and hinder utilization of the genus for economic purposes. In the present study, we conducted comparative analysis among the complete chloroplast (cp) genomes of 11 Schima species. Our results indicate that Schima cp genomes possess a typical quadripartite structure, with conserved genomic structure and gene order. The size of the Schima cp genome is about 157 kilo base pairs (kb). They consistently encode 114 unique genes, including 80 protein-coding genes, 30 tRNAs, and 4 rRNAs, with 17 duplicated in the inverted repeat (IR). These cp genomes are highly conserved and do not show obvious expansion or contraction of the IR region. The percent variability of the 68 coding and 93 noncoding (>150 bp) fragments is consistently less than 3%. The seven most widely touted DNA barcode regions as well as one promising barcode candidate showed low sequence divergence. Eight mutational hotspots were identified from the 11 cp genomes. These hotspots may potentially be useful as specific DNA barcodes for species identification of Schima. The 58 cpSSR loci reported here are complementary to the microsatellite markers identified from the nuclear genome, and will be leveraged for further population-level studies. Phylogenetic relationships among the 11 Schima species were resolved with strong support based on the cp genome data set, which corresponds well with the species distribution pattern. The data presented here will serve as a foundation to facilitate species identification, DNA barcoding and phylogenetic reconstructions for future exploration of Schima.

Genomic Expression Patterns in Menstrually-Related Migraine in Adolescents

PubMed Central

Hershey, Andrew; Horn, Paul; Kabbouche, Marielle; O'Brien, Hope; Powers, Scott

2011-01-01

Background Exacerbation of migraine with menses is common in adolescent girls and women with migraine, occurring in up to 60% of females with migraine. These migraines are oftentimes longer and more disabling and may be related to estrogen levels and hormonal fluctuations. Objective This study identifies the unique genomic expression pattern of menstrually-related migraine (MRM) in comparison to migraine occurring outside the menstrual period and headache free controls. Methods Whole blood samples were obtained from female subjects having an acute migraine during their menstrual period (MRM) or outside of their menstrual period (nonMRM) and controls (C) – females having a menstrual period without any history of headache. The mRNA was isolated from these samples and genomic profile was assessed. Affymetrix Human Exon ST 1.0 arrays were used to examine the genomic expression pattern differences between these three groups. Results Blood genomic expression patterns were obtained on 56 subjects (MRM = 18, nonMRM = 18 and C = 20). Unique genomic expression patterns were observed for both MRM and nonMRM. For MRM, 77 genes were identified that were unique to MRM, while 61 genes were commonly expressed for MRM and nonMRM and 127 genes appeared to have a unique expression pattern for nonMRM. In addition, there were 279 genes that differentially expressed for MRM compared to nonMRM that were not differentially expressed for nonMRM. Gene ontology of these samples indicated many of these groups of genes were functionally related and included categories of immunomodulation/inflammation, mitochondrial function and DNA homeostasis. Conclusions Blood genomic patterns can accurately differentiate MRM from nonMRM. These results indicate that MRM involves a unique molecular biology pathway that can be identified with a specific biomarker and suggest that individuals with MRM have a different underlying genetic etiology. PMID:22220971
Genome-wide characterization of the Pectate Lyase-like (PLL) genes in Brassica rapa.

PubMed

Jiang, Jingjing; Yao, Lina; Miao, Ying; Cao, Jiashu

2013-11-01

Pectate lyases (PL) depolymerize demethylated pectin (pectate, EC 4.2.2.2) by catalyzing the eliminative cleavage of α-1,4-glycosidic linked galacturonan. Pectate Lyase-like (PLL) genes are one of the largest and most complex families in plants. However, studies on the phylogeny, gene structure, and expression of PLL genes are limited. To understand the potential functions of PLL genes in plants, we characterized their intron-exon structure, phylogenetic relationships, and protein structures, and measured their expression patterns in various tissues, specifically the reproductive tissues in Brassica rapa. Sequence alignments revealed two characteristic motifs in PLL genes. The chromosome location analysis indicated that 18 of the 46 PLL genes were located in the least fractionated sub-genome (LF) of B. rapa, while 16 were located in the medium fractionated sub-genome (MF1) and 12 in the more fractionated sub-genome (MF2). Quantitative RT-PCR analysis showed that BrPLL genes were expressed in various tissues, with most of them being expressed in flowers. Detailed qRT-PCR analysis identified 11 pollen specific PLL genes and several other genes with unique spatial expression patterns. In addition, some duplicated genes showed similar expression patterns. The phylogenetic analysis identified three PLL gene subfamilies in plants, among which subfamily II might have evolved from gene neofunctionalization or subfunctionalization. Therefore, this study opens the possibility for exploring the roles of PLL genes during plant development.
Historical changes in population structure during rice breeding programs in the northern limits of rice cultivation.

PubMed

Shinada, Hiroshi; Yamamoto, Toshio; Yamamoto, Eiji; Hori, Kiyosumi; Yonemaru, Junichi; Matsuba, Shuichi; Fujino, Kenji

2014-04-01

The rice local population was clearly differentiated into six groups over the 100-year history of rice breeding programs in the northern limit of rice cultivation over the world. Genetic improvements in plant breeding programs in local regions have led to the development of new cultivars with specific agronomic traits under environmental conditions and generated the unique genetic structures of local populations. Understanding historical changes in genome structures and phenotypic characteristics within local populations may be useful for identifying profitable genes and/or genetic resources and the creation of new gene combinations in plant breeding programs. In the present study, historical changes were elucidated in genome structures and phenotypic characteristics during 100-year rice breeding programs in Hokkaido, the northern limit of rice cultivation in the world. We selected 63 rice cultivars to represent the historical diversity of this local population from landraces to the current breeding lines. The results of the phylogenetic analysis demonstrated that these cultivars clearly differentiated into six groups over the history of rice breeding programs. Significant differences among these groups were detected in five of the seven traits, indicating that the differentiation of the Hokkaido rice population into these groups was correlated with these phenotypic changes. These results demonstrated that breeding practices in Hokkaido have created new genetic structures for adaptability to specific environmental conditions and breeding objectives. They also provide a new strategy for rice breeding programs in which such unique genes in local populations in the world can explore the genetic potentials of the local populations.
The Laccaria and Tuber Genomes Reveal Unique Signatures of Mycorrhizal Symbiosis Evolution (2010 JGI User Meeting)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Martin, Francis

Francis Martin from the French National Institute for Agricultural Research (INRA) talks on how "The Laccaria and Tuber genomes reveal unique signatures of mycorrhizal symbiosis evolution" on March 24, 2010 at the 5th Annual DOE JGI User Meeting
Genome Sequence of a Canadian Vibrio parahaemolyticus Isolate with Unique Mobilizing Capacity.

PubMed

Bioteau, Audrey; Huguet, Kévin; Burrus, Vincent; Banerjee, Swapan

2018-06-14

Vibrio parahaemolyticus is a clinically significant marine bacterium implicated in gastroenteritis among consumers of raw or undercooked seafood. This report presents the whole-genome sequence of a unique strain of V. parahaemolyticus isolated from oysters harvested in Canada. © Crown copyright 2018.
Structural Mechanisms of Plant Glucan Phosphatases in Starch Metabolism

PubMed Central

Meekins, David A.; Vander Kooi, Craig W.; Gentry, Matthew S.

2016-01-01

Glucan phosphatases are a recently discovered class of enzymes that dephosphorylate starch and glycogen, thereby regulating energy metabolism. Plant genomes encode for two glucan phosphatases called Starch EXcess4 (SEX4) and Like Sex Four2 (LSF2) that regulate starch metabolism by selectively dephosphorylating glucose moieties within starch glucan chains. Recently, the structures of both SEX4 and LSF2 were determined, with and without phosphoglucan products bound, revealing the mechanism for their unique activities. This review explores the structural and enzymatic features of the plant glucan phosphatases and outlines how they are uniquely adapted for carrying out their cellular functions. We outline the physical mechanisms employed by SEX4 and LSF2 to interact with starch glucans: SEX4 binds glucan chains via a continuous glucan binding platform comprised of its Dual Specificity Phosphatase (DSP) domain and Carbohydrate Binding Module (CBM) while LSF2 utilizes Surface Binding Sites (SBSs). SEX4 and LSF2 both contain a unique network of aromatic residues in their catalytic DSP domains that serve as glucan engagement platforms and are unique to the glucan phosphatases. We also discuss the phosphoglucan substrate specificities inherent to SEX4 and LSF2 and outline structural features within the active site that govern glucan orientation. This review defines the structural mechanism of the plant glucan phosphatases with respect to phosphatases, starch metabolism, and protein-glucan interaction; thereby providing a framework for their applications in both agricultural and industrial settings. PMID:26934589
DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhang, Yanfeng; Zheng, Yi; Qin, Ling

Beta-hydroxyacid dehydrogenase (β-HAD) genes have been identified in all sequenced genomes of eukaryotes and prokaryotes. Their gene products catalyze the NAD+- or NADP+-dependent oxidation of various β-hydroxy acid substrates into their corresponding semialdehyde. In many fungal and bacterial genomes, multiple β-HAD genes are observed leading to the hypothesis that these gene products may have unique, uncharacterized metabolic roles specific to their species. The genomes of Geobacter sulfurreducens and Geobacter metallireducens each contain two potential β-HAD genes. The protein sequences of one pair of these genes, Gs-βHAD (Q74DE4) and Gm-βHAD (Q39R98), have 65% sequence identity and 77% sequence similarity with eachmore » other. Both proteins reduce succinic semialdehyde, a metabolite of the GABA shunt. To further explore the structural and functional characteristics of these two β-HADs with a potentially unique substrate specificity, crystal structures for Gs-βHAD and Gm-βHAD in complex with NADP+ were determined to a resolution of 1.89 Å and 2.07 Å, respectively. The structure of both proteins are similar, composed of 14 α-helices and nine β-strands organized into two domains. Domain One (1-165) adopts a typical Rossmann fold composed of two α/β units: a six-strand parallel β-sheet surrounded by six α-helices (α1 – α6) followed by a mixed three-strand β-sheet surrounded by two α-helices (α7 and α8). Domain Two (166-287) is composed of a bundle of seven α-helices (α9 – α14). Four functional regions conserved in all β-HADs are spatially located near each other at the interdomain cleft in both Gs-βHAD and Gm-βHAD with a buried molecule of NADP+. The structural features of Gs-βHAD and Gm-βHAD are described in relation to the four conserved consensus sequences characteristic of β-HADs and the potential biochemical importance of these enzymes as an alternative pathway for the degradation of succinic semialdehyde.« less
Chromosomal structures and repetitive sequences divergence in Cucumis species revealed by comparative cytogenetic mapping.

PubMed

Zhang, Yunxia; Cheng, Chunyan; Li, Ji; Yang, Shuqiong; Wang, Yunzhu; Li, Ziang; Chen, Jinfeng; Lou, Qunfeng

2015-09-25

Differentiation and copy number of repetitive sequences affect directly chromosome structure which contributes to reproductive isolation and speciation. Comparative cytogenetic mapping has been verified an efficient tool to elucidate the differentiation and distribution of repetitive sequences in genome. In present study, the distinct chromosomal structures of five Cucumis species were revealed through genomic in situ hybridization (GISH) technique and comparative cytogenetic mapping of major satellite repeats. Chromosome structures of five Cucumis species were investigated using GISH and comparative mapping of specific satellites. Southern hybridization was employed to study the proliferation of satellites, whose structural characteristics were helpful for analyzing chromosome evolution. Preferential distribution of repetitive DNAs at the subtelomeric regions was found in C. sativus, C hystrix and C. metuliferus, while majority was positioned at the pericentromeric heterochromatin regions in C. melo and C. anguria. Further, comparative GISH (cGISH) through using genomic DNA of other species as probes revealed high homology of repeats between C. sativus and C. hystrix. Specific satellites including 45S rDNA, Type I/II, Type III, Type IV, CentM and telomeric repeat were then comparatively mapped in these species. Type I/II and Type IV produced bright signals at the subtelomeric regions of C. sativus and C. hystrix simultaneously, which might explain the significance of their amplification in the divergence of Cucumis subgenus from the ancient ancestor. Unique positioning of Type III and CentM only at the centromeric domains of C. sativus and C. melo, respectively, combining with unique southern bands, revealed rapid evolutionary patterns of centromeric DNA in Cucumis. Obvious interstitial telomeric repeats were observed in chromosomes 1 and 2 of C. sativus, which might provide evidence of the fusion hypothesis of chromosome evolution from x = 12 to x = 7 in Cucumis species. Besides, the significant correlation was found between gene density along chromosome and GISH band intensity in C. sativus and C. melo. In summary, comparative cytogenetic mapping of major satellites and GISH revealed the distinct differentiation of chromosome structure during species formation. The evolution of repetitive sequences was the main force for the divergence of Cucumis species from common ancestor.
Evolution of short inverted repeat in cupressophytes, transfer of accD to nucleus in Sciadopitys verticillata and phylogenetic position of Sciadopityaceae.

PubMed

Li, Jia; Gao, Lei; Chen, Shanshan; Tao, Ke; Su, Yingjuan; Wang, Ting

2016-02-11

Sciadopitys verticillata is an evergreen conifer and an economically valuable tree used in construction, which is the only member of the family Sciadopityaceae. Acquisition of the S. verticillata chloroplast (cp) genome will be useful for understanding the evolutionary mechanism of conifers and phylogenetic relationships among gymnosperm. In this study, we have first reported the complete chloroplast genome of S. verticillata. The total genome is 138,284 bp in length, consisting of 118 unique genes. The S. verticillata cp genome has lost one copy of the canonical inverted repeats and shown distinctive genomic structure comparing with other cupressophytes. Fifty-three simple sequence repeat loci and 18 forward tandem repeats were identified in the S. verticillata cp genome. According to the rearrangement of cupressophyte cp genome, we proposed one mechanism for the formation of inverted repeat: tandem repeat occured first, then rearrangement divided the tandem repeat into inverted repeats located at different regions. Phylogenetic estimates inferred from 59-gene sequences and cpDNA organizations have both shown that S. verticillata was sister to the clade consisting of Cupressaceae, Taxaceae, and Cephalotaxaceae. Moreover, accD gene was found to be lost in the S. verticillata cp genome, and a nucleus copy was identified from two transcriptome data.
Scanning the Effects of Ethyl Methanesulfonate on the Whole Genome of Lotus japonicus Using Second-Generation Sequencing Analysis

PubMed Central

Mohd-Yusoff, Nur Fatihah; Ruperao, Pradeep; Tomoyoshi, Nurain Emylia; Edwards, David; Gresshoff, Peter M.; Biswas, Bandana; Batley, Jacqueline

2015-01-01

Genetic structure can be altered by chemical mutagenesis, which is a common method applied in molecular biology and genetics. Second-generation sequencing provides a platform to reveal base alterations occurring in the whole genome due to mutagenesis. A model legume, Lotus japonicus ecotype Miyakojima, was chemically mutated with alkylating ethyl methanesulfonate (EMS) for the scanning of DNA lesions throughout the genome. Using second-generation sequencing, two individually mutated third-generation progeny (M3, named AM and AS) were sequenced and analyzed to identify single nucleotide polymorphisms and reveal the effects of EMS on nucleotide sequences in these mutant genomes. Single-nucleotide polymorphisms were found in every 208 kb (AS) and 202 kb (AM) with a bias mutation of G/C-to-A/T changes at low percentage. Most mutations were intergenic. The mutation spectrum of the genomes was comparable in their individual chromosomes; however, each mutated genome has unique alterations, which are useful to identify causal mutations for their phenotypic changes. The data obtained demonstrate that whole genomic sequencing is applicable as a high-throughput tool to investigate genomic changes due to mutagenesis. The identification of these single-point mutations will facilitate the identification of phenotypically causative mutations in EMS-mutated germplasm. PMID:25660167
Characterization and Comparative Analysis of the Complete Chloroplast Genome of the Critically Endangered Species Streptocarpus teitensis (Gesneriaceae).

PubMed

Kyalo, Cornelius M; Gichira, Andrew W; Li, Zhi-Zhong; Saina, Josphat K; Malombe, Itambo; Hu, Guang-Wan; Wang, Qing-Feng

2018-01-01

Streptocarpus teitensis (Gesneriaceae) is an endemic species listed as critically endangered in the International Union for Conservation of Nature (IUCN) red list of threatened species. However, the sequence and genome information of this species remains to be limited. In this article, we present the complete chloroplast genome structure of Streptocarpus teitensis and its evolution inferred through comparative studies with other related species. S. teitensis displayed a chloroplast genome size of 153,207 bp, sheltering a pair of inverted repeats (IR) of 25,402 bp each split by small and large single-copy (SSC and LSC) regions of 18,300 and 84,103 bp, respectively. The chloroplast genome was observed to contain 116 unique genes, of which 80 are protein-coding, 32 are transfer RNAs, and four are ribosomal RNAs. In addition, a total of 196 SSR markers were detected in the chloroplast genome of Streptocarpus teitensis with mononucleotides (57.1%) being the majority, followed by trinucleotides (33.2%) and dinucleotides and tetranucleotides (both 4.1%), and pentanucleotides being the least (1.5%). Genome alignment indicated that this genome was comparable to other sequenced members of order Lamiales. The phylogenetic analysis suggested that Streptocarpus teitensis is closely related to Lysionotus pauciflorus and Dorcoceras hygrometricum .
Comparative Genomics of the Balsaminaceae Sister Genera Hydrocera triflora and Impatiens pinfanensis

PubMed Central

Li, Zhi-Zhong; Saina, Josphat K.; Gichira, Andrew W.; Kyalo, Cornelius M.; Wang, Qing-Feng

2018-01-01

The family Balsaminaceae, which consists of the economically important genus Impatiens and the monotypic genus Hydrocera, lacks a reported or published complete chloroplast genome sequence. Therefore, chloroplast genome sequences of the two sister genera are significant to give insight into the phylogenetic position and understanding the evolution of the Balsaminaceae family among the Ericales. In this study, complete chloroplast (cp) genomes of Impatiens pinfanensis and Hydrocera triflora were characterized and assembled using a high-throughput sequencing method. The complete cp genomes were found to possess the typical quadripartite structure of land plants chloroplast genomes with double-stranded molecules of 154,189 bp (Impatiens pinfanensis) and 152,238 bp (Hydrocera triflora) in length. A total of 115 unique genes were identified in both genomes, of which 80 are protein-coding genes, 31 are distinct transfer RNA (tRNA) and four distinct ribosomal RNA (rRNA). Thirty codons, of which 29 had A/T ending codons, revealed relative synonymous codon usage values of >1, whereas those with G/C ending codons displayed values of <1. The simple sequence repeats comprise mostly the mononucleotide repeats A/T in all examined cp genomes. Phylogenetic analysis based on 51 common protein-coding genes indicated that the Balsaminaceae family formed a lineage with Ebenaceae together with all the other Ericales. PMID:29360746
Proteomic strategy for the identification of critical actors in reorganization of the post-meiotic male genome.

PubMed

Govin, Jerome; Gaucher, Jonathan; Ferro, Myriam; Debernardi, Alexandra; Garin, Jerome; Khochbin, Saadi; Rousseaux, Sophie

2012-01-01

After meiosis, during the final stages of spermatogenesis, the haploid male genome undergoes major structural changes, resulting in a shift from a nucleosome-based genome organization to the sperm-specific, highly compacted nucleoprotamine structure. Recent data support the idea that region-specific programming of the haploid male genome is of high importance for the post-fertilization events and for successful embryo development. Although these events constitute a unique and essential step in reproduction, the mechanisms by which they occur have remained completely obscure and the factors involved have mostly remained uncharacterized. Here, we sought a strategy to significantly increase our understanding of proteins controlling the haploid male genome reprogramming, based on the identification of proteins in two specific pools: those with the potential to bind nucleic acids (basic proteins) and proteins capable of binding basic proteins (acidic proteins). For the identification of acidic proteins, we developed an approach involving a transition-protein (TP)-based chromatography, which has the advantage of retaining not only acidic proteins due to the charge interactions, but also potential TP-interacting factors. A second strategy, based on an in-depth bioinformatic analysis of the identified proteins, was then applied to pinpoint within the lists obtained, male germ cells expressed factors relevant to the post-meiotic genome organization. This approach reveals a functional network of DNA-packaging proteins and their putative chaperones and sheds a new light on the way the critical transitions in genome organizations could take place. This work also points to a new area of research in male infertility and sperm quality assessments.
Identification of novel non-coding small RNAs from Streptococcus pneumoniae TIGR4 using high-resolution genome tiling arrays

PubMed Central

2010-01-01

Background The identification of non-coding transcripts in human, mouse, and Escherichia coli has revealed their widespread occurrence and functional importance in both eukaryotic and prokaryotic life. In prokaryotes, studies have shown that non-coding transcripts participate in a broad range of cellular functions like gene regulation, stress and virulence. However, very little is known about non-coding transcripts in Streptococcus pneumoniae (pneumococcus), an obligate human respiratory pathogen responsible for significant worldwide morbidity and mortality. Tiling microarrays enable genome wide mRNA profiling as well as identification of novel transcripts at a high-resolution. Results Here, we describe a high-resolution transcription map of the S. pneumoniae clinical isolate TIGR4 using genomic tiling arrays. Our results indicate that approximately 66% of the genome is expressed under our experimental conditions. We identified a total of 50 non-coding small RNAs (sRNAs) from the intergenic regions, of which 36 had no predicted function. Half of the identified sRNA sequences were found to be unique to S. pneumoniae genome. We identified eight overrepresented sequence motifs among sRNA sequences that correspond to sRNAs in different functional categories. Tiling arrays also identified approximately 202 operon structures in the genome. Conclusions In summary, the pneumococcal operon structures and novel sRNAs identified in this study enhance our understanding of the complexity and extent of the pneumococcal 'expressed' genome. Furthermore, the results of this study open up new avenues of research for understanding the complex RNA regulatory network governing S. pneumoniae physiology and virulence. PMID:20525227
Genomic characterisation of Arachis porphyrocalyx (Valls & C.E. Simpson, 2005) (Leguminosae): multiple origin of Arachis species with x = 9.

PubMed

Celeste, Silvestri María; Ortiz, Alejandra Marcela; Robledo, Germán Ariel; Valls, José Francisco Montenegro; Lavia, Graciela Inés

2017-01-01

The genus Arachis Linnaeus, 1753 comprises four species with x = 9, three belong to the section Arachis: Arachis praecox (Krapov. W.C. Greg. & Valls, 1994), Arachis palustris (Krapov. W.C. Greg. & Valls, 1994) and Arachis decora (Krapov. W.C. Greg. & Valls, 1994) and only one belongs to the section Erectoides: Arachis porphyrocalyx (Valls & C.E. Simpson, 2005). Recently, the x = 9 species of section Arachis have been assigned to G genome, the latest described so far. The genomic relationship of Arachis porphyrocalyx with these species is controversial. In the present work, we carried out a karyotypic characterisation of Arachis porphyrocalyx to evaluate its genomic structure and analyse the origin of all x = 9 Arachis species. Arachis porphyrocalyx showed a karyotype formula of 14m+4st, one pair of A chromosomes, satellited chromosomes type 8, one pair of 45S rDNA sites in the SAT chromosomes, one pair of 5S rDNA sites and pericentromeric C-DAPI+ bands in all chromosomes. Karyotype structure indicates that Arachis porphyrocalyx does not share the same genome type with the other three x = 9 species and neither with the remaining Erectoides species. Taking into account the geographic distribution, morphological and cytogenetic features, the origin of species with x = 9 of the genus Arachis cannot be unique; instead, they originated at least twice in the evolutionary history of the genus.
Insights into Platypus Population Structure and History from Whole-Genome Sequencing.

PubMed

Martin, Hilary C; Batty, Elizabeth M; Hussin, Julie; Westall, Portia; Daish, Tasman; Kolomyjec, Stephen; Piazza, Paolo; Bowden, Rory; Hawkins, Margaret; Grant, Tom; Moritz, Craig; Grutzner, Frank; Gongora, Jaime; Donnelly, Peter

2018-05-01

The platypus is an egg-laying mammal which, alongside the echidna, occupies a unique place in the mammalian phylogenetic tree. Despite widespread interest in its unusual biology, little is known about its population structure or recent evolutionary history. To provide new insights into the dispersal and demographic history of this iconic species, we sequenced the genomes of 57 platypuses from across the whole species range in eastern mainland Australia and Tasmania. Using a highly improved reference genome, we called over 6.7 M SNPs, providing an informative genetic data set for population analyses. Our results show very strong population structure in the platypus, with our sampling locations corresponding to discrete groupings between which there is no evidence for recent gene flow. Genome-wide data allowed us to establish that 28 of the 57 sampled individuals had at least a third-degree relative among other samples from the same river, often taken at different times. Taking advantage of a sampled family quartet, we estimated the de novo mutation rate in the platypus at 7.0 × 10-9/bp/generation (95% CI 4.1 × 10-9-1.2 × 10-8/bp/generation). We estimated effective population sizes of ancestral populations and haplotype sharing between current groupings, and found evidence for bottlenecks and long-term population decline in multiple regions, and early divergence between populations in different regions. This study demonstrates the power of whole-genome sequencing for studying natural populations of an evolutionarily important species.
Insights into Platypus Population Structure and History from Whole-Genome Sequencing

PubMed Central

Martin, Hilary C; Hussin, Julie; Westall, Portia; Daish, Tasman; Kolomyjec, Stephen; Piazza, Paolo; Bowden, Rory; Hawkins, Margaret; Grant, Tom; Moritz, Craig; Grutzner, Frank; Gongora, Jaime; Donnelly, Peter

2018-01-01

Abstract The platypus is an egg-laying mammal which, alongside the echidna, occupies a unique place in the mammalian phylogenetic tree. Despite widespread interest in its unusual biology, little is known about its population structure or recent evolutionary history. To provide new insights into the dispersal and demographic history of this iconic species, we sequenced the genomes of 57 platypuses from across the whole species range in eastern mainland Australia and Tasmania. Using a highly improved reference genome, we called over 6.7 M SNPs, providing an informative genetic data set for population analyses. Our results show very strong population structure in the platypus, with our sampling locations corresponding to discrete groupings between which there is no evidence for recent gene flow. Genome-wide data allowed us to establish that 28 of the 57 sampled individuals had at least a third-degree relative among other samples from the same river, often taken at different times. Taking advantage of a sampled family quartet, we estimated the de novo mutation rate in the platypus at 7.0 × 10−9/bp/generation (95% CI 4.1 × 10−9–1.2 × 10−8/bp/generation). We estimated effective population sizes of ancestral populations and haplotype sharing between current groupings, and found evidence for bottlenecks and long-term population decline in multiple regions, and early divergence between populations in different regions. This study demonstrates the power of whole-genome sequencing for studying natural populations of an evolutionarily important species. PMID:29688544
Genome sequence of the ultrasmall unicellular red alga Cyanidioschyzon merolae 10D.

PubMed

Matsuzaki, Motomichi; Misumi, Osami; Shin-I, Tadasu; Maruyama, Shinichiro; Takahara, Manabu; Miyagishima, Shin-Ya; Mori, Toshiyuki; Nishida, Keiji; Yagisawa, Fumi; Nishida, Keishin; Yoshida, Yamato; Nishimura, Yoshiki; Nakao, Shunsuke; Kobayashi, Tamaki; Momoyama, Yu; Higashiyama, Tetsuya; Minoda, Ayumi; Sano, Masako; Nomoto, Hisayo; Oishi, Kazuko; Hayashi, Hiroko; Ohta, Fumiko; Nishizaka, Satoko; Haga, Shinobu; Miura, Sachiko; Morishita, Tomomi; Kabeya, Yukihiro; Terasawa, Kimihiro; Suzuki, Yutaka; Ishii, Yasuyuki; Asakawa, Shuichi; Takano, Hiroyoshi; Ohta, Niji; Kuroiwa, Haruko; Tanaka, Kan; Shimizu, Nobuyoshi; Sugano, Sumio; Sato, Naoki; Nozaki, Hisayoshi; Ogasawara, Naotake; Kohara, Yuji; Kuroiwa, Tsuneyoshi

2004-04-08

Small, compact genomes of ultrasmall unicellular algae provide information on the basic and essential genes that support the lives of photosynthetic eukaryotes, including higher plants. Here we report the 16,520,305-base-pair sequence of the 20 chromosomes of the unicellular red alga Cyanidioschyzon merolae 10D as the first complete algal genome. We identified 5,331 genes in total, of which at least 86.3% were expressed. Unique characteristics of this genomic structure include: a lack of introns in all but 26 genes; only three copies of ribosomal DNA units that maintain the nucleolus; and two dynamin genes that are involved only in the division of mitochondria and plastids. The conserved mosaic origin of Calvin cycle enzymes in this red alga and in green plants supports the hypothesis of the existence of single primary plastid endosymbiosis. The lack of a myosin gene, in addition to the unexpressed actin gene, suggests a simpler system of cytokinesis. These results indicate that the C. merolae genome provides a model system with a simple gene composition for studying the origin, evolution and fundamental mechanisms of eukaryotic cells.
Towards decoding the conifer giga-genome.

PubMed

Mackay, John; Dean, Jeffrey F D; Plomion, Christophe; Peterson, Daniel G; Cánovas, Francisco M; Pavy, Nathalie; Ingvarsson, Pär K; Savolainen, Outi; Guevara, M Ángeles; Fluch, Silvia; Vinceti, Barbara; Abarca, Dolores; Díaz-Sala, Carmen; Cervera, María-Teresa

2012-12-01

Several new initiatives have been launched recently to sequence conifer genomes including pines, spruces and Douglas-fir. Owing to the very large genome sizes ranging from 18 to 35 gigabases, sequencing even a single conifer genome had been considered unattainable until the recent throughput increases and cost reductions afforded by next generation sequencers. The purpose of this review is to describe the context for these new initiatives. A knowledge foundation has been acquired in several conifers of commercial and ecological interest through large-scale cDNA analyses, construction of genetic maps and gene mapping studies aiming to link phenotype and genotype. Exploratory sequencing in pines and spruces have pointed out some of the unique properties of these giga-genomes and suggested strategies that may be needed to extract value from their sequencing. The hope is that recent and pending developments in sequencing technology will contribute to rapidly filling the knowledge vacuum surrounding their structure, contents and evolution. Researchers are also making plans to use comparative analyses that will help to turn the data into a valuable resource for enhancing and protecting the world's conifer forests.
Comparative Genomics Identifies Epidermal Proteins Associated with the Evolution of the Turtle Shell

PubMed Central

Holthaus, Karin Brigit; Strasser, Bettina; Sipos, Wolfgang; Schmidt, Heiko A.; Mlitz, Veronika; Sukseree, Supawadee; Weissenbacher, Anton; Tschachler, Erwin; Alibardi, Lorenzo; Eckhart, Leopold

2016-01-01

The evolution of reptiles, birds, and mammals was associated with the origin of unique integumentary structures. Studies on lizards, chicken, and humans have suggested that the evolution of major structural proteins of the outermost, cornified layers of the epidermis was driven by the diversification of a gene cluster called Epidermal Differentiation Complex (EDC). Turtles have evolved unique defense mechanisms that depend on mechanically resilient modifications of the epidermis. To investigate whether the evolution of the integument in these reptiles was associated with specific adaptations of the sequences and expression patterns of EDC-related genes, we utilized newly available genome sequences to determine the epidermal differentiation gene complement of turtles. The EDC of the western painted turtle (Chrysemys picta bellii) comprises more than 100 genes, including at least 48 genes that encode proteins referred to as beta-keratins or corneous beta-proteins. Several EDC proteins have evolved cysteine/proline contents beyond 50% of total amino acid residues. Comparative genomics suggests that distinct subfamilies of EDC genes have been expanded and partly translocated to loci outside of the EDC in turtles. Gene expression analysis in the European pond turtle (Emys orbicularis) showed that EDC genes are differentially expressed in the skin of the various body sites and that a subset of beta-keratin genes within the EDC as well as those located outside of the EDC are expressed predominantly in the shell. Our findings give strong support to the hypothesis that the evolutionary innovation of the turtle shell involved specific molecular adaptations of epidermal differentiation. PMID:26601937

Natural Product Biosynthetic Diversity and Comparative Genomics of the Cyanobacteria.

PubMed

Dittmann, Elke; Gugger, Muriel; Sivonen, Kaarina; Fewer, David P

2015-10-01

Cyanobacteria are an ancient lineage of slow-growing photosynthetic bacteria and a prolific source of natural products with intricate chemical structures and potent biological activities. The bulk of these natural products are known from just a handful of genera. Recent efforts have elucidated the mechanisms underpinning the biosynthesis of a diverse array of natural products from cyanobacteria. Many of the biosynthetic mechanisms are unique to cyanobacteria or rarely described from other organisms. Advances in genome sequence technology have precipitated a deluge of genome sequences for cyanobacteria. This makes it possible to link known natural products to biosynthetic gene clusters but also accelerates the discovery of new natural products through genome mining. These studies demonstrate that cyanobacteria encode a huge variety of cryptic gene clusters for the production of natural products, and the known chemical diversity is likely to be just a fraction of the true biosynthetic capabilities of this fascinating and ancient group of organisms. Copyright © 2015. Published by Elsevier Ltd.
An Organismal CNV Mutator Phenotype Restricted to Early Human Development.

PubMed

Liu, Pengfei; Yuan, Bo; Carvalho, Claudia M B; Wuster, Arthur; Walter, Klaudia; Zhang, Ling; Gambin, Tomasz; Chong, Zechen; Campbell, Ian M; Coban Akdemir, Zeynep; Gelowani, Violet; Writzl, Karin; Bacino, Carlos A; Lindsay, Sarah J; Withers, Marjorie; Gonzaga-Jauregui, Claudia; Wiszniewska, Joanna; Scull, Jennifer; Stankiewicz, Paweł; Jhangiani, Shalini N; Muzny, Donna M; Zhang, Feng; Chen, Ken; Gibbs, Richard A; Rautenstrauss, Bernd; Cheung, Sau Wai; Smith, Janice; Breman, Amy; Shaw, Chad A; Patel, Ankita; Hurles, Matthew E; Lupski, James R

2017-02-23

De novo copy number variants (dnCNVs) arising at multiple loci in a personal genome have usually been considered to reflect cancer somatic genomic instabilities. We describe a multiple dnCNV (MdnCNV) phenomenon in which individuals with genomic disorders carry five to ten constitutional dnCNVs. These CNVs originate from independent formation incidences, are predominantly tandem duplications or complex gains, exhibit breakpoint junction features reminiscent of replicative repair, and show increased de novo point mutations flanking the rearrangement junctions. The active CNV mutation shower appears to be restricted to a transient perizygotic period. We propose that a defect in the CNV formation process is responsible for the "CNV-mutator state," and this state is dampened after early embryogenesis. The constitutional MdnCNV phenomenon resembles chromosomal instability in various cancers. Investigations of this phenomenon may provide unique access to understanding genomic disorders, structural variant mutagenesis, human evolution, and cancer biology. Copyright © 2017 Elsevier Inc. All rights reserved.
Sequence Analysis of Leuconostoc mesenteroides Bacteriophage Φ1-A4 Isolated from an Industrial Vegetable Fermentation▿

PubMed Central

Lu, Z.; Altermann, E.; Breidt, F.; Kozyavkin, S.

2010-01-01

Vegetable fermentations rely on the proper succession of a variety of lactic acid bacteria (LAB). Leuconostoc mesenteroides initiates fermentation. As fermentation proceeds, L. mesenteroides dies off and other LAB complete the fermentation. Phages infecting L. mesenteroides may significantly influence the die-off of L. mesenteroides. However, no L. mesenteroides phages have been previously genetically characterized. Knowledge of more phage genome sequences may provide new insights into phage genomics, phage evolution, and phage-host interactions. We have determined the complete genome sequence of L. mesenteroides phage Φ1-A4, isolated from an industrial sauerkraut fermentation. The phage possesses a linear, double-stranded DNA genome consisting of 29,508 bp with a G+C content of 36%. Fifty open reading frames (ORFs) were predicted. Putative functions were assigned to 26 ORFs (52%), including 5 ORFs of structural proteins. The phage genome was modularly organized, containing DNA replication, DNA-packaging, head and tail morphogenesis, cell lysis, and DNA regulation/modification modules. In silico analyses showed that Φ1-A4 is a unique lytic phage with a large-scale genome inversion (∼30% of the genome). The genome inversion encompassed the lysis module, part of the structural protein module, and a cos site. The endolysin gene was flanked by two holin genes. The tail morphogenesis module was interspersed with cell lysis genes and other genes with unknown functions. The predicted amino acid sequences of the phage proteins showed little similarity to other phages, but functional analyses showed that Φ1-A4 clusters with several Lactococcus phages. To our knowledge, Φ1-A4 is the first genetically characterized L. mesenteroides phage. PMID:20118355
Complete chloroplast genome sequences of Praxelis (Eupatorium catarium Veldkamp), an important invasive species.

PubMed

Zhang, Ying; Li, Lei; Yan, Ting Liang; Liu, Qiang

2014-10-01

Praxelis (Eupatorium catarium Veldkamp) is a new hazardous invasive plant species that has caused serious economic losses and environmental damage in the Northern hemisphere tropical and subtropical regions. Although previous studies focused on detecting the biological characteristics of this plant to prevent its expansion, little effort has been made to understand the impact of Praxelis on the ecosystem in an evolutionary process. The genetic information of Praxelis is required for further phylogenetic identification and evolutionary studies. Here, we report the complete Praxelis chloroplast (cp) genome sequence. The Praxelis chloroplast genome is 151,410 bp in length including a small single-copy region (18,547 bp) and a large single-copy region (85,311 bp) separated by a pair of inverted repeats (IRs; 23,776 bp). The genome contains 85 unique and 18 duplicated genes in the IR region. The gene content and organization are similar to other Asteraceae tribe cp genomes. We also analyzed the whole cp genome sequence, repeat structure, codon usage, contraction of the IR and gene structure/organization features between native and invasive Asteraceae plants, in order to understand the evolution of organelle genomes between native and invasive Asteraceae. Comparative analysis identified the 14 markers containing greater than 2% parsimony-informative characters, indicating that they are potential informative markers for barcoding and phylogenetic analysis. Moreover, a sister relationship between Praxelis and seven other species in Asteraceae was found based on phylogenetic analysis of 28 protein-coding sequences. Complete cp genome information is useful for plant phylogenetic and evolutionary studies within this invasive species and also within the Asteraceae family. Copyright © 2014 Elsevier B.V. All rights reserved.
Chloroplast Genome Sequence of Pigeonpea (Cajanus cajan (L.) Millspaugh) and Cajanus scarabaeoides (L.) Thouars: Genome Organization and Comparison with Other Legumes

PubMed Central

Kaila, Tanvi; Chaduvla, Pavan K.; Saxena, Swati; Bahadur, Kaushlendra; Gahukar, Santosh J.; Chaudhury, Ashok; Sharma, T. R.; Singh, N. K.; Gaikwad, Kishor

2016-01-01

Pigeonpea (Cajanus cajan (L.) Millspaugh), a diploid (2n = 22) legume crop with a genome size of 852 Mbp, serves as an important source of human dietary protein especially in South East Asian and African regions. In this study, the draft chloroplast genomes of Cajanus cajan and Cajanus scarabaeoides (L.) Thouars were generated. Cajanus scarabaeoides is an important species of the Cajanus gene pool and has also been used for developing promising CMS system by different groups. A male sterile genotype harboring the C. scarabaeoides cytoplasm was used for sequencing the plastid genome. The cp genome of C. cajan is 152,242bp long, having a quadripartite structure with LSC of 83,455 bp and SSC of 17,871 bp separated by IRs of 25,398 bp. Similarly, the cp genome of C. scarabaeoides is 152,201bp long, having a quadripartite structure in which IRs of 25,402 bp length separates 83,423 bp of LSC and 17,854 bp of SSC. The pigeonpea cp genome contains 116 unique genes, including 30 tRNA, 4 rRNA, 78 predicted protein coding genes and 5 pseudogenes. A 50 kb inversion was observed in the LSC region of pigeonpea cp genome, consistent with other legumes. Comparison of cp genome with other legumes revealed the contraction of IR boundaries due to the absence of rps19 gene in the IR region. Chloroplast SSRs were mined and a total of 280 and 292 cpSSRs were identified in C. scarabaeoides and C. cajan respectively. RNA editing was observed at 37 sites in both C. scarabaeoides and C. cajan, with maximum occurrence in the ndh genes. The pigeonpea cp genome sequence would be beneficial in providing informative molecular markers which can be utilized for genetic diversity analysis and aid in understanding the plant systematics studies among major grain legumes. PMID:28018385
Chloroplast Genome Sequence of Pigeonpea (Cajanus cajan (L.) Millspaugh) and Cajanus scarabaeoides (L.) Thouars: Genome Organization and Comparison with Other Legumes.

PubMed

Kaila, Tanvi; Chaduvla, Pavan K; Saxena, Swati; Bahadur, Kaushlendra; Gahukar, Santosh J; Chaudhury, Ashok; Sharma, T R; Singh, N K; Gaikwad, Kishor

2016-01-01

Pigeonpea ( Cajanus cajan (L.) Millspaugh), a diploid (2n = 22) legume crop with a genome size of 852 Mbp, serves as an important source of human dietary protein especially in South East Asian and African regions. In this study, the draft chloroplast genomes of Cajanus cajan and Cajanus scarabaeoides (L.) Thouars were generated. Cajanus scarabaeoides is an important species of the Cajanus gene pool and has also been used for developing promising CMS system by different groups. A male sterile genotype harboring the C. scarabaeoides cytoplasm was used for sequencing the plastid genome. The cp genome of C. cajan is 152,242bp long, having a quadripartite structure with LSC of 83,455 bp and SSC of 17,871 bp separated by IRs of 25,398 bp. Similarly, the cp genome of C. scarabaeoides is 152,201bp long, having a quadripartite structure in which IRs of 25,402 bp length separates 83,423 bp of LSC and 17,854 bp of SSC. The pigeonpea cp genome contains 116 unique genes, including 30 tRNA, 4 rRNA, 78 predicted protein coding genes and 5 pseudogenes. A 50 kb inversion was observed in the LSC region of pigeonpea cp genome, consistent with other legumes. Comparison of cp genome with other legumes revealed the contraction of IR boundaries due to the absence of rps19 gene in the IR region. Chloroplast SSRs were mined and a total of 280 and 292 cpSSRs were identified in C. scarabaeoides and C. cajan respectively. RNA editing was observed at 37 sites in both C. scarabaeoides and C. cajan , with maximum occurrence in the ndh genes. The pigeonpea cp genome sequence would be beneficial in providing informative molecular markers which can be utilized for genetic diversity analysis and aid in understanding the plant systematics studies among major grain legumes.
The complete nucleotide sequences of the five genetically distinct plastid genomes of Oenothera, subsection Oenothera: I. sequence evaluation and plastome evolution.

PubMed

Greiner, Stephan; Wang, Xi; Rauwolf, Uwe; Silber, Martina V; Mayer, Klaus; Meurer, Jörg; Haberer, Georg; Herrmann, Reinhold G

2008-04-01

The flowering plant genus Oenothera is uniquely suited for studying molecular mechanisms of speciation. It assembles an intriguing combination of genetic features, including permanent translocation heterozygosity, biparental transmission of plastids, and a general interfertility of well-defined species. This allows an exchange of plastids and nuclei between species often resulting in plastome-genome incompatibility. For evaluation of its molecular determinants we present the complete nucleotide sequences of the five basic, genetically distinguishable plastid chromosomes of subsection Oenothera (=Euoenothera) of the genus, which are associated in distinct combinations with six basic genomes. Sizes of the chromosomes range from 163 365 bp (plastome IV) to 165 728 bp (plastome I), display between 96.3% and 98.6% sequence similarity and encode a total of 113 unique genes. Plastome diversification is caused by an abundance of nucleotide substitutions, small insertions, deletions and repetitions. The five plastomes deviate from the general ancestral design of plastid chromosomes of vascular plants by a subsection-specific 56 kb inversion within the large single-copy segment. This inversion disrupted operon structures and predates the divergence of the subsection presumably 1 My ago. Phylogenetic relationships suggest plastomes I-III in one clade, while plastome IV appears to be closest to the common ancestor.
The complete nucleotide sequences of the five genetically distinct plastid genomes of Oenothera, subsection Oenothera: I. Sequence evaluation and plastome evolution†

PubMed Central

Greiner, Stephan; Wang, Xi; Rauwolf, Uwe; Silber, Martina V.; Mayer, Klaus; Meurer, Jörg; Haberer, Georg; Herrmann, Reinhold G.

2008-01-01

The flowering plant genus Oenothera is uniquely suited for studying molecular mechanisms of speciation. It assembles an intriguing combination of genetic features, including permanent translocation heterozygosity, biparental transmission of plastids, and a general interfertility of well-defined species. This allows an exchange of plastids and nuclei between species often resulting in plastome–genome incompatibility. For evaluation of its molecular determinants we present the complete nucleotide sequences of the five basic, genetically distinguishable plastid chromosomes of subsection Oenothera (=Euoenothera) of the genus, which are associated in distinct combinations with six basic genomes. Sizes of the chromosomes range from 163 365 bp (plastome IV) to 165 728 bp (plastome I), display between 96.3% and 98.6% sequence similarity and encode a total of 113 unique genes. Plastome diversification is caused by an abundance of nucleotide substitutions, small insertions, deletions and repetitions. The five plastomes deviate from the general ancestral design of plastid chromosomes of vascular plants by a subsection-specific 56 kb inversion within the large single-copy segment. This inversion disrupted operon structures and predates the divergence of the subsection presumably 1 My ago. Phylogenetic relationships suggest plastomes I–III in one clade, while plastome IV appears to be closest to the common ancestor. PMID:18299283
Novel Insights into Tree Biology and Genome Evolution as Revealed Through Genomics.

PubMed

Neale, David B; Martínez-García, Pedro J; De La Torre, Amanda R; Montanari, Sara; Wei, Xiao-Xin

2017-04-28

Reference genome sequences are the key to the discovery of genes and gene families that determine traits of interest. Recent progress in sequencing technologies has enabled a rapid increase in genome sequencing of tree species, allowing the dissection of complex characters of economic importance, such as fruit and wood quality and resistance to biotic and abiotic stresses. Although the number of reference genome sequences for trees lags behind those for other plant species, it is not too early to gain insight into the unique features that distinguish trees from nontree plants. Our review of the published data suggests that, although many gene families are conserved among herbaceous and tree species, some gene families, such as those involved in resistance to biotic and abiotic stresses and in the synthesis and transport of sugars, are often expanded in tree genomes. As the genomes of more tree species are sequenced, comparative genomics will further elucidate the complexity of tree genomes and how this relates to traits unique to trees.
Mosaic Graphs and Comparative Genomics in Phage Communities

PubMed Central

Belcaid, Mahdi; Bergeron, Anne

2010-01-01

Abstract Comparing the genomes of two closely related viruses often produces mosaics where nearly identical sequences alternate with sequences that are unique to each genome. When several closely related genomes are compared, the unique sequences are likely to be shared with third genomes, leading to virus mosaic communities. Here we present comparative analysis of sets of Staphylococcus aureus phages that share large identical sequences with up to three other genomes, and with different partners along their genomes. We introduce mosaic graphs to represent these complex recombination events, and use them to illustrate the breath and depth of sequence sharing: some genomes are almost completely made up of shared sequences, while genomes that share very large identical sequences can adopt alternate functional modules. Mosaic graphs also allow us to identify breakpoints that could eventually be used for the construction of recombination networks. These findings have several implications on phage metagenomics assembly, on the horizontal gene transfer paradigm, and more generally on the understanding of the composition and evolutionary dynamics of virus communities. PMID:20874413
The complete chloroplast genome sequence of Actinidia arguta using the PacBio RS II platform

PubMed Central

Lin, Miaomiao; Qi, Xiujuan; Chen, Jinyong; Sun, Leiming; Zhong, Yunpeng; Fang, Jinbao; Hu, Chungen

2018-01-01

Actinidia arguta is the most basal species in a phylogenetically and economically important genus in the family Actinidiaceae. To better understand the molecular basis of the Actinidia arguta chloroplast (cp), we sequenced the complete cp genome from A. arguta using Illumina and PacBio RS II sequencing technologies. The cp genome from A. arguta was 157,611 bp in length and composed of a pair of 24,232 bp inverted repeats (IRs) separated by a 20,463 bp small single copy region (SSC) and an 88,684 bp large single copy region (LSC). Overall, the cp genome contained 113 unique genes. The cp genomes from A. arguta and three other Actinidia species from GenBank were subjected to a comparative analysis. Indel mutation events and high frequencies of base substitution were identified, and the accD and ycf2 genes showed a high degree of variation within Actinidia. Forty-seven simple sequence repeats (SSRs) and 155 repetitive structures were identified, further demonstrating the rapid evolution in Actinidia. The cp genome analysis and the identification of variable loci provide vital information for understanding the evolution and function of the chloroplast and for characterizing Actinidia population genetics. PMID:29795601
Does genome organization matter in spermatozoa? A refined hypothesis to awaken the silent vessel.

PubMed

Ioannou, Dimitrios; Tempest, Helen G

2018-01-02

The spermatozoon is considered by many to be a silent vessel whose only function is to safely deliver the paternal genome to the maternal oocyte. As a result, the paternal contribution to fertilization and embryogenesis is frequently overlooked. However, the spermatozoon is a highly elaborate and specialized cell that is formed through the process of spermatogenesis. Spermatogenesis is a complex cellular program of differentiation that produces mature spermatozoa, which are essential for reproduction, fertilization, and normal embryonic development. The sperm cell is unique in morphology, chromatin structure, and function. Increasing evidence demonstrates that perturbations in chromatin integrity and organization could have a significant clinical impact on fertilization and embryogenesis. In this article we will review the evidence that demonstrates the paternal genome to be highly packaged and uniquely organized. We will postulate how the integrity and organization of the paternal genome likely has functional consequences that are critical for the establishment and maintenance of a viable pregnancy. In doing so, we hope to dispel the common myth that the sperm cell is a silent vessel; instead we will demonstrate the sperm cell to be a highly segmentally organized, epigenetically primed cell. 2D: two-dimension; 3C: chromosome conformation capture; 3D: three-dimension; 4D: four-dimension; CTs: chromosome territories; FISH: fluorescence in situ hybridization; IMSI: intra cytoplasmic morphologically selected sperm injection; ICSI: intracytoplasmic sperm injection; IVF: in-vitro fertilization; mESCs: mouse embryonic stem cells; NORs: nuclear organizing regions; TADs: topologically associated domain.
GENOMIC ORGANIZATION OF THE SP22 GENE AND A UNIQUE PATTERN OF EXPRESSION IN SPERMATOGENIC CELLS

EPA Science Inventory

GENOMIC ORGANIZATION OF THE SP22 GENE AND A UNIQUE PATTERN OF EXPRESSION IN SPERMATOGENIC CELLS.
JE Welch*, RR Barbee*, JD Suarez*, NL Roberts*, and GR Klinefelter. Reproductive Toxicology Division, NHEERL, U.S. EPA, Research Triangle Park, NC, USA.
Our laboratory has rep...
Genome fluctuations in cyanobacteria reflect evolutionary, developmental and adaptive traits.

PubMed

Larsson, John; Nylander, Johan Aa; Bergman, Birgitta

2011-06-30

Cyanobacteria belong to an ancient group of photosynthetic prokaryotes with pronounced variations in their cellular differentiation strategies, physiological capacities and choice of habitat. Sequencing efforts have shown that genomes within this phylum are equally diverse in terms of size and protein-coding capacity. To increase our understanding of genomic changes in the lineage, the genomes of 58 contemporary cyanobacteria were analysed for shared and unique orthologs. A total of 404 protein families, present in all cyanobacterial genomes, were identified. Two of these are unique to the phylum, corresponding to an AbrB family transcriptional regulator and a gene that escapes functional annotation although its genomic neighbourhood is conserved among the organisms examined. The evolution of cyanobacterial genome sizes involves a mix of gains and losses in the clade encompassing complex cyanobacteria, while a single event of reduction is evident in a clade dominated by unicellular cyanobacteria. Genome sizes and gene family copy numbers evolve at a higher rate in the former clade, and multi-copy genes were predominant in large genomes. Orthologs unique to cyanobacteria exhibiting specific characteristics, such as filament formation, heterocyst differentiation, diazotrophy and symbiotic competence, were also identified. An ancestral character reconstruction suggests that the most recent common ancestor of cyanobacteria had a genome size of approx. 4.5 Mbp and 1678 to 3291 protein-coding genes, 4%-6% of which are unique to cyanobacteria today. The different rates of genome-size evolution and multi-copy gene abundance suggest two routes of genome development in the history of cyanobacteria. The expansion strategy is driven by gene-family enlargment and generates a broad adaptive potential; while the genome streamlining strategy imposes adaptations to highly specific niches, also reflected in their different functional capacities. A few genomes display extreme proliferation of non-coding nucleotides which is likely to be the result of initial expansion of genomes/gene copy number to gain adaptive potential, followed by a shift to a life-style in a highly specific niche (e.g. symbiosis). This transition results in redundancy of genes and gene families, leading to an increase in junk DNA and eventually to gene loss. A few orthologs can be correlated with specific phenotypes in cyanobacteria, such as filament formation and symbiotic competence; these constitute exciting exploratory targets.
Genome fluctuations in cyanobacteria reflect evolutionary, developmental and adaptive traits

PubMed Central

2011-01-01

Background Cyanobacteria belong to an ancient group of photosynthetic prokaryotes with pronounced variations in their cellular differentiation strategies, physiological capacities and choice of habitat. Sequencing efforts have shown that genomes within this phylum are equally diverse in terms of size and protein-coding capacity. To increase our understanding of genomic changes in the lineage, the genomes of 58 contemporary cyanobacteria were analysed for shared and unique orthologs. Results A total of 404 protein families, present in all cyanobacterial genomes, were identified. Two of these are unique to the phylum, corresponding to an AbrB family transcriptional regulator and a gene that escapes functional annotation although its genomic neighbourhood is conserved among the organisms examined. The evolution of cyanobacterial genome sizes involves a mix of gains and losses in the clade encompassing complex cyanobacteria, while a single event of reduction is evident in a clade dominated by unicellular cyanobacteria. Genome sizes and gene family copy numbers evolve at a higher rate in the former clade, and multi-copy genes were predominant in large genomes. Orthologs unique to cyanobacteria exhibiting specific characteristics, such as filament formation, heterocyst differentiation, diazotrophy and symbiotic competence, were also identified. An ancestral character reconstruction suggests that the most recent common ancestor of cyanobacteria had a genome size of approx. 4.5 Mbp and 1678 to 3291 protein-coding genes, 4%-6% of which are unique to cyanobacteria today. Conclusions The different rates of genome-size evolution and multi-copy gene abundance suggest two routes of genome development in the history of cyanobacteria. The expansion strategy is driven by gene-family enlargment and generates a broad adaptive potential; while the genome streamlining strategy imposes adaptations to highly specific niches, also reflected in their different functional capacities. A few genomes display extreme proliferation of non-coding nucleotides which is likely to be the result of initial expansion of genomes/gene copy number to gain adaptive potential, followed by a shift to a life-style in a highly specific niche (e.g. symbiosis). This transition results in redundancy of genes and gene families, leading to an increase in junk DNA and eventually to gene loss. A few orthologs can be correlated with specific phenotypes in cyanobacteria, such as filament formation and symbiotic competence; these constitute exciting exploratory targets. PMID:21718514
Mitochondrial DNA as a non-invasive biomarker: Accurate quantification using real time quantitative PCR without co-amplification of pseudogenes and dilution bias

DOE Office of Scientific and Technical Information (OSTI.GOV)

Malik, Afshan N., E-mail: afshan.malik@kcl.ac.uk; Shahni, Rojeen; Rodriguez-de-Ledesma, Ana

2011-08-19

Highlights: {yields} Mitochondrial dysfunction is central to many diseases of oxidative stress. {yields} 95% of the mitochondrial genome is duplicated in the nuclear genome. {yields} Dilution of untreated genomic DNA leads to dilution bias. {yields} Unique primers and template pretreatment are needed to accurately measure mitochondrial DNA content. -- Abstract: Circulating mitochondrial DNA (MtDNA) is a potential non-invasive biomarker of cellular mitochondrial dysfunction, the latter known to be central to a wide range of human diseases. Changes in MtDNA are usually determined by quantification of MtDNA relative to nuclear DNA (Mt/N) using real time quantitative PCR. We propose that themore » methodology for measuring Mt/N needs to be improved and we have identified that current methods have at least one of the following three problems: (1) As much of the mitochondrial genome is duplicated in the nuclear genome, many commonly used MtDNA primers co-amplify homologous pseudogenes found in the nuclear genome; (2) use of regions from genes such as {beta}-actin and 18S rRNA which are repetitive and/or highly variable for qPCR of the nuclear genome leads to errors; and (3) the size difference of mitochondrial and nuclear genomes cause a 'dilution bias' when template DNA is diluted. We describe a PCR-based method using unique regions in the human mitochondrial genome not duplicated in the nuclear genome; unique single copy region in the nuclear genome and template treatment to remove dilution bias, to accurately quantify MtDNA from human samples.« less
The genome of Eucalyptus grandis.

PubMed

Myburg, Alexander A; Grattapaglia, Dario; Tuskan, Gerald A; Hellsten, Uffe; Hayes, Richard D; Grimwood, Jane; Jenkins, Jerry; Lindquist, Erika; Tice, Hope; Bauer, Diane; Goodstein, David M; Dubchak, Inna; Poliakov, Alexandre; Mizrachi, Eshchar; Kullan, Anand R K; Hussey, Steven G; Pinard, Desre; van der Merwe, Karen; Singh, Pooja; van Jaarsveld, Ida; Silva-Junior, Orzenil B; Togawa, Roberto C; Pappas, Marilia R; Faria, Danielle A; Sansaloni, Carolina P; Petroli, Cesar D; Yang, Xiaohan; Ranjan, Priya; Tschaplinski, Timothy J; Ye, Chu-Yu; Li, Ting; Sterck, Lieven; Vanneste, Kevin; Murat, Florent; Soler, Marçal; Clemente, Hélène San; Saidi, Naijib; Cassan-Wang, Hua; Dunand, Christophe; Hefer, Charles A; Bornberg-Bauer, Erich; Kersting, Anna R; Vining, Kelly; Amarasinghe, Vindhya; Ranik, Martin; Naithani, Sushma; Elser, Justin; Boyd, Alexander E; Liston, Aaron; Spatafora, Joseph W; Dharmwardhana, Palitha; Raja, Rajani; Sullivan, Christopher; Romanel, Elisson; Alves-Ferreira, Marcio; Külheim, Carsten; Foley, William; Carocha, Victor; Paiva, Jorge; Kudrna, David; Brommonschenkel, Sergio H; Pasquali, Giancarlo; Byrne, Margaret; Rigault, Philippe; Tibbits, Josquin; Spokevicius, Antanas; Jones, Rebecca C; Steane, Dorothy A; Vaillancourt, René E; Potts, Brad M; Joubert, Fourie; Barry, Kerrie; Pappas, Georgios J; Strauss, Steven H; Jaiswal, Pankaj; Grima-Pettenati, Jacqueline; Salse, Jérôme; Van de Peer, Yves; Rokhsar, Daniel S; Schmutz, Jeremy

2014-06-19

Eucalypts are the world's most widely planted hardwood trees. Their outstanding diversity, adaptability and growth have made them a global renewable resource of fibre and energy. We sequenced and assembled >94% of the 640-megabase genome of Eucalyptus grandis. Of 36,376 predicted protein-coding genes, 34% occur in tandem duplications, the largest proportion thus far in plant genomes. Eucalyptus also shows the highest diversity of genes for specialized metabolites such as terpenes that act as chemical defence and provide unique pharmaceutical oils. Genome sequencing of the E. grandis sister species E. globulus and a set of inbred E. grandis tree genomes reveals dynamic genome evolution and hotspots of inbreeding depression. The E. grandis genome is the first reference for the eudicot order Myrtales and is placed here sister to the eurosids. This resource expands our understanding of the unique biology of large woody perennials and provides a powerful tool to accelerate comparative biology, breeding and biotechnology.
Massive gene acquisitions in Mycobacterium indicus pranii provide a perspective on mycobacterial evolution

PubMed Central

Saini, Vikram; Raghuvanshi, Saurabh; Khurana, Jitendra P.; Ahmed, Niyaz; Hasnain, Seyed E.; Tyagi, Akhilesh K.; Tyagi, Anil K.

2012-01-01

Understanding the evolutionary and genomic mechanisms responsible for turning the soil-derived saprophytic mycobacteria into lethal intracellular pathogens is a critical step towards the development of strategies for the control of mycobacterial diseases. In this context, Mycobacterium indicus pranii (MIP) is of specific interest because of its unique immunological and evolutionary significance. Evolutionarily, it is the progenitor of opportunistic pathogens belonging to M. avium complex and is endowed with features that place it between saprophytic and pathogenic species. Herein, we have sequenced the complete MIP genome to understand its unique life style, basis of immunomodulation and habitat diversification in mycobacteria. As a case of massive gene acquisitions, 50.5% of MIP open reading frames (ORFs) are laterally acquired. We show, for the first time for Mycobacterium, that MIP genome has mosaic architecture. These gene acquisitions have led to the enrichment of selected gene families critical to MIP physiology. Comparative genomic analysis indicates a higher antigenic potential of MIP imparting it a unique ability for immunomodulation. Besides, it also suggests an important role of genomic fluidity in habitat diversification within mycobacteria and provides a unique view of evolutionary divergence and putative bottlenecks that might have eventually led to intracellular survival and pathogenic attributes in mycobacteria. PMID:22965120
The genome sequence of Dyella jiangningensis FCAV SCS01 from a lignocellulose-decomposing microbial consortium metagenome reveals potential for biotechnological applications.

PubMed

Desiderato, Joana G; Alvarenga, Danillo O; Constancio, Milena T L; Alves, Lucia M C; Varani, Alessandro M

2018-05-14

Cellulose and its associated polymers are structural components of the plant cell wall, constituting one of the major sources of carbon and energy in nature. The carbon cycle is dependent on cellulose- and lignin-decomposing microbial communities and their enzymatic systems acting as consortia. These microbial consortia are under constant exploration for their potential biotechnological use. Herein, we describe the characterization of the genome of Dyella jiangningensis FCAV SCS01, recovered from the metagenome of a lignocellulose-degrading microbial consortium, which was isolated from a sugarcane crop soil under mechanical harvesting and covered by decomposing straw. The 4.7 Mbp genome encodes 4,194 proteins, including 36 glycoside hydrolases (GH), supporting the hypothesis that this bacterium may contribute to lignocellulose decomposition. Comparative analysis among fully sequenced Dyella species indicate that the genome synteny is not conserved, and that D. jiangningensis FCAV SCS01 carries 372 unique genes, including an alpha-glucosidase and maltodextrin glucosidase coding genes, and other potential biomass degradation related genes. Additional genomic features, such as prophage-like, genomic islands and putative new biosynthetic clusters were also uncovered. Overall, D. jiangningensis FCAV SCS01 represents the first South American Dyella genome sequenced and shows an exclusive feature among its genus, related to biomass degradation.
Mutational landscape of yeast mutator strains.

PubMed

Serero, Alexandre; Jubin, Claire; Loeillet, Sophie; Legoix-Né, Patricia; Nicolas, Alain G

2014-02-04

The acquisition of mutations is relevant to every aspect of genetics, including cancer and evolution of species on Darwinian selection. Genome variations arise from rare stochastic imperfections of cellular metabolism and deficiencies in maintenance genes. Here, we established the genome-wide spectrum of mutations that accumulate in a WT and in nine Saccharomyces cerevisiae mutator strains deficient for distinct genome maintenance processes: pol32Δ and rad27Δ (replication), msh2Δ (mismatch repair), tsa1Δ (oxidative stress), mre11Δ (recombination), mec1Δ tel1Δ (DNA damage/S-phase checkpoints), pif1Δ (maintenance of mitochondrial genome and telomere length), cac1Δ cac3Δ (nucleosome deposition), and clb5Δ (cell cycle progression). This study reveals the diversity, complexity, and ultimate unique nature of each mutational spectrum, composed of punctual mutations, chromosomal structural variations, and/or aneuploidies. The mutations produced in clb5Δ/CCNB1, mec1Δ/ATR, tel1Δ/ATM, and rad27Δ/FEN1 strains extensively reshape the genome, following a trajectory dependent on previous events. It comprises the transmission of unstable genomes that lead to colony mosaicisms. This comprehensive analytical approach of mutator defects provides a model to understand how genome variations might accumulate during clonal evolution of somatic cell populations, including tumor cells.

Explaining human uniqueness: genome interactions with environment, behaviour and culture.

PubMed

Varki, Ajit; Geschwind, Daniel H; Eichler, Evan E

2008-10-01

What makes us human? Specialists in each discipline respond through the lens of their own expertise. In fact, 'anthropogeny' (explaining the origin of humans) requires a transdisciplinary approach that eschews such barriers. Here we take a genomic and genetic perspective towards molecular variation, explore systems analysis of gene expression and discuss an organ-systems approach. Rejecting any 'genes versus environment' dichotomy, we then consider genome interactions with environment, behaviour and culture, finally speculating that aspects of human uniqueness arose because of a primate evolutionary trend towards increasing and irreversible dependence on learned behaviours and culture - perhaps relaxing allowable thresholds for large-scale genomic diversity.
Optimizing Restriction Site Placement for Synthetic Genomes

NASA Astrophysics Data System (ADS)

Montes, Pablo; Memelli, Heraldo; Ward, Charles; Kim, Joondong; Mitchell, Joseph S. B.; Skiena, Steven

Restriction enzymes are the workhorses of molecular biology. We introduce a new problem that arises in the course of our project to design virus variants to serve as potential vaccines: we wish to modify virus-length genomes to introduce large numbers of unique restriction enzyme recognition sites while preserving wild-type function by substitution of synonymous codons. We show that the resulting problem is NP-Complete, give an exponential-time algorithm, and propose effective heuristics, which we show give excellent results for five sample viral genomes. Our resulting modified genomes have several times more unique restriction sites and reduce the maximum gap between adjacent sites by three to nine-fold.
Explaining human uniqueness: genome interactions with environment, behaviour and culture

PubMed Central

Varki, Ajit; Geschwind, Daniel H.; Eichler, Evan E.

2009-01-01

What makes us human? Specialists in each discipline respond through the lens of their own expertise. In fact, ‘anthropogeny’ (explaining the origin of humans) requires a transdisciplinary approach that eschews such barriers. Here we take a genomic and genetic perspective towards molecular variation, explore systems analysis of gene expression and discuss an organ-systems approach. Rejecting any ‘genes versus environment’ dichotomy, we then consider genome interactions with environment, behaviour and culture, finally speculating that aspects of human uniqueness arose because of a primate evolutionary trend towards increasing and irreversible dependence on learned behaviours and culture — perhaps relaxing allowable thresholds for large-scale genomic diversity. PMID:18802414
Hepatitis A Virus Genome Organization and Replication Strategy.

PubMed

McKnight, Kevin L; Lemon, Stanley M

2018-04-02

Hepatitis A virus (HAV) is a positive-strand RNA virus classified in the genus Hepatovirus of the family Picornaviridae It is an ancient virus with a long evolutionary history and multiple features of its capsid structure, genome organization, and replication cycle that distinguish it from other mammalian picornaviruses. HAV proteins are produced by cap-independent translation of a single, long open reading frame under direction of an inefficient, upstream internal ribosome entry site (IRES). Genome replication occurs slowly and is noncytopathic, with transcription likely primed by a uridylated protein primer as in other picornaviruses. Newly produced quasi-enveloped virions (eHAV) are released from cells in a nonlytic fashion in a unique process mediated by interactions of capsid proteins with components of the host cell endosomal sorting complexes required for transport (ESCRT) system. Copyright © 2018 Cold Spring Harbor Laboratory Press; all rights reserved.
Complete genome sequence of Paenibacillus sp. strain JDR-2

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chow, Virginia; Nong, Guang; St. John, Franz J.

2012-01-01

Paenibacillus sp. strain JDR-2, an aggressively xylanolytic bacterium isolated from sweetgum (Liquidambar styraciflua) wood, is able to efficiently depolymerize, assimilate and metabolize 4-O-methylglucuronoxylan, the predominant structural component of hardwood hemicelluloses. A basis for this capability was first supported by the identification of genes and characterization of encoded enzymes and has been further defined by the sequencing and annotation of the complete genome, which we describe. In addition to genes implicated in the utilization of -1,4-xylan, genes have also been identified for the utilization of other hemicellulosic polysaccharides. The genome of Paenibacillus sp. JDR-2 contains 7,184,930 bp in a single repliconmore » with 6,288 protein-coding and 122 RNA genes. Uniquely prominent are 874 genes encoding proteins involved in carbohydrate transport and metabolism. The prevalence and organization of these genes support a metabolic potential for bioprocessing of hemicellulose fractions derived from lignocellulosic resources.« less
Genome-wide genetic diversity, population structure and admixture analysis in African and Asian cattle breeds.

PubMed

Edea, Z; Bhuiyan, M S A; Dessie, T; Rothschild, M F; Dadi, H; Kim, K S

2015-02-01

Knowledge about genetic diversity and population structure is useful for designing effective strategies to improve the production, management and conservation of farm animal genetic resources. Here, we present a comprehensive genome-wide analysis of genetic diversity, population structure and admixture based on 244 animals sampled from 10 cattle populations in Asia and Africa and genotyped for 69,903 autosomal single-nucleotide polymorphisms (SNPs) mainly derived from the indicine breed. Principal component analysis, STRUCTURE and distance analysis from high-density SNP data clearly revealed that the largest genetic difference occurred between the two domestic lineages (taurine and indicine), whereas Ethiopian cattle populations represent a mosaic of the humped zebu and taurine. Estimation of the genetic influence of zebu and taurine revealed that Ethiopian cattle were characterized by considerable levels of introgression from South Asian zebu, whereas Bangladeshi populations shared very low taurine ancestry. The relationships among Ethiopian cattle populations reflect their history of origin and admixture rather than phenotype-based distinctions. The high within-individual genetic variability observed in Ethiopian cattle represents an untapped opportunity for adaptation to changing environments and for implementation of within-breed genetic improvement schemes. Our results provide a basis for future applications of genome-wide SNP data to exploit the unique genetic makeup of indigenous cattle breeds and to facilitate their improvement and conservation.
Secondary structural entropy in RNA switch (Riboswitch) identification.

PubMed

Manzourolajdad, Amirhossein; Arnold, Jonathan

2015-04-28

RNA regulatory elements play a significant role in gene regulation. Riboswitches, a widespread group of regulatory RNAs, are vital components of many bacterial genomes. These regulatory elements generally function by forming a ligand-induced alternative fold that controls access to ribosome binding sites or other regulatory sites in RNA. Riboswitch-mediated mechanisms are ubiquitous across bacterial genomes. A typical class of riboswitch has its own unique structural and biological complexity, making de novo riboswitch identification a formidable task. Traditionally, riboswitches have been identified through comparative genomics based on sequence and structural homology. The limitations of structural-homology-based approaches, coupled with the assumption that there is a great diversity of undiscovered riboswitches, suggests the need for alternative methods for riboswitch identification, possibly based on features intrinsic to their structure. As of yet, no such reliable method has been proposed. We used structural entropy of riboswitch sequences as a measure of their secondary structural dynamics. Entropy values of a diverse set of riboswitches were compared to that of their mutants, their dinucleotide shuffles, and their reverse complement sequences under different stochastic context-free grammar folding models. Significance of our results was evaluated by comparison to other approaches, such as the base-pairing entropy and energy landscapes dynamics. Classifiers based on structural entropy optimized via sequence and structural features were devised as riboswitch identifiers and tested on Bacillus subtilis, Escherichia coli, and Synechococcus elongatus as an exploration of structural entropy based approaches. The unusually long untranslated region of the cotH in Bacillus subtilis, as well as upstream regions of certain genes, such as the sucC genes were associated with significant structural entropy values in genome-wide examinations. Various tests show that there is in fact a relationship between higher structural entropy and the potential for the RNA sequence to have alternative structures, within the limitations of our methodology. This relationship, though modest, is consistent across various tests. Understanding the behavior of structural entropy as a fairly new feature for RNA conformational dynamics, however, may require extensive exploratory investigation both across RNA sequences and folding models.
Comparative molecular dynamics studies of heterozygous open reading frames of DNA polymerase eta (η) in pathogenic yeast Candida albicans

NASA Astrophysics Data System (ADS)

Satpati, Suresh; Manohar, Kodavati; Acharya, Narottam; Dixit, Anshuman

2017-01-01

Genomic instability in Candida albicans is believed to play a crucial role in fungal pathogenesis. DNA polymerases contribute significantly to stability of any genome. Although Candida Genome database predicts presence of S. cerevisiae DNA polymerase orthologs; functional and structural characterizations of Candida DNA polymerases are still unexplored. DNA polymerase eta (Polη) is unique as it promotes efficient bypass of cyclobutane pyrimidine dimers. Interestingly, C. albicans is heterozygous in carrying two Polη genes and the nucleotide substitutions were found only in the ORFs. As allelic differences often result in functional differences of the encoded proteins, comparative analyses of structural models and molecular dynamic simulations were performed to characterize these orthologs of DNA Polη. Overall structures of both the ORFs remain conserved except subtle differences in the palm and PAD domains. The complementation analysis showed that both the ORFs equally suppressed UV sensitivity of yeast rad30 deletion strain. Our study has predicted two novel molecular interactions, a highly conserved molecular tetrad of salt bridges and a series of π-π interactions spanning from thumb to PAD. This study suggests these ORFs as the homologues of yeast Polη, and due to its heterogeneity in C. albicans they may play a significant role in pathogenicity.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Kishigami, Satoshi; Kinki University, 930 Nishimitani, Kinokawa 599-5993; Wakayama, Sayaka

In mammals, a diploid genome of an individual following fertilization of an egg and a spermatozoon is unique and irreproducible. This implies that the generated unique diploid genome is doomed with the individual ending. Even as cultured cells from the individual, they cannot normally proliferate in perpetuity because of the 'Hayflick limit'. However, Dolly, the sheep cloned from an adult mammary gland cell, changes this scenario. Somatic cell nuclear transfer (SCNT) enables us to produce offspring without germ cells, that is, to 'passage' a unique diploid genome. Animal cloning has also proven to be a powerful research tool for reprogrammingmore » in many mammals, notably mouse and cow. The mechanism underlying reprogramming, however, remains largely unknown and, animal cloning has been inefficient as a result. More momentously, in addition to abortion and fetal mortality, some cloned animals display possible premature aging phenotypes including early death and short telomere lengths. Under these inauspicious conditions, is it really possible for SCNT to preserve a diploid genome? Delightfully, in mouse and recently in primate, using SCNT we can produce nuclear transfer ES cells (ntES) more efficiently, which can preserve the eternal lifespan for the 'passage' of a unique diploid genome. Further, new somatic cloning technique using histone-deacetylase inhibitors has been developed which can significantly increase the previous cloning rates two to six times. Here, we introduce SCNT and its value as a preservation tool for a diploid genome while reviewing aging of cloned animals on cellular and individual levels.« less
Complete genome sequence of Brachyspira intermedia reveals unique genomic features in Brachyspira species and phage-mediated horizontal gene transfer

PubMed Central

2011-01-01

Background Brachyspira spp. colonize the intestines of some mammalian and avian species and show different degrees of enteropathogenicity. Brachyspira intermedia can cause production losses in chickens and strain PWS/AT now becomes the fourth genome to be completed in the genus Brachyspira. Results 15 classes of unique and shared genes were analyzed in B. intermedia, B. murdochii, B. hyodysenteriae and B. pilosicoli. The largest number of unique genes was found in B. intermedia and B. murdochii. This indicates the presence of larger pan-genomes. In general, hypothetical protein annotations are overrepresented among the unique genes. A 3.2 kb plasmid was found in B. intermedia strain PWS/AT. The plasmid was also present in the B. murdochii strain but not in nine other Brachyspira isolates. Within the Brachyspira genomes, genes had been translocated and also frequently switched between leading and lagging strands, a process that can be followed by different AT-skews in the third positions of synonymous codons. We also found evidence that bacteriophages were being remodeled and genes incorporated into them. Conclusions The accessory gene pool shapes species-specific traits. It is also influenced by reductive genome evolution and horizontal gene transfer. Gene-transfer events can cross both species and genus boundaries and bacteriophages appear to play an important role in this process. A mechanism for horizontal gene transfer appears to be gene translocations leading to remodeling of bacteriophages in combination with broad tropism. PMID:21816042
Functional annotation by sequence-weighted structure alignments: statistical analysis and case studies from the Protein 3000 structural genomics project in Japan.

PubMed

Standley, Daron M; Toh, Hiroyuki; Nakamura, Haruki

2008-09-01

A method to functionally annotate structural genomics targets, based on a novel structural alignment scoring function, is proposed. In the proposed score, position-specific scoring matrices are used to weight structurally aligned residue pairs to highlight evolutionarily conserved motifs. The functional form of the score is first optimized for discriminating domains belonging to the same Pfam family from domains belonging to different families but the same CATH or SCOP superfamily. In the optimization stage, we consider four standard weighting functions as well as our own, the "maximum substitution probability," and combinations of these functions. The optimized score achieves an area of 0.87 under the receiver-operating characteristic curve with respect to identifying Pfam families within a sequence-unique benchmark set of domain pairs. Confidence measures are then derived from the benchmark distribution of true-positive scores. The alignment method is next applied to the task of functionally annotating 230 query proteins released to the public as part of the Protein 3000 structural genomics project in Japan. Of these queries, 78 were found to align to templates with the same Pfam family as the query or had sequence identities > or = 30%. Another 49 queries were found to match more distantly related templates. Within this group, the template predicted by our method to be the closest functional relative was often not the most structurally similar. Several nontrivial cases are discussed in detail. Finally, 103 queries matched templates at the fold level, but not the family or superfamily level, and remain functionally uncharacterized. 2008 Wiley-Liss, Inc.
Draft Genome Sequence of the Spore-Forming Probiotic Strain Bacillus coagulans Unique IS-2

PubMed Central

Upadrasta, Aditya; Pitta, Swetha

2016-01-01

Bacillus coagulans Unique IS-2 is a potential spore-forming probiotic that is commercially available on the market. The draft genome sequence presented here provides deep insight into the beneficial features of this strain for its safe use as a probiotic for various human and animal health applications. PMID:27103709
MODBASE, a database of annotated comparative protein structure models

PubMed Central

Pieper, Ursula; Eswar, Narayanan; Stuart, Ashley C.; Ilyin, Valentin A.; Sali, Andrej

2002-01-01

MODBASE (http://guitar.rockefeller.edu/modbase) is a relational database of annotated comparative protein structure models for all available protein sequences matched to at least one known protein structure. The models are calculated by MODPIPE, an automated modeling pipeline that relies on PSI-BLAST, IMPALA and MODELLER. MODBASE uses the MySQL relational database management system for flexible and efficient querying, and the MODVIEW Netscape plugin for viewing and manipulating multiple sequences and structures. It is updated regularly to reflect the growth of the protein sequence and structure databases, as well as improvements in the software for calculating the models. For ease of access, MODBASE is organized into different datasets. The largest dataset contains models for domains in 304 517 out of 539 171 unique protein sequences in the complete TrEMBL database (23 March 2001); only models based on significant alignments (PSI-BLAST E-value < 10–4) and models assessed to have the correct fold are included. Other datasets include models for target selection and structure-based annotation by the New York Structural Genomics Research Consortium, models for prediction of genes in the Drosophila melanogaster genome, models for structure determination of several ribosomal particles and models calculated by the MODWEB comparative modeling web server. PMID:11752309
Evolutionary insights from Erwinia amylovora genomics.

PubMed

Smits, Theo H M; Rezzonico, Fabio; Duffy, Brion

2011-08-20

Evolutionary genomics is coming into focus with the recent availability of complete sequences for many bacterial species. A hypothesis on the evolution of virulence factors in the plant pathogen Erwinia amylovora, the causative agent of fire blight, was generated using comparative genomics with the genomes E. amylovora, Erwinia pyrifoliae and Erwinia tasmaniensis. Putative virulence factors were mapped to the proposed genealogy of the genus Erwinia that is based on phylogenetic and genomic data. Ancestral origin of several virulence factors was identified, including levan biosynthesis, sorbitol metabolism, three T3SS and two T6SS. Other factors appeared to have been acquired after divergence of pathogenic species, including a second flagellar gene and two glycosyltransferases involved in amylovoran biosynthesis. E. amylovora singletons include 3 unique T3SS effectors that may explain differential virulence/host ranges. E. amylovora also has a unique T1SS export system, and a unique third T6SS gene cluster. Genetic analysis revealed signatures of foreign DNA suggesting that horizontal gene transfer is responsible for some of these differential features between the three species. Copyright © 2010 Elsevier B.V. All rights reserved.
A new family of β-helix proteins with similarities to the polysaccharide lyases

DOE PAGES

Close, Devin W.; D'Angelo, Sara; Bradbury, Andrew R. M.

2014-09-27

Microorganisms that degrade biomass produce diverse assortments of carbohydrate-active enzymes and binding modules. Despite tremendous advances in the genomic sequencing of these organisms, many genes do not have an ascribed function owing to low sequence identity to genes that have been annotated. Consequently, biochemical and structural characterization of genes with unknown function is required to complement the rapidly growing pool of genomic sequencing data. A protein with previously unknown function (Cthe_2159) was recently isolated in a genome-wide screen using phage display to identify cellulose-binding protein domains from the biomass-degrading bacterium Clostridium thermocellum. Here, the crystal structure of Cthe_2159 is presentedmore » and it is shown that it is a unique right-handed parallel β-helix protein. Despite very low sequence identity to known β-helix or carbohydrate-active proteins, Cthe_2159 displays structural features that are very similar to those of polysaccharide lyase (PL) families 1, 3, 6 and 9. Cthe_2159 is conserved across bacteria and some archaea and is a member of the domain of unknown function family DUF4353. This suggests that Cthe_2159 is the first representative of a previously unknown family of cellulose and/or acid-sugar binding β-helix proteins that share structural similarities with PLs. More importantly, these results demonstrate how functional annotation by biochemical and structural analysis remains a critical tool in the characterization of new gene products.« less
A new family of β-helix proteins with similarities to the polysaccharide lyases

DOE Office of Scientific and Technical Information (OSTI.GOV)

Close, Devin W.; D'Angelo, Sara; Bradbury, Andrew R. M.

Microorganisms that degrade biomass produce diverse assortments of carbohydrate-active enzymes and binding modules. Despite tremendous advances in the genomic sequencing of these organisms, many genes do not have an ascribed function owing to low sequence identity to genes that have been annotated. Consequently, biochemical and structural characterization of genes with unknown function is required to complement the rapidly growing pool of genomic sequencing data. A protein with previously unknown function (Cthe_2159) was recently isolated in a genome-wide screen using phage display to identify cellulose-binding protein domains from the biomass-degrading bacterium Clostridium thermocellum. Here, the crystal structure of Cthe_2159 is presentedmore » and it is shown that it is a unique right-handed parallel β-helix protein. Despite very low sequence identity to known β-helix or carbohydrate-active proteins, Cthe_2159 displays structural features that are very similar to those of polysaccharide lyase (PL) families 1, 3, 6 and 9. Cthe_2159 is conserved across bacteria and some archaea and is a member of the domain of unknown function family DUF4353. This suggests that Cthe_2159 is the first representative of a previously unknown family of cellulose and/or acid-sugar binding β-helix proteins that share structural similarities with PLs. More importantly, these results demonstrate how functional annotation by biochemical and structural analysis remains a critical tool in the characterization of new gene products.« less
Complete genome analysis of three Acinetobacter baumannii clinical isolates in China for insight into the diversification of drug resistance elements.

PubMed

Zhu, Lingxiang; Yan, Zhongqiang; Zhang, Zhaojun; Zhou, Qiming; Zhou, Jinchun; Wakeland, Edward K; Fang, Xiangdong; Xuan, Zhenyu; Shen, Dingxia; Li, Quan-Zhen

2013-01-01

The emergence and rapid spreading of multidrug-resistant Acinetobacter baumannii strains has become a major health threat worldwide. To better understand the genetic recombination related with the acquisition of drug-resistant elements during bacterial infection, we performed complete genome analysis on three newly isolated multidrug-resistant A. baumannii strains from Beijing using next-generation sequencing technology. Whole genome comparison revealed that all 3 strains share some common drug resistant elements including carbapenem-resistant bla OXA-23 and tetracycline (tet) resistance islands, but the genome structures are diversified among strains. Various genomic islands intersperse on the genome with transposons and insertions, reflecting the recombination flexibility during the acquisition of the resistant elements. The blood-isolated BJAB07104 and ascites-isolated BJAB0868 exhibit high similarity on their genome structure with most of the global clone II strains, suggesting these two strains belong to the dominant outbreak strains prevalent worldwide. A large resistance island (RI) of about 121-kb, carrying a cluster of resistance-related genes, was inserted into the ATPase gene on BJAB07104 and BJAB0868 genomes. A 78-kb insertion element carrying tra-locus and bla OXA-23 island, can be either inserted into one of the tniB gene in the 121-kb RI on the chromosome, or transformed to conjugative plasmid in the two BJAB strains. The third strains of this study, BJAB0715, which was isolated from spinal fluid, exhibit much more divergence compared with above two strains. It harbors multiple drug-resistance elements including a truncated AbaR-22-like RI on its genome. One of the unique features of this strain is that it carries both bla OXA-23 and bla OXA-58 genes on its genome. Besides, an Acinetobacter lwoffii adeABC efflux element was found inserted into the ATPase position in BJAB0715. Our comparative analysis on currently completed Acinetobacter baumannii genomes revealed extensive and dynamic genome organizations, which may facilitate the bacteria to acquire drug-resistance elements into their genomes.
Genome Analysis of the Domestic Dog (Korean Jindo) by Massively Parallel Sequencing

PubMed Central

Kim, Ryong Nam; Kim, Dae-Soo; Choi, Sang-Haeng; Yoon, Byoung-Ha; Kang, Aram; Nam, Seong-Hyeuk; Kim, Dong-Wook; Kim, Jong-Joo; Ha, Ji-Hong; Toyoda, Atsushi; Fujiyama, Asao; Kim, Aeri; Kim, Min-Young; Park, Kun-Hyang; Lee, Kang Seon; Park, Hong-Seog

2012-01-01

Although pioneering sequencing projects have shed light on the boxer and poodle genomes, a number of challenges need to be met before the sequencing and annotation of the dog genome can be considered complete. Here, we present the DNA sequence of the Jindo dog genome, sequenced to 45-fold average coverage using Illumina massively parallel sequencing technology. A comparison of the sequence to the reference boxer genome led to the identification of 4 675 437 single nucleotide polymorphisms (SNPs, including 3 346 058 novel SNPs), 71 642 indels and 8131 structural variations. Of these, 339 non-synonymous SNPs and 3 indels are located within coding sequences (CDS). In particular, 3 non-synonymous SNPs and a 26-bp deletion occur in the TCOF1 locus, implying that the difference observed in cranial facial morphology between Jindo and boxer dogs might be influenced by those variations. Through the annotation of the Jindo olfactory receptor gene family, we found 2 unique olfactory receptor genes and 236 olfactory receptor genes harbouring non-synonymous homozygous SNPs that are likely to affect smelling capability. In addition, we determined the DNA sequence of the Jindo dog mitochondrial genome and identified Jindo dog-specific mtDNA genotypes. This Jindo genome data upgrade our understanding of dog genomic architecture and will be a very valuable resource for investigating not only dog genetics and genomics but also human and dog disease genetics and comparative genomics. PMID:22474061
DNA methylation at hepatitis B viral integrants is associated with methylation at flanking human genomic sequences

PubMed Central

Watanabe, Yoshiyuki; Yamamoto, Hiroyuki; Oikawa, Ritsuko; Toyota, Minoru; Yamamoto, Masakazu; Kokudo, Norihiro; Tanaka, Shinji; Arii, Shigeki; Yotsuyanagi, Hiroshi; Koike, Kazuhiko; Itoh, Fumio

2015-01-01

Integration of DNA viruses into the human genome plays an important role in various types of tumors, including hepatitis B virus (HBV)–related hepatocellular carcinoma. However, the molecular details and clinical impact of HBV integration on either human or HBV epigenomes are unknown. Here, we show that methylation of the integrated HBV DNA is related to the methylation status of the flanking human genome. We developed a next-generation sequencing-based method for structural methylation analysis of integrated viral genomes (denoted G-NaVI). This method is a novel approach that enables enrichment of viral fragments for sequencing using unique baits based on the sequence of the HBV genome. We detected integrated HBV sequences in the genome of the PLC/PRF/5 cell line and found variable levels of methylation within the integrated HBV genomes. Allele-specific methylation analysis revealed that the HBV genome often became significantly methylated when integrated into highly methylated host sites. After integration into unmethylated human genome regions such as promoters, however, the HBV DNA remains unmethylated and may eventually play an important role in tumorigenesis. The observed dynamic changes in DNA methylation of the host and viral genomes may functionally affect the biological behavior of HBV. These findings may impact public health given that millions of people worldwide are carriers of HBV. We also believe our assay will be a powerful tool to increase our understanding of the various types of DNA virus-associated tumorigenesis. PMID:25653310
Genomic characterisation of Wongabel virus reveals novel genes within the Rhabdoviridae.

PubMed

Gubala, Aneta J; Proll, David F; Barnard, Ross T; Cowled, Chris J; Crameri, Sandra G; Hyatt, Alex D; Boyle, David B

2008-06-20

Viruses belonging to the family Rhabdoviridae infect a variety of different hosts, including insects, vertebrates and plants. Currently, there are approximately 200 ICTV-recognised rhabdoviruses isolated around the world. However, the majority remain poorly characterised and only a fraction have been definitively assigned to genera. The genomic and transcriptional complexity displayed by several of the characterised rhabdoviruses indicates large diversity and complexity within this family. To enable an improved taxonomic understanding of this family, it is necessary to gain further information about the poorly characterised members of this family. Here we present the complete genome sequence and predicted transcription strategy of Wongabel virus (WONV), a previously uncharacterised rhabdovirus isolated from biting midges (Culicoides austropalpalis) collected in northern Queensland, Australia. The 13,196 nucleotide genome of WONV encodes five typical rhabdovirus genes N, P, M, G and L. In addition, the WONV genome contains three genes located between the P and M genes (U1, U2, U3) and two open reading frames overlapping with the N and G genes (U4, U5). These five additional genes and their putative protein products appear to be novel, and their functions are unknown. Predictive analysis of the U5 gene product revealed characteristics typical of viroporins, and indicated structural similarities with the alpha-1 protein (putative viroporin) of viruses in the genus Ephemerovirus. Phylogenetic analyses of the N and G proteins of WONV indicated closest similarity with the avian-associated Flanders virus; however, the genomes of these two viruses are significantly diverged. WONV displays a novel and unique genome structure that has not previously been described for any animal rhabdovirus.

Population genomics of Fusarium graminearum reveals signatures of divergent evolution within a major cereal pathogen

PubMed Central

2018-01-01

The cereal pathogen Fusarium graminearum is the primary cause of Fusarium head blight (FHB) and a significant threat to food safety and crop production. To elucidate population structure and identify genomic targets of selection within major FHB pathogen populations in North America we sequenced the genomes of 60 diverse F. graminearum isolates. We also assembled the first pan-genome for F. graminearum to clarify population-level differences in gene content potentially contributing to pathogen diversity. Bayesian and phylogenomic analyses revealed genetic structure associated with isolates that produce the novel NX-2 mycotoxin, suggesting a North American population that has remained genetically distinct from other endemic and introduced cereal-infecting populations. Genome scans uncovered distinct signatures of selection within populations, focused in high diversity, frequently recombining regions. These patterns suggested selection for genomic divergence at the trichothecene toxin gene cluster and thirteen additional regions containing genes potentially involved in pathogen specialization. Gene content differences further distinguished populations, in that 121 genes showed population-specific patterns of conservation. Genes that differentiated populations had predicted functions related to pathogenesis, secondary metabolism and antagonistic interactions, though a subset had unique roles in temperature and light sensitivity. Our results indicated that F. graminearum populations are distinguished by dozens of genes with signatures of selection and an array of dispensable accessory genes, suggesting that FHB pathogen populations may be equipped with different traits to exploit the agroecosystem. These findings provide insights into the evolutionary processes and genomic features contributing to population divergence in plant pathogens, and highlight candidate genes for future functional studies of pathogen specialization across evolutionarily and ecologically diverse fungi. PMID:29584736
Comparative analysis of grapevine whole-genome gene predictions, functional annotation, categorization and integration of the predicted gene sequences

PubMed Central

2012-01-01

Background The first draft assembly and gene prediction of the grapevine genome (8X base coverage) was made available to the scientific community in 2007, and functional annotation was developed on this gene prediction. Since then additional Sanger sequences were added to the 8X sequences pool and a new version of the genomic sequence with superior base coverage (12X) was produced. Results In order to more efficiently annotate the function of the genes predicted in the new assembly, it is important to build on as much of the previous work as possible, by transferring 8X annotation of the genome to the 12X version. The 8X and 12X assemblies and gene predictions of the grapevine genome were compared to answer the question, “Can we uniquely map 8X predicted genes to 12X predicted genes?” The results show that while the assemblies and gene structure predictions are too different to make a complete mapping between them, most genes (18,725) showed a one-to-one relationship between 8X predicted genes and the last version of 12X predicted genes. In addition, reshuffled genomic sequence structures appeared. These highlight regions of the genome where the gene predictions need to be taken with caution. Based on the new grapevine gene functional annotation and in-depth functional categorization, twenty eight new molecular networks have been created for VitisNet while the existing networks were updated. Conclusions The outcomes of this study provide a functional annotation of the 12X genes, an update of VitisNet, the system of the grapevine molecular networks, and a new functional categorization of genes. Data are available at the VitisNet website (http://www.sdstate.edu/ps/research/vitis/pathways.cfm). PMID:22554261
Comparative genomics of 9 novel Paenibacillus larvae bacteriophages

PubMed Central

Stamereilers, Casey; LeBlanc, Lucy; Yost, Diane; Amy, Penny S.; Tsourkas, Philippos K.

2016-01-01

ABSTRACT American Foulbrood Disease, caused by the bacterium Paenibacillus larvae, is one of the most destructive diseases of the honeybee, Apis mellifera. Our group recently published the sequences of 9 new phages with the ability to infect and lyse P. larvae. Here, we characterize the genomes of these P. larvae phages, compare them to each other and to other sequenced P. larvae phages, and putatively identify protein function. The phage genomes are 38–45 kb in size and contain 68–86 genes, most of which appear to be unique to P. larvae phages. We classify P. larvae phages into 2 main clusters and one singleton based on nucleotide sequence identity. Three of the new phages show sequence similarity to other sequenced P. larvae phages, while the remaining 6 do not. We identified functions for roughly half of the P. larvae phage proteins, including structural, assembly, host lysis, DNA replication/metabolism, regulatory, and host-related functions. Structural and assembly proteins are highly conserved among our phages and are located at the start of the genome. DNA replication/metabolism, regulatory, and host-related proteins are located in the middle and end of the genome, and are not conserved, with many of these genes found in some of our phages but not others. All nine phages code for a conserved N-acetylmuramoyl-L-alanine amidase. Comparative analysis showed the phages use the “cohesive ends with 3′ overhang” DNA packaging strategy. This work is the first in-depth study of P. larvae phage genomics, and serves as a marker for future work in this area. PMID:27738559
Whole genome resequencing of a laboratory-adapted Drosophila melanogaster population sample

PubMed Central

Gilks, William P.; Pennell, Tanya M.; Flis, Ilona; Webster, Matthew T.; Morrow, Edward H.

2016-01-01

As part of a study into the molecular genetics of sexually dimorphic complex traits, we used high-throughput sequencing to obtain data on genomic variation in an outbred laboratory-adapted fruit fly ( Drosophila melanogaster) population. We successfully resequenced the whole genome of 220 hemiclonal females that were heterozygous for the same Berkeley reference line genome (BDGP6/dm6), and a unique haplotype from the outbred base population (LH M). The use of a static and known genetic background enabled us to obtain sequences from whole-genome phased haplotypes. We used a BWA-Picard-GATK pipeline for mapping sequence reads to the dm6 reference genome assembly, at a median depth-of coverage of 31X, and have made the resulting data publicly-available in the NCBI Short Read Archive (Accession number SRP058502). We used Haplotype Caller to discover and genotype 1,726,931 small genomic variants (SNPs and indels, <200bp). Additionally we detected and genotyped 167 large structural variants (1-100Kb in size) using GenomeStrip/2.0. Sequence and genotype data are publicly-available at the corresponding NCBI databases: Short Read Archive, dbSNP and dbVar (BioProject PRJNA282591). We have also released the unfiltered genotype data, and the code and logs for data processing and summary statistics ( https://zenodo.org/communities/sussex_drosophila_sequencing/). PMID:27928499
Genome mining of Streptomyces scabrisporus NF3 reveals symbiotic features including genes related to plant interactions.

PubMed

Ceapă, Corina Diana; Vázquez-Hernández, Melissa; Rodríguez-Luna, Stefany Daniela; Cruz Vázquez, Angélica Patricia; Jiménez Suárez, Verónica; Rodríguez-Sanoja, Romina; Alvarez-Buylla, Elena R; Sánchez, Sergio

2018-01-01

Endophytic bacteria are wide-spread and associated with plant physiological benefits, yet their genomes and secondary metabolites remain largely unidentified. In this study, we explored the genome of the endophyte Streptomyces scabrisporus NF3 for discovery of potential novel molecules as well as genes and metabolites involved in host interactions. The complete genomes of seven Streptomyces and three other more distantly related bacteria were used to define the functional landscape of this unique microbe. The S. scabrisporus NF3 genome is larger than the average Streptomyces genome and not structured for an obligate endosymbiotic lifestyle; this and the fact that can grow in R2YE media implies that it could include a soil-living stage. The genome displays an enrichment of genes associated with amino acid production, protein secretion, secondary metabolite and antioxidants production and xenobiotic degradation, indicating that S. scabrisporus NF3 could contribute to the metabolic enrichment of soil microbial communities and of its hosts. Importantly, besides its metabolic advantages, the genome showed evidence for differential functional specificity and diversification of plant interaction molecules, including genes for the production of plant hormones, stress resistance molecules, chitinases, antibiotics and siderophores. Given the diversity of S. scabrisporus mechanisms for host upkeep, we propose that these strategies were necessary for its adaptation to plant hosts and to face changes in environmental conditions.
Genome mining of Streptomyces scabrisporus NF3 reveals symbiotic features including genes related to plant interactions

PubMed Central

Rodríguez-Luna, Stefany Daniela; Cruz Vázquez, Angélica Patricia; Jiménez Suárez, Verónica; Rodríguez-Sanoja, Romina; Alvarez-Buylla, Elena R.; Sánchez, Sergio

2018-01-01

Endophytic bacteria are wide-spread and associated with plant physiological benefits, yet their genomes and secondary metabolites remain largely unidentified. In this study, we explored the genome of the endophyte Streptomyces scabrisporus NF3 for discovery of potential novel molecules as well as genes and metabolites involved in host interactions. The complete genomes of seven Streptomyces and three other more distantly related bacteria were used to define the functional landscape of this unique microbe. The S. scabrisporus NF3 genome is larger than the average Streptomyces genome and not structured for an obligate endosymbiotic lifestyle; this and the fact that can grow in R2YE media implies that it could include a soil-living stage. The genome displays an enrichment of genes associated with amino acid production, protein secretion, secondary metabolite and antioxidants production and xenobiotic degradation, indicating that S. scabrisporus NF3 could contribute to the metabolic enrichment of soil microbial communities and of its hosts. Importantly, besides its metabolic advantages, the genome showed evidence for differential functional specificity and diversification of plant interaction molecules, including genes for the production of plant hormones, stress resistance molecules, chitinases, antibiotics and siderophores. Given the diversity of S. scabrisporus mechanisms for host upkeep, we propose that these strategies were necessary for its adaptation to plant hosts and to face changes in environmental conditions. PMID:29447216
Predicting protein crystallization propensity from protein sequence

PubMed Central

2011-01-01

The high-throughput structure determination pipelines developed by structural genomics programs offer a unique opportunity for data mining. One important question is how protein properties derived from a primary sequence correlate with the protein’s propensity to yield X-ray quality crystals (crystallizability) and 3D X-ray structures. A set of protein properties were computed for over 1,300 proteins that expressed well but were insoluble, and for ~720 unique proteins that resulted in X-ray structures. The correlation of the protein’s iso-electric point and grand average hydropathy (GRAVY) with crystallizability was analyzed for full length and domain constructs of protein targets. In a second step, several additional properties that can be calculated from the protein sequence were added and evaluated. Using statistical analyses we have identified a set of the attributes correlating with a protein’s propensity to crystallize and implemented a Support Vector Machine (SVM) classifier based on these. We have created applications to analyze and provide optimal boundary information for query sequences and to visualize the data. These tools are available via the web site http://bioinformatics.anl.gov/cgi-bin/tools/pdpredictor. PMID:20177794
A tutorial of diverse genome analysis tools found in the CoGe web-platform using Plasmodium spp. as a model

PubMed Central

Castillo, Andreina I; Nelson, Andrew D L; Haug-Baltzell, Asher K; Lyons, Eric

2018-01-01

Abstract Integrated platforms for storage, management, analysis and sharing of large quantities of omics data have become fundamental to comparative genomics. CoGe (https://genomevolution.org/coge/) is an online platform designed to manage and study genomic data, enabling both data- and hypothesis-driven comparative genomics. CoGe’s tools and resources can be used to organize and analyse both publicly available and private genomic data from any species. Here, we demonstrate the capabilities of CoGe through three example workflows using 17 Plasmodium genomes as a model. Plasmodium genomes present unique challenges for comparative genomics due to their rapidly evolving and highly variable genomic AT/GC content. These example workflows are intended to serve as templates to help guide researchers who would like to use CoGe to examine diverse aspects of genome evolution. In the first workflow, trends in genome composition and amino acid usage are explored. In the second, changes in genome structure and the distribution of synonymous (Ks) and non-synonymous (Kn) substitution values are evaluated across species with different levels of evolutionary relatedness. In the third workflow, microsyntenic analyses of multigene families’ genomic organization are conducted using two Plasmodium-specific gene families—serine repeat antigen, and cytoadherence-linked asexual gene—as models. In general, these example workflows show how to achieve quick, reproducible and shareable results using the CoGe platform. We were able to replicate previously published results, as well as leverage CoGe’s tools and resources to gain additional insight into various aspects of Plasmodium genome evolution. Our results highlight the usefulness of the CoGe platform, particularly in understanding complex features of genome evolution. Database URL: https://genomevolution.org/coge/
The complete chloroplast genome sequence of the CAM epiphyte Spanish moss (Tillandsia usneoides, Bromeliaceae) and its comparative analysis.

PubMed

Poczai, Péter; Hyvönen, Jaakko

2017-01-01

Spanish moss (Tillandsia usneoides) is an epiphytic bromeliad widely distributed throughout tropical and warm temperate America. This plant is highly adapted to extreme environmental conditions. Striking features of this species include specialized trichomes (scales) covering the surface of its shoots aiding the absorption of water and nutrients directly from the atmosphere and a specific photosynthesis using crassulacean acid metabolism (CAM). Here we report the plastid genome of Spanish moss and present the comparison of genome organization and sequence evolution within Poales. The plastome of Spanish moss has a quadripartite structure consisting of a large single copy (LSC, 87,439 bp), two inverted regions (IRa and IRb, 26,803 bp) and short single copy (SSC, 18,612 bp) region. The plastid genome had 37.2% GC content and 134 genes with 88 being unique protein-coding genes and 20 of these are duplicated in the IR, similar to other reported bromeliads. Our study shows that early diverging lineages of Poales do not have high substitution rates as compared to grasses, and plastid genomes of bromeliads show structural features considered to be ancestral in graminids. These include the loss of the introns in the clpP and rpoC1 genes and the complete loss or partial degradation of accD and ycf genes in the Graminid clade. Further structural rearrangements appeared in the graminids lacking in Spanish moss, which include a 28-kb inversion between the trnG-UCC-rps14 region and 6-kb in the trnG-UCC-psbD, followed by a third <1kb inversion in the trnT sequence.
The complete chloroplast genome sequence of the CAM epiphyte Spanish moss (Tillandsia usneoides, Bromeliaceae) and its comparative analysis

PubMed Central

Hyvönen, Jaakko

2017-01-01

Spanish moss (Tillandsia usneoides) is an epiphytic bromeliad widely distributed throughout tropical and warm temperate America. This plant is highly adapted to extreme environmental conditions. Striking features of this species include specialized trichomes (scales) covering the surface of its shoots aiding the absorption of water and nutrients directly from the atmosphere and a specific photosynthesis using crassulacean acid metabolism (CAM). Here we report the plastid genome of Spanish moss and present the comparison of genome organization and sequence evolution within Poales. The plastome of Spanish moss has a quadripartite structure consisting of a large single copy (LSC, 87,439 bp), two inverted regions (IRa and IRb, 26,803 bp) and short single copy (SSC, 18,612 bp) region. The plastid genome had 37.2% GC content and 134 genes with 88 being unique protein-coding genes and 20 of these are duplicated in the IR, similar to other reported bromeliads. Our study shows that early diverging lineages of Poales do not have high substitution rates as compared to grasses, and plastid genomes of bromeliads show structural features considered to be ancestral in graminids. These include the loss of the introns in the clpP and rpoC1 genes and the complete loss or partial degradation of accD and ycf genes in the Graminid clade. Further structural rearrangements appeared in the graminids lacking in Spanish moss, which include a 28-kb inversion between the trnG-UCC–rps14 region and 6-kb in the trnG-UCC–psbD, followed by a third <1kb inversion in the trnT sequence. PMID:29095905
Discrimination of candidate subgenome-specific loci by linkage map construction with an S1 population of octoploid strawberry (Fragaria × ananassa).

PubMed

Nagano, Soichiro; Shirasawa, Kenta; Hirakawa, Hideki; Maeda, Fumi; Ishikawa, Masami; Isobe, Sachiko N

2017-05-12

The strawberry, Fragaria × ananassa, is an allo-octoploid (2n = 8x = 56) and outcrossing species. Although it is the most widely consumed berry crop in the world, its complex genome structure has hindered its genetic and genomic analysis, and thus discrimination of subgenome-specific loci among the homoeologous chromosomes is needed. In the present study, we identified candidate subgenome-specific single nucleotide polymorphism (SNP) and simple sequence repeat (SSR) loci, and constructed a linkage map using an S 1 mapping population of the cultivar 'Reikou' with an IStraw90 Axiom® SNP array and previously published SSR markers. The 'Reikou' linkage map consisted of 11,574 loci (11,002 SNPs and 572 SSR loci) spanning 2816.5 cM of 31 linkage groups. The 11,574 loci were located on 4738 unique positions (bin) on the linkage map. Of the mapped loci, 8999 (8588 SNPs and 411 SSR loci) showed a 1:2:1 segregation ratio of AA:AB:BB allele, which suggested the possibility of deriving loci from candidate subgenome-specific sequences. In addition, 2575 loci (2414 SNPs and 161 SSR loci) showed a 3:1 segregation of AB:BB allele, indicating they were derived from homoeologous genomic sequences. Comparative analysis of the homoeologous linkage groups revealed differences in genome structure among the subgenomes. Our results suggest that candidate subgenome-specific loci are randomly located across the genomes, and that there are small- to large-scale structural variations among the subgenomes. The mapped SNPs and SSR loci on the linkage map are expected to be seed points for the construction of pseudomolecules in the octoploid strawberry.
The draft genomes of soft-shell turtle and green sea turtle yield insights into the development and evolution of the turtle-specific body plan.

PubMed

Wang, Zhuo; Pascual-Anaya, Juan; Zadissa, Amonida; Li, Wenqi; Niimura, Yoshihito; Huang, Zhiyong; Li, Chunyi; White, Simon; Xiong, Zhiqiang; Fang, Dongming; Wang, Bo; Ming, Yao; Chen, Yan; Zheng, Yuan; Kuraku, Shigehiro; Pignatelli, Miguel; Herrero, Javier; Beal, Kathryn; Nozawa, Masafumi; Li, Qiye; Wang, Juan; Zhang, Hongyan; Yu, Lili; Shigenobu, Shuji; Wang, Junyi; Liu, Jiannan; Flicek, Paul; Searle, Steve; Wang, Jun; Kuratani, Shigeru; Yin, Ye; Aken, Bronwen; Zhang, Guojie; Irie, Naoki

2013-06-01

The unique anatomical features of turtles have raised unanswered questions about the origin of their unique body plan. We generated and analyzed draft genomes of the soft-shell turtle (Pelodiscus sinensis) and the green sea turtle (Chelonia mydas); our results indicated the close relationship of the turtles to the bird-crocodilian lineage, from which they split ∼267.9-248.3 million years ago (Upper Permian to Triassic). We also found extensive expansion of olfactory receptor genes in these turtles. Embryonic gene expression analysis identified an hourglass-like divergence of turtle and chicken embryogenesis, with maximal conservation around the vertebrate phylotypic period, rather than at later stages that show the amniote-common pattern. Wnt5a expression was found in the growth zone of the dorsal shell, supporting the possible co-option of limb-associated Wnt signaling in the acquisition of this turtle-specific novelty. Our results suggest that turtle evolution was accompanied by an unexpectedly conservative vertebrate phylotypic period, followed by turtle-specific repatterning of development to yield the novel structure of the shell.
Mutation in a primate-conserved retrotransposon reveals a noncoding RNA as a mediator of infantile encephalopathy

PubMed Central

Cartault, François; Munier, Patrick; Benko, Edgar; Desguerre, Isabelle; Hanein, Sylvain; Boddaert, Nathalie; Bandiera, Simonetta; Vellayoudom, Jeanine; Krejbich-Trotot, Pascale; Bintner, Marc; Hoarau, Jean-Jacques; Girard, Muriel; Génin, Emmanuelle; de Lonlay, Pascale; Fourmaintraux, Alain; Naville, Magali; Rodriguez, Diana; Feingold, Josué; Renouil, Michel; Munnich, Arnold; Westhof, Eric; Fähling, Michael; Lyonnet, Stanislas; Henrion-Caude, Alexandra

2012-01-01

The human genome is densely populated with transposons and transposon-like repetitive elements. Although the impact of these transposons and elements on human genome evolution is recognized, the significance of subtle variations in their sequence remains mostly unexplored. Here we report homozygosity mapping of an infantile neurodegenerative disease locus in a genetic isolate. Complete DNA sequencing of the 400-kb linkage locus revealed a point mutation in a primate-specific retrotransposon that was transcribed as part of a unique noncoding RNA, which was expressed in the brain. In vitro knockdown of this RNA increased neuronal apoptosis, consistent with the inappropriate dosage of this RNA in vivo and with the phenotype. Moreover, structural analysis of the sequence revealed a small RNA-like hairpin that was consistent with the putative gain of a functional site when mutated. We show here that a mutation in a unique transposable element-containing RNA is associated with lethal encephalopathy, and we suggest that RNAs that harbor evolutionarily recent repetitive elements may play important roles in human brain development. PMID:22411793
Genomic characterisation of Arachis porphyrocalyx (Valls & C.E. Simpson, 2005) (Leguminosae): multiple origin of Arachis species with x = 9

PubMed Central

Celeste, Silvestri María; Ortiz, Alejandra Marcela; Robledo, Germán Ariel; Valls, José Francisco Montenegro; Lavia, Graciela Inés

2017-01-01

Abstract The genus Arachis Linnaeus, 1753 comprises four species with x = 9, three belong to the section Arachis: Arachis praecox (Krapov. W.C. Greg. & Valls, 1994), Arachis palustris (Krapov. W.C. Greg. & Valls, 1994) and Arachis decora (Krapov. W.C. Greg. & Valls, 1994) and only one belongs to the section Erectoides: Arachis porphyrocalyx (Valls & C.E. Simpson, 2005). Recently, the x = 9 species of section Arachis have been assigned to G genome, the latest described so far. The genomic relationship of Arachis porphyrocalyx with these species is controversial. In the present work, we carried out a karyotypic characterisation of Arachis porphyrocalyx to evaluate its genomic structure and analyse the origin of all x = 9 Arachis species. Arachis porphyrocalyx showed a karyotype formula of 14m+4st, one pair of A chromosomes, satellited chromosomes type 8, one pair of 45S rDNA sites in the SAT chromosomes, one pair of 5S rDNA sites and pericentromeric C-DAPI+ bands in all chromosomes. Karyotype structure indicates that Arachis porphyrocalyx does not share the same genome type with the other three x = 9 species and neither with the remaining Erectoides species. Taking into account the geographic distribution, morphological and cytogenetic features, the origin of species with x = 9 of the genus Arachis cannot be unique; instead, they originated at least twice in the evolutionary history of the genus. PMID:28919947
A Reevaluation of Rice Mitochondrial Evolution Based on the Complete Sequence of Male-Fertile and Male-Sterile Mitochondrial Genomes1[C][W][OA

PubMed Central

Bentolila, Stéphane; Stefanov, Stefan

2012-01-01

Plant mitochondrial genomes have features that distinguish them radically from their animal counterparts: a high rate of rearrangement, of uptake and loss of DNA sequences, and an extremely low point mutation rate. Perhaps the most unique structural feature of plant mitochondrial DNAs is the presence of large repeated sequences involved in intramolecular and intermolecular recombination. In addition, rare recombination events can occur across shorter repeats, creating rearrangements that result in aberrant phenotypes, including pollen abortion, which is known as cytoplasmic male sterility (CMS). Using next-generation sequencing, we pyrosequenced two rice (Oryza sativa) mitochondrial genomes that belong to the indica subspecies. One genome is normal, while the other carries the wild abortive-CMS. We find that numerous rearrangements in the rice mitochondrial genome occur even between close cytotypes during rice evolution. Unlike maize (Zea mays), a closely related species also belonging to the grass family, integration of plastid sequences did not play a role in the sequence divergence between rice cytotypes. This study also uncovered an excellent candidate for the wild abortive-CMS-encoding gene; like most of the CMS-associated open reading frames that are known in other species, this candidate was created via a rearrangement, is chimeric in structure, possesses predicted transmembrane domains, and coopted the promoter of a genuine mitochondrial gene. Our data give new insights into rice mitochondrial evolution, correcting previous reports. PMID:22128137
The Naegleria genome: a free-living microbial eukaryote lends unique insights into core eukaryotic cell biology

PubMed Central

Fritz-Laylin, Lillian K.; Ginger, Michael L.; Walsh, Charles; Dawson, Scott C.; Fulton, Chandler

2016-01-01

Naegleria gruberi, a free-living protist, has long been treasured as a model for basal body and flagellar assembly due to its ability to differentiate from crawling amoebae into swimming flagellates. The full genome sequence of Naegleria gruberi has recently been used to estimate gene families ancestral to all eukaryotes and to identify novel aspects of Naegleria biology, including likely facultative anaerobic metabolism, extensive signaling cascades, and evidence for sexuality. Distinctive features of the Naegleria genome and nuclear biology provide unique perspectives for comparative cell biology, including cell division, RNA processing and nucleolar assembly. We highlight here exciting new and novel aspects of Naegleria biology identified through genomic analysis. PMID:21392573
De Novo Assembly of Human Herpes Virus Type 1 (HHV-1) Genome, Mining of Non-Canonical Structures and Detection of Novel Drug-Resistance Mutations Using Short- and Long-Read Next Generation Sequencing Technologies

PubMed Central

Karamitros, Timokratis; Piorkowska, Renata; Katzourakis, Aris; Magiorkinis, Gkikas; Mbisa, Jean Lutamyo

2016-01-01

Human herpesvirus type 1 (HHV-1) has a large double-stranded DNA genome of approximately 152 kbp that is structurally complex and GC-rich. This makes the assembly of HHV-1 whole genomes from short-read sequencing data technically challenging. To improve the assembly of HHV-1 genomes we have employed a hybrid genome assembly protocol using data from two sequencing technologies: the short-read Roche 454 and the long-read Oxford Nanopore MinION sequencers. We sequenced 18 HHV-1 cell culture-isolated clinical specimens collected from immunocompromised patients undergoing antiviral therapy. The susceptibility of the samples to several antivirals was determined by plaque reduction assay. Hybrid genome assembly resulted in a decrease in the number of contigs in 6 out of 7 samples and an increase in N(G)50 and N(G)75 of all 7 samples sequenced by both technologies. The approach also enhanced the detection of non-canonical contigs including a rearrangement between the unique (UL) and repeat (T/IRL) sequence regions of one sample that was not detectable by assembly of 454 reads alone. We detected several known and novel resistance-associated mutations in UL23 and UL30 genes. Genome-wide genetic variability ranged from <1% to 53% of amino acids in each gene exhibiting at least one substitution within the pool of samples. The UL23 gene had one of the highest genetic variabilities at 35.2% in keeping with its role in development of drug resistance. The assembly of accurate, full-length HHV-1 genomes will be useful in determining genetic determinants of drug resistance, virulence, pathogenesis and viral evolution. The numerous, complex repeat regions of the HHV-1 genome currently remain a barrier towards this goal. PMID:27309375
De Novo Assembly of Human Herpes Virus Type 1 (HHV-1) Genome, Mining of Non-Canonical Structures and Detection of Novel Drug-Resistance Mutations Using Short- and Long-Read Next Generation Sequencing Technologies.

PubMed

Karamitros, Timokratis; Harrison, Ian; Piorkowska, Renata; Katzourakis, Aris; Magiorkinis, Gkikas; Mbisa, Jean Lutamyo

2016-01-01

Human herpesvirus type 1 (HHV-1) has a large double-stranded DNA genome of approximately 152 kbp that is structurally complex and GC-rich. This makes the assembly of HHV-1 whole genomes from short-read sequencing data technically challenging. To improve the assembly of HHV-1 genomes we have employed a hybrid genome assembly protocol using data from two sequencing technologies: the short-read Roche 454 and the long-read Oxford Nanopore MinION sequencers. We sequenced 18 HHV-1 cell culture-isolated clinical specimens collected from immunocompromised patients undergoing antiviral therapy. The susceptibility of the samples to several antivirals was determined by plaque reduction assay. Hybrid genome assembly resulted in a decrease in the number of contigs in 6 out of 7 samples and an increase in N(G)50 and N(G)75 of all 7 samples sequenced by both technologies. The approach also enhanced the detection of non-canonical contigs including a rearrangement between the unique (UL) and repeat (T/IRL) sequence regions of one sample that was not detectable by assembly of 454 reads alone. We detected several known and novel resistance-associated mutations in UL23 and UL30 genes. Genome-wide genetic variability ranged from <1% to 53% of amino acids in each gene exhibiting at least one substitution within the pool of samples. The UL23 gene had one of the highest genetic variabilities at 35.2% in keeping with its role in development of drug resistance. The assembly of accurate, full-length HHV-1 genomes will be useful in determining genetic determinants of drug resistance, virulence, pathogenesis and viral evolution. The numerous, complex repeat regions of the HHV-1 genome currently remain a barrier towards this goal.
Comparative and demographic analysis of orangutan genomes

PubMed Central

Locke, Devin P.; Hillier, LaDeana W.; Warren, Wesley C.; Worley, Kim C.; Nazareth, Lynne V.; Muzny, Donna M.; Yang, Shiaw-Pyng; Wang, Zhengyuan; Chinwalla, Asif T.; Minx, Pat; Mitreva, Makedonka; Cook, Lisa; Delehaunty, Kim D.; Fronick, Catrina; Schmidt, Heather; Fulton, Lucinda A.; Fulton, Robert S.; Nelson, Joanne O.; Magrini, Vincent; Pohl, Craig; Graves, Tina A.; Markovic, Chris; Cree, Andy; Dinh, Huyen H.; Hume, Jennifer; Kovar, Christie L.; Fowler, Gerald R.; Lunter, Gerton; Meader, Stephen; Heger, Andreas; Ponting, Chris P.; Marques-Bonet, Tomas; Alkan, Can; Chen, Lin; Cheng, Ze; Kidd, Jeffrey M.; Eichler, Evan E.; White, Simon; Searle, Stephen; Vilella, Albert J.; Chen, Yuan; Flicek, Paul; Ma, Jian; Raney, Brian; Suh, Bernard; Burhans, Richard; Herrero, Javier; Haussler, David; Faria, Rui; Fernando, Olga; Darré, Fleur; Farré, Domènec; Gazave, Elodie; Oliva, Meritxell; Navarro, Arcadi; Roberto, Roberta; Capozzi, Oronzo; Archidiacono, Nicoletta; Valle, Giuliano Della; Purgato, Stefania; Rocchi, Mariano; Konkel, Miriam K.; Walker, Jerilyn A.; Ullmer, Brygg; Batzer, Mark A.; Smit, Arian F. A.; Hubley, Robert; Casola, Claudio; Schrider, Daniel R.; Hahn, Matthew W.; Quesada, Victor; Puente, Xose S.; Ordoñez, Gonzalo R.; López-Otín, Carlos; Vinar, Tomas; Brejova, Brona; Ratan, Aakrosh; Harris, Robert S.; Miller, Webb; Kosiol, Carolin; Lawson, Heather A.; Taliwal, Vikas; Martins, André L.; Siepel, Adam; RoyChoudhury, Arindam; Ma, Xin; Degenhardt, Jeremiah; Bustamante, Carlos D.; Gutenkunst, Ryan N.; Mailund, Thomas; Dutheil, Julien Y.; Hobolth, Asger; Schierup, Mikkel H.; Chemnick, Leona; Ryder, Oliver A.; Yoshinaga, Yuko; de Jong, Pieter J.; Weinstock, George M.; Rogers, Jeffrey; Mardis, Elaine R.; Gibbs, Richard A.; Wilson, Richard K.

2011-01-01

“Orangutan” is derived from the Malay term “man of the forest” and aptly describes the Southeast Asian great apes native to Sumatra and Borneo. The orangutan species, Pongo abelii (Sumatran) and Pongo pygmaeus (Bornean), are the most phylogenetically distant great apes from humans, thereby providing an informative perspective on hominid evolution. Here we present a Sumatran orangutan draft genome assembly and short read sequence data from five Sumatran and five Bornean orangutan genomes. Our analyses reveal that, compared to other primates, the orangutan genome has many unique features. Structural evolution of the orangutan genome has proceeded much more slowly than other great apes, evidenced by fewer rearrangements, less segmental duplication, a lower rate of gene family turnover and surprisingly quiescent Alu repeats, which have played a major role in restructuring other primate genomes. We also describe the first primate polymorphic neocentromere, found in both Pongo species, emphasizing the gradual evolution of orangutan genome structure. Orangutans have extremely low energy usage for a eutherian mammal1, far lower than their hominid relatives. Adding their genome to the repertoire of sequenced primates illuminates new signals of positive selection in several pathways including glycolipid metabolism. From the population perspective, both Pongo species are deeply diverse; however, Sumatran individuals possess greater diversity than their Bornean counterparts, and more species-specific variation. Our estimate of Bornean/Sumatran speciation time, 400k years ago (ya), is more recent than most previous studies and underscores the complexity of the orangutan speciation process. Despite a smaller modern census population size, the Sumatran effective population size (Ne) expanded exponentially relative to the ancestral Ne after the split, while Bornean Ne declined over the same period. Overall, the resources and analyses presented here offer new opportunities in evolutionary genomics, insights into hominid biology, and an extensive database of variation for conservation efforts. PMID:21270892
Comparative Genomics Identifies Epidermal Proteins Associated with the Evolution of the Turtle Shell.

PubMed

Holthaus, Karin Brigit; Strasser, Bettina; Sipos, Wolfgang; Schmidt, Heiko A; Mlitz, Veronika; Sukseree, Supawadee; Weissenbacher, Anton; Tschachler, Erwin; Alibardi, Lorenzo; Eckhart, Leopold

2016-03-01

The evolution of reptiles, birds, and mammals was associated with the origin of unique integumentary structures. Studies on lizards, chicken, and humans have suggested that the evolution of major structural proteins of the outermost, cornified layers of the epidermis was driven by the diversification of a gene cluster called Epidermal Differentiation Complex (EDC). Turtles have evolved unique defense mechanisms that depend on mechanically resilient modifications of the epidermis. To investigate whether the evolution of the integument in these reptiles was associated with specific adaptations of the sequences and expression patterns of EDC-related genes, we utilized newly available genome sequences to determine the epidermal differentiation gene complement of turtles. The EDC of the western painted turtle (Chrysemys picta bellii) comprises more than 100 genes, including at least 48 genes that encode proteins referred to as beta-keratins or corneous beta-proteins. Several EDC proteins have evolved cysteine/proline contents beyond 50% of total amino acid residues. Comparative genomics suggests that distinct subfamilies of EDC genes have been expanded and partly translocated to loci outside of the EDC in turtles. Gene expression analysis in the European pond turtle (Emys orbicularis) showed that EDC genes are differentially expressed in the skin of the various body sites and that a subset of beta-keratin genes within the EDC as well as those located outside of the EDC are expressed predominantly in the shell. Our findings give strong support to the hypothesis that the evolutionary innovation of the turtle shell involved specific molecular adaptations of epidermal differentiation. © The Author 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

Brachypodium distachyon as a Genetic Model System.

PubMed

Kellogg, Elizabeth A

2015-01-01

Brachypodium distachyon has emerged as a powerful model system for studying the genetics of flowering plants. Originally chosen for its phylogenetic proximity to the large-genome cereal crops wheat and barley, it is proving to be useful for more than simply providing markers for comparative mapping. Studies in B. distachyon have provided new insight into the structure and physiology of plant cell walls, the development and chemical composition of endosperm, and the genetic basis for cold tolerance. Recent work on auxin transport has uncovered mechanisms that apply to all angiosperms other than Arabidopsis. In addition to the areas in which it is currently used, B. distachyon is uniquely suited for studies of floral development, vein patterning, the controls of the perennial versus annual habit, and genome organization.
Genome sequences of the human body louse and its primary endosymbiont provide insights into the permanent parasitic lifestyle.

PubMed

Kirkness, Ewen F; Haas, Brian J; Sun, Weilin; Braig, Henk R; Perotti, M Alejandra; Clark, John M; Lee, Si Hyeock; Robertson, Hugh M; Kennedy, Ryan C; Elhaik, Eran; Gerlach, Daniel; Kriventseva, Evgenia V; Elsik, Christine G; Graur, Dan; Hill, Catherine A; Veenstra, Jan A; Walenz, Brian; Tubío, José Manuel C; Ribeiro, José M C; Rozas, Julio; Johnston, J Spencer; Reese, Justin T; Popadic, Aleksandar; Tojo, Marta; Raoult, Didier; Reed, David L; Tomoyasu, Yoshinori; Kraus, Emily; Krause, Emily; Mittapalli, Omprakash; Margam, Venu M; Li, Hong-Mei; Meyer, Jason M; Johnson, Reed M; Romero-Severson, Jeanne; Vanzee, Janice Pagel; Alvarez-Ponce, David; Vieira, Filipe G; Aguadé, Montserrat; Guirao-Rico, Sara; Anzola, Juan M; Yoon, Kyong S; Strycharz, Joseph P; Unger, Maria F; Christley, Scott; Lobo, Neil F; Seufferheld, Manfredo J; Wang, Naikuan; Dasch, Gregory A; Struchiner, Claudio J; Madey, Greg; Hannick, Linda I; Bidwell, Shelby; Joardar, Vinita; Caler, Elisabet; Shao, Renfu; Barker, Stephen C; Cameron, Stephen; Bruggner, Robert V; Regier, Allison; Johnson, Justin; Viswanathan, Lakshmi; Utterback, Terry R; Sutton, Granger G; Lawson, Daniel; Waterhouse, Robert M; Venter, J Craig; Strausberg, Robert L; Berenbaum, May R; Collins, Frank H; Zdobnov, Evgeny M; Pittendrigh, Barry R

2010-07-06

As an obligatory parasite of humans, the body louse (Pediculus humanus humanus) is an important vector for human diseases, including epidemic typhus, relapsing fever, and trench fever. Here, we present genome sequences of the body louse and its primary bacterial endosymbiont Candidatus Riesia pediculicola. The body louse has the smallest known insect genome, spanning 108 Mb. Despite its status as an obligate parasite, it retains a remarkably complete basal insect repertoire of 10,773 protein-coding genes and 57 microRNAs. Representing hemimetabolous insects, the genome of the body louse thus provides a reference for studies of holometabolous insects. Compared with other insect genomes, the body louse genome contains significantly fewer genes associated with environmental sensing and response, including odorant and gustatory receptors and detoxifying enzymes. The unique architecture of the 18 minicircular mitochondrial chromosomes of the body louse may be linked to the loss of the gene encoding the mitochondrial single-stranded DNA binding protein. The genome of the obligatory louse endosymbiont Candidatus Riesia pediculicola encodes less than 600 genes on a short, linear chromosome and a circular plasmid. The plasmid harbors a unique arrangement of genes required for the synthesis of pantothenate, an essential vitamin deficient in the louse diet. The human body louse, its primary endosymbiont, and the bacterial pathogens that it vectors all possess genomes reduced in size compared with their free-living close relatives. Thus, the body louse genome project offers unique information and tools to use in advancing understanding of coevolution among vectors, symbionts, and pathogens.
Genome sequences of the human body louse and its primary endosymbiont provide insights into the permanent parasitic lifestyle

PubMed Central

Kirkness, Ewen F.; Haas, Brian J.; Sun, Weilin; Braig, Henk R.; Perotti, M. Alejandra; Clark, John M.; Lee, Si Hyeock; Robertson, Hugh M.; Kennedy, Ryan C.; Elhaik, Eran; Gerlach, Daniel; Kriventseva, Evgenia V.; Elsik, Christine G.; Graur, Dan; Hill, Catherine A.; Veenstra, Jan A.; Walenz, Brian; Tubío, José Manuel C.; Ribeiro, José M. C.; Rozas, Julio; Johnston, J. Spencer; Reese, Justin T.; Popadic, Aleksandar; Tojo, Marta; Raoult, Didier; Reed, David L.; Tomoyasu, Yoshinori; Kraus, Emily; Mittapalli, Omprakash; Margam, Venu M.; Li, Hong-Mei; Meyer, Jason M.; Johnson, Reed M.; Romero-Severson, Jeanne; VanZee, Janice Pagel; Alvarez-Ponce, David; Vieira, Filipe G.; Aguadé, Montserrat; Guirao-Rico, Sara; Anzola, Juan M.; Yoon, Kyong S.; Strycharz, Joseph P.; Unger, Maria F.; Christley, Scott; Lobo, Neil F.; Seufferheld, Manfredo J.; Wang, NaiKuan; Dasch, Gregory A.; Struchiner, Claudio J.; Madey, Greg; Hannick, Linda I.; Bidwell, Shelby; Joardar, Vinita; Caler, Elisabet; Shao, Renfu; Barker, Stephen C.; Cameron, Stephen; Bruggner, Robert V.; Regier, Allison; Johnson, Justin; Viswanathan, Lakshmi; Utterback, Terry R.; Sutton, Granger G.; Lawson, Daniel; Waterhouse, Robert M.; Venter, J. Craig; Strausberg, Robert L.; Collins, Frank H.; Zdobnov, Evgeny M.; Pittendrigh, Barry R.

2010-01-01

As an obligatory parasite of humans, the body louse (Pediculus humanus humanus) is an important vector for human diseases, including epidemic typhus, relapsing fever, and trench fever. Here, we present genome sequences of the body louse and its primary bacterial endosymbiont Candidatus Riesia pediculicola. The body louse has the smallest known insect genome, spanning 108 Mb. Despite its status as an obligate parasite, it retains a remarkably complete basal insect repertoire of 10,773 protein-coding genes and 57 microRNAs. Representing hemimetabolous insects, the genome of the body louse thus provides a reference for studies of holometabolous insects. Compared with other insect genomes, the body louse genome contains significantly fewer genes associated with environmental sensing and response, including odorant and gustatory receptors and detoxifying enzymes. The unique architecture of the 18 minicircular mitochondrial chromosomes of the body louse may be linked to the loss of the gene encoding the mitochondrial single-stranded DNA binding protein. The genome of the obligatory louse endosymbiont Candidatus Riesia pediculicola encodes less than 600 genes on a short, linear chromosome and a circular plasmid. The plasmid harbors a unique arrangement of genes required for the synthesis of pantothenate, an essential vitamin deficient in the louse diet. The human body louse, its primary endosymbiont, and the bacterial pathogens that it vectors all possess genomes reduced in size compared with their free-living close relatives. Thus, the body louse genome project offers unique information and tools to use in advancing understanding of coevolution among vectors, symbionts, and pathogens. PMID:20566863
Combined Analysis of the Chloroplast Genome and Transcriptome of the Antarctic Vascular Plant Deschampsia antarctica Desv

PubMed Central

Lee, Jungeun; Kang, Yoonjee; Shin, Seung Chul; Park, Hyun; Lee, Hyoungseok

2014-01-01

Background Antarctic hairgrass (Deschampsia antarctica Desv.) is the only natural grass species in the maritime Antarctic. It has been researched as an important ecological marker and as an extremophile plant for studies on stress tolerance. Despite its importance, little genomic information is available for D. antarctica. Here, we report the complete chloroplast genome, transcriptome profiles of the coding/noncoding genes, and the posttranscriptional processing by RNA editing in the chloroplast system. Results The complete chloroplast genome of D. antarctica is 135,362 bp in length with a typical quadripartite structure, including the large (LSC: 79,881 bp) and small (SSC: 12,519 bp) single-copy regions, separated by a pair of identical inverted repeats (IR: 21,481 bp). It contains 114 unique genes, including 81 unique protein-coding genes, 29 tRNA genes, and 4 rRNA genes. Sequence divergence analysis with other plastomes from the BEP clade of the grass family suggests a sister relationship between D. antarctica, Festuca arundinacea and Lolium perenne of the Poeae tribe, based on the whole plastome. In addition, we conducted high-resolution mapping of the chloroplast-derived transcripts. Thus, we created an expression profile for 81 protein-coding genes and identified ndhC, psbJ, rps19, psaJ, and psbA as the most highly expressed chloroplast genes. Small RNA-seq analysis identified 27 small noncoding RNAs of chloroplast origin that were preferentially located near the 5′- or 3′-ends of genes. We also found >30 RNA-editing sites in the D. antarctica chloroplast genome, with a dominance of C-to-U conversions. Conclusions We assembled and characterized the complete chloroplast genome sequence of D. antarctica and investigated the features of the plastid transcriptome. These data may contribute to a better understanding of the evolution of D. antarctica within the Poaceae family for use in molecular phylogenetic studies and may also help researchers understand the characteristics of the chloroplast transcriptome. PMID:24647560
Combined analysis of the chloroplast genome and transcriptome of the Antarctic vascular plant Deschampsia antarctica Desv.

PubMed

Lee, Jungeun; Kang, Yoonjee; Shin, Seung Chul; Park, Hyun; Lee, Hyoungseok

2014-01-01

Antarctic hairgrass (Deschampsia antarctica Desv.) is the only natural grass species in the maritime Antarctic. It has been researched as an important ecological marker and as an extremophile plant for studies on stress tolerance. Despite its importance, little genomic information is available for D. antarctica. Here, we report the complete chloroplast genome, transcriptome profiles of the coding/noncoding genes, and the posttranscriptional processing by RNA editing in the chloroplast system. The complete chloroplast genome of D. antarctica is 135,362 bp in length with a typical quadripartite structure, including the large (LSC: 79,881 bp) and small (SSC: 12,519 bp) single-copy regions, separated by a pair of identical inverted repeats (IR: 21,481 bp). It contains 114 unique genes, including 81 unique protein-coding genes, 29 tRNA genes, and 4 rRNA genes. Sequence divergence analysis with other plastomes from the BEP clade of the grass family suggests a sister relationship between D. antarctica, Festuca arundinacea and Lolium perenne of the Poeae tribe, based on the whole plastome. In addition, we conducted high-resolution mapping of the chloroplast-derived transcripts. Thus, we created an expression profile for 81 protein-coding genes and identified ndhC, psbJ, rps19, psaJ, and psbA as the most highly expressed chloroplast genes. Small RNA-seq analysis identified 27 small noncoding RNAs of chloroplast origin that were preferentially located near the 5'- or 3'-ends of genes. We also found >30 RNA-editing sites in the D. antarctica chloroplast genome, with a dominance of C-to-U conversions. We assembled and characterized the complete chloroplast genome sequence of D. antarctica and investigated the features of the plastid transcriptome. These data may contribute to a better understanding of the evolution of D. antarctica within the Poaceae family for use in molecular phylogenetic studies and may also help researchers understand the characteristics of the chloroplast transcriptome.
Draft genome sequence of marine alphaproteobacterial strain HIMB11, the first cultivated representative of a unique lineage within the Roseobacter clade possessing an unusually small genome

PubMed Central

Durham, Bryndan P.; Grote, Jana; Whittaker, Kerry A.; Bender, Sara J.; Luo, Haiwei; Grim, Sharon L.; Brown, Julia M.; Casey, John R.; Dron, Antony; Florez-Leiva, Lennin; Krupke, Andreas; Luria, Catherine M.; Mine, Aric H.; Nigro, Olivia D.; Pather, Santhiska; Talarmin, Agathe; Wear, Emma K.; Weber, Thomas S.; Wilson, Jesse M.; Church, Matthew J.; DeLong, Edward F.; Karl, David M.; Steward, Grieg F.; Eppley, John M.; Kyrpides, Nikos C.; Schuster, Stephan; Rappé, Michael S.

2014-01-01

Strain HIMB11 is a planktonic marine bacterium isolated from coastal seawater in Kaneohe Bay, Oahu, Hawaii belonging to the ubiquitous and versatile Roseobacter clade of the alphaproteobacterial family Rhodobacteraceae. Here we describe the preliminary characteristics of strain HIMB11, including annotation of the draft genome sequence and comparative genomic analysis with other members of the Roseobacter lineage. The 3,098,747 bp draft genome is arranged in 34 contigs and contains 3,183 protein-coding genes and 54 RNA genes. Phylogenomic and 16S rRNA gene analyses indicate that HIMB11 represents a unique sublineage within the Roseobacter clade. Comparison with other publicly available genome sequences from members of the Roseobacter lineage reveals that strain HIMB11 has the genomic potential to utilize a wide variety of energy sources (e.g. organic matter, reduced inorganic sulfur, light, carbon monoxide), while possessing a reduced number of substrate transporters. PMID:25197450
Draft genome sequence of marine alphaproteobacterial strain HIMB11, the first cultivated representative of a unique lineage within the Roseobacter clade possessing an unusually small genome.

PubMed

Durham, Bryndan P; Grote, Jana; Whittaker, Kerry A; Bender, Sara J; Luo, Haiwei; Grim, Sharon L; Brown, Julia M; Casey, John R; Dron, Antony; Florez-Leiva, Lennin; Krupke, Andreas; Luria, Catherine M; Mine, Aric H; Nigro, Olivia D; Pather, Santhiska; Talarmin, Agathe; Wear, Emma K; Weber, Thomas S; Wilson, Jesse M; Church, Matthew J; DeLong, Edward F; Karl, David M; Steward, Grieg F; Eppley, John M; Kyrpides, Nikos C; Schuster, Stephan; Rappé, Michael S

2014-06-15

Strain HIMB11 is a planktonic marine bacterium isolated from coastal seawater in Kaneohe Bay, Oahu, Hawaii belonging to the ubiquitous and versatile Roseobacter clade of the alphaproteobacterial family Rhodobacteraceae. Here we describe the preliminary characteristics of strain HIMB11, including annotation of the draft genome sequence and comparative genomic analysis with other members of the Roseobacter lineage. The 3,098,747 bp draft genome is arranged in 34 contigs and contains 3,183 protein-coding genes and 54 RNA genes. Phylogenomic and 16S rRNA gene analyses indicate that HIMB11 represents a unique sublineage within the Roseobacter clade. Comparison with other publicly available genome sequences from members of the Roseobacter lineage reveals that strain HIMB11 has the genomic potential to utilize a wide variety of energy sources (e.g. organic matter, reduced inorganic sulfur, light, carbon monoxide), while possessing a reduced number of substrate transporters.
Genome and transcriptome of the regeneration-competent flatworm, Macrostomum lignano.

PubMed

Wasik, Kaja; Gurtowski, James; Zhou, Xin; Ramos, Olivia Mendivil; Delás, M Joaquina; Battistoni, Giorgia; El Demerdash, Osama; Falciatori, Ilaria; Vizoso, Dita B; Smith, Andrew D; Ladurner, Peter; Schärer, Lukas; McCombie, W Richard; Hannon, Gregory J; Schatz, Michael

2015-10-06

The free-living flatworm, Macrostomum lignano has an impressive regenerative capacity. Following injury, it can regenerate almost an entirely new organism because of the presence of an abundant somatic stem cell population, the neoblasts. This set of unique properties makes many flatworms attractive organisms for studying the evolution of pathways involved in tissue self-renewal, cell-fate specification, and regeneration. The use of these organisms as models, however, is hampered by the lack of a well-assembled and annotated genome sequences, fundamental to modern genetic and molecular studies. Here we report the genomic sequence of M. lignano and an accompanying characterization of its transcriptome. The genome structure of M. lignano is remarkably complex, with ∼75% of its sequence being comprised of simple repeats and transposon sequences. This has made high-quality assembly from Illumina reads alone impossible (N50=222 bp). We therefore generated 130× coverage by long sequencing reads from the Pacific Biosciences platform to create a substantially improved assembly with an N50 of 64 Kbp. We complemented the reference genome with an assembled and annotated transcriptome, and used both of these datasets in combination to probe gene-expression patterns during regeneration, examining pathways important to stem cell function.
A High-Coverage Yersinia pestis Genome from a Sixth-Century Justinianic Plague Victim

PubMed Central

Feldman, Michal; Harbeck, Michaela; Keller, Marcel; Spyrou, Maria A.; Rott, Andreas; Trautmann, Bernd; Scholz, Holger C.; Päffgen, Bernd; Peters, Joris; McCormick, Michael; Bos, Kirsten; Herbig, Alexander; Krause, Johannes

2016-01-01

The Justinianic Plague, which started in the sixth century and lasted to the mid eighth century, is thought to be the first of three historically documented plague pandemics causing massive casualties. Historical accounts and molecular data suggest the bacterium Yersinia pestis as its etiological agent. Here we present a new high-coverage (17.9-fold) Y. pestis genome obtained from a sixth-century skeleton recovered from a southern German burial site close to Munich. The reconstructed genome enabled the detection of 30 unique substitutions as well as structural differences that have not been previously described. We report indels affecting a lacl family transcription regulator gene as well as nonsynonymous substitutions in the nrdE, fadJ, and pcp genes, that have been suggested as plague virulence determinants or have been shown to be upregulated in different models of plague infection. In addition, we identify 19 false positive substitutions in a previously published lower-coverage Y. pestis genome from another archaeological site of the same time period and geographical region that is otherwise genetically identical to the high-coverage genome sequence reported here, suggesting low-genetic diversity of the plague during the sixth century in rural southern Germany. PMID:27578768
Morphology and genome organization of the virus PSV of the hyperthermophilic archaeal genera Pyrobaculum and Thermoproteus: a novel virus family, the Globuloviridae.

PubMed

Häring, Monika; Peng, Xu; Brügger, Kim; Rachel, Reinhard; Stetter, Karl O; Garrett, Roger A; Prangishvili, David

2004-06-01

A novel virus, termed Pyrobaculum spherical virus (PSV), is described that infects anaerobic hyperthermophilic archaea of the genera Pyrobaculum and Thermoproteus. Spherical enveloped virions, about 100 nm in diameter, contain a major multimeric 33-kDa protein and host-derived lipids. A viral envelope encases a superhelical nucleoprotein core containing linear double-stranded DNA. The PSV infection cycle does not cause lysis of host cells. The viral genome was sequenced and contains 28337 bp. The genome is unique for known archaeal viruses in that none of the genes, including that encoding the major structural protein, show any significant sequence matches to genes in public sequence databases. Exceptionally for an archaeal double-stranded DNA virus, almost all the recognizable genes are located on one DNA strand. The ends of the genome consist of 190-bp inverted repeats that contain multiple copies of short direct repeats. The two DNA strands are probably covalently linked at their termini. On the basis of the unusual morphological and genomic properties of this DNA virus, we propose to assign PSV to a new viral family, the Globuloviridae.
Plastome Sequences of Lygodium japonicum and Marsilea crenata Reveal the Genome Organization Transformation from Basal Ferns to Core Leptosporangiates

PubMed Central

Gao, Lei; Wang, Bo; Wang, Zhi-Wei; Zhou, Yuan; Su, Ying-Juan; Wang, Ting

2013-01-01

Previous studies have shown that core leptosporangiates, the most species-rich group of extant ferns (monilophytes), have a distinct plastid genome (plastome) organization pattern from basal fern lineages. However, the details of genome structure transformation from ancestral ferns to core leptosporangiates remain unclear because of limited plastome data available. Here, we have determined the complete chloroplast genome sequences of Lygodium japonicum (Lygodiaceae), a member of schizaeoid ferns (Schizaeales), and Marsilea crenata (Marsileaceae), a representative of heterosporous ferns (Salviniales). The two species represent the sister and the basal lineages of core leptosporangiates, respectively, for which the plastome sequences are currently unavailable. Comparative genomic analysis of all sequenced fern plastomes reveals that the gene order of L. japonicum plastome occupies an intermediate position between that of basal ferns and core leptosporangiates. The two exons of the fern ndhB gene have a unique pattern of intragenic copy number variances. Specifically, the substitution rate heterogeneity between the two exons is congruent with their copy number changes, confirming the constraint role that inverted repeats may play on the substitution rate of chloroplast gene sequences. PMID:23821521
Global Implementation of Genomic Medicine: We Are Not Alone

PubMed Central

Manolio, Teri A.; Abramowicz, Marc; Al-Mulla, Fahd; Anderson, Warwick; Balling, Rudi; Berger, Adam C.; Bleyl, Steven; Chakravarti, Aravinda; Chantratita, Wasun; Chisholm, Rex L.; Dissanayake, Vajira H. W.; Dunn, Michael; Dzau, Victor J.; Han, Bok-Ghee; Hubbard, Tim; Kolbe, Anne; Korf, Bruce; Kubo, Michiaki; Lasko, Paul; Leego, Erkki; Mahasirimongkol, Surakameth; Majumdar, Partha P.; Matthijs, Gert; McLeod, Howard L.; Metspalu, Andres; Meulien, Pierre; Miyano, Satoru; Naparstek, Yaakov; O’Rourke, P. Pearl; Patrinos, George P.; Rehm, Heidi L.; Relling, Mary V.; Rennert, Gad; Rodriguez, Laura Lyman; Roden, Dan M.; Shuldiner, Alan R.; Sinha, Sukdev; Tan, Patrick; Ulfendahl, Mats; Ward, Robyn; Williams, Marc S.; Wong, John E.L.; Green, Eric D.; Ginsburg, Geoffrey S.

2016-01-01

Advances in high-throughput genomic technologies coupled with a growing number of genomic results potentially useful in clinical care have led to ground-breaking genomic medicine implementation programs in various nations. Many of these innovative programs capitalize on unique local capabilities arising from the structure of their health care systems or their cultural or political milieu, as well as from unusual burdens of disease or risk alleles. Many such programs are being conducted in relative isolation and might benefit from sharing of approaches and lessons learned in other nations. The National Human Genome Research Institute recently brought together 25 of these groups from around the world to describe and compare projects, examine the current state of implementation and desired near-term capabilities, and identify opportunities for collaboration to promote the responsible implementation of genomic medicine. The wide variety of nascent programs in diverse settings demonstrates that implementation of genomic medicine is expanding globally in varied and highly innovative ways. Opportunities for collaboration abound in the areas of evidence generation, health information technology, education, workforce development, pharmacogenomics, and policy and regulatory issues. Several international organizations that are already facilitating effective research collaborations should engage to ensure implementation proceeds collaboratively without potentially wasteful duplication. Efforts to coalesce these groups around concrete but compelling signature projects, such as global eradication of genetically-mediated drug reactions or developing a truly global genomic variant data resource across a wide number of ethnicities, would accelerate appropriate implementation of genomics to improve clinical care world-wide. PMID:26041702
Comparative Genomic Analysis of Phylogenetically Closely Related Hydrogenobaculum sp. Isolates from Yellowstone National Park

PubMed Central

Romano, Christine; D'Imperio, Seth; Woyke, Tanja; Mavromatis, Konstantinos; Lasken, Roger; Shock, Everett L.

2013-01-01

We describe the complete genome sequences of four closely related Hydrogenobaculum sp. isolates (≥99.7% 16S rRNA gene identity) that were isolated from the outflow channel of Dragon Spring (DS), Norris Geyser Basin, in Yellowstone National Park (YNP), WY. The genomes range in size from 1,552,607 to 1,552,931 bp, contain 1,667 to 1,676 predicted genes, and are highly syntenic. There are subtle differences among the DS isolates, which as a group are different from Hydrogenobaculum sp. strain Y04AAS1 that was previously isolated from a geographically distinct YNP geothermal feature. Genes unique to the DS genomes encode arsenite [As(III)] oxidation, NADH-ubiquinone-plastoquinone (complex I), NADH-ubiquinone oxidoreductase chain, a DNA photolyase, and elements of a type II secretion system. Functions unique to strain Y04AAS1 include thiosulfate metabolism, nitrate respiration, and mercury resistance determinants. DS genomes contain seven CRISPR loci that are almost identical but are different from the single CRISPR locus in strain Y04AAS1. Other differences between the DS and Y04AAS1 genomes include average nucleotide identity (94.764%) and percentage conserved DNA (80.552%). Approximately half of the genes unique to Y04AAS1 are predicted to have been acquired via horizontal gene transfer. Fragment recruitment analysis and marker gene searches demonstrated that the DS metagenome was more similar to the DS genomes than to the Y04AAS1 genome, but that the DS community is likely comprised of a continuum of Hydrogenobaculum genotypes that span from the DS genomes described here to an Y04AAS1-like organism, which appears to represent a distinct ecotype relative to the DS genomes characterized. PMID:23435891
Whole-Genome Sequencing for Detecting Antimicrobial Resistance in Nontyphoidal Salmonella

PubMed Central

Tyson, Gregory H.; Kabera, Claudine; Chen, Yuansha; Li, Cong; Folster, Jason P.; Ayers, Sherry L.; Lam, Claudia; Tate, Heather P.; Zhao, Shaohua

2016-01-01

Laboratory-based in vitro antimicrobial susceptibility testing is the foundation for guiding anti-infective therapy and monitoring antimicrobial resistance trends. We used whole-genome sequencing (WGS) technology to identify known antimicrobial resistance determinants among strains of nontyphoidal Salmonella and correlated these with susceptibility phenotypes to evaluate the utility of WGS for antimicrobial resistance surveillance. Six hundred forty Salmonella of 43 different serotypes were selected from among retail meat and human clinical isolates that were tested for susceptibility to 14 antimicrobials using broth microdilution. The MIC for each drug was used to categorize isolates as susceptible or resistant based on Clinical and Laboratory Standards Institute clinical breakpoints or National Antimicrobial Resistance Monitoring System (NARMS) consensus interpretive criteria. Each isolate was subjected to whole-genome shotgun sequencing, and resistance genes were identified from assembled sequences. A total of 65 unique resistance genes, plus mutations in two structural resistance loci, were identified. There were more unique resistance genes (n = 59) in the 104 human isolates than in the 536 retail meat isolates (n = 36). Overall, resistance genotypes and phenotypes correlated in 99.0% of cases. Correlations approached 100% for most classes of antibiotics but were lower for aminoglycosides and beta-lactams. We report the first finding of extended-spectrum β-lactamases (ESBLs) (blaCTX-M1 and blaSHV2a) in retail meat isolates of Salmonella in the United States. Whole-genome sequencing is an effective tool for predicting antibiotic resistance in nontyphoidal Salmonella, although the use of more appropriate surveillance breakpoints and increased knowledge of new resistance alleles will further improve correlations. PMID:27381390
The Genome of the Netherlands: design, and project goals.

PubMed

Boomsma, Dorret I; Wijmenga, Cisca; Slagboom, Eline P; Swertz, Morris A; Karssen, Lennart C; Abdellaoui, Abdel; Ye, Kai; Guryev, Victor; Vermaat, Martijn; van Dijk, Freerk; Francioli, Laurent C; Hottenga, Jouke Jan; Laros, Jeroen F J; Li, Qibin; Li, Yingrui; Cao, Hongzhi; Chen, Ruoyan; Du, Yuanping; Li, Ning; Cao, Sujie; van Setten, Jessica; Menelaou, Androniki; Pulit, Sara L; Hehir-Kwa, Jayne Y; Beekman, Marian; Elbers, Clara C; Byelas, Heorhiy; de Craen, Anton J M; Deelen, Patrick; Dijkstra, Martijn; den Dunnen, Johan T; de Knijff, Peter; Houwing-Duistermaat, Jeanine; Koval, Vyacheslav; Estrada, Karol; Hofman, Albert; Kanterakis, Alexandros; Enckevort, David van; Mai, Hailiang; Kattenberg, Mathijs; van Leeuwen, Elisabeth M; Neerincx, Pieter B T; Oostra, Ben; Rivadeneira, Fernanodo; Suchiman, Eka H D; Uitterlinden, Andre G; Willemsen, Gonneke; Wolffenbuttel, Bruce H; Wang, Jun; de Bakker, Paul I W; van Ommen, Gert-Jan; van Duijn, Cornelia M

2014-02-01

Within the Netherlands a national network of biobanks has been established (Biobanking and Biomolecular Research Infrastructure-Netherlands (BBMRI-NL)) as a national node of the European BBMRI. One of the aims of BBMRI-NL is to enrich biobanks with different types of molecular and phenotype data. Here, we describe the Genome of the Netherlands (GoNL), one of the projects within BBMRI-NL. GoNL is a whole-genome-sequencing project in a representative sample consisting of 250 trio-families from all provinces in the Netherlands, which aims to characterize DNA sequence variation in the Dutch population. The parent-offspring trios include adult individuals ranging in age from 19 to 87 years (mean=53 years; SD=16 years) from birth cohorts 1910-1994. Sequencing was done on blood-derived DNA from uncultured cells and accomplished coverage was 14-15x. The family-based design represents a unique resource to assess the frequency of regional variants, accurately reconstruct haplotypes by family-based phasing, characterize short indels and complex structural variants, and establish the rate of de novo mutational events. GoNL will also serve as a reference panel for imputation in the available genome-wide association studies in Dutch and other cohorts to refine association signals and uncover population-specific variants. GoNL will create a catalog of human genetic variation in this sample that is uniquely characterized with respect to micro-geographic location and a wide range of phenotypes. The resource will be made available to the research and medical community to guide the interpretation of sequencing projects. The present paper summarizes the global characteristics of the project.
The zebrafish reference genome sequence and its relationship to the human genome.

PubMed

Howe, Kerstin; Clark, Matthew D; Torroja, Carlos F; Torrance, James; Berthelot, Camille; Muffato, Matthieu; Collins, John E; Humphray, Sean; McLaren, Karen; Matthews, Lucy; McLaren, Stuart; Sealy, Ian; Caccamo, Mario; Churcher, Carol; Scott, Carol; Barrett, Jeffrey C; Koch, Romke; Rauch, Gerd-Jörg; White, Simon; Chow, William; Kilian, Britt; Quintais, Leonor T; Guerra-Assunção, José A; Zhou, Yi; Gu, Yong; Yen, Jennifer; Vogel, Jan-Hinnerk; Eyre, Tina; Redmond, Seth; Banerjee, Ruby; Chi, Jianxiang; Fu, Beiyuan; Langley, Elizabeth; Maguire, Sean F; Laird, Gavin K; Lloyd, David; Kenyon, Emma; Donaldson, Sarah; Sehra, Harminder; Almeida-King, Jeff; Loveland, Jane; Trevanion, Stephen; Jones, Matt; Quail, Mike; Willey, Dave; Hunt, Adrienne; Burton, John; Sims, Sarah; McLay, Kirsten; Plumb, Bob; Davis, Joy; Clee, Chris; Oliver, Karen; Clark, Richard; Riddle, Clare; Elliot, David; Eliott, David; Threadgold, Glen; Harden, Glenn; Ware, Darren; Begum, Sharmin; Mortimore, Beverley; Mortimer, Beverly; Kerry, Giselle; Heath, Paul; Phillimore, Benjamin; Tracey, Alan; Corby, Nicole; Dunn, Matthew; Johnson, Christopher; Wood, Jonathan; Clark, Susan; Pelan, Sarah; Griffiths, Guy; Smith, Michelle; Glithero, Rebecca; Howden, Philip; Barker, Nicholas; Lloyd, Christine; Stevens, Christopher; Harley, Joanna; Holt, Karen; Panagiotidis, Georgios; Lovell, Jamieson; Beasley, Helen; Henderson, Carl; Gordon, Daria; Auger, Katherine; Wright, Deborah; Collins, Joanna; Raisen, Claire; Dyer, Lauren; Leung, Kenric; Robertson, Lauren; Ambridge, Kirsty; Leongamornlert, Daniel; McGuire, Sarah; Gilderthorp, Ruth; Griffiths, Coline; Manthravadi, Deepa; Nichol, Sarah; Barker, Gary; Whitehead, Siobhan; Kay, Michael; Brown, Jacqueline; Murnane, Clare; Gray, Emma; Humphries, Matthew; Sycamore, Neil; Barker, Darren; Saunders, David; Wallis, Justene; Babbage, Anne; Hammond, Sian; Mashreghi-Mohammadi, Maryam; Barr, Lucy; Martin, Sancha; Wray, Paul; Ellington, Andrew; Matthews, Nicholas; Ellwood, Matthew; Woodmansey, Rebecca; Clark, Graham; Cooper, James D; Cooper, James; Tromans, Anthony; Grafham, Darren; Skuce, Carl; Pandian, Richard; Andrews, Robert; Harrison, Elliot; Kimberley, Andrew; Garnett, Jane; Fosker, Nigel; Hall, Rebekah; Garner, Patrick; Kelly, Daniel; Bird, Christine; Palmer, Sophie; Gehring, Ines; Berger, Andrea; Dooley, Christopher M; Ersan-Ürün, Zübeyde; Eser, Cigdem; Geiger, Horst; Geisler, Maria; Karotki, Lena; Kirn, Anette; Konantz, Judith; Konantz, Martina; Oberländer, Martina; Rudolph-Geiger, Silke; Teucke, Mathias; Lanz, Christa; Raddatz, Günter; Osoegawa, Kazutoyo; Zhu, Baoli; Rapp, Amanda; Widaa, Sara; Langford, Cordelia; Yang, Fengtang; Schuster, Stephan C; Carter, Nigel P; Harrow, Jennifer; Ning, Zemin; Herrero, Javier; Searle, Steve M J; Enright, Anton; Geisler, Robert; Plasterk, Ronald H A; Lee, Charles; Westerfield, Monte; de Jong, Pieter J; Zon, Leonard I; Postlethwait, John H; Nüsslein-Volhard, Christiane; Hubbard, Tim J P; Roest Crollius, Hugues; Rogers, Jane; Stemple, Derek L

2013-04-25

Zebrafish have become a popular organism for the study of vertebrate gene function. The virtually transparent embryos of this species, and the ability to accelerate genetic studies by gene knockdown or overexpression, have led to the widespread use of zebrafish in the detailed investigation of vertebrate gene function and increasingly, the study of human genetic disease. However, for effective modelling of human genetic disease it is important to understand the extent to which zebrafish genes and gene structures are related to orthologous human genes. To examine this, we generated a high-quality sequence assembly of the zebrafish genome, made up of an overlapping set of completely sequenced large-insert clones that were ordered and oriented using a high-resolution high-density meiotic map. Detailed automatic and manual annotation provides evidence of more than 26,000 protein-coding genes, the largest gene set of any vertebrate so far sequenced. Comparison to the human reference genome shows that approximately 70% of human genes have at least one obvious zebrafish orthologue. In addition, the high quality of this genome assembly provides a clearer understanding of key genomic features such as a unique repeat content, a scarcity of pseudogenes, an enrichment of zebrafish-specific genes on chromosome 4 and chromosomal regions that influence sex determination.
The zebrafish reference genome sequence and its relationship to the human genome

PubMed Central

Howe, Kerstin; Clark, Matthew D.; Torroja, Carlos F.; Torrance, James; Berthelot, Camille; Muffato, Matthieu; Collins, John E.; Humphray, Sean; McLaren, Karen; Matthews, Lucy; McLaren, Stuart; Sealy, Ian; Caccamo, Mario; Churcher, Carol; Scott, Carol; Barrett, Jeffrey C.; Koch, Romke; Rauch, Gerd-Jörg; White, Simon; Chow, William; Kilian, Britt; Quintais, Leonor T.; Guerra-Assunção, José A.; Zhou, Yi; Gu, Yong; Yen, Jennifer; Vogel, Jan-Hinnerk; Eyre, Tina; Redmond, Seth; Banerjee, Ruby; Chi, Jianxiang; Fu, Beiyuan; Langley, Elizabeth; Maguire, Sean F.; Laird, Gavin K.; Lloyd, David; Kenyon, Emma; Donaldson, Sarah; Sehra, Harminder; Almeida-King, Jeff; Loveland, Jane; Trevanion, Stephen; Jones, Matt; Quail, Mike; Willey, Dave; Hunt, Adrienne; Burton, John; Sims, Sarah; McLay, Kirsten; Plumb, Bob; Davis, Joy; Clee, Chris; Oliver, Karen; Clark, Richard; Riddle, Clare; Eliott, David; Threadgold, Glen; Harden, Glenn; Ware, Darren; Mortimer, Beverly; Kerry, Giselle; Heath, Paul; Phillimore, Benjamin; Tracey, Alan; Corby, Nicole; Dunn, Matthew; Johnson, Christopher; Wood, Jonathan; Clark, Susan; Pelan, Sarah; Griffiths, Guy; Smith, Michelle; Glithero, Rebecca; Howden, Philip; Barker, Nicholas; Stevens, Christopher; Harley, Joanna; Holt, Karen; Panagiotidis, Georgios; Lovell, Jamieson; Beasley, Helen; Henderson, Carl; Gordon, Daria; Auger, Katherine; Wright, Deborah; Collins, Joanna; Raisen, Claire; Dyer, Lauren; Leung, Kenric; Robertson, Lauren; Ambridge, Kirsty; Leongamornlert, Daniel; McGuire, Sarah; Gilderthorp, Ruth; Griffiths, Coline; Manthravadi, Deepa; Nichol, Sarah; Barker, Gary; Whitehead, Siobhan; Kay, Michael; Brown, Jacqueline; Murnane, Clare; Gray, Emma; Humphries, Matthew; Sycamore, Neil; Barker, Darren; Saunders, David; Wallis, Justene; Babbage, Anne; Hammond, Sian; Mashreghi-Mohammadi, Maryam; Barr, Lucy; Martin, Sancha; Wray, Paul; Ellington, Andrew; Matthews, Nicholas; Ellwood, Matthew; Woodmansey, Rebecca; Clark, Graham; Cooper, James; Tromans, Anthony; Grafham, Darren; Skuce, Carl; Pandian, Richard; Andrews, Robert; Harrison, Elliot; Kimberley, Andrew; Garnett, Jane; Fosker, Nigel; Hall, Rebekah; Garner, Patrick; Kelly, Daniel; Bird, Christine; Palmer, Sophie; Gehring, Ines; Berger, Andrea; Dooley, Christopher M.; Ersan-Ürün, Zübeyde; Eser, Cigdem; Geiger, Horst; Geisler, Maria; Karotki, Lena; Kirn, Anette; Konantz, Judith; Konantz, Martina; Oberländer, Martina; Rudolph-Geiger, Silke; Teucke, Mathias; Osoegawa, Kazutoyo; Zhu, Baoli; Rapp, Amanda; Widaa, Sara; Langford, Cordelia; Yang, Fengtang; Carter, Nigel P.; Harrow, Jennifer; Ning, Zemin; Herrero, Javier; Searle, Steve M. J.; Enright, Anton; Geisler, Robert; Plasterk, Ronald H. A.; Lee, Charles; Westerfield, Monte; de Jong, Pieter J.; Zon, Leonard I.; Postlethwait, John H.; Nüsslein-Volhard, Christiane; Hubbard, Tim J. P.; Crollius, Hugues Roest; Rogers, Jane; Stemple, Derek L.

2013-01-01

Zebrafish have become a popular organism for the study of vertebrate gene function1,2. The virtually transparent embryos of this species, and the ability to accelerate genetic studies by gene knockdown or overexpression, have led to the widespread use of zebrafish in the detailed investigation of vertebrate gene function and increasingly, the study of human genetic disease3–5. However, for effective modelling of human genetic disease it is important to understand the extent to which zebrafish genes and gene structures are related to orthologous human genes. To examine this, we generated a high-quality sequence assembly of the zebrafish genome, made up of an overlapping set of completely sequenced large-insert clones that were ordered and oriented using a high-resolution high-density meiotic map. Detailed automatic and manual annotation provides evidence of more than 26,000 protein-coding genes6, the largest gene set of any vertebrate so far sequenced. Comparison to the human reference genome shows that approximately 70% of human genes have at least one obvious zebrafish orthologue. In addition, the high quality of this genome assembly provides a clearer understanding of key genomic features such as a unique repeat content, a scarcity of pseudogenes, an enrichment of zebrafish-specific genes on chromosome 4 and chromosomal regions that influence sex determination. PMID:23594743
Sequencing rare marine actinomycete genomes reveals high density of unique natural product biosynthetic gene clusters.

PubMed

Schorn, Michelle A; Alanjary, Mohammad M; Aguinaldo, Kristen; Korobeynikov, Anton; Podell, Sheila; Patin, Nastassia; Lincecum, Tommie; Jensen, Paul R; Ziemert, Nadine; Moore, Bradley S

2016-12-01

Traditional natural product discovery methods have nearly exhausted the accessible diversity of microbial chemicals, making new sources and techniques paramount in the search for new molecules. Marine actinomycete bacteria have recently come into the spotlight as fruitful producers of structurally diverse secondary metabolites, and remain relatively untapped. In this study, we sequenced 21 marine-derived actinomycete strains, rarely studied for their secondary metabolite potential and under-represented in current genomic databases. We found that genome size and phylogeny were good predictors of biosynthetic gene cluster diversity, with larger genomes rivalling the well-known marine producers in the Streptomyces and Salinispora genera. Genomes in the Micrococcineae suborder, however, had consistently the lowest number of biosynthetic gene clusters. By networking individual gene clusters into gene cluster families, we were able to computationally estimate the degree of novelty each genus contributed to the current sequence databases. Based on the similarity measures between all actinobacteria in the Joint Genome Institute's Atlas of Biosynthetic gene Clusters database, rare marine genera show a high degree of novelty and diversity, with Corynebacterium, Gordonia, Nocardiopsis, Saccharomonospora and Pseudonocardia genera representing the highest gene cluster diversity. This research validates that rare marine actinomycetes are important candidates for exploration, as they are relatively unstudied, and their relatives are historically rich in secondary metabolites.
Sequencing rare marine actinomycete genomes reveals high density of unique natural product biosynthetic gene clusters

PubMed Central

Schorn, Michelle A.; Alanjary, Mohammad M.; Aguinaldo, Kristen; Korobeynikov, Anton; Podell, Sheila; Patin, Nastassia; Lincecum, Tommie; Jensen, Paul R.; Ziemert, Nadine

2016-01-01

Traditional natural product discovery methods have nearly exhausted the accessible diversity of microbial chemicals, making new sources and techniques paramount in the search for new molecules. Marine actinomycete bacteria have recently come into the spotlight as fruitful producers of structurally diverse secondary metabolites, and remain relatively untapped. In this study, we sequenced 21 marine-derived actinomycete strains, rarely studied for their secondary metabolite potential and under-represented in current genomic databases. We found that genome size and phylogeny were good predictors of biosynthetic gene cluster diversity, with larger genomes rivalling the well-known marine producers in the Streptomyces and Salinispora genera. Genomes in the Micrococcineae suborder, however, had consistently the lowest number of biosynthetic gene clusters. By networking individual gene clusters into gene cluster families, we were able to computationally estimate the degree of novelty each genus contributed to the current sequence databases. Based on the similarity measures between all actinobacteria in the Joint Genome Institute's Atlas of Biosynthetic gene Clusters database, rare marine genera show a high degree of novelty and diversity, with Corynebacterium, Gordonia, Nocardiopsis, Saccharomonospora and Pseudonocardia genera representing the highest gene cluster diversity. This research validates that rare marine actinomycetes are important candidates for exploration, as they are relatively unstudied, and their relatives are historically rich in secondary metabolites. PMID:27902408
Molecular structures of centromeric heterochromatin and karyotypic evolution in the Siamese crocodile (Crocodylus siamensis) (Crocodylidae, Crocodylia).

PubMed

Kawagoshi, Taiki; Nishida, Chizuko; Ota, Hidetoshi; Kumazawa, Yoshinori; Endo, Hideki; Matsuda, Yoichi

2008-01-01

Crocodilians have several unique karyotypic features, such as small diploid chromosome numbers (30-42) and the absence of dot-shaped microchromosomes. Of the extant crocodilian species, the Siamese crocodile (Crocodylus siamensis) has no more than 2n = 30, comprising mostly bi-armed chromosomes with large centromeric heterochromatin blocks. To investigate the molecular structures of C-heterochromatin and genomic compartmentalization in the karyotype, characterized by the disappearance of tiny microchromosomes and reduced chromosome number, we performed molecular cloning of centromeric repetitive sequences and chromosome mapping of the 18S-28S rDNA and telomeric (TTAGGG)( n ) sequences. The centromeric heterochromatin was composed mainly of two repetitive sequence families whose characteristics were quite different. Two types of GC-rich CSI-HindIII family sequences, the 305 bp CSI-HindIII-S (G+C content, 61.3%) and 424 bp CSI-HindIII-M (63.1%), were localized to the intensely PI-stained centric regions of all chromosomes, except for chromosome 2 with PI-negative heterochromatin. The 94 bp CSI-DraI (G+C content, 48.9%) was tandem-arrayed satellite DNA and localized to chromosome 2 and four pairs of small-sized chromosomes. The chromosomal size-dependent genomic compartmentalization that is supposedly unique to the Archosauromorpha was probably lost in the crocodilian lineage with the disappearance of microchromosomes followed by the homogenization of centromeric repetitive sequences between chromosomes, except for chromosome 2.

Transport genes and chemotaxis in Laribacter hongkongensis: a genome-wide analysis

PubMed Central

2011-01-01

Background Laribacter hongkongensis is a Gram-negative, sea gull-shaped rod associated with community-acquired gastroenteritis. The bacterium has been found in diverse freshwater environments including fish, frogs and drinking water reservoirs. Using the complete genome sequence data of L. hongkongensis, we performed a comprehensive analysis of putative transport-related genes and genes related to chemotaxis, motility and quorum sensing, which may help the bacterium adapt to the changing environments and combat harmful substances. Results A genome-wide analysis using Transport Classification Database TCDB, similarity and keyword searches revealed the presence of a large diversity of transporters (n = 457) and genes related to chemotaxis (n = 52) and flagellar biosynthesis (n = 40) in the L. hongkongensis genome. The transporters included those from all seven major transporter categories, which may allow the uptake of essential nutrients or ions, and extrusion of metabolic end products and hazardous substances. L. hongkongensis is unique among closely related members of Neisseriaceae family in possessing higher number of proteins related to transport of ammonium, urea and dicarboxylate, which may reflect the importance of nitrogen and dicarboxylate metabolism in this assacharolytic bacterium. Structural modeling of two C4-dicarboxylate transporters showed that they possessed similar structures to the determined structures of other DctP-TRAP transporters, with one having an unusual disulfide bond. Diverse mechanisms for iron transport, including hemin transporters for iron acquisition from host proteins, were also identified. In addition to the chemotaxis and flagella-related genes, the L. hongkongensis genome also contained two copies of qseB/qseC homologues of the AI-3 quorum sensing system. Conclusions The large number of diverse transporters and genes involved in chemotaxis, motility and quorum sensing suggested that the bacterium may utilize a complex system to adapt to different environments. Structural modeling will provide useful insights on the transporters in L. hongkongensis. PMID:21849034
The Genome of the Obligately Intracellular Bacterium Ehrlichia canis Reveals Themes of Complex Membrane Structure and Immune Evasion Strategies

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mavromatis, K; Doyle, C Kuyler; Lykidis, A

2006-01-01

Ehrlichia canis, a small obligately intracellular, tick-transmitted, gram-negative, {alpha}-proteobacterium, is the primary etiologic agent of globally distributed canine monocytic ehrlichiosis. Complete genome sequencing revealed that the E. canis genome consists of a single circular chromosome of 1,315,030 bp predicted to encode 925 proteins, 40 stable RNA species, 17 putative pseudogenes, and a substantial proportion of noncoding sequence (27%). Interesting genome features include a large set of proteins with transmembrane helices and/or signal sequences and a unique serine-threonine bias associated with the potential for O glycosylation that was prominent in proteins associated with pathogen-host interactions. Furthermore, two paralogous protein families associatedmore » with immune evasion were identified, one of which contains poly(G-C) tracts, suggesting that they may play a role in phase variation and facilitation of persistent infections. Genes associated with pathogen-host interactions were identified, including a small group encoding proteins (n = 12) with tandem repeats and another group encoding proteins with eukaryote-like ankyrin domains (n = 7).« less
Digital Family Histories for Data Mining

PubMed Central

Hoyt, Robert; Linnville, Steven; Chung, Hui-Min; Hutfless, Brent; Rice, Courtney

2013-01-01

As we move closer to ubiquitous electronic health records (EHRs), genetic, familial, and clinical information will need to be incorporated into EHRs as structured data that can be used for data mining and clinical decision support. While the Human Genome Project has produced new and exciting genomic data, the cost to sequence the human personal genome is high, and significant controversies regarding how to interpret genomic data exist. Many experts feel that the family history is a surrogate marker for genetic information and should be part of any paper-based or electronic health record. A digital family history is now part of the Meaningful Use Stage 2 menu objectives for EHR reimbursement, projected for 2014. In this study, a secure online family history questionnaire was designed to collect data on a unique cohort of Vietnam-era repatriated male veterans and a comparison group in order to compare participant and family disease rates on common medical disorders with a genetic component. This article describes our approach to create the digital questionnaire and the results of analyzing family history data on 319 male participants. PMID:24159269
Differential nuclear scaffold/matrix attachment marks expressed genes.

PubMed

Linnemann, Amelia K; Platts, Adrian E; Krawetz, Stephen A

2009-02-15

It is well established that nuclear architecture plays a key role in poising regions of the genome for transcription. This may be achieved using scaffold/matrix attachment regions (S/MARs) that establish loop domains. However, the relationship between changes in the physical structure of the genome as mediated by attachment to the nuclear scaffold/matrix and gene expression is not clearly understood. To define the role of S/MARs in organizing our genome and to resolve the often contradictory loci-specific studies, we have surveyed the S/MARs in HeLa S3 cells on human chromosomes 14-18 by array comparative genomic hybridization. Comparison of LIS (lithium 3,5-diiodosalicylate) extraction to identify SARs and 2 m NaCl extraction to identify MARs revealed that approximately one-half of the sites were in common. The results presented in this study suggest that SARs 5' of a gene are associated with transcript presence whereas MARs contained within a gene are associated with silenced genes. The varied functions of the S/MARs as revealed by the different extraction methods highlights their unique functional contribution.
Differential nuclear scaffold/matrix attachment marks expressed genes†

PubMed Central

Linnemann, Amelia K.; Platts, Adrian E.; Krawetz, Stephen A.

2009-01-01

It is well established that nuclear architecture plays a key role in poising regions of the genome for transcription. This may be achieved using scaffold/matrix attachment regions (S/MARs) that establish loop domains. However, the relationship between changes in the physical structure of the genome as mediated by attachment to the nuclear scaffold/matrix and gene expression is not clearly understood. To define the role of S/MARs in organizing our genome and to resolve the often contradictory loci-specific studies, we have surveyed the S/MARs in HeLa S3 cells on human chromosomes 14–18 by array comparative genomic hybridization. Comparison of LIS (lithium 3,5-diiodosalicylate) extraction to identify SARs and 2 m NaCl extraction to identify MARs revealed that approximately one-half of the sites were in common. The results presented in this study suggest that SARs 5′ of a gene are associated with transcript presence whereas MARs contained within a gene are associated with silenced genes. The varied functions of the S/MARs as revealed by the different extraction methods highlights their unique functional contribution. PMID:19017725
Digital family histories for data mining.

PubMed

Hoyt, Robert; Linnville, Steven; Chung, Hui-Min; Hutfless, Brent; Rice, Courtney

2013-01-01

As we move closer to ubiquitous electronic health records (EHRs), genetic, familial, and clinical information will need to be incorporated into EHRs as structured data that can be used for data mining and clinical decision support. While the Human Genome Project has produced new and exciting genomic data, the cost to sequence the human personal genome is high, and significant controversies regarding how to interpret genomic data exist. Many experts feel that the family history is a surrogate marker for genetic information and should be part of any paper-based or electronic health record. A digital family history is now part of the Meaningful Use Stage 2 menu objectives for EHR reimbursement, projected for 2014. In this study, a secure online family history questionnaire was designed to collect data on a unique cohort of Vietnam-era repatriated male veterans and a comparison group in order to compare participant and family disease rates on common medical disorders with a genetic component. This article describes our approach to create the digital questionnaire and the results of analyzing family history data on 319 male participants.
The genome of obligately intracellular Ehrlichia canis revealsthemes of complex membrane structure and immune evasion strategies

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mavromatis, K.; Kuyler Doyle, C.; Lykidis, A.

2005-09-01

Ehrlichia canis, a small obligately intracellular, tick-transmitted, gram-negative, a-proteobacterium is the primary etiologic agent of globally distributed canine monocytic ehrlichiosis. Complete genome sequencing revealed that the E. canis genome consists of a single circular chromosome of 1,315,030 bp predicted to encode 925 proteins, 40 stable RNA species, and 17 putative pseudogenes, and a substantial proportion of non-coding sequence (27 percent). Interesting genome features include a large set of proteins with transmembrane helices and/or signal sequences, and a unique serine-threonine bias associated with the potential for O-glycosylation that was prominent in proteins associated with pathogen-host interactions. Furthermore, two paralogous protein familiesmore » associated with immune evasion were identified, one of which contains poly G:C tracts, suggesting that they may play a role in phase variation and facilitation of persistent infections. Proteins associated with pathogen-host interactions were identified including a small group of proteins (12) with tandem repeats and another with eukaryotic-like ankyrin domains (7).« less
Genomes to natural products PRediction Informatics for Secondary Metabolomes (PRISM)

PubMed Central

Skinnider, Michael A.; Dejong, Chris A.; Rees, Philip N.; Johnston, Chad W.; Li, Haoxin; Webster, Andrew L. H.; Wyatt, Morgan A.; Magarvey, Nathan A.

2015-01-01

Microbial natural products are an invaluable source of evolved bioactive small molecules and pharmaceutical agents. Next-generation and metagenomic sequencing indicates untapped genomic potential, yet high rediscovery rates of known metabolites increasingly frustrate conventional natural product screening programs. New methods to connect biosynthetic gene clusters to novel chemical scaffolds are therefore critical to enable the targeted discovery of genetically encoded natural products. Here, we present PRISM, a computational resource for the identification of biosynthetic gene clusters, prediction of genetically encoded nonribosomal peptides and type I and II polyketides, and bio- and cheminformatic dereplication of known natural products. PRISM implements novel algorithms which render it uniquely capable of predicting type II polyketides, deoxygenated sugars, and starter units, making it a comprehensive genome-guided chemical structure prediction engine. A library of 57 tailoring reactions is leveraged for combinatorial scaffold library generation when multiple potential substrates are consistent with biosynthetic logic. We compare the accuracy of PRISM to existing genomic analysis platforms. PRISM is an open-source, user-friendly web application available at http://magarveylab.ca/prism/. PMID:26442528
Mammalian-specific genomic functions: Newly acquired traits generated by genomic imprinting and LTR retrotransposon-derived genes in mammals

PubMed Central

KANEKO-ISHINO, Tomoko; ISHINO, Fumitoshi

2015-01-01

Mammals, including human beings, have evolved a unique viviparous reproductive system and a highly developed central nervous system. How did these unique characteristics emerge in mammalian evolution, and what kinds of changes did occur in the mammalian genomes as evolution proceeded? A key conceptual term in approaching these issues is “mammalian-specific genomic functions”, a concept covering both mammalian-specific epigenetics and genetics. Genomic imprinting and LTR retrotransposon-derived genes are reviewed as the representative, mammalian-specific genomic functions that are essential not only for the current mammalian developmental system, but also mammalian evolution itself. First, the essential roles of genomic imprinting in mammalian development, especially related to viviparous reproduction via placental function, as well as the emergence of genomic imprinting in mammalian evolution, are discussed. Second, we introduce the novel concept of “mammalian-specific traits generated by mammalian-specific genes from LTR retrotransposons”, based on the finding that LTR retrotransposons served as a critical driving force in the mammalian evolution via generating mammalian-specific genes. PMID:26666304
Mammalian-specific genomic functions: Newly acquired traits generated by genomic imprinting and LTR retrotransposon-derived genes in mammals.

PubMed

Kaneko-Ishino, Tomoko; Ishino, Fumitoshi

2015-01-01

Mammals, including human beings, have evolved a unique viviparous reproductive system and a highly developed central nervous system. How did these unique characteristics emerge in mammalian evolution, and what kinds of changes did occur in the mammalian genomes as evolution proceeded? A key conceptual term in approaching these issues is "mammalian-specific genomic functions", a concept covering both mammalian-specific epigenetics and genetics. Genomic imprinting and LTR retrotransposon-derived genes are reviewed as the representative, mammalian-specific genomic functions that are essential not only for the current mammalian developmental system, but also mammalian evolution itself. First, the essential roles of genomic imprinting in mammalian development, especially related to viviparous reproduction via placental function, as well as the emergence of genomic imprinting in mammalian evolution, are discussed. Second, we introduce the novel concept of "mammalian-specific traits generated by mammalian-specific genes from LTR retrotransposons", based on the finding that LTR retrotransposons served as a critical driving force in the mammalian evolution via generating mammalian-specific genes.
Radiation-induced gene expression in the nematode Caenorhabditis elegans

NASA Technical Reports Server (NTRS)

Nelson, Gregory A.; Jones, Tamako A.; Chesnut, Aaron; Smith, Anna L.

2002-01-01

We used the nematode C. elegans to characterize the genotoxic and cytotoxic effects of ionizing radiation in a simple animal model emphasizing the unique effects of charged particle radiation. Here we demonstrate by RT-PCR differential display and whole genome microarray hybridization experiments that gamma rays, accelerated protons and iron ions at the same physical dose lead to unique transcription profiles. 599 of 17871 genes analyzed (3.4%) showed differential expression 3 hrs after exposure to 3 Gy of radiation. 193 were up-regulated, 406 were down-regulated and 90% were affected only by a single species of radiation. A novel statistical clustering technique identified the regulatory relationships between the radiation-modulated genes and showed that genes affected by each radiation species were associated with unique regulatory clusters. This suggests that independent homeostatic mechanisms are activated in response to radiation exposure as a function of track structure or ionization density.
Genome-wide SNP discovery and population structure analysis in pepper (Capsicum annuum) using genotyping by sequencing.

PubMed

Taranto, F; D'Agostino, N; Greco, B; Cardi, T; Tripodi, P

2016-11-21

Knowledge on population structure and genetic diversity in vegetable crops is essential for association mapping studies and genomic selection. Genotyping by sequencing (GBS) represents an innovative method for large scale SNP detection and genotyping of genetic resources. Herein we used the GBS approach for the genome-wide identification of SNPs in a collection of Capsicum spp. accessions and for the assessment of the level of genetic diversity in a subset of 222 cultivated pepper (Capsicum annum) genotypes. GBS analysis generated a total of 7,568,894 master tags, of which 43.4% uniquely aligned to the reference genome CM334. A total of 108,591 SNP markers were identified, of which 105,184 were in C. annuum accessions. In order to explore the genetic diversity of C. annuum and to select a minimal core set representing most of the total genetic variation with minimum redundancy, a subset of 222 C. annuum accessions were analysed using 32,950 high quality SNPs. Based on Bayesian and Hierarchical clustering it was possible to divide the collection into three clusters. Cluster I had the majority of varieties and landraces mainly from Southern and Northern Italy, and from Eastern Europe, whereas clusters II and III comprised accessions of different geographical origins. Considering the genome-wide genetic variation among the accessions included in cluster I, a second round of Bayesian (K = 3) and Hierarchical (K = 2) clustering was performed. These analysis showed that genotypes were grouped not only based on geographical origin, but also on fruit-related features. GBS data has proven useful to assess the genetic diversity in a collection of C. annuum accessions. The high number of SNP markers, uniformly distributed on the 12 chromosomes, allowed the accessions to be distinguished according to geographical origin and fruit-related features. SNP markers and information on population structure developed in this study will undoubtedly support genome-wide association mapping studies and marker-assisted selection programs.
T4-Like Genome Organization of the Escherichia coli O157:H7 Lytic Phage AR1▿†

PubMed Central

Liao, Wei-Chao; Ng, Wailap Victor; Lin, I-Hsuan; Syu, Wan-Jr; Liu, Tze-Tze; Chang, Chuan-Hsiung

2011-01-01

We report the genome organization and analysis of the first completely sequenced T4-like phage, AR1, of Escherichia coli O157:H7. Unlike most of the other sequenced phages of O157:H7, which belong to the temperate Podoviridae and Siphoviridae families, AR1 is a T4-like phage known to efficiently infect this pathogenic bacterial strain. The 167,435-bp AR1 genome is currently the largest among all the sequenced E. coli O157:H7 phages. It carries a total of 281 potential open reading frames (ORFs) and 10 putative tRNA genes. Of these, 126 predicted proteins could be classified into six viral orthologous group categories, with at least 18 proteins of the structural protein category having been detected by tandem mass spectrometry. Comparative genomic analysis of AR1 and four other completely sequenced T4-like genomes (RB32, RB69, T4, and JS98) indicated that they share a well-organized and highly conserved core genome, particularly in the regions encoding DNA replication and virion structural proteins. The major diverse features between these phages include the modules of distal tail fibers and the types and numbers of internal proteins, tRNA genes, and mobile elements. Codon usage analysis suggested that the presence of AR1-encoded tRNAs may be relevant to the codon usage of structural proteins. Furthermore, protein sequence analysis of AR1 gp37, a potential receptor binding protein, indicated that eight residues in the C terminus are unique to O157:H7 T4-like phages AR1 and PP01. These residues are known to be located in the T4 receptor recognition domain, and they may contribute to specificity for adsorption to the O157:H7 strain. PMID:21507986
Fine Mapping of Bone Structure and Strength QTLs in Heterogeneous Stock Rat

PubMed Central

Alam, Imranul; Koller, Daniel L.; Cañete, Toni; Blázquez, Gloria; Mont-Cardona, Carme; López-Aumatell, Regina; Martínez-Membrives, Esther; Díaz-Morán, Sira; Tobeña, Adolf; Fernández-Teruel, Alberto; Stridh, Pernilla; Diez, Margarita; Olsson, Tomas; Johannesson, Martina; Baud, Amelie; Econs, Michael J.; Foroud, Tatiana

2015-01-01

We previously demonstrated that skeletal structure and strength phenotypes vary considerably in heterogeneous stock (HS) rats. These phenotypes were found to be strongly heritable, suggesting that the HS rat model represents a unique genetic resource for dissecting the complex genetic etiology underlying bone fragility. The purpose of this study was to identify and localize genes associated with bone structure and strength phenotypes using 1524 adult male and female HS rats between 17 to 20 weeks of age. Structure measures included femur length, neck width, head width; femur and lumbar spine (L3-5) areas obtained by DXA; and cross-sectional areas (CSA) at the midshaft, distal femur and femoral neck, and the 5th lumbar vertebra measured by CT. In addition, measures of strength of the whole femur and femoral neck were obtained. Approximately 70,000 polymorphic SNPs distributed throughout the rat genome were selected for genotyping, with a mean linkage disequilibrium coefficient between neighboring SNPs of 0.95. Haplotypes were estimated across the entire genome for each rat using a multipoint haplotype reconstruction method, which calculates the probability of descent at each locus from each of the 8 HS founder strains. The haplotypes were then tested for association with each structure and strength phenotype via a mixed model with covariate adjustment. We identified quantitative trait loci (QTLs) for structure phenotypes on chromosomes 3, 8, 10, 12, 17 and 20, and QTLs for strength phenotypes on chromosomes 5, 10 and 11 that met a conservative genome-wide empiric significance threshold (FDR=5%; P<3 × 10−6). Importantly, most QTLs were localized to very narrow genomic regions (as small as 0.3Mb and up to 3 Mb), each harboring a small set of candidate genes, both novel and previously shown to have roles in skeletal development and homeostasis. PMID:26297441
Molecular complexity of successive bacterial epidemics deconvoluted by comparative pathogenomics.

PubMed

Beres, Stephen B; Carroll, Ronan K; Shea, Patrick R; Sitkiewicz, Izabela; Martinez-Gutierrez, Juan Carlos; Low, Donald E; McGeer, Allison; Willey, Barbara M; Green, Karen; Tyrrell, Gregory J; Goldman, Thomas D; Feldgarden, Michael; Birren, Bruce W; Fofanov, Yuriy; Boos, John; Wheaton, William D; Honisch, Christiane; Musser, James M

2010-03-02

Understanding the fine-structure molecular architecture of bacterial epidemics has been a long-sought goal of infectious disease research. We used short-read-length DNA sequencing coupled with mass spectroscopy analysis of SNPs to study the molecular pathogenomics of three successive epidemics of invasive infections involving 344 serotype M3 group A Streptococcus in Ontario, Canada. Sequencing the genome of 95 strains from the three epidemics, coupled with analysis of 280 biallelic SNPs in all 344 strains, revealed an unexpectedly complex population structure composed of a dynamic mixture of distinct clonally related complexes. We discovered that each epidemic is dominated by micro- and macrobursts of multiple emergent clones, some with distinct strain genotype-patient phenotype relationships. On average, strains were differentiated from one another by only 49 SNPs and 11 insertion-deletion events (indels) in the core genome. Ten percent of SNPs are strain specific; that is, each strain has a unique genome sequence. We identified nonrandom temporal-spatial patterns of strain distribution within and between the epidemic peaks. The extensive full-genome data permitted us to identify genes with significantly increased rates of nonsynonymous (amino acid-altering) nucleotide polymorphisms, thereby providing clues about selective forces operative in the host. Comparative expression microarray analysis revealed that closely related strains differentiated by seemingly modest genetic changes can have significantly divergent transcriptomes. We conclude that enhanced understanding of bacterial epidemics requires a deep-sequencing, geographically centric, comparative pathogenomics strategy.
Structural Characterization and Evolutionary Relationship of High-Molecular-Weight Glutenin Subunit Genes in Roegneria nakaii and Roegneria alashanica.

PubMed

Zhang, Lujun; Li, Zhixin; Fan, Renchun; Wei, Bo; Zhang, Xiangqi

2016-07-19

The Roegneria of Triticeae is a large genus including about 130 allopolyploid species. Little is known about its high-molecular-weight glutenin subunits (HMW-GSs). Here, we reported six novel HMW-GS genes from R. nakaii and R. alashanica. Sequencing indicated that Rny1, Rny3, and Ray1 possessed intact open reading frames (ORFs), whereas Rny2, Rny4, and Ray2 harbored in-frame stop codons. All of the six genes possessed a similar primary structure to known HMW-GS, while showing some unique characteristics. Their coding regions were significantly shorter than Glu-1 genes in wheat. The amino acid sequences revealed that all of the six genes were intermediate towards the y-type. The phylogenetic analysis showed that the HMW-GSs from species with St, StY, or StH genome(s) clustered in an independent clade, varying from the typical x- and y-type clusters. Thus, the Glu-1 locus in R. nakaii and R. alashanica is a very primitive glutenin locus across evolution. The six genes were phylogenetically split into two groups clustered to different clades, respectively, each of the two clades included the HMW-GSs from species with St (diploid and tetraploid species), StY, and StH genomes. Hence, it is concluded that the six Roegneria HMW-GS genes are from two St genomes undergoing slight differentiation.
Adaptive genomic divergence under high gene flow between freshwater and brackish-water ecotypes of prickly sculpin (Cottus asper) revealed by Pool-Seq.

PubMed

Dennenmoser, Stefan; Vamosi, Steven M; Nolte, Arne W; Rogers, Sean M

2017-01-01

Understanding the genomic basis of adaptive divergence in the presence of gene flow remains a major challenge in evolutionary biology. In prickly sculpin (Cottus asper), an abundant euryhaline fish in northwestern North America, high genetic connectivity among brackish-water (estuarine) and freshwater (tributary) habitats of coastal rivers does not preclude the build-up of neutral genetic differentiation and emergence of different life history strategies. Because these two habitats present different osmotic niches, we predicted high genetic differentiation at known teleost candidate genes underlying salinity tolerance and osmoregulation. We applied whole-genome sequencing of pooled DNA samples (Pool-Seq) to explore adaptive divergence between two estuarine and two tributary habitats. Paired-end sequence reads were mapped against genomic contigs of European Cottus, and the gene content of candidate regions was explored based on comparisons with the threespine stickleback genome. Genes showing signals of repeated differentiation among brackish-water and freshwater habitats included functions such as ion transport and structural permeability in freshwater gills, which suggests that local adaptation to different osmotic niches might contribute to genomic divergence among habitats. Overall, the presence of both repeated and unique signatures of differentiation across many loci scattered throughout the genome is consistent with polygenic adaptation from standing genetic variation and locally variable selection pressures in the early stages of life history divergence. © 2016 John Wiley & Sons Ltd.
The Medicago Genome Provides Insight into the Evolution of Rhizobial Symbioses

PubMed Central

Young, Nevin D.; Debellé, Frédéric; Oldroyd, Giles E. D.; Geurts, Rene; Cannon, Steven B.; Udvardi, Michael K.; Benedito, Vagner A.; Mayer, Klaus F. X.; Gouzy, Jérôme; Schoof, Heiko; Van de Peer, Yves; Proost, Sebastian; Cook, Douglas R.; Meyers, Blake C.; Spannagl, Manuel; Cheung, Foo; De Mita, Stéphane; Krishnakumar, Vivek; Gundlach, Heidrun; Zhou, Shiguo; Mudge, Joann; Bharti, Arvind K.; Murray, Jeremy D.; Naoumkina, Marina A.; Rosen, Benjamin; Silverstein, Kevin A. T.; Tang, Haibao; Rombauts, Stephane; Zhao, Patrick X.; Zhou, Peng; Barbe, Valérie; Bardou, Philippe; Bechner, Michael; Bellec, Arnaud; Berger, Anne; Bergès, Hélène; Bidwell, Shelby; Bisseling, Ton; Choisne, Nathalie; Couloux, Arnaud; Denny, Roxanne; Deshpande, Shweta; Dai, Xinbin; Doyle, Jeff; Dudez, Anne-Marie; Farmer, Andrew D.; Fouteau, Stéphanie; Franken, Carolien; Gibelin, Chrystel; Gish, John; Goldstein, Steven; González, Alvaro J.; Green, Pamela J.; Hallab, Asis; Hartog, Marijke; Hua, Axin; Humphray, Sean; Jeong, Dong-Hoon; Jing, Yi; Jöcker, Anika; Kenton, Steve M.; Kim, Dong-Jin; Klee, Kathrin; Lai, Hongshing; Lang, Chunting; Lin, Shaoping; Macmil, Simone L; Magdelenat, Ghislaine; Matthews, Lucy; McCorrison, Jamison; Monaghan, Erin L.; Mun, Jeong-Hwan; Najar, Fares Z.; Nicholson, Christine; Noirot, Céline; O’Bleness, Majesta; Paule, Charles R.; Poulain, Julie; Prion, Florent; Qin, Baifang; Qu, Chunmei; Retzel, Ernest F.; Riddle, Claire; Sallet, Erika; Samain, Sylvie; Samson, Nicolas; Sanders, Iryna; Saurat, Olivier; Scarpelli, Claude; Schiex, Thomas; Segurens, Béatrice; Severin, Andrew J.; Sherrier, D. Janine; Shi, Ruihua; Sims, Sarah; Singer, Susan R.; Sinharoy, Senjuti; Sterck, Lieven; Viollet, Agnès; Wang, Bing-Bing; Wang, Keqin; Wang, Mingyi; Wang, Xiaohong; Warfsmann, Jens; Weissenbach, Jean; White, Doug D.; White, Jim D.; Wiley, Graham B.; Wincker, Patrick; Xing, Yanbo; Yang, Limei; Yao, Ziyun; Ying, Fu; Zhai, Jixian; Zhou, Liping; Zuber, Antoine; Dénarié, Jean; Dixon, Richard A.; May, Gregory D.; Schwartz, David C.; Rogers, Jane; Quétier, Francis; Town, Christopher D.; Roe, Bruce A.

2011-01-01

Legumes (Fabaceae or Leguminosae) are unique among cultivated plants for their ability to carry out endosymbiotic nitrogen fixation with rhizobial bacteria, a process that takes place in a specialized structure known as the nodule. Legumes belong to one of the two main groups of eurosids, the Fabidae, which includes most species capable of endosymbiotic nitrogen fixation 1. Legumes comprise several evolutionary lineages derived from a common ancestor 60 million years ago (Mya). Papilionoids are the largest clade, dating nearly to the origin of legumes and containing most cultivated species 2. Medicago truncatula (Mt) is a long-established model for the study of legume biology. Here we describe the draft sequence of the Mt euchromatin based on a recently completed BAC-assembly supplemented with Illumina-shotgun sequence, together capturing ~94% of all Mt genes. A whole-genome duplication (WGD) approximately 58 Mya played a major role in shaping the Mt genome and thereby contributed to the evolution of endosymbiotic nitrogen fixation. Subsequent to the WGD, the Mt genome experienced higher levels of rearrangement than two other sequenced legumes, Glycine max (Gm) and Lotus japonicus (Lj). Mt is a close relative of alfalfa (M. sativa), a widely cultivated crop with limited genomics tools and complex autotetraploid genetics. As such, the Mt genome sequence provides significant opportunities to expand alfalfa’s genomic toolbox. PMID:22089132
Accessing the genomic effects of naked nanoceria in murine neuronal cells.

PubMed

Lee, Tin-Lap; Raitano, Joan M; Rennert, Owen M; Chan, Siu-Wai; Chan, Wai-Yee

2012-07-01

Cerium oxide nanoparticles (nanoceria) are engineered nanoparticles whose versatility is due to their unique redox properties. We and others have demonstrated that naked nanoceria can act as antioxidants to protect cells against oxidative damage. Although the redox properties may be beneficial, the genome-wide effects of nanoceria on gene transcription and associated biological processes remain elusive. Here we applied a functional genomic approach to examine the genome-wide effects of nanoceria on global gene transcription and cellular functions in mouse neuronal cells. Importantly, we demonstrated that nanoceria induced chemical- and size-specific changes in the murine neuronal cell transcriptome. The nanoceria contributed more than 83% of the population of uniquely altered genes and were associated with a unique spectrum of genes related to neurological disease, cell cycle control, and growth. These observations suggest that an in-depth assessment of potential health effects of naked nanoceria and other naked nanoparticles is both necessary and imminent. Cerium oxide nanoparticles are important antioxidants, with potential applications in neurodegenerative conditions. This team of investigators demonstrated the genomic effects of nanoceria, showing that it induced chemical- and size-specific changes in the murine neuronal cell transcriptome. Published by Elsevier Inc.
The first genome sequence of a metatherian herpesvirus: Macropodid herpesvirus 1.

PubMed

Vaz, Paola K; Mahony, Timothy J; Hartley, Carol A; Fowler, Elizabeth V; Ficorilli, Nino; Lee, Sang W; Gilkerson, James R; Browning, Glenn F; Devlin, Joanne M

2016-01-22

While many placental herpesvirus genomes have been fully sequenced, the complete genome of a marsupial herpesvirus has not been described. Here we present the first genome sequence of a metatherian herpesvirus, Macropodid herpesvirus 1 (MaHV-1). The MaHV-1 viral genome was sequenced using an Illumina MiSeq sequencer, de novo assembly was performed and the genome was annotated. The MaHV-1 genome was 140 kbp in length and clustered phylogenetically with the primate simplexviruses, sharing 67% nucleotide sequence identity with Human herpesviruses 1 and 2. The MaHV-1 genome contained 66 predicted open reading frames (ORFs) homologous to those in other herpesvirus genomes, but lacked homologues of UL3, UL4, UL56 and glycoprotein J. This is the first alphaherpesvirus genome that has been found to lack the UL3 and UL4 homologues. We identified six novel ORFs and confirmed their transcription by RT-PCR. This is the first genome sequence of a herpesvirus that infects metatherians, a taxonomically unique mammalian clade. Members of the Simplexvirus genus are remarkably conserved, so the absence of ORFs otherwise retained in eutherian and avian alphaherpesviruses contributes to our understanding of the Alphaherpesvirinae. Further study of metatherian herpesvirus genetics and pathogenesis provides a unique approach to understanding herpesvirus-mammalian interactions.

Salvage of failed protein targets by reductive alkylation.

PubMed

Tan, Kemin; Kim, Youngchang; Hatzos-Skintges, Catherine; Chang, Changsoo; Cuff, Marianne; Chhor, Gekleng; Osipiuk, Jerzy; Michalska, Karolina; Nocek, Boguslaw; An, Hao; Babnigg, Gyorgy; Bigelow, Lance; Joachimiak, Grazyna; Li, Hui; Mack, Jamey; Makowska-Grzyska, Magdalena; Maltseva, Natalia; Mulligan, Rory; Tesar, Christine; Zhou, Min; Joachimiak, Andrzej

2014-01-01

The growth of diffraction-quality single crystals is of primary importance in protein X-ray crystallography. Chemical modification of proteins can alter their surface properties and crystallization behavior. The Midwest Center for Structural Genomics (MCSG) has previously reported how reductive methylation of lysine residues in proteins can improve crystallization of unique proteins that initially failed to produce diffraction-quality crystals. Recently, this approach has been expanded to include ethylation and isopropylation in the MCSG protein crystallization pipeline. Applying standard methods, 180 unique proteins were alkylated and screened using standard crystallization procedures. Crystal structures of 12 new proteins were determined, including the first ethylated and the first isopropylated protein structures. In a few cases, the structures of native and methylated or ethylated states were obtained and the impact of reductive alkylation of lysine residues was assessed. Reductive methylation tends to be more efficient and produces the most alkylated protein structures. Structures of methylated proteins typically have higher resolution limits. A number of well-ordered alkylated lysine residues have been identified, which make both intermolecular and intramolecular contacts. The previous report is updated and complemented with the following new data; a description of a detailed alkylation protocol with results, structural features, and roles of alkylated lysine residues in protein crystals. These contribute to improved crystallization properties of some proteins.
Salvage of Failed Protein Targets by Reductive Alkylation

PubMed Central

Tan, Kemin; Kim, Youngchang; Hatzos-Skintges, Catherine; Chang, Changsoo; Cuff, Marianne; Chhor, Gekleng; Osipiuk, Jerzy; Michalska, Karolina; Nocek, Boguslaw; An, Hao; Babnigg, Gyorgy; Bigelow, Lance; Joachimiak, Grazyna; Li, Hui; Mack, Jamey; Makowska-Grzyska, Magdalena; Maltseva, Natalia; Mulligan, Rory; Tesar, Christine; Zhou, Min; Joachimiak, Andrzej

2014-01-01

The growth of diffraction-quality single crystals is of primary importance in protein X-ray crystallography. Chemical modification of proteins can alter their surface properties and crystallization behavior. The Midwest Center for Structural Genomics (MCSG) has previously reported how reductive methylation of lysine residues in proteins can improve crystallization of unique proteins that initially failed to produce diffraction-quality crystals. Recently, this approach has been expanded to include ethylation and isopropylation in the MCSG protein crystallization pipeline. Applying standard methods, 180 unique proteins were alkylated and screened using standard crystallization procedures. Crystal structures of 12 new proteins were determined, including the first ethylated and the first isopropylated protein structures. In a few cases, the structures of native and methylated or ethylated states were obtained and the impact of reductive alkylation of lysine residues was assessed. Reductive methylation tends to be more efficient and produces the most alkylated protein structures. Structures of methylated proteins typically have higher resolution limits. A number of well-ordered alkylated lysine residues have been identified, which make both intermolecular and intramolecular contacts. The previous report is updated and complemented with the following new data; a description of a detailed alkylation protocol with results, structural features, and roles of alkylated lysine residues in protein crystals. These contribute to improved crystallization properties of some proteins. PMID:24590719
Genetic variability of mutans streptococci revealed by wide whole-genome sequencing

PubMed Central

2013-01-01

Background Mutans streptococci are a group of bacteria significantly contributing to tooth decay. Their genetic variability is however still not well understood. Results Genomes of 6 clinical S. mutans isolates of different origins, one isolate of S. sobrinus (DSM 20742) and one isolate of S. ratti (DSM 20564) were sequenced and comparatively analyzed. Genome alignment revealed a mosaic-like structure of genome arrangement. Genes related to pathogenicity are found to have high variations among the strains, whereas genes for oxidative stress resistance are well conserved, indicating the importance of this trait in the dental biofilm community. Analysis of genome-scale metabolic networks revealed significant differences in 42 pathways. A striking dissimilarity is the unique presence of two lactate oxidases in S. sobrinus DSM 20742, probably indicating an unusual capability of this strain in producing H2O2 and expanding its ecological niche. In addition, lactate oxidases may form with other enzymes a novel energetic pathway in S. sobrinus DSM 20742 that can remedy its deficiency in citrate utilization pathway. Using 67 S. mutans genomes currently available including the strains sequenced in this study, we estimates the theoretical core genome size of S. mutans, and performed modeling of S. mutans pan-genome by applying different fitting models. An “open” pan-genome was inferred. Conclusions The comparative genome analyses revealed diversities in the mutans streptococci group, especially with respect to the virulence related genes and metabolic pathways. The results are helpful for better understanding the evolution and adaptive mechanisms of these oral pathogen microorganisms and for combating them. PMID:23805886
Primordial germ cell-mediated transgenesis and genome editing in birds.

PubMed

Han, Jae Yong; Park, Young Hyun

2018-01-01

Transgenesis and genome editing in birds are based on a unique germline transmission system using primordial germ cells (PGCs), which is quite different from the mammalian transgenic and genome editing system. PGCs are progenitor cells of gametes that can deliver genetic information to the next generation. Since avian PGCs were first discovered in nineteenth century, there have been numerous efforts to reveal their origin, specification, and unique migration pattern, and to improve germline transmission efficiency. Recent advances in the isolation and in vitro culture of avian PGCs with genetic manipulation and genome editing tools enable the development of valuable avian models that were unavailable before. However, many challenges remain in the production of transgenic and genome-edited birds, including the precise control of germline transmission, introduction of exogenous genes, and genome editing in PGCs. Therefore, establishing reliable germline-competent PGCs and applying precise genome editing systems are critical current issues in the production of avian models. Here, we introduce a historical overview of avian PGCs and their application, including improved techniques and methodologies in the production of transgenic and genome-edited birds, and we discuss the future potential applications of transgenic and genome-edited birds to provide opportunities and benefits for humans.
Comprehensive Genome Analysis of Carbapenemase-Producing Enterobacter spp.: New Insights into Phylogeny, Population Structure, and Resistance Mechanisms

PubMed Central

Chavda, Kalyan D.; Chen, Liang; Fouts, Derrick E.; Sutton, Granger; Brinkac, Lauren; Jenkins, Stephen G.; Bonomo, Robert A.

2016-01-01

ABSTRACT Knowledge regarding the genomic structure of Enterobacter spp., the second most prevalent carbapenemase-producing Enterobacteriaceae, remains limited. Here we sequenced 97 clinical Enterobacter species isolates that were both carbapenem susceptible and resistant from various geographic regions to decipher the molecular origins of carbapenem resistance and to understand the changing phylogeny of these emerging and drug-resistant pathogens. Of the carbapenem-resistant isolates, 30 possessed blaKPC-2, 40 had blaKPC-3, 2 had blaKPC-4, and 2 had blaNDM-1. Twenty-three isolates were carbapenem susceptible. Six genomes were sequenced to completion, and their sizes ranged from 4.6 to 5.1 Mbp. Phylogenomic analysis placed 96 of these genomes, 351 additional Enterobacter genomes downloaded from NCBI GenBank, and six newly sequenced type strains into 19 phylogenomic groups—18 groups (A to R) in the Enterobacter cloacae complex and Enterobacter aerogenes. Diverse mechanisms underlying the molecular evolutionary trajectory of these drug-resistant Enterobacter spp. were revealed, including the acquisition of an antibiotic resistance plasmid, followed by clonal spread, horizontal transfer of blaKPC-harboring plasmids between different phylogenomic groups, and repeated transposition of the blaKPC gene among different plasmid backbones. Group A, which comprises multilocus sequence type 171 (ST171), was the most commonly identified (23% of isolates). Genomic analysis showed that ST171 isolates evolved from a common ancestor and formed two different major clusters; each acquiring unique blaKPC-harboring plasmids, followed by clonal expansion. The data presented here represent the first comprehensive study of phylogenomic interrogation and the relationship between antibiotic resistance and plasmid discrimination among carbapenem-resistant Enterobacter spp., demonstrating the genetic diversity and complexity of the molecular mechanisms driving antibiotic resistance in this genus. PMID:27965456
Genetic linkage map of a wild genome: genomic structure, recombination and sexual dimorphism in bighorn sheep

PubMed Central

2010-01-01

Background The construction of genetic linkage maps in free-living populations is a promising tool for the study of evolution. However, such maps are rare because it is difficult to develop both wild pedigrees and corresponding sets of molecular markers that are sufficiently large. We took advantage of two long-term field studies of pedigreed individuals and genomic resources originally developed for domestic sheep (Ovis aries) to construct a linkage map for bighorn sheep, Ovis canadensis. We then assessed variability in genomic structure and recombination rates between bighorn sheep populations and sheep species. Results Bighorn sheep population-specific maps differed slightly in contiguity but were otherwise very similar in terms of genomic structure and recombination rates. The joint analysis of the two pedigrees resulted in a highly contiguous map composed of 247 microsatellite markers distributed along all 26 autosomes and the X chromosome. The map is estimated to cover about 84% of the bighorn sheep genome and contains 240 unique positions spanning a sex-averaged distance of 3051 cM with an average inter-marker distance of 14.3 cM. Marker synteny, order, sex-averaged interval lengths and sex-averaged total map lengths were all very similar between sheep species. However, in contrast to domestic sheep, but consistent with the usual pattern for a placental mammal, recombination rates in bighorn sheep were significantly greater in females than in males (~12% difference), resulting in an autosomal female map of 3166 cM and an autosomal male map of 2831 cM. Despite differing genome-wide patterns of heterochiasmy between the sheep species, sexual dimorphism in recombination rates was correlated between orthologous intervals. Conclusions We have developed a first-generation bighorn sheep linkage map that will facilitate future studies of the genetic architecture of trait variation in this species. While domestication has been hypothesized to be responsible for the elevated mean recombination rate observed in domestic sheep, our results suggest that it is a characteristic of Ovis species. However, domestication may have played a role in altering patterns of heterochiasmy. Finally, we found that interval-specific patterns of sexual dimorphism were preserved among closely related Ovis species, possibly due to the conserved position of these intervals relative to the centromeres and telomeres. This study exemplifies how transferring genomic resources from domesticated species to close wild relative can benefit evolutionary ecologists while providing insights into the evolution of genomic structure and recombination rates of domesticated species. PMID:20920197
Analysis of strain-specific genes in glutamic acid-producing Corynebacterium glutamicum ssp. lactofermentum AJ 1511.

PubMed

Nishio, Yousuke; Koseki, Chie; Tonouchi, Naoto; Matsui, Kazuhiko; Sugimoto, Shinichi; Usuda, Yoshihiro

2017-07-11

Strains of the bacterium, Corynebacterium glutamicum, are widely used for the industrial production of L-glutamic acid and various other substances. C. glutamicum ssp. lactofermentum AJ 1511, formerly classified as Brevibacterium lactofermentum, and the closely related C. glutamicum ATCC 13032 have been used as industrial strains for more than 50 years. We determined the whole genome sequence of C. glutamicum AJ 1511 and performed genome-wide comparative analysis with C. glutamicum ATCC 13032 to determine strain-specific genetic differences. This analysis revealed that the genomes of the two industrial strains are highly similar despite the phenotypic differences between the two strains. Both strains harbored unique genes but gene transpositions or inversions were not observed. The largest unique region, a 220-kb AT-rich region located between 1.78 and 2.00 Mb position in C. glutamicum ATCC 13032 genome, was missing in the genome of C. glutamicum AJ 1511. The next two largest unique regions were present in C. glutamicum AJ 1511. The first region (413-484 kb position) contains several predicted transport proteins, enzymes involved in sugar metabolism, and transposases. The second region (1.47-1.50 Mb position) encodes restriction modification systems. A gene predicted to encode NADH-dependent glutamate dehydrogenase, which is involved in L-glutamate biosynthesis, is present in C. glutamicum AJ 1511. Strain-specific genes identified in this study are likely to govern phenotypes unique to each strain.
Mutagenic consequences of a single G-quadruplex demonstrate mitotic inheritance of DNA replication fork barriers

PubMed Central

Lemmens, Bennie; van Schendel, Robin; Tijsterman, Marcel

2015-01-01

Faithful DNA replication is vital to prevent disease-causing mutations, chromosomal aberrations and malignant transformation. However, accuracy conflicts with pace and flexibility and cells rely on specialized polymerases and helicases to ensure effective and timely replication of genomes that contain DNA lesions or secondary structures. If and how cells can tolerate a permanent barrier to replication is, however, unknown. Here we show that a single unresolved G-quadruplexed DNA structure can persist through multiple mitotic divisions without changing conformation. Failed replication across a G-quadruplex causes single-strand DNA gaps that give rise to DNA double-strand breaks in subsequent cell divisions, which are processed by polymerase theta (POLQ)-mediated alternative end joining. Lineage tracing experiments further reveal that persistent G-quadruplexes cause genetic heterogeneity during organ development. Our data demonstrate that a single lesion can cause multiple unique genomic rearrangements, and that alternative end joining enables cells to proliferate in the presence of mitotically inherited replication blocks. PMID:26563448
Mutagenic consequences of a single G-quadruplex demonstrate mitotic inheritance of DNA replication fork barriers.

PubMed

Lemmens, Bennie; van Schendel, Robin; Tijsterman, Marcel

2015-11-13

Faithful DNA replication is vital to prevent disease-causing mutations, chromosomal aberrations and malignant transformation. However, accuracy conflicts with pace and flexibility and cells rely on specialized polymerases and helicases to ensure effective and timely replication of genomes that contain DNA lesions or secondary structures. If and how cells can tolerate a permanent barrier to replication is, however, unknown. Here we show that a single unresolved G-quadruplexed DNA structure can persist through multiple mitotic divisions without changing conformation. Failed replication across a G-quadruplex causes single-strand DNA gaps that give rise to DNA double-strand breaks in subsequent cell divisions, which are processed by polymerase theta (POLQ)-mediated alternative end joining. Lineage tracing experiments further reveal that persistent G-quadruplexes cause genetic heterogeneity during organ development. Our data demonstrate that a single lesion can cause multiple unique genomic rearrangements, and that alternative end joining enables cells to proliferate in the presence of mitotically inherited replication blocks.
The Dynamic Interplay Between DNA Topoisomerases and DNA Topology.

PubMed

Seol, Yeonee; Neuman, Keir C

2016-09-01

Topological properties of DNA influence its structure and biochemical interactions. Within the cell DNA topology is constantly in flux. Transcription and other essential processes including DNA replication and repair, alter the topology of the genome, while introducing additional complications associated with DNA knotting and catenation. These topological perturbations are counteracted by the action of topoisomerases, a specialized class of highly conserved and essential enzymes that actively regulate the topological state of the genome. This dynamic interplay among DNA topology, DNA processing enzymes, and DNA topoisomerases, is a pervasive factor that influences DNA metabolism in vivo . Building on the extensive structural and biochemical characterization over the past four decades that established the fundamental mechanistic basis of topoisomerase activity, the unique roles played by DNA topology in modulating and influencing the activity of topoisomerases have begun to be explored. In this review we survey established and emerging DNA topology dependent protein-DNA interactions with a focus on in vitro measurements of the dynamic interplay between DNA topology and topoisomerase activity.
The dynamic interplay between DNA topoisomerases and DNA topology.

PubMed

Seol, Yeonee; Neuman, Keir C

2016-11-01

Topological properties of DNA influence its structure and biochemical interactions. Within the cell, DNA topology is constantly in flux. Transcription and other essential processes, including DNA replication and repair, not only alter the topology of the genome but also introduce additional complications associated with DNA knotting and catenation. These topological perturbations are counteracted by the action of topoisomerases, a specialized class of highly conserved and essential enzymes that actively regulate the topological state of the genome. This dynamic interplay among DNA topology, DNA processing enzymes, and DNA topoisomerases is a pervasive factor that influences DNA metabolism in vivo. Building on the extensive structural and biochemical characterization over the past four decades that has established the fundamental mechanistic basis of topoisomerase activity, scientists have begun to explore the unique roles played by DNA topology in modulating and influencing the activity of topoisomerases. In this review we survey established and emerging DNA topology-dependent protein-DNA interactions with a focus on in vitro measurements of the dynamic interplay between DNA topology and topoisomerase activity.
Human retinoblastoma susceptibility gene: genomic organization and analysis of heterozygous intragenic deletion mutants.

PubMed Central

Bookstein, R; Lee, E Y; To, H; Young, L J; Sery, T W; Hayes, R C; Friedmann, T; Lee, W H

1988-01-01

A gene in chromosome region 13q14 has been identified as the human retinoblastoma susceptibility (RB) gene on the basis of altered gene expression found in virtually all retinoblastomas. In order to further characterize the RB gene and its structural alterations, we examined genomic clones of the RB gene isolated from both a normal human genomic library and a library made from DNA of the retinoblastoma cell line Y79. First, a restriction and exon map of the RB gene was constructed by aligning overlapping genomic clones, yielding three contiguous regions ("contigs") of 150 kilobases total length separated by two gaps. At least 20 exons were identified in genomic clones, and these were provisionally numbered. Second, two overlapping genomic clones that demonstrated a DNA deletion of exons 2 through 6 from one RB allele were isolated from the Y79 library. To confirm and extend this result, a unique sequence probe from intron 1 was used to detect similar and possibly identical heterozygous deletions in genomic DNA from three retinoblastoma cell lines, thereby explaining the origins of their shortened RB mRNA transcripts. The same probe detected genomic rearrangements in fibroblasts from two hereditary retinoblastoma patients, indicating that intron 1 includes a frequent site for mutations conferring predisposition to retinoblastoma. Third, this probe also detected a polymorphic site for BamHI with allele frequencies near 0.5/0.5. Identification of commonly mutated regions will contribute significantly to genetic diagnosis in retinoblastoma patients and families. Images PMID:2895471
Accumulation of slightly deleterious mutations in the mitochondrial genome: a hallmark of animal domestication.

PubMed

Hughes, Austin L

2013-02-15

The hypothesis that domestication leads to a relaxation of purifying selection on mitochondrial (mt) genomes was tested by comparative analysis of mt genes from dog, pig, chicken, and silkworm. The three vertebrate species showed mt genome phylogenies in which domestic and wild isolates were intermingled, whereas the domestic silkworm (Bombyx mori) formed a distinct cluster nested within its closest wild relative (Bombyx mandarina). In spite of these differences in phylogenetic pattern, significantly greater proportions of nonsynonymous SNPs than of synonymous SNPs were unique to the domestic populations of all four species. Likewise, in all four species, significantly greater proportions of RNA-encoding SNPs than of synonymous SNPs were unique to the domestic populations. Thus, domestic populations were characterized by an excess of unique polymorphisms in two categories generally subject to purifying selection: nonsynonymous sites and RNA-encoding sites. Many of these unique polymorphisms thus seem likely to be slightly deleterious; the latter hypothesis was supported by the generally lower gene diversities of polymorphisms unique to domestic populations in comparison to those of polymorphisms shared by domestic and wild populations. Copyright © 2012 Elsevier B.V. All rights reserved.
Contribution of transposable elements and distal enhancers to evolution of human-specific features of interphase chromatin architecture in embryonic stem cells.

PubMed

Glinsky, Gennadi V

2018-03-01

Transposable elements have made major evolutionary impacts on creation of primate-specific and human-specific genomic regulatory loci and species-specific genomic regulatory networks (GRNs). Molecular and genetic definitions of human-specific changes to GRNs contributing to development of unique to human phenotypes remain a highly significant challenge. Genome-wide proximity placement analysis of diverse families of human-specific genomic regulatory loci (HSGRL) identified topologically associating domains (TADs) that are significantly enriched for HSGRL and designated rapidly evolving in human TADs. Here, the analysis of HSGRL, hESC-enriched enhancers, super-enhancers (SEs), and specific sub-TAD structures termed super-enhancer domains (SEDs) has been performed. In the hESC genome, 331 of 504 (66%) of SED-harboring TADs contain HSGRL and 68% of SEDs co-localize with HSGRL, suggesting that emergence of HSGRL may have rewired SED-associated GRNs within specific TADs by inserting novel and/or erasing existing non-coding regulatory sequences. Consequently, markedly distinct features of the principal regulatory structures of interphase chromatin evolved in the hESC genome compared to mouse: the SED quantity is 3-fold higher and the median SED size is significantly larger. Concomitantly, the overall TAD quantity is increased by 42% while the median TAD size is significantly decreased (p = 9.11E-37) in the hESC genome. Present analyses illustrate a putative global role for transposable elements and HSGRL in shaping the human-specific features of the interphase chromatin organization and functions, which are facilitated by accelerated creation of novel transcription factor binding sites and new enhancers driven by targeted placement of HSGRL at defined genomic coordinates. A trend toward the convergence of TAD and SED architectures of interphase chromatin in the hESC genome may reflect changes of 3D-folding patterns of linear chromatin fibers designed to enhance both regulatory complexity and functional precision of GRNs by creating predominantly a single gene (or a set of functionally linked genes) per regulatory domain structures. Collectively, present analyses reveal critical evolutionary contributions of transposable elements and distal enhancers to creation of thousands primate- and human-specific elements of a chromatin folding code, which defines the 3D context of interphase chromatin both restricting and facilitating biological functions of GRNs.
The Atlantic salmon genome provides insights into rediploidization

USDA-ARS?s Scientific Manuscript database

The common ancestor of salmonids underwent an autotetraploid whole genome duplication event (Ss4R) approximately eighty million years ago, which provides unique opportunities to study the early evolutionary fate of a duplicated vertebrate genome in different extant lineages. Here, we present a high ...
Comparison and quantitative verification of mapping algorithms for whole genome bisulfite sequencing

USDA-ARS?s Scientific Manuscript database

Coupling bisulfite conversion with next-generation sequencing (Bisulfite-seq) enables genome-wide measurement of DNA methylation, but poses unique challenges for mapping. However, despite a proliferation of Bisulfite-seq mapping tools, no systematic comparison of their genomic coverage and quantitat...
De novo assembly, characterization and functional annotation of pineapple fruit transcriptome through massively parallel sequencing.

PubMed

Ong, Wen Dee; Voo, Lok-Yung Christopher; Kumar, Vijay Subbiah

2012-01-01

Pineapple (Ananas comosus var. comosus), is an important tropical non-climacteric fruit with high commercial potential. Understanding the mechanism and processes underlying fruit ripening would enable scientists to enhance the improvement of quality traits such as, flavor, texture, appearance and fruit sweetness. Although, the pineapple is an important fruit, there is insufficient transcriptomic or genomic information that is available in public databases. Application of high throughput transcriptome sequencing to profile the pineapple fruit transcripts is therefore needed. To facilitate this, we have performed transcriptome sequencing of ripe yellow pineapple fruit flesh using Illumina technology. About 4.7 millions Illumina paired-end reads were generated and assembled using the Velvet de novo assembler. The assembly produced 28,728 unique transcripts with a mean length of approximately 200 bp. Sequence similarity search against non-redundant NCBI database identified a total of 16,932 unique transcripts (58.93%) with significant hits. Out of these, 15,507 unique transcripts were assigned to gene ontology terms. Functional annotation against Kyoto Encyclopedia of Genes and Genomes pathway database identified 13,598 unique transcripts (47.33%) which were mapped to 126 pathways. The assembly revealed many transcripts that were previously unknown. The unique transcripts derived from this work have rapidly increased of the number of the pineapple fruit mRNA transcripts as it is now available in public databases. This information can be further utilized in gene expression, genomics and other functional genomics studies in pineapple.
De Novo Assembly, Characterization and Functional Annotation of Pineapple Fruit Transcriptome through Massively Parallel Sequencing

PubMed Central

Ong, Wen Dee; Voo, Lok-Yung Christopher; Kumar, Vijay Subbiah

2012-01-01

Background Pineapple (Ananas comosus var. comosus), is an important tropical non-climacteric fruit with high commercial potential. Understanding the mechanism and processes underlying fruit ripening would enable scientists to enhance the improvement of quality traits such as, flavor, texture, appearance and fruit sweetness. Although, the pineapple is an important fruit, there is insufficient transcriptomic or genomic information that is available in public databases. Application of high throughput transcriptome sequencing to profile the pineapple fruit transcripts is therefore needed. Methodology/Principal Findings To facilitate this, we have performed transcriptome sequencing of ripe yellow pineapple fruit flesh using Illumina technology. About 4.7 millions Illumina paired-end reads were generated and assembled using the Velvet de novo assembler. The assembly produced 28,728 unique transcripts with a mean length of approximately 200 bp. Sequence similarity search against non-redundant NCBI database identified a total of 16,932 unique transcripts (58.93%) with significant hits. Out of these, 15,507 unique transcripts were assigned to gene ontology terms. Functional annotation against Kyoto Encyclopedia of Genes and Genomes pathway database identified 13,598 unique transcripts (47.33%) which were mapped to 126 pathways. The assembly revealed many transcripts that were previously unknown. Conclusions The unique transcripts derived from this work have rapidly increased of the number of the pineapple fruit mRNA transcripts as it is now available in public databases. This information can be further utilized in gene expression, genomics and other functional genomics studies in pineapple. PMID:23091603
The Comprehensive Antibiotic Resistance Database

PubMed Central

McArthur, Andrew G.; Waglechner, Nicholas; Nizam, Fazmin; Yan, Austin; Azad, Marisa A.; Baylay, Alison J.; Bhullar, Kirandeep; Canova, Marc J.; De Pascale, Gianfranco; Ejim, Linda; Kalan, Lindsay; King, Andrew M.; Koteva, Kalinka; Morar, Mariya; Mulvey, Michael R.; O'Brien, Jonathan S.; Pawlowski, Andrew C.; Piddock, Laura J. V.; Spanogiannopoulos, Peter; Sutherland, Arlene D.; Tang, Irene; Taylor, Patricia L.; Thaker, Maulik; Wang, Wenliang; Yan, Marie; Yu, Tennison

2013-01-01

The field of antibiotic drug discovery and the monitoring of new antibiotic resistance elements have yet to fully exploit the power of the genome revolution. Despite the fact that the first genomes sequenced of free living organisms were those of bacteria, there have been few specialized bioinformatic tools developed to mine the growing amount of genomic data associated with pathogens. In particular, there are few tools to study the genetics and genomics of antibiotic resistance and how it impacts bacterial populations, ecology, and the clinic. We have initiated development of such tools in the form of the Comprehensive Antibiotic Research Database (CARD; http://arpcard.mcmaster.ca). The CARD integrates disparate molecular and sequence data, provides a unique organizing principle in the form of the Antibiotic Resistance Ontology (ARO), and can quickly identify putative antibiotic resistance genes in new unannotated genome sequences. This unique platform provides an informatic tool that bridges antibiotic resistance concerns in health care, agriculture, and the environment. PMID:23650175
Structures of membrane proteins

PubMed Central

Vinothkumar, Kutti R.; Henderson, Richard

2010-01-01

In reviewing the structures of membrane proteins determined up to the end of 2009, we present in words and pictures the most informative examples from each family. We group the structures together according to their function and architecture to provide an overview of the major principles and variations on the most common themes. The first structures, determined 20 years ago, were those of naturally abundant proteins with limited conformational variability, and each membrane protein structure determined was a major landmark. With the advent of complete genome sequences and efficient expression systems, there has been an explosion in the rate of membrane protein structure determination, with many classes represented. New structures are published every month and more than 150 unique membrane protein structures have been determined. This review analyses the reasons for this success, discusses the challenges that still lie ahead, and presents a concise summary of the key achievements with illustrated examples selected from each class. PMID:20667175

Structural and Phylogenetic Analysis of a Conserved Actinobacteria-Specific Protein (ASP1; SCO1997) from Streptomyces Coelicolor

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gao, B.; Sugiman-Marangos, S; Junop, M

2009-01-01

The Actinobacteria phylum represents one of the largest and most diverse groups of bacteria, encompassing many important and well-characterized organisms including Streptomyces, Bifidobacterium, Corynebacterium and Mycobacterium. Members of this phylum are remarkably diverse in terms of life cycle, morphology, physiology and ecology. Recent comparative genomic analysis of 19 actinobacterial species determined that only 5 genes of unknown function uniquely define this large phylum [1]. The cellular functions of these actinobacteria-specific proteins (ASP) are not known.
Structure of large dsDNA viruses

PubMed Central

Klose, Thomas; Rossmann, Michael G.

2015-01-01

Nucleocytoplasmic large dsDNA viruses (NCLDVs) encompass an ever-increasing group of large eukaryotic viruses, infecting a wide variety of organisms. The set of core genes shared by all these viruses includes a major capsid protein with a double jelly-roll fold forming an icosahedral capsid, which surrounds a double layer membrane that contains the viral genome. Furthermore, some of these viruses, such as the members of the Mimiviridae and Phycodnaviridae have a unique vertex that is used during infection to transport DNA into the host. PMID:25003382
Genomics Portals: integrative web-platform for mining genomics data.

PubMed

Shinde, Kaustubh; Phatak, Mukta; Johannes, Freudenberg M; Chen, Jing; Li, Qian; Vineet, Joshi K; Hu, Zhen; Ghosh, Krishnendu; Meller, Jaroslaw; Medvedovic, Mario

2010-01-13

A large amount of experimental data generated by modern high-throughput technologies is available through various public repositories. Our knowledge about molecular interaction networks, functional biological pathways and transcriptional regulatory modules is rapidly expanding, and is being organized in lists of functionally related genes. Jointly, these two sources of information hold a tremendous potential for gaining new insights into functioning of living systems. Genomics Portals platform integrates access to an extensive knowledge base and a large database of human, mouse, and rat genomics data with basic analytical visualization tools. It provides the context for analyzing and interpreting new experimental data and the tool for effective mining of a large number of publicly available genomics datasets stored in the back-end databases. The uniqueness of this platform lies in the volume and the diversity of genomics data that can be accessed and analyzed (gene expression, ChIP-chip, ChIP-seq, epigenomics, computationally predicted binding sites, etc), and the integration with an extensive knowledge base that can be used in such analysis. The integrated access to primary genomics data, functional knowledge and analytical tools makes Genomics Portals platform a unique tool for interpreting results of new genomics experiments and for mining the vast amount of data stored in the Genomics Portals backend databases. Genomics Portals can be accessed and used freely at http://GenomicsPortals.org.
Genomics Portals: integrative web-platform for mining genomics data

PubMed Central

2010-01-01

Background A large amount of experimental data generated by modern high-throughput technologies is available through various public repositories. Our knowledge about molecular interaction networks, functional biological pathways and transcriptional regulatory modules is rapidly expanding, and is being organized in lists of functionally related genes. Jointly, these two sources of information hold a tremendous potential for gaining new insights into functioning of living systems. Results Genomics Portals platform integrates access to an extensive knowledge base and a large database of human, mouse, and rat genomics data with basic analytical visualization tools. It provides the context for analyzing and interpreting new experimental data and the tool for effective mining of a large number of publicly available genomics datasets stored in the back-end databases. The uniqueness of this platform lies in the volume and the diversity of genomics data that can be accessed and analyzed (gene expression, ChIP-chip, ChIP-seq, epigenomics, computationally predicted binding sites, etc), and the integration with an extensive knowledge base that can be used in such analysis. Conclusion The integrated access to primary genomics data, functional knowledge and analytical tools makes Genomics Portals platform a unique tool for interpreting results of new genomics experiments and for mining the vast amount of data stored in the Genomics Portals backend databases. Genomics Portals can be accessed and used freely at http://GenomicsPortals.org. PMID:20070909
Genome-Wide SNP Discovery, Genotyping and Their Preliminary Applications for Population Genetic Inference in Spotted Sea Bass (Lateolabrax maculatus)

PubMed Central

Wang, Juan; Xue, Dong-Xiu; Zhang, Bai-Dong; Li, Yu-Long; Liu, Bing-Jian; Liu, Jin-Xian

2016-01-01

Next-generation sequencing and the collection of genome-wide single-nucleotide polymorphisms (SNPs) allow identifying fine-scale population genetic structure and genomic regions under selection. The spotted sea bass (Lateolabrax maculatus) is a non-model species of ecological and commercial importance and widely distributed in northwestern Pacific. A total of 22 648 SNPs was discovered across the genome of L. maculatus by paired-end sequencing of restriction-site associated DNA (RAD-PE) for 30 individuals from two populations. The nucleotide diversity (π) for each population was 0.0028±0.0001 in Dandong and 0.0018±0.0001 in Beihai, respectively. Shallow but significant genetic differentiation was detected between the two populations analyzed by using both the whole data set (FST = 0.0550, P < 0.001) and the putatively neutral SNPs (FST = 0.0347, P < 0.001). However, the two populations were highly differentiated based on the putatively adaptive SNPs (FST = 0.6929, P < 0.001). Moreover, a total of 356 SNPs representing 298 unique loci were detected as outliers putatively under divergent selection by FST-based outlier tests as implemented in BAYESCAN and LOSITAN. Functional annotation of the contigs containing putatively adaptive SNPs yielded hits for 22 of 55 (40%) significant BLASTX matches. Candidate genes for local selection constituted a wide array of functions, including binding, catalytic and metabolic activities, etc. The analyses with the SNPs developed in the present study highlighted the importance of genome-wide genetic variation for inference of population structure and local adaptation in L. maculatus. PMID:27336696
Ancient, recurrent phage attacks and recombination shaped dynamic sequence-variable mosaics at the root of phytoplasma genome evolution.

PubMed

Wei, Wei; Davis, Robert E; Jomantiene, Rasa; Zhao, Yan

2008-08-19

Mobile genetic elements have impacted biological evolution across all studied organisms, but evidence for a role in evolutionary emergence of an entire phylogenetic clade has not been forthcoming. We suggest that mobile element predation played a formative role in emergence of the phytoplasma clade. Phytoplasmas are cell wall-less bacteria that cause numerous diseases in plants. Phylogenetic analyses indicate that these transkingdom parasites descended from Gram-positive walled bacteria, but events giving rise to the first phytoplasma have remained unknown. Previously we discovered a unique feature of phytoplasmal genome architecture, genes clustered in sequence-variable mosaics (SVMs), and suggested that such structures formed through recurrent, targeted attacks by mobile elements. In the present study, we discovered that cryptic prophage remnants, originating from phages in the order Caudovirales, formed SVMs and comprised exceptionally large percentages of the chromosomes of 'Candidatus Phytoplasma asteris'-related strains OYM and AYWB, occupying nearly all major nonsyntenic sections, and accounting for most of the size difference between the two genomes. The clustered phage remnants formed genomic islands exhibiting distinct DNA physical signatures, such as dinucleotide relative abundance and codon position GC values. Phytoplasma strain-specific genes identified as phage morons were located in hypervariable regions within individual SVMs, indicating that prophage remnants played important roles in generating phytoplasma genetic diversity. Because no SVM-like structures could be identified in genomes of ancestral relatives including Acholeplasma spp., we hypothesize that ancient phage attacks leading to SVM formation occurred after divergence of phytoplasmas from acholeplasmas, triggering evolution of the phytoplasma clade.
Genome-Wide SNP Discovery, Genotyping and Their Preliminary Applications for Population Genetic Inference in Spotted Sea Bass (Lateolabrax maculatus).

PubMed

Wang, Juan; Xue, Dong-Xiu; Zhang, Bai-Dong; Li, Yu-Long; Liu, Bing-Jian; Liu, Jin-Xian

2016-01-01

Next-generation sequencing and the collection of genome-wide single-nucleotide polymorphisms (SNPs) allow identifying fine-scale population genetic structure and genomic regions under selection. The spotted sea bass (Lateolabrax maculatus) is a non-model species of ecological and commercial importance and widely distributed in northwestern Pacific. A total of 22 648 SNPs was discovered across the genome of L. maculatus by paired-end sequencing of restriction-site associated DNA (RAD-PE) for 30 individuals from two populations. The nucleotide diversity (π) for each population was 0.0028±0.0001 in Dandong and 0.0018±0.0001 in Beihai, respectively. Shallow but significant genetic differentiation was detected between the two populations analyzed by using both the whole data set (FST = 0.0550, P < 0.001) and the putatively neutral SNPs (FST = 0.0347, P < 0.001). However, the two populations were highly differentiated based on the putatively adaptive SNPs (FST = 0.6929, P < 0.001). Moreover, a total of 356 SNPs representing 298 unique loci were detected as outliers putatively under divergent selection by FST-based outlier tests as implemented in BAYESCAN and LOSITAN. Functional annotation of the contigs containing putatively adaptive SNPs yielded hits for 22 of 55 (40%) significant BLASTX matches. Candidate genes for local selection constituted a wide array of functions, including binding, catalytic and metabolic activities, etc. The analyses with the SNPs developed in the present study highlighted the importance of genome-wide genetic variation for inference of population structure and local adaptation in L. maculatus.
Natural history of the ERVWE1 endogenous retroviral locus

PubMed Central

Bonnaud, Bertrand; Beliaeff, Jean; Bouton, Olivier; Oriol, Guy; Duret, Laurent; Mallet, François

2005-01-01

Background The human HERV-W multicopy family includes a unique proviral locus, termed ERVWE1, whose full-length envelope ORF was preserved through evolution by the action of a selective pressure. The encoded Env protein (Syncytin) is involved in hominoid placental physiology. Results In order to infer the natural history of this domestication process, a comparative genomic analysis of the human 7q21.2 syntenic regions in eutherians was performed. In primates, this region was progressively colonized by LTR-elements, leading to two different evolutionary pathways in Cercopithecidae and Hominidae, a genetic drift versus a domestication, respectively. Conclusion The preservation in Hominoids of a genomic structure consisting in the juxtaposition of a retrotransposon-derived MaLR LTR and the ERVWE1 provirus suggests a functional link between both elements. PMID:16176588
Phylogenetic and Evolutionary Patterns in Microbial Carotenoid Biosynthesis Are Revealed by Comparative Genomics

PubMed Central

Klassen, Jonathan L.

2010-01-01

Background Carotenoids are multifunctional, taxonomically widespread and biotechnologically important pigments. Their biosynthesis serves as a model system for understanding the evolution of secondary metabolism. Microbial carotenoid diversity and evolution has hitherto been analyzed primarily from structural and biosynthetic perspectives, with the few phylogenetic analyses of microbial carotenoid biosynthetic proteins using either used limited datasets or lacking methodological rigor. Given the recent accumulation of microbial genome sequences, a reappraisal of microbial carotenoid biosynthetic diversity and evolution from the perspective of comparative genomics is warranted to validate and complement models of microbial carotenoid diversity and evolution based upon structural and biosynthetic data. Methodology/Principal Findings Comparative genomics were used to identify and analyze in silico microbial carotenoid biosynthetic pathways. Four major phylogenetic lineages of carotenoid biosynthesis are suggested composed of: (i) Proteobacteria; (ii) Firmicutes; (iii) Chlorobi, Cyanobacteria and photosynthetic eukaryotes; and (iv) Archaea, Bacteroidetes and two separate sub-lineages of Actinobacteria. Using this phylogenetic framework, specific evolutionary mechanisms are proposed for carotenoid desaturase CrtI-family enzymes and carotenoid cyclases. Several phylogenetic lineage-specific evolutionary mechanisms are also suggested, including: (i) horizontal gene transfer; (ii) gene acquisition followed by differential gene loss; (iii) co-evolution with other biochemical structures such as proteorhodopsins; and (iv) positive selection. Conclusions/Significance Comparative genomics analyses of microbial carotenoid biosynthetic proteins indicate a much greater taxonomic diversity then that identified based on structural and biosynthetic data, and divides microbial carotenoid biosynthesis into several, well-supported phylogenetic lineages not evident previously. This phylogenetic framework is applicable to understanding the evolution of specific carotenoid biosynthetic proteins or the unique characteristics of carotenoid biosynthetic evolution in a specific phylogenetic lineage. Together, these analyses suggest a “bramble” model for microbial carotenoid biosynthesis whereby later biosynthetic steps exhibit greater evolutionary plasticity and reticulation compared to those closer to the biosynthetic “root”. Structural diversification may be constrained (“trimmed”) where selection is strong, but less so where selection is weaker. These analyses also highlight likely productive avenues for future research and bioprospecting by identifying both gaps in current knowledge and taxa which may particularly facilitate carotenoid diversification. PMID:20582313
Recent history of artificial outcrossing facilitates whole-genome association mapping in elite inbred crop varieties

PubMed Central

Rostoks, Nils; Ramsay, Luke; MacKenzie, Katrin; Cardle, Linda; Bhat, Prasanna R.; Roose, Mikeal L.; Svensson, Jan T.; Stein, Nils; Varshney, Rajeev K.; Marshall, David F.; Graner, Andreas; Close, Timothy J.; Waugh, Robbie

2006-01-01

Genomewide association studies depend on the extent of linkage disequilibrium (LD), the number and distribution of markers, and the underlying structure in populations under study. Outbreeding species generally exhibit limited LD, and consequently, a very large number of markers are required for effective whole-genome association genetic scans. In contrast, several of the world's major food crops are self-fertilizing inbreeding species with narrow genetic bases and theoretically extensive LD. Together these are predicted to result in a combination of low resolution and a high frequency of spurious associations in LD-based studies. However, inbred elite plant varieties represent a unique human-induced pseudooutbreeding population that has been subjected to strong selection for advantageous alleles. By assaying 1,524 genomewide SNPs we demonstrate that, after accounting for population substructure, the level of LD exhibited in elite northwest European barley, a typical inbred cereal crop, can be effectively exploited to map traits by using whole-genome association scans with several hundred to thousands of biallelic SNPs. PMID:17085595
Isolation of a Novel Fusogenic Orthoreovirus from Eucampsipoda africana Bat Flies in South Africa

PubMed Central

Jansen van Vuren, Petrus; Wiley, Michael; Palacios, Gustavo; Storm, Nadia; McCulloch, Stewart; Markotter, Wanda; Birkhead, Monica; Kemp, Alan; Paweska, Janusz T.

2016-01-01

We report on the isolation of a novel fusogenic orthoreovirus from bat flies (Eucampsipoda africana) associated with Egyptian fruit bats (Rousettus aegyptiacus) collected in South Africa. Complete sequences of the ten dsRNA genome segments of the virus, tentatively named Mahlapitsi virus (MAHLV), were determined. Phylogenetic analysis places this virus into a distinct clade with Baboon orthoreovirus, Bush viper reovirus and the bat-associated Broome virus. All genome segments of MAHLV contain a 5' terminal sequence (5'-GGUCA) that is unique to all currently described viruses of the genus. The smallest genome segment is bicistronic encoding for a 14 kDa protein similar to p14 membrane fusion protein of Bush viper reovirus and an 18 kDa protein similar to p16 non-structural protein of Baboon orthoreovirus. This is the first report on isolation of an orthoreovirus from an arthropod host associated with bats, and phylogenetic and sequence data suggests that MAHLV constitutes a new species within the Orthoreovirus genus. PMID:27011199
Ketide Synthase (KS) Domain Prediction and Analysis of Iterative Type II PKS Gene in Marine Sponge-Associated Actinobacteria Producing Biosurfactants and Antimicrobial Agents

PubMed Central

Selvin, Joseph; Sathiyanarayanan, Ganesan; Lipton, Anuj N.; Al-Dhabi, Naif Abdullah; Valan Arasu, Mariadhas; Kiran, George S.

2016-01-01

The important biological macromolecules, such as lipopeptide and glycolipid biosurfactant producing marine actinobacteria were analyzed and their potential linkage between type II polyketide synthase (PKS) genes was explored. A unique feature of type II PKS genes is their high amino acid (AA) sequence homology and conserved gene organization. These enzymes mediate the biosynthesis of polyketide natural products with enormous structural complexity and chemical nature by combinatorial use of various domains. Therefore, deciphering the order of AA sequence encoded by PKS domains tailored the chemical structure of polyketide analogs still remains a great challenge. The present work deals with an in vitro and in silico analysis of PKS type II genes from five actinobacterial species to correlate KS domain architecture and structural features. Our present analysis reveals the unique protein domain organization of iterative type II PKS and KS domain of marine actinobacteria. The findings of this study would have implications in metabolic pathway reconstruction and design of semi-synthetic genomes to achieve rational design of novel natural products. PMID:26903957
Identification of a novel bovine enterovirus possessing highly divergent amino acid sequences in capsid protein.

PubMed

Tsuchiaka, Shinobu; Rahpaya, Sayed Samim; Otomaru, Konosuke; Aoki, Hiroshi; Kishimoto, Mai; Naoi, Yuki; Omatsu, Tsutomu; Sano, Kaori; Okazaki-Terashima, Sachiko; Katayama, Yukie; Oba, Mami; Nagai, Makoto; Mizutani, Tetsuya

2017-01-17

Bovine enterovirus (BEV) belongs to the species Enterovirus E or F, genus Enterovirus and family Picornaviridae. Although numerous studies have identified BEVs in the feces of cattle with diarrhea, the pathogenicity of BEVs remains unclear. Previously, we reported the detection of novel kobu-like virus in calf feces, by metagenomics analysis. In the present study, we identified a novel BEV in diarrheal feces collected for that survey. Complete genome sequences were determined by deep sequencing in feces. Secondary RNA structure analysis of the 5' untranslated region (UTR), phylogenetic tree construction and pairwise identity analysis were conducted. The complete genome sequences of BEV were genetically distant from other EVs and the VP1 coding region contained novel and unique amino acid sequences. We named this strain as BEV AN12/Bos taurus/JPN/2014 (referred to as BEV-AN12). According to genome analysis, the genome length of this virus is 7414 nucleotides excluding the poly (A) tail and its genome consists of a 5'UTR, open reading frame encoding a single polyprotein, and 3'UTR. The results of secondary RNA structure analysis showed that in the 5'UTR, BEV-AN12 had an additional clover leaf structure and small stem loop structure, similarly to other BEVs. In pairwise identity analysis, BEV-AN12 showed high amino acid (aa) identities to Enterovirus F in the polyprotein, P2 and P3 regions (aa identity ≥82.4%). Therefore, BEV-AN12 is closely related to Enterovirus F. However, aa sequences in the capsid protein regions, particularly the VP1 encoding region, showed significantly low aa identity to other viruses in genus Enterovirus (VP1 aa identity ≤58.6%). In addition, BEV-AN12 branched separately from Enterovirus E and F in phylogenetic trees based on the aa sequences of P1 and VP1, although it clustered with Enterovirus F in trees based on sequences in the P2 and P3 genome region. We identified novel BEV possessing highly divergent aa sequences in the VP1 coding region in Japan. According to species definition, we proposed naming this strain as "Enterovirus K", which is a novel species within genus Enterovirus. Further genomic studies are needed to understand the pathogenicity of BEVs.
A proteome view of structural, functional, and taxonomic characteristics of major protein domain clusters.

PubMed

Sun, Chia-Tsen; Chiang, Austin W T; Hwang, Ming-Jing

2017-10-27

Proteome-scale bioinformatics research is increasingly conducted as the number of completely sequenced genomes increases, but analysis of protein domains (PDs) usually relies on similarity in their amino acid sequences and/or three-dimensional structures. Here, we present results from a bi-clustering analysis on presence/absence data for 6,580 unique PDs in 2,134 species with a sequenced genome, thus covering a complete set of proteins, for the three superkingdoms of life, Bacteria, Archaea, and Eukarya. Our analysis revealed eight distinctive PD clusters, which, following an analysis of enrichment of Gene Ontology functions and CATH classification of protein structures, were shown to exhibit structural and functional properties that are taxa-characteristic. For examples, the largest cluster is ubiquitous in all three superkingdoms, constituting a set of 1,472 persistent domains created early in evolution and retained in living organisms and characterized by basic cellular functions and ancient structural architectures, while an Archaea and Eukarya bi-superkingdom cluster suggests its PDs may have existed in the ancestor of the two superkingdoms, and others are single superkingdom- or taxa (e.g. Fungi)-specific. These results contribute to increase our appreciation of PD diversity and our knowledge of how PDs are used in species, yielding implications on species evolution.
Genomic sequence for the aflatoxigenic filamentous fungus Aspergillus nomius

USDA-ARS?s Scientific Manuscript database

The genome of the A. nomius type strain was sequenced using a personal genome machine. Annotation of the genes was undertaken, followed by gene ontology and an investigation into the number of secondary metabolite clusters. Comparative studies with other Aspergillus species involved shared/unique ge...
RNA secondary structures of the bacteriophage phi6 packaging regions.

PubMed

Pirttimaa, M J; Bamford, D H

2000-06-01

Bacteriophage phi6 genome consists of three segments of double-stranded RNA. During maturation, single-stranded copies of these segments are packaged into preformed polymerase complex particles. Only phi6 RNA is packaged, and each particle contains only one copy of each segment. An in vitro packaging and replication assay has been developed for phi6, and the packaging signals (pac sites) have been mapped to the 5' ends of the RNA segments. In this study, we propose secondary structure models for the pac sites of phi6 single-stranded RNA segments. Our models accommodate data from structure-specific chemical modifications, free energy minimizations, and phylogenetic comparisons. Previously reported pac site deletion studies are also discussed. Each pac site possesses a unique architecture, that, however, contains common structural elements.
The genome of Eucalyptus grandis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Myburg, Alexander A.; Grattapaglia, Dario; Tuskan, Gerald A.

Eucalypts are the world s most widely planted hardwood trees. Their broad adaptability, rich species diversity, fast growth and superior multipurpose wood, have made them a global renewable resource of fiber and energy that mitigates human pressures on natural forests. We sequenced and assembled >94% of the 640 Mbp genome of Eucalyptus grandis into its 11 chromosomes. A set of 36,376 protein coding genes were predicted revealing that 34% occur in tandem duplications, the largest proportion found thus far in any plant genome. Eucalypts also show the highest diversity of genes for plant specialized metabolism that act as chemical defencemore » against biotic agents and provide unique pharmaceutical oils. Resequencing of a set of inbred tree genomes revealed regions of strongly conserved heterozygosity, likely hotspots of inbreeding depression. The resequenced genome of the sister species E. globulus underscored the high inter-specific genome colinearity despite substantial genome size variation in the genus. The genome of E. grandis is the first reference for the early diverging Rosid order Myrtales and is placed here basal to the Eurosids. This resource expands knowledge on the unique biology of large woody perennials and provides a powerful tool to accelerate comparative biology, breeding and biotechnology.« less
Whole-Genome Sequencing for Detecting Antimicrobial Resistance in Nontyphoidal Salmonella.

PubMed

McDermott, Patrick F; Tyson, Gregory H; Kabera, Claudine; Chen, Yuansha; Li, Cong; Folster, Jason P; Ayers, Sherry L; Lam, Claudia; Tate, Heather P; Zhao, Shaohua

2016-09-01

Laboratory-based in vitro antimicrobial susceptibility testing is the foundation for guiding anti-infective therapy and monitoring antimicrobial resistance trends. We used whole-genome sequencing (WGS) technology to identify known antimicrobial resistance determinants among strains of nontyphoidal Salmonella and correlated these with susceptibility phenotypes to evaluate the utility of WGS for antimicrobial resistance surveillance. Six hundred forty Salmonella of 43 different serotypes were selected from among retail meat and human clinical isolates that were tested for susceptibility to 14 antimicrobials using broth microdilution. The MIC for each drug was used to categorize isolates as susceptible or resistant based on Clinical and Laboratory Standards Institute clinical breakpoints or National Antimicrobial Resistance Monitoring System (NARMS) consensus interpretive criteria. Each isolate was subjected to whole-genome shotgun sequencing, and resistance genes were identified from assembled sequences. A total of 65 unique resistance genes, plus mutations in two structural resistance loci, were identified. There were more unique resistance genes (n = 59) in the 104 human isolates than in the 536 retail meat isolates (n = 36). Overall, resistance genotypes and phenotypes correlated in 99.0% of cases. Correlations approached 100% for most classes of antibiotics but were lower for aminoglycosides and beta-lactams. We report the first finding of extended-spectrum β-lactamases (ESBLs) (blaCTX-M1 and blaSHV2a) in retail meat isolates of Salmonella in the United States. Whole-genome sequencing is an effective tool for predicting antibiotic resistance in nontyphoidal Salmonella, although the use of more appropriate surveillance breakpoints and increased knowledge of new resistance alleles will further improve correlations. Copyright © 2016, American Society for Microbiology. All Rights Reserved.
The Genome of the Netherlands: design, and project goals

PubMed Central

Boomsma, Dorret I; Wijmenga, Cisca; Slagboom, Eline P; Swertz, Morris A; Karssen, Lennart C; Abdellaoui, Abdel; Ye, Kai; Guryev, Victor; Vermaat, Martijn; van Dijk, Freerk; Francioli, Laurent C; Hottenga, Jouke Jan; Laros, Jeroen F J; Li, Qibin; Li, Yingrui; Cao, Hongzhi; Chen, Ruoyan; Du, Yuanping; Li, Ning; Cao, Sujie; van Setten, Jessica; Menelaou, Androniki; Pulit, Sara L; Hehir-Kwa, Jayne Y; Beekman, Marian; Elbers, Clara C; Byelas, Heorhiy; de Craen, Anton J M; Deelen, Patrick; Dijkstra, Martijn; den Dunnen, Johan T; de Knijff, Peter; Houwing-Duistermaat, Jeanine; Koval, Vyacheslav; Estrada, Karol; Hofman, Albert; Kanterakis, Alexandros; Enckevort, David van; Mai, Hailiang; Kattenberg, Mathijs; van Leeuwen, Elisabeth M; Neerincx, Pieter B T; Oostra, Ben; Rivadeneira, Fernanodo; Suchiman, Eka H D; Uitterlinden, Andre G; Willemsen, Gonneke; Wolffenbuttel, Bruce H; Wang, Jun; de Bakker, Paul I W; van Ommen, Gert-Jan; van Duijn, Cornelia M

2014-01-01

Within the Netherlands a national network of biobanks has been established (Biobanking and Biomolecular Research Infrastructure-Netherlands (BBMRI-NL)) as a national node of the European BBMRI. One of the aims of BBMRI-NL is to enrich biobanks with different types of molecular and phenotype data. Here, we describe the Genome of the Netherlands (GoNL), one of the projects within BBMRI-NL. GoNL is a whole-genome-sequencing project in a representative sample consisting of 250 trio-families from all provinces in the Netherlands, which aims to characterize DNA sequence variation in the Dutch population. The parent–offspring trios include adult individuals ranging in age from 19 to 87 years (mean=53 years; SD=16 years) from birth cohorts 1910–1994. Sequencing was done on blood-derived DNA from uncultured cells and accomplished coverage was 14–15x. The family-based design represents a unique resource to assess the frequency of regional variants, accurately reconstruct haplotypes by family-based phasing, characterize short indels and complex structural variants, and establish the rate of de novo mutational events. GoNL will also serve as a reference panel for imputation in the available genome-wide association studies in Dutch and other cohorts to refine association signals and uncover population-specific variants. GoNL will create a catalog of human genetic variation in this sample that is uniquely characterized with respect to micro-geographic location and a wide range of phenotypes. The resource will be made available to the research and medical community to guide the interpretation of sequencing projects. The present paper summarizes the global characteristics of the project. PMID:23714750
LAMP detection assays for boxwood blight pathogens: A comparative genomics approach

DOE Office of Scientific and Technical Information (OSTI.GOV)

Malapi-Wight, Martha; Demers, Jill E.; Veltri, Daniel

Rapid and accurate molecular diagnostic tools are critical to efforts to minimize the impact and spread of emergent pathogens. The identification of diagnostic markers for novel pathogens presents several challenges, especially in the absence of information about population diversity and where genetic resources are limited. The objective of this study was to use comparative genomics datasets to find unique target regions suitable for the diagnosis of two fungal species causing a newly emergent blight disease of boxwood. Candidate marker regions for loop-mediated isothermal amplification (LAMP) assays were identified from draft genomes of Calonectria henricotiae and C. pseudonaviculata, as well asmore » three related species not associated with this disease. To increase the probability of identifying unique targets, we used three approaches to mine genome datasets, based on (i) unique regions, (ii) polymorphisms, and (iii) presence/absence of regions across datasets. From a pool of candidate markers, we demonstrate LAMP assay specificity by testing related fungal species, common boxwood pathogens, and environmental samples containing 445 diverse fungal taxa. In conclusion, this comparative-genomics-based approach to the development of LAMP diagnostic assays is the first of its kind for fungi and could be easily applied to diagnostic marker development for other newly emergent plant pathogens.« less

LAMP detection assays for boxwood blight pathogens: A comparative genomics approach

DOE PAGES

Malapi-Wight, Martha; Demers, Jill E.; Veltri, Daniel; ...

2016-05-20

Rapid and accurate molecular diagnostic tools are critical to efforts to minimize the impact and spread of emergent pathogens. The identification of diagnostic markers for novel pathogens presents several challenges, especially in the absence of information about population diversity and where genetic resources are limited. The objective of this study was to use comparative genomics datasets to find unique target regions suitable for the diagnosis of two fungal species causing a newly emergent blight disease of boxwood. Candidate marker regions for loop-mediated isothermal amplification (LAMP) assays were identified from draft genomes of Calonectria henricotiae and C. pseudonaviculata, as well asmore » three related species not associated with this disease. To increase the probability of identifying unique targets, we used three approaches to mine genome datasets, based on (i) unique regions, (ii) polymorphisms, and (iii) presence/absence of regions across datasets. From a pool of candidate markers, we demonstrate LAMP assay specificity by testing related fungal species, common boxwood pathogens, and environmental samples containing 445 diverse fungal taxa. In conclusion, this comparative-genomics-based approach to the development of LAMP diagnostic assays is the first of its kind for fungi and could be easily applied to diagnostic marker development for other newly emergent plant pathogens.« less
Mining the Giardia genome and proteome for conserved and unique basal body proteins

PubMed Central

Lauwaet, Tineke; Smith, Alias J.; Reiner, David S.; Romijn, Edwin P.; Wong, Catherine C. L.; Davids, Barbara J.; Shah, Sheila A.; Yates, John R.; Gillin, Frances D.

2015-01-01

Giardia lamblia is a flagellated protozoan parasite and a major cause of diarrhea in humans. Its microtubular cytoskeleton mediates trophozoite motility, attachment and cytokinesis, and is characterized by an attachment disk and eight flagella that are each nucleated in a basal body. To date, only 10 giardial basal body proteins have been identified, including universal signaling proteins that are important for regulating mitosis or differentiation. In this study, we have exploited bioinformatics and proteomic approaches to identify new Giardia basal body proteins and confocal microscopy to confirm their localization in interphase trophozoites. This approach identified 75 homologs of conserved basal body proteins in the genome including 65 not previously known to be associated with Giardia basal bodies. Thirteen proteins were confirmed to co-localize with centrin to the Giardia basal bodies. We also demonstrate that most basal body proteins localize to additional cytoskeletal structures in interphase trophozoites. This might help to explain the roles of the four pairs of flagella and Giardia-specific organelles in motility and differentiation. A deeper understanding of the composition of the Giardia basal bodies will contribute insights into the complex signaling pathways that regulate its unique cytoskeleton and the biological divergence of these conserved organelles. PMID:21723868
The draft genomes of soft–shell turtle and green sea turtle yield insights into the development and evolution of the turtle–specific body plan

PubMed Central

Niimura, Yoshihito; Huang, Zhiyong; Li, Chunyi; White, Simon; Xiong, Zhiqiang; Fang, Dongming; Wang, Bo; Ming, Yao; Chen, Yan; Zheng, Yuan; Kuraku, Shigehiro; Pignatelli, Miguel; Herrero, Javier; Beal, Kathryn; Nozawa, Masafumi; Li, Qiye; Wang, Juan; Zhang, Hongyan; Yu, Lili; Shigenobu, Shuji; Wang, Junyi; Liu, Jiannan; Flicek, Paul; Searle, Steve; Wang, Jun; Kuratani, Shigeru; Yin, Ye; Aken, Bronwen; Zhang, Guojie; Irie, Naoki

2014-01-01

The unique anatomical features of turtles have raised unanswered questions about the origin of their unique body plan. We generated and analyzed draft genomes of the soft-shell turtle (Pelodiscus sinensis) and the green sea turtle (Chelonia mydas); our results indicated the close relationship of the turtles to the bird-crocodilian lineage, from which they split ~267.9–248.3 million years ago (Upper Permian to Triassic). We also found extensive expansion of olfactory receptor genes in these turtles. Embryonic gene expression analysis identified an hourglass-like divergence of turtle and chicken embryogenesis, with maximal conservation around the vertebrate phylotypic period, rather than at later stages that show the amniote-common pattern. Wnt5a expression was found in the growth zone of the dorsal shell, supporting the possible co-option of limb-associated Wnt signaling in the acquisition of this turtle-specific novelty. Our results suggest that turtle evolution was accompanied by an unexpectedly conservative vertebrate phylotypic period, followed by turtle-specific repatterning of development to yield the novel structure of the shell. PMID:23624526
Unique physiology of host-parasite interactions in microsporidia infections.

PubMed

Williams, Bryony A P

2009-11-01

Microsporidia are intracellular parasites of all major animal lineages and have a described diversity of over 1200 species and an actual diversity that is estimated to be much higher. They are important pathogens of mammals, and are now one of the most common infections among immunocompromised humans. Although related to fungi, microsporidia are atypical in genomic biology, cell structure and infection mechanism. Host cell infection involves the rapid expulsion of a polar tube from a dormant spore to pierce the host cell membrane and allow the direct transfer of the spore contents into the host cell cytoplasm. This intimate relationship between parasite and host is unique. It allows the microsporidia to be highly exploitative of the host cell environment and cause such diverse effects as the induction of hypertrophied cells to harbour prolific spore development, host sex ratio distortion and host cell organelle and microtubule reorganization. Genome sequencing has revealed that microsporidia have achieved this high level of parasite sophistication with radically reduced proteomes and with many typical eukaryotic pathways pared-down to what appear to be minimal functional units. These traits make microsporidia intriguing model systems for understanding the extremes of reductive parasite evolution and host cell manipulation.
Scanning the landscape of genome architecture of non-O1 and non-O139 Vibrio cholerae by whole genome mapping reveals extensive population genetic diversity

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chapman, Carol; Henry, Matthew; Bishop-Lilly, Kimberly A.

Historically, cholera outbreaks have been linked to V. cholerae O1 serogroup strains or its derivatives of the O37 and O139 serogroups. A genomic study on the 2010 Haiti cholera outbreak strains highlighted the putative role of non O1/non-O139 V. cholerae in causing cholera and the lack of genomic sequences of such strains from around the world. Here we address these gaps by scanning a global collection of V. cholerae strains as a first step towards understanding the population genetic diversity and epidemic potential of non O1/non-O139 strains. Whole Genome Mapping (Optical Mapping) based bar coding produces a high resolution, orderedmore » restriction map, depicting a complete view of the unique chromosomal architecture of an organism. To assess the genomic diversity of non-O1/non-O139 V. cholerae, we applied a Whole Genome Mapping strategy on a well-defined and geographically and temporally diverse strain collection, the Sakazaki serogroup type strains. Whole Genome Map data on 91 of the 206 serogroup type strains support the hypothesis that V. cholerae has an unprecedented genetic and genomic structural diversity. Interestingly, we discovered chromosomal fusions in two unusual strains that possess a single chromosome instead of the two chromosomes usually found in V. cholerae. We also found pervasive chromosomal rearrangements such as duplications and indels in many strains. The majority of Vibrio genome sequences currently in public databases are unfinished draft sequences. The Whole Genome Mapping approach presented here enables rapid screening of large strain collections to capture genomic complexities that would not have been otherwise revealed by unfinished draft genome sequencing and thus aids in assembling and finishing draft sequences of complex genomes. Furthermore, Whole Genome Mapping allows for prediction of novel V. cholerae non-O1/non-O139 strains that may have the potential to cause future cholera outbreaks.« less
Scanning the landscape of genome architecture of non-O1 and non-O139 Vibrio cholerae by whole genome mapping reveals extensive population genetic diversity.

PubMed

Chapman, Carol; Henry, Matthew; Bishop-Lilly, Kimberly A; Awosika, Joy; Briska, Adam; Ptashkin, Ryan N; Wagner, Trevor; Rajanna, Chythanya; Tsang, Hsinyi; Johnson, Shannon L; Mokashi, Vishwesh P; Chain, Patrick S G; Sozhamannan, Shanmuga

2015-01-01

Historically, cholera outbreaks have been linked to V. cholerae O1 serogroup strains or its derivatives of the O37 and O139 serogroups. A genomic study on the 2010 Haiti cholera outbreak strains highlighted the putative role of non O1/non-O139 V. cholerae in causing cholera and the lack of genomic sequences of such strains from around the world. Here we address these gaps by scanning a global collection of V. cholerae strains as a first step towards understanding the population genetic diversity and epidemic potential of non O1/non-O139 strains. Whole Genome Mapping (Optical Mapping) based bar coding produces a high resolution, ordered restriction map, depicting a complete view of the unique chromosomal architecture of an organism. To assess the genomic diversity of non-O1/non-O139 V. cholerae, we applied a Whole Genome Mapping strategy on a well-defined and geographically and temporally diverse strain collection, the Sakazaki serogroup type strains. Whole Genome Map data on 91 of the 206 serogroup type strains support the hypothesis that V. cholerae has an unprecedented genetic and genomic structural diversity. Interestingly, we discovered chromosomal fusions in two unusual strains that possess a single chromosome instead of the two chromosomes usually found in V. cholerae. We also found pervasive chromosomal rearrangements such as duplications and indels in many strains. The majority of Vibrio genome sequences currently in public databases are unfinished draft sequences. The Whole Genome Mapping approach presented here enables rapid screening of large strain collections to capture genomic complexities that would not have been otherwise revealed by unfinished draft genome sequencing and thus aids in assembling and finishing draft sequences of complex genomes. Furthermore, Whole Genome Mapping allows for prediction of novel V. cholerae non-O1/non-O139 strains that may have the potential to cause future cholera outbreaks.
Scanning the landscape of genome architecture of non-O1 and non-O139 Vibrio cholerae by whole genome mapping reveals extensive population genetic diversity

DOE PAGES

Chapman, Carol; Henry, Matthew; Bishop-Lilly, Kimberly A.; ...

2015-03-20

Historically, cholera outbreaks have been linked to V. cholerae O1 serogroup strains or its derivatives of the O37 and O139 serogroups. A genomic study on the 2010 Haiti cholera outbreak strains highlighted the putative role of non O1/non-O139 V. cholerae in causing cholera and the lack of genomic sequences of such strains from around the world. Here we address these gaps by scanning a global collection of V. cholerae strains as a first step towards understanding the population genetic diversity and epidemic potential of non O1/non-O139 strains. Whole Genome Mapping (Optical Mapping) based bar coding produces a high resolution, orderedmore » restriction map, depicting a complete view of the unique chromosomal architecture of an organism. To assess the genomic diversity of non-O1/non-O139 V. cholerae, we applied a Whole Genome Mapping strategy on a well-defined and geographically and temporally diverse strain collection, the Sakazaki serogroup type strains. Whole Genome Map data on 91 of the 206 serogroup type strains support the hypothesis that V. cholerae has an unprecedented genetic and genomic structural diversity. Interestingly, we discovered chromosomal fusions in two unusual strains that possess a single chromosome instead of the two chromosomes usually found in V. cholerae. We also found pervasive chromosomal rearrangements such as duplications and indels in many strains. The majority of Vibrio genome sequences currently in public databases are unfinished draft sequences. The Whole Genome Mapping approach presented here enables rapid screening of large strain collections to capture genomic complexities that would not have been otherwise revealed by unfinished draft genome sequencing and thus aids in assembling and finishing draft sequences of complex genomes. Furthermore, Whole Genome Mapping allows for prediction of novel V. cholerae non-O1/non-O139 strains that may have the potential to cause future cholera outbreaks.« less
Effective normalization for copy number variation detection from whole genome sequencing.

PubMed

Janevski, Angel; Varadan, Vinay; Kamalakaran, Sitharthan; Banerjee, Nilanjana; Dimitrova, Nevenka

2012-01-01

Whole genome sequencing enables a high resolution view of the human genome and provides unique insights into genome structure at an unprecedented scale. There have been a number of tools to infer copy number variation in the genome. These tools, while validated, also include a number of parameters that are configurable to genome data being analyzed. These algorithms allow for normalization to account for individual and population-specific effects on individual genome CNV estimates but the impact of these changes on the estimated CNVs is not well characterized. We evaluate in detail the effect of normalization methodologies in two CNV algorithms FREEC and CNV-seq using whole genome sequencing data from 8 individuals spanning four populations. We apply FREEC and CNV-seq to a sequencing data set consisting of 8 genomes. We use multiple configurations corresponding to different read-count normalization methodologies in FREEC, and statistically characterize the concordance of the CNV calls between FREEC configurations and the analogous output from CNV-seq. The normalization methodologies evaluated in FREEC are: GC content, mappability and control genome. We further stratify the concordance analysis within genic, non-genic, and a collection of validated variant regions. The GC content normalization methodology generates the highest number of altered copy number regions. Both mappability and control genome normalization reduce the total number and length of copy number regions. Mappability normalization yields Jaccard indices in the 0.07 - 0.3 range, whereas using a control genome normalization yields Jaccard index values around 0.4 with normalization based on GC content. The most critical impact of using mappability as a normalization factor is substantial reduction of deletion CNV calls. The output of another method based on control genome normalization, CNV-seq, resulted in comparable CNV call profiles, and substantial agreement in variable gene and CNV region calls. Choice of read-count normalization methodology has a substantial effect on CNV calls and the use of genomic mappability or an appropriately chosen control genome can optimize the output of CNV analysis.
Unexpected effects of different genetic backgrounds on identification of genomic rearrangements via whole-genome next generation sequencing.

PubMed

Chen, Zhangguo; Gowan, Katherine; Leach, Sonia M; Viboolsittiseri, Sawanee S; Mishra, Ameet K; Kadoishi, Tanya; Diener, Katrina; Gao, Bifeng; Jones, Kenneth; Wang, Jing H

2016-10-21

Whole genome next generation sequencing (NGS) is increasingly employed to detect genomic rearrangements in cancer genomes, especially in lymphoid malignancies. We recently established a unique mouse model by specifically deleting a key non-homologous end-joining DNA repair gene, Xrcc4, and a cell cycle checkpoint gene, Trp53, in germinal center B cells. This mouse model spontaneously develops mature B cell lymphomas (termed G1XP lymphomas). Here, we attempt to employ whole genome NGS to identify novel structural rearrangements, in particular inter-chromosomal translocations (CTXs), in these G1XP lymphomas. We sequenced six lymphoma samples, aligned our NGS data with mouse reference genome (in C57BL/6J (B6) background) and identified CTXs using CREST algorithm. Surprisingly, we detected widespread CTXs in both lymphomas and wildtype control samples, majority of which were false positive and attributable to different genetic backgrounds. In addition, we validated our NGS pipeline by sequencing multiple control samples from distinct tissues of different genetic backgrounds of mouse (B6 vs non-B6). Lastly, our studies showed that widespread false positive CTXs can be generated by simply aligning sequences from different genetic backgrounds of mouse. We conclude that mapping and alignment with reference genome might not be a preferred method for analyzing whole-genome NGS data obtained from a genetic background different from reference genome. Given the complex genetic background of different mouse strains or the heterogeneity of cancer genomes in human patients, in order to minimize such systematic artifacts and uncover novel CTXs, a preferred method might be de novo assembly of personalized normal control genome and cancer cell genome, instead of mapping and aligning NGS data to mouse or human reference genome. Thus, our studies have critical impact on the manner of data analysis for cancer genomics.
Two new miniature inverted-repeat transposable elements in the genome of the clam Donax trunculus.

PubMed

Šatović, Eva; Plohl, Miroslav

2017-10-01

Repetitive sequences are important components of eukaryotic genomes that drive their evolution. Among them are different types of mobile elements that share the ability to spread throughout the genome and form interspersed repeats. To broaden the generally scarce knowledge on bivalves at the genome level, in the clam Donax trunculus we described two new non-autonomous DNA transposons, miniature inverted-repeat transposable elements (MITEs), named DTC M1 and DTC M2. Like other MITEs, they are characterized by their small size, their A + T richness, and the presence of terminal inverted repeats (TIRs). DTC M1 and DTC M2 are 261 and 286 bp long, respectively, and in addition to TIRs, both of them contain a long imperfect palindrome sequence in their central parts. These elements are present in complete and truncated versions within the genome of the clam D. trunculus. The two new MITEs share only structural similarity, but lack any nucleotide sequence similarity to each other. In a search for related elements in databases, blast search revealed within the Crassostrea gigas genome a larger element sharing sequence similarity only to DTC M1 in its TIR sequences. The lack of sequence similarity with any previously published mobile elements indicates that DTC M1 and DTC M2 elements may be unique to D. trunculus.
Comparative genome analysis of rice-pathogenic Burkholderia provides insight into capacity to adapt to different environments and hosts.

PubMed

Seo, Young-Su; Lim, Jae Yun; Park, Jungwook; Kim, Sunyoung; Lee, Hyun-Hee; Cheong, Hoon; Kim, Sang-Mok; Moon, Jae Sun; Hwang, Ingyu

2015-05-06

In addition to human and animal diseases, bacteria of the genus Burkholderia can cause plant diseases. The representative species of rice-pathogenic Burkholderia are Burkholderia glumae, B. gladioli, and B. plantarii, which primarily cause grain rot, sheath rot, and seedling blight, respectively, resulting in severe reductions in rice production. Though Burkholderia rice pathogens cause problems in rice-growing countries, comprehensive studies of these rice-pathogenic species aiming to control Burkholderia-mediated diseases are only in the early stages. We first sequenced the complete genome of B. plantarii ATCC 43733T. Second, we conducted comparative analysis of the newly sequenced B. plantarii ATCC 43733T genome with eleven complete or draft genomes of B. glumae and B. gladioli strains. Furthermore, we compared the genome of three rice Burkholderia pathogens with those of other Burkholderia species such as those found in environmental habitats and those known as animal/human pathogens. These B. glumae, B. gladioli, and B. plantarii strains have unique genes involved in toxoflavin or tropolone toxin production and the clustered regularly interspaced short palindromic repeats (CRISPR)-mediated bacterial immune system. Although the genome of B. plantarii ATCC 43733T has many common features with those of B. glumae and B. gladioli, this B. plantarii strain has several unique features, including quorum sensing and CRISPR/CRISPR-associated protein (Cas) systems. The complete genome sequence of B. plantarii ATCC 43733T and publicly available genomes of B. glumae BGR1 and B. gladioli BSR3 enabled comprehensive comparative genome analyses among three rice-pathogenic Burkholderia species responsible for tissue rotting and seedling blight. Our results suggest that B. glumae has evolved rapidly, or has undergone rapid genome rearrangements or deletions, in response to the hosts. It also, clarifies the unique features of rice pathogenic Burkholderia species relative to other animal and human Burkholderia species.
A High-Coverage Yersinia pestis Genome from a Sixth-Century Justinianic Plague Victim.

PubMed

Feldman, Michal; Harbeck, Michaela; Keller, Marcel; Spyrou, Maria A; Rott, Andreas; Trautmann, Bernd; Scholz, Holger C; Päffgen, Bernd; Peters, Joris; McCormick, Michael; Bos, Kirsten; Herbig, Alexander; Krause, Johannes

2016-11-01

The Justinianic Plague, which started in the sixth century and lasted to the mid eighth century, is thought to be the first of three historically documented plague pandemics causing massive casualties. Historical accounts and molecular data suggest the bacterium Yersinia pestis as its etiological agent. Here we present a new high-coverage (17.9-fold) Y. pestis genome obtained from a sixth-century skeleton recovered from a southern German burial site close to Munich. The reconstructed genome enabled the detection of 30 unique substitutions as well as structural differences that have not been previously described. We report indels affecting a lacl family transcription regulator gene as well as nonsynonymous substitutions in the nrdE, fadJ, and pcp genes, that have been suggested as plague virulence determinants or have been shown to be upregulated in different models of plague infection. In addition, we identify 19 false positive substitutions in a previously published lower-coverage Y. pestis genome from another archaeological site of the same time period and geographical region that is otherwise genetically identical to the high-coverage genome sequence reported here, suggesting low-genetic diversity of the plague during the sixth century in rural southern Germany. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Genetics, Molecular, and Proteomics Advances in Filamentous Fungi.

PubMed

Sharma Ghimire, Prakriti; Jin, Cheng

2017-10-01

Filamentous fungi play a dynamic role in health and the environment. In addition, their unique and complex hyphal structures are involved in their morphogenesis, integrity, synthesis, and degradation, according to environmental and physiological conditions and resource availability. However, in biotechnology, it has a great value in the production of enzymes, pharmaceuticals, and food ingredients. The beginning of nomenclature of overall fungi started in early 1990 after which the categorization, interior and exterior mechanism, function, molecular and genetics study took pace. This mini-review has emphasized some of the important aspects of filamentous fungi, their pattern of life cycle, history, and development of different strategic methods applied to exploit this unique organism. New trends and concepts that have been applied to overcome obstacle because of their basic structure related to genomics and systems biology has been presented. Furthermore, the future aspects and challenges that need to be deciphered to get a bigger and better picture of filamentous fungi have been discussed.
The complete chloroplast genome sequence of Epipremnum aureum and its comparative analysis among eight Araceae species

PubMed Central

Han, Limin; Chen, Chen; Wang, Zhezhi

2018-01-01

Epipremnum aureum is an important foliage plant in the Araceae family. In this study, we have sequenced the complete chloroplast genome of E. aureum by using Illumina Hiseq sequencing platforms. This genome is a double-stranded circular DNA sequence of 164,831 bp that contains 35.8% GC. The two inverted repeats (IRa and IRb; 26,606 bp) are spaced by a small single-copy region (22,868 bp) and a large single-copy region (88,751 bp). The chloroplast genome has 131 (113 unique) functional genes, including 86 (79 unique) protein-coding genes, 37 (30 unique) tRNA genes, and eight (four unique) rRNA genes. Tandem repeats comprise the majority of the 43 long repetitive sequences. In addition, 111 simple sequence repeats are present, with mononucleotides being the most common type and di- and tetranucleotides being infrequent events. Positive selection pressure on rps12 in the E. aureum chloroplast has been demonstrated via synonymous and nonsynonymous substitution rates and selection pressure sites analyses. Ycf15 and infA are pseudogenes in this species. We constructed a Maximum Likelihood phylogenetic tree based on the complete chloroplast genomes of 38 species from 13 families. Those results strongly indicated that E. aureum is positioned as the sister of Colocasia esculenta within the Araceae family. This work may provide information for further study of the molecular phylogenetic relationships within Araceae, as well as molecular markers and breeding novel varieties by chloroplast genetic-transformation of E. aureum in particular. PMID:29529038
Novel rod-shaped viruses isolated from garlic, Allium sativum, possessing a unique genome organization.

PubMed

Sumi, S; Tsuneyoshi, T; Furutani, H

1993-09-01

Rod-shaped flexuous viruses were partially purified from garlic plants (Allium sativum) showing typical mosaic symptoms. The genome was shown to be composed of RNA with a poly(A) tail of an estimated size of 10 kb as shown by denaturing agarose gel electrophoresis. We constructed cDNA libraries and screened four independent clones, which were designated GV-A, GV-B, GV-C and GV-D, using Northern and Southern blot hybridization. Nucleotide sequence determination of the cDNAs, two of which correspond to nearly one-third of the virus genomic RNA, shows that all of these viruses possess an identical genomic structure and that also at least four proteins are encoded in the viral cDNA, their M(r)s being estimated to be 15K, 27K, 40K and 11K. The 15K open reading frame (ORF) encodes the core-like sequence of a zinc finger protein preceded by a cluster of basic amino acid residues. The 27K ORF probably encodes the viral coat protein (CP), based on both the existence of some conserved sequences observed in many other rod-shaped or flexuous virus CPs and an overall amino acid sequence similarity to potexvirus and carlavirus CPs. The 11K ORF shows significant amino acid sequence similarities to the corresponding 12K proteins of the potexviruses and carlaviruses. On the other hand, the 40K ORF product does not resemble any other plant virus gene products reported so far. The genomic organization in the 3' region of the garlic viruses resembles, but clearly differs from, that of carlaviruses. Phylogenetic analysis based upon the amino acid sequence of the viral capsid protein also indicates that the garlic viruses have a unique and distinct domain different from those of the potexvirus and carlavirus groups. The results suggest that the garlic viruses described here belong to an unclassified and new virus group closely related to the carlaviruses.
Genomic expression patterns in medication overuse headaches

PubMed Central

Hershey, Andrew D; Burdine, Danny; Kabbouche, Marielle A; Powers, Scott W

2016-01-01

Background Chronic daily headache (CDH) and chronic migraine (CM) are one of the most frequent problems encountered in neurology, are often difficult to treat, and frequently complicated by medication-overuse headache (MOH). Proper recognition of MOH may alter treatment outcome and prevent long term disability. Objective This study identifies the unique genomic expression pattern MOH that respond to cessation of the overused medication. Methods Baseline occurrence of MOH and typical pattern of response to medication cessation were measured from a large database. Whole blood samples from patients with CM with or without MOH were obtained and their genomic profile was assessed. Affymetrix human U133 plus2 arrays were used to examine the genomic expression patterns prior to treatment and 6–12 weeks later. Headache characterisation and response to treatment based on headache frequency and disability were compared. Results Of 1311 patients reporting daily or continuous headaches, 513 (39.1%) reported overusing analgesic medication. At follow-up, 44.5% had a 50% or greater reduction in headache frequency, while 41.6% had no change. Blood genomic expression patterns were obtained on 33 patients with 19 (57.6%) overusing analgesic medication with a unique genomic expression pattern in MOH that responded to cessation of analgesics. Gene ontology of these samples indicated a significant number were involved with brain and immunological tissues, including multiple signalling pathways and apoptosis. Conclusions Blood genomic patterns can accurately identify MOH patients that respond to medication cessation. These results suggest that MOH involves a unique molecular biology pathway that can be identified with a specific biomarker. PMID:20974594
Staphylococcus aureus genomics and the impact of horizontal gene transfer.

PubMed

Lindsay, Jodi A

2014-03-01

Whole genome sequencing and microarrays have revealed the population structure of Staphylococcus aureus, and identified epidemiological shifts, transmission routes, and adaptation of major clones. S. aureus genomes are highly diverse. This is partly due to a population structure of conserved lineages, each with unique combinations of genes encoding surface proteins, regulators, immune evasion and virulence pathways. Even more variable are the mobile genetic elements (MGE), which encode key proteins for antibiotic resistance, virulence and host-adaptation. MGEs can transfer at high frequency between isolates of the same lineage by horizontal gene transfer (HGT). There is increasing evidence that HGT is key to bacterial adaptation and success. Recent studies have shed light on new mechanisms of DNA transfer such as transformation, the identification of receptors for transduction, on integration of DNA pathways, mechanisms blocking transfer including CRISPR and new restriction systems, strategies for evasion of restriction barriers, as well as factors influencing MGE selection and stability. These studies have also lead to new tools enabling construction of genetically modified clinical S. aureus isolates. This review will focus on HGT mechanisms and their importance in shaping the evolution of new clones adapted to antibiotic resistance, healthcare, communities and livestock. Copyright © 2013 Elsevier GmbH. All rights reserved.
Genome-Wide Association Study of Cardiac Structure and Systolic Function in African Americans: The Candidate Gene Association Resource (CARe) Study

PubMed Central

Fox, Ervin R.; Musani, Solomon K.; Barbalic, Maja; Lin, Honghuang; Yu, Bing; Ogunyankin, Kofo O.; Smith, Nicholas L.; Kutlar, Abdullah; Glazer, Nicole L.; Post, Wendy S.; Paltoo, Dina N.; Dries, Daniel L.; Farlow, Deborah N.; Duarte, Christine W.; Kardia, Sharon L.; Meyers, Kristin J.; Sun, Yan V.; Arnett, Donna K.; Patki, Amit A.; Sha, Jin; Cui, Xiangqui; Samdarshi, Tandaw E.; Penman, Alan D.; Bibbins-Domingo, Kirsten; Bůžková, Petra; Benjamin, Emelia J.; Bluemke, David A.; Morrison, Alanna C.; Heiss, Gerardo; Carr, J. Jeffrey; Tracy, Russell P.; Mosley, Thomas H.; Taylor, Herman A.; Psaty, Bruce M.; Heckbert, Susan R.; Cappola, Thomas P.; Vasan, Ramachandran S.

2013-01-01

Background Using data from four community-based cohorts of African Americans (AA), we tested the association between genome-wide markers (SNPs) and cardiac phenotypes in the Candidate-gene Association REsource (CARe) study. Methods and Results Among 6,765 AA, we related age, sex, height and weight-adjusted residuals for nine cardiac phenotypes (assessed by echocardiogram or MRI) to 2.5 million SNPs genotyped using Genome-Wide Affymetrix Human SNP Array 6.0 (Affy6.0) and the remainder imputed. Within cohort genome-wide association analysis was conducted followed by meta-analysis across cohorts using inverse variance weights (genome-wide significance threshold=4.0 ×10−07). Supplementary pathway analysis was performed. We attempted replication in 3 smaller cohorts of African ancestry and tested look-ups in one consortium of European ancestry (EchoGEN). Across the 9 phenotypes, variants in 4 genetic loci reached genome-wide significance: rs4552931 in UBE2V2 (p=1.43 × 10−07) for left ventricular mass (LVM); rs7213314 in WIPI1 (p=1.68 × 10−07) for LV internal diastolic diameter (LVIDD); rs1571099 in PPAPDC1A (p= 2.57 × 10−08) for interventricular septal wall thickness (IVST); and rs9530176 in KLF5 (p=4.02 × 10−07) for ejection fraction (EF). Associated variants were enriched in three signaling pathways involved in cardiac remodeling. None of the 4 loci replicated in cohorts of African ancestry were confirmed in look-ups in EchoGEN. Conclusions In the largest GWAS of cardiac structure and function to date in AA, we identified 4 genetic loci related to LVM, IVST, LVIDD and EF that reached genome-wide significance. Replication results suggest that these loci may represent unique to individuals of African ancestry. Additional large-scale studies are warranted for these complex phenotypes. PMID:23275298
Molecular analysis of the anaerobic rumen fungus Orpinomyces - insights into an AT-rich genome.

PubMed

Nicholson, Matthew J; Theodorou, Michael K; Brookman, Jayne L

2005-01-01

The anaerobic gut fungi occupy a unique niche in the intestinal tract of large herbivorous animals and are thought to act as primary colonizers of plant material during digestion. They are the only known obligately anaerobic fungi but molecular analysis of this group has been hampered by difficulties in their culture and manipulation, and by their extremely high A+T nucleotide content. This study begins to answer some of the fundamental questions about the structure and organization of the anaerobic gut fungal genome. Directed plasmid libraries using genomic DNA digested with highly or moderately rich AT-specific restriction enzymes (VspI and EcoRI) were prepared from a polycentric Orpinomyces isolate. Clones were sequenced from these libraries and the breadth of genomic inserts, both genic and intergenic, was characterized. Genes encoding numerous functions not previously characterized for these fungi were identified, including cytoskeletal, secretory pathway and transporter genes. A peptidase gene with no introns and having sequence similarity to a gene encoding a bacterial peptidase was also identified, extending the range of metabolic enzymes resulting from apparent trans-kingdom transfer from bacteria to fungi, as previously characterized largely for genes encoding plant-degrading enzymes. This paper presents the first thorough analysis of the genic, intergenic and rDNA regions of a variety of genomic segments from an anaerobic gut fungus and provides observations on rules governing intron boundaries, the codon biases observed with different types of genes, and the sequence of only the second anaerobic gut fungal promoter reported. Large numbers of retrotransposon sequences of different types were found and the authors speculate on the possible consequences of any such transposon activity in the genome. The coding sequences identified included several orphan gene sequences, including one with regions strongly suggestive of structural proteins such as collagens and lampirin. This gene was present as a single copy in Orpinomyces, was expressed during vegetative growth and was also detected in genomes from another gut fungal genus, Neocallimastix.
Novel genomic rearrangements mediated by multiple genetic elements in Streptococcus pyogenes M23ND confer potential for evolutionary persistence

PubMed Central

Bao, Yun-Juan; Liang, Zhong; Mayfield, Jeffrey A.; McShan, William M.; Lee, Shaun W.; Ploplis, Victoria A.; Castellino, Francis J.

2016-01-01

Symmetric genomic rearrangements around replication axes in genomes are commonly observed in prokaryotic genomes, including Group A Streptococcus (GAS). However, asymmetric rearrangements are rare. Our previous studies showed that the hypervirulent invasive GAS strain, M23ND, containing an inactivated transcriptional regulator system, covRS, exhibits unique extensive asymmetric rearrangements, which reconstructed a genomic structure distinct from other GAS genomes. In the current investigation, we identified the rearrangement events and examined the genetic consequences and evolutionary implications underlying the rearrangements. By comparison with a close phylogenetic relative, M18-MGAS8232, we propose a molecular model wherein a series of asymmetric rearrangements have occurred in M23ND, involving translocations, inversions and integrations mediated by multiple factors, viz., rRNA-comX (factor for late competence), transposons and phage-encoded gene segments. Assessments of the cumulative gene orientations and GC skews reveal that the asymmetric genomic rearrangements did not affect the general genomic integrity of the organism. However, functional distributions reveal re-clustering of a broad set of CovRS-regulated actively transcribed genes, including virulence factors and metabolic genes, to the same leading strand, with high confidence (p-value ~10−10). The re-clustering of the genes suggests a potential selection advantage for the spatial proximity to the transcription complexes, which may contain the global transcriptional regulator, CovRS, and other RNA polymerases. Their proximities allow for efficient transcription of the genes required for growth, virulence and persistence. A new paradigm of survival strategies of GAS strains is provided through multiple genomic rearrangements, while, at the same time, maintaining genomic integrity. PMID:27329479

Evolutionary dynamics of protein domain architecture in plants

PubMed Central

2012-01-01

Background Protein domains are the structural, functional and evolutionary units of the protein. Protein domain architectures are the linear arrangements of domain(s) in individual proteins. Although the evolutionary history of protein domain architecture has been extensively studied in microorganisms, the evolutionary dynamics of domain architecture in the plant kingdom remains largely undefined. To address this question, we analyzed the lineage-based protein domain architecture content in 14 completed green plant genomes. Results Our analyses show that all 14 plant genomes maintain similar distributions of species-specific, single-domain, and multi-domain architectures. Approximately 65% of plant domain architectures are universally present in all plant lineages, while the remaining architectures are lineage-specific. Clear examples are seen of both the loss and gain of specific protein architectures in higher plants. There has been a dynamic, lineage-wise expansion of domain architectures during plant evolution. The data suggest that this expansion can be largely explained by changes in nuclear ploidy resulting from rounds of whole genome duplications. Indeed, there has been a decrease in the number of unique domain architectures when the genomes were normalized into a presumed ancestral genome that has not undergone whole genome duplications. Conclusions Our data show the conservation of universal domain architectures in all available plant genomes, indicating the presence of an evolutionarily conserved, core set of protein components. However, the occurrence of lineage-specific domain architectures indicates that domain architecture diversity has been maintained beyond these core components in plant genomes. Although several features of genome-wide domain architecture content are conserved in plants, the data clearly demonstrate lineage-wise, progressive changes and expansions of individual protein domain architectures, reinforcing the notion that plant genomes have undergone dynamic evolution. PMID:22252370
The complete mitochondrial genome of Setaria digitata (Nematoda: Filarioidea): Mitochondrial gene content, arrangement and composition compared with other nematodes.

PubMed

Yatawara, Lalani; Wickramasinghe, Susiji; Rajapakse, R P V J; Agatsuma, Takeshi

2010-09-01

In the present study, we determined the complete mitochondrial (mt) genome sequence (13,839bp) of parasitic nematode Setaria digitata and its structure and organization compared with Onchocerca volvulus, Dirofilaria immitis and Brugia malayi. The mt genome of S. digitata is slightly larger than the mt genomes of other filarial nematodes. S. digitata mt genome contains 36 genes (12 protein-coding genes, 22 transfer RNAs and 2 ribosomal RNAs) that are typically found in metazoans. This genome contains a high A+T (75.1%) content and low G+C content (24.9%). The mt gene order for S. digitata is the same as those for O. volvulus, D. immitis and B. malayi but it is distinctly different from other nematodes compared. The start codons inferred in the mt genome of S. digitata are TTT, ATT, TTG, ATG, GTT and ATA. Interestingly, the initiation codon TTT is unique to S. digitata mt genome and four protein-coding genes use this codon as a translation initiation codon. Five protein-coding genes use TAG as a stop codon whereas three genes use TAA and four genes use T as a termination codon. Out of 64 possible codons, only 57 are used for mitochondrial protein-coding genes of S. digitata. T-rich codons such as TTT (18.9%), GTT (7.9%), TTG (7.8%), TAT (7%), ATT (5.7%), TCT (4.8%) and TTA (4.1%) are used more frequently. This pattern of codon usage reflects the strong bias for T in the mt genome of S. digitata. In conclusion, the present investigation provides new molecular data for future studies of the comparative mitochondrial genomics and systematic of parasitic nematodes of socio-economic importance. 2010 Elsevier B.V. All rights reserved.
Genomic Evolution of Saccharomyces cerevisiae under Chinese Rice Wine Fermentation

PubMed Central

Li, Yudong; Zhang, Weiping; Zheng, Daoqiong; Zhou, Zhan; Yu, Wenwen; Zhang, Lei; Feng, Lifang; Liang, Xinle; Guan, Wenjun; Zhou, Jingwen; Chen, Jian; Lin, Zhenguo

2014-01-01

Rice wine fermentation represents a unique environment for the evolution of the budding yeast, Saccharomyces cerevisiae. To understand how the selection pressure shaped the yeast genome and gene regulation, we determined the genome sequence and transcriptome of a S. cerevisiae strain YHJ7 isolated from Chinese rice wine (Huangjiu), a popular traditional alcoholic beverage in China. By comparing the genome of YHJ7 to the lab strain S288c, a Japanese sake strain K7, and a Chinese industrial bioethanol strain YJSH1, we identified many genomic sequence and structural variations in YHJ7, which are mainly located in subtelomeric regions, suggesting that these regions play an important role in genomic evolution between strains. In addition, our comparative transcriptome analysis between YHJ7 and S288c revealed a set of differentially expressed genes, including those involved in glucose transport (e.g., HXT2, HXT7) and oxidoredutase activity (e.g., AAD10, ADH7). Interestingly, many of these genomic and transcriptional variations are directly or indirectly associated with the adaptation of YHJ7 strain to its specific niches. Our molecular evolution analysis suggested that Japanese sake strains (K7/UC5) were derived from Chinese rice wine strains (YHJ7) at least approximately 2,300 years ago, providing the first molecular evidence elucidating the origin of Japanese sake strains. Our results depict interesting insights regarding the evolution of yeast during rice wine fermentation, and provided a valuable resource for genetic engineering to improve industrial wine-making strains. PMID:25212861
Genome comparison of two Magnaporthe oryzae field isolates reveals genome variations and potential virulence effectors

PubMed Central

2013-01-01

Background Rice blast caused by the fungus Magnaporthe oryzae is an important disease in virtually every rice growing region of the world, which leads to significant annual decreases of grain quality and yield. To prevent disease, resistance genes in rice have been cloned and introduced into susceptible cultivars. However, introduced resistance can often be broken within few years of release, often due to mutation of cognate avirulence genes in fungal field populations. Results To better understand the pattern of mutation of M. oryzae field isolates under natural selection forces, we used a next generation sequencing approach to analyze the genomes of two field isolates FJ81278 and HN19311, as well as the transcriptome of FJ81278. By comparing the de novo genome assemblies of the two isolates against the finished reference strain 70–15, we identified extensive polymorphisms including unique genes, SNPs (single nucleotide polymorphism) and indels, structural variations, copy number variations, and loci under strong positive selection. The 1.75 MB of isolate-specific genome content carrying 118 novel genes from FJ81278, and 0.83 MB from HN19311 were also identified. By analyzing secreted proteins carrying polymorphisms, in total 256 candidate virulence effectors were found and 6 were chosen for functional characterization. Conclusions We provide results from genome comparison analysis showing extensive genome variation, and generated a list of M. oryzae candidate virulence effectors for functional characterization. PMID:24341723
Culture independent genomic comparisons reveal environmental adaptations for Altiarchaeales

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bird, Jordan T.; Baker, Brett J.; Probst, Alexander J.

The recently proposed candidatus order Altiarchaeales remains an uncultured archaeal lineage composed of genetically diverse, globally widespread organisms frequently observed in anoxic subsurface environments. In spite of 15 years of studies on the psychrophilic biofilm-producing Candidatus Altiarchaeum hamiconexum and its close relatives, very little is known about the phylogenetic and functional diversity of the widespread free-living marine members of this taxon. From methanogenic sediments in the White Oak River Estuary, NC, USA, we sequenced a single cell amplified genome (SAG), WOR_SM1_SCG, and used it to identify and refine two high-quality genomes from metagenomes, WOR_SM1_79 and WOR_SM1_86-2, from the same site.more » These three genomic reconstructions form a monophyletic group, which also includes three previously published genomes from metagenomes from terrestrial springs and a SAG from Sakinaw Lake in a group previously designated as pMC2A384. A synapomorphic mutation in the Altiarchaeales tRNA synthetase β subunit, pheT, caused the protein to be encoded as two subunits at non-adjacent loci. Consistent with the terrestrial spring clades, our estuarine genomes contained a near-complete autotrophic metabolism, H2 or CO as potential electron donors, a reductive acetyl-CoA pathway for carbon fixation, and methylotroph-like NADP(H)-dependent dehydrogenase. Phylogenies based on 16S rRNA genes and concatenated conserved proteins identified two distinct sub-clades of Altiarchaeales, Alti-1 populated by organisms from actively flowing springs, and Alti-2 which was more widespread, diverse, and not associated with visible mats. The core Alti-1 genome suggested Alti-1 is adapted for the stream environment with lipopolysaccharide production capacity and extracellular hami structures. The core Alti-2 genome suggested members of this clade are free-living with distinct mechanisms for energy maintenance, motility, osmoregulation, and sulfur redox reactions. These data suggested that the hamus structures found in Candidatus Altiarchaeum hamiconexum are not present outside of stream-adapted Altiarchaeales. Homologs to a Na + transporter and membrane bound coenzyme As a result, a disulfide reductase that were unique to the brackish sediment Alti-2 genomes, could indicate adaptations to the estuarine, sulfur-rich environment.« less
Culture Independent Genomic Comparisons Reveal Environmental Adaptations for Altiarchaeales

PubMed Central

Baker, Brett J.; Probst, Alexander J.; Podar, Mircea; Lloyd, Karen G.

2016-01-01

The recently proposed candidatus order Altiarchaeales remains an uncultured archaeal lineage composed of genetically diverse, globally widespread organisms frequently observed in anoxic subsurface environments. In spite of 15 years of studies on the psychrophilic biofilm-producing Candidatus Altiarchaeum hamiconexum and its close relatives, very little is known about the phylogenetic and functional diversity of the widespread free-living marine members of this taxon. From methanogenic sediments in the White Oak River Estuary, NC, USA, we sequenced a single cell amplified genome (SAG), WOR_SM1_SCG, and used it to identify and refine two high-quality genomes from metagenomes, WOR_SM1_79 and WOR_SM1_86-2, from the same site. These three genomic reconstructions form a monophyletic group, which also includes three previously published genomes from metagenomes from terrestrial springs and a SAG from Sakinaw Lake in a group previously designated as pMC2A384. A synapomorphic mutation in the Altiarchaeales tRNA synthetase β subunit, pheT, caused the protein to be encoded as two subunits at non-adjacent loci. Consistent with the terrestrial spring clades, our estuarine genomes contained a near-complete autotrophic metabolism, H2 or CO as potential electron donors, a reductive acetyl-CoA pathway for carbon fixation, and methylotroph-like NADP(H)-dependent dehydrogenase. Phylogenies based on 16S rRNA genes and concatenated conserved proteins identified two distinct sub-clades of Altiarchaeales, Alti-1 populated by organisms from actively flowing springs, and Alti-2 which was more widespread, diverse, and not associated with visible mats. The core Alti-1 genome suggested Alti-1 is adapted for the stream environment with lipopolysaccharide production capacity and extracellular hami structures. The core Alti-2 genome suggested members of this clade are free-living with distinct mechanisms for energy maintenance, motility, osmoregulation, and sulfur redox reactions. These data suggested that the hamus structures found in Candidatus Altiarchaeum hamiconexum are not present outside of stream-adapted Altiarchaeales. Homologs to a Na+ transporter and membrane bound coenzyme A disulfide reductase that were unique to the brackish sediment Alti-2 genomes, could indicate adaptations to the estuarine, sulfur-rich environment. PMID:27547202
Culture independent genomic comparisons reveal environmental adaptations for Altiarchaeales

DOE PAGES

Bird, Jordan T.; Baker, Brett J.; Probst, Alexander J.; ...

2016-08-05

The recently proposed candidatus order Altiarchaeales remains an uncultured archaeal lineage composed of genetically diverse, globally widespread organisms frequently observed in anoxic subsurface environments. In spite of 15 years of studies on the psychrophilic biofilm-producing Candidatus Altiarchaeum hamiconexum and its close relatives, very little is known about the phylogenetic and functional diversity of the widespread free-living marine members of this taxon. From methanogenic sediments in the White Oak River Estuary, NC, USA, we sequenced a single cell amplified genome (SAG), WOR_SM1_SCG, and used it to identify and refine two high-quality genomes from metagenomes, WOR_SM1_79 and WOR_SM1_86-2, from the same site.more » These three genomic reconstructions form a monophyletic group, which also includes three previously published genomes from metagenomes from terrestrial springs and a SAG from Sakinaw Lake in a group previously designated as pMC2A384. A synapomorphic mutation in the Altiarchaeales tRNA synthetase β subunit, pheT, caused the protein to be encoded as two subunits at non-adjacent loci. Consistent with the terrestrial spring clades, our estuarine genomes contained a near-complete autotrophic metabolism, H2 or CO as potential electron donors, a reductive acetyl-CoA pathway for carbon fixation, and methylotroph-like NADP(H)-dependent dehydrogenase. Phylogenies based on 16S rRNA genes and concatenated conserved proteins identified two distinct sub-clades of Altiarchaeales, Alti-1 populated by organisms from actively flowing springs, and Alti-2 which was more widespread, diverse, and not associated with visible mats. The core Alti-1 genome suggested Alti-1 is adapted for the stream environment with lipopolysaccharide production capacity and extracellular hami structures. The core Alti-2 genome suggested members of this clade are free-living with distinct mechanisms for energy maintenance, motility, osmoregulation, and sulfur redox reactions. These data suggested that the hamus structures found in Candidatus Altiarchaeum hamiconexum are not present outside of stream-adapted Altiarchaeales. Homologs to a Na + transporter and membrane bound coenzyme As a result, a disulfide reductase that were unique to the brackish sediment Alti-2 genomes, could indicate adaptations to the estuarine, sulfur-rich environment.« less
Making the Bend: DNA Tertiary Structure and Protein-DNA Interactions

PubMed Central

Harteis, Sabrina; Schneider, Sabine

2014-01-01

DNA structure functions as an overlapping code to the DNA sequence. Rapid progress in understanding the role of DNA structure in gene regulation, DNA damage recognition and genome stability has been made. The three dimensional structure of both proteins and DNA plays a crucial role for their specific interaction, and proteins can recognise the chemical signature of DNA sequence (“base readout”) as well as the intrinsic DNA structure (“shape recognition”). These recognition mechanisms do not exist in isolation but, depending on the individual interaction partners, are combined to various extents. Driving force for the interaction between protein and DNA remain the unique thermodynamics of each individual DNA-protein pair. In this review we focus on the structures and conformations adopted by DNA, both influenced by and influencing the specific interaction with the corresponding protein binding partner, as well as their underlying thermodynamics. PMID:25026169
Biochemical studies of the Saccharomyces cerevisiae Mph1 helicase on junction-containing DNA structures

PubMed Central

Kang, Young-Hoon; Munashingha, Palinda Ruvan; Lee, Chul-Hwan; Nguyen, Tuan Anh; Seo, Yeon-Soo

2012-01-01

Saccharomyces cerevisiae Mph1 is a 3–5′ DNA helicase, required for the maintenance of genome integrity. In order to understand the ATPase/helicase role of Mph1 in genome stability, we characterized its helicase activity with a variety of DNA substrates, focusing on its action on junction structures containing three or four DNA strands. Consistent with its 3′ to 5′ directionality, Mph1 displaced 3′-flap substrates in double-fixed or equilibrating flap substrates. Surprisingly, Mph1 displaced the 5′-flap strand more efficiently than the 3′ flap strand from double-flap substrates, which is not expected for a 3–5′ DNA helicase. For this to occur, Mph1 required a threshold size (>5 nt) of 5′ single-stranded DNA flap. Based on the unique substrate requirements of Mph1 defined in this study, we propose that the helicase/ATPase activity of Mph1 play roles in converting multiple-stranded DNA structures into structures cleavable by processing enzymes such as Fen1. We also found that the helicase activity of Mph1 was used to cause structural alterations required for restoration of replication forks stalled due to damaged template. The helicase properties of Mph1 reported here could explain how it resolves D-loop structure, and are in keeping with a model proposed for the error-free damage avoidance pathway. PMID:22090425
ProteinWorldDB: querying radical pairwise alignments among protein sets from complete genomes.

PubMed

Otto, Thomas Dan; Catanho, Marcos; Tristão, Cristian; Bezerra, Márcia; Fernandes, Renan Mathias; Elias, Guilherme Steinberger; Scaglia, Alexandre Capeletto; Bovermann, Bill; Berstis, Viktors; Lifschitz, Sergio; de Miranda, Antonio Basílio; Degrave, Wim

2010-03-01

Many analyses in modern biological research are based on comparisons between biological sequences, resulting in functional, evolutionary and structural inferences. When large numbers of sequences are compared, heuristics are often used resulting in a certain lack of accuracy. In order to improve and validate results of such comparisons, we have performed radical all-against-all comparisons of 4 million protein sequences belonging to the RefSeq database, using an implementation of the Smith-Waterman algorithm. This extremely intensive computational approach was made possible with the help of World Community Grid, through the Genome Comparison Project. The resulting database, ProteinWorldDB, which contains coordinates of pairwise protein alignments and their respective scores, is now made available. Users can download, compare and analyze the results, filtered by genomes, protein functions or clusters. ProteinWorldDB is integrated with annotations derived from Swiss-Prot, Pfam, KEGG, NCBI Taxonomy database and gene ontology. The database is a unique and valuable asset, representing a major effort to create a reliable and consistent dataset of cross-comparisons of the whole protein content encoded in hundreds of completely sequenced genomes using a rigorous dynamic programming approach. The database can be accessed through http://proteinworlddb.org
Genomes to natural products PRediction Informatics for Secondary Metabolomes (PRISM).

PubMed

Skinnider, Michael A; Dejong, Chris A; Rees, Philip N; Johnston, Chad W; Li, Haoxin; Webster, Andrew L H; Wyatt, Morgan A; Magarvey, Nathan A

2015-11-16

Microbial natural products are an invaluable source of evolved bioactive small molecules and pharmaceutical agents. Next-generation and metagenomic sequencing indicates untapped genomic potential, yet high rediscovery rates of known metabolites increasingly frustrate conventional natural product screening programs. New methods to connect biosynthetic gene clusters to novel chemical scaffolds are therefore critical to enable the targeted discovery of genetically encoded natural products. Here, we present PRISM, a computational resource for the identification of biosynthetic gene clusters, prediction of genetically encoded nonribosomal peptides and type I and II polyketides, and bio- and cheminformatic dereplication of known natural products. PRISM implements novel algorithms which render it uniquely capable of predicting type II polyketides, deoxygenated sugars, and starter units, making it a comprehensive genome-guided chemical structure prediction engine. A library of 57 tailoring reactions is leveraged for combinatorial scaffold library generation when multiple potential substrates are consistent with biosynthetic logic. We compare the accuracy of PRISM to existing genomic analysis platforms. PRISM is an open-source, user-friendly web application available at http://magarveylab.ca/prism/. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
The molecular diversity of α-gliadin genes in the tribe Triticeae.

PubMed

Qi, Peng-Fei; Chen, Qing; Ouellet, Thérèse; Wang, Zhao; Le, Cheng-Xing; Wei, Yu-Ming; Lan, Xiu-Jin; Zheng, You-Liang

2013-09-01

Many of the unique properties of wheat flour are derived from seed storage proteins such as the α-gliadins. In this study these α-gliadin genes from diploid Triticeae species were systemically characterized, and divided into 3 classes according to the distinct organization of their protein domains. Our analyses indicated that these α-gliadins varied in the number of cysteine residues they contained. Most of the α-gliadin genes were grouped according to their genomic origins within the phylogenetic tree. As expected, sequence alignments suggested that the repetitive domain and the two polyglutamine regions were responsible for length variations of α-gliadins as were the insertion/deletion of structural domains within the three different classes (I, II, and III) of α-gliadins. A screening of celiac disease toxic epitopes indicated that the α-gliadins of the class II, derived from the Ns genome, contain no epitope, and that some other genomes contain much fewer epitopes than the A, S(B) and D genomes of wheat. Our results suggest that the observed genetic differences in α-gliadins of Triticeae might indicate their use as a fertile ground for the breeding of less CD-toxic wheat varieties.
Genomics and bioinformatics in undergraduate curricula: Contexts for hybrid laboratory/lecture courses for entering and advanced science students.

PubMed

Temple, Louise; Cresawn, Steven G; Monroe, Jonathan D

2010-01-01

Emerging interest in genomics in the scientific community prompted biologists at James Madison University to create two courses at different levels to modernize the biology curriculum. The courses are hybrids of classroom and laboratory experiences. An upper level class uses raw sequence of a genome (plasmid or virus) as the subject on which to base the experience of genomic analysis. Students also learn bioinformatics and software programs needed to support a project linking structure and function in proteins and showing evolutionary relatedness of similar genes. An optional entry-level course taken in addition to the required first-year curriculum and sponsored in part by the Howard Hughes Medical Institute, engages first year students in a primary research project. In the first semester, they isolate and characterize novel bacteriophages that infect soil bacteria. In the second semester, these young scientists annotate the genes on one or more of the unique viruses they discovered. These courses are demanding but exciting for both faculty and students and should be accessible to any interested faculty member. Copyright © 2010 International Union of Biochemistry and Molecular Biology, Inc.
Insights into the genomic plasticity of Pseudomonas putida KF715, a strain with unique biphenyl-utilizing activity and genome instability properties.

PubMed

Suenaga, Hikaru; Fujihara, Hidehiko; Kimura, Nobutada; Hirose, Jun; Watanabe, Takahito; Futagami, Taiki; Goto, Masatoshi; Shimodaira, Jun; Furukawa, Kensuke

2017-10-01

Pseudomonas putida KF715 exhibits unique properties in both catabolic activity and genome plasticity. Our previous studies revealed that the DNA region containing biphenyl and salycilate metabolism gene clusters (termed the bph-sal element) was frequently deleted and transferred by conjugation to closely related P. putida strains. In this study, we first determined the complete nucleotide sequence of the KF715 genome. Next, to determine the underlying cause of genome plasticity in KF715, we compared the KF715 genome with the genomes of one KF715 defective mutant, two transconjugants, and several P. putida strains available from public databases. The gapless KF715 genome sequence revealed five replicons: one circular chromosome, and four plasmids. Southern blot analysis indicated that most of the KF715 cell population carries the bph-sal element on the chromosome whereas a small number carry it on a huge plasmid, pKF715A. Moreover, the bph-sal element is present stably on the plasmid and did not integrate into the chromosome of its transconjugants. Comparative genome analysis and experiments showed that a number of diverse putative genetic elements are present in KF715 and are likely involved in genome rearrangement. These data provide insights into the genetic plasticity and adaptability of microorganisms for survival in various ecological niches. © 2017 Society for Applied Microbiology and John Wiley & Sons Ltd.
Long-read sequencing uncovers the adaptive topography of a carnivorous plant genome

PubMed Central

Lan, Tianying; Renner, Tanya; Ibarra-Laclette, Enrique; Farr, Kimberly M.; Chang, Tien-Hao; Cervantes-Pérez, Sergio Alan; Zheng, Chunfang; Sankoff, David; Tang, Haibao; Purbojati, Rikky W.; Putra, Alexander; Drautz-Moses, Daniela I.; Schuster, Stephan C.; Herrera-Estrella, Luis; Albert, Victor A.

2017-01-01

Utricularia gibba, the humped bladderwort, is a carnivorous plant that retains a tiny nuclear genome despite at least two rounds of whole genome duplication (WGD) since common ancestry with grapevine and other species. We used a third-generation genome assembly with several complete chromosomes to reconstruct the two most recent lineage-specific ancestral genomes that led to the modern U. gibba genome structure. Patterns of subgenome dominance in the most recent WGD, both architectural and transcriptional, are suggestive of allopolyploidization, which may have generated genomic novelty and led to instantaneous speciation. Syntenic duplicates retained in polyploid blocks are enriched for transcription factor functions, whereas gene copies derived from ongoing tandem duplication events are enriched in metabolic functions potentially important for a carnivorous plant. Among these are tandem arrays of cysteine protease genes with trap-specific expression that evolved within a protein family known to be useful in the digestion of animal prey. Further enriched functions among tandem duplicates (also with trap-enhanced expression) include peptide transport (intercellular movement of broken-down prey proteins), ATPase activities (bladder-trap acidification and transmembrane nutrient transport), hydrolase and chitinase activities (breakdown of prey polysaccharides), and cell-wall dynamic components possibly associated with active bladder movements. Whereas independently polyploid Arabidopsis syntenic gene duplicates are similarly enriched for transcriptional regulatory activities, Arabidopsis tandems are distinct from those of U. gibba, while still metabolic and likely reflecting unique adaptations of that species. Taken together, these findings highlight the special importance of tandem duplications in the adaptive landscapes of a carnivorous plant genome. PMID:28507139
Complete chloroplast genome sequence of a major invasive species, crofton weed (Ageratina adenophora).

PubMed

Nie, Xiaojun; Lv, Shuzuo; Zhang, Yingxin; Du, Xianghong; Wang, Le; Biradar, Siddanagouda S; Tan, Xiufang; Wan, Fanghao; Weining, Song

2012-01-01

Crofton weed (Ageratina adenophora) is one of the most hazardous invasive plant species, which causes serious economic losses and environmental damages worldwide. However, the sequence resource and genome information of A. adenophora are rather limited, making phylogenetic identification and evolutionary studies very difficult. Here, we report the complete sequence of the A. adenophora chloroplast (cp) genome based on Illumina sequencing. The A. adenophora cp genome is 150, 689 bp in length including a small single-copy (SSC) region of 18, 358 bp and a large single-copy (LSC) region of 84, 815 bp separated by a pair of inverted repeats (IRs) of 23, 755 bp. The genome contains 130 unique genes and 18 duplicated in the IR regions, with the gene content and organization similar to other Asteraceae cp genomes. Comparative analysis identified five DNA regions (ndhD-ccsA, psbI-trnS, ndhF-ycf1, ndhI-ndhG and atpA-trnR) containing parsimony-informative characters higher than 2%, which may be potential informative markers for barcoding and phylogenetic analysis. Repeat structure, codon usage and contraction of the IR were also investigated to reveal the pattern of evolution. Phylogenetic analysis demonstrated a sister relationship between A. adenophora and Guizotia abyssinica and supported a monophyly of the Asterales. We have assembled and analyzed the chloroplast genome of A. adenophora in this study, which was the first sequenced plastome in the Eupatorieae tribe. The complete chloroplast genome information is useful for plant phylogenetic and evolutionary studies within this invasive species and also within the Asteraceae family.
The First Complete Chloroplast Genome Sequences in Actinidiaceae: Genome Structure and Comparative Analysis.

PubMed

Yao, Xiaohong; Tang, Ping; Li, Zuozhou; Li, Dawei; Liu, Yifei; Huang, Hongwen

2015-01-01

Actinidia chinensis is an important economic plant belonging to the basal lineage of the asterids. Availability of a complete Actinidia chloroplast genome sequence is crucial to understanding phylogenetic relationships among major lineages of angiosperms and facilitates kiwifruit genetic improvement. We report here the complete nucleotide sequences of the chloroplast genomes for Actinidia chinensis and A. chinensis var deliciosa obtained through de novo assembly of Illumina paired-end reads produced by total DNA sequencing. The total genome size ranges from 155,446 to 157,557 bp, with an inverted repeat (IR) of 24,013 to 24,391 bp, a large single copy region (LSC) of 87,984 to 88,337 bp and a small single copy region (SSC) of 20,332 to 20,336 bp. The genome encodes 113 different genes, including 79 unique protein-coding genes, 30 tRNA genes and 4 ribosomal RNA genes, with 16 duplicated in the inverted repeats, and a tRNA gene (trnfM-CAU) duplicated once in the LSC region. Comparisons of IR boundaries among four asterid species showed that IR/LSC borders were extended into the 5' portion of the psbA gene and IR contraction occurred in Actinidia. The clap gene has been lost from the chloroplast genome in Actinidia, and may have been transferred to the nucleus during chloroplast evolution. Twenty-seven polymorphic simple sequence repeat (SSR) loci were identified in the Actinidia chloroplast genome. Maximum parsimony analyses of a 72-gene, 16 taxa angiosperm dataset strongly support the placement of Actinidiaceae in Ericales within the basal asterids.
Modular assembly of transposable element arrays by microsatellite targeting in the guayule and rice genomes.

PubMed

Valdes Franco, José A; Wang, Yi; Huo, Naxin; Ponciano, Grisel; Colvin, Howard A; McMahan, Colleen M; Gu, Yong Q; Belknap, William R

2018-04-19

Guayule (Parthenium argentatum A. Gray) is a rubber-producing desert shrub native to Mexico and the United States. Guayule represents an alternative to Hevea brasiliensis as a source for commercial natural rubber. The efficient application of modern molecular/genetic tools to guayule improvement requires characterization of its genome. The 1.6 Gb guayule genome was sequenced, assembled and annotated. The final 1.5 Gb assembly, while fragmented (N 50 = 22 kb), maps > 95% of the shotgun reads and is essentially complete. Approximately 40,000 transcribed, protein encoding genes were annotated on the assembly. Further characterization of this genome revealed 15 families of small, microsatellite-associated, transposable elements (TEs) with unexpected chromosomal distribution profiles. These SaTar (Satellite Targeted) elements, which are non-autonomous Mu-like elements (MULEs), were frequently observed in multimeric linear arrays of unrelated individual elements within which no individual element is interrupted by another. This uniformly non-nested TE multimer architecture has not been previously described in either eukaryotic or prokaryotic genomes. Five families of similarly distributed non-autonomous MULEs (microsatellite associated, modularly assembled) were characterized in the rice genome. Families of TEs with similar structures and distribution profiles were identified in sorghum and citrus. The sequencing and assembly of the guayule genome provides a foundation for application of current crop improvement technologies to this plant. In addition, characterization of this genome revealed SaTar elements with distribution profiles unique among TEs. Satar targeting appears based on an alternative MULE recombination mechanism with the potential to impact gene evolution.
Applied genomics in ruminants-new discoveries and model for predictive medicine

USDA-ARS?s Scientific Manuscript database

An overview of the progress for Dr. Sonstegard’s work in applied genomics in dairy cattle will be presented. The overview will include how applied research in livestock offers unique investigative models to discover gene function as a result of genetic load or inbreeding and also how genome selectio...
Complete Genome Sequences of Bacillus Phages Janet and OTooleKemple52

PubMed Central

2018-01-01

ABSTRACT We report here the genome sequences of two novel Bacillus cereus group-infecting bacteriophages, Janet and OTooleKemple52. These bacteriophages are double-stranded DNA-containing Myoviridae isolated from soil samples. While their genomes share a high degree of sequence identity with one another, their host preferences are unique. PMID:29748396

Fanconi anemia proteins in telomere maintenance.

PubMed

Sarkar, Jaya; Liu, Yie

2016-07-01

Mammalian chromosome ends are protected by nucleoprotein structures called telomeres. Telomeres ensure genome stability by preventing chromosome termini from being recognized as DNA damage. Telomere length homeostasis is inevitable for telomere maintenance because critical shortening or over-lengthening of telomeres may lead to DNA damage response or delay in DNA replication, and hence genome instability. Due to their repetitive DNA sequence, unique architecture, bound shelterin proteins, and high propensity to form alternate/secondary DNA structures, telomeres are like common fragile sites and pose an inherent challenge to the progression of DNA replication, repair, and recombination apparatus. It is conceivable that longer the telomeres are, greater is the severity of such challenges. Recent studies have linked excessively long telomeres with increased tumorigenesis. Here we discuss telomere abnormalities in a rare recessive chromosomal instability disorder called Fanconi Anemia and the role of the Fanconi Anemia pathway in telomere biology. Reports suggest that Fanconi Anemia proteins play a role in maintaining long telomeres, including processing telomeric joint molecule intermediates. We speculate that ablation of the Fanconi Anemia pathway would lead to inadequate aberrant structural barrier resolution at excessively long telomeres, thereby causing replicative burden on the cell. Published by Elsevier B.V.
Genome Re-Sequencing of Semi-Wild Soybean Reveals a Complex Soja Population Structure and Deep Introgression

PubMed Central

Wu, Sanling; Wang, Ying-Ying; Ye, Chu-Yu; Bai, Xuefei; Li, Zefeng; Yan, Chenghai; Wang, Weidi; Wang, Ziqiang; Shu, Qingyao; Xie, Jiahua; Lee, Suk-Ha; Fan, Longjiang

2014-01-01

Semi-wild soybean is a unique type of soybean that retains both wild and domesticated characteristics, which provides an important intermediate type for understanding the evolution of the subgenus Soja population in the Glycine genus. In this study, a semi-wild soybean line (Maliaodou) and a wild line (Lanxi 1) collected from the lower Yangtze regions were deeply sequenced while nine other semi-wild lines were sequenced to a 3-fold genome coverage. Sequence analysis revealed that (1) no independent phylogenetic branch covering all 10 semi-wild lines was observed in the Soja phylogenetic tree; (2) besides two distinct subpopulations of wild and cultivated soybean in the Soja population structure, all semi-wild lines were mixed with some wild lines into a subpopulation rather than an independent one or an intermediate transition type of soybean domestication; (3) high heterozygous rates (0.19–0.49) were observed in several semi-wild lines; and (4) over 100 putative selective regions were identified by selective sweep analysis, including those related to the development of seed size. Our results suggested a hybridization origin for the semi-wild soybean, which makes a complex Soja population structure. PMID:25265539
Evolutionary conservation of sequence and secondary structures inCRISPR repeats

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kunin, Victor; Sorek, Rotem; Hugenholtz, Philip

Clustered Regularly Interspaced Palindromic Repeats (CRISPRs) are a novel class of direct repeats, separated by unique spacer sequences of similar length, that are present in {approx}40% of bacterial and all archaeal genomes analyzed to date. More than 40 gene families, called CRISPR-associated sequences (CAS), appear in conjunction with these repeats and are thought to be involved in the propagation and functioning of CRISPRs. It has been proposed that the CRISPR/CAS system samples, maintains a record of, and inactivates invasive DNA that the cell has encountered, and therefore constitutes a prokaryotic analog of an immune system. Here we analyze CRISPR repeatsmore » identified in 195 microbial genomes and show that they can be organized into multiple clusters based on sequence similarity. All individual repeats in any given cluster were inferred to form characteristic RNA secondary structure, ranging from non-existent to pronounced. Stable secondary structures included G:U base pairs and exhibited multiple compensatory base changes in the stem region, indicating evolutionary conservation and functional importance. We also show that the repeat-based classification corresponds to, and expands upon, a previously reported CAS gene-based classification including specific relationships between CRISPR and CAS subtypes.« less
An archaeal genomic signature

NASA Technical Reports Server (NTRS)

Graham, D. E.; Overbeek, R.; Olsen, G. J.; Woese, C. R.

2000-01-01

Comparisons of complete genome sequences allow the most objective and comprehensive descriptions possible of a lineage's evolution. This communication uses the completed genomes from four major euryarchaeal taxa to define a genomic signature for the Euryarchaeota and, by extension, the Archaea as a whole. The signature is defined in terms of the set of protein-encoding genes found in at least two diverse members of the euryarchaeal taxa that function uniquely within the Archaea; most signature proteins have no recognizable bacterial or eukaryal homologs. By this definition, 351 clusters of signature proteins have been identified. Functions of most proteins in this signature set are currently unknown. At least 70% of the clusters that contain proteins from all the euryarchaeal genomes also have crenarchaeal homologs. This conservative set, which appears refractory to horizontal gene transfer to the Bacteria or the Eukarya, would seem to reflect the significant innovations that were unique and fundamental to the archaeal "design fabric." Genomic protein signature analysis methods may be extended to characterize the evolution of any phylogenetically defined lineage. The complete set of protein clusters for the archaeal genomic signature is presented as supplementary material (see the PNAS web site, www.pnas.org).
The Sorcerer II Global Ocean Sampling Expedition: Northwest Atlantic through Eastern Tropical Pacific

PubMed Central

Rusch, Douglas B; Halpern, Aaron L; Sutton, Granger; Heidelberg, Karla B; Williamson, Shannon; Yooseph, Shibu; Wu, Dongying; Eisen, Jonathan A; Hoffman, Jeff M; Remington, Karin; Beeson, Karen; Tran, Bao; Smith, Hamilton; Baden-Tillson, Holly; Stewart, Clare; Thorpe, Joyce; Freeman, Jason; Andrews-Pfannkoch, Cynthia; Venter, Joseph E; Li, Kelvin; Kravitz, Saul; Heidelberg, John F; Utterback, Terry; Rogers, Yu-Hui; Falcón, Luisa I; Souza, Valeria; Bonilla-Rosso, Germán; Eguiarte, Luis E; Karl, David M; Sathyendranath, Shubha; Platt, Trevor; Bermingham, Eldredge; Gallardo, Victor; Tamayo-Castillo, Giselle; Ferrari, Michael R; Strausberg, Robert L; Nealson, Kenneth; Friedman, Robert; Frazier, Marvin; Venter, J. Craig

2007-01-01

The world's oceans contain a complex mixture of micro-organisms that are for the most part, uncharacterized both genetically and biochemically. We report here a metagenomic study of the marine planktonic microbiota in which surface (mostly marine) water samples were analyzed as part of the Sorcerer II Global Ocean Sampling expedition. These samples, collected across a several-thousand km transect from the North Atlantic through the Panama Canal and ending in the South Pacific yielded an extensive dataset consisting of 7.7 million sequencing reads (6.3 billion bp). Though a few major microbial clades dominate the planktonic marine niche, the dataset contains great diversity with 85% of the assembled sequence and 57% of the unassembled data being unique at a 98% sequence identity cutoff. Using the metadata associated with each sample and sequencing library, we developed new comparative genomic and assembly methods. One comparative genomic method, termed “fragment recruitment,” addressed questions of genome structure, evolution, and taxonomic or phylogenetic diversity, as well as the biochemical diversity of genes and gene families. A second method, termed “extreme assembly,” made possible the assembly and reconstruction of large segments of abundant but clearly nonclonal organisms. Within all abundant populations analyzed, we found extensive intra-ribotype diversity in several forms: (1) extensive sequence variation within orthologous regions throughout a given genome; despite coverage of individual ribotypes approaching 500-fold, most individual sequencing reads are unique; (2) numerous changes in gene content some with direct adaptive implications; and (3) hypervariable genomic islands that are too variable to assemble. The intra-ribotype diversity is organized into genetically isolated populations that have overlapping but independent distributions, implying distinct environmental preference. We present novel methods for measuring the genomic similarity between metagenomic samples and show how they may be grouped into several community types. Specific functional adaptations can be identified both within individual ribotypes and across the entire community, including proteorhodopsin spectral tuning and the presence or absence of the phosphate-binding gene PstS. PMID:17355176
Molecular Evolution and Functional Diversification of Replication Protein A1 in Plants

PubMed Central

Aklilu, Behailu B.; Culligan, Kevin M.

2016-01-01

Replication protein A (RPA) is a heterotrimeric, single-stranded DNA binding complex required for eukaryotic DNA replication, repair, and recombination. RPA is composed of three subunits, RPA1, RPA2, and RPA3. In contrast to single RPA subunit genes generally found in animals and yeast, plants encode multiple paralogs of RPA subunits, suggesting subfunctionalization. Genetic analysis demonstrates that five Arabidopsis thaliana RPA1 paralogs (RPA1A to RPA1E) have unique and overlapping functions in DNA replication, repair, and meiosis. We hypothesize here that RPA1 subfunctionalities will be reflected in major structural and sequence differences among the paralogs. To address this, we analyzed amino acid and nucleotide sequences of RPA1 paralogs from 25 complete genomes representing a wide spectrum of plants and unicellular green algae. We find here that the plant RPA1 gene family is divided into three general groups termed RPA1A, RPA1B, and RPA1C, which likely arose from two progenitor groups in unicellular green algae. In the family Brassicaceae the RPA1B and RPA1C groups have further expanded to include two unique sub-functional paralogs RPA1D and RPA1E, respectively. In addition, RPA1 groups have unique domains, motifs, cis-elements, gene expression profiles, and pattern of conservation that are consistent with proposed functions in monocot and dicot species, including a novel C-terminal zinc-finger domain found only in plant RPA1C-like sequences. These results allow for improved prediction of RPA1 subunit functions in newly sequenced plant genomes, and potentially provide a unique molecular tool to improve classification of Brassicaceae species. PMID:26858742
Genome and metagenome analyses reveal adaptive evolution of the host and interaction with the gut microbiota in the goose

PubMed Central

Gao, Guangliang; Zhao, Xianzhi; Li, Qin; He, Chuan; Zhao, Wenjing; Liu, Shuyun; Ding, Jinmei; Ye, Weixing; Wang, Jun; Chen, Ye; Wang, Haiwei; Li, Jing; Luo, Yi; Su, Jian; Huang, Yong; Liu, Zuohua; Dai, Ronghua; Shi, Yixiang; Meng, He; Wang, Qigui

2016-01-01

The goose is an economically important waterfowl that exhibits unique characteristics and abilities, such as liver fat deposition and fibre digestion. Here, we report de novo whole-genome assemblies for the goose and swan goose and describe the evolutionary relationships among 7 bird species, including domestic and wild geese, which diverged approximately 3.4~6.3 million years ago (Mya). In contrast to chickens as a proximal species, the expanded and rapidly evolving genes found in the goose genome are mainly involved in metabolism, including energy, amino acid and carbohydrate metabolism. Further integrated analysis of the host genome and gut metagenome indicated that the most widely shared functional enrichment of genes occurs for functions such as glycolysis/gluconeogenesis, starch and sucrose metabolism, propanoate metabolism and the citrate cycle. We speculate that the unique physiological abilities of geese benefit from the adaptive evolution of the host genome and symbiotic interactions with gut microbes. PMID:27608918
The complete chloroplast genome sequence of strawberry (Fragaria × ananassa Duch.) and comparison with related species of Rosaceae

PubMed Central

Cheng, Hui; Li, Jinfeng; Zhang, Hong; Cai, Binhua; Gao, Zhihong

2017-01-01

Compared with other members of the family Rosaceae, the chloroplast genomes of Fragaria species exhibit low variation, and this situation has limited phylogenetic analyses; thus, complete chloroplast genome sequencing of Fragaria species is needed. In this study, we sequenced the complete chloroplast genome of F. × ananassa ‘Benihoppe’ using the Illumina HiSeq 2500-PE150 platform and then performed a combination of de novo assembly and reference-guided mapping of contigs to generate complete chloroplast genome sequences. The chloroplast genome exhibits a typical quadripartite structure with a pair of inverted repeats (IRs, 25,936 bp) separated by large (LSC, 85,531 bp) and small (SSC, 18,146 bp) single-copy (SC) regions. The length of the F. × ananassa ‘Benihoppe’ chloroplast genome is 155,549 bp, representing the smallest Fragaria chloroplast genome observed to date. The genome encodes 112 unique genes, comprising 78 protein-coding genes, 30 tRNA genes and four rRNA genes. Comparative analysis of the overall nucleotide sequence identity among ten complete chloroplast genomes confirmed that for both coding and non-coding regions in Rosaceae, SC regions exhibit higher sequence variation than IRs. The Ka/Ks ratio of most genes was less than 1, suggesting that most genes are under purifying selection. Moreover, the mVISTA results also showed a high degree of conservation in genome structure, gene order and gene content in Fragaria, particularly among three octoploid strawberries which were F. × ananassa ‘Benihoppe’, F. chiloensis (GP33) and F. virginiana (O477). However, when the sequences of the coding and non-coding regions of F. × ananassa ‘Benihoppe’ were compared in detail with those of F. chiloensis (GP33) and F. virginiana (O477), a number of SNPs and InDels were revealed by MEGA 7. Six non-coding regions (trnK-matK, trnS-trnG, atpF-atpH, trnC-petN, trnT-psbD and trnP-psaJ) with a percentage of variable sites greater than 1% and no less than five parsimony-informative sites were identified and may be useful for phylogenetic analysis of the genus Fragaria. PMID:29038765
Enhanced guide-RNA design and targeting analysis for precise CRISPR genome editing of single and consortia of industrially relevant and non-model organisms.

PubMed

Mendoza, Brian J; Trinh, Cong T

2018-01-01

Genetic diversity of non-model organisms offers a repertoire of unique phenotypic features for exploration and cultivation for synthetic biology and metabolic engineering applications. To realize this enormous potential, it is critical to have an efficient genome editing tool for rapid strain engineering of these organisms to perform novel programmed functions. To accommodate the use of CRISPR/Cas systems for genome editing across organisms, we have developed a novel method, named CRISPR Associated Software for Pathway Engineering and Research (CASPER), for identifying on- and off-targets with enhanced predictability coupled with an analysis of non-unique (repeated) targets to assist in editing any organism with various endonucleases. Utilizing CASPER, we demonstrated a modest 2.4% and significant 30.2% improvement (F-test, P < 0.05) over the conventional methods for predicting on- and off-target activities, respectively. Further we used CASPER to develop novel applications in genome editing: multitargeting analysis (i.e. simultaneous multiple-site modification on a target genome with a sole guide-RNA requirement) and multispecies population analysis (i.e. guide-RNA design for genome editing across a consortium of organisms). Our analysis on a selection of industrially relevant organisms revealed a number of non-unique target sites associated with genes and transposable elements that can be used as potential sites for multitargeting. The analysis also identified shared and unshared targets that enable genome editing of single or multiple genomes in a consortium of interest. We envision CASPER as a useful platform to enhance the precise CRISPR genome editing for metabolic engineering and synthetic biology applications. https://github.com/TrinhLab/CASPER. ctrinh@utk.edu. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Elucidation of the genome organization of tobacco mosaic virus.

PubMed Central

Zaitlin, M

1999-01-01

Proteins unique to tobacco mosaic virus (TMV)-infected plants were detected in the 1970s by electrophoretic analyses of extracts of virus-infected tissues, comparing their proteins to those generated in extracts of uninfected tissues. The genome organization of TMV was deduced principally from studies involving in vitro translation of proteins from the genomic and subgenomic messenger RNAs. The ultimate analysis of the TMV genome came in 1982 when P. Goelet and colleagues sequenced the entire genome. Studies leading to the elucidation of the TMV genome organization are described below. PMID:10212938
Analysis of BAC end sequences in oak, a keystone forest tree species, providing insight into the composition of its genome

PubMed Central

2011-01-01

Background One of the key goals of oak genomics research is to identify genes of adaptive significance. This information may help to improve the conservation of adaptive genetic variation and the management of forests to increase their health and productivity. Deep-coverage large-insert genomic libraries are a crucial tool for attaining this objective. We report herein the construction of a BAC library for Quercus robur, its characterization and an analysis of BAC end sequences. Results The EcoRI library generated consisted of 92,160 clones, 7% of which had no insert. Levels of chloroplast and mitochondrial contamination were below 3% and 1%, respectively. Mean clone insert size was estimated at 135 kb. The library represents 12 haploid genome equivalents and, the likelihood of finding a particular oak sequence of interest is greater than 99%. Genome coverage was confirmed by PCR screening of the library with 60 unique genetic loci sampled from the genetic linkage map. In total, about 20,000 high-quality BAC end sequences (BESs) were generated by sequencing 15,000 clones. Roughly 5.88% of the combined BAC end sequence length corresponded to known retroelements while ab initio repeat detection methods identified 41 additional repeats. Collectively, characterized and novel repeats account for roughly 8.94% of the genome. Further analysis of the BESs revealed 1,823 putative genes suggesting at least 29,340 genes in the oak genome. BESs were aligned with the genome sequences of Arabidopsis thaliana, Vitis vinifera and Populus trichocarpa. One putative collinear microsyntenic region encoding an alcohol acyl transferase protein was observed between oak and chromosome 2 of V. vinifera. Conclusions This BAC library provides a new resource for genomic studies, including SSR marker development, physical mapping, comparative genomics and genome sequencing. BES analysis provided insight into the structure of the oak genome. These sequences will be used in the assembly of a future genome sequence for oak. PMID:21645357
Distinctive Architecture of the Chloroplast Genome in the Chlorodendrophycean Green Algae Scherffelia dubia and Tetraselmis sp. CCMP 881.

PubMed

Turmel, Monique; de Cambiaire, Jean-Charles; Otis, Christian; Lemieux, Claude

2016-01-01

The Chlorodendrophyceae is a small class of green algae belonging to the core Chlorophyta, an assemblage that also comprises the Pedinophyceae, Trebouxiophyceae, Ulvophyceae and Chlorophyceae. Here we describe for the first time the chloroplast genomes of chlorodendrophycean algae (Scherffelia dubia, 137,161 bp; Tetraselmis sp. CCMP 881, 100,264 bp). Characterized by a very small single-copy (SSC) region devoid of any gene and an unusually large inverted repeat (IR), the quadripartite structures of the Scherffelia and Tetraselmis genomes are unique among all core chlorophytes examined thus far. The lack of genes in the SSC region is offset by the rich and atypical gene complement of the IR, which includes genes from the SSC and large single-copy regions of prasinophyte and streptophyte chloroplast genomes having retained an ancestral quadripartite structure. Remarkably, seven of the atypical IR-encoded genes have also been observed in the IRs of pedinophycean and trebouxiophycean chloroplast genomes, suggesting that they were already present in the IR of the common ancestor of all core chlorophytes. Considering that the relationships among the main lineages of the core Chlorophyta are still unresolved, we evaluated the impact of including the Chlorodendrophyceae in chloroplast phylogenomic analyses. The trees we inferred using data sets of 79 and 108 genes from 71 chlorophytes indicate that the Chlorodendrophyceae is a deep-diverging lineage of the core Chlorophyta, although the placement of this class relative to the Pedinophyceae remains ambiguous. Interestingly, some of our phylogenomic trees together with our comparative analysis of gene order data support the monophyly of the Trebouxiophyceae, thus offering further evidence that the previously observed affiliation between the Chlorellales and Pedinophyceae is the result of systematic errors in phylogenetic reconstruction.
Lineage tracing of genome-edited alleles reveals high fidelity axolotl limb regeneration.

PubMed

Flowers, Grant Parker; Sanor, Lucas D; Crews, Craig M

2017-09-16

Salamanders are unparalleled among tetrapods in their ability to regenerate many structures, including entire limbs, and the study of this ability may provide insights into human regenerative therapies. The complex structure of the limb poses challenges to the investigation of the cellular and molecular basis of its regeneration. Using CRISPR/Cas, we genetically labelled unique cell lineages within the developing axolotl embryo and tracked the frequency of each lineage within amputated and fully regenerated limbs. This allowed us, for the first time, to assess the contributions of multiple low frequency cell lineages to the regenerating limb at once. Our comparisons reveal that regenerated limbs are high fidelity replicas of the originals even after repeated amputations.
Major repeat components covering one-third of the ginseng (Panax ginseng C.A. Meyer) genome and evidence for allotetraploidy.

PubMed

Choi, Hong-Il; Waminal, Nomar E; Park, Hye Mi; Kim, Nam-Hoon; Choi, Beom Soon; Park, Minkyu; Choi, Doil; Lim, Yong Pyo; Kwon, Soo-Jin; Park, Beom-Seok; Kim, Hyun Hee; Yang, Tae-Jin

2014-03-01

Ginseng (Panax ginseng) is a famous medicinal herb, but the composition and structure of its genome are largely unknown. Here we characterized the major repeat components and inspected their distribution in the ginseng genome. By analyzing three repeat-rich bacterial artificial chromosome (BAC) sequences from ginseng, we identified complex insertion patterns of 34 long terminal repeat retrotransposons (LTR-RTs) and 11 LTR-RT derivatives accounting for more than 80% of the BAC sequences. The LTR-RTs were classified into three Ty3/gypsy (PgDel, PgTat and PgAthila) and two Ty1/Copia (PgTork and PgOryco) families. Mapping of 30-Gbp Illumina whole-genome shotgun reads to the BAC sequences revealed that these five LTR-RT families occupy at least 34% of the ginseng genome. The Ty3/Gypsy families were predominant, comprising 74 and 33% of the BAC sequences and the genome, respectively. In particular, the PgDel family accounted for 29% of the genome and presumably played major roles in enlargement of the size of the ginseng genome. Fluorescence in situ hybridization (FISH) revealed that the PgDel1 elements are distributed throughout the chromosomes along dispersed heterochromatic regions except for ribosomal DNA blocks. The intensity of the PgDel2 FISH signals was biased toward 24 out of 48 chromosomes. Unique gene probes showed two pairs of signals with different locations, one pair in subtelomeric regions on PgDel2-rich chromosomes and the other in interstitial regions on PgDel2-poor chromosomes, demonstrating allotetraploidy in ginseng. Our findings promote understanding of the evolution of the ginseng genome and of that of related species in the Araliaceae. © 2014 The Authors The Plant Journal © 2014 John Wiley & Sons Ltd.
Five Complete Chloroplast Genome Sequences from Diospyros: Genome Organization and Comparative Analysis.

PubMed

Fu, Jianmin; Liu, Huimin; Hu, Jingjing; Liang, Yuqin; Liang, Jinjun; Wuyun, Tana; Tan, Xiaofeng

2016-01-01

Diospyros is the largest genus in Ebenaceae, comprising more than 500 species with remarkable economic value, especially Diospyros kaki Thunb., which has traditionally been an important food resource in China, Korea, and Japan. Complete chloroplast (cp) genomes from D. kaki, D. lotus L., D. oleifera Cheng., D. glaucifolia Metc., and Diospyros 'Jinzaoshi' were sequenced using Illumina sequencing technology. This is the first cp genome reported in Ebenaceae. The cp genome sequences of Diospyros ranged from 157,300 to 157,784 bp in length, presenting a typical quadripartite structure with two inverted repeats each separated by one large and one small single-copy region. For each cp genome, 134 genes were annotated, including 80 protein-coding, 31 tRNA, and 4 rRNA unique genes. In all, 179 repeats and 283 single sequence repeats were identified. Four hypervariable regions, namely, intergenic region of trnQ_rps16, trnV_ndhC, and psbD_trnT, and intron of ndhA, were identified in the Diospyros genomes. Phylogenetic analyses based on the whole cp genome, protein-coding, and intergenic and intron sequences indicated that D. oleifera is closely related to D. kaki and could be used as a model plant for future research on D. kaki; to our knowledge, this is proposed for the first time. Further, these analyses together with two large deletions (301 and 140 bp) in the cp genome of D. 'Jinzaoshi', support its placement as a new species in Diospyros. Both maximum parsimony and likelihood analyses for 19 taxa indicated the basal position of Ericales in asterids and suggested that Ebenaceae is monophyletic in Ericales.
Five Complete Chloroplast Genome Sequences from Diospyros: Genome Organization and Comparative Analysis

PubMed Central

Hu, Jingjing; Liang, Yuqin; Liang, Jinjun; Wuyun, Tana; Tan, Xiaofeng

2016-01-01

Diospyros is the largest genus in Ebenaceae, comprising more than 500 species with remarkable economic value, especially Diospyros kaki Thunb., which has traditionally been an important food resource in China, Korea, and Japan. Complete chloroplast (cp) genomes from D. kaki, D. lotus L., D. oleifera Cheng., D. glaucifolia Metc., and Diospyros ‘Jinzaoshi’ were sequenced using Illumina sequencing technology. This is the first cp genome reported in Ebenaceae. The cp genome sequences of Diospyros ranged from 157,300 to 157,784 bp in length, presenting a typical quadripartite structure with two inverted repeats each separated by one large and one small single-copy region. For each cp genome, 134 genes were annotated, including 80 protein-coding, 31 tRNA, and 4 rRNA unique genes. In all, 179 repeats and 283 single sequence repeats were identified. Four hypervariable regions, namely, intergenic region of trnQ_rps16, trnV_ndhC, and psbD_trnT, and intron of ndhA, were identified in the Diospyros genomes. Phylogenetic analyses based on the whole cp genome, protein-coding, and intergenic and intron sequences indicated that D. oleifera is closely related to D. kaki and could be used as a model plant for future research on D. kaki; to our knowledge, this is proposed for the first time. Further, these analyses together with two large deletions (301 and 140 bp) in the cp genome of D. ‘Jinzaoshi’, support its placement as a new species in Diospyros. Both maximum parsimony and likelihood analyses for 19 taxa indicated the basal position of Ericales in asterids and suggested that Ebenaceae is monophyletic in Ericales. PMID:27442423
Parallel evolution of Streptococcus pneumoniae and Streptococcus mitis to pathogenic and mutualistic lifestyles.

PubMed

Kilian, Mogens; Riley, David R; Jensen, Anders; Brüggemann, Holger; Tettelin, Hervé

2014-07-22

The bacterium Streptococcus pneumoniae is one of the leading causes of fatal infections affecting humans. Intriguingly, phylogenetic analysis shows that the species constitutes one evolutionary lineage in a cluster of the otherwise commensal Streptococcus mitis strains, with which humans live in harmony. In a comparative analysis of 35 genomes, including phylogenetic analyses of all predicted genes, we have shown that the pathogenic pneumococcus has evolved into a master of genomic flexibility while lineages that evolved into the nonpathogenic S. mitis secured harmonious coexistence with their host by stabilizing an approximately 15%-reduced genome devoid of many virulence genes. Our data further provide evidence that interspecies gene transfer between S. pneumoniae and S. mitis occurs in a unidirectional manner, i.e., from S. mitis to S. pneumoniae. Import of genes from S. mitis and other mitis, anginosus, and salivarius group streptococci ensured allelic replacements and antigenic diversification and has been driving the evolution of the remarkable structural diversity of capsular polysaccharides of S. pneumoniae. Our study explains how the unique structural diversity of the pneumococcal capsule emerged and conceivably will continue to increase and reveals a striking example of the fragile border between the commensal and pathogenic lifestyles. While genomic plasticity enabling quick adaptation to environmental stress is a necessity for the pathogenic streptococci, the commensal lifestyle benefits from stability. Importance: One of the leading causes of fatal infections affecting humans, Streptococcus pneumoniae, and the commensal Streptococcus mitis are closely related obligate symbionts associated with hominids. Faced with a shortage of accessible hosts, the two opposing lifestyles evolved in parallel. We have shown that the nonpathogenic S. mitis secured harmonious coexistence with its host by stabilizing a reduced genome devoid of many virulence genes. Meanwhile, the pathogenic pneumococcus evolved into a master of genomic flexibility and imports genes from S. mitis and other related streptococci. This process ensured antigenic diversification and has been driving the evolution of the remarkable structural diversity of capsular polysaccharides of S. pneumoniae, which conceivably will continue to increase and present a challenge to disease prevention. Copyright © 2014 Kilian et al.
Extensive genetic diversity, unique population structure and evidence of genetic exchange in the sexually transmitted parasite Trichomonas vaginalis.

PubMed

Conrad, Melissa D; Gorman, Andrew W; Schillinger, Julia A; Fiori, Pier Luigi; Arroyo, Rossana; Malla, Nancy; Dubey, Mohan Lal; Gonzalez, Jorge; Blank, Susan; Secor, William E; Carlton, Jane M

2012-01-01

Trichomonas vaginalis is the causative agent of human trichomoniasis, the most common non-viral sexually transmitted infection world-wide. Despite its prevalence, little is known about the genetic diversity and population structure of this haploid parasite due to the lack of appropriate tools. The development of a panel of microsatellite makers and SNPs from mining the parasite's genome sequence has paved the way to a global analysis of the genetic structure of the pathogen and association with clinical phenotypes. Here we utilize a panel of T. vaginalis-specific genetic markers to genotype 235 isolates from Mexico, Chile, India, Australia, Papua New Guinea, Italy, Africa and the United States, including 19 clinical isolates recently collected from 270 women attending New York City sexually transmitted disease clinics. Using population genetic analysis, we show that T. vaginalis is a genetically diverse parasite with a unique population structure consisting of two types present in equal proportions world-wide. Parasites belonging to the two types (type 1 and type 2) differ significantly in the rate at which they harbor the T. vaginalis virus, a dsRNA virus implicated in parasite pathogenesis, and in their sensitivity to the widely-used drug, metronidazole. We also uncover evidence of genetic exchange, indicating a sexual life-cycle of the parasite despite an absence of morphologically-distinct sexual stages. Our study represents the first robust and comprehensive evaluation of global T. vaginalis genetic diversity and population structure. Our identification of a unique two-type structure, and the clinically relevant phenotypes associated with them, provides a new dimension for understanding T. vaginalis pathogenesis. In addition, our demonstration of the possibility of genetic exchange in the parasite has important implications for genetic research and control of the disease.
15N indicates an active N-cycling microbial community in low carbon, freshwater sediments.

NASA Astrophysics Data System (ADS)

Sheik, C.

2017-12-01

Earth's large lakes are unique aquatic ecosystems, but we know little of the microbial life driving sedimentary biogeochemical cycles and ultimately the isotopic record. In several of these large lakes, water column productivity is constrained by element limitation, such as phosphorus and iron, creating oligotrophic water column conditions that drive low organic matter content in sediments. Yet, these sediments are biogeochemically active and have been shown to have oxygen consumption rates akin to pelagic ocean sediments and complex sulfur cycling dynamics. Thus, large oligotrophic lakes provide unique and interesting biogeochemical contrast to highly productive freshwater and coastal marine systems. Using Lake Superior as our study site, we found microbial community structure followed patterns in bulk sediment carbon and nitrogen concentrations. These observed patterns were loosely driven by land proximity, as some stations are more coastal and have higher rates of sedimentation, allochthonous carbon inputs and productivity than pelagic sites. Interestingly, upper sediment carbon and nitrogen stable isotopes were quite different from water column. Sediment carbon and nitrogen isotopes correlated significantly with microbial community structure. However, 15N showed much stronger correlation than 13C, and became heavier with core depth. Coinciding with the increase in 15N values, we see evidence of both denitrification and anammox processes in 16S rRNA gene libraries and metagenome assembled genomes. Given that microorganisms prefer light isotopes and that these N-cycling processes both contribute to N2 production and efflux from the sediment, the increase in 15N with sediment depth suggests microbial turnover. Abundance of these genomes also varies with depth suggesting these novel microorganisms are partitioning into specific sediment geochemical zones. Additionally, several of these genomes contain genes involved in sulphur cycling, suggesting a dual biogeochemical role and potential for a cryptic sulfur cycle. Together, Lake Superior sediments offer a glimpse into microbial metabolism in carbon limited environments. Further the pervasiveness of co-metabolic pathways suggests interpretation of isotopic records may be messier than previously thought.
The sea cucumber genome provides insights into morphological evolution and visceral regeneration

PubMed Central

Dai, Hui; Hamel, Jean-François; Liu, Chengzhang; Yu, Yang; Liu, Shilin; Lin, Wenchao; Guo, Kaimin; Jin, Songjun; Xu, Peng; Storey, Kenneth B.; Huan, Pin; Zhang, Tao; Zhou, Yi; Zhang, Jiquan; Lin, Chenggang; Li, Xiaoni; Xing, Lili; Huo, Da; Sun, Mingzhe; Wang, Lei; Mercier, Annie; Li, Fuhua; Yang, Hongsheng

2017-01-01

Apart from sharing common ancestry with chordates, sea cucumbers exhibit a unique morphology and exceptional regenerative capacity. Here we present the complete genome sequence of an economically important sea cucumber, A. japonicus, generated using Illumina and PacBio platforms, to achieve an assembly of approximately 805 Mb (contig N50 of 190 Kb and scaffold N50 of 486 Kb), with 30,350 protein-coding genes and high continuity. We used this resource to explore key genetic mechanisms behind the unique biological characters of sea cucumbers. Phylogenetic and comparative genomic analyses revealed the presence of marker genes associated with notochord and gill slits, suggesting that these chordate features were present in ancestral echinoderms. The unique shape and weak mineralization of the sea cucumber adult body were also preliminarily explained by the contraction of biomineralization genes. Genome, transcriptome, and proteome analyses of organ regrowth after induced evisceration provided insight into the molecular underpinnings of visceral regeneration, including a specific tandem-duplicated prostatic secretory protein of 94 amino acids (PSP94)-like gene family and a significantly expanded fibrinogen-related protein (FREP) gene family. This high-quality genome resource will provide a useful framework for future research into biological processes and evolution in deuterostomes, including remarkable regenerative abilities that could have medical applications. Moreover, the multiomics data will be of prime value for commercial sea cucumber breeding programs. PMID:29023486

The sea cucumber genome provides insights into morphological evolution and visceral regeneration.

PubMed

Zhang, Xiaojun; Sun, Lina; Yuan, Jianbo; Sun, Yamin; Gao, Yi; Zhang, Libin; Li, Shihao; Dai, Hui; Hamel, Jean-François; Liu, Chengzhang; Yu, Yang; Liu, Shilin; Lin, Wenchao; Guo, Kaimin; Jin, Songjun; Xu, Peng; Storey, Kenneth B; Huan, Pin; Zhang, Tao; Zhou, Yi; Zhang, Jiquan; Lin, Chenggang; Li, Xiaoni; Xing, Lili; Huo, Da; Sun, Mingzhe; Wang, Lei; Mercier, Annie; Li, Fuhua; Yang, Hongsheng; Xiang, Jianhai

2017-10-01

Apart from sharing common ancestry with chordates, sea cucumbers exhibit a unique morphology and exceptional regenerative capacity. Here we present the complete genome sequence of an economically important sea cucumber, A. japonicus, generated using Illumina and PacBio platforms, to achieve an assembly of approximately 805 Mb (contig N50 of 190 Kb and scaffold N50 of 486 Kb), with 30,350 protein-coding genes and high continuity. We used this resource to explore key genetic mechanisms behind the unique biological characters of sea cucumbers. Phylogenetic and comparative genomic analyses revealed the presence of marker genes associated with notochord and gill slits, suggesting that these chordate features were present in ancestral echinoderms. The unique shape and weak mineralization of the sea cucumber adult body were also preliminarily explained by the contraction of biomineralization genes. Genome, transcriptome, and proteome analyses of organ regrowth after induced evisceration provided insight into the molecular underpinnings of visceral regeneration, including a specific tandem-duplicated prostatic secretory protein of 94 amino acids (PSP94)-like gene family and a significantly expanded fibrinogen-related protein (FREP) gene family. This high-quality genome resource will provide a useful framework for future research into biological processes and evolution in deuterostomes, including remarkable regenerative abilities that could have medical applications. Moreover, the multiomics data will be of prime value for commercial sea cucumber breeding programs.
Structural and Functional Studies of Archaeal Viruses*

PubMed Central

Lawrence, C. Martin; Menon, Smita; Eilers, Brian J.; Bothner, Brian; Khayat, Reza; Douglas, Trevor; Young, Mark J.

2009-01-01

Viruses populate virtually every ecosystem on the planet, including the extreme acidic, thermal, and saline environments where archaeal organisms can dominate. For example, recent studies have identified crenarchaeal viruses in the hot springs of Yellowstone National Park and other high temperature environments worldwide. These viruses are often morphologically and genetically unique, with genomes that show little similarity to genes of known function, complicating efforts to understand their viral life cycles. Here, we review progress in understanding these fascinating viruses at the molecular level and the evolutionary insights coming from these studies. PMID:19158076
Plasmid Characterization and Chromosome Analysis of Two netF+ Clostridium perfringens Isolates Associated with Foal and Canine Necrotizing Enteritis.

PubMed

Mehdizadeh Gohari, Iman; Kropinski, Andrew M; Weese, Scott J; Parreira, Valeria R; Whitehead, Ashley E; Boerlin, Patrick; Prescott, John F

2016-01-01

The recent discovery of a novel beta-pore-forming toxin, NetF, which is strongly associated with canine and foal necrotizing enteritis should improve our understanding of the role of type A Clostridium perfringens associated disease in these animals. The current study presents the complete genome sequence of two netF-positive strains, JFP55 and JFP838, which were recovered from cases of foal necrotizing enteritis and canine hemorrhagic gastroenteritis, respectively. Genome sequencing was done using Single Molecule, Real-Time (SMRT) technology-PacBio and Illumina Hiseq2000. The JFP55 and JFP838 genomes include a single 3.34 Mb and 3.53 Mb chromosome, respectively, and both genomes include five circular plasmids. Plasmid annotation revealed that three plasmids were shared by the two newly sequenced genomes, including a NetF/NetE toxins-encoding tcp-conjugative plasmid, a CPE/CPB2 toxins-encoding tcp-conjugative plasmid and a putative bacteriocin-encoding plasmid. The putative beta-pore-forming toxin genes, netF, netE and netG, were located in unique pathogenicity loci on tcp-conjugative plasmids. The C. perfringens JFP55 chromosome carries 2,825 protein-coding genes whereas the chromosome of JFP838 contains 3,014 protein-encoding genes. Comparison of these two chromosomes with three available reference C. perfringens chromosome sequences identified 48 (~247 kb) and 81 (~430 kb) regions unique to JFP55 and JFP838, respectively. Some of these divergent genomic regions in both chromosomes are phage- and plasmid-related segments. Sixteen of these unique chromosomal regions (~69 kb) were shared between the two isolates. Five of these shared regions formed a mosaic of plasmid-integrated segments, suggesting that these elements were acquired early in a clonal lineage of netF-positive C. perfringens strains. These results provide significant insight into the basis of canine and foal necrotizing enteritis and are the first to demonstrate that netF resides on a large and unique plasmid-encoded locus.
Genome sequence of an aflatoxigenic pathogen of Argentinian peanut, Aspergillus arachidicola

USDA-ARS?s Scientific Manuscript database

In this study we sequenced the genome of the A. arachidicola Type strain (CBS 117610) and found its genome size to be 38.9 Mb, and its number of predicted genes to be 12,091, which are values comparable to those in other sequenced Aspergilli. Of its predicted genes, 691 were identified as unique to ...
Complete Genome Sequences of Bacillus Phages Janet and OTooleKemple52.

PubMed

Kent, Brenna; Raymond, Thomas; Mosier, Philip D; Johnson, Allison A

2018-05-10

We report here the genome sequences of two novel Bacillus cereus group-infecting bacteriophages, Janet and OTooleKemple52. These bacteriophages are double-stranded DNA-containing Myoviridae isolated from soil samples. While their genomes share a high degree of sequence identity with one another, their host preferences are unique. Copyright © 2018 Kent et al.
Odonata (dragonflies and damselflies) as a bridge between ecology and evolutionary genomics.

PubMed

Bybee, Seth; Córdoba-Aguilar, Alex; Duryea, M Catherine; Futahashi, Ryo; Hansson, Bengt; Lorenzo-Carballa, M Olalla; Schilder, Ruud; Stoks, Robby; Suvorov, Anton; Svensson, Erik I; Swaegers, Janne; Takahashi, Yuma; Watts, Phillip C; Wellenreuther, Maren

2016-01-01

Odonata (dragonflies and damselflies) present an unparalleled insect model to integrate evolutionary genomics with ecology for the study of insect evolution. Key features of Odonata include their ancient phylogenetic position, extensive phenotypic and ecological diversity, several unique evolutionary innovations, ease of study in the wild and usefulness as bioindicators for freshwater ecosystems worldwide. In this review, we synthesize studies on the evolution, ecology and physiology of odonates, highlighting those areas where the integration of ecology with genomics would yield significant insights into the evolutionary processes that would not be gained easily by working on other animal groups. We argue that the unique features of this group combined with their complex life cycle, flight behaviour, diversity in ecological niches and their sensitivity to anthropogenic change make odonates a promising and fruitful taxon for genomics focused research. Future areas of research that deserve increased attention are also briefly outlined.
Directed evolution approach to a structural genomics project: Rv2002 from Mycobacterium tuberculosis.

PubMed

Yang, Jin Kuk; Park, Min S; Waldo, Geoffrey S; Suh, Se Won

2003-01-21

One of the serious bottlenecks in structural genomics projects is overexpression of the target proteins in soluble form. We have applied the directed evolution technique and prepared soluble mutants of the Mycobacterium tuberculosis Rv2002 gene product, the wild type of which had been expressed as inclusion bodies in Escherichia coli. A triple mutant I6TV47MT69K (Rv2002-M3) was chosen for structural and functional characterizations. Enzymatic assays indicate that the Rv2002-M3 protein has a high catalytic activity as a NADH-dependent 3alpha, 20beta-hydroxysteroid dehydrogenase. We have determined the crystal structures of a binary complex with NAD(+) and a ternary complex with androsterone and NADH. The structure reveals that Asp-38 determines the cofactor specificity. The catalytic site includes the triad Ser-140Tyr-153Lys-157. Additionally, it has an unusual feature, Glu-142. Enzymatic assays of the E142A mutant of Rv2002-M3 indicate that Glu-142 reverses the effect of Lys-157 in influencing the pKa of Tyr-153. This study suggests that the Rv2002 gene product is a unique member of the SDR family and is likely to be involved in steroid metabolism in M. tuberculosis. Our work demonstrates the power of the directed evolution technique as a general way of overcoming the difficulties in overexpressing the target proteins in soluble form.
MACF1 gene structure: a hybrid of plectin and dystrophin.

PubMed

Gong, T W; Besirli, C G; Lomax, M I

2001-11-01

Mammalian MACF1 (Macrophin1; previously named ACF7) is a giant cytoskeletal linker protein with three known isoforms that arise by alternative splicing. We isolated a 19.1-kb cDNA encoding a fourth isoform (MACF1-4) with a unique N-terminus. Instead of an N-terminal actin-binding domain found in the other three isoforms, MACF1-4 has eight plectin repeats. The MACF1 gene is located on human Chr 1p32, contains at least 102 exons, spans over 270 kb, and gives rise to four major isoforms with different N-termini. The genomic organization of the actin-binding domain is highly conserved in mammalian genes for both plectin and BPAG1. All eight plectin repeats are encoded by one large exon; this feature is similar to the genomic structure of plectin. The intron positions within spectrin repeats in MACF1 are very similar to those in the dystrophin gene. This demonstrates that MACF1 has characteristic features of genes for two classes of cytoskeletal proteins, i.e., plectin and dystrophin.
Mutational analysis of three predicted 5'-proximal stem-loop structures in the genome of tick-borne encephalitis virus indicates different roles in RNA replication and translation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rouha, Harald; Hoenninger, Verena M.; Thurner, Caroline

2011-08-15

Flavivirus gene expression is modulated by RNA secondary structure elements at the terminal ends of the viral RNA molecule. For tick-borne encephalitis virus (TBEV), four stem-loop (SL) elements have been predicted in the first 180 nucleotides of the viral genome: 5'-SL1, 5'-SL2, 5'-SL3 and 5'-SL4. The last three of these appear to be unique to tick-borne flaviviruses. Here, we report their characterization by mutagenesis in a TBEV luciferase reporter system. By manipulating their thermodynamic properties, we found that an optimal stability of the 5'-SL2 is required for efficient RNA replication. 5'-SL3 formation is also important for viral RNA replication, butmore » although it contains the viral start codon, its formation is dispensable for RNA translation. 5'-SL4 appears to facilitate both RNA translation and replication. Our data suggest that maintenance of the balanced thermodynamic stability of these SL elements is important for temporal regulation of its different functions.« less
The novel asymmetric entry intermediate of a picornavirus captured with nanodiscs

PubMed Central

Lee, Hyunwook; Shingler, Kristin L.; Organtini, Lindsey J.; Ashley, Robert E.; Makhov, Alexander M.; Conway, James F.; Hafenstein, Susan

2016-01-01

Many nonenveloped viruses engage host receptors that initiate capsid conformational changes necessary for genome release. Structural studies on the mechanisms of picornavirus entry have relied on in vitro approaches of virus incubated at high temperatures or with excess receptor molecules to trigger the entry intermediate or A-particle. We have induced the coxsackievirus B3 entry intermediate by triggering the virus with full-length receptors embedded in lipid bilayer nanodiscs. These asymmetrically formed A-particles were reconstructed using cryo-electron microscopy and a direct electron detector. These first high-resolution structures of a picornavirus entry intermediate captured at a membrane with and without imposing icosahedral symmetry (3.9 and 7.8 Å, respectively) revealed a novel A-particle that is markedly different from the classical A-particles. The asymmetric receptor binding triggers minimal global capsid expansion but marked local conformational changes at the site of receptor interaction. In addition, viral proteins extrude from the capsid only at the site of extensive protein remodeling adjacent to the nanodisc. Thus, the binding of the receptor triggers formation of a unique site in preparation for genome release. PMID:27574701
Genus-Wide Comparative Genomics of Malassezia Delineates Its Phylogeny, Physiology, and Niche Adaptation on Human Skin

PubMed Central

Wu, Guangxi; Zhao, He; Li, Chenhao; Rajapakse, Menaka Priyadarsani; Wong, Wing Cheong; Xu, Jun; Saunders, Charles W.; Reeder, Nancy L.; Reilman, Raymond A.; Scheynius, Annika; Sun, Sheng; Billmyre, Blake Robert; Li, Wenjun; Averette, Anna Floyd; Mieczkowski, Piotr; Heitman, Joseph; Theelen, Bart; Schröder, Markus S.; De Sessions, Paola Florez; Butler, Geraldine; Maurer-Stroh, Sebastian; Boekhout, Teun; Nagarajan, Niranjan; Dawson, Thomas L.

2015-01-01

Malassezia is a unique lipophilic genus in class Malasseziomycetes in Ustilaginomycotina, (Basidiomycota, fungi) that otherwise consists almost exclusively of plant pathogens. Malassezia are typically isolated from warm-blooded animals, are dominant members of the human skin mycobiome and are associated with common skin disorders. To characterize the genetic basis of the unique phenotypes of Malassezia spp., we sequenced the genomes of all 14 accepted species and used comparative genomics against a broad panel of fungal genomes to comprehensively identify distinct features that define the Malassezia gene repertoire: gene gain and loss; selection signatures; and lineage-specific gene family expansions. Our analysis revealed key gene gain events (64) with a single gene conserved across all Malassezia but absent in all other sequenced Basidiomycota. These likely horizontally transferred genes provide intriguing gain-of-function events and prime candidates to explain the emergence of Malassezia. A larger set of genes (741) were lost, with enrichment for glycosyl hydrolases and carbohydrate metabolism, concordant with adaptation to skin’s carbohydrate-deficient environment. Gene family analysis revealed extensive turnover and underlined the importance of secretory lipases, phospholipases, aspartyl proteases, and other peptidases. Combining genomic analysis with a re-evaluation of culture characteristics, we establish the likely lipid-dependence of all Malassezia. Our phylogenetic analysis sheds new light on the relationship between Malassezia and other members of Ustilaginomycotina, as well as phylogenetic lineages within the genus. Overall, our study provides a unique genomic resource for understanding Malassezia niche-specificity and potential virulence, as well as their abundance and distribution in the environment and on human skin. PMID:26539826
Genus-Wide Comparative Genomics of Malassezia Delineates Its Phylogeny, Physiology, and Niche Adaptation on Human Skin.

PubMed

Wu, Guangxi; Zhao, He; Li, Chenhao; Rajapakse, Menaka Priyadarsani; Wong, Wing Cheong; Xu, Jun; Saunders, Charles W; Reeder, Nancy L; Reilman, Raymond A; Scheynius, Annika; Sun, Sheng; Billmyre, Blake Robert; Li, Wenjun; Averette, Anna Floyd; Mieczkowski, Piotr; Heitman, Joseph; Theelen, Bart; Schröder, Markus S; De Sessions, Paola Florez; Butler, Geraldine; Maurer-Stroh, Sebastian; Boekhout, Teun; Nagarajan, Niranjan; Dawson, Thomas L

2015-11-01

Malassezia is a unique lipophilic genus in class Malasseziomycetes in Ustilaginomycotina, (Basidiomycota, fungi) that otherwise consists almost exclusively of plant pathogens. Malassezia are typically isolated from warm-blooded animals, are dominant members of the human skin mycobiome and are associated with common skin disorders. To characterize the genetic basis of the unique phenotypes of Malassezia spp., we sequenced the genomes of all 14 accepted species and used comparative genomics against a broad panel of fungal genomes to comprehensively identify distinct features that define the Malassezia gene repertoire: gene gain and loss; selection signatures; and lineage-specific gene family expansions. Our analysis revealed key gene gain events (64) with a single gene conserved across all Malassezia but absent in all other sequenced Basidiomycota. These likely horizontally transferred genes provide intriguing gain-of-function events and prime candidates to explain the emergence of Malassezia. A larger set of genes (741) were lost, with enrichment for glycosyl hydrolases and carbohydrate metabolism, concordant with adaptation to skin's carbohydrate-deficient environment. Gene family analysis revealed extensive turnover and underlined the importance of secretory lipases, phospholipases, aspartyl proteases, and other peptidases. Combining genomic analysis with a re-evaluation of culture characteristics, we establish the likely lipid-dependence of all Malassezia. Our phylogenetic analysis sheds new light on the relationship between Malassezia and other members of Ustilaginomycotina, as well as phylogenetic lineages within the genus. Overall, our study provides a unique genomic resource for understanding Malassezia niche-specificity and potential virulence, as well as their abundance and distribution in the environment and on human skin.
Development and molecular-genetic characterization of a stable Brassica allohexaploid.

PubMed

Gupta, Mehak; Atri, Chhaya; Agarwal, Neha; Banga, Surinder Singh

2016-11-01

We report first-time synthesis of a stable Brassica allohexaploid. It may evolve into a new species and also advance our understanding of pairing regulation and genome evolution in complex allopolyploids. Crop Brassicas include both monogenomic and digenomic species. A trigenomic Brassica (AABBCC) is not known to exist in nature. Past attempts to synthesize a stable allohexaploid were not successful due to aberrant meiosis and very high proportion of aneuploid plants in the selfed progenies. We report the development of a stable allohexaploid Brassica (2n = 54; AABBCC). Genomic in situ hybridization confirmed the complete assemblage of three genomes. Only allohexaploids involving B. rapa cv. R01 (2n = 20; AA) as pollinator with a set of B. carinata (2n = 34; BBCC) were stable. These exhibited a high proportion (0.78-0.94) of pollen mother cells with normal meiosis and an excellent hexaploid ratio (0.80-0.94) in the selfed progenies. Stability of two allohexaploid combinations was demonstrated from H 1 to H 4 generations at two very diverse locations in India. Graphical genotyping of allohexaploids allowed detection of chromosome fragment exchanges among three genomes. These were much smaller for meiotically stable allohexaploids as compared to unstable ones. The putative hexaploids were morphologically closer to the female donor, B. carinata, for leaf morphology, inflorescence structure and flower shape. The newly formed allohexaploid may also provide unique opportunities to investigate the immediate genetic and genomic consequences of a Brassica allohexaploid with three resident genomes.
Comparative analysis of complete chloroplast genome sequence and inversion variation in Lasthenia burkei (Madieae, Asteraceae).

PubMed

Walker, Joseph F; Zanis, Michael J; Emery, Nancy C

2014-04-01

Complete chloroplast genome studies can help resolve relationships among large, complex plant lineages such as Asteraceae. We present the first whole plastome from the Madieae tribe and compare its sequence variation to other chloroplast genomes in Asteraceae. We used high throughput sequencing to obtain the Lasthenia burkei chloroplast genome. We compared sequence structure and rates of molecular evolution in the small single copy (SSC), large single copy (LSC), and inverted repeat (IR) regions to those for eight Asteraceae accessions and one Solanaceae accession. The chloroplast sequence of L. burkei is 150 746 bp and contains 81 unique protein coding genes and 4 coding ribosomal RNA sequences. We identified three major inversions in the L. burkei chloroplast, all of which have been found in other Asteraceae lineages, and a previously unreported inversion in Lactuca sativa. Regions flanking inversions contained tRNA sequences, but did not have particularly high G + C content. Substitution rates varied among the SSC, LSC, and IR regions, and rates of evolution within each region varied among species. Some observed differences in rates of molecular evolution may be explained by the relative proportion of coding to noncoding sequence within regions. Rates of molecular evolution vary substantially within and among chloroplast genomes, and major inversion events may be promoted by the presence of tRNAs. Collectively, these results provide insight into different mechanisms that may promote intramolecular recombination and the inversion of large genomic regions in the plastome.
Polygenic risk score, genome-wide association, and gene set analyses of cognitive domain deficits in schizophrenia.

PubMed

Nakahara, Soichiro; Medland, Sarah; Turner, Jessica A; Calhoun, Vince D; Lim, Kelvin O; Mueller, Bryon A; Bustillo, Juan R; O'Leary, Daniel S; Vaidya, Jatin G; McEwen, Sarah; Voyvodic, James; Belger, Aysenil; Mathalon, Daniel H; Ford, Judith M; Guffanti, Guia; Macciardi, Fabio; Potkin, Steven G; van Erp, Theo G M

2018-06-12

This study assessed genetic contributions to six cognitive domains, identified by the MATRICS Cognitive Consensus Battery as relevant for schizophrenia, cognition-enhancing, clinical trials. Psychiatric Genomics Consortium Schizophrenia polygenic risk scores showed significant negative correlations with each cognitive domain. Genome-wide association analyses identified loci associated with attention/vigilance (rs830786 within HNF4G), verbal memory (rs67017972 near NDUFS4), and reasoning/problem solving (rs76872642 within HDAC9). Gene set analysis identified unique and shared genes across cognitive domains. These findings suggest involvement of common and unique mechanisms across cognitive domains and may contribute to the discovery of new therapeutic targets to treat cognitive deficits in schizophrenia. Copyright © 2018 Elsevier B.V. All rights reserved.
Transposable elements generate population-specific insertional patterns and allelic variation in genes of wild emmer wheat (Triticum turgidum ssp. dicoccoides).

PubMed

Domb, Katherine; Keidar, Danielle; Yaakov, Beery; Khasdan, Vadim; Kashkush, Khalil

2017-10-27

Natural populations of the tetraploid wild emmer wheat (genome AABB) were previously shown to demonstrate eco-geographically structured genetic and epigenetic diversity. Transposable elements (TEs) might make up a significant part of the genetic and epigenetic variation between individuals and populations because they comprise over 80% of the wild emmer wheat genome. In this study, we performed detailed analyses to assess the dynamics of transposable elements in 50 accessions of wild emmer wheat collected from 5 geographically isolated sites. The analyses included: the copy number variation of TEs among accessions in the five populations, population-unique insertional patterns, and the impact of population-unique/specific TE insertions on structure and expression of genes. We assessed the copy numbers of 12 TE families using real-time quantitative PCR, and found significant copy number variation (CNV) in the 50 wild emmer wheat accessions, in a population-specific manner. In some cases, the CNV difference reached up to 6-fold. However, the CNV was TE-specific, namely some TE families showed higher copy numbers in one or more populations, and other TE families showed lower copy numbers in the same population(s). Furthermore, we assessed the insertional patterns of 6 TE families using transposon display (TD), and observed significant population-specific insertional patterns. The polymorphism levels of TE-insertional patterns reached 92% among all wild emmer wheat accessions, in some cases. In addition, we observed population-specific/unique TE insertions, some of which were located within or close to protein-coding genes, creating allelic variations in a population-specific manner. We also showed that those genes are differentially expressed in wild emmer wheat. For the first time, this study shows that TEs proliferate in wild emmer wheat in a population-specific manner, creating new alleles of genes, which contribute to the divergent evolution of homeologous genes from the A and B subgenomes.
A base-modified PNA-graphene oxide platform as a turn-on fluorescence sensor for the detection of human telomeric repeats

NASA Astrophysics Data System (ADS)

Sabale, Pramod M.; George, Jerrin Thomas; Srivatsan, Seergazhi G.

2014-08-01

Given the biological and therapeutic significance of telomeres and other G-quadruplex forming sequences in human genome, it is highly desirable to develop simple methods to study these structures, which can also be implemented in screening formats for the discovery of G-quadruplex binders. The majority of telomere detection methods developed so far are laborious and use elaborate assay and instrumental setups, and hence, are not amenable to discovery platforms. Here, we describe the development of a simple homogeneous fluorescence turn-on method, which uses a unique combination of an environment-sensitive fluorescent nucleobase analogue, the superior base pairing property of PNA, and DNA-binding and fluorescence quenching properties of graphene oxide, to detect human telomeric DNA repeats of varying lengths. Our results demonstrate that this method, which does not involve a rigorous assay setup, would provide new opportunities to study G-quadruplex structures.Given the biological and therapeutic significance of telomeres and other G-quadruplex forming sequences in human genome, it is highly desirable to develop simple methods to study these structures, which can also be implemented in screening formats for the discovery of G-quadruplex binders. The majority of telomere detection methods developed so far are laborious and use elaborate assay and instrumental setups, and hence, are not amenable to discovery platforms. Here, we describe the development of a simple homogeneous fluorescence turn-on method, which uses a unique combination of an environment-sensitive fluorescent nucleobase analogue, the superior base pairing property of PNA, and DNA-binding and fluorescence quenching properties of graphene oxide, to detect human telomeric DNA repeats of varying lengths. Our results demonstrate that this method, which does not involve a rigorous assay setup, would provide new opportunities to study G-quadruplex structures. Electronic supplementary information (ESI) available. Figures, tables, experimental procedures and NMR spectra. See DOI: 10.1039/c4nr00878b
Peering down the barrel of a bacteriophage portal: the genome packaging and release valve in p22.

PubMed

Tang, Jinghua; Lander, Gabriel C; Olia, Adam S; Olia, Adam; Li, Rui; Casjens, Sherwood; Prevelige, Peter; Cingolani, Gino; Baker, Timothy S; Johnson, John E

2011-04-13

The encapsidated genome in all double-strand DNA bacteriophages is packaged to liquid crystalline density through a unique vertex in the procapsid assembly intermediate, which has a portal protein dodecamer in place of five coat protein subunits. The portal orchestrates DNA packaging and exit, through a series of varying interactions with the scaffolding, terminase, and closure proteins. Here, we report an asymmetric cryoEM reconstruction of the entire P22 virion at 7.8 Å resolution. X-ray crystal structure models of the full-length portal and of the portal lacking 123 residues at the C terminus in complex with gene product 4 (Δ123portal-gp4) obtained by Olia et al. (2011) were fitted into this reconstruction. The interpreted density map revealed that the 150 Å, coiled-coil, barrel portion of the portal entraps the last DNA to be packaged and suggests a mechanism for head-full DNA signaling and transient stabilization of the genome during addition of closure proteins. Copyright © 2011 Elsevier Ltd. All rights reserved.
Genome mining of the sordarin biosynthetic gene cluster from Sordaria araneosa Cain ATCC 36386: characterization of cycloaraneosene synthase and GDP-6-deoxyaltrose transferase.

PubMed

Kudo, Fumitaka; Matsuura, Yasunori; Hayashi, Takaaki; Fukushima, Masayuki; Eguchi, Tadashi

2016-07-01

Sordarin is a glycoside antibiotic with a unique tetracyclic diterpene aglycone structure called sordaricin. To understand its intriguing biosynthetic pathway that may include a Diels-Alder-type [4+2]cycloaddition, genome mining of the gene cluster from the draft genome sequence of the producer strain, Sordaria araneosa Cain ATCC 36386, was carried out. A contiguous 67 kb gene cluster consisting of 20 open reading frames encoding a putative diterpene cyclase, a glycosyltransferase, a type I polyketide synthase, and six cytochrome P450 monooxygenases were identified. In vitro enzymatic analysis of the putative diterpene cyclase SdnA showed that it catalyzes the transformation of geranylgeranyl diphosphate to cycloaraneosene, a known biosynthetic intermediate of sordarin. Furthermore, a putative glycosyltransferase SdnJ was found to catalyze the glycosylation of sordaricin in the presence of GDP-6-deoxy-d-altrose to give 4'-O-demethylsordarin. These results suggest that the identified sdn gene cluster is responsible for the biosynthesis of sordarin. Based on the isolated potential biosynthetic intermediates and bioinformatics analysis, a plausible biosynthetic pathway for sordarin is proposed.
Transcriptome profiling reveals mosaic genomic origins of modern cultivated barley.

PubMed

Dai, Fei; Chen, Zhong-Hua; Wang, Xiaolei; Li, Zefeng; Jin, Gulei; Wu, Dezhi; Cai, Shengguan; Wang, Ning; Wu, Feibo; Nevo, Eviatar; Zhang, Guoping

2014-09-16

The domestication of cultivated barley has been used as a model system for studying the origins and early spread of agrarian culture. Our previous results indicated that the Tibetan Plateau and its vicinity is one of the centers of domestication of cultivated barley. Here we reveal multiple origins of domesticated barley using transcriptome profiling of cultivated and wild-barley genotypes. Approximately 48-Gb of clean transcript sequences in 12 Hordeum spontaneum and 9 Hordeum vulgare accessions were generated. We reported 12,530 de novo assembled transcripts in all of the 21 samples. Population structure analysis showed that Tibetan hulless barley (qingke) might have existed in the early stage of domestication. Based on the large number of unique genomic regions showing the similarity between cultivated and wild-barley groups, we propose that the genomic origin of modern cultivated barley is derived from wild-barley genotypes in the Fertile Crescent (mainly in chromosomes 1H, 2H, and 3H) and Tibet (mainly in chromosomes 4H, 5H, 6H, and 7H). This study indicates that the domestication of barley may have occurred over time in geographically distinct regions.

CARD 2017: expansion and model-centric curation of the comprehensive antibiotic resistance database

PubMed Central

Jia, Baofeng; Raphenya, Amogelang R.; Alcock, Brian; Waglechner, Nicholas; Guo, Peiyao; Tsang, Kara K.; Lago, Briony A.; Dave, Biren M.; Pereira, Sheldon; Sharma, Arjun N.; Doshi, Sachin; Courtot, Mélanie; Lo, Raymond; Williams, Laura E.; Frye, Jonathan G.; Elsayegh, Tariq; Sardar, Daim; Westman, Erin L.; Pawlowski, Andrew C.; Johnson, Timothy A.; Brinkman, Fiona S.L.; Wright, Gerard D.; McArthur, Andrew G.

2017-01-01

The Comprehensive Antibiotic Resistance Database (CARD; http://arpcard.mcmaster.ca) is a manually curated resource containing high quality reference data on the molecular basis of antimicrobial resistance (AMR), with an emphasis on the genes, proteins and mutations involved in AMR. CARD is ontologically structured, model centric, and spans the breadth of AMR drug classes and resistance mechanisms, including intrinsic, mutation-driven and acquired resistance. It is built upon the Antibiotic Resistance Ontology (ARO), a custom built, interconnected and hierarchical controlled vocabulary allowing advanced data sharing and organization. Its design allows the development of novel genome analysis tools, such as the Resistance Gene Identifier (RGI) for resistome prediction from raw genome sequence. Recent improvements include extensive curation of additional reference sequences and mutations, development of a unique Model Ontology and accompanying AMR detection models to power sequence analysis, new visualization tools, and expansion of the RGI for detection of emergent AMR threats. CARD curation is updated monthly based on an interplay of manual literature curation, computational text mining, and genome analysis. PMID:27789705
Breed-specific ancestry studies and genome-wide association analysis highlight an association between the MYH9 gene and heat tolerance in Alaskan sprint racing sled dogs.

PubMed

Huson, Heather J; vonHoldt, Bridgett M; Rimbault, Maud; Byers, Alexandra M; Runstadler, Jonathan A; Parker, Heidi G; Ostrander, Elaine A

2012-02-01

Alaskan sled dogs are a genetically distinct population shaped by generations of selective interbreeding with purebred dogs to create a group of high-performance athletes. As a result of selective breeding strategies, sled dogs present a unique opportunity to employ admixture-mapping techniques to investigate how breed composition and trait selection impact genomic structure. We used admixture mapping to investigate genetic ancestry across the genomes of two classes of sled dogs, sprint and long-distance racers, and combined that with genome-wide association studies (GWAS) to identify regions that correlate with performance-enhancing traits. The sled dog genome is enhanced by differential contributions from four non-admixed breeds (Alaskan Malamute, Siberian Husky, German Shorthaired Pointer, and Borzoi). A principal components analysis (PCA) of 115,000 genome-wide SNPs clearly resolved the sprint and distance populations as distinct genetic groups, with longer blocks of linkage disequilibrium (LD) observed in the distance versus sprint dogs (7.5-10 and 2.5-3.75 kb, respectively). Furthermore, we identified eight regions with the genomic signal from either a selective sweep or an association analysis, corroborated by an excess of ancestry when comparing sprint and distance dogs. A comparison of elite and poor-performing sled dogs identified a single region significantly associated with heat tolerance. Within the region we identified seven SNPs within the myosin heavy chain 9 gene (MYH9) that were significantly associated with heat tolerance in sprint dogs, two of which correspond to conserved promoter and enhancer regions in the human ortholog.
Genomic evolution of Saccharomyces cerevisiae under Chinese rice wine fermentation.

PubMed

Li, Yudong; Zhang, Weiping; Zheng, Daoqiong; Zhou, Zhan; Yu, Wenwen; Zhang, Lei; Feng, Lifang; Liang, Xinle; Guan, Wenjun; Zhou, Jingwen; Chen, Jian; Lin, Zhenguo

2014-09-10

Rice wine fermentation represents a unique environment for the evolution of the budding yeast, Saccharomyces cerevisiae. To understand how the selection pressure shaped the yeast genome and gene regulation, we determined the genome sequence and transcriptome of a S. cerevisiae strain YHJ7 isolated from Chinese rice wine (Huangjiu), a popular traditional alcoholic beverage in China. By comparing the genome of YHJ7 to the lab strain S288c, a Japanese sake strain K7, and a Chinese industrial bioethanol strain YJSH1, we identified many genomic sequence and structural variations in YHJ7, which are mainly located in subtelomeric regions, suggesting that these regions play an important role in genomic evolution between strains. In addition, our comparative transcriptome analysis between YHJ7 and S288c revealed a set of differentially expressed genes, including those involved in glucose transport (e.g., HXT2, HXT7) and oxidoredutase activity (e.g., AAD10, ADH7). Interestingly, many of these genomic and transcriptional variations are directly or indirectly associated with the adaptation of YHJ7 strain to its specific niches. Our molecular evolution analysis suggested that Japanese sake strains (K7/UC5) were derived from Chinese rice wine strains (YHJ7) at least approximately 2,300 years ago, providing the first molecular evidence elucidating the origin of Japanese sake strains. Our results depict interesting insights regarding the evolution of yeast during rice wine fermentation, and provided a valuable resource for genetic engineering to improve industrial wine-making strains. © The Author(s) 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Functionally Structured Genomes in Lactobacillus kunkeei Colonizing the Honey Crop and Food Products of Honeybees and Stingless Bees

PubMed Central

Tamarit, Daniel; Ellegaard, Kirsten M.; Wikander, Johan; Olofsson, Tobias; Vásquez, Alejandra; Andersson, Siv G.E.

2015-01-01

Lactobacillus kunkeei is the most abundant bacterial species in the honey crop and food products of honeybees. The 16 S rRNA genes of strains isolated from different bee species are nearly identical in sequence and therefore inadequate as markers for studies of coevolutionary patterns. Here, we have compared the 1.5 Mb genomes of ten L. kunkeei strains isolated from all recognized Apis species and another two strains from Meliponini species. A gene flux analysis, including previously sequenced Lactobacillus species as outgroups, indicated the influence of reductive evolution. The genome architecture is unique in that vertically inherited core genes are located near the terminus of replication, whereas genes for secreted proteins and putative host-adaptive traits are located near the origin of replication. We suggest that these features have resulted from a genome-wide loss of genes, with integrations of novel genes mostly occurring in regions flanking the origin of replication. The phylogenetic analyses showed that the bacterial topology was incongruent with the host topology, and that strains of the same microcluster have recombined frequently across the host species barriers, arguing against codiversification. Multiple genotypes were recovered in the individual hosts and transfers of mobile elements could be demonstrated for strains isolated from the same host species. Unlike other bacteria with small genomes, short generation times and multiple rRNA operons suggest that L. kunkeei evolves under selection for rapid growth in its natural growth habitat. The results provide an extended framework for reductive genome evolution and functional genome organization in bacteria. PMID:25953738
Error Correcting Optical Mapping Data.

PubMed

Mukherjee, Kingshuk; Washimkar, Darshan; Muggli, Martin D; Salmela, Leena; Boucher, Christina

2018-05-26

Optical mapping is a unique system that is capable of producing high-resolution, high-throughput genomic map data that gives information about the structure of a genome [21]. Recently it has been used for scaffolding contigs and assembly validation for large-scale sequencing projects, including the maize [32], goat [6], and amborella [4] genomes. However, a major impediment in the use of this data is the variety and quantity of errors in the raw optical mapping data, which are called Rmaps. The challenges associated with using Rmap data are analogous to dealing with insertions and deletions in the alignment of long reads. Moreover, they are arguably harder to tackle since the data is numerical and susceptible to inaccuracy. We develop cOMET to error correct Rmap data, which to the best of our knowledge is the only optical mapping error correction method. Our experimental results demonstrate that cOMET has high prevision and corrects 82.49% of insertion errors and 77.38% of deletion errors in Rmap data generated from the E. coli K-12 reference genome. Out of the deletion errors corrected, 98.26% are true errors. Similarly, out of the insertion errors corrected, 82.19% are true errors. It also successfully scales to large genomes, improving the quality of 78% and 99% of the Rmaps in the plum and goat genomes, respectively. Lastly, we show the utility of error correction by demonstrating how it improves the assembly of Rmap data. Error corrected Rmap data results in an assembly that is more contiguous, and covers a larger fraction of the genome.
Dynamics and Adaptive Benefits of Protein Domain Emergence and Arrangements during Plant Genome Evolution

PubMed Central

Kersting, Anna R.; Bornberg-Bauer, Erich; Moore, Andrew D.; Grath, Sonja

2012-01-01

Plant genomes are generally very large, mostly paleopolyploid, and have numerous gene duplicates and complex genomic features such as repeats and transposable elements. Many of these features have been hypothesized to enable plants, which cannot easily escape environmental challenges, to rapidly adapt. Another mechanism, which has recently been well described as a major facilitator of rapid adaptation in bacteria, animals, and fungi but not yet for plants, is modular rearrangement of protein-coding genes. Due to the high precision of profile-based methods, rearrangements can be well captured at the protein level by characterizing the emergence, loss, and rearrangements of protein domains, their structural, functional, and evolutionary building blocks. Here, we study the dynamics of domain rearrangements and explore their adaptive benefit in 27 plant and 3 algal genomes. We use a phylogenomic approach by which we can explain the formation of 88% of all arrangements by single-step events, such as fusion, fission, and terminal loss of domains. We find many domains are lost along every lineage, but at least 500 domains are novel, that is, they are unique to green plants and emerged more or less recently. These novel domains duplicate and rearrange more readily within their genomes than ancient domains and are overproportionally involved in stress response and developmental innovations. Novel domains more often affect regulatory proteins and show a higher degree of structural disorder than ancient domains. Whereas a relatively large and well-conserved core set of single-domain proteins exists, long multi-domain arrangements tend to be species-specific. We find that duplicated genes are more often involved in rearrangements. Although fission events typically impact metabolic proteins, fusion events often create new signaling proteins essential for environmental sensing. Taken together, the high volatility of single domains and complex arrangements in plant genomes demonstrate the importance of modularity for environmental adaptability of plants. PMID:22250127
IMG-ABC: A Knowledge Base To Fuel Discovery of Biosynthetic Gene Clusters and Novel Secondary Metabolites.

PubMed

Hadjithomas, Michalis; Chen, I-Min Amy; Chu, Ken; Ratner, Anna; Palaniappan, Krishna; Szeto, Ernest; Huang, Jinghua; Reddy, T B K; Cimermančič, Peter; Fischbach, Michael A; Ivanova, Natalia N; Markowitz, Victor M; Kyrpides, Nikos C; Pati, Amrita

2015-07-14

In the discovery of secondary metabolites, analysis of sequence data is a promising exploration path that remains largely underutilized due to the lack of computational platforms that enable such a systematic approach on a large scale. In this work, we present IMG-ABC (https://img.jgi.doe.gov/abc), an atlas of biosynthetic gene clusters within the Integrated Microbial Genomes (IMG) system, which is aimed at harnessing the power of "big" genomic data for discovering small molecules. IMG-ABC relies on IMG's comprehensive integrated structural and functional genomic data for the analysis of biosynthetic gene clusters (BCs) and associated secondary metabolites (SMs). SMs and BCs serve as the two main classes of objects in IMG-ABC, each with a rich collection of attributes. A unique feature of IMG-ABC is the incorporation of both experimentally validated and computationally predicted BCs in genomes as well as metagenomes, thus identifying BCs in uncultured populations and rare taxa. We demonstrate the strength of IMG-ABC's focused integrated analysis tools in enabling the exploration of microbial secondary metabolism on a global scale, through the discovery of phenazine-producing clusters for the first time in Alphaproteobacteria. IMG-ABC strives to fill the long-existent void of resources for computational exploration of the secondary metabolism universe; its underlying scalable framework enables traversal of uncovered phylogenetic and chemical structure space, serving as a doorway to a new era in the discovery of novel molecules. IMG-ABC is the largest publicly available database of predicted and experimental biosynthetic gene clusters and the secondary metabolites they produce. The system also includes powerful search and analysis tools that are integrated with IMG's extensive genomic/metagenomic data and analysis tool kits. As new research on biosynthetic gene clusters and secondary metabolites is published and more genomes are sequenced, IMG-ABC will continue to expand, with the goal of becoming an essential component of any bioinformatic exploration of the secondary metabolism world. Copyright © 2015 Hadjithomas et al.
Chromosome Evolution in the Free-Living Flatworms: First Evidence of Intrachromosomal Rearrangements in Karyotype Evolution of Macrostomum lignano (Platyhelminthes, Macrostomida)

PubMed Central

Zadesenets, Kira S.; Ershov, Nikita I.; Berezikov, Eugene; Rubtsov, Nikolay B.

2017-01-01

The free-living flatworm Macrostomum lignano is a hidden tetraploid. Its genome was formed by a recent whole genome duplication followed by chromosome fusions. Its karyotype (2n = 8) consists of a pair of large chromosomes (MLI1), which contain regions of all other chromosomes, and three pairs of small metacentric chromosomes. Comparison of MLI1 with metacentrics was performed by painting with microdissected DNA probes and fluorescent in situ hybridization of unique DNA fragments. Regions of MLI1 homologous to small metacentrics appeared to be contiguous. Besides the loss of DNA repeat clusters (pericentromeric and telomeric repeats and the 5S rDNA cluster) from MLI1, the difference between small metacentrics MLI2 and MLI4 and regions homologous to them in MLI1 were revealed. Abnormal karyotypes found in the inbred DV1/10 subline were analyzed, and structurally rearranged chromosomes were described with the painting technique, suggesting the mechanism of their origin. The revealed chromosomal rearrangements generate additional diversity, opening the way toward massive loss of duplicated genes from a duplicated genome. Our findings suggest that the karyotype of M. lignano is in the early stage of genome diploidization after whole genome duplication, and further studies on M. lignano and closely related species can address many questions about karyotype evolution in animals. PMID:29084138
First draft genome of an iconic clownfish species (Amphiprion frenatus).

PubMed

Marcionetti, Anna; Rossier, Victor; Bertrand, Joris A M; Litsios, Glenn; Salamin, Nicolas

2018-02-17

Clownfishes (or anemonefishes) form an iconic group of coral reef fishes, principally known for their mutualistic interaction with sea anemones. They are characterized by particular life history traits, such as a complex social structure and mating system involving sequential hermaphroditism, coupled with an exceptionally long lifespan. Additionally, clownfishes are considered to be one of the rare groups to have experienced an adaptive radiation in the marine environment. Here, we assembled and annotated the first genome of a clownfish species, the tomato clownfish (Amphiprion frenatus). We obtained 17,801 assembled scaffolds, containing a total of 26,917 genes. The completeness of the assembly and annotation was satisfying, with 96.5% of the Actinopterygii Benchmarking Universal Single-Copy Orthologs (BUSCOs) being retrieved in A. frenatus assembly. The quality of the resulting assembly is comparable to other bony fish assemblies. This resource is valuable for advancing studies of the particular life history traits of clownfishes, as well as being useful for population genetic studies and the development of new phylogenetic markers. It will also open the way to comparative genomics. Indeed, future genomic comparison among closely related fishes may provide means to identify genes related to the unique adaptations to different sea anemone hosts, as well as better characterize the genomic signatures of an adaptive radiation. © 2018 The Authors. Molecular Ecology Resources Published by John Wiley & Sons Ltd.
Comparative Genome Analysis of Campylobacter fetus Subspecies Revealed Horizontally Acquired Genetic Elements Important for Virulence and Niche Specificity

PubMed Central

Kienesberger, Sabine; Sprenger, Hanna; Wolfgruber, Stella; Halwachs, Bettina; Thallinger, Gerhard G.; Perez-Perez, Guillermo I.; Blaser, Martin J.; Zechner, Ellen L.; Gorkiewicz, Gregor

2014-01-01

Campylobacter fetus are important animal and human pathogens and the two major subspecies differ strikingly in pathogenicity. C. fetus subsp. venerealis is highly niche-adapted, mainly infecting the genital tract of cattle. C. fetus subsp. fetus has a wider host-range, colonizing the genital- and intestinal-tract of animals and humans. We report the complete genomic sequence of C. fetus subsp. venerealis 84-112 and comparisons to the genome of C. fetus subsp. fetus 82-40. Functional analysis of genes predicted to be involved in C. fetus virulence was performed. The two subspecies are highly syntenic with 92% sequence identity but C. fetus subsp. venerealis has a larger genome and an extra-chromosomal element. Aside from apparent gene transfer agents and hypothetical proteins, the unique genes in both subspecies comprise two known functional groups: lipopolysaccharide production, and type IV secretion machineries. Analyses of lipopolysaccharide-biosynthesis genes in C. fetus isolates showed linkage to particular pathotypes, and mutational inactivation demonstrated their roles in regulating virulence and host range. The comparative analysis presented here broadens knowledge of the genomic basis of C. fetus pathogenesis and host specificity. It further highlights the importance of surface-exposed structures to C. fetus pathogenicity and demonstrates how evolutionary forces optimize the fitness and host-adaptation of these pathogens. PMID:24416416
The Apostasia genome and the evolution of orchids.

PubMed

Zhang, Guo-Qiang; Liu, Ke-Wei; Li, Zhen; Lohaus, Rolf; Hsiao, Yu-Yun; Niu, Shan-Ce; Wang, Jie-Yu; Lin, Yao-Cheng; Xu, Qing; Chen, Li-Jun; Yoshida, Kouki; Fujiwara, Sumire; Wang, Zhi-Wen; Zhang, Yong-Qiang; Mitsuda, Nobutaka; Wang, Meina; Liu, Guo-Hui; Pecoraro, Lorenzo; Huang, Hui-Xia; Xiao, Xin-Ju; Lin, Min; Wu, Xin-Yi; Wu, Wan-Lin; Chen, You-Yi; Chang, Song-Bin; Sakamoto, Shingo; Ohme-Takagi, Masaru; Yagi, Masafumi; Zeng, Si-Jin; Shen, Ching-Yu; Yeh, Chuan-Ming; Luo, Yi-Bo; Tsai, Wen-Chieh; Van de Peer, Yves; Liu, Zhong-Jian

2017-09-21

Constituting approximately 10% of flowering plant species, orchids (Orchidaceae) display unique flower morphologies, possess an extraordinary diversity in lifestyle, and have successfully colonized almost every habitat on Earth. Here we report the draft genome sequence of Apostasia shenzhenica, a representative of one of two genera that form a sister lineage to the rest of the Orchidaceae, providing a reference for inferring the genome content and structure of the most recent common ancestor of all extant orchids and improving our understanding of their origins and evolution. In addition, we present transcriptome data for representatives of Vanilloideae, Cypripedioideae and Orchidoideae, and novel third-generation genome data for two species of Epidendroideae, covering all five orchid subfamilies. A. shenzhenica shows clear evidence of a whole-genome duplication, which is shared by all orchids and occurred shortly before their divergence. Comparisons between A. shenzhenica and other orchids and angiosperms also permitted the reconstruction of an ancestral orchid gene toolkit. We identify new gene families, gene family expansions and contractions, and changes within MADS-box gene classes, which control a diverse suite of developmental processes, during orchid evolution. This study sheds new light on the genetic mechanisms underpinning key orchid innovations, including the development of the labellum and gynostemium, pollinia, and seeds without endosperm, as well as the evolution of epiphytism; reveals relationships between the Orchidaceae subfamilies; and helps clarify the evolutionary history of orchids within the angiosperms.
The PathoYeastract database: an information system for the analysis of gene and genomic transcription regulation in pathogenic yeasts.

PubMed

Monteiro, Pedro Tiago; Pais, Pedro; Costa, Catarina; Manna, Sauvagya; Sá-Correia, Isabel; Teixeira, Miguel Cacho

2017-01-04

We present the PATHOgenic YEAst Search for Transcriptional Regulators And Consensus Tracking (PathoYeastract - http://pathoyeastract.org) database, a tool for the analysis and prediction of transcription regulatory associations at the gene and genomic levels in the pathogenic yeasts Candida albicans and C. glabrata Upon data retrieval from hundreds of publications, followed by curation, the database currently includes 28 000 unique documented regulatory associations between transcription factors (TF) and target genes and 107 DNA binding sites, considering 134 TFs in both species. Following the structure used for the YEASTRACT database, PathoYeastract makes available bioinformatics tools that enable the user to exploit the existing information to predict the TFs involved in the regulation of a gene or genome-wide transcriptional response, while ranking those TFs in order of their relative importance. Each search can be filtered based on the selection of specific environmental conditions, experimental evidence or positive/negative regulatory effect. Promoter analysis tools and interactive visualization tools for the representation of TF regulatory networks are also provided. The PathoYeastract database further provides simple tools for the prediction of gene and genomic regulation based on orthologous regulatory associations described for other yeast species, a comparative genomics setup for the study of cross-species evolution of regulatory networks. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
MIPS: a database for protein sequences, homology data and yeast genome information.

PubMed Central

Mewes, H W; Albermann, K; Heumann, K; Liebl, S; Pfeiffer, F

1997-01-01

The MIPS group (Martinsried Institute for Protein Sequences) at the Max-Planck-Institute for Biochemistry, Martinsried near Munich, Germany, collects, processes and distributes protein sequence data within the framework of the tripartite association of the PIR-International Protein Sequence Database (,). MIPS contributes nearly 50% of the data input to the PIR-International Protein Sequence Database. The database is distributed on CD-ROM together with PATCHX, an exhaustive supplement of unique, unverified protein sequences from external sources compiled by MIPS. Through its WWW server (http://www.mips.biochem.mpg.de/ ) MIPS permits internet access to sequence databases, homology data and to yeast genome information. (i) Sequence similarity results from the FASTA program () are stored in the FASTA database for all proteins from PIR-International and PATCHX. The database is dynamically maintained and permits instant access to FASTA results. (ii) Starting with FASTA database queries, proteins have been classified into families and superfamilies (PROT-FAM). (iii) The HPT (hashed position tree) data structure () developed at MIPS is a new approach for rapid sequence and pattern searching. (iv) MIPS provides access to the sequence and annotation of the complete yeast genome (), the functional classification of yeast genes (FunCat) and its graphical display, the 'Genome Browser' (). A CD-ROM based on the JAVA programming language providing dynamic interactive access to the yeast genome and the related protein sequences has been compiled and is available on request. PMID:9016498
Local chromatin structure of heterochromatin regulates repeated DNA stability, nucleolus structure, and genome integrity

DOE Office of Scientific and Technical Information (OSTI.GOV)

Peng, Jamy C.

Heterochromatin constitutes a significant portion of the genome in higher eukaryotes; approximately 30% in Drosophila and human. Heterochromatin contains a high repeat DNA content and a low density of protein-encoding genes. In contrast, euchromatin is composed mostly of unique sequences and contains the majority of single-copy genes. Genetic and cytological studies demonstrated that heterochromatin exhibits regulatory roles in chromosome organization, centromere function and telomere protection. As an epigenetically regulated structure, heterochromatin formation is not defined by any DNA sequence consensus. Heterochromatin is characterized by its association with nucleosomes containing methylated-lysine 9 of histone H3 (H3K9me), heterochromatin protein 1 (HP1) thatmore » binds H3K9me, and Su(var)3-9, which methylates H3K9 and binds HP1. Heterochromatin formation and functions are influenced by HP1, Su(var)3-9, and the RNA interference (RNAi) pathway. My thesis project investigates how heterochromatin formation and function impact nuclear architecture, repeated DNA organization, and genome stability in Drosophila melanogaster. H3K9me-based chromatin reduces extrachromosomal DNA formation; most likely by restricting the access of repair machineries to repeated DNAs. Reducing extrachromosomal ribosomal DNA stabilizes rDNA repeats and the nucleolus structure. H3K9me-based chromatin also inhibits DNA damage in heterochromatin. Cells with compromised heterochromatin structure, due to Su(var)3-9 or dcr-2 (a component of the RNAi pathway) mutations, display severe DNA damage in heterochromatin compared to wild type. In these mutant cells, accumulated DNA damage leads to chromosomal defects such as translocations, defective DNA repair response, and activation of the G2-M DNA repair and mitotic checkpoints that ensure cellular and animal viability. My thesis research suggests that DNA replication, repair, and recombination mechanisms in heterochromatin differ from those in euchromatin. Remarkably, human euchromatin and fly heterochromatin share similar features; such as repeated DNA content, intron lengths and open reading frame sizes. Human cells likely stabilize their DNA content via mechanisms and factors similar to those in Drosophila heterochromatin. Furthermore, my thesis work raises implications for H3K9me and chromatin functions in complex-DNA genome stability, repeated DNA homogenization by molecular drive, and in genome reorganization through evolution.« less
Complete chloroplast genome of Trachelium caeruleum: extensiverearrangements are associated with repeats and tRNAs

DOE Office of Scientific and Technical Information (OSTI.GOV)

Haberle, Rosemarie C.; Fourcade, Matthew L.; Boore, Jeffrey L.

2006-01-09

Chloroplast genome structure, gene order and content arehighly conserved in land plants. We sequenced the complete chloroplastgenome sequence of Trachelium caeruleum (Campanulaceae) a member of anangiosperm family known for highly rearranged chloroplast genomes. Thetotal genome size is 162,321 bp with an IR of 27,273 bp, LSC of 100,113bp and SSC of 7,661 bp. The genome encodes 115 unique genes, with 19duplicated in the IR, a tRNA (trnI-CAU) duplicated once in the LSC and aprotein coding gene (psbJ) duplicated twice, for a total of 137 genes.Four genes (ycf15, rpl23, infA and accD) are truncated and likelynonfunctional; three others (clpP, ycf1 andmore » ycf2) are so highly divergedthat they may now be pseudogenes. The most conspicuous feature of theTrachelium genome is the presence of eighteen internally unrearrangedblocks of genes that have been inverted or relocated within the genome,relative to the typical gene order of most angiosperm chloroplastgenomes. Recombination between repeats or tRNAs has been suggested as twomeans of chloroplast genome rearrangements. We compared the relativenumber of repeats in Trachelium to eight other angiosperm chloroplastgenomes, and evaluated the location of repeats and tRNAs in relation torearrangements. Trachelium has the highest number and largest repeats,which are concentrated near inversion endpoints or other rearrangements.tRNAs occur at many but not all inversion endpoints. There is likely nosingle mechanism responsible for the remarkable number of alterations inthis genome, but both repeats and tRNAs are clearly associated with theserearrangements. Land plant chloroplast genomes are highly conserved instructure, gene order and content. The chloroplast genomes of ferns, thegymnosperm Ginkgo, and most angiosperms are nearly collinear, reflectingthe gene order in lineages that diverged from lycopsids and the ancestralchloroplast gene order over 350 million years ago (Raubeson and Jansen,1992). Although earlier mapping studies identified a number of taxa inwhich several rearrangements have occurred (reviewed in Raubeson andJansen, 2005), an extraordinary number of chloroplast genome alterationsare concentrated in several families in the angiosperm order Asterales(sensu APGII, Bremer et al., 2003). Gene mapping studies ofrepresentatives of the Campanulaceae (Cosner, 1993; Cosner et al.,1997,2004) and Lobeliaceae (Knox et al., 1993; Knox and Palmer, 1999)identified large inversions, contraction and expansion of the invertedrepeat regions, and several insertions and deletions in the cpDNAs ofthese closely related taxa. Detailed restriction site and gene mapping ofthe chloroplast genome of Trachelium caeruleum (Campanulaceae) identifiedseven to ten large inversions, families of repeats associated withrearrangements, possible transpositions, and even the disruption ofoperons (Cosner et al., 1997). Seventeen other members of theCampanulaceae were mapped and exhibit many additional rearrangements(Cosner et al., 2004). What happened in this lineage that made itsusceptible to so many chloroplast genome rearrangements? How do normallyvery conserved chloroplast genomes change? The cause of rearrangements inthis group is unclear based on the limited resolution available withmapping techniques. Several mechanisms have been proposed to explain howrearrangements occur: recombination between repeats, transposition, ortemporary instability due to loss of the inverted repeat (Raubeson andJansen, 2005). Sequencing whole chloroplast genomes within theCampanulaceae offers a unique opportunity to examine both the extent andmechanisms of rearrangements within a phylogenetic framework.We reporthere the first complete chloroplast genome sequence of a member of theCampanulaceae, Trachelium caeruleum. This work will serve as a benchmarkfor subsequent, comparative sequencing and analysis of other members ofthis family and close relatives, with the goal of further understandingchloroplast genome evolution. We confirmed features previously identifiedthrough mapping, and discovered many additional structural changes,including several partial to entire gene duplications, deterioration ofat least four normally conserved chloroplast genes into gene fragments,and the nature and position of numerous repeat elements at or nearinversion endpoints. The focus of this paper is on analyses of sequencesat or near these rearrangements in Trachelium caeruleum. Inversions arebelieved to occur due to the presence of repeat elements subject tohomologous recombination (Palmer, 1991; Knox et al., 1993). Repeats mayfacilitate inversions or other genome rearrangements (Achaz et al.,2003), and higher incidences of repeats have been correlated with greaternumbers of rearrangements (Rocha, 2003). Alternatively, repeats mayproliferate within a genome asa result of DNA strand repair mechanismsfollowing a rearrangement event such as an inversion. Gene« less
High-throughput comparison, functional annotation, and metabolic modeling of plant genomes using the PlantSEED resource

PubMed Central

Seaver, Samuel M. D.; Gerdes, Svetlana; Frelin, Océane; Lerma-Ortiz, Claudia; Bradbury, Louis M. T.; Zallot, Rémi; Hasnain, Ghulam; Niehaus, Thomas D.; El Yacoubi, Basma; Pasternak, Shiran; Olson, Robert; Pusch, Gordon; Overbeek, Ross; Stevens, Rick; de Crécy-Lagard, Valérie; Ware, Doreen; Hanson, Andrew D.; Henry, Christopher S.

2014-01-01

The increasing number of sequenced plant genomes is placing new demands on the methods applied to analyze, annotate, and model these genomes. Today’s annotation pipelines result in inconsistent gene assignments that complicate comparative analyses and prevent efficient construction of metabolic models. To overcome these problems, we have developed the PlantSEED, an integrated, metabolism-centric database to support subsystems-based annotation and metabolic model reconstruction for plant genomes. PlantSEED combines SEED subsystems technology, first developed for microbial genomes, with refined protein families and biochemical data to assign fully consistent functional annotations to orthologous genes, particularly those encoding primary metabolic pathways. Seamless integration with its parent, the prokaryotic SEED database, makes PlantSEED a unique environment for cross-kingdom comparative analysis of plant and bacterial genomes. The consistent annotations imposed by PlantSEED permit rapid reconstruction and modeling of primary metabolism for all plant genomes in the database. This feature opens the unique possibility of model-based assessment of the completeness and accuracy of gene annotation and thus allows computational identification of genes and pathways that are restricted to certain genomes or need better curation. We demonstrate the PlantSEED system by producing consistent annotations for 10 reference genomes. We also produce a functioning metabolic model for each genome, gapfilling to identify missing annotations and proposing gene candidates for missing annotations. Models are built around an extended biomass composition representing the most comprehensive published to date. To our knowledge, our models are the first to be published for seven of the genomes analyzed. PMID:24927599
High-throughput comparison, functional annotation, and metabolic modeling of plant genomes using the PlantSEED resource.

PubMed

Seaver, Samuel M D; Gerdes, Svetlana; Frelin, Océane; Lerma-Ortiz, Claudia; Bradbury, Louis M T; Zallot, Rémi; Hasnain, Ghulam; Niehaus, Thomas D; El Yacoubi, Basma; Pasternak, Shiran; Olson, Robert; Pusch, Gordon; Overbeek, Ross; Stevens, Rick; de Crécy-Lagard, Valérie; Ware, Doreen; Hanson, Andrew D; Henry, Christopher S

2014-07-01

The increasing number of sequenced plant genomes is placing new demands on the methods applied to analyze, annotate, and model these genomes. Today's annotation pipelines result in inconsistent gene assignments that complicate comparative analyses and prevent efficient construction of metabolic models. To overcome these problems, we have developed the PlantSEED, an integrated, metabolism-centric database to support subsystems-based annotation and metabolic model reconstruction for plant genomes. PlantSEED combines SEED subsystems technology, first developed for microbial genomes, with refined protein families and biochemical data to assign fully consistent functional annotations to orthologous genes, particularly those encoding primary metabolic pathways. Seamless integration with its parent, the prokaryotic SEED database, makes PlantSEED a unique environment for cross-kingdom comparative analysis of plant and bacterial genomes. The consistent annotations imposed by PlantSEED permit rapid reconstruction and modeling of primary metabolism for all plant genomes in the database. This feature opens the unique possibility of model-based assessment of the completeness and accuracy of gene annotation and thus allows computational identification of genes and pathways that are restricted to certain genomes or need better curation. We demonstrate the PlantSEED system by producing consistent annotations for 10 reference genomes. We also produce a functioning metabolic model for each genome, gapfilling to identify missing annotations and proposing gene candidates for missing annotations. Models are built around an extended biomass composition representing the most comprehensive published to date. To our knowledge, our models are the first to be published for seven of the genomes analyzed.
Decoding the massive genome of loblolly pine using haploid DNA and novel assembly strategies

PubMed Central

2014-01-01

Background The size and complexity of conifer genomes has, until now, prevented full genome sequencing and assembly. The large research community and economic importance of loblolly pine, Pinus taeda L., made it an early candidate for reference sequence determination. Results We develop a novel strategy to sequence the genome of loblolly pine that combines unique aspects of pine reproductive biology and genome assembly methodology. We use a whole genome shotgun approach relying primarily on next generation sequence generated from a single haploid seed megagametophyte from a loblolly pine tree, 20-1010, that has been used in industrial forest tree breeding. The resulting sequence and assembly was used to generate a draft genome spanning 23.2 Gbp and containing 20.1 Gbp with an N50 scaffold size of 66.9 kbp, making it a significant improvement over available conifer genomes. The long scaffold lengths allow the annotation of 50,172 gene models with intron lengths averaging over 2.7 kbp and sometimes exceeding 100 kbp in length. Analysis of orthologous gene sets identifies gene families that may be unique to conifers. We further characterize and expand the existing repeat library based on the de novo analysis of the repetitive content, estimated to encompass 82% of the genome. Conclusions In addition to its value as a resource for researchers and breeders, the loblolly pine genome sequence and assembly reported here demonstrates a novel approach to sequencing the large and complex genomes of this important group of plants that can now be widely applied. PMID:24647006
Personalized medicine, genomics, and pharmacogenomics: a primer for nurses.

PubMed

Blix, Andrew

2014-08-01

Personalized medicine is the study of patients' unique environmental influences as well as the totality of their genetic code-their genome-to tailor personalized risk assessments, diagnoses, prognoses, and treatments. The study of how patients' genomes affect responses to medications, or pharmacogenomics, is a related field. Personalized medicine and genomics are particularly relevant in oncology because of the genetic basis of cancer. Nurses need to understand related issues such as the role of genetic and genomic counseling, the ethical and legal questions surrounding genomics, and the growing direct-to-consumer genomics industry. As genomics research is incorporated into health care, nurses need to understand the technology to provide advocacy and education for patients and their families.
A diversity study of Saccharomycopsis fibuligera in rice wine starter nuruk, reveals the evolutionary process associated with its interspecies hybrid.

PubMed

Farh, Mohamed El-Agamy; Cho, Yunjoo; Lim, Jae Yun; Seo, Jeong-Ah

2017-05-01

The amylolytic yeast Saccharomycopsis fibuligera is the predominant yeast in the starter product, nuruk, which is utilized for rice wine production in South Korea. Latest molecular studies explore a recently developed interspecific hybridization among stains of S. fibuligera with a unique genetic feature. However, the origin of the natural hybridization occurrence is still unclear. Thus, to respectively distinguish parental and hybrid strains, specific primer sets were applied on 141 yeast strains isolated from different nuruk samples fermented in different provinces. Sixty-seven strains were defined accordingly as parental species with genome A while 8 strains were defined as hybrid strains. Unexpectedly, another parental species with genome B could not be found among the strain pools yet. Furthermore, it was observed that hybrid strains are phenotypically different from A genome strains; asci containing tetrad ascospores were observed in A genome strains more frequent than in hybrid strains. Nevertheless, hybrid strains were slightly more thermotolerant than A genome strains. Interestingly, all hybrid strains were located only in Jeju province. Based on these sets of data, we speculated that the unique climate of Jeju province might play an evolutionary role in the interspecific hybridization between A genome strains, as well as the unculturable allopatric B genome strains.

The 193-base pair Gsg2 (haspin) promoter region regulates germ cell-specific expression bidirectionally and synchronously.

PubMed

Tokuhiro, Keizo; Miyagawa, Yasushi; Yamada, Shuichi; Hirose, Mika; Ohta, Hiroshi; Nishimune, Yoshitake; Tanaka, Hiromitsu

2007-03-01

Haspin is a unique protein kinase expressed predominantly in haploid male germ cells. The genomic structure of haspin (Gsg2) has revealed it to be intronless, and the entire transcription unit is in an intron of the integrin alphaE (Itgae) gene. Transcription occurs from a bidirectional promoter that also generates an alternatively spliced integrin alphaE-derived mRNA (Aed). In mice, the testis-specific alternative splicing of Aed is expressed bidirectionally downstream from the Gsg2 transcription initiation site, and a segment consisting of 26 bp transcribes both genomic DNA strands between Gsg2 and the Aed transcription initiation sites. To investigate the mechanisms for this unique gene regulation, we cloned and characterized the Gsg2 promoter region. The 193-bp genomic fragment from the 5' end of the Gsg2 and Aed genes, fused with EGFP and DsRed genes, drove the expression of both proteins in haploid germ cells of transgenic mice. This promoter element contained only a GC-rich sequence, and not the previously reported DNA sequences known to bind various transcription factors--with the exception of E2F1, TCFAP2A1 (AP2), and SP1. Here, we show that the 193-bp DNA sequence is sufficient for the specific, bidirectional, and synchronous expression in germ cells in the testis. We also demonstrate the existence of germ cell nuclear factors specifically bound to the promoter sequence. This activity may be regulated by binding to the promoter sequence with germ cell-specific nuclear complex(es) without regulation via DNA methylation.
Evolution of insect proteomes: insights into synapse organization and synaptic vesicle life cycle

PubMed Central

Yanay, Chava; Morpurgo, Noa; Linial, Michal

2008-01-01

Background The molecular components in synapses that are essential to the life cycle of synaptic vesicles are well characterized. Nonetheless, many aspects of synaptic processes, in particular how they relate to complex behaviour, remain elusive. The genomes of flies, mosquitoes, the honeybee and the beetle are now fully sequenced and span an evolutionary breadth of about 350 million years; this provides a unique opportunity to conduct a comparative genomics study of the synapse. Results We compiled a list of 120 gene prototypes that comprise the core of presynaptic structures in insects. Insects lack several scaffolding proteins in the active zone, such as bassoon and piccollo, and the most abundant protein in the mammalian synaptic vesicle, namely synaptophysin. The pattern of evolution of synaptic protein complexes is analyzed. According to this analysis, the components of presynaptic complexes as well as proteins that take part in organelle biogenesis are tightly coordinated. Most synaptic proteins are involved in rich protein interaction networks. Overall, the number of interacting proteins and the degrees of sequence conservation between human and insects are closely correlated. Such a correlation holds for exocytotic but not for endocytotic proteins. Conclusion This comparative study of human with insects sheds light on the composition and assembly of protein complexes in the synapse. Specifically, the nature of the protein interaction graphs differentiate exocytotic from endocytotic proteins and suggest unique evolutionary constraints for each set. General principles in the design of proteins of the presynaptic site can be inferred from a comparative study of human and insect genomes. PMID:18257909
Comparative genomics of Ceriporiopsis subvermispora and Phanerochaete chrysosporium provide insight into selective ligninolysis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fernandez-Fueyo, Elena; Ruiz-Duenas, Francisco J.; Ferreira, Patrica

Efficient lignin depolymerization is unique to the wood decay basidiomycetes, collectively referred to as white rot fungi. Phanerochaete chrysosporium simultaneously degrades lignin and cellulose, whereas the closely related species, Ceriporiopsis subvermispora, also depolymerizes lignin but may do so with relatively little cellulose degradation. To investigate the basis for selective ligninolysis, we conducted comparative genome analysis of C. subvermispora and P. chrysosporium. Genes encoding manganese peroxidase numbered 13 and five in C. subvermispora and P. chrysosporium, respectively. In addition, the C. subvermispora genome contains at least seven genes predicted to encode laccases, whereas the P. chrysosporium genome contains none. We alsomore » observed expansion of the number of C. subvermispora desaturase-encoding genes putatively involved in lipid metabolism. Microarray-based transcriptome analysis showed substantial up-regulation of several desaturase and MnP genes in wood-containing medium. MS identified MnP proteins in C. subvermispora culture filtrates, but none in P. chrysosporium cultures. These results support the importance of MnP and a lignin degradation mechanism whereby cleavage of the dominant nonphenolic structures is mediated by lipid peroxidation products. Two C. subvermispora genes were predicted to encode peroxidases structurally similar to P. chrysosporium lignin peroxidase and, following heterologous expression in Escherichia coli, the enzymes were shown to oxidize high redox potential substrates, but not Mn2. Apart from oxidative lignin degradation, we also examined cellulolytic and hemicellulolytic systems in both fungi. In summary, the C. subvermispora genetic inventory and expression patterns exhibit increased oxidoreductase potential and diminished cellulolytic capability relative to P. chrysosporium.« less
Comparative genomics of Ceriporiopsis subvermispora and Phanerochaete chrysosporium provide insight into selective ligninolysis

PubMed Central

Fernandez-Fueyo, Elena; Ruiz-Dueñas, Francisco J.; Ferreira, Patricia; Floudas, Dimitrios; Hibbett, David S.; Canessa, Paulo; Larrondo, Luis F.; James, Tim Y.; Seelenfreund, Daniela; Lobos, Sergio; Polanco, Rubén; Tello, Mario; Honda, Yoichi; Watanabe, Takahito; Watanabe, Takashi; Ryu, Jae San; Kubicek, Christian P.; Schmoll, Monika; Gaskell, Jill; Hammel, Kenneth E.; St. John, Franz J.; Vanden Wymelenberg, Amber; Sabat, Grzegorz; Splinter BonDurant, Sandra; Syed, Khajamohiddin; Yadav, Jagjit S.; Doddapaneni, Harshavardhan; Subramanian, Venkataramanan; Lavín, José L.; Oguiza, José A.; Perez, Gumer; Pisabarro, Antonio G.; Ramirez, Lucia; Santoyo, Francisco; Master, Emma; Coutinho, Pedro M.; Henrissat, Bernard; Lombard, Vincent; Magnuson, Jon Karl; Kües, Ursula; Hori, Chiaki; Igarashi, Kiyohiko; Samejima, Masahiro; Held, Benjamin W.; Barry, Kerrie W.; LaButti, Kurt M.; Lapidus, Alla; Lindquist, Erika A.; Lucas, Susan M.; Riley, Robert; Salamov, Asaf A.; Hoffmeister, Dirk; Schwenk, Daniel; Hadar, Yitzhak; Yarden, Oded; de Vries, Ronald P.; Wiebenga, Ad; Stenlid, Jan; Eastwood, Daniel; Grigoriev, Igor V.; Berka, Randy M.; Blanchette, Robert A.; Kersten, Phil; Martinez, Angel T.; Vicuna, Rafael; Cullen, Dan

2012-01-01

Efficient lignin depolymerization is unique to the wood decay basidiomycetes, collectively referred to as white rot fungi. Phanerochaete chrysosporium simultaneously degrades lignin and cellulose, whereas the closely related species, Ceriporiopsis subvermispora, also depolymerizes lignin but may do so with relatively little cellulose degradation. To investigate the basis for selective ligninolysis, we conducted comparative genome analysis of C. subvermispora and P. chrysosporium. Genes encoding manganese peroxidase numbered 13 and five in C. subvermispora and P. chrysosporium, respectively. In addition, the C. subvermispora genome contains at least seven genes predicted to encode laccases, whereas the P. chrysosporium genome contains none. We also observed expansion of the number of C. subvermispora desaturase-encoding genes putatively involved in lipid metabolism. Microarray-based transcriptome analysis showed substantial up-regulation of several desaturase and MnP genes in wood-containing medium. MS identified MnP proteins in C. subvermispora culture filtrates, but none in P. chrysosporium cultures. These results support the importance of MnP and a lignin degradation mechanism whereby cleavage of the dominant nonphenolic structures is mediated by lipid peroxidation products. Two C. subvermispora genes were predicted to encode peroxidases structurally similar to P. chrysosporium lignin peroxidase and, following heterologous expression in Escherichia coli, the enzymes were shown to oxidize high redox potential substrates, but not Mn2+. Apart from oxidative lignin degradation, we also examined cellulolytic and hemicellulolytic systems in both fungi. In summary, the C. subvermispora genetic inventory and expression patterns exhibit increased oxidoreductase potential and diminished cellulolytic capability relative to P. chrysosporium. PMID:22434909
Novel Positive-Sense, Single-Stranded RNA (+ssRNA) Virus with Di-Cistronic Genome from Intestinal Content of Freshwater Carp (Cyprinus carpio)

PubMed Central

Pankovics, Péter; Simmonds, Peter

2011-01-01

A novel positive-sense, single-stranded RNA (+ssRNA) virus (Halastavi árva RNA virus, HalV; JN000306) with di-cistronic genome organization was serendipitously identified in intestinal contents of freshwater carps (Cyprinus carpio) fished by line-fishing from fishpond “Lőrinte halastó” located in Veszprém County, Hungary. The complete nucleotide (nt) sequence of the genomic RNA is 9565 nt in length and contains two long - non-in-frame - open reading frames (ORFs), which are separated by an intergenic region. The ORF1 (replicase) is preceded by an untranslated sequence of 827 nt, while an untranslated region of 139 nt follows the ORF2 (capsid proteins). The deduced amino acid (aa) sequences of the ORFs showed only low (less than 32%) and partial similarity to the non-structural (2C-like helicase, 3C-like cystein protease and 3D-like RNA dependent RNA polymerase) and structural proteins (VP2/VP4/VP3) of virus families in Picornavirales especially to members of the viruses with dicistronic genome. Halastavi árva RNA virus is present in intestinal contents of omnivorous freshwater carps but the origin and the host species of this virus remains unknown. The unique viral sequence and the actual position indicate that Halastavi árva RNA virus seems to be the first member of a new di-cistronic ssRNA virus. Further studies are required to investigate the specific host species (and spectrum), ecology and role of Halastavi árva RNA virus in the nature. PMID:22195010
Structural variation and rates of genome evolution in the grass family seen through comparison of sequences of genomes greatly differing in size.

PubMed

Dvorak, Jan; Wang, Le; Zhu, Tingting; Jorgensen, Chad M; Deal, Karin R; Dai, Xiongtao; Dawson, Matthew W; Müller, Hans-Georg; Luo, Ming-Cheng; Ramasamy, Ramesh K; Dehghani, Hamid; Gu, Yong Q; Gill, Bikram S; Distelfeld, Assaf; Devos, Katrien M; Qi, Peng; You, Frank M; Gulick, Patrick J; McGuire, Patrick E

2018-05-16

Homology was searched with genes annotated in the Aegilops tauschii pseudomolecules against genes annotated in the pseudomolecules of tetraploid wild emmer wheat, Brachypodium distachyon, sorghum, and rice. Similar searches were initiated with genes annotated in the rice pseudomolecules. Matrices of colinear genes and rearrangements in their order were constructed. Optical Bionano genome maps were constructed and used to validate rearrangements unique to the wild emmer and Ae. tauschii genomes. Most common rearrangements were short paracentric inversions and short intrachromosomal translocations. Intrachromosomal translocations outnumbered segmental intrachromosomal duplications. The densities of paracentric inversion lengths were approximated by exponential distributions in all six genomes. Densities of colinear genes along the Ae. tauschii chromosomes were highly correlated with meiotic recombination rates but those of rearrangements were not, suggesting different causes of the erosion of gene colinearity and evolution of major chromosome rearrangements. Frequent rearrangements sharing breakpoints suggested that chromosomes have been rearranged recurrently at some sites. The distal 4 Mb of the short arms of rice chromosomes Os11 and Os12 and corresponding regions in the sorghum, B. distachyon, and Triticeae genomes contain clusters of interstitial translocations including from 1 to 7 colinear genes. The rates of acquisition of major rearrangements were greater in the wild emmer wheat and Ae. tauschii genomes than in the lineage preceding their divergence or in the B. distachyon, rice, and sorghum lineages. It is suggested that synergy between large quantities of dynamic transposable elements and annual growth habit caused the fast evolution of the Triticeae genomes. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
Draft genome sequence of non-shiga toxin-producing Escherichia coli O157 NCCP15738.

PubMed

Kwon, Taesoo; Kim, Jung-Beom; Bak, Young-Seok; Yu, Young-Bin; Kwon, Ki Sung; Kim, Won; Cho, Seung-Hak

2016-01-01

The non-shiga toxin-producing Escherichia coli (non-STEC) O157 is a pathogenic strain that cause diarrhea but does not cause hemolytic-uremic syndrome, or hemorrhagic colitis. Here, we present the 5-Mb draft genome sequence of non-STEC O157 NCCP15738, which was isolated from the feces of a Korean patient with diarrhea, and describe its features and the structural basis for its genome evolution. A total of 565-Mbp paired-end reads were generated using the Illumina-HiSeq 2000 platform. The reads were assembled into 135 scaffolds throughout the de novo assembly. The assembled genome size of NCCP15738 was 5,005,278 bp with an N50 value of 142,450 bp and 50.65 % G+C content. Using Rapid Annotation using Subsystem Technology analysis, we predicted 4780 ORFs and 31 RNA genes. The evolutionary tree was inferred from multiple sequence alignment of 45 E. coli species. The most closely related neighbor of NCCP15738 indicated by whole-genome phylogeny was E. coli UMNK88, but that indicated by multilocus sequence analysis was E. coli DH1(ME8569). A comparison between the NCCP15738 genome and those of reference strains, E. coli K-12 substr. MG1655 and EHEC O157:H7 EDL933 by bioinformatics analyses revealed unique genes in NCCP15738 associated with lysis protein S, two-component signal transduction system, conjugation, the flagellum, nucleotide-binding proteins, and metal-ion binding proteins. Notably, NCCP15738 has a dual flagella system like that in Vibrio parahaemolyticus, Aeromonas spp., and Rhodospirillum centenum. The draft genome sequence and the results of bioinformatics analysis of NCCP15738 provide the basis for understanding the genomic evolution of this strain.
Comparative Analysis of the Genomes of Two Field Isolates of the Rice Blast Fungus Magnaporthe oryzae

PubMed Central

Li, Zhigang; Hu, Songnian; Yao, Nan; Dean, Ralph A.; Zhao, Wensheng; Shen, Mi; Zhang, Haiwang; Li, Chao; Liu, Liyuan; Cao, Lei; Xu, Xiaowen; Xing, Yunfei; Hsiang, Tom; Zhang, Ziding; Xu, Jin-Rong; Peng, You-Liang

2012-01-01

Rice blast caused by Magnaporthe oryzae is one of the most destructive diseases of rice worldwide. The fungal pathogen is notorious for its ability to overcome host resistance. To better understand its genetic variation in nature, we sequenced the genomes of two field isolates, Y34 and P131. In comparison with the previously sequenced laboratory strain 70-15, both field isolates had a similar genome size but slightly more genes. Sequences from the field isolates were used to improve genome assembly and gene prediction of 70-15. Although the overall genome structure is similar, a number of gene families that are likely involved in plant-fungal interactions are expanded in the field isolates. Genome-wide analysis on asynonymous to synonymous nucleotide substitution rates revealed that many infection-related genes underwent diversifying selection. The field isolates also have hundreds of isolate-specific genes and a number of isolate-specific gene duplication events. Functional characterization of randomly selected isolate-specific genes revealed that they play diverse roles, some of which affect virulence. Furthermore, each genome contains thousands of loci of transposon-like elements, but less than 30% of them are conserved among different isolates, suggesting active transposition events in M. oryzae. A total of approximately 200 genes were disrupted in these three strains by transposable elements. Interestingly, transposon-like elements tend to be associated with isolate-specific or duplicated sequences. Overall, our results indicate that gain or loss of unique genes, DNA duplication, gene family expansion, and frequent translocation of transposon-like elements are important factors in genome variation of the rice blast fungus. PMID:22876203
The complete chloroplast genome sequence of date palm (Phoenix dactylifera L.).

PubMed

Yang, Meng; Zhang, Xiaowei; Liu, Guiming; Yin, Yuxin; Chen, Kaifu; Yun, Quanzheng; Zhao, Duojun; Al-Mssallem, Ibrahim S; Yu, Jun

2010-09-15

Date palm (Phoenix dactylifera L.), a member of Arecaceae family, is one of the three major economically important woody palms--the two other palms being oil palm and coconut tree--and its fruit is a staple food among Middle East and North African nations, as well as many other tropical and subtropical regions. Here we report a complete sequence of the data palm chloroplast (cp) genome based on pyrosequencing. After extracting 369,022 cp sequencing reads from our whole-genome-shotgun data, we put together an assembly and validated it with intensive PCR-based verification, coupled with PCR product sequencing. The date palm cp genome is 158,462 bp in length and has a typical quadripartite structure of the large (LSC, 86,198 bp) and small single-copy (SSC, 17,712 bp) regions separated by a pair of inverted repeats (IRs, 27,276 bp). Similar to what has been found among most angiosperms, the date palm cp genome harbors 112 unique genes and 19 duplicated fragments in the IR regions. The junctions between LSC/IRs and SSC/IRs show different features of sequence expansion in evolution. We identified 78 SNPs as major intravarietal polymorphisms within the population of a specific cp genome, most of which were located in genes with vital functions. Based on RNA-sequencing data, we also found 18 polycistronic transcription units and three highly expression-biased genes--atpF, trnA-UGC, and rrn23. Unlike most monocots, date palm has a typical cp genome similar to that of tobacco--with little rearrangement and gene loss or gain. High-throughput sequencing technology facilitates the identification of intravarietal variations in cp genomes among different cultivars. Moreover, transcriptomic analysis of cp genes provides clues for uncovering regulatory mechanisms of transcription and translation in chloroplasts.
Genome Features of “Dark-Fly”, a Drosophila Line Reared Long-Term in a Dark Environment

PubMed Central

Zhou, Jun; Sugiyama, Yuzo; Nishimura, Osamu; Aizu, Tomoyuki; Toyoda, Atsushi; Fujiyama, Asao; Agata, Kiyokazu

2012-01-01

Organisms are remarkably adapted to diverse environments by specialized metabolisms, morphology, or behaviors. To address the molecular mechanisms underlying environmental adaptation, we have utilized a Drosophila melanogaster line, termed “Dark-fly”, which has been maintained in constant dark conditions for 57 years (1400 generations). We found that Dark-fly exhibited higher fecundity in dark than in light conditions, indicating that Dark-fly possesses some traits advantageous in darkness. Using next-generation sequencing technology, we determined the whole genome sequence of Dark-fly and identified approximately 220,000 single nucleotide polymorphisms (SNPs) and 4,700 insertions or deletions (InDels) in the Dark-fly genome compared to the genome of the Oregon-R-S strain, a control strain. 1.8% of SNPs were classified as non-synonymous SNPs (nsSNPs: i.e., they alter the amino acid sequence of gene products). Among them, we detected 28 nonsense mutations (i.e., they produce a stop codon in the protein sequence) in the Dark-fly genome. These included genes encoding an olfactory receptor and a light receptor. We also searched runs of homozygosity (ROH) regions as putative regions selected during the population history, and found 21 ROH regions in the Dark-fly genome. We identified 241 genes carrying nsSNPs or InDels in the ROH regions. These include a cluster of alpha-esterase genes that are involved in detoxification processes. Furthermore, analysis of structural variants in the Dark-fly genome showed the deletion of a gene related to fatty acid metabolism. Our results revealed unique features of the Dark-fly genome and provided a list of potential candidate genes involved in environmental adaptation. PMID:22432011
A cool tool for hot and sour Archaea: proteomics of Sulfolobus solfataricus.

PubMed

Kort, Julia Christin; Esser, Dominik; Pham, Trong Khoa; Noirel, Josselin; Wright, Phillip C; Siebers, Bettina

2013-10-01

In recent years, much progress has been made in proteomic studies to unravel metabolic pathways and basic cellular processes. This is especially interesting for members of the Archaea, the third domain of life. Archaea exhibit extraordinary features and many of their cultivable representatives are adaptable to extreme environments. Archaea harbor many unique traits besides bacterial attributes, such as size, shape, and DNA structure and eukaryal characteristics like information processing. Sulfolobus solfataricus P2, a thermoacidophilic archaeal representative, is a well-established model organism adapted to low-pH environments (pH 2-3) and high temperatures (80°C). The genome has a size of 3 Mbp and its sequence has been deciphered. Approximately 3033 predicted open reading frames have been identified and the genome is characterized by a great number of diverse insertion sequence elements. In unraveling the organisms' metabolism and lifestyle, proteomic analyses have played a major role. Much effort has been directed at this organism and is reviewed here. With the help of proteomics, unique metabolic pathways were resolved in S. solfataricus, targets for regulatory protein phosphorylation identified, and cellular responses upon virus infection as well as oxidative stress analyzed. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Revealing the transcriptomic complexity of switchgrass by PacBio long-read sequencing.

PubMed

Zuo, Chunman; Blow, Matthew; Sreedasyam, Avinash; Kuo, Rita C; Ramamoorthy, Govindarajan Kunde; Torres-Jerez, Ivone; Li, Guifen; Wang, Mei; Dilworth, David; Barry, Kerrie; Udvardi, Michael; Schmutz, Jeremy; Tang, Yuhong; Xu, Ying

2018-01-01

Switchgrass ( Panicum virgatum L.) is an important bioenergy crop widely used for lignocellulosic research. While extensive transcriptomic analyses have been conducted on this species using short read-based sequencing techniques, very little has been reliably derived regarding alternatively spliced (AS) transcripts. We present an analysis of transcriptomes of six switchgrass tissue types pooled together, sequenced using Pacific Biosciences (PacBio) single-molecular long-read technology. Our analysis identified 105,419 unique transcripts covering 43,570 known genes and 8795 previously unknown genes. 45,168 are novel transcripts of known genes. A total of 60,096 AS transcripts are identified, 45,628 being novel. We have also predicted 1549 transcripts of genes involved in cell wall construction and remodeling, 639 being novel transcripts of known cell wall genes. Most of the predicted transcripts are validated against Illumina-based short reads. Specifically, 96% of the splice junction sites in all the unique transcripts are validated by at least five Illumina reads. Comparisons between genes derived from our identified transcripts and the current genome annotation revealed that among the gene set predicted by both analyses, 16,640 have different exon-intron structures. Overall, substantial amount of new information is derived from the PacBio RNA data regarding both the transcriptome and the genome of switchgrass.
Cryptic Population Structuring and the Role of the Isthmus of Tehuantepec as a Gene Flow Barrier in the Critically Endangered Central American River Turtle

PubMed Central

González-Porter, Gracia P.; Maldonado, Jesús E.; Flores-Villela, Oscar; Vogt, Richard C.; Janke, Axel; Fleischer, Robert C.; Hailer, Frank

2013-01-01

The critically endangered Central American River Turtle (Dermatemys mawii) is the only remaining member of the Dermatemydidae family, yet little is known about its population structuring. In a previous study of mitochondrial (mt) DNA in the species, three main lineages were described. One lineage (Central) was dominant across most of the range, while two other lineages were restricted to Papaloapan (PAP; isolated by the Isthmus of Tehuantepec and the Sierra de Santa Marta) or the south-eastern part of the range (1D). Here we provide data from seven polymorphic microsatellite loci and the R35 intron to re-evaluate these findings using DNA from the nuclear genome. Based on a slightly expanded data set of a total of 253 samples from the same localities, we find that mtDNA and nuclear DNA markers yield a highly congruent picture of the evolutionary history and population structuring of D. mawii. While resolution provided by the R35 intron (sequenced for a subset of the samples) was very limited, the microsatellite data revealed pronounced population structuring. Within the Grijalva-Usumacinta drainage basin, however, many populations separated by more than 300 kilometers showed signals of high gene flow. Across the entire range, neither mitochondrial nor nuclear DNA show a significant isolation-by-distance pattern, but both genomes highlight that the D. mawii population in the Papaloapan basin is genetically distinctive. Further, both marker systems detect unique genomic signals in four individuals with mtDNA clade 1D sampled on the southeast edge of the Grijalva-Usumacinta basin. These individuals may represent a separate cryptic taxon that is likely impacted by recent admixture. PMID:24086253
Isolation and Characterization of Metallosphaera turreted icosahedral virus (MTIV), a founding member of a new family of archaeal viruses.

PubMed

Wagner, Cassia; Reddy, Vijay; Asturias, Francisco; Khoshouei, Maryam; Johnson, John E; Manrique, Pilar; Munson-McGee, Jacob; Baumeister, Wolfgang; Lawrence, C Martin; Young, Mark J

2017-08-02

Our understanding of archaeal virus diversity and structure is just beginning to emerge. Here we describe a new archaeal virus, tentatively named Metallosphaera turreted icosahedral virus (MTIV), that was isolated from an acidic hot spring in Yellowstone National Park, USA. Two strains of the virus were identified and found to replicate in an archaeal host species closely related to Metallosphaera yellowstonensis Each strain encodes for a 9.8-9.9 kb, linear dsDNA genome with large inverted terminal repeats. Each genome encodes for 21 ORFs. Between the strains the ORFs display high homology, but they are quite distinct from other known viral genes. The 70-nm diameter virion is built upon on a T=28 icosahedral lattice. Both single particle cryo-electron microscopy and cryo-tomography reconstructions reveal an unusual structure that has 42 turret-like projections: 12 from each of the 5-fold axes and 30 hexameric units positioned on icosahedral 2-fold axes. Both the virion structural properties and genome content support MTIV as the founding member of a new family of archaeal viruses. Importance: Many archaeal viruses are quite different than viruses infecting bacteria and eukaryotes. Initial characterization of MTIV reveals a virus distinct from other known bacterial, eukaryotic, and archaeal viruses; this finding suggests that viruses infecting Archaea are still an understudied group of viruses. As the first known virus infecting the Metallosphaera , MTIV provides a new system for exploring archaeal virology by examining host-virus interactions and the unique features of MTIV structure-function relationships. These studies will likely expand our understanding of virus ecology and evolution. Copyright © 2017 American Society for Microbiology.
Comparative Genomics of a Parthenogenesis-Inducing Wolbachia Symbiont

PubMed Central

Lindsey, Amelia R. I.; Werren, John H.; Richards, Stephen; Stouthamer, Richard

2016-01-01

Wolbachia is an intracellular symbiont of invertebrates responsible for inducing a wide variety of phenotypes in its host. These host-Wolbachia relationships span the continuum from reproductive parasitism to obligate mutualism, and provide a unique system to study genomic changes associated with the evolution of symbiosis. We present the genome sequence from a parthenogenesis-inducing Wolbachia strain (wTpre) infecting the minute parasitoid wasp Trichogramma pretiosum. The wTpre genome is the most complete parthenogenesis-inducing Wolbachia genome available to date. We used comparative genomics across 16 Wolbachia strains, representing five supergroups, to identify a core Wolbachia genome of 496 sets of orthologous genes. Only 14 of these sets are unique to Wolbachia when compared to other bacteria from the Rickettsiales. We show that the B supergroup of Wolbachia, of which wTpre is a member, contains a significantly higher number of ankyrin repeat-containing genes than other supergroups. In the wTpre genome, there is evidence for truncation of the protein coding sequences in 20% of ORFs, mostly as a result of frameshift mutations. The wTpre strain represents a conversion from cytoplasmic incompatibility to a parthenogenesis-inducing lifestyle, and is required for reproduction in the Trichogramma host it infects. We hypothesize that the large number of coding frame truncations has accompanied the change in reproductive mode of the wTpre strain. PMID:27194801
Comparative Genomics of a Parthenogenesis-Inducing Wolbachia Symbiont.

PubMed

Lindsey, Amelia R I; Werren, John H; Richards, Stephen; Stouthamer, Richard

2016-07-07

Wolbachia is an intracellular symbiont of invertebrates responsible for inducing a wide variety of phenotypes in its host. These host-Wolbachia relationships span the continuum from reproductive parasitism to obligate mutualism, and provide a unique system to study genomic changes associated with the evolution of symbiosis. We present the genome sequence from a parthenogenesis-inducing Wolbachia strain (wTpre) infecting the minute parasitoid wasp Trichogramma pretiosum The wTpre genome is the most complete parthenogenesis-inducing Wolbachia genome available to date. We used comparative genomics across 16 Wolbachia strains, representing five supergroups, to identify a core Wolbachia genome of 496 sets of orthologous genes. Only 14 of these sets are unique to Wolbachia when compared to other bacteria from the Rickettsiales. We show that the B supergroup of Wolbachia, of which wTpre is a member, contains a significantly higher number of ankyrin repeat-containing genes than other supergroups. In the wTpre genome, there is evidence for truncation of the protein coding sequences in 20% of ORFs, mostly as a result of frameshift mutations. The wTpre strain represents a conversion from cytoplasmic incompatibility to a parthenogenesis-inducing lifestyle, and is required for reproduction in the Trichogramma host it infects. We hypothesize that the large number of coding frame truncations has accompanied the change in reproductive mode of the wTpre strain. Copyright © 2016 Lindsey et al.
Draft genome analysis provides insights into the fiber yield, crude protein biosynthesis, and vegetative growth of domesticated ramie (Boehmeria nivea L. Gaud).

PubMed

Liu, Chan; Zeng, Liangbin; Zhu, Siyuan; Wu, Lingqing; Wang, Yanzhou; Tang, Shouwei; Wang, Hongwu; Zheng, Xia; Zhao, Jian; Chen, Xiaorong; Dai, Qiuzhong; Liu, Touming

2017-11-15

Plentiful bast fiber, a high crude protein content, and vigorous vegetative growth make ramie a popular fiber and forage crop. Here, we report the draft genome of ramie, along with a genomic comparison and evolutionary analysis. The draft genome contained a sequence of approximately 335.6 Mb with 42,463 predicted genes. A high-density genetic map with 4,338 single nucleotide polymorphisms (SNPs) was developed and used to anchor the genome sequence, thus, creating an integrated genetic and physical map containing a 58.2-Mb genome sequence and 4,304 molecular markers. A genomic comparison identified 1,075 unique gene families in ramie, containing 4,082 genes. Among these unique genes, five were cellulose synthase genes that were specifically expressed in stem bark, and 3 encoded a WAT1-related protein, suggesting that they are probably related to high bast fiber yield. An evolutionary analysis detected 106 positively selected genes, 22 of which were related to nitrogen metabolism, indicating that they are probably responsible for the crude protein content and vegetative growth of domesticated varieties. This study is the first to characterize the genome and develop a high-density genetic map of ramie and provides a basis for the genetic and molecular study of this crop. © The Author 2017. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Giraffe genome sequence reveals clues to its unique morphology and physiology

PubMed Central

Agaba, Morris; Ishengoma, Edson; Miller, Webb C.; McGrath, Barbara C.; Hudson, Chelsea N.; Bedoya Reina, Oscar C.; Ratan, Aakrosh; Burhans, Rico; Chikhi, Rayan; Medvedev, Paul; Praul, Craig A.; Wu-Cavener, Lan; Wood, Brendan; Robertson, Heather; Penfold, Linda; Cavener, Douglas R.

2016-01-01

The origins of giraffe's imposing stature and associated cardiovascular adaptations are unknown. Okapi, which lacks these unique features, is giraffe's closest relative and provides a useful comparison, to identify genetic variation underlying giraffe's long neck and cardiovascular system. The genomes of giraffe and okapi were sequenced, and through comparative analyses genes and pathways were identified that exhibit unique genetic changes and likely contribute to giraffe's unique features. Some of these genes are in the HOX, NOTCH and FGF signalling pathways, which regulate both skeletal and cardiovascular development, suggesting that giraffe's stature and cardiovascular adaptations evolved in parallel through changes in a small number of genes. Mitochondrial metabolism and volatile fatty acids transport genes are also evolutionarily diverged in giraffe and may be related to its unusual diet that includes toxic plants. Unexpectedly, substantial evolutionary changes have occurred in giraffe and okapi in double-strand break repair and centrosome functions. PMID:27187213
Nucleoprotein from the unique human infecting Orthobunyavirus of Simbu serogroup (Oropouche virus) forms higher order oligomers in complex with nucleic acids in vitro.

PubMed

Murillo, Juliana Londoño; Cabral, Aline Diniz; Uehara, Mabel; da Silva, Viviam Moura; Dos Santos, Juliete Vitorino; Muniz, João Renato Carvalho; Estrozi, Leandro Farias; Fenel, Daphna; Garcia, Wanius; Sperança, Márcia Aparecida

2018-06-01

Oropouche virus (OROV) is the unique known human pathogen belonging to serogroup Simbu of Orthobunyavirus genus and Bunyaviridae family. OROV is transmitted by wild mosquitoes species to sloths, rodents, monkeys and birds in sylvatic environment, and by midges (Culicoides paraensis and Culex quinquefasciatus) to man causing explosive outbreaks in urban locations. OROV infection causes dengue fever-like symptoms and in few cases, can cause clinical symptoms of aseptic meningitis. OROV contains a tripartite negative RNA genome encapsidated by the viral nucleocapsid protein (NP), which is essential for viral genome encapsidation, transcription and replication. Here, we reported the first study on the structural properties of a recombinant NP from human pathogen Oropouche virus (OROV-rNP). OROV-rNP was successfully expressed in E. coli in soluble form and purified using affinity and size-exclusion chromatographies. Purified OROV-rNP was analyzed using a series of biophysical tools and molecular modeling. The results showed that OROV-rNP formed stable oligomers in solution coupled with endogenous E. coli nucleic acids (RNA) of different sizes. Finally, electron microscopy revealed a total of eleven OROV-rNP oligomer classes with tetramers (42%) and pentamers (43%) the two main populations and minor amounts of other bigger oligomeric states, such as hexamers, heptamers or octamers. The different RNA sizes and nucleotide composition may explain the diversity of oligomer classes observed. Besides, structural differences among bunyaviruses NP can be used to help in the development of tools for specific diagnosis and epidemiological studies of this group of viruses.
Insights into Morphology and Disease from the Dog Genome Project

PubMed Central

Schoenebeck, Jeffrey J.; Ostrander, Elaine A.

2017-01-01

Although most modern dog breeds are less than 200 years old, the symbiosis between man and dog is ancient. Since prehistoric times, repeated selection events have transformed the wolf into man’s guardians, laborers, athletes, and companions. The rapid transformation from pack predator to loyal companion is a feat that is arguably unique among domesticated animals. How this transformation came to pass remained a biological mystery until recently: Within the past decade, the deployment of genomic approaches to study population structure, detect signatures of selection, and identify genetic variants that underlie canine phenotypes is ushering into focus novel biological mechanisms that make dogs remarkable. Ironically, the very practices responsible for breed formation also spurned morbidity; today, many diseases are correlated with breed identity. In this review, we discuss man’s best friend in the context of a genetic model to understand paradigms of heritable phenotypes, both desirable and disadvantageous. PMID:25062362

The Nucleolus: In Genome Maintenance and Repair.

PubMed

Tsekrekou, Maria; Stratigi, Kalliopi; Chatzinikolaou, Georgia

2017-07-01

The nucleolus is the subnuclear membrane-less organelle where rRNA is transcribed and processed and ribosomal assembly occurs. During the last 20 years, however, the nucleolus has emerged as a multifunctional organelle, regulating processes that go well beyond its traditional role. Moreover, the unique organization of rDNA in tandem arrays and its unusually high transcription rates make it prone to unscheduled DNA recombination events and frequent RNA:DNA hybrids leading to DNA double strand breaks (DSBs). If not properly repaired, rDNA damage may contribute to premature disease onset and aging. Deregulation of ribosomal synthesis at any level from transcription and processing to ribosomal subunit assembly elicits a stress response and is also associated with disease onset. Here, we discuss how genome integrity is maintained within nucleoli and how such structures are functionally linked to nuclear DNA damage response and repair giving an emphasis on the newly emerging roles of the nucleolus in mammalian physiology and disease.
Protoparvovirus Knocking at the Nuclear Door.

PubMed

Mäntylä, Elina; Kann, Michael; Vihinen-Ranta, Maija

2017-10-02

Protoparvoviruses target the nucleus due to their dependence on the cellular reproduction machinery during the replication and expression of their single-stranded DNA genome. In recent years, our understanding of the multistep process of the capsid nuclear import has improved, and led to the discovery of unique viral nuclear entry strategies. Preceded by endosomal transport, endosomal escape and microtubule-mediated movement to the vicinity of the nuclear envelope, the protoparvoviruses interact with the nuclear pore complexes. The capsids are transported actively across the nuclear pore complexes using nuclear import receptors. The nuclear import is sometimes accompanied by structural changes in the nuclear envelope, and is completed by intranuclear disassembly of capsids and chromatinization of the viral genome. This review discusses the nuclear import strategies of protoparvoviruses and describes its dynamics comprising active and passive movement, and directed and diffusive motion of capsids in the molecularly crowded environment of the cell.
Giant Viruses of Amoebae: A Journey Through Innovative Research and Paradigm Changes.

PubMed

Colson, Philippe; La Scola, Bernard; Raoult, Didier

2017-09-29

Giant viruses of amoebae were discovered serendipitously in 2003; they are visible via optical microscopy, making them bona fide microbes. Their lifestyle, structure, and genomes break the mold of classical viruses. Giant viruses of amoebae are complex microorganisms. Their genomes harbor between 444 and 2,544 genes, including many that are unique to viruses, and encode translation components; their virions contain >100 proteins as well as mRNAs. Mimiviruses have a specific mobilome, including virophages, provirophages, and transpovirons, and can resist virophages through a system known as MIMIVIRE (mimivirus virophage resistance element). Giant viruses of amoebae bring upheaval to the definition of viruses and tend to separate the current virosphere into two categories: very simple viruses and viruses with complexity similar to that of other microbes. This new paradigm is propitious for enhanced detection and characterization of giant viruses of amoebae, and a particular focus on their role in humans is warranted.
Sequence and analysis of chromosome 4 of the plant Arabidopsis thaliana.

PubMed

Mayer, K; Schüller, C; Wambutt, R; Murphy, G; Volckaert, G; Pohl, T; Düsterhöft, A; Stiekema, W; Entian, K D; Terryn, N; Harris, B; Ansorge, W; Brandt, P; Grivell, L; Rieger, M; Weichselgartner, M; de Simone, V; Obermaier, B; Mache, R; Müller, M; Kreis, M; Delseny, M; Puigdomenech, P; Watson, M; Schmidtheini, T; Reichert, B; Portatelle, D; Perez-Alonso, M; Boutry, M; Bancroft, I; Vos, P; Hoheisel, J; Zimmermann, W; Wedler, H; Ridley, P; Langham, S A; McCullagh, B; Bilham, L; Robben, J; Van der Schueren, J; Grymonprez, B; Chuang, Y J; Vandenbussche, F; Braeken, M; Weltjens, I; Voet, M; Bastiaens, I; Aert, R; Defoor, E; Weitzenegger, T; Bothe, G; Ramsperger, U; Hilbert, H; Braun, M; Holzer, E; Brandt, A; Peters, S; van Staveren, M; Dirske, W; Mooijman, P; Klein Lankhorst, R; Rose, M; Hauf, J; Kötter, P; Berneiser, S; Hempel, S; Feldpausch, M; Lamberth, S; Van den Daele, H; De Keyser, A; Buysshaert, C; Gielen, J; Villarroel, R; De Clercq, R; Van Montagu, M; Rogers, J; Cronin, A; Quail, M; Bray-Allen, S; Clark, L; Doggett, J; Hall, S; Kay, M; Lennard, N; McLay, K; Mayes, R; Pettett, A; Rajandream, M A; Lyne, M; Benes, V; Rechmann, S; Borkova, D; Blöcker, H; Scharfe, M; Grimm, M; Löhnert, T H; Dose, S; de Haan, M; Maarse, A; Schäfer, M; Müller-Auer, S; Gabel, C; Fuchs, M; Fartmann, B; Granderath, K; Dauner, D; Herzl, A; Neumann, S; Argiriou, A; Vitale, D; Liguori, R; Piravandi, E; Massenet, O; Quigley, F; Clabauld, G; Mündlein, A; Felber, R; Schnabl, S; Hiller, R; Schmidt, W; Lecharny, A; Aubourg, S; Chefdor, F; Cooke, R; Berger, C; Montfort, A; Casacuberta, E; Gibbons, T; Weber, N; Vandenbol, M; Bargues, M; Terol, J; Torres, A; Perez-Perez, A; Purnelle, B; Bent, E; Johnson, S; Tacon, D; Jesse, T; Heijnen, L; Schwarz, S; Scholler, P; Heber, S; Francs, P; Bielke, C; Frishman, D; Haase, D; Lemcke, K; Mewes, H W; Stocker, S; Zaccaria, P; Bevan, M; Wilson, R K; de la Bastide, M; Habermann, K; Parnell, L; Dedhia, N; Gnoj, L; Schutz, K; Huang, E; Spiegel, L; Sehkon, M; Murray, J; Sheet, P; Cordes, M; Abu-Threideh, J; Stoneking, T; Kalicki, J; Graves, T; Harmon, G; Edwards, J; Latreille, P; Courtney, L; Cloud, J; Abbott, A; Scott, K; Johnson, D; Minx, P; Bentley, D; Fulton, B; Miller, N; Greco, T; Kemp, K; Kramer, J; Fulton, L; Mardis, E; Dante, M; Pepin, K; Hillier, L; Nelson, J; Spieth, J; Ryan, E; Andrews, S; Geisel, C; Layman, D; Du, H; Ali, J; Berghoff, A; Jones, K; Drone, K; Cotton, M; Joshu, C; Antonoiu, B; Zidanic, M; Strong, C; Sun, H; Lamar, B; Yordan, C; Ma, P; Zhong, J; Preston, R; Vil, D; Shekher, M; Matero, A; Shah, R; Swaby, I K; O'Shaughnessy, A; Rodriguez, M; Hoffmann, J; Till, S; Granat, S; Shohdy, N; Hasegawa, A; Hameed, A; Lodhi, M; Johnson, A; Chen, E; Marra, M; Martienssen, R; McCombie, W R

1999-12-16

The higher plant Arabidopsis thaliana (Arabidopsis) is an important model for identifying plant genes and determining their function. To assist biological investigations and to define chromosome structure, a coordinated effort to sequence the Arabidopsis genome was initiated in late 1996. Here we report one of the first milestones of this project, the sequence of chromosome 4. Analysis of 17.38 megabases of unique sequence, representing about 17% of the genome, reveals 3,744 protein coding genes, 81 transfer RNAs and numerous repeat elements. Heterochromatic regions surrounding the putative centromere, which has not yet been completely sequenced, are characterized by an increased frequency of a variety of repeats, new repeats, reduced recombination, lowered gene density and lowered gene expression. Roughly 60% of the predicted protein-coding genes have been functionally characterized on the basis of their homology to known genes. Many genes encode predicted proteins that are homologous to human and Caenorhabditis elegans proteins.
The mitochondrial genome of the ascalaphid owlfly Libelloides macaronius and comparative evolutionary mitochondriomics of neuropterid insects

PubMed Central

2011-01-01

Background The insect order Neuroptera encompasses more than 5,700 described species. To date, only three neuropteran mitochondrial genomes have been fully and one partly sequenced. Current knowledge on neuropteran mitochondrial genomes is limited, and new data are strongly required. In the present work, the mitochondrial genome of the ascalaphid owlfly Libelloides macaronius is described and compared with the known neuropterid mitochondrial genomes: Megaloptera, Neuroptera and Raphidioptera. These analyses are further extended to other endopterygotan orders. Results The mitochondrial genome of L. macaronius is a circular molecule 15,890 bp long. It includes the entire set of 37 genes usually present in animal mitochondrial genomes. The gene order of this newly sequenced genome is unique among Neuroptera and differs from the ancestral type of insects in the translocation of trnC. The L. macaronius genome shows the lowest A+T content (74.50%) among known neuropterid genomes. Protein-coding genes possess the typical mitochondrial start codons, except for cox1, which has an unusual ACG. Comparisons among endopterygotan mitochondrial genomes showed that A+T content and AT/GC-skews exhibit a broad range of variation among 84 analyzed taxa. Comparative analyses showed that neuropterid mitochondrial protein-coding genes experienced complex evolutionary histories, involving features ranging from codon usage to rate of substitution, that make them potential markers for population genetics/phylogenetics studies at different taxonomic ranks. The 22 tRNAs show variable substitution patterns in Neuropterida, with higher sequence conservation in genes located on the α strand. Inferred secondary structures for neuropterid rrnS and rrnL genes largely agree with those known for other insects. For the first time, a model is provided for domain I of an insect rrnL. The control region in Neuropterida, as in other insects, is fast-evolving genomic region, characterized by AT-rich motifs. Conclusions The new genome shares many features with known neuropteran genomes but differs in its low A+T content. Comparative analysis of neuropterid mitochondrial genes showed that they experienced distinct evolutionary patterns. Both tRNA families and ribosomal RNAs show composite substitution pathways. The neuropterid mitochondrial genome is characterized by a complex evolutionary history. PMID:21569260
Genomic Resources of Three Pulsatilla Species Reveal Evolutionary Hotspots, Species-Specific Sites and Variable Plastid Structure in the Family Ranunculaceae.

PubMed

Szczecińska, Monika; Sawicki, Jakub

2015-09-15

The European continent is presently colonized by nine species of the genus Pulsatilla, five of which are encountered only in mountainous regions of southwest and south-central Europe. The remaining four species inhabit lowlands in the north-central and eastern parts of the continent. Most plants of the genus Pulsatilla are rare and endangered, which is why most research efforts focused on their biology, ecology and hybridization. The objective of this study was to develop genomic resources, including complete plastid genomes and nuclear rRNA clusters, for three sympatric Pulsatilla species that are most commonly found in Central Europe. The results will supply valuable information about genetic variation, which can be used in the process of designing primers for population studies and conservation genetics research. The complete plastid genomes together with the nuclear rRNA cluster can serve as a useful tool in hybridization studies. Six complete plastid genomes and nuclear rRNA clusters were sequenced from three species of Pulsatilla using the Illumina sequencing technology. Four junctions between single copy regions and inverted repeats and junctions between the identified locally-collinear blocks (LCB) were confirmed by Sanger sequencing. Pulsatilla genomes of 120 unique genes had a total length of approximately 161-162 kb, and 21 were duplicated in the inverted repeats (IR) region. Comparative plastid genomes of newly-sequenced Pulsatilla and the previously-identified plastomes of Aconitum and Ranunculus species belonging to the family Ranunculaceae revealed several variations in the structure of the genome, but the gene content remained constant. The nuclear rRNA cluster (18S-ITS1-5.8S-ITS2-26S) of studied Pulsatilla species is 5795 bp long. Among five analyzed regions of the rRNA cluster, only Internal Transcribed Spacer 2 (ITS2) enabled the molecular delimitation of closely-related Pulsatilla patens and Pulsatilla vernalis. The determination of complete plastid genome and nuclear rRNA cluster sequences in three species of the genus Pulsatilla is an important contribution to our knowledge of the evolution and phylogeography of those endangered taxa. The resulting data can be used to identify regions that are particularly useful for barcoding, phylogenetic and phylogeographic studies. The investigated taxa can be identified at each stage of development based on their species-specific SNPs. The nuclear and plastid genomic resources enable advanced studies on hybridization, including identification of parent species, including their roles in that process. The identified nonsynonymous mutations could play an important role in adaptations to changing environments. The results of the study will also provide valuable information about the evolution of the plastome structure in the family Ranunculaceae.
Genomic Resources of Three Pulsatilla Species Reveal Evolutionary Hotspots, Species-Specific Sites and Variable Plastid Structure in the Family Ranunculaceae

PubMed Central

Szczecińska, Monika; Sawicki, Jakub

2015-01-01

Background: The European continent is presently colonized by nine species of the genus Pulsatilla, five of which are encountered only in mountainous regions of southwest and south-central Europe. The remaining four species inhabit lowlands in the north-central and eastern parts of the continent. Most plants of the genus Pulsatilla are rare and endangered, which is why most research efforts focused on their biology, ecology and hybridization. The objective of this study was to develop genomic resources, including complete plastid genomes and nuclear rRNA clusters, for three sympatric Pulsatilla species that are most commonly found in Central Europe. The results will supply valuable information about genetic variation, which can be used in the process of designing primers for population studies and conservation genetics research. The complete plastid genomes together with the nuclear rRNA cluster can serve as a useful tool in hybridization studies. Methodology/principal findings: Six complete plastid genomes and nuclear rRNA clusters were sequenced from three species of Pulsatilla using the Illumina sequencing technology. Four junctions between single copy regions and inverted repeats and junctions between the identified locally-collinear blocks (LCB) were confirmed by Sanger sequencing. Pulsatilla genomes of 120 unique genes had a total length of approximately 161–162 kb, and 21 were duplicated in the inverted repeats (IR) region. Comparative plastid genomes of newly-sequenced Pulsatilla and the previously-identified plastomes of Aconitum and Ranunculus species belonging to the family Ranunculaceae revealed several variations in the structure of the genome, but the gene content remained constant. The nuclear rRNA cluster (18S-ITS1-5.8S-ITS2-26S) of studied Pulsatilla species is 5795 bp long. Among five analyzed regions of the rRNA cluster, only Internal Transcribed Spacer 2 (ITS2) enabled the molecular delimitation of closely-related Pulsatilla patens and Pulsatilla vernalis. Conclusions/significance: The determination of complete plastid genome and nuclear rRNA cluster sequences in three species of the genus Pulsatilla is an important contribution to our knowledge of the evolution and phylogeography of those endangered taxa. The resulting data can be used to identify regions that are particularly useful for barcoding, phylogenetic and phylogeographic studies. The investigated taxa can be identified at each stage of development based on their species-specific SNPs. The nuclear and plastid genomic resources enable advanced studies on hybridization, including identification of parent species, including their roles in that process. The identified nonsynonymous mutations could play an important role in adaptations to changing environments. The results of the study will also provide valuable information about the evolution of the plastome structure in the family Ranunculaceae. PMID:26389887
The resurrection genome of Boea hygrometrica: A blueprint for survival of dehydration.

PubMed

Xiao, Lihong; Yang, Ge; Zhang, Liechi; Yang, Xinhua; Zhao, Shuang; Ji, Zhongzhong; Zhou, Qing; Hu, Min; Wang, Yu; Chen, Ming; Xu, Yu; Jin, Haijing; Xiao, Xuan; Hu, Guipeng; Bao, Fang; Hu, Yong; Wan, Ping; Li, Legong; Deng, Xin; Kuang, Tingyun; Xiang, Chengbin; Zhu, Jian-Kang; Oliver, Melvin J; He, Yikun

2015-05-05

"Drying without dying" is an essential trait in land plant evolution. Unraveling how a unique group of angiosperms, the Resurrection Plants, survive desiccation of their leaves and roots has been hampered by the lack of a foundational genome perspective. Here we report the ∼1,691-Mb sequenced genome of Boea hygrometrica, an important resurrection plant model. The sequence revealed evidence for two historical genome-wide duplication events, a compliment of 49,374 protein-coding genes, 29.15% of which are unique (orphan) to Boea and 20% of which (9,888) significantly respond to desiccation at the transcript level. Expansion of early light-inducible protein (ELIP) and 5S rRNA genes highlights the importance of the protection of the photosynthetic apparatus during drying and the rapid resumption of protein synthesis in the resurrection capability of Boea. Transcriptome analysis reveals extensive alternative splicing of transcripts and a focus on cellular protection strategies. The lack of desiccation tolerance-specific genome organizational features suggests the resurrection phenotype evolved mainly by an alteration in the control of dehydration response genes.
Incoming human papillomavirus type 16 genome resides in a vesicular compartment throughout mitosis.

PubMed

DiGiuseppe, Stephen; Luszczek, Wioleta; Keiffer, Timothy R; Bienkowska-Haba, Malgorzata; Guion, Lucile G M; Sapp, Martin J

2016-05-31

During the entry process, the human papillomavirus (HPV) capsid is trafficked to the trans-Golgi network (TGN), whereupon it enters the nucleus during mitosis. We previously demonstrated that the minor capsid protein L2 assumes a transmembranous conformation in the TGN. Here we provide evidence that the incoming viral genome dissociates from the TGN and associates with microtubules after the onset of mitosis. Deposition onto mitotic chromosomes is L2-mediated. Using differential staining of an incoming viral genome by small molecular dyes in selectively permeabilized cells, nuclease protection, and flotation assays, we found that HPV resides in a membrane-bound vesicle until mitosis is completed and the nuclear envelope has reformed. As a result, expression of the incoming viral genome is delayed. Taken together, these data provide evidence that HPV has evolved a unique strategy for delivering the viral genome to the nucleus of dividing cells. Furthermore, it is unlikely that nuclear vesicles are unique to HPV, and thus we may have uncovered a hitherto unrecognized cellular pathway that may be of interest for future cell biological studies.
Personalized Medicine in a New Genomic Era: Ethical and Legal Aspects.

PubMed

Shoaib, Maria; Rameez, Mansoor Ali Merchant; Hussain, Syed Ather; Madadin, Mohammed; Menezes, Ritesh G

2017-08-01

The genome of two completely unrelated individuals is quite similar apart from minor variations called single nucleotide polymorphisms which contribute to the uniqueness of each and every person. These single nucleotide polymorphisms are of great interest clinically as they are useful in figuring out the susceptibility of certain individuals to particular diseases and for recognizing varied responses to pharmacological interventions. This gives rise to the idea of 'personalized medicine' as an exciting new therapeutic science in this genomic era. Personalized medicine suggests a unique treatment strategy based on an individual's genetic make-up. Its key principles revolve around applied pharmaco-genomics, pharmaco-kinetics and pharmaco-proteomics. Herein, the ethical and legal aspects of personalized medicine in a new genomic era are briefly addressed. The ultimate goal is to comprehensively recognize all relevant forms of genetic variation in each individual and be able to interpret this information in a clinically meaningful manner within the ambit of ethical and legal considerations. The authors of this article firmly believe that personalized medicine has the potential to revolutionize the current landscape of medicine as it makes its way into clinical practice.
Complete genome sequence of uropathogenic Escherichia coli isolate UPEC 26-1.

PubMed

Subhadra, Bindu; Kim, Dong Ho; Kim, Jaeseok; Woo, Kyungho; Sohn, Kyung Mok; Kim, Hwa-Jung; Han, Kyudong; Oh, Man Hwan; Choi, Chul Hee

2018-06-01

Urinary tract infections (UTIs) are among the most common infections in humans, predominantly caused by uropathogenic Escherichia coli (UPEC). The diverse genomes of UPEC strains mostly impede disease prevention and control measures. In this study, we comparatively analyzed the whole genome sequence of a highly virulent UPEC strain, namely UPEC 26-1, which was isolated from urine sample of a patient suffering from UTI in Korea. Whole genome analysis showed that the genome consists of one circular chromosome of 5,329,753 bp, comprising 5064 protein-coding genes, 122 RNA genes (94 tRNA, 22 rRNA and 6 ncRNA genes), and 100 pseudogenes, with an average G+C content of 50.56%. In addition, we identified 8 prophage regions comprising 5 intact, 2 incomplete and 1 questionable ones and 63 genomic islands, suggesting the possibility of horizontal gene transfer in this strain. Comparative genome analysis of UPEC 26-1 with the UPEC strain CFT073 revealed an average nucleotide identity of 99.7%. The genome comparison with CFT073 provides major differences in the genome of UPEC 26-1 that would explain its increased virulence and biofilm formation. Nineteen of the total GIs were unique to UPEC 26-1 compared to CFT073 and nine of them harbored unique genes that are involved in virulence, multidrug resistance, biofilm formation and bacterial pathogenesis. The data from this study will assist in future studies of UPEC strains to develop effective control measures.
Comprehensive definition of genome features in Spirodela polyrhiza by high-depth physical mapping and short-read DNA sequencing strategies.

PubMed

Michael, Todd P; Bryant, Douglas; Gutierrez, Ryan; Borisjuk, Nikolai; Chu, Philomena; Zhang, Hanzhong; Xia, Jing; Zhou, Junfei; Peng, Hai; El Baidouri, Moaine; Ten Hallers, Boudewijn; Hastie, Alex R; Liang, Tiffany; Acosta, Kenneth; Gilbert, Sarah; McEntee, Connor; Jackson, Scott A; Mockler, Todd C; Zhang, Weixiong; Lam, Eric

2017-02-01

Spirodela polyrhiza is a fast-growing aquatic monocot with highly reduced morphology, genome size and number of protein-coding genes. Considering these biological features of Spirodela and its basal position in the monocot lineage, understanding its genome architecture could shed light on plant adaptation and genome evolution. Like many draft genomes, however, the 158-Mb Spirodela genome sequence has not been resolved to chromosomes, and important genome characteristics have not been defined. Here we deployed rapid genome-wide physical maps combined with high-coverage short-read sequencing to resolve the 20 chromosomes of Spirodela and to empirically delineate its genome features. Our data revealed a dramatic reduction in the number of the rDNA repeat units in Spirodela to fewer than 100, which is even fewer than that reported for yeast. Consistent with its unique phylogenetic position, small RNA sequencing revealed 29 Spirodela-specific microRNA, with only two being shared with Elaeis guineensis (oil palm) and Musa balbisiana (banana). Combining DNA methylation data and small RNA sequencing enabled the accurate prediction of 20.5% long terminal repeats (LTRs) that doubled the previous estimate, and revealed a high Solo:Intact LTR ratio of 8.2. Interestingly, we found that Spirodela has the lowest global DNA methylation levels (9%) of any plant species tested. Taken together our results reveal a genome that has undergone reduction, likely through eliminating non-essential protein coding genes, rDNA and LTRs. In addition to delineating the genome features of this unique plant, the methodologies described and large-scale genome resources from this work will enable future evolutionary and functional studies of this basal monocot family. © 2016 The Authors The Plant Journal © 2016 John Wiley & Sons Ltd.
phiGENOME: an integrative navigation throughout bacteriophage genomes.

PubMed

Stano, Matej; Klucar, Lubos

2011-11-01

phiGENOME is a web-based genome browser generating dynamic and interactive graphical representation of phage genomes stored in the phiSITE, database of gene regulation in bacteriophages. phiGENOME is an integral part of the phiSITE web portal (http://www.phisite.org/phigenome) and it was optimised for visualisation of phage genomes with the emphasis on the gene regulatory elements. phiGENOME consists of three components: (i) genome map viewer built using Adobe Flash technology, providing dynamic and interactive graphical display of phage genomes; (ii) sequence browser based on precisely formatted HTML tags, providing detailed exploration of genome features on the sequence level and (iii) regulation illustrator, based on Scalable Vector Graphics (SVG) and designed for graphical representation of gene regulations. Bringing 542 complete genome sequences accompanied with their rich annotations and references, makes phiGENOME a unique information resource in the field of phage genomics. Copyright Â© 2011 Elsevier Inc. All rights reserved.
ProteinWorldDB: querying radical pairwise alignments among protein sets from complete genomes

PubMed Central

Otto, Thomas Dan; Catanho, Marcos; Tristão, Cristian; Bezerra, Márcia; Fernandes, Renan Mathias; Elias, Guilherme Steinberger; Scaglia, Alexandre Capeletto; Bovermann, Bill; Berstis, Viktors; Lifschitz, Sergio; de Miranda, Antonio Basílio; Degrave, Wim

2010-01-01

Motivation: Many analyses in modern biological research are based on comparisons between biological sequences, resulting in functional, evolutionary and structural inferences. When large numbers of sequences are compared, heuristics are often used resulting in a certain lack of accuracy. In order to improve and validate results of such comparisons, we have performed radical all-against-all comparisons of 4 million protein sequences belonging to the RefSeq database, using an implementation of the Smith–Waterman algorithm. This extremely intensive computational approach was made possible with the help of World Community Grid™, through the Genome Comparison Project. The resulting database, ProteinWorldDB, which contains coordinates of pairwise protein alignments and their respective scores, is now made available. Users can download, compare and analyze the results, filtered by genomes, protein functions or clusters. ProteinWorldDB is integrated with annotations derived from Swiss-Prot, Pfam, KEGG, NCBI Taxonomy database and gene ontology. The database is a unique and valuable asset, representing a major effort to create a reliable and consistent dataset of cross-comparisons of the whole protein content encoded in hundreds of completely sequenced genomes using a rigorous dynamic programming approach. Availability: The database can be accessed through http://proteinworlddb.org Contact: otto@fiocruz.br PMID:20089515
Genome-scale model reveals metabolic basis of biomass partitioning in a model diatom

DOE PAGES

Levering, Jennifer; Broddrick, Jared; Dupont, Christopher L.; ...

2016-05-06

Diatoms are eukaryotic microalgae that contain genes from various sources, including bacteria and the secondary endosymbiotic host. Due to this unique combination of genes, diatoms are taxonomically and functionally distinct from other algae and vascular plants and confer novel metabolic capabilities. Based on the genome annotation, we performed a genome-scale metabolic network reconstruction for the marine diatom Phaeodactylum tricornutum. Due to their endosymbiotic origin, diatoms possess a complex chloroplast structure which complicates the prediction of subcellular protein localization. Based on previous work we implemented a pipeline that exploits a series of bioinformatics tools to predict protein localization. The manually curatedmore » reconstructed metabolic network iLB1027_lipid accounts for 1,027 genes associated with 4,456 reactions and 2,172 metabolites distributed across six compartments. To constrain the genome-scale model, we determined the organism specific biomass composition in terms of lipids, carbohydrates, and proteins using Fourier transform infrared spectrometry. Our simulations indicate the presence of a yet unknown glutamine-ornithine shunt that could be used to transfer reducing equivalents generated by photosynthesis to the mitochondria. Furthermore, the model reflects the known biochemical composition of P. tricornutum in defined culture conditions and enables metabolic engineering strategies to improve the use of P. tricornutum for biotechnological applications.« less
Microbial Genomics of a Host-Associated Commensal Bacterium in Fragmented Populations of Endangered Takahe.

PubMed

Grange, Zoë L; Gartrell, Brett D; Biggs, Patrick J; Nelson, Nicola J; Anderson, Marti; French, Nigel P

2016-05-01

Isolation of wildlife into fragmented populations as a consequence of anthropogenic-mediated environmental change may alter host-pathogen relationships. Our understanding of some of the epidemiological features of infectious disease in vulnerable populations can be enhanced by the use of commensal bacteria as a proxy for invasive pathogens in natural ecosystems. The distinctive population structure of a well-described meta-population of a New Zealand endangered flightless bird, the takahe (Porphyrio hochstetteri), provided a unique opportunity to investigate the influence of host isolation on enteric microbial diversity. The genomic epidemiology of a prevalent rail-associated endemic commensal bacterium was explored using core genome and ribosomal multilocus sequence typing (rMLST) of 70 Campylobacter sp. nova 1 isolated from one third of the takahe population resident in multiple locations. While there was evidence of recombination between lineages, bacterial divergence appears to have occurred and multivariate analysis of 52 rMLST genes revealed location-associated differentiation of C. sp. nova 1 sequence types. Our results indicate that fragmentation and anthropogenic manipulation of populations can influence host-microbial relationships, with potential implications for niche adaptation and the evolution of micro-organisms in remote environments. This study provides a novel framework in which to explore the complex genomic epidemiology of micro-organisms in wildlife populations.
Using Arabidopsis to understand centromere function: progress and prospects.

PubMed

Copenhaver, Gregory P

2003-01-01

Arabidopsis thaliana has emerged in recent years as a leading model for understanding the structure and function of higher eukaryotic centromeres. Arabidopsis centromeres, like those of virtually all higher eukaryotes, encompass large DNA domains consisting of a complex combination of unique, dispersed middle repetitive and highly repetitive DNA. For this reason, they have required creative analysis using molecular, genetic, cytological and genomic techniques. This synergy of approaches, reinforced by rapid progress in understanding how proteins interact with the centromere DNA to form a complete functional unit, has made Arabidopsis one the best understood centromere systems. Yet major problems remain to be solved: gaining a complete structural definition of the centromere has been surprisingly difficult, and developing synthetic mini-chromosomes in plants has been even more challenging.
Putative and unique gene sequence utilization for the design of species specific probes as modeled by Lactobacillus plantarum

USDA-ARS?s Scientific Manuscript database

The concept of utilizing putative and unique gene sequences for the design of species specific probes was tested. The abundance profile of assigned functions within the Lactobacillus plantarum genome was used for the identification of the putative and unique gene sequence, csh. The targeted gene (cs...
Genotyping-by-Sequencing Analysis for Determining Population Structure of Finger Millet Germplasm of Diverse Origins.

PubMed

Kumar, Anil; Sharma, Divya; Tiwari, Apoorv; Jaiswal, J P; Singh, N K; Sood, Salej

2016-07-01

Finger millet [ (L.) Gaertn.] is grown mainly by subsistence farmers in arid and semiarid regions of the world. To broaden its genetic base and to boost its production, it is of paramount importance to characterize and genotype the diverse gene pool of this important food and nutritional security crop. However, as a result of nonavailability of the genome sequence of finger millet, the progress could not be made in realizing the molecular basis of unique qualities of the crop. In the present investigation, attempts have been made to characterize the genetically diverse collection of 113 finger millet accessions through whole-genome genotyping-by-sequencing (GBS), which resulted in a genome-wide set of 23,000 single-nucleotide polymorphisms (SNPs) segregating across the entire collection and several thousand SNPs segregating within every accession. A model-based population structure analysis reveals the presence of three subpopulations among the finger millet accessions, which are in parallel with the results of phylogenetic analysis. The observed population structure is consistent with the hypothesis that finger millet was domesticated first in Africa, and from there it was introduced to India some 3000 yr ago. A total of 1128 gene ontology (GO) terms were assigned to SNP-carrying genes for three main categories: biological process, cellular component, and molecular function. Facilitated access to high-throughput genotyping and sequencing technologies are likely to improve the breeding process in developing countries, and as such, this data will be very useful to breeders who are working for the genetic improvement of finger millet. Copyright © 2016 Crop Science Society of America.
Genomic comparison of multi-drug resistant invasive and colonizing Acinetobacter baumannii isolated from diverse human body sites reveals genomic plasticity.

PubMed

Sahl, Jason W; Johnson, J Kristie; Harris, Anthony D; Phillippy, Adam M; Hsiao, William W; Thom, Kerri A; Rasko, David A

2011-06-04

Acinetobacter baumannii has recently emerged as a significant global pathogen, with a surprisingly rapid acquisition of antibiotic resistance and spread within hospitals and health care institutions. This study examines the genomic content of three A. baumannii strains isolated from distinct body sites. Isolates from blood, peri-anal, and wound sources were examined in an attempt to identify genetic features that could be correlated to each isolation source. Pulsed-field gel electrophoresis, multi-locus sequence typing and antibiotic resistance profiles demonstrated genotypic and phenotypic variation. Each isolate was sequenced to high-quality draft status, which allowed for comparative genomic analyses with existing A. baumannii genomes. A high resolution, whole genome alignment method detailed the phylogenetic relationships of sequenced A. baumannii and found no correlation between phylogeny and body site of isolation. This method identified genomic regions unique to both those isolates found on the surface of the skin or in wounds, termed colonization isolates, and those identified from body fluids, termed invasive isolates; these regions may play a role in the pathogenesis and spread of this important pathogen. A PCR-based screen of 74 A. baumanii isolates demonstrated that these unique genes are not exclusive to either phenotype or isolation source; however, a conserved genomic region exclusive to all sequenced A. baumannii was identified and verified. The results of the comparative genome analysis and PCR assay show that A. baumannii is a diverse and genomically variable pathogen that appears to have the potential to cause a range of human disease regardless of the isolation source.

Crystal structure of reverse gyrase: insights into the positive supercoiling of DNA

PubMed Central

Rodríguez, A.Chapin; Stock, Daniela

2002-01-01

Reverse gyrase is the only topoisomerase known to positively supercoil DNA. The protein appears to be unique to hyperthermophiles, where its activity is believed to protect the genome from denaturation. The 120 kDa enzyme is the only member of the type I topoisomerase family that requires ATP, which is bound and hydrolysed by a helicase-like domain. We have determined the crystal structure of reverse gyrase from Archaeoglobus fulgidus in the presence and absence of nucleotide cofactor. The structure provides the first view of an intact supercoiling enzyme, explains mechanistic differences from other type I topoisomerases and suggests a model for how the two domains of the protein cooperate to positively supercoil DNA. Coordinates have been deposited in the Protein Data Bank under accession codes 1GKU and 1GL9. PMID:11823434
Comparative Genomics of Carp Herpesviruses

PubMed Central

Kurobe, Tomofumi; Gatherer, Derek; Cunningham, Charles; Korf, Ian; Fukuda, Hideo; Hedrick, Ronald P.; Waltzek, Thomas B.

2013-01-01

Three alloherpesviruses are known to cause disease in cyprinid fish: cyprinid herpesviruses 1 and 3 (CyHV1 and CyHV3) in common carp and koi and cyprinid herpesvirus 2 (CyHV2) in goldfish. We have determined the genome sequences of CyHV1 and CyHV2 and compared them with the published CyHV3 sequence. The CyHV1 and CyHV2 genomes are 291,144 and 290,304 bp, respectively, in size, and thus the CyHV3 genome, at 295,146 bp, remains the largest recorded among the herpesviruses. Each of the three genomes consists of a unique region flanked at each terminus by a sizeable direct repeat. The CyHV1, CyHV2, and CyHV3 genomes are predicted to contain 137, 150, and 155 unique, functional protein-coding genes, respectively, of which six, four, and eight, respectively, are duplicated in the terminal repeat. The three viruses share 120 orthologous genes in a largely colinear arrangement, of which up to 55 are also conserved in the other member of the genus Cyprinivirus, anguillid herpesvirus 1. Twelve genes are conserved convincingly in all sequenced alloherpesviruses, and two others are conserved marginally. The reference CyHV3 strain has been reported to contain five fragmented genes that are presumably nonfunctional. The CyHV2 strain has two fragmented genes, and the CyHV1 strain has none. CyHV1, CyHV2, and CyHV3 have five, six, and five families of paralogous genes, respectively. One family unique to CyHV1 is related to cellular JUNB, which encodes a transcription factor involved in oncogenesis. To our knowledge, this is the first time that JUNB-related sequences have been reported in a herpesvirus. PMID:23269803
Complete mitochondrial genome of Ostrea denselamellosa (Bivalvia, Ostreidae).

PubMed

Yu, Hong; Kong, Lingfeng; Li, Qi

2016-01-01

The complete mitochondrial (mt) genome of the flat oyster, Ostrea denselamellosa, was determined using Long-PCR and genome walking techniques in this study. The total length of the mt genome sequence of O. denselamellosa was 16,227 bp, which is the smallest reported Ostreidae mt genome to date. It contained 12 protein-coding genes (lacking of ATP8), 23 transfer RNA genes, and two ribosomal RNA genes. A bias towards a higher representation of nucleotides A and T (60.7%) was detected in the mt genome of O. denselamellosa. The rrnL was split into two fragments (3' half, 711 bp; 5' half, 509 bp), which seems to be the unique characteristics of Ostreidae mt genomes.
Cloning, analysis and functional annotation of expressed sequence tags from the Earthworm Eisenia fetida

PubMed Central

Pirooznia, Mehdi; Gong, Ping; Guan, Xin; Inouye, Laura S; Yang, Kuan; Perkins, Edward J; Deng, Youping

2007-01-01

Background Eisenia fetida, commonly known as red wiggler or compost worm, belongs to the Lumbricidae family of the Annelida phylum. Little is known about its genome sequence although it has been extensively used as a test organism in terrestrial ecotoxicology. In order to understand its gene expression response to environmental contaminants, we cloned 4032 cDNAs or expressed sequence tags (ESTs) from two E. fetida libraries enriched with genes responsive to ten ordnance related compounds using suppressive subtractive hybridization-PCR. Results A total of 3144 good quality ESTs (GenBank dbEST accession number EH669363–EH672369 and EL515444–EL515580) were obtained from the raw clone sequences after cleaning. Clustering analysis yielded 2231 unique sequences including 448 contigs (from 1361 ESTs) and 1783 singletons. Comparative genomic analysis showed that 743 or 33% of the unique sequences shared high similarity with existing genes in the GenBank nr database. Provisional function annotation assigned 830 Gene Ontology terms to 517 unique sequences based on their homology with the annotated genomes of four model organisms Drosophila melanogaster, Mus musculus, Saccharomyces cerevisiae, and Caenorhabditis elegans. Seven percent of the unique sequences were further mapped to 99 Kyoto Encyclopedia of Genes and Genomes pathways based on their matching Enzyme Commission numbers. All the information is stored and retrievable at a highly performed, web-based and user-friendly relational database called EST model database or ESTMD version 2. Conclusion The ESTMD containing the sequence and annotation information of 4032 E. fetida ESTs is publicly accessible at . PMID:18047730
Structural insights into 5‧ flap DNA unwinding and incision by the human FAN1 dimer

NASA Astrophysics Data System (ADS)

Zhao, Qi; Xue, Xiaoyu; Longerich, Simonne; Sung, Patrick; Xiong, Yong

2014-12-01

Human FANCD2-associated nuclease 1 (FAN1) is a DNA structure-specific nuclease involved in the processing of DNA interstrand crosslinks (ICLs). FAN1 maintains genomic stability and prevents tissue decline in multiple organs, yet it confers ICL-induced anti-cancer drug resistance in several cancer subtypes. Here we report three crystal structures of human FAN1 in complex with a 5‧ flap DNA substrate, showing that two FAN1 molecules form a head-to-tail dimer to locate the lesion, orient the DNA and unwind a 5‧ flap for subsequent incision. Biochemical experiments further validate our model for FAN1 action, as structure-informed mutations that disrupt protein dimerization, substrate orientation or flap unwinding impair the structure-specific nuclease activity. Our work elucidates essential aspects of FAN1-DNA lesion recognition and a unique mechanism of incision. These structural insights shed light on the cellular mechanisms underlying organ degeneration protection and cancer drug resistance mediated by FAN1.
A novel member of the split betaalphabeta fold: Solution structure of the hypothetical protein YML108W from Saccharomyces cerevisiae.

PubMed

Pineda-Lucena, Antonio; Liao, Jack C C; Cort, John R; Yee, Adelinda; Kennedy, Michael A; Edwards, Aled M; Arrowsmith, Cheryl H

2003-05-01

As part of the Northeast Structural Genomics Consortium pilot project focused on small eukaryotic proteins and protein domains, we have determined the NMR structure of the protein encoded by ORF YML108W from Saccharomyces cerevisiae. YML108W belongs to one of the numerous structural proteomics targets whose biological function is unknown. Moreover, this protein does not have sequence similarity to any other protein. The NMR structure of YML108W consists of a four-stranded beta-sheet with strand order 2143 and two alpha-helices, with an overall topology of betabetaalphabetabetaalpha. Strand beta1 runs parallel to beta4, and beta2:beta1 and beta4:beta3 pairs are arranged in an antiparallel fashion. Although this fold belongs to the split betaalphabeta family, it appears to be unique among this family; it is a novel arrangement of secondary structure, thereby expanding the universe of protein folds.
Novel modes of RNA editing in mitochondria

PubMed Central

Moreira, Sandrine; Valach, Matus; Aoulad-Aissa, Mohamed; Otto, Christian; Burger, Gertraud

2016-01-01

Abstract Gene structure and expression in diplonemid mitochondria are unparalleled. Genes are fragmented in pieces (modules) that are separately transcribed, followed by the joining of module transcripts to contiguous RNAs. Some instances of unique uridine insertion RNA editing at module boundaries were noted, but the extent and potential occurrence of other editing types remained unknown. Comparative analysis of deep transcriptome and genome data from Diplonema papillatum mitochondria reveals ∼220 post-transcriptional insertions of uridines, but no insertions of other nucleotides nor deletions. In addition, we detect in total 114 substitutions of cytosine by uridine and adenosine by inosine, amassed into unusually compact clusters. Inosines in transcripts were confirmed experimentally. This is the first report of adenosine-to-inosine editing of mRNAs and ribosomal RNAs in mitochondria. In mRNAs, editing causes mostly amino-acid additions and non-synonymous substitutions; in ribosomal RNAs, it permits formation of canonical secondary structures. Two extensively edited transcripts were compared across four diplonemids. The pattern of uridine-insertion editing is strictly conserved, whereas substitution editing has diverged dramatically, but still rendering diplonemid proteins more similar to other eukaryotic orthologs. We posit that RNA editing not only compensates but also sustains, or even accelerates, ultra-rapid evolution of genome structure and sequence in diplonemid mitochondria. PMID:27001515
Genomic characterization of a core set of the USDA-NPGS Ethiopian sorghum germplasm collection: implications for germplasm conservation, evaluation, and utilization in crop improvement.

PubMed

Cuevas, Hugo E; Rosa-Valentin, Giseiry; Hayes, Chad M; Rooney, William L; Hoffmann, Leo

2017-01-26

The USDA Agriculture Research Service National Plant Germplasm System (NPGS) preserves the largest sorghum germplasm collection in the world, which includes 7,217 accessions from the center of diversity in Ethiopia. The characterization of this exotic germplasm at a genome-wide scale will improve conservation efforts and its utilization in research and breeding programs. Therefore, we phenotyped a representative core set of 374 Ethiopian accessions at two locations for agronomic traits and characterized the genomes. Using genotyping-by-sequencing, we identified 148,476 single-nucleotide polymorphism (SNP) markers distributed across the entire genome. Over half of the alleles were rare (frequency < 0.05). The genetic profile of each accession was unique (i.e., no duplicates), and the average genetic distance among accessions was 0.70. Based on population structure and cluster analyses, we separated the collection into 11 populations with pairwise F ST values ranging from 0.11 to 0.47. In total, 198 accessions (53%) were assigned to one of these populations with an ancestry membership coefficient of larger than 0.60; these covered 90% of the total genomic variation. We characterized these populations based on agronomic and seed compositional traits. We performed a cluster analysis with the sorghum association panel based on 26,026 SNPs and determined that nine of the Ethiopian populations expanded the genetic diversity in the panel. Genome-wide association analysis demonstrated that these low-coverage data and the observed population structure could be employed for the genomic dissection of important phenotypes in this core set of Ethiopian sorghum germplasm. The NPGS Ethiopian sorghum germplasm is a genetically and phenotypically diverse collection comprising 11 populations with high levels of admixture. Genetic associations with agronomic traits can be used to improve the screening of exotic germplasm for selection of specific populations. We detected many rare alleles, suggesting that this germplasm contains potentially useful undiscovered alleles, but their discovery and characterization will require extensive effort. The genotypic data available for these accessions provide a valuable resource for sorghum breeders and geneticists to effectively improve crops.
In silico Comparison of 19 Porphyromonas gingivalis Strains in Genomics, Phylogenetics, Phylogenomics and Functional Genomics.

PubMed

Chen, Tsute; Siddiqui, Huma; Olsen, Ingar

2017-01-01

Currently, genome sequences of a total of 19 Porphyromonas gingivalis strains are available, including eight completed genomes (strains W83, ATCC 33277, TDC60, HG66, A7436, AJW4, 381, and A7A1-28) and 11 high-coverage draft sequences (JCVI SC001, F0185, F0566, F0568, F0569, F0570, SJD2, W4087, W50, Ando, and MP4-504) that are assembled into fewer than 300 contigs. The objective was to compare these genomes at both nucleotide and protein sequence levels in order to understand their phylogenetic and functional relatedness. Four copies of 16S rRNA gene sequences were identified in each of the eight complete genomes and one in the other 11 unfinished genomes. These 43 16S rRNA sequences represent only 24 unique sequences and the derived phylogenetic tree suggests a possible evolutionary history for these strains. Phylogenomic comparison based on shared proteins and whole genome nucleotide sequences consistently showed two groups with closely related members: one consisted of ATCC 33277, 381, and HG66, another of W83, W50, and A7436. At least 1,037 core/shared proteins were identified in the 19 P. gingivalis genomes based on the most stringent detecting parameters. Comparative functional genomics based on genome-wide comparisons between NCBI and RAST annotations, as well as additional approaches, revealed functions that are unique or missing in individual P. gingivalis strains, or species-specific in all P. gingivalis strains, when compared to a neighboring species P. asaccharolytica . All the comparative results of this study are available online for download at ftp://www.homd.org/publication_data/20160425/.
The first complete chloroplast genome sequence of a lycophyte,Huperzia lucidula (Lycopodiaceae)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wolf, Paul G.; Karol, Kenneth G.; Mandoli, Dina F.

2005-02-01

We used a unique combination of techniques to sequence the first complete chloroplast genome of a lycophyte, Huperzia lucidula. This plant belongs to a significant clade hypothesized to represent the sister group to all other vascular plants. We used fluorescence-activated cell sorting (FACS) to isolate the organelles, rolling circle amplification (RCA) to amplify the genome, and shotgun sequencing to 8x depth coverage to obtain the complete chloroplast genome sequence. The genome is 154,373bp, containing inverted repeats of 15,314 bp each, a large single-copy region of 104,088 bp, and a small single-copy region of 19,671 bp. Gene order is more similarmore » to those of mosses, liverworts, and hornworts than to gene order for other vascular plants. For example, the Huperziachloroplast genome possesses the bryophyte gene order for a previously characterized 30 kb inversion, thus supporting the hypothesis that lycophytes are sister to all other extant vascular plants. The lycophytechloroplast genome data also enable a better reconstruction of the basaltracheophyte genome, which is useful for inferring relationships among bryophyte lineages. Several unique characters are observed in Huperzia, such as movement of the gene ndhF from the small single copy region into the inverted repeat. We present several analyses of evolutionary relationships among land plants by using nucleotide data, amino acid sequences, and by comparing gene arrangements from chloroplast genomes. The results, while still tentative pending the large number of chloroplast genomes from other key lineages that are soon to be sequenced, are intriguing in themselves, and contribute to a growing comparative database of genomic and morphological data across the green plants.« less
In silico Comparison of 19 Porphyromonas gingivalis Strains in Genomics, Phylogenetics, Phylogenomics and Functional Genomics

PubMed Central

Chen, Tsute; Siddiqui, Huma; Olsen, Ingar

2017-01-01

Currently, genome sequences of a total of 19 Porphyromonas gingivalis strains are available, including eight completed genomes (strains W83, ATCC 33277, TDC60, HG66, A7436, AJW4, 381, and A7A1-28) and 11 high-coverage draft sequences (JCVI SC001, F0185, F0566, F0568, F0569, F0570, SJD2, W4087, W50, Ando, and MP4-504) that are assembled into fewer than 300 contigs. The objective was to compare these genomes at both nucleotide and protein sequence levels in order to understand their phylogenetic and functional relatedness. Four copies of 16S rRNA gene sequences were identified in each of the eight complete genomes and one in the other 11 unfinished genomes. These 43 16S rRNA sequences represent only 24 unique sequences and the derived phylogenetic tree suggests a possible evolutionary history for these strains. Phylogenomic comparison based on shared proteins and whole genome nucleotide sequences consistently showed two groups with closely related members: one consisted of ATCC 33277, 381, and HG66, another of W83, W50, and A7436. At least 1,037 core/shared proteins were identified in the 19 P. gingivalis genomes based on the most stringent detecting parameters. Comparative functional genomics based on genome-wide comparisons between NCBI and RAST annotations, as well as additional approaches, revealed functions that are unique or missing in individual P. gingivalis strains, or species-specific in all P. gingivalis strains, when compared to a neighboring species P. asaccharolytica. All the comparative results of this study are available online for download at ftp://www.homd.org/publication_data/20160425/. PMID:28261563
The mitochondrial genome sequences of the round goby and the sand goby reveal patterns of recent evolution in gobiid fish.

PubMed

Adrian-Kalchhauser, Irene; Svensson, Ola; Kutschera, Verena E; Alm Rosenblad, Magnus; Pippel, Martin; Winkler, Sylke; Schloissnig, Siegfried; Blomberg, Anders; Burkhardt-Holm, Patricia

2017-02-16

Vertebrate mitochondrial genomes are optimized for fast replication and low cost of RNA expression. Accordingly, they are devoid of introns, are transcribed as polycistrons and contain very little intergenic sequences. Usually, vertebrate mitochondrial genomes measure between 16.5 and 17 kilobases (kb). During genome sequencing projects for two novel vertebrate models, the invasive round goby and the sand goby, we found that the sand goby genome is exceptionally small (16.4 kb), while the mitochondrial genome of the round goby is much larger than expected for a vertebrate. It is 19 kb in size and is thus one of the largest fish and even vertebrate mitochondrial genomes known to date. The expansion is attributable to a sequence insertion downstream of the putative transcriptional start site. This insertion carries traces of repeats from the control region, but is mostly novel. To get more information about this phenomenon, we gathered all available mitochondrial genomes of Gobiidae and of nine gobioid species, performed phylogenetic analyses, analysed gene arrangements, and compared gobiid mitochondrial genome sizes, ecological information and other species characteristics with respect to the mitochondrial phylogeny. This allowed us amongst others to identify a unique arrangement of tRNAs among Ponto-Caspian gobies. Our results indicate that the round goby mitochondrial genome may contain novel features. Since mitochondrial genome organisation is tightly linked to energy metabolism, these features may be linked to its invasion success. Also, the unique tRNA arrangement among Ponto-Caspian gobies may be helpful in studying the evolution of this highly adaptive and invasive species group. Finally, we find that the phylogeny of gobiids can be further refined by the use of longer stretches of linked DNA sequence.
Systematic CpT (ApG) depletion and CpG excess are unique genomic signatures of large DNA viruses infecting invertebrates.

PubMed

Upadhyay, Mohita; Sharma, Neha; Vivekanandan, Perumal

2014-01-01

Differences in the relative abundance of dinucleotides, if any may provide important clues on host-driven evolution of viruses. We studied dinucleotide frequencies of large DNA viruses infecting vertebrates (n = 105; viruses infecting mammals = 99; viruses infecting aves = 6; viruses infecting reptiles = 1) and invertebrates (n = 88; viruses infecting insects = 84; viruses infecting crustaceans = 4). We have identified systematic depletion of CpT(ApG) dinucleotides and over-representation of CpG dinucleotides as the unique genomic signature of large DNA viruses infecting invertebrates. Detailed investigation of this unique genomic signature suggests the existence of invertebrate host-induced pressures specifically targeting CpT(ApG) and CpG dinucleotides. The depletion of CpT dinucleotides among large DNA viruses infecting invertebrates is at least in part, explained by non-canonical DNA methylation by the infected host. Our findings highlight the role of invertebrate host-related factors in shaping virus evolution and they also provide the necessary framework for future studies on evolution, epigenetics and molecular biology of viruses infecting this group of hosts.
Complete genome sequence of the acetylene-fermenting Pelobacter sp. strain SFB93

USGS Publications Warehouse

Sutton, John M.; Baesman, Shaun; Fierst, Janna L.; Poret-Peterson, Amisha T.; Oremland, Ronald S.; Dunlap, Darren S.; Akob, Denise M.

2017-01-01

Acetylene fermentation is a rare metabolism that was previously reported as being unique to Pelobacter acetylenicus. Here, we report the genome sequence of Pelobacter sp. strain SFB93, an acetylene-fermenting bacterium isolated from sediments collected in San Francisco Bay, CA.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Chauhan, Archana; Layton, Alice; Williams, Daniel W

Pseudomonas fluorescens strain HK44 (DSM 6700) is a genetically engineered lux-based bioluminescent bioreporter. Here we report the draft genome sequence of strain HK44. Annotation of {approx}6.1 Mb sequence indicates that 30% of the traits are unique and distributed over 5 genomic islands, a prophage and two plasmids.
The genome of Mesobuthus martensii reveals a unique adaptation model of arthropods

PubMed Central

Cao, Zhijian; Yu, Yao; Wu, Yingliang; Hao, Pei; Di, Zhiyong; He, Yawen; Chen, Zongyun; Yang, Weishan; Shen, Zhiyong; He, Xiaohua; Sheng, Jia; Xu, Xiaobo; Pan, Bohu; Feng, Jing; Yang, Xiaojuan; Hong, Wei; Zhao, Wenjuan; Li, Zhongjie; Huang, Kai; Li, Tian; Kong, Yimeng; Liu, Hui; Jiang, Dahe; Zhang, Binyan; Hu, Jun; Hu, Youtian; Wang, Bin; Dai, Jianliang; Yuan, Bifeng; Feng, Yuqi; Huang, Wei; Xing, Xiaojing; Zhao, Guoping; Li, Xuan; Li, Yixue; Li, Wenxin

2013-01-01

Representing a basal branch of arachnids, scorpions are known as ‘living fossils’ that maintain an ancient anatomy and are adapted to have survived extreme climate changes. Here we report the genome sequence of Mesobuthus martensii, containing 32,016 protein-coding genes, the most among sequenced arthropods. Although M. martensii appears to evolve conservatively, it has a greater gene family turnover than the insects that have undergone diverse morphological and physiological changes, suggesting the decoupling of the molecular and morphological evolution in scorpions. Underlying the long-term adaptation of scorpions is the expansion of the gene families enriched in basic metabolic pathways, signalling pathways, neurotoxins and cytochrome P450, and the different dynamics of expansion between the shared and the scorpion lineage-specific gene families. Genomic and transcriptomic analyses further illustrate the important genetic features associated with prey, nocturnal behaviour, feeding and detoxification. The M. martensii genome reveals a unique adaptation model of arthropods, offering new insights into the genetic bases of the living fossils. PMID:24129506
The genome of Th17 cell-inducing segmented filamentous bacteria reveals extensive auxotrophy and adaptations to the intestinal environment

PubMed Central

Sczesnak, Andrew; Segata, Nicola; Qin, Xiang; Gevers, Dirk; Petrosino, Joseph F.; Huttenhower, Curtis; Littman, Dan R.; Ivanov, Ivaylo I.

2011-01-01

Summary Perturbations of the composition of the symbiotic intestinal microbiota can have profound consequences for host metabolism and immunity. In mice, segmented filamentous bacteria (SFB) direct the accumulation of potentially pro-inflammatory Th17 cells in the intestinal lamina propria. We present the genome sequence of SFB isolated from mono-colonized mice, which classifies SFB phylogenetically as a unique member of Clostridiales with a highly reduced genome. Annotation analysis demonstrates that SFB depends on its environment for amino acids and essential nutrients and may utilize host and dietary glycans for carbon, nitrogen, and energy. Comparative analyses reveal that SFB is functionally related to members of the genus Clostridium and several pathogenic or commensal “minimal” genera, including Finegoldia, Mycoplasma, Borrelia, and Phytoplasma. However, SFB is functionally distinct from all 1,200 examined genomes, indicating a gene complement representing biology relatively unique to its role as a gut commensal closely tied to host metabolism and immunity. PMID:21925113
Genome analysis of three Pneumocystis species reveals adaptation mechanisms to life exclusively in mammalian hosts

PubMed Central

Ma, Liang; Chen, Zehua; Huang, Da Wei; Kutty, Geetha; Ishihara, Mayumi; Wang, Honghui; Abouelleil, Amr; Bishop, Lisa; Davey, Emma; Deng, Rebecca; Deng, Xilong; Fan, Lin; Fantoni, Giovanna; Fitzgerald, Michael; Gogineni, Emile; Goldberg, Jonathan M.; Handley, Grace; Hu, Xiaojun; Huber, Charles; Jiao, Xiaoli; Jones, Kristine; Levin, Joshua Z.; Liu, Yueqin; Macdonald, Pendexter; Melnikov, Alexandre; Raley, Castle; Sassi, Monica; Sherman, Brad T.; Song, Xiaohong; Sykes, Sean; Tran, Bao; Walsh, Laura; Xia, Yun; Yang, Jun; Young, Sarah; Zeng, Qiandong; Zheng, Xin; Stephens, Robert; Nusbaum, Chad; Birren, Bruce W.; Azadi, Parastoo; Lempicki, Richard A.; Cuomo, Christina A.; Kovacs, Joseph A.

2016-01-01

Pneumocystis jirovecii is a major cause of life-threatening pneumonia in immunosuppressed patients including transplant recipients and those with HIV/AIDS, yet surprisingly little is known about the biology of this fungal pathogen. Here we report near complete genome assemblies for three Pneumocystis species that infect humans, rats and mice. Pneumocystis genomes are highly compact relative to other fungi, with substantial reductions of ribosomal RNA genes, transporters, transcription factors and many metabolic pathways, but contain expansions of surface proteins, especially a unique and complex surface glycoprotein superfamily, as well as proteases and RNA processing proteins. Unexpectedly, the key fungal cell wall components chitin and outer chain N-mannans are absent, based on genome content and experimental validation. Our findings suggest that Pneumocystis has developed unique mechanisms of adaptation to life exclusively in mammalian hosts, including dependence on the lungs for gas and nutrients and highly efficient strategies to escape both host innate and acquired immune defenses. PMID:26899007
Challenges and Strategies for Breeding Resistance in Capsicum annuum to the Multifarious Pathogen, Phytophthora capsici

PubMed Central

Barchenger, Derek W.; Lamour, Kurt H.; Bosland, Paul W.

2018-01-01

Phytophthora capsici is the most devastating pathogen for chile pepper production worldwide and current management strategies are not effective. The population structure of the pathogen is highly variable and few sources of widely applicable host resistance have been identified. Recent genomic advancements in the host and the pathogen provide important insights into the difficulties reported by epidemiological and physiological studies published over the past century. This review highlights important challenges unique to this complex pathosystem and suggests strategies for resistance breeding to help limit losses associated with P. capsici. PMID:29868083
Plasmid Characterization and Chromosome Analysis of Two netF+ Clostridium perfringens Isolates Associated with Foal and Canine Necrotizing Enteritis

PubMed Central

Mehdizadeh Gohari, Iman; Kropinski, Andrew M.; Weese, Scott J.; Parreira, Valeria R.; Whitehead, Ashley E.; Boerlin, Patrick; Prescott, John F.

2016-01-01

The recent discovery of a novel beta-pore-forming toxin, NetF, which is strongly associated with canine and foal necrotizing enteritis should improve our understanding of the role of type A Clostridium perfringens associated disease in these animals. The current study presents the complete genome sequence of two netF-positive strains, JFP55 and JFP838, which were recovered from cases of foal necrotizing enteritis and canine hemorrhagic gastroenteritis, respectively. Genome sequencing was done using Single Molecule, Real-Time (SMRT) technology-PacBio and Illumina Hiseq2000. The JFP55 and JFP838 genomes include a single 3.34 Mb and 3.53 Mb chromosome, respectively, and both genomes include five circular plasmids. Plasmid annotation revealed that three plasmids were shared by the two newly sequenced genomes, including a NetF/NetE toxins-encoding tcp-conjugative plasmid, a CPE/CPB2 toxins-encoding tcp-conjugative plasmid and a putative bacteriocin-encoding plasmid. The putative beta-pore-forming toxin genes, netF, netE and netG, were located in unique pathogenicity loci on tcp-conjugative plasmids. The C. perfringens JFP55 chromosome carries 2,825 protein-coding genes whereas the chromosome of JFP838 contains 3,014 protein-encoding genes. Comparison of these two chromosomes with three available reference C. perfringens chromosome sequences identified 48 (~247 kb) and 81 (~430 kb) regions unique to JFP55 and JFP838, respectively. Some of these divergent genomic regions in both chromosomes are phage- and plasmid-related segments. Sixteen of these unique chromosomal regions (~69 kb) were shared between the two isolates. Five of these shared regions formed a mosaic of plasmid-integrated segments, suggesting that these elements were acquired early in a clonal lineage of netF-positive C. perfringens strains. These results provide significant insight into the basis of canine and foal necrotizing enteritis and are the first to demonstrate that netF resides on a large and unique plasmid-encoded locus. PMID:26859667

Sequence space coverage, entropy of genomes and the potential to detect non-human DNA in human samples

PubMed Central

Liu, Zhandong; Venkatesh, Santosh S; Maley, Carlo C

2008-01-01

Background Genomes store information for building and maintaining organisms. Complete sequencing of many genomes provides the opportunity to study and compare global information properties of those genomes. Results We have analyzed aspects of the information content of Homo sapiens, Mus musculus, Drosophila melanogaster, Caenorhabditis elegans, Arabidopsis thaliana, Saccharomyces cerevisiae, and Escherichia coli (K-12) genomes. Virtually all possible (> 98%) 12 bp oligomers appear in vertebrate genomes while < 2% of 19 bp oligomers are present. Other species showed different ranges of > 98% to < 2% of possible oligomers in D. melanogaster (12–17 bp), C. elegans (11–17 bp), A. thaliana (11–17 bp), S. cerevisiae (10–16 bp) and E. coli (9–15 bp). Frequencies of unique oligomers in the genomes follow similar patterns. We identified a set of 2.6 M 15-mers that are more than 1 nucleotide different from all 15-mers in the human genome and so could be used as probes to detect microbes in human samples. In a human sample, these probes would detect 100% of the 433 currently fully sequenced prokaryotes and 75% of the 3065 fully sequenced viruses. The human genome is significantly more compact in sequence space than a random genome. We identified the most frequent 5- to 20-mers in the human genome, which may prove useful as PCR primers. We also identified a bacterium, Anaeromyxobacter dehalogenans, which has an exceptionally low diversity of oligomers given the size of its genome and its GC content. The entropy of coding regions in the human genome is significantly higher than non-coding regions and chromosomes. However chromosomes 1, 2, 9, 12 and 14 have a relatively high proportion of coding DNA without high entropy, and chromosome 20 is the opposite with a low frequency of coding regions but relatively high entropy. Conclusion Measures of the frequency of oligomers are useful for designing PCR assays and for identifying chromosomes and organisms with hidden structure that had not been previously recognized. This information may be used to detect novel microbes in human tissues. PMID:18973670
Sequence space coverage, entropy of genomes and the potential to detect non-human DNA in human samples.

PubMed

Liu, Zhandong; Venkatesh, Santosh S; Maley, Carlo C

2008-10-30

Genomes store information for building and maintaining organisms. Complete sequencing of many genomes provides the opportunity to study and compare global information properties of those genomes. We have analyzed aspects of the information content of Homo sapiens, Mus musculus, Drosophila melanogaster, Caenorhabditis elegans, Arabidopsis thaliana, Saccharomyces cerevisiae, and Escherichia coli (K-12) genomes. Virtually all possible (> 98%) 12 bp oligomers appear in vertebrate genomes while < 2% of 19 bp oligomers are present. Other species showed different ranges of > 98% to < 2% of possible oligomers in D. melanogaster (12-17 bp), C. elegans (11-17 bp), A. thaliana (11-17 bp), S. cerevisiae (10-16 bp) and E. coli (9-15 bp). Frequencies of unique oligomers in the genomes follow similar patterns. We identified a set of 2.6 M 15-mers that are more than 1 nucleotide different from all 15-mers in the human genome and so could be used as probes to detect microbes in human samples. In a human sample, these probes would detect 100% of the 433 currently fully sequenced prokaryotes and 75% of the 3065 fully sequenced viruses. The human genome is significantly more compact in sequence space than a random genome. We identified the most frequent 5- to 20-mers in the human genome, which may prove useful as PCR primers. We also identified a bacterium, Anaeromyxobacter dehalogenans, which has an exceptionally low diversity of oligomers given the size of its genome and its GC content. The entropy of coding regions in the human genome is significantly higher than non-coding regions and chromosomes. However chromosomes 1, 2, 9, 12 and 14 have a relatively high proportion of coding DNA without high entropy, and chromosome 20 is the opposite with a low frequency of coding regions but relatively high entropy. Measures of the frequency of oligomers are useful for designing PCR assays and for identifying chromosomes and organisms with hidden structure that had not been previously recognized. This information may be used to detect novel microbes in human tissues.
The three-dimensional genome organization of Drosophila melanogaster through data integration.

PubMed

Li, Qingjiao; Tjong, Harianto; Li, Xiao; Gong, Ke; Zhou, Xianghong Jasmine; Chiolo, Irene; Alber, Frank

2017-07-31

Genome structures are dynamic and non-randomly organized in the nucleus of higher eukaryotes. To maximize the accuracy and coverage of three-dimensional genome structural models, it is important to integrate all available sources of experimental information about a genome's organization. It remains a major challenge to integrate such data from various complementary experimental methods. Here, we present an approach for data integration to determine a population of complete three-dimensional genome structures that are statistically consistent with data from both genome-wide chromosome conformation capture (Hi-C) and lamina-DamID experiments. Our structures resolve the genome at the resolution of topological domains, and reproduce simultaneously both sets of experimental data. Importantly, this data deconvolution framework allows for structural heterogeneity between cells, and hence accounts for the expected plasticity of genome structures. As a case study we choose Drosophila melanogaster embryonic cells, for which both data types are available. Our three-dimensional genome structures have strong predictive power for structural features not directly visible in the initial data sets, and reproduce experimental hallmarks of the D. melanogaster genome organization from independent and our own imaging experiments. Also they reveal a number of new insights about genome organization and its functional relevance, including the preferred locations of heterochromatic satellites of different chromosomes, and observations about homologous pairing that cannot be directly observed in the original Hi-C or lamina-DamID data. Our approach allows systematic integration of Hi-C and lamina-DamID data for complete three-dimensional genome structure calculation, while also explicitly considering genome structural variability.
Breed-Specific Ancestry Studies and Genome-Wide Association Analysis Highlight an Association Between the MYH9 Gene and Heat Tolerance in Alaskan Sprint Racing Sled Dogs

PubMed Central

Huson, Heather J.; vonHoldt, Bridgett M.; Rimbault, Maud; Byers, Alexandra M.; Runstadler, Jonathan A.; Parker, Heidi G.; Ostrander, Elaine A.

2012-01-01

Alaskan sled dogs are a genetically distinct population shaped by generations of selective interbreeding with purebred dogs to create a group of high performance athletes. As a result of selective breeding strategies, sled dogs present a unique opportunity to employ admixture-mapping techniques to investigate how breed composition and trait selection impact genomic structure. We used admixture mapping to investigate genetic ancestry across the genomes of two classes of sled dogs, sprint and long distance racers, and combined that with genome wide association studies (GWAS) to identify regions correlating with performance enhancing traits. The sled dog genome is enhanced by differential contributions from four non-admixed breeds (Alaskan Malamute, Siberian Husky, German Shorthaired Pointer, and Borzoi). A principle components analysis (PCA) of 115,000 genome-wide SNPs clearly resolved the sprint and distance populations as distinct genetic groups, with longer blocks of linkage disequilibrium (LD) observed in the distance versus sprint dogs (7.5–10 and 2.5–3.75 kb, respectively). Further, we identified eight regions with the genomic signal either from a selective sweep or an association analysis, corroborated by an excess of ancestry when comparing sprint and distance dogs. A comparison of elite and poor performing sled dogs identified a single region significantly association with heat tolerance. Within the region we identified seven SNPs within the myosin heavy chain 9 gene (MYH9) that were significantly associated with heat tolerance in sprint dogs, two of which correspond to conserved promoter and enhancer regions in the human ortholog. PMID:22105876
Drechslerella stenobrocha genome illustrates the mechanism of constricting rings and the origin of nematode predation in fungi

PubMed Central

2014-01-01

Background Nematode-trapping fungi are a unique group of organisms that can capture nematodes using sophisticated trapping structures. The genome of Drechslerella stenobrocha, a constricting-ring-forming fungus, has been sequenced and reported, and provided new insights into the evolutionary origins of nematode predation in fungi, the trapping mechanisms, and the dual lifestyles of saprophagy and predation. Results The genome of the fungus Drechslerella stenobrocha, which mechanically traps nematodes using a constricting ring, was sequenced. The genome was 29.02 Mb in size and was found rare instances of transposons and repeat induced point mutations, than that of Arthrobotrys oligospora. The functional proteins involved in nematode-infection, such as chitinases, subtilisins, and adhesive proteins, underwent a significant expansion in the A. oligospora genome, while there were fewer lectin genes that mediate fungus-nematode recognition in the D. stenobrocha genome. The carbohydrate-degrading enzyme catalogs in both species were similar to those of efficient cellulolytic fungi, suggesting a saprophytic origin of nematode-trapping fungi. In D. stenobrocha, the down-regulation of saprophytic enzyme genes and the up-regulation of infection-related genes during the capture of nematodes indicated a transition between dual life strategies of saprophagy and predation. The transcriptional profiles also indicated that trap formation was related to the protein kinase C (PKC) signal pathway and regulated by Zn(2)–C6 type transcription factors. Conclusions The genome of D. stenobrocha provides support for the hypothesis that nematode trapping fungi evolved from saprophytic fungi in a high carbon and low nitrogen environment. It reveals the transition between saprophagy and predation of these fungi and also proves new insights into the mechanisms of mechanical trapping. PMID:24507587
RNA Structural Analysis by Evolving SHAPE Chemistry

PubMed Central

Spitale, Robert C.; Flynn, Ryan A.; Torre, Eduardo A.; Kool, Eric T.; Chang, Howard Y.

2017-01-01

RNA is central to the flow of biological information. From transcription to splicing, RNA localization, translation, and decay, RNA is intimately involved in regulating every step of the gene expression program, and is thus essential for health and understanding disease. RNA has the unique ability to base-pair with itself and other nucleic acids to form complex structures. Hence the information content in RNA is not simply its linear sequence of bases, but is also encoded in complex folding of RNA molecules. A general chemical functionality that all RNAs have is a 2’-hydroxyl group in the ribose ring, and the reactivity of the 2'-hydroxyl in RNA is gated by local nucleotide flexibility. In other words, the 2'-hydroxyl is reactive at single-stranded and conformationally flexible positions but is unreactive at nucleotides constrained by base pairing. Recent efforts have been focused on developing reagents that modify RNA as a function of RNA 2’ hydroxyl group flexibility. Such RNA structure probing techniques can be read out by primer extension in experiments termed RNA SHAPE (Selective 2’ Hydroxyl Acylation and Primer Extension). Herein we describe the efforts devoted to the design and utilization of SHAPE probes for characterizing RNA structure. We also describe current technological advances that are being used to utilize SHAPE chemistry with deep sequencing to probe many RNAs in parallel. The merger of chemistry with genomics is sure to open the door to genome-wide exploration of RNA structure and function. PMID:25132067
Comparative genomic analysis shows that Streptococcus suis meningitis isolate SC070731 contains a unique 105K genomic island.

PubMed

Wu, Zongfu; Wang, Weixue; Tang, Min; Shao, Jing; Dai, Chen; Zhang, Wei; Fan, Hongjie; Yao, Huochun; Zong, Jie; Chen, Dai; Wang, Junning; Lu, Chengping

2014-02-10

Streptococcus suis (SS) is an important swine pathogen worldwide that occasionally causes serious infections in humans. SS infection may result in meningitis in pigs and humans. The pathogenic mechanisms of SS are poorly understood. Here, we provide the complete genome sequence of S. suis serotype 2 (SS2) strain SC070731 isolated from a pig with meningitis. The chromosome is 2,138,568bp in length. There are 1933 predicted protein coding sequences and 96.7% (57/59) of the known virulence-associated genes are present in the genome. Strain SC070731 showed similar virulence with SS2 virulent strains HA9801 and ZY05719, but was more virulent than SS2 virulent strain P1/7 in the zebrafish infection model. Comparative genomic analysis revealed a unique 105K genomic island in strain SC070731 that is absent in seven other sequenced SS2 strains. Further analysis of the 105K genomic island indicated that it contained a complete nisin locus similar to the nisin U locus in S. uberis strain 42, a prophage similar to S. oralis phage PH10 and several antibiotic resistance genes. Several proteins in the 105K genomic island, including nisin and RelBE toxin-antitoxin system, contribute to the bacterial fitness and virulence in other pathogenic bacteria. Further investigation of newly identified gene products, including four putative new virulence-associated surface proteins, will improve our understanding of SS pathogenesis. Copyright © 2013 Elsevier B.V. All rights reserved.
Large-Scale Bioinformatics Analysis of Bacillus Genomes Uncovers Conserved Roles of Natural Products in Bacterial Physiology.

PubMed

Grubbs, Kirk J; Bleich, Rachel M; Santa Maria, Kevin C; Allen, Scott E; Farag, Sherif; Shank, Elizabeth A; Bowers, Albert A

2017-01-01

Bacteria possess an amazing capacity to synthesize a diverse range of structurally complex, bioactive natural products known as specialized (or secondary) metabolites. Many of these specialized metabolites are used as clinical therapeutics, while others have important ecological roles in microbial communities. The biosynthetic gene clusters (BGCs) that generate these metabolites can be identified in bacterial genome sequences using their highly conserved genetic features. We analyzed an unprecedented 1,566 bacterial genomes from Bacillus species and identified nearly 20,000 BGCs. By comparing these BGCs to one another as well as a curated set of known specialized metabolite BGCs, we discovered that the majority of Bacillus natural products are comprised of a small set of highly conserved, well-distributed, known natural product compounds. Most of these metabolites have important roles influencing the physiology and development of Bacillus species. We identified, in addition to these characterized compounds, many unique, weakly conserved BGCs scattered across the genus that are predicted to encode unknown natural products. Many of these "singleton" BGCs appear to have been acquired via horizontal gene transfer. Based on this large-scale characterization of metabolite production in the Bacilli , we go on to connect the alkylpyrones, natural products that are highly conserved but previously biologically uncharacterized, to a role in Bacillus physiology: inhibiting spore development. IMPORTANCE Bacilli are capable of producing a diverse array of specialized metabolites, many of which have gained attention for their roles as signals that affect bacterial physiology and development. Up to this point, however, the Bacillus genus's metabolic capacity has been underexplored. We undertook a deep genomic analysis of 1,566 Bacillus genomes to understand the full spectrum of metabolites that this bacterial group can make. We discovered that the majority of the specialized metabolites produced by Bacillus species are highly conserved, known compounds with important signaling roles in the physiology and development of this bacterium. Additionally, there is significant unique biosynthetic machinery distributed across the genus that might lead to new, unknown metabolites with diverse biological functions. Inspired by the findings of our genomic analysis, we speculate that the highly conserved alkylpyrones might have an important biological activity within this genus. We go on to validate this prediction by demonstrating that these natural products are developmental signals in Bacillus and act by inhibiting sporulation.
Gene-enriched draft genome of the cattle tick Rhipicephalus microplus: assembly by the hybrid Pacific Biosciences/Illumina approach enabled analysis of the highly repetitive genome.

PubMed

Barrero, Roberto A; Guerrero, Felix D; Black, Michael; McCooke, John; Chapman, Brett; Schilkey, Faye; Pérez de León, Adalberto A; Miller, Robert J; Bruns, Sara; Dobry, Jason; Mikhaylenko, Galina; Stormo, Keith; Bell, Callum; Tao, Quanzhou; Bogden, Robert; Moolhuijzen, Paula M; Hunter, Adam; Bellgard, Matthew I

2017-08-01

The genome of the cattle tick Rhipicephalus microplus, an ectoparasite with global distribution, is estimated to be 7.1Gbp in length and consists of approximately 70% repetitive DNA. We report the draft assembly of a tick genome that utilized a hybrid sequencing and assembly approach to capture the repetitive fractions of the genome. Our hybrid approach produced an assembly consisting of 2.0Gbp represented in 195,170 scaffolds with a N50 of 60,284bp. The Rmi v2.0 assembly is 51.46% repetitive with a large fraction of unclassified repeats, short interspersed elements, long interspersed elements and long terminal repeats. We identified 38,827 putative R. microplus gene loci, of which 24,758 were protein coding genes (≥100 amino acids). OrthoMCL comparative analysis against 11 selected species including insects and vertebrates identified 10,835 and 3,423 protein coding gene loci that are unique to R. microplus or common to both R. microplus and Ixodes scapularis ticks, respectively. We identified 191 microRNA loci, of which 168 have similarity to known miRNAs and 23 represent novel miRNA families. We identified the genomic loci of several highly divergent R. microplus esterases with sequence similarity to acetylcholinesterase. Additionally we report the finding of a novel cytochrome P450 CYP41 homolog that shows similar protein folding structures to known CYP41 proteins known to be involved in acaricide resistance. Copyright © 2017 Australian Society for Parasitology. Published by Elsevier Ltd. All rights reserved.
Genomic Variation in Natural Populations of Drosophila melanogaster

PubMed Central

Langley, Charles H.; Stevens, Kristian; Cardeno, Charis; Lee, Yuh Chwen G.; Schrider, Daniel R.; Pool, John E.; Langley, Sasha A.; Suarez, Charlyn; Corbett-Detig, Russell B.; Kolaczkowski, Bryan; Fang, Shu; Nista, Phillip M.; Holloway, Alisha K.; Kern, Andrew D.; Dewey, Colin N.; Song, Yun S.; Hahn, Matthew W.; Begun, David J.

2012-01-01

This report of independent genome sequences of two natural populations of Drosophila melanogaster (37 from North America and 6 from Africa) provides unique insight into forces shaping genomic polymorphism and divergence. Evidence of interactions between natural selection and genetic linkage is abundant not only in centromere- and telomere-proximal regions, but also throughout the euchromatic arms. Linkage disequilibrium, which decays within 1 kbp, exhibits a strong bias toward coupling of the more frequent alleles and provides a high-resolution map of recombination rate. The juxtaposition of population genetics statistics in small genomic windows with gene structures and chromatin states yields a rich, high-resolution annotation, including the following: (1) 5′- and 3′-UTRs are enriched for regions of reduced polymorphism relative to lineage-specific divergence; (2) exons overlap with windows of excess relative polymorphism; (3) epigenetic marks associated with active transcription initiation sites overlap with regions of reduced relative polymorphism and relatively reduced estimates of the rate of recombination; (4) the rate of adaptive nonsynonymous fixation increases with the rate of crossing over per base pair; and (5) both duplications and deletions are enriched near origins of replication and their density correlates negatively with the rate of crossing over. Available demographic models of X and autosome descent cannot account for the increased divergence on the X and loss of diversity associated with the out-of-Africa migration. Comparison of the variation among these genomes to variation among genomes from D. simulans suggests that many targets of directional selection are shared between these species. PMID:22673804
Comparative Genomics Suggests an Independent Origin of Cytoplasmic Incompatibility in Cardinium hertigii

PubMed Central

Kelly, Suzanne E.; Cass, Bodil N.; Müller, Anneliese; Woyke, Tanja; Malfatti, Stephanie A.; Hunter, Martha S.; Horn, Matthias

2012-01-01

Terrestrial arthropods are commonly infected with maternally inherited bacterial symbionts that cause cytoplasmic incompatibility (CI). In CI, the outcome of crosses between symbiont-infected males and uninfected females is reproductive failure, increasing the relative fitness of infected females and leading to spread of the symbiont in the host population. CI symbionts have profound impacts on host genetic structure and ecology and may lead to speciation and the rapid evolution of sex determination systems. Cardinium hertigii, a member of the Bacteroidetes and symbiont of the parasitic wasp Encarsia pergandiella, is the only known bacterium other than the Alphaproteobacteria Wolbachia to cause CI. Here we report the genome sequence of Cardinium hertigii cEper1. Comparison with the genomes of CI–inducing Wolbachia pipientis strains wMel, wRi, and wPip provides a unique opportunity to pinpoint shared proteins mediating host cell interaction, including some candidate proteins for CI that have not previously been investigated. The genome of Cardinium lacks all major biosynthetic pathways but harbors a complete biotin biosynthesis pathway, suggesting a potential role for Cardinium in host nutrition. Cardinium lacks known protein secretion systems but encodes a putative phage-derived secretion system distantly related to the antifeeding prophage of the entomopathogen Serratia entomophila. Lastly, while Cardinium and Wolbachia genomes show only a functional overlap of proteins, they show no evidence of laterally transferred elements that would suggest common ancestry of CI in both lineages. Instead, comparative genomics suggests an independent evolution of CI in Cardinium and Wolbachia and provides a novel context for understanding the mechanistic basis of CI. PMID:23133394
The genome and structural proteome of YuA, a new Pseudomonas aeruginosa phage resembling M6.

PubMed

Ceyssens, Pieter-Jan; Mesyanzhinov, Vadim; Sykilinda, Nina; Briers, Yves; Roucourt, Bart; Lavigne, Rob; Robben, Johan; Domashin, Artem; Miroshnikov, Konstantin; Volckaert, Guido; Hertveldt, Kirsten

2008-02-01

Pseudomonas aeruginosa phage YuA (Siphoviridae) was isolated from a pond near Moscow, Russia. It has an elongated head, encapsulating a circularly permuted genome of 58,663 bp, and a flexible, noncontractile tail, which is terminally and subterminally decorated with short fibers. The YuA genome is neither Mu- nor lambda-like and encodes 78 gene products that cluster in three major regions involved in (i) DNA metabolism and replication, (ii) host interaction, and (iii) phage particle formation and host lysis. At the protein level, YuA displays significant homology with phages M6, phiJL001, 73, B3, DMS3, and D3112. Eighteen YuA proteins were identified as part of the phage particle by mass spectrometry analysis. Five different bacterial promoters were experimentally identified using a promoter trap assay, three of which have a sigma54-specific binding site and regulate transcription in the genome region involved in phage particle formation and host lysis. The dependency of these promoters on the host sigma54 factor was confirmed by analysis of an rpoN mutant strain of P. aeruginosa PAO1. At the DNA level, YuA is 91% identical to the recently (July 2007) annotated phage M6 of the Lindberg typing set. Despite this level of DNA homology throughout the genome, both phages combined have 15 unique genes that do not occur in the other phage. The genome organization of both phages differs substantially from those of the other known Pseudomonas-infecting Siphoviridae, delineating them as a distinct genus within this family.
Unique transposon landscapes are pervasive across Drosophila melanogaster genomes

PubMed Central

Rahman, Reazur; Chirn, Gung-wei; Kanodia, Abhay; Sytnikova, Yuliya A.; Brembs, Björn; Bergman, Casey M.; Lau, Nelson C.

2015-01-01

To understand how transposon landscapes (TLs) vary across animal genomes, we describe a new method called the Transposon Insertion and Depletion AnaLyzer (TIDAL) and a database of >300 TLs in Drosophila melanogaster (TIDAL-Fly). Our analysis reveals pervasive TL diversity across cell lines and fly strains, even for identically named sub-strains from different laboratories such as the ISO1 strain used for the reference genome sequence. On average, >500 novel insertions exist in every lab strain, inbred strains of the Drosophila Genetic Reference Panel (DGRP), and fly isolates in the Drosophila Genome Nexus (DGN). A minority (<25%) of transposon families comprise the majority (>70%) of TL diversity across fly strains. A sharp contrast between insertion and depletion patterns indicates that many transposons are unique to the ISO1 reference genome sequence. Although TL diversity from fly strains reaches asymptotic limits with increasing sequencing depth, rampant TL diversity causes unsaturated detection of TLs in pools of flies. Finally, we show novel transposon insertions negatively correlate with Piwi-interacting RNA (piRNA) levels for most transposon families, except for the highly-abundant roo retrotransposon. Our study provides a useful resource for Drosophila geneticists to understand how transposons create extensive genomic diversity in fly cell lines and strains. PMID:26578579
Characterization of Equine Infectious Anemia Virus Integration in the Horse Genome.

PubMed

Liu, Qiang; Wang, Xue-Feng; Ma, Jian; He, Xi-Jun; Wang, Xiao-Jun; Zhou, Jian-Hua

2015-06-19

Human immunodeficiency virus (HIV)-1 has a unique integration profile in the human genome relative to murine and avian retroviruses. Equine infectious anemia virus (EIAV) is another well-studied lentivirus that can also be used as a promising retro-transfection vector, but its integration into its native host has not been characterized. In this study, we mapped 477 integration sites of the EIAV strain EIAVFDDV13 in fetal equine dermal (FED) cells during in vitro infection. Published integration sites of EIAV and HIV-1 in the human genome were also analyzed as references. Our results demonstrated that EIAVFDDV13 tended to integrate into genes and AT-rich regions, and it avoided integrating into transcription start sites (TSS), which is consistent with EIAV and HIV-1 integration in the human genome. Notably, the integration of EIAVFDDV13 favored long interspersed elements (LINEs) and DNA transposons in the horse genome, whereas the integration of HIV-1 favored short interspersed elements (SINEs) in the human genome. The chromosomal environment near LINEs or DNA transposons potentially influences viral transcription and may be related to the unique EIAV latency states in equids. The data on EIAV integration in its natural host will facilitate studies on lentiviral infection and lentivirus-based therapeutic vectors.
Advancing stroke genomic research in the age of Trans-Omics big data science: Emerging priorities and opportunities.

PubMed

Owolabi, Mayowa; Peprah, Emmanuel; Xu, Huichun; Akinyemi, Rufus; Tiwari, Hemant K; Irvin, Marguerite R; Wahab, Kolawole Wasiu; Arnett, Donna K; Ovbiagele, Bruce

2017-11-15

We systematically reviewed the genetic variants associated with stroke in genome-wide association studies (GWAS) and examined the emerging priorities and opportunities for rapidly advancing stroke research in the era of Trans-Omics science. Using the PRISMA guideline, we searched PubMed and NHGRI- EBI GWAS catalog for stroke studies from 2007 till May 2017. We included 31 studies. The major challenge is that the few validated variants could not account for the full genetic risk of stroke and have not been translated for clinical use. None of the studies included continental Africans. Genomic study of stroke among Africans presents a unique opportunity for the discovery, validation, functional annotation, Trans-Omics study and translation of genomic determinants of stroke with implications for global populations. This is because all humans originated from Africa, a continent with a unique genomic architecture and a distinctive epidemiology of stroke; as well as substantially higher heritability and resolution of fine mapping of stroke genes. Understanding the genomic determinants of stroke and the corresponding molecular mechanisms will revolutionize the development of a new set of precise biomarkers for stroke prediction, diagnosis and prognostic estimates as well as personalized interventions for reducing the global burden of stroke. Copyright © 2017 Elsevier B.V. All rights reserved.
Comparative Genomic Analyses of Clavibacter michiganensis subsp. insidiosus and Pathogenicity on Medicago truncatula.

PubMed

Lu, You; Ishimaru, Carol A; Glazebrook, Jane; Samac, Deborah A

2018-02-01

Clavibacter michiganensis is the most economically important gram-positive bacterial plant pathogen, with subspecies that cause serious diseases of maize, wheat, tomato, potato, and alfalfa. Much less is known about pathogenesis involving gram-positive plant pathogens than is known for gram-negative bacteria. Comparative genome analyses of C. michiganensis subspecies affecting tomato, potato, and maize have provided insights on pathogenicity. In this study, we identified strains of C. michiganensis subsp. insidiosus with contrasting pathogenicity on three accessions of the model legume Medicago truncatula. We generated complete genome sequences for two strains and compared these to a previously sequenced strain and genome sequences of four other subspecies. The three C. michiganensis subsp. insidiosus strains varied in gene content due to genome rearrangements, most likely facilitated by insertion elements, and plasmid number, which varied from one to three depending on strain. The core C. michiganensis genome consisted of 1,917 genes, with 379 genes unique to C. michiganensis subsp. insidiosus. An operon for synthesis of the extracellular blue pigment indigoidine, enzymes for pectin degradation, and an operon for inositol metabolism are among the unique features. Secreted serine proteases belonging to both the pat-1 and ppa families were present but highly diverged from those in other subspecies.
Characterization of Equine Infectious Anemia Virus Integration in the Horse Genome

PubMed Central

Liu, Qiang; Wang, Xue-Feng; Ma, Jian; He, Xi-Jun; Wang, Xiao-Jun; Zhou, Jian-Hua

2015-01-01

Human immunodeficiency virus (HIV)-1 has a unique integration profile in the human genome relative to murine and avian retroviruses. Equine infectious anemia virus (EIAV) is another well-studied lentivirus that can also be used as a promising retro-transfection vector, but its integration into its native host has not been characterized. In this study, we mapped 477 integration sites of the EIAV strain EIAVFDDV13 in fetal equine dermal (FED) cells during in vitro infection. Published integration sites of EIAV and HIV-1 in the human genome were also analyzed as references. Our results demonstrated that EIAVFDDV13 tended to integrate into genes and AT-rich regions, and it avoided integrating into transcription start sites (TSS), which is consistent with EIAV and HIV-1 integration in the human genome. Notably, the integration of EIAVFDDV13 favored long interspersed elements (LINEs) and DNA transposons in the horse genome, whereas the integration of HIV-1 favored short interspersed elements (SINEs) in the human genome. The chromosomal environment near LINEs or DNA transposons potentially influences viral transcription and may be related to the unique EIAV latency states in equids. The data on EIAV integration in its natural host will facilitate studies on lentiviral infection and lentivirus-based therapeutic vectors. PMID:26102582
Behavioral Economics: A New Lens for Understanding Genomic Decision Making.

PubMed

Moore, Scott Emory; Ulbrich, Holley H; Hepburn, Kenneth; Holaday, Bonnie; Mayo, Rachel; Sharp, Julia; Pruitt, Rosanne H

2018-05-01

This article seeks to take the next step in examining the insights that nurses and other healthcare providers can derive from applying behavioral economic concepts to support genomic decision making. As genomic science continues to permeate clinical practice, nurses must continue to adapt practice to meet new challenges. Decisions associated with genomics are often not simple and dichotomous in nature. They can be complex and challenging for all involved. This article offers an introduction to behavioral economics as a possible tool to help support patients', families', and caregivers' decision making related to genomics. Using current writings from nursing, ethics, behavioral economic, and other healthcare scholars, we review key concepts of behavioral economics and discuss their relevance to supporting genomic decision making. Behavioral economic concepts-particularly relativity, deliberation, and choice architecture-are specifically examined as new ways to view the complexities of genomic decision making. Each concept is explored through patient decision making and clinical practice examples. This article also discusses next steps and practice implications for further development of the behavioral economic lens in nursing. Behavioral economics provides valuable insight into the unique nature of genetic decision-making practices. Nurses are often a source of information and support for patients during clinical decision making. This article seeks to offer behavioral economic concepts as a framework for understanding and examining the unique nature of genomic decision making. As genetic and genomic testing become more common in practice, it will continue to grow in importance for nurses to be able to support the autonomous decision making of patients, their families, and caregivers. © 2018 Sigma Theta Tau International.
Comparative genomics of Eucalyptus and Corymbia reveals low rates of genome structural rearrangement.

PubMed

Butler, J B; Vaillancourt, R E; Potts, B M; Lee, D J; King, G J; Baten, A; Shepherd, M; Freeman, J S

2017-05-22

Previous studies suggest genome structure is largely conserved between Eucalyptus species. However, it is unknown if this conservation extends to more divergent eucalypt taxa. We performed comparative genomics between the eucalypt genera Eucalyptus and Corymbia. Our results will facilitate transfer of genomic information between these important taxa and provide further insights into the rate of structural change in tree genomes. We constructed three high density linkage maps for two Corymbia species (Corymbia citriodora subsp. variegata and Corymbia torelliana) which were used to compare genome structure between both species and Eucalyptus grandis. Genome structure was highly conserved between the Corymbia species. However, the comparison of Corymbia and E. grandis suggests large (from 1-13 MB) intra-chromosomal rearrangements have occurred on seven of the 11 chromosomes. Most rearrangements were supported through comparisons of the three independent Corymbia maps to the E. grandis genome sequence, and to other independently constructed Eucalyptus linkage maps. These are the first large scale chromosomal rearrangements discovered between eucalypts. Nonetheless, in the general context of plants, the genomic structure of the two genera was remarkably conserved; adding to a growing body of evidence that conservation of genome structure is common amongst woody angiosperms.
OPTIMA: sensitive and accurate whole-genome alignment of error-prone genomic maps by combinatorial indexing and technology-agnostic statistical analysis.

PubMed

Verzotto, Davide; M Teo, Audrey S; Hillmer, Axel M; Nagarajan, Niranjan

2016-01-01

Resolution of complex repeat structures and rearrangements in the assembly and analysis of large eukaryotic genomes is often aided by a combination of high-throughput sequencing and genome-mapping technologies (for example, optical restriction mapping). In particular, mapping technologies can generate sparse maps of large DNA fragments (150 kilo base pairs (kbp) to 2 Mbp) and thus provide a unique source of information for disambiguating complex rearrangements in cancer genomes. Despite their utility, combining high-throughput sequencing and mapping technologies has been challenging because of the lack of efficient and sensitive map-alignment algorithms for robustly aligning error-prone maps to sequences. We introduce a novel seed-and-extend glocal (short for global-local) alignment method, OPTIMA (and a sliding-window extension for overlap alignment, OPTIMA-Overlap), which is the first to create indexes for continuous-valued mapping data while accounting for mapping errors. We also present a novel statistical model, agnostic with respect to technology-dependent error rates, for conservatively evaluating the significance of alignments without relying on expensive permutation-based tests. We show that OPTIMA and OPTIMA-Overlap outperform other state-of-the-art approaches (1.6-2 times more sensitive) and are more efficient (170-200 %) and precise in their alignments (nearly 99 % precision). These advantages are independent of the quality of the data, suggesting that our indexing approach and statistical evaluation are robust, provide improved sensitivity and guarantee high precision.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.